Commit Graph

334 Commits

Author SHA1 Message Date
Jinank Jain
96bc282759 hypervisor: mshv: Add VmFd to MshvVcpu struct
This would be required later to implement few additional operations on
top of it.

Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
2023-11-16 14:58:53 -08:00
Jinank Jain
0287e6a603 hypervisor: Add support for MMIO write emulation
This is very similar MMIO read emulation for SEV-SNP guest.

Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
2023-10-30 10:23:52 -07:00
Jinank Jain
ac43825f79 hypervisor: Add support MMIO read VMGEXIT
MMIO emulation is also performed via VMGEXIT in case of SEV-SNP guest.
Emulation is done in a very similar way like a regular guest. Just need
to make sure that guest memory is access via read/write GPA hypercall
instead of directly accessing it.

Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
2023-10-30 10:23:52 -07:00
Jinank Jain
7975207e0f hypervisor: Add support for legacy I/O port emulation
Legacy port emulation requires reading RAX register from GHCB page for
SEV-SNP guest. This is the major difference between a regular guest and
SEV-SNP enabled guest.

Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
2023-10-30 10:23:52 -07:00
Jinank Jain
e2288a8d2c hypervisor: Add support for handling extended guest request
Currently MSHV does not support fetching extended guest report and thus
return an appropriate error stating the NAE event is not valid.

Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Nuno Das Neves <nudasnev@microsoft.com>
2023-10-30 10:23:52 -07:00
Jinank Jain
cb5ea05945 hypervisor: Add support for handling #HV Doorbell Page
As part of this handling there are 4 different operations:

1. Getting the hypervisor preffered doorbell page GPA.
2. Informing hypervisor about the doorbell page chosen by the guest
3. Querying the GPA of the doorbell page
4. Clearing the GPA of the doorbell page from hypervisor

Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
2023-10-30 10:23:52 -07:00
Jinank Jain
d68fec594e hypervisor: Add support for handling SEV INFO request
As part of handling this request, hypervisor is expected to three
things:

1. Maximum GHCB protocol version supported.
2. Minimum GHCB protocol version supported.
3. SEV-page table encryption bit number.

If the guest cannot support the protocol range supplied by the
hypervisor, it should terminate

Signed-off-by: Muminul Islam <muislam@microsoft.com>
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
2023-10-30 10:23:52 -07:00
Jinank Jain
6f4d82bd61 hypervisor: Add support for registering GHCB GPA with hypervisor
SEV-SNP guest allocates a GHCB page and in order to update hypervisor
about the same, there is a vmgexit which allows registering GHCB page
with the hypervisor.

Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
2023-10-30 10:23:52 -07:00
Jinank Jain
437e6088e6 hypervisor: Add support for handling VMGEXIT for SEV-SNP guest
A VMGEXIT exit occurs for any of the listed NAE events in the GHCB
specification [1] (e.g. CPUID, RDMSR/WRMSR, MMIO, port IO, etc.). Some
of these events are handled by hypervisor while other are handled by
VMM. Currently, we are adding support for one such request i.e.,
report supported SEV-SNP features by hypervisor.

[1] GHCB protocol specification:
https://www.amd.com/content/dam/amd/en/documents/epyc-technical-docs/specifications/56421.pdf

Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
2023-10-30 10:23:52 -07:00
Muminul Islam
5bd113e625 hypervisor: Add API to complete isolated import
This is the function that needs to be called by the VMM
to inform the MSHV that isolation is complete and inform
PSP about this completion.

Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
2023-10-24 13:02:34 -07:00
Muminul Islam
dc3903012d hypervisor: Add API to import the isolated pages
Add hypervisor VM specific API to import the isolated
pages. Hypervisor adds those pages for PSP measurement.

Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
2023-10-24 13:02:34 -07:00
Jinank Jain
1afac185ff hypervisor: Enable VMGEXIT offload for SEV-SNP partition
For a SEV-SNP enabled partition on MSHV, some of the VMGEXITS are
offloaded for Hypervisor to handle while the rest are handled by VMM.
By setting this additional partition property hypervisor is informed
about the VMGEXITs it needs to take care off, rest all would be handled
by the CloudHypervisor.

Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
2023-10-17 14:15:38 -07:00
Jinank Jain
1b59ab3d7b vmm, hypervisor: Initialize SEV-SNP VM
As part of this initialization for a SEV-SNP VM on MSHV, it is required
that we transition the guest state to secure state using partition
hypercall. This implies all the created VPs will transition to secure
state and could access the guest encrypted memory.

Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
2023-10-17 17:45:28 +01:00
Jinank Jain
a5763bcb6c hypervisor: Set isolation policy for SNP guest
It's a requirement that a SEV-SNP enabled guest on MSHV must have
isolation policy set before launching the guest.

Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
2023-10-11 17:15:51 -07:00
Rob Bradford
44f200d67d hypervisor: Set destination vCPU TSC frequency to source
Include the TSC frequency as part of the KVM state so that it will be
restored at the destination.

This ensures migration works correctly between hosts that have a
different TSC frequency if the guest is running with TSC as the source
of timekeeping.

Fixes: #5786

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2023-09-20 09:13:42 -07:00
Philipp Schuster
7bf0cc1ed5 misc: Fix various spelling errors using typos
This fixes all typos found by the typos utility with respect to the config file.

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
2023-09-09 10:46:21 +01:00
Jinank Jain
200cba0e20 vmm: Refactor VM creation workflow
This refactoring is required to add support for creating SEV-SNP enabled
VM.

Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
2023-09-07 12:52:27 +01:00
Philipp Schuster
556bda74a0 hypervisor: emulator: Use wrapping add for memory offset
Assume rax is 0xfee003e0 and the displacement is negative 0x60. The effective
address is then 0xfee00380. This is perfectly valid.

Example instruction:
c7 40 a0 00 10 00 00    movl   $0x1000,-0x60(%rax)

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
2023-09-01 14:27:54 +01:00
Rob Bradford
239f422203 hypervisor: x86: emulator: Remove unncessary mut from reference
warning: this argument is a mutable reference, but not used mutably
  --> hypervisor/src/arch/x86/emulator/instructions/mod.rs:22:15
   |
22 |     platform: &mut dyn PlatformEmulator<CpuState = T>,
   |               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: consider changing to: `&dyn PlatformEmulator<CpuState = T>`
   |
   = note: this is cfg-gated and may require further changes
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_pass_by_ref_mut

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2023-08-22 12:01:54 +01:00
Wei Liu
dbe67fca7f hypervisor: mshv: handle APIC EOI message
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2023-08-21 17:20:05 -07:00
Philipp Schuster
442ac9056c x86 emulator: add Mov_moffs_AX & Mov_AX_moffs (16,32,64)
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
2023-08-01 20:14:10 +01:00
Yu Li
8ab2d5e539 build: Fix beta clippy issue: private item shadows public glob re-export
error: private item shadows public glob re-export
Error:   --> hypervisor/src/mshv/mod.rs:42:27
   |
42 |     CpuIdEntry, FpuState, LapicState, MsrEntry, SpecialRegisters, StandardRegisters,
   |                           ^^^^^^^^^^
   |
note: the name `LapicState` in the type namespace is supposed to be publicly re-exported here
  --> hypervisor/src/mshv/mod.rs:16:9
   |
16 | pub use mshv_bindings::*;
   |         ^^^^^^^^^^^^^^^^
note: but the private item here shadows it
  --> hypervisor/src/mshv/mod.rs:42:27
   |
42 |     CpuIdEntry, FpuState, LapicState, MsrEntry, SpecialRegisters, StandardRegisters,
   |                           ^^^^^^^^^^
   = note: `-D hidden-glob-reexports` implied by `-D warnings`

Signed-off-by: Yu Li <liyu.yukiteru@bytedance.com>
2023-07-13 08:16:30 -07:00
Christian Blichmann
b6d009830d hypervisor: x86: Emulator is only needed on mshv, not kvm
On x86-64, when the underlying hypervisor platform is KVM, no
instruction emulator is necessary. KVM handles instruction boundaries
internally.

This change allows to skip the iced-x86 dependency on KVM, improving
build times, prunes the dependency graph and reduces network traffic
during the initial build.

For Hyper-V, the emulator is still necessary on x86-64, so nothing
changes there.

Signed-off-by: Christian Blichmann <cblichmann@google.com>
2023-07-04 08:29:24 +01:00
Anatol Belski
7df80220ec hyperivsor: Add infrastructure to determine CPU vendor
Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
2023-05-31 23:54:33 +02:00
Anatol Belski
35ecfb6ec5 hypervisor: mshv: Implement set_cpuid2 call
Passing the CPUID leafs with the topology is integrated into the common
mechanism of setting and patching CPUID in Cloud Hypervisor. All the
CPUID values will be passed to the hypervisor through the register
intercept call.

Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
2023-05-08 08:50:09 -07:00
Rob Bradford
ceb8151747 hypervisor, vmm: Limit max number of vCPUs to hypervisor maximum
On KVM this is provided by an ioctl, on MSHV this is constant. Although
there is a HV_MAXIMUM_PROCESSORS constant the MSHV ioctl API is limited
to u8.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2023-04-22 10:35:39 +01:00
Rafael Mendonca
6379074264 misc: Remove unnecessary clippy directives
Clippy passes fine without these.

Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com>
2023-04-18 10:48:31 -07:00
Alyssa Ross
9b724303ac vmm: only use KVM_ARM_VCPU_PMU_V3 if available
Having PMU in guests isn't critical, and not all hardware supports
it (e.g. Apple Silicon).

CpuManager::init_pmu already has a fallback for if PMU is not
supported by the VCPU, but we weren't getting that far, because we
would always try to initialise the VCPU with KVM_ARM_VCPU_PMU_V3, and
then bail when it returned with EINVAL.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2023-04-13 09:02:55 +08:00
Muminul Islam
3096f1d42f hypervisor: Fix few register names on MSHV
Recently generated mshv-bindings has most of the registers
renamed. This patch renames some of the MSHV registers.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2023-04-10 14:28:41 -07:00
Alyssa Ross
755cabea4c hypervisor: use proper doc tests for examples
It seems like these examples were always intended to be doctests,
since there are lines marked with "#" so that they are excluded from
the generated documentation, but they were not recognised as doc tests
because they were not formatted correctly.

The code needed some adjustments so that it would actually compile and
run as doctests.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2023-04-05 11:22:47 +01:00
Alyssa Ross
1ed4898d28 hypervisor: fix building doctests
When doctests are built, the crate is built with itself as a
dependency via --extern.  This causes a compiler error if using a
module with the name same as the crate, because it's ambiguous whether
it's referring to the module, or the extern version of the crate, so
it's necessary to disambiguate when using the hypervisor module here.

Fixes running cargo test --doc --workspace.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2023-04-05 11:22:47 +01:00
Wei Liu
de3ca97095 hypervisor: rename get_cpuid to get_supported_cpuid
To better reflect its nature and avoid confusion with get_cpuid2.

No functional change.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2023-02-23 13:03:12 +00:00
Rob Bradford
c22c4675b3 arch, hypervisor: Populate CPUID leaf 0x4000_0010 (TSC frequency)
This hypervisor leaf includes details of the TSC frequency if that is
available from KVM. This can be used to efficiently calculate time
passed when there is an invariant TSC.

TEST=Run `cpuid` in the guest and observe the frequency populated.

Fixes: #5178

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2023-02-09 18:32:21 +01:00
Rob Bradford
69e8f60b91 tdx: Set the SEPT_VE_DISABLE attribute
This is required for booting Linux:

From: https://lore.kernel.org/all/20221028141220.29217-3-kirill.shutemov@linux.intel.com/

"""

Virtualization Exceptions (#VE) are delivered to TDX guests due to
specific guest actions such as using specific instructions or accessing
a specific MSR.

Notable reason for #VE is access to specific guest physical addresses.
It requires special security considerations as it is not fully in
control of the guest kernel. VMM can remove a page from EPT page table
and trigger #VE on access.

The primary use-case for #VE on a memory access is MMIO: VMM removes
page from EPT to trigger exception in the guest which allows guest to
emulate MMIO with hypercalls.

MMIO only happens on shared memory. All conventional kernel memory is
private. This includes everything from kernel stacks to kernel text.

Handling exceptions on arbitrary accesses to kernel memory is
essentially impossible as handling #VE may require access to memory
that also triggers the exception.

TDX module provides mechanism to disable #VE delivery on access to
private memory. If SEPT_VE_DISABLE TD attribute is set, private EPT
violation will not be reflected to the guest as #VE, but will trigger
exit to VMM.

Make sure the attribute is set by VMM. Panic otherwise.

There's small window during the boot before the check where kernel has
early #VE handler. But the handler is only for port I/O and panic as
soon as it sees any other #VE reason.

SEPT_VE_DISABLE makes SEPT violation unrecoverable and terminating the
TD is the only option.

Kernel has no legitimate use-cases for #VE on private memory. It is
either a guest kernel bug (like access of unaccepted memory) or
malicious/buggy VMM that removes guest page that is still in use.

In both cases terminating TD is the right thing to do.

"""

With this change Cloud Hypervisor can boot the current Linux guest
kernel.

Reported-By: Jiaqi Gao <jiaqi.gao@intel.com
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2023-02-02 14:53:59 +00:00
Praveen K Paladugu
ad202f9b7a hypervisor: x86: emulate MOVSB
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
2023-01-27 21:14:38 +00:00
Wei Liu
3a225aaa23 hypervisor: x86: emulate MOVSW
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2023-01-27 21:14:38 +00:00
Wei Liu
1bfa07f48e hypervisor: x86: use a macro to generate emulate function for movs
No functional change.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2023-01-27 21:14:38 +00:00
Sebastien Boeuf
e4ae668bcd tdx: Update support based on kvm-upstream v5.19
In order to comply with latest TDX version, we rely onto the branch
kvm-upstream-2022.08.07-v5.19-rc8 from https://github.com/intel/tdx
repository. Updates are based on changes that happened in
arch/x86/include/uapi/asm/kvm.h headers file.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2023-01-20 09:59:56 +00:00
Muminul Islam
7d8f795430 hypervisor: remove unnecessary derive of HypervisorType
There was an unnecessary change in previous PR #5077.
This is the follow-up clean up patch.

Right now there is no use case of the drive of
Eq and PartialEq.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2023-01-12 09:03:28 +01:00
Muminul Islam
4e3bc20f2c vmm: Ensure PIO/MMIO exits complete before pausing only for KVM
MSHV does not require to ensure MMIO/PIO exits complete
before pausing. This patch makes sure the above requirement
by checking the hypervisor type run-time.

Fixes #5037

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2023-01-11 17:15:56 +00:00
Jinank Jain
8914ce9da8 build: Bump mshv-ioctls from 10d0c52 to ef01a5a
With this bump there was a change in one of the externally exposed
variable. Thus, the use of that variable in CLH must be adjusted
accordingly.

Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
2022-12-20 10:10:34 +00:00
Wei Liu
3a232ef31a hypervisor: kvm: aarch64: remove repetition in offset_of
The repetition served no purpose.

No functional change.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-12-16 17:04:43 +00:00
Wei Liu
cd83d258b8 hypervisor: kvm: aarch64: rename offset__of to offset_of
The double underscore made it different from how other projects would
name this particular macro.

No functional change.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-12-16 17:04:43 +00:00
Rob Bradford
5e52729453 misc: Automatically fix cargo clippy issues added in 1.65 (stable)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-12-14 14:27:19 +00:00
Rob Bradford
a48d7c281e vmm: seccomp: Remove unreachable patterns
Make HypervisorType enum's members conditional on build time features.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-12-13 18:10:42 +00:00
Rob Bradford
4b08142117 misc: Remove #![allow(clippy::significant_drop_in_scrutinee)]
This isn't supported by clippy on Rust 1.60 but also no longer seems to
be required.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-12-07 17:50:48 +00:00
Rob Bradford
3888f57600 aarch64: Remove unnecessary casts (beta clippy check)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-12-01 17:02:30 +00:00
Wei Liu
7f2723d9c6 hypervisor: x86: add two safety comments
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-11-18 12:50:01 +00:00
Wei Liu
6c89c541da hypervisor: kvm: add two safety comments
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-11-18 12:50:01 +00:00
Wei Liu
d272113d0c hypervisor: mshv: modify two safety comments
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-11-18 12:50:01 +00:00