... enabled VMs. IOEvents are not supported in case of SEV-SNP VMs. All
the IO events are delievered via GHCB protocol.
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
This will help in identify whether a VM supports sev-snp and based on
that disable/enable certain features.
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
Unaccepted GPA is usually thrown by Microsoft hypervisor in case of
mismatch between GPA and GVA mappings. This is a fatal message from the
hypervisor perspective so we would need to error out from the vcpu run
loop. Along with add some debug message to identify the broken mapping
between GVA and GPA.
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
This patch bumps the following crates, including `kvm-bindings@0.7.0`*,
`kvm-ioctls@0.16.0`**, `linux-loader@0.11.0`, `versionize@0.2.0`,
`versionize_derive@0.1.6`***, `vhost@0.10.0`,
`vhost-user-backend@0.13.1`, `virtio-queue@0.11.0`, `vm-memory@0.14.0`,
`vmm-sys-util@0.12.1`, and the latest of `vfio-bindings`, `vfio-ioctls`,
`mshv-bindings`,`mshv-ioctls`, and `vfio-user`.
* A fork of the `kvm-bindings` crate is being used to support
serialization of various structs for migration [1]. Also, code changes
are made to accommodate the updated `struct xsave` from the Linux
kernel. Note: these changes related to `struct xsave` break
live-upgrade.
** The new `kvm-ioctls` crate introduced breaking changes for
the `get/set_one_reg` API on `aarch64` [2], so code changes are made to
the new APIs.
*** A fork of the `versionize_derive` crate is being used to support
versionize on packed structs [3].
[1] https://github.com/cloud-hypervisor/kvm-bindings/tree/ch-v0.7.0
[2] https://github.com/rust-vmm/kvm-ioctls/pull/223
[3] https://github.com/cloud-hypervisor/versionize_derive/tree/ch-0.1.6Fixes: #6072
Signed-off-by: Bo Chen <chen.bo@intel.com>
This register configures the SEV feature control
state on a virtual processor.
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
Consistent with the other data structures and constants used in TDX
support code import the necessary structures from the kernel for
accessing the vmcall structure.
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
Use right and exact size 32 bytes for host data field
for completing the isolated import. This way OOB
can be avoided during a function call.
Signed-off-by: Muminul Islam <muislam@microsoft.com>
This patch adds missing new lines after functions,
fixes few typos in the comments, adds few missing
comments to SNP related functions.
Signed-off-by: Muminul Islam <muislam@microsoft.com>
Currently there are some inconsistencies in Cargo.toml which is causing
the following warnings during the build process:
Error parsing Cargo.toml manifest, fallback to caching entire file:
Invalid TOML document: expected key-value, found comma
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Add necessary API to retrieve cpuid leaf on MSHV.
This API is used to update cpuid information
during the parsing of the igvm file.
Microsoft hypervisor does not provide common
CpuID like KVM. That's why we need to call this API
during the IGVM parsing.
Signed-off-by: Muminul Islam <muislam@microsoft.com>
As part SMP bringup for a SEV-SNP guest, BSP sets up the VMSA page for
each AP threads and informs hypervisor about the same using a VMGEXIT.
Thus, extend the current GHCB interface to handle this scenario.
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
SEV-SNP guest can request AMD's secure co-processor i.e., PSP to
generate an runtime attesation report. During this process guest needs
to inform PSP about the request and response GPAs where that report
would be generated by the PSP. This is handled via a VMGEXIT request.
Thus, extend the current GHCB handling to add support for it.
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
This is very similar MMIO read emulation for SEV-SNP guest.
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
MMIO emulation is also performed via VMGEXIT in case of SEV-SNP guest.
Emulation is done in a very similar way like a regular guest. Just need
to make sure that guest memory is access via read/write GPA hypercall
instead of directly accessing it.
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
Legacy port emulation requires reading RAX register from GHCB page for
SEV-SNP guest. This is the major difference between a regular guest and
SEV-SNP enabled guest.
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
Currently MSHV does not support fetching extended guest report and thus
return an appropriate error stating the NAE event is not valid.
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Nuno Das Neves <nudasnev@microsoft.com>
As part of this handling there are 4 different operations:
1. Getting the hypervisor preffered doorbell page GPA.
2. Informing hypervisor about the doorbell page chosen by the guest
3. Querying the GPA of the doorbell page
4. Clearing the GPA of the doorbell page from hypervisor
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
As part of handling this request, hypervisor is expected to three
things:
1. Maximum GHCB protocol version supported.
2. Minimum GHCB protocol version supported.
3. SEV-page table encryption bit number.
If the guest cannot support the protocol range supplied by the
hypervisor, it should terminate
Signed-off-by: Muminul Islam <muislam@microsoft.com>
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
SEV-SNP guest allocates a GHCB page and in order to update hypervisor
about the same, there is a vmgexit which allows registering GHCB page
with the hypervisor.
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
A VMGEXIT exit occurs for any of the listed NAE events in the GHCB
specification [1] (e.g. CPUID, RDMSR/WRMSR, MMIO, port IO, etc.). Some
of these events are handled by hypervisor while other are handled by
VMM. Currently, we are adding support for one such request i.e.,
report supported SEV-SNP features by hypervisor.
[1] GHCB protocol specification:
https://www.amd.com/content/dam/amd/en/documents/epyc-technical-docs/specifications/56421.pdf
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
This is the function that needs to be called by the VMM
to inform the MSHV that isolation is complete and inform
PSP about this completion.
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
Add hypervisor VM specific API to import the isolated
pages. Hypervisor adds those pages for PSP measurement.
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
For a SEV-SNP enabled partition on MSHV, some of the VMGEXITS are
offloaded for Hypervisor to handle while the rest are handled by VMM.
By setting this additional partition property hypervisor is informed
about the VMGEXITs it needs to take care off, rest all would be handled
by the CloudHypervisor.
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
As part of this initialization for a SEV-SNP VM on MSHV, it is required
that we transition the guest state to secure state using partition
hypercall. This implies all the created VPs will transition to secure
state and could access the guest encrypted memory.
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
It's a requirement that a SEV-SNP enabled guest on MSHV must have
isolation policy set before launching the guest.
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Include the TSC frequency as part of the KVM state so that it will be
restored at the destination.
This ensures migration works correctly between hosts that have a
different TSC frequency if the guest is running with TSC as the source
of timekeeping.
Fixes: #5786
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
This fixes all typos found by the typos utility with respect to the config file.
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
Update to the latest vm-memory and all the crates that also depend upon
it.
Fix some deprecation warnings.
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
This feature flag gates the development for SEV-SNP enabled guest.
Also add a helper function to identify if SNP should be enabled for the
guest.
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Assume rax is 0xfee003e0 and the displacement is negative 0x60. The effective
address is then 0xfee00380. This is perfectly valid.
Example instruction:
c7 40 a0 00 10 00 00 movl $0x1000,-0x60(%rax)
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
warning: this argument is a mutable reference, but not used mutably
--> hypervisor/src/arch/x86/emulator/instructions/mod.rs:22:15
|
22 | platform: &mut dyn PlatformEmulator<CpuState = T>,
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: consider changing to: `&dyn PlatformEmulator<CpuState = T>`
|
= note: this is cfg-gated and may require further changes
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_pass_by_ref_mut
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
error: private item shadows public glob re-export
Error: --> hypervisor/src/mshv/mod.rs:42:27
|
42 | CpuIdEntry, FpuState, LapicState, MsrEntry, SpecialRegisters, StandardRegisters,
| ^^^^^^^^^^
|
note: the name `LapicState` in the type namespace is supposed to be publicly re-exported here
--> hypervisor/src/mshv/mod.rs:16:9
|
16 | pub use mshv_bindings::*;
| ^^^^^^^^^^^^^^^^
note: but the private item here shadows it
--> hypervisor/src/mshv/mod.rs:42:27
|
42 | CpuIdEntry, FpuState, LapicState, MsrEntry, SpecialRegisters, StandardRegisters,
| ^^^^^^^^^^
= note: `-D hidden-glob-reexports` implied by `-D warnings`
Signed-off-by: Yu Li <liyu.yukiteru@bytedance.com>
On x86-64, when the underlying hypervisor platform is KVM, no
instruction emulator is necessary. KVM handles instruction boundaries
internally.
This change allows to skip the iced-x86 dependency on KVM, improving
build times, prunes the dependency graph and reduces network traffic
during the initial build.
For Hyper-V, the emulator is still necessary on x86-64, so nothing
changes there.
Signed-off-by: Christian Blichmann <cblichmann@google.com>
Bump to the latest rust-vmm crates, including vm-memory, vfio,
vfio-bindings, vfio-user, virtio-bindings, virtio-queue, linux-loader,
vhost, and vhost-user-backend,
Signed-off-by: Bo Chen <chen.bo@intel.com>
Passing the CPUID leafs with the topology is integrated into the common
mechanism of setting and patching CPUID in Cloud Hypervisor. All the
CPUID values will be passed to the hypervisor through the register
intercept call.
Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
On KVM this is provided by an ioctl, on MSHV this is constant. Although
there is a HV_MAXIMUM_PROCESSORS constant the MSHV ioctl API is limited
to u8.
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
Having PMU in guests isn't critical, and not all hardware supports
it (e.g. Apple Silicon).
CpuManager::init_pmu already has a fallback for if PMU is not
supported by the VCPU, but we weren't getting that far, because we
would always try to initialise the VCPU with KVM_ARM_VCPU_PMU_V3, and
then bail when it returned with EINVAL.
Signed-off-by: Alyssa Ross <hi@alyssa.is>
Recently generated mshv-bindings has most of the registers
renamed. This patch renames some of the MSHV registers.
Signed-off-by: Muminul Islam <muislam@microsoft.com>
It seems like these examples were always intended to be doctests,
since there are lines marked with "#" so that they are excluded from
the generated documentation, but they were not recognised as doc tests
because they were not formatted correctly.
The code needed some adjustments so that it would actually compile and
run as doctests.
Signed-off-by: Alyssa Ross <hi@alyssa.is>
When doctests are built, the crate is built with itself as a
dependency via --extern. This causes a compiler error if using a
module with the name same as the crate, because it's ambiguous whether
it's referring to the module, or the extern version of the crate, so
it's necessary to disambiguate when using the hypervisor module here.
Fixes running cargo test --doc --workspace.
Signed-off-by: Alyssa Ross <hi@alyssa.is>
It was not possible to build just hypervisor with Cargo's -p flag,
because it was not properly specifying the features it requires from
vfio-ioctls.
Signed-off-by: Alyssa Ross <hi@alyssa.is>
This hypervisor leaf includes details of the TSC frequency if that is
available from KVM. This can be used to efficiently calculate time
passed when there is an invariant TSC.
TEST=Run `cpuid` in the guest and observe the frequency populated.
Fixes: #5178
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This is required for booting Linux:
From: https://lore.kernel.org/all/20221028141220.29217-3-kirill.shutemov@linux.intel.com/
"""
Virtualization Exceptions (#VE) are delivered to TDX guests due to
specific guest actions such as using specific instructions or accessing
a specific MSR.
Notable reason for #VE is access to specific guest physical addresses.
It requires special security considerations as it is not fully in
control of the guest kernel. VMM can remove a page from EPT page table
and trigger #VE on access.
The primary use-case for #VE on a memory access is MMIO: VMM removes
page from EPT to trigger exception in the guest which allows guest to
emulate MMIO with hypercalls.
MMIO only happens on shared memory. All conventional kernel memory is
private. This includes everything from kernel stacks to kernel text.
Handling exceptions on arbitrary accesses to kernel memory is
essentially impossible as handling #VE may require access to memory
that also triggers the exception.
TDX module provides mechanism to disable #VE delivery on access to
private memory. If SEPT_VE_DISABLE TD attribute is set, private EPT
violation will not be reflected to the guest as #VE, but will trigger
exit to VMM.
Make sure the attribute is set by VMM. Panic otherwise.
There's small window during the boot before the check where kernel has
early #VE handler. But the handler is only for port I/O and panic as
soon as it sees any other #VE reason.
SEPT_VE_DISABLE makes SEPT violation unrecoverable and terminating the
TD is the only option.
Kernel has no legitimate use-cases for #VE on private memory. It is
either a guest kernel bug (like access of unaccepted memory) or
malicious/buggy VMM that removes guest page that is still in use.
In both cases terminating TD is the right thing to do.
"""
With this change Cloud Hypervisor can boot the current Linux guest
kernel.
Reported-By: Jiaqi Gao <jiaqi.gao@intel.com
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
In order to comply with latest TDX version, we rely onto the branch
kvm-upstream-2022.08.07-v5.19-rc8 from https://github.com/intel/tdx
repository. Updates are based on changes that happened in
arch/x86/include/uapi/asm/kvm.h headers file.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
There was an unnecessary change in previous PR #5077.
This is the follow-up clean up patch.
Right now there is no use case of the drive of
Eq and PartialEq.
Signed-off-by: Muminul Islam <muislam@microsoft.com>
MSHV does not require to ensure MMIO/PIO exits complete
before pausing. This patch makes sure the above requirement
by checking the hypervisor type run-time.
Fixes#5037
Signed-off-by: Muminul Islam <muislam@microsoft.com>
With this bump there was a change in one of the externally exposed
variable. Thus, the use of that variable in CLH must be adjusted
accordingly.
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
The double underscore made it different from how other projects would
name this particular macro.
No functional change.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
In particular update to latest linux-loader release and point to latest
vfio repository for both crates hosted there.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
TDX functionality is not currently available on MSHV but we should not
preclude building a binary that can run on both.
Fixes: #4677
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The latest kvm-ioctls contains a breaking change to its API. Now Arm's
get/set_one_reg use u128 instead of u64.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Multiple rust-vmm crates must be updated at once given the vm-memory one
has been updated and they all rely on vm-memory.
- vm-memory from 0.8.0 to 0.9.0
- vhost from 0.4.0 to 0.5.0
- virtio-queue from 0.5.0 to 0.6.0
- vhost-user-backend from 0.6.0 to 0.7.0
- linux-loader from 0.4.0 to 0.5.0
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Use VgicConfig to initialize Vgic.
Use Gic::create_default_config everywhere so we don't always recompute
redist/msi registers.
Add a helper create_test_vgic_config for tests in hypervisor crate.
Signed-off-by: Nuno Das Neves <nudasnev@microsoft.com>
VgicConfig structure will be used for initializing the Vgic.
Gic::create_default_config will be used everywhere we currently compute
redist/msi registers.
Signed-off-by: Nuno Das Neves <nudasnev@microsoft.com>
Set the maximum number of HW breakpoints according to the value returned
from `Hypervisor::get_guest_debug_hw_bps()`.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Added `Hypervisor::get_guest_debug_hw_bps()` for fetching the number of
supported hardware breakpoints.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>