Place the 3 page TSS at an explicit location in the 32-bit address space
to avoid conflicting with the loaded raw firmware.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
When the synthetic interrupt controller is enabled, an extra set of MSRs
must be stored in case of migration. There was one MSR missing in the
list, HV_X64_MSR_SINT14 corresponding to the 15th interrupt source from
the synthetic interrupt controller.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Update the kvm-bindings dependency so that Cloud Hypervisor now depends
on the version 0.5.0, which is based on Linux kernel v5.13.0. We still
have to rely on a forked version to be able to serialize all the KVM
structures we need.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Definition of kvm_tdx_init_vm used by INIT_VM has been updated in latest
kernel, needing an update on the Cloud Hypervisor side as well.
Update structure TdxInitVm to fit this change and avoid -EINVAL to be
returned by the kernel.
Signed-off-by: Jiaqi Gao <jiaqi.gao@intel.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Right now, get_dirty_log API has two parameters,
slot and memory_size.
MSHV needs gpa to retrieve the page states. GPA is
needed as MSHV returns the state base on PFN.
Signed-off-by: Muminul Islam <muislam@microsoft.com>
With the new beta version, clippy complains about redundant allocation
when using Arc<Box<dyn T>>, and suggests replacing it simply with
Arc<dyn T>.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Following KVM interfaces, the `hypervisor` crate now provides interfaces
to start/stop the dirty pages logging on a per region basis, and asks
its users (e.g. the `vmm` crate) to iterate over the regions that needs
dirty pages log. MSHV only has a global control to start/stop dirty
pages log on all regions at once.
This patch refactors related APIs from the `hypervisor` crate to provide
a global control to start/stop dirty pages log (following MSHV's
behaviors), and keeps tracking the regions need dirty pages log for
KVM. It avoids leaking hypervisor-specific behaviors out of the
`hypervisor` crate.
Signed-off-by: Bo Chen <chen.bo@intel.com>
This patch extends slightly the current live-migration code path with
the ability to dynamically start and stop logging dirty-pages, which
relies on two new methods added to the `hypervisor::vm::Vm` Trait. This
patch also contains a complete implementation of the two new methods
based on `kvm` and placeholders for `mshv` in the `hypervisor` crate.
Fixes: #2858
Signed-off-by: Bo Chen <chen.bo@intel.com>
We need a dedicated function to enable the SGX attribute capability
through the Hypervisor abstraction.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Previously the same function was used to both create and remove regions.
This worked on KVM because it uses size 0 to indicate removal.
MSHV has two calls -- one for creation and one for removal. It also
requires having the size field available because it is not slot based.
Split set_user_memory_region to {create/remove}_user_memory_region. For
KVM they still use set_user_memory_region underneath, but for MSHV they
map to different functions.
This fixes user memory region removal on MSHV.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Issue from beta verion of clippy:
Error: --> vm-virtio/src/queue.rs:700:59
|
700 | if let Some(used_event) = self.get_used_event(&mem) {
| ^^^^ help: change this to: `mem`
|
= note: `-D clippy::needless-borrow` implied by `-D warnings`
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow
Signed-off-by: Bo Chen <chen.bo@intel.com>
Not all AArch64 platforms support IPAs up to 40 bits. Since the
kvm-ioctl crate now supports `get_host_ipa_limit` for AArch64,
when creating the VM, it is better to get the IPA size from the
host and use that to create the VM.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
This commit adds a helper `get_host_ipa_limit` to the AArch64
`KvmHypervisor` struct. This helper can be used to get the
`Host_IPA_Limit`, which is the maximum possible value for
IPA_Bits on the host and is dependent on the CPU capability
and the kernel configuration.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
error: name `FinalizeTDX` contains a capitalized acronym
--> vmm/src/vm.rs:274:5
|
274 | FinalizeTDX(hypervisor::HypervisorVmError),
| ^^^^^^^^^^^ help: consider making the acronym lowercase, except the initial letter: `FinalizeTdx`
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#upper_case_acronyms
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
warning: name `TranslateGVA` contains a capitalized acronym
--> hypervisor/src/arch/emulator/mod.rs:51:5
|
51 | TranslateGVA(#[source] anyhow::Error),
| ^^^^^^^^^^^^ help: consider making the acronym lowercase, except the initial letter: `TranslateGva`
|
= note: `#[warn(clippy::upper_case_acronyms)]` on by default
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#upper_case_acronyms
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
refactor vec_with_array_field to common hypervisor code
so that mshv can also make use of it.
Signed-off-by: Vineeth Pillai <viremana@linux.microsoft.com>
On AArch64, interrupt controller (GIC) is emulated by KVM. VMM need to
set IRQ routing for devices, including legacy ones.
Before this commit, IRQ routing was only set for MSI. Legacy routing
entries of type KVM_IRQ_ROUTING_IRQCHIP were missing. That is way legacy
devices (like serial device ttyS0) does not work.
The setting of X86 IRQ routing entries are not impacted.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Add API to the hypervisor interface and implement for KVM to allow the
special TDX KVM ioctls on the VM and vCPU FDs.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
In particular update for the vmm-sys-util upgrade and all the other
dependent packages. This requires an updated forked version of
kvm-bindings (due to updated vfio-ioctls) but allowed the removal of our
forked version of kvm-ioctls.
The changes to the API from kvm-ioctls and vmm-sys-util required some
other minor changes to the code.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
error: field assignment outside of initializer for an instance created with Default::default()
Error: --> hypervisor/src/kvm/mod.rs:1239:9
|
1239 | state.mp_state = self.get_mp_state()?;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: `-D clippy::field-reassign-with-default` implied by `-D warnings`
note: consider initializing the variable with `kvm::aarch64::VcpuKvmState { mp_state: self.get_mp_state()?, ..Default::default() }` and removing relevant reassignments
--> hypervisor/src/kvm/mod.rs:1237:9
|
1237 | let mut state = CpuState::default();
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#field_reassign_with_default
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
error: field assignment outside of initializer for an instance created with Default::default()
--> hypervisor/src/kvm/mod.rs:318:9
|
318 | cap.cap = KVM_CAP_SPLIT_IRQCHIP;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: `-D clippy::field-reassign-with-default` implied by `-D warnings`
note: consider initializing the variable with `kvm_bindings::kvm_enable_cap { cap: KVM_CAP_SPLIT_IRQCHIP, ..Default::default() }` and removing relevant reassignments
--> hypervisor/src/kvm/mod.rs:317:9
|
317 | let mut cap: kvm_enable_cap = Default::default();
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#field_reassign_with_default
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
When a total ordering between multiple atomic variables is not required
then use Ordering::Acquire with atomic loads and Ordering::Release with
atomic stores.
This will improve performance as this does not require a memory fence
on x86_64 which Ordering::SeqCst will use.
Add a comment to the code in the vCPU handling code where it operates on
multiple atomics to explain why Ordering::SeqCst is required.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This interface is used by the vCPU thread to delegate responsibility for
handling MMIO/PIO operations and to support different approaches than a
VM exit.
During profiling I found that we were spending 13.75% of the boot CPU
uage acquiring access to the object holding the VmmOps via
ArcSwap::load_full()
13.75% 6.02% vcpu0 cloud-hypervisor [.] arc_swap::ArcSwapAny<T,S>::load_full
|
---arc_swap::ArcSwapAny<T,S>::load_full
|
--13.43%--<hypervisor::kvm::KvmVcpu as hypervisor::cpu::Vcpu>::run
std::sys_common::backtrace::__rust_begin_short_backtrace
core::ops::function::FnOnce::call_once{{vtable-shim}}
std::sys::unix:🧵:Thread:🆕:thread_start
However since the object implementing VmmOps does not need to be mutable
and it is only used from the vCPU side we can change the ownership to
being a simple Arc<> that is passed in when calling create_vcpu().
This completely removes the above CPU usage from subsequent profiles.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Cloning the ArcSwapOption (like the ArcSwap) does not act like a
.clone() on an Arc, instead an entirely new ArcSwap is created with the
same contents. To correctly share the ArcSwap needs to be placed inside
an Arc.
See: 2433d5719b (diff-6c6d94533c44c19bd1416ef17bad1a878e63dca6e98d59181228fbe8f967c62bR6)
Due to this being wrongly used ::clone() was removed from
ArcSwap/ArcSwapOption in 1.0.0.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The logic to handle AArch64 system event was: SHUTDOWN and RESET were
all treated as RESET.
Now we handle them differently:
- RESET event will trigger Vmm::vm_reboot(),
- SHUTDOWN event will trigger Vmm::vm_shutdown().
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
The snasphot/restore feature is not working because some CPU states are
not properly saved, which means they can't be restored later on.
First thing, we ensure the CPUID is stored so that it can be properly
restored later. The code is simplified and pushed down to the hypervisor
crate.
Second thing, we identify for each vCPU if the Hyper-V SynIC device is
emulated or not. In case it is, that means some specific MSRs will be
set by the guest. These MSRs must be saved in order to properly restore
the VM.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
"Using a stable sort consumes more memory and cpu cycles. Because values
which compare equal are identical, preserving their relative order (the
guarantee that a stable sort provides) means nothing, while the extra
costs still apply."
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The OneRegister literally means "one (arbitrary) register". Just call it
"Register" instead. There is no need to inherit KVM's naming scheme in
the hypervisor agnostic code.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Run loop in hypervisor needs a callback mechanism to access resources
like guest memory, mmio, pio etc.
VmmOps trait is introduced here, which is implemented by vmm module.
While handling vcpuexits in run loop, this trait allows hypervisor
module access to the above mentioned resources via callbacks.
Signed-off-by: Praveen Paladugu <prapal@microsoft.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This commit implements the `get_device_attr` method for the
`KVM_GET_DEVICE_ATTR` ioctl. This ioctl will be used in retrieving
the GIC status.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
This commit adds methods to save/restore AArch64 vCPU registers,
including:
1. The AArch64 `VcpuKvmState` structure.
2. Some `Vcpu` trait methods of the `KvmVcpu` structure to
enable the save/restore of the AArch64 vCPU states.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
This commit ports code from firecracker and refactors the existing
AArch64 code as the preparation for implementing save/restore
AArch64 vCPU, including:
1. Modification of `arm64_core_reg` macro to retrive the index of
arm64 core register and implemention of a helper to determine if
a register is a system register.
2. Move some macros and helpers in `arch` crate to the `hypervisor`
crate.
3. Added related unit tests for above functions and macros.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
Currently we don't need to do anything to service these exits but when
the synthetic interrupt controller is active an exit will be triggered
to notify the VMM of details of the synthetic interrupt page.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The new function already checks if the API version is compatible. There
is no need to expose the get_api_version function to code outside
hypervisor crate.
Signed-off-by: Wei Liu <liuwe@microsoft.com>