On AArch64, interrupt controller (GIC) is emulated by KVM. VMM need to
set IRQ routing for devices, including legacy ones.
Before this commit, IRQ routing was only set for MSI. Legacy routing
entries of type KVM_IRQ_ROUTING_IRQCHIP were missing. That is way legacy
devices (like serial device ttyS0) does not work.
The setting of X86 IRQ routing entries are not impacted.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Add API to the hypervisor interface and implement for KVM to allow the
special TDX KVM ioctls on the VM and vCPU FDs.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
These need to be updated together as the kvm-ioctls depends upon a
strictly newer version of kvm-bindings which requires a rebase in the CH
fork.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
So far we've only had the need to emulate one instruction. There is no
need to use a HashMap when a simple tuple for the initial mapping will
do.
We can bring back the HashMap once more sophisticated use cases surface.
No functional change.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
This avoids code complexity down the line when we get around
implementing Windows support.
No functional change.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
In particular update for the vmm-sys-util upgrade and all the other
dependent packages. This requires an updated forked version of
kvm-bindings (due to updated vfio-ioctls) but allowed the removal of our
forked version of kvm-ioctls.
The changes to the API from kvm-ioctls and vmm-sys-util required some
other minor changes to the code.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
According to Intel's mnemonic (which is used by iced-x86) the first
argument is destination while the second is source.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
We don't have an easy way to figure out if a GPA points to normal memory
or device memory, but the guest's normal memory regions shouldn't
overlap with device regions. We can simply try to do a normal memory
read / write, and proceed to do device memory read / write if that
fails.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
OVMF would use string IO on those ports. String IO has not been
implemented, so that leads to panics.
Skip them explicitly in MSHV. Leave a long-ish comment in code to
explain the situation. We should properly implement string IO once it
becomes feasible / necessary.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
If the function can never return an error this is now a clippy failure:
error: this function's return value is unnecessarily wrapped by `Result`
--> virtio-devices/src/watchdog.rs:215:5
|
215 | / fn set_state(&mut self, state: &WatchdogState) -> io::Result<()> {
216 | | self.common.avail_features = state.avail_features;
217 | | self.common.acked_features = state.acked_features;
218 | | // When restoring enable the watchdog if it was previously enabled. We reset the timer
... |
223 | | Ok(())
224 | | }
| |_____^
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_wraps
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Swap the last two parameters of guest_mem_{read,write} to be consistent
with other read / write functions.
Use more descriptive parameter names.
No functional change.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
MOV R/RM is a special case of MOVZX, so we generalize the mov_r_rm macro
to make it support both instructions.
Fixes: #2227
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
error: field assignment outside of initializer for an instance created with Default::default()
Error: --> hypervisor/src/kvm/mod.rs:1239:9
|
1239 | state.mp_state = self.get_mp_state()?;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: `-D clippy::field-reassign-with-default` implied by `-D warnings`
note: consider initializing the variable with `kvm::aarch64::VcpuKvmState { mp_state: self.get_mp_state()?, ..Default::default() }` and removing relevant reassignments
--> hypervisor/src/kvm/mod.rs:1237:9
|
1237 | let mut state = CpuState::default();
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#field_reassign_with_default
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
error: field assignment outside of initializer for an instance created with Default::default()
--> hypervisor/src/kvm/mod.rs:318:9
|
318 | cap.cap = KVM_CAP_SPLIT_IRQCHIP;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: `-D clippy::field-reassign-with-default` implied by `-D warnings`
note: consider initializing the variable with `kvm_bindings::kvm_enable_cap { cap: KVM_CAP_SPLIT_IRQCHIP, ..Default::default() }` and removing relevant reassignments
--> hypervisor/src/kvm/mod.rs:317:9
|
317 | let mut cap: kvm_enable_cap = Default::default();
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#field_reassign_with_default
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
And fix related warnings: op_kind and op_register are being deprecated
as they might panic.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Currently these two macros(msr, msr_data) reside both on kvm and mshv
module. Definition is same for both module. Moving them to arch/x86
module eliminates redundancy and makes more sense.
Signed-off-by: Muminul Islam <muislam@microsoft.com>
Handle CPU exits, adding instruction emulations.
Keep CPU specific data inside vmm for later use.
Co-Developed-by: Nuno Das Neves <nudasnev@microsoft.com>
Signed-off-by: Nuno Das Neves <nudasnev@microsoft.com>
Co-Developed-by: Praveen Paladugu <prapal@microsoft.com>
Signed-off-by: Praveen Paladugu <prapal@microsoft.com>
Co-Developed-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Co-Developed-by: Wei Liu <liuwe@microsoft.com>
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
This patch adds the definition and implementation
MshvEmulatorContext which is platform emulation for Hyper-V.
Co-Developed-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Co-Developed-by: Wei Liu <liuwe@microsoft.com>
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
A software emulated TLB. This is mostly used by
the instruction emulator to cache gva to gpa
translations passed from the hypervisor.
Co-Developed-by: Nuno Das Neves <nudasnev@microsoft.com>
Signed-off-by: Nuno Das Neves <nudasnev@microsoft.com>
Co-Developed-by: Praveen Paladugu <prapal@microsoft.com>
Signed-off-by: Praveen Paladugu <prapal@microsoft.com>
Co-Developed-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Co-Developed-by: Wei Liu <liuwe@microsoft.com>
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
We don't have IrqFd and IOEventFd support in the kernel for now.
So an emulation layer is needed. In the future, we will be adding this
support in the kernel.
Co-Developed-by: Wei Liu <liuwe@microsoft.com>
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
vmmops trait object is needed to get access some
of the upper level vmm functionalities i.e guest
memory access, IO read write etc.
Co-Developed-by: Praveen Paladugu <prapal@microsoft.com>
Signed-off-by: Praveen Paladugu <prapal@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
Adding hv_state (hyperv state) to Vm and Vcpu struct for mshv.
This state is needed to keep some kernel data(for now hypercall page)
in the vmm.
Co-Developed-by: Praveen Paladugu <prapal@microsoft.com>
Signed-off-by: Praveen Paladugu <prapal@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
Implement hypervisor, Vm, Vcpu crate at a minimal
functionalities. Also adds the mshv feature gate,
separates out the functionalities between kvm and
mshv inside the vmm crate.
Co-Developed-by: Nuno Das Neves <nudasnev@microsoft.com>
Signed-off-by: Nuno Das Neves <nudasnev@microsoft.com>
Co-Developed-by: Praveen Paladugu <prapal@microsoft.com>
Signed-off-by: Praveen Paladugu <prapal@microsoft.com>
Co-Developed-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Co-Developed-by: Wei Liu <liuwe@microsoft.com>
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
This is the initial folder structure of the mshv module inside
the hypervisor crate. The aim of this module is to support Microsoft
Hyper-V as a supported Hypervisor.
Signed-off-by: Muminul Islam <muislam@microsoft.com>
There are some code base and function which are purely KVM specific for
now and we don't have those supports in mshv at the moment but we have plan
for the future. We are doing a feature guard with KVM. For example, KVM has
mp_state, cpu clock support, which we don't have for mshv. In order to build
those code we are making the code base for KVM specific compilation.
Signed-off-by: Muminul Islam <muislam@microsoft.com>
The customized hashmap macro can't be lifted to common MockVMM code.
MockVMM only needs a collection to iterate over to get initial register
states. A vector is just as good as a hashmap.
Switch to use a vector to store initial register states. This allows us
to drop the hashmap macro everywhere.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Unfortunately Rust stable does not yet have inline ASM support the flag
calculation will have to be implemented in software.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Unfortunately it seems patch entries are ignored when obtaining
dependencies from another workspace.
Remove the problematic kvm-ioctls and kvm-bindings patch entries and use
the forked repository directly.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
There is no need to have a pair of curly brackets for structures without
any member.
No functional change.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
The mapping between code and its handler is static. We can drop the
HashMap in favour of a static match expression.
This has two benefits:
1. No more memory allocation and deallocation for the HashMap.
2. Shorter look-up time.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
When a total ordering between multiple atomic variables is not required
then use Ordering::Acquire with atomic loads and Ordering::Release with
atomic stores.
This will improve performance as this does not require a memory fence
on x86_64 which Ordering::SeqCst will use.
Add a comment to the code in the vCPU handling code where it operates on
multiple atomics to explain why Ordering::SeqCst is required.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
When the x86 instruction decoder tells us about some missing bytes from
the instruction stream, we call into the platform fetch method and
emulate one last instruction.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
In preparation for the instruction fetching step, we modify the decoding
loop so that we can check what the last decoding error is.
We also switch to explictly using decode_out() which removes a 32 bytes
copy compared to decode().
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
In order to validate emulated memory accesses, we need to be able to get
all the segments descriptor attributes.
This is done by abstracting the SegmentRegister attributes through a
trait that each hypervisor will have to implement.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
We need to be able to build the CPU mode from its state in order to
start implementing mode related checks in the x86 emulator.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
The MockVMM platform will be used by other instructions emulation
implementations, but also by the emulator framework.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
The observation here is PlatformEmulator can be seen as the context for
emulation to take place. It should be rather easy to construct a context
that satisfies the lifetime constraints for instruction emulation.
The thread doing the emulation will have full ownership over the
context, so this removes the need to wrap PlatformEmulator in Arc and
Mutex, as well as the need for the context to be either Clone or Copy.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
The emulator gets a CPU state from a CpuStateManager instance, emulates
the passed instructions stream and returns the modified CPU state.
The emulator is a skeleton for now since it comes with an empty
instruction mnemonic map.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
And an InstructionMap helper structure to map x86 mnemonic codes
to instruction handlers.
Any instruction emulation implementation should then boil down with
implementing InstructionHandler for any supported mnemonic.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Minimal will be defined by the amount of emulated instructions.
Carrying all GPRs, all CRs, segment registers and table registers should
cover quite a few instructions.
Co-developed-by: Wei Liu <liuwe@microsoft.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
For efficiently emulating x86 instructions, we need to build and pass a
CPU state copy/reference to instruction emulation handlers. Those handlers
will typically modify the CPU state and let the caller commit those
changes back through the PlatformEmulator trait set_cpu_state method.
Hypervisors typically have internal CPU state structures, that maps back
to the correspinding kernel APIs. By implementing the CpuState trait,
instruction emulators will be able to directly work on CPU state
instances that are directly consumable by the underlying hypervisor and
its kernel APIs.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
In order to emulate instructions, we need a way to get access to some of
the guest resources. The PlatformEmulator interface provides guest
memory and CPU state access to emulator implementations.
Typically, an hypervisor will implement PlatformEmulator for architecture
specific instruction emulators to build their framework on top of.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
We will need the GDT API for the hypervisor's x86 instruction
emulator implementation, it's better if the arch crate depends on the
hypervisor one rather than the other way around.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
This interface is used by the vCPU thread to delegate responsibility for
handling MMIO/PIO operations and to support different approaches than a
VM exit.
During profiling I found that we were spending 13.75% of the boot CPU
uage acquiring access to the object holding the VmmOps via
ArcSwap::load_full()
13.75% 6.02% vcpu0 cloud-hypervisor [.] arc_swap::ArcSwapAny<T,S>::load_full
|
---arc_swap::ArcSwapAny<T,S>::load_full
|
--13.43%--<hypervisor::kvm::KvmVcpu as hypervisor::cpu::Vcpu>::run
std::sys_common::backtrace::__rust_begin_short_backtrace
core::ops::function::FnOnce::call_once{{vtable-shim}}
std::sys::unix:🧵:Thread:🆕:thread_start
However since the object implementing VmmOps does not need to be mutable
and it is only used from the vCPU side we can change the ownership to
being a simple Arc<> that is passed in when calling create_vcpu().
This completely removes the above CPU usage from subsequent profiles.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>