Commit Graph

210 Commits

Author SHA1 Message Date
Wei Liu
3e6b0a5eab vmm: unify TranslateVirtualAddress error for both x86_64 and aarch64
Using anyhow::Error should cover both architectures.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-08-22 09:37:21 -07:00
Michael Zhao
0522e40933 vmm: Implement translate_gva on AArch64
On AArch64, `translate_gva` API is not provided by KVM. We implemented
it in VMM by walking through translation tables.

Address translation is big topic, here we only focus the scenario that
happens in VMM while debugging kernel. This `translate_gva`
implementation is restricted to:
 - Exception Level 1
 - Translate high address range only (kernel space)

This implementation supports following Arm-v8a features related to
address translation:
 - FEAT_LPA
 - FEAT_LVA
 - FEAT_LPA2

The implementation supports page sizes of 4KiB, 16KiB and 64KiB.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-08-21 17:07:26 +08:00
Michael Zhao
5febdec81a vmm: Enable gdbstub on AArch64
The `gva_translate` function is still missing, it will be added with a
separate commit.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-08-21 17:07:26 +08:00
Nuno Das Neves
fdc8546eef vmm: aarch64: Use GIC_V3_* consts instead of magic numbers in create_madt()
Signed-off-by: Nuno Das Neves <nudasnev@microsoft.com>
2022-08-21 17:06:48 +08:00
Michael Zhao
7199119bb2 hypervisor: Remove Vcpu::read_mpidr() on AArch64
Replaced `read_mpidr()` with `get_sys_reg()`.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-07-29 11:45:12 +01:00
Michael Zhao
cd7f36a713 hypervisor: Remove get/set_reg() on AArch64
`Vcpu::get/set_reg()` were only invoked in Vcpu itself.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-07-29 11:45:12 +01:00
Michael Zhao
f7b6d99c2d hypervisor: Remove get/set_sys_regs() on AArch64
`hypervisor::Vcpu::get/set_sys_regs()` are only used in Vcpu internally.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-07-29 11:45:12 +01:00
Rob Bradford
857edc71a9 vmm: cpu: Remove now unused CpuManager::vcpus_paused()
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-07-26 09:22:25 +02:00
Rob Bradford
0e29379bcf vmm: Make gdb break/resuming more resilient
When starting the VM such that it is already on a breakpoint (via
stop_on_boot) when attached to gdb then start the vCPUs in a paused
state rather than starting the vCPUs later (upon resume).

Further, make the resumption/break of the VM more resilient by only
attempting to resume the vCPUs if were are already in a break point and
only attempting to pause/break if we were already running.

Fixes: #4354

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-07-26 09:22:25 +02:00
Wei Liu
ad33f7c5e6 vmm: return seccomp rules according to hypervisors
That requires stashing the hypervisor type into various places.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-22 12:50:12 +01:00
Wei Liu
f84ddedb1a hypervisor, vmm: introduce trait functions for aarch64 PMU
The original code uses kvm_device_attr directly outside of the
hyeprvisor crate. That leaks hypervisor details.

No functional change intended.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-21 23:37:53 +01:00
Wei Liu
f21fc1dcb6 hypervisor: x86: provide a generic MsrEntry structure
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-20 10:13:41 +01:00
Wei Liu
4d2cc3778f hypervisor: move away from MsrEntries type
It is a flexible array. Switch to vector and slice instead.

No functional change intended.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-20 10:13:41 +01:00
Wei Liu
05e5106b9b hypervisor x86: provide a generic LapicState structure
This requires making get/set_lapic_reg part of the type.

For the moment we cannot provide a default variant for the new type,
because picking one will be wrong for the other hypervisor, so I just
drop the test cases that requires LapicState::default().

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-19 09:38:38 +01:00
Wei Liu
6a8c0fc887 hypervisor: provide a generic FpuState structure
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-18 22:15:30 +01:00
Wei Liu
08135fa085 hypervisor: provide a generic CpudIdEntry structure
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-18 22:15:30 +01:00
Wei Liu
45fbf840db hypervisor, vmm: move away from CpuId type
CpuId is an alias type for the flexible array structure type over
CpuIdEntry. The type itself and the type of the element in the array
portion are tied to the underlying hypervisor.

Switch to using CpuIdEntry slice or vector directly. The construction of
CpuId type is left to hypervisors.

This allows us to decouple CpuIdEntry from hypervisors more easily.

No functional change intended.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-18 22:15:30 +01:00
Wei Liu
f1ab86fecb hypervisor: x86: provide a generic SpecialRegisters structure
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-15 10:21:43 +01:00
Wei Liu
75797827d5 hypervisor: x86: provide a generic SegmentRegister structure
And drop SegmentRegisterOps since it is no longer required.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-15 10:21:43 +01:00
Wei Liu
8b7781e267 hypervisor: x86: provide a generic StandardRegisters structure
We only need to do this for x86 since MSHV does not have aarch64 support
yet. This reduces unnecessary code churn.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-15 10:21:43 +01:00
Michael Zhao
2d8635f04a hypervisor: Refactor system_registers on AArch64
Function `system_registers` took mutable vector reference and modified
the vector content. Now change the definition to `get/set` style.
And rename to `get/set_sys_regs` to align with other functions.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-07-14 22:55:19 +08:00
Michael Zhao
c445513976 hypervisor: Refactor core_registers on AArch64
On AArch64, the function `core_registers` and `set_core_registers` are
the same thing of `get/set_regs` on x86_64. Now the names are aligned.
This will benefit supporting `gdb`.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-07-14 22:55:19 +08:00
Wei Liu
84bbaf06d1 hypervisor: turn boot_msr_entries into a trait method
This allows dispatching to either KVM or MSHV automatically.

No functional change.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-08 16:49:58 +01:00
Rob Bradford
93237f0106 vmm: Set MADT "Online Capable" flag
The Linux kernel now checks for this before marking CPUs as
hotpluggable:

commit aa06e20f1be628186f0c2dcec09ea0009eb69778
Author: Mario Limonciello <mario.limonciello@amd.com>
Date:   Wed Sep 8 16:41:46 2021 -0500

    x86/ACPI: Don't add CPUs that are not online capable

    A number of systems are showing "hotplug capable" CPUs when they
    are not really hotpluggable.  This is because the MADT has extra
    CPU entries to support different CPUs that may be inserted into
    the socket with different numbers of cores.

    Starting with ACPI 6.3 the spec has an Online Capable bit in the
    MADT used to determine whether or not a CPU is hotplug capable
    when the enabled bit is not set.

    Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/05_ACPI_Software_Programming_Model/ACPI_Software_Programming_Model.html?#local-apic-flags
    Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-07-01 18:45:05 +01:00
Rob Bradford
65ec6631fb vmm: cpu: Store the vCPU snapshots in ascending order
The snapshots are stored in a BTree which is ordered however as the ids
are strings lexical ordering places "11" ahead of "2". So encode the
vCPU id with zero padding so it is lexically sorted.

This fixes issues with CPU restore on aarch64.

See: #4239

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-06-27 16:20:57 +01:00
Rob Bradford
94fb9f817d vmm: Fix clippy issues under "guest_debug" feature
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-06-08 11:40:56 +01:00
Michael Zhao
a7a15d56dd aarch64: Move setup_regs to hypervisor
`setup_regs` of AArch64 calls KVM sepecific code. Now move it to
`hypervisor` crate.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-06-06 11:07:46 +01:00
Sebastien Boeuf
65dc1c83a9 vmm: cpu: Save and restore CPU states during snapshot/restore
Based on recent KVM host patches (merged in Linux 5.16), it's forbidden
to call into KVM_SET_CPUID2 after the first successful KVM_RUN returned.
That means saving CPU states during the pause sequence, and restoring
these states during the resume sequence will not work with the current
design starting with kernel version 5.16.

In order to solve this problem, let's simply move the save/restore logic
to the snapshot/restore sequences rather than the pause/resume ones.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-06-06 11:07:29 +01:00
Yi Wang
ccb604e1e1 vmm: add cpu segment note for coredump
The crash tool use a special note segment which named 'QEMU' to
analyze kaslr info and so on. If we don't add the 'QEMU' note
segment, crash tool can't find linux version to move on.

For now, the most convenient way is to add 'QEMU' note segment to
make crash tool happy.

Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
2022-05-30 13:41:40 +02:00
Rob Bradford
16a9882153 vmm: cpu: tdx: Don't use fd suffix for something not an FD
The hypervisor::Vcpu is the abstraction over the fd.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-13 15:39:22 +02:00
Rob Bradford
218be2642e hypervisor: Explicitly pub use at the hypervisor crate top-level
Explicitly re-export types from the hypervisor specific modules. This
makes it much clearer what the common functionality that is exposed is.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-13 15:39:22 +02:00
Rob Bradford
cd0df05808 vmm, arch: CpuId is x86_64 specific so import from the x86_64 module
It will be removed as a top-level export from the hypervisor crate.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-13 15:39:22 +02:00
Rob Bradford
d3f66f8702 hypervisor: Make vm module private
And thus only export what is necessary through a `pub use`. This is
consistent with some of the other modules and makes it easier to
understand what the external interface of the hypervisor crate is.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-13 15:39:22 +02:00
Rob Bradford
387d56879b vmm, hypervisor: Clean up nomenclature around offloading VM operations
The trait and functionality is about operations on the VM rather than
the VMM so should be named appropriately. This clashed with with
existing struct for the concrete implementation that was renamed
appropriately.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-10 13:10:01 +01:00
Rob Bradford
c47e3b8689 gdb: Do not use VmmOps for memory manipulation
We don't use the VmmOps trait directly for manipulating memory in the
core of the VMM as it's really designed for the MSHV crate to handle
instruction decoding. As I plan to make this trait MSHV specific to
allow reduced locking for MMIO and PIO handling when running on KVM this
use should be removed.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-04 11:33:02 -07:00
Rob Bradford
0270d697ab vmm: cpu: Improve Error reporting
Remove unused enum members, improve error messages and implement
thiserror::Error.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-04-22 17:46:41 +01:00
Michael Zhao
656425a328 aarch64: Align the data types in layout
Some addresses defined in `layout.rs` were of type `GuestAddress`, and
are `u64`. Now align the types of all the `*_START` definitions to
`GuestAddress`.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-04-08 11:08:43 -07:00
Rob Bradford
7fd76eff05 vmm: Don't error if live resizing is not possible
The introduction of a error if live resizing is not possible is a
regression compared to the original behaviour where the new size would
be stored in the config and reflected in the next boot. This behaviour
was also inconsistent with the effect of resizing with no VM booted.

Instead of generating an error allow the code to go ahead and update the
config so that the new size will be available upon the reboot.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-03-31 17:04:53 +01:00
Fabiano Fidêncio
2c8045343c vmm,cpu: Deny resizing only if the vcpu amount has changed
188078467d made clear that resize should
only happen when dealing with a "dynamic" CpuManager.  Although this is
very much correct, it causes a regression on Kata Containers (and on any
other consumer of Cloud Hypervisor) in cases where a resize would be
triggered but the vCPUs values wouldn't be changed.

There's no doubt Kata Containers could do better and do not call a
resize in such situations, and that's something that should **also** be
solved there.  However, we should also work this around on Cloud
Hypervisor side as it introduces a regression with the current Kata
Containers code.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-03-30 21:29:08 +01:00
Rob Bradford
7c0cf8cc23 arch, devices, vmm: Remove "acpi" feature gate
Compile this feature in by default as it's well supported on both
aarch64 and x86_64 and we only officially support using it (no non-acpi
binaries are available.)

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-03-28 09:18:29 -07:00
William Douglas
6b0df31e5d vmm: Add support for enabling AMX in vm guests
AMX is an x86 extension adding hardware units for matrix
operations (int and float dot products). The goal of the extension is
to provide performance enhancements for these common operations.

On Linux, AMX requires requesting the permission from the kernel prior
to use. Guests wanting to make use of the feature need to have the
request made prior to starting the vm.

This change then adds the first --cpus features option amx that when
passed will enable AMX usage for guests (needs a 5.17+ kernel) or
exits with failure.

The activation is done in the CpuManager of the VMM thread as it
allows migration and snapshot/restore to work fairly painlessly for
AMX enabled workloads.

Signed-off-by: William Douglas <william.douglas@intel.com>
2022-03-25 14:11:54 -07:00
Rob Bradford
f6dfb42a64 vmm: cpu: Don't place CpuManager on MMIO bus when non-dynamic
This is now consistent with not supplying the _CRS for the device when
CpuManager is not dynamic.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-03-24 13:17:39 +00:00
Rob Bradford
7324b0e514 vmm: cpu: Only include hotplug/unplug related AML code if dynamic
This will significantly reduce the size of the DSDT and the effort
required to parse them if there is no requirement to support
hotplug/unplug.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-03-17 13:46:21 +00:00
Rob Bradford
188078467d vmm: cpu: Deny resizing if CpuManager is not dynamic
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-03-17 13:46:21 +00:00
Rob Bradford
e5cb13588b vmm: cpu: Add concept of making CpuManager dynamic
If the CpuManager is dynamic it devices CPUs can be
hotplugged/unplugged.

Since TDX does not support CPU hotplug this is currently the only
determinator as to whether the CpuManager is dynamic.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-03-17 13:46:21 +00:00
Akira Moroo
2451c4d833 vmm: Implement GDB event handler to enable --gdb flag
This commit adds event fds and the event handler to send/receive
requests and responses from the GDB thread. It also adds `--gdb` flag to
enable GDB stub feature.

Signed-off-by: Akira Moroo <retrage01@gmail.com>
2022-02-23 11:16:09 +00:00
Akira Moroo
f1c4705638 vmm: Add Debuggable trait implementation
This commit adds initial gdb.rs implementation for `Debuggable` trait to
describe a debuggable component. Some part of the trait bound
implementations is based on the crosvm GDB stub code [1].

[1] https://github.com/google/crosvm/blob/main/src/gdb.rs

Signed-off-by: Akira Moroo <retrage01@gmail.com>
2022-02-23 11:16:09 +00:00
Michael Zhao
0fa31539eb vmm: Add default VCPU topology in PPTT on AArch64
When VCPU topology is not specified, fill the PPTT with default setting.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-02-22 09:21:00 +08:00
Sebastien Boeuf
0ac094c0d1 vmm: Handle TDX hypercalls with INVALID_OPERAND
Based on the helpers from the hypervisor crate, the VMM can identify
what type of hypercall has been issued through the KVM_EXIT_TDX reason.

For now, we only log warnings and set the status to INVALID_OPERAND
since these hypercalls aren't supported. The proper handling will be
implemented later.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-02-18 14:41:07 +01:00
Sebastien Boeuf
a3dfe726f8 vmm: cpu: Avoid useless cloning of Arc<Mutex<Vcpu>>
Since the object returned from CpuManager.create_vcpu() is never used,
we can avoid the cloning of this object.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-02-18 14:41:07 +01:00