Commit Graph

221 Commits

Author SHA1 Message Date
Rob Bradford
3888f57600 aarch64: Remove unnecessary casts (beta clippy check)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-12-01 17:02:30 +00:00
Sebastien Boeuf
4487c8376b vmm: Move CpuManager and Vcpu to the new restore design
Every Vcpu is now created with the right state if there's an available
snapshot associated with it. This simplifies the restore logic.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-12-01 09:27:00 +01:00
Sebastien Boeuf
86e7f07485 vmm: cpu: Create vCPUs before the DeviceManager
Moving the creation of the vCPUs before the DeviceManager gets created
will allow for the aarch64 vGIC to be created before the DeviceManager
as well in a follow up patch. The end goal being to adopt the same
creation sequence for both x86_64 and aarch64, and keeping in mind that
the vGIC requires every vCPU to be created.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-11-23 11:49:57 +01:00
Sebastien Boeuf
578780ed0c vmm: cpu: Split vCPU creation
Split the vCPU creation into two distincts parts. On the one hand we
create the actual Vcpu object with the creation of the hypervisor::Vcpu.
And on the other hand, we configure the existing Vcpu, setting registers
to proper values (such as setting the entry point).

This will allow for further work to move the creation earlier in the
boot, so that the hypervisor::Vcpu will be already created when the
DeviceManager gets created.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-11-23 11:49:57 +01:00
Sebastien Boeuf
ec01062ada vmm: Switch order between DeviceManager and CpuManager creation
The CpuManager is now created before the DeviceManager. This is required
as preliminary work for creating the vCPUs before the DeviceManager,
which is required to ensure both x86_64 and aarch64 follow the same
sequence.

It's important to note the optimization for faster PIO accesses on the
PCI config space had to be removed given the VmOps was required by the
CpuManager and by the Vcpu by extension. But given the PciConfigIo is
created as part of the DeviceManager, there was no proper way of moving
things around so that we could provide PciConfigIo early enough.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-11-23 11:49:57 +01:00
Wei Liu
d05586f520 vmm: modify or provide safety comments
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-11-18 12:50:01 +00:00
Bo Chen
a9ec0f33c0 misc: Fix clippy issues
Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-11-02 09:41:43 +01:00
Rob Bradford
06eb82d239 build: Consolidate "gdb" build feature into "guest_debug"
This simplifies the CI process but also logical with the existing
functionality under "guest_debug" (dumping guest memory).

Fixes: #4679

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-09-27 14:30:57 +01:00
Rob Bradford
1202b9a07a vmm: Add some tracing of boot sequence
Add tracing of the VM boot sequence from the point at which the request
to create a VM is received to the hand-off to the vCPU threads running.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-09-22 18:09:31 +01:00
Sebastien Boeuf
1849ffff31 vmm: Remove "amx" feature gate
Given the AMX x86 feature has been made available since kernel v5.17,
and given we don't have any test validating this feature, there's no
need to keep it behing a Rust feature gate.

Fixes #3996

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-09-16 15:03:31 +01:00
Nuno Das Neves
784a3aaf3c devices: gic: use VgicConfig everywhere
Use VgicConfig to initialize Vgic.
Use Gic::create_default_config everywhere so we don't always recompute
redist/msi registers.
Add a helper create_test_vgic_config for tests in hypervisor crate.

Signed-off-by: Nuno Das Neves <nudasnev@microsoft.com>
2022-08-31 08:33:05 +01:00
Wei Liu
3e6b0a5eab vmm: unify TranslateVirtualAddress error for both x86_64 and aarch64
Using anyhow::Error should cover both architectures.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-08-22 09:37:21 -07:00
Michael Zhao
0522e40933 vmm: Implement translate_gva on AArch64
On AArch64, `translate_gva` API is not provided by KVM. We implemented
it in VMM by walking through translation tables.

Address translation is big topic, here we only focus the scenario that
happens in VMM while debugging kernel. This `translate_gva`
implementation is restricted to:
 - Exception Level 1
 - Translate high address range only (kernel space)

This implementation supports following Arm-v8a features related to
address translation:
 - FEAT_LPA
 - FEAT_LVA
 - FEAT_LPA2

The implementation supports page sizes of 4KiB, 16KiB and 64KiB.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-08-21 17:07:26 +08:00
Michael Zhao
5febdec81a vmm: Enable gdbstub on AArch64
The `gva_translate` function is still missing, it will be added with a
separate commit.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-08-21 17:07:26 +08:00
Nuno Das Neves
fdc8546eef vmm: aarch64: Use GIC_V3_* consts instead of magic numbers in create_madt()
Signed-off-by: Nuno Das Neves <nudasnev@microsoft.com>
2022-08-21 17:06:48 +08:00
Michael Zhao
7199119bb2 hypervisor: Remove Vcpu::read_mpidr() on AArch64
Replaced `read_mpidr()` with `get_sys_reg()`.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-07-29 11:45:12 +01:00
Michael Zhao
cd7f36a713 hypervisor: Remove get/set_reg() on AArch64
`Vcpu::get/set_reg()` were only invoked in Vcpu itself.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-07-29 11:45:12 +01:00
Michael Zhao
f7b6d99c2d hypervisor: Remove get/set_sys_regs() on AArch64
`hypervisor::Vcpu::get/set_sys_regs()` are only used in Vcpu internally.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-07-29 11:45:12 +01:00
Rob Bradford
857edc71a9 vmm: cpu: Remove now unused CpuManager::vcpus_paused()
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-07-26 09:22:25 +02:00
Rob Bradford
0e29379bcf vmm: Make gdb break/resuming more resilient
When starting the VM such that it is already on a breakpoint (via
stop_on_boot) when attached to gdb then start the vCPUs in a paused
state rather than starting the vCPUs later (upon resume).

Further, make the resumption/break of the VM more resilient by only
attempting to resume the vCPUs if were are already in a break point and
only attempting to pause/break if we were already running.

Fixes: #4354

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-07-26 09:22:25 +02:00
Wei Liu
ad33f7c5e6 vmm: return seccomp rules according to hypervisors
That requires stashing the hypervisor type into various places.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-22 12:50:12 +01:00
Wei Liu
f84ddedb1a hypervisor, vmm: introduce trait functions for aarch64 PMU
The original code uses kvm_device_attr directly outside of the
hyeprvisor crate. That leaks hypervisor details.

No functional change intended.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-21 23:37:53 +01:00
Wei Liu
f21fc1dcb6 hypervisor: x86: provide a generic MsrEntry structure
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-20 10:13:41 +01:00
Wei Liu
4d2cc3778f hypervisor: move away from MsrEntries type
It is a flexible array. Switch to vector and slice instead.

No functional change intended.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-20 10:13:41 +01:00
Wei Liu
05e5106b9b hypervisor x86: provide a generic LapicState structure
This requires making get/set_lapic_reg part of the type.

For the moment we cannot provide a default variant for the new type,
because picking one will be wrong for the other hypervisor, so I just
drop the test cases that requires LapicState::default().

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-19 09:38:38 +01:00
Wei Liu
6a8c0fc887 hypervisor: provide a generic FpuState structure
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-18 22:15:30 +01:00
Wei Liu
08135fa085 hypervisor: provide a generic CpudIdEntry structure
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-18 22:15:30 +01:00
Wei Liu
45fbf840db hypervisor, vmm: move away from CpuId type
CpuId is an alias type for the flexible array structure type over
CpuIdEntry. The type itself and the type of the element in the array
portion are tied to the underlying hypervisor.

Switch to using CpuIdEntry slice or vector directly. The construction of
CpuId type is left to hypervisors.

This allows us to decouple CpuIdEntry from hypervisors more easily.

No functional change intended.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-18 22:15:30 +01:00
Wei Liu
f1ab86fecb hypervisor: x86: provide a generic SpecialRegisters structure
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-15 10:21:43 +01:00
Wei Liu
75797827d5 hypervisor: x86: provide a generic SegmentRegister structure
And drop SegmentRegisterOps since it is no longer required.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-15 10:21:43 +01:00
Wei Liu
8b7781e267 hypervisor: x86: provide a generic StandardRegisters structure
We only need to do this for x86 since MSHV does not have aarch64 support
yet. This reduces unnecessary code churn.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-15 10:21:43 +01:00
Michael Zhao
2d8635f04a hypervisor: Refactor system_registers on AArch64
Function `system_registers` took mutable vector reference and modified
the vector content. Now change the definition to `get/set` style.
And rename to `get/set_sys_regs` to align with other functions.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-07-14 22:55:19 +08:00
Michael Zhao
c445513976 hypervisor: Refactor core_registers on AArch64
On AArch64, the function `core_registers` and `set_core_registers` are
the same thing of `get/set_regs` on x86_64. Now the names are aligned.
This will benefit supporting `gdb`.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-07-14 22:55:19 +08:00
Wei Liu
84bbaf06d1 hypervisor: turn boot_msr_entries into a trait method
This allows dispatching to either KVM or MSHV automatically.

No functional change.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-08 16:49:58 +01:00
Rob Bradford
93237f0106 vmm: Set MADT "Online Capable" flag
The Linux kernel now checks for this before marking CPUs as
hotpluggable:

commit aa06e20f1be628186f0c2dcec09ea0009eb69778
Author: Mario Limonciello <mario.limonciello@amd.com>
Date:   Wed Sep 8 16:41:46 2021 -0500

    x86/ACPI: Don't add CPUs that are not online capable

    A number of systems are showing "hotplug capable" CPUs when they
    are not really hotpluggable.  This is because the MADT has extra
    CPU entries to support different CPUs that may be inserted into
    the socket with different numbers of cores.

    Starting with ACPI 6.3 the spec has an Online Capable bit in the
    MADT used to determine whether or not a CPU is hotplug capable
    when the enabled bit is not set.

    Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/05_ACPI_Software_Programming_Model/ACPI_Software_Programming_Model.html?#local-apic-flags
    Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-07-01 18:45:05 +01:00
Rob Bradford
65ec6631fb vmm: cpu: Store the vCPU snapshots in ascending order
The snapshots are stored in a BTree which is ordered however as the ids
are strings lexical ordering places "11" ahead of "2". So encode the
vCPU id with zero padding so it is lexically sorted.

This fixes issues with CPU restore on aarch64.

See: #4239

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-06-27 16:20:57 +01:00
Rob Bradford
94fb9f817d vmm: Fix clippy issues under "guest_debug" feature
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-06-08 11:40:56 +01:00
Michael Zhao
a7a15d56dd aarch64: Move setup_regs to hypervisor
`setup_regs` of AArch64 calls KVM sepecific code. Now move it to
`hypervisor` crate.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-06-06 11:07:46 +01:00
Sebastien Boeuf
65dc1c83a9 vmm: cpu: Save and restore CPU states during snapshot/restore
Based on recent KVM host patches (merged in Linux 5.16), it's forbidden
to call into KVM_SET_CPUID2 after the first successful KVM_RUN returned.
That means saving CPU states during the pause sequence, and restoring
these states during the resume sequence will not work with the current
design starting with kernel version 5.16.

In order to solve this problem, let's simply move the save/restore logic
to the snapshot/restore sequences rather than the pause/resume ones.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-06-06 11:07:29 +01:00
Yi Wang
ccb604e1e1 vmm: add cpu segment note for coredump
The crash tool use a special note segment which named 'QEMU' to
analyze kaslr info and so on. If we don't add the 'QEMU' note
segment, crash tool can't find linux version to move on.

For now, the most convenient way is to add 'QEMU' note segment to
make crash tool happy.

Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
2022-05-30 13:41:40 +02:00
Rob Bradford
16a9882153 vmm: cpu: tdx: Don't use fd suffix for something not an FD
The hypervisor::Vcpu is the abstraction over the fd.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-13 15:39:22 +02:00
Rob Bradford
218be2642e hypervisor: Explicitly pub use at the hypervisor crate top-level
Explicitly re-export types from the hypervisor specific modules. This
makes it much clearer what the common functionality that is exposed is.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-13 15:39:22 +02:00
Rob Bradford
cd0df05808 vmm, arch: CpuId is x86_64 specific so import from the x86_64 module
It will be removed as a top-level export from the hypervisor crate.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-13 15:39:22 +02:00
Rob Bradford
d3f66f8702 hypervisor: Make vm module private
And thus only export what is necessary through a `pub use`. This is
consistent with some of the other modules and makes it easier to
understand what the external interface of the hypervisor crate is.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-13 15:39:22 +02:00
Rob Bradford
387d56879b vmm, hypervisor: Clean up nomenclature around offloading VM operations
The trait and functionality is about operations on the VM rather than
the VMM so should be named appropriately. This clashed with with
existing struct for the concrete implementation that was renamed
appropriately.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-10 13:10:01 +01:00
Rob Bradford
c47e3b8689 gdb: Do not use VmmOps for memory manipulation
We don't use the VmmOps trait directly for manipulating memory in the
core of the VMM as it's really designed for the MSHV crate to handle
instruction decoding. As I plan to make this trait MSHV specific to
allow reduced locking for MMIO and PIO handling when running on KVM this
use should be removed.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-04 11:33:02 -07:00
Rob Bradford
0270d697ab vmm: cpu: Improve Error reporting
Remove unused enum members, improve error messages and implement
thiserror::Error.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-04-22 17:46:41 +01:00
Michael Zhao
656425a328 aarch64: Align the data types in layout
Some addresses defined in `layout.rs` were of type `GuestAddress`, and
are `u64`. Now align the types of all the `*_START` definitions to
`GuestAddress`.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-04-08 11:08:43 -07:00
Rob Bradford
7fd76eff05 vmm: Don't error if live resizing is not possible
The introduction of a error if live resizing is not possible is a
regression compared to the original behaviour where the new size would
be stored in the config and reflected in the next boot. This behaviour
was also inconsistent with the effect of resizing with no VM booted.

Instead of generating an error allow the code to go ahead and update the
config so that the new size will be available upon the reboot.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-03-31 17:04:53 +01:00
Fabiano Fidêncio
2c8045343c vmm,cpu: Deny resizing only if the vcpu amount has changed
188078467d made clear that resize should
only happen when dealing with a "dynamic" CpuManager.  Although this is
very much correct, it causes a regression on Kata Containers (and on any
other consumer of Cloud Hypervisor) in cases where a resize would be
triggered but the vCPUs values wouldn't be changed.

There's no doubt Kata Containers could do better and do not call a
resize in such situations, and that's something that should **also** be
solved there.  However, we should also work this around on Cloud
Hypervisor side as it introduces a regression with the current Kata
Containers code.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-03-30 21:29:08 +01:00