Commit Graph

230 Commits

Author SHA1 Message Date
Sebastien Boeuf
748018ace3 vm-migration: Don't store the id as part of Snapshot structure
The information about the identifier related to a Snapshot is only
relevant from the BTreeMap perspective, which is why we can get rid of
the duplicated identifier in every Snapshot structure.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-12-09 10:26:06 +01:00
Sebastien Boeuf
4517b76a23 vm-migration: Rename SnapshotDataSection into SnapshotData
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-12-09 10:26:06 +01:00
Sebastien Boeuf
5b3bcfa233 vm-migration: Snapshot should have a unique SnapshotDataSection
There's no reason to carry a HashMap of SnapshotDataSection per
Snapshot. And given we now provide at most one SnapshotDataSection per
Snapshot, there's no need to keep the id part of the SnapshotDataSection
structure.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-12-09 10:26:06 +01:00
Rob Bradford
cefbf6b4a3 vmm: guest_debug: Mark coredump functionality x86_64 only
The coredump functionality is only implemented for x86_64 so it should
only be compiled in there.

Fixes: #4964

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-12-05 17:23:52 +00:00
Rob Bradford
bfa31f9c56 vmm: Propagate vCPU configure error correctly
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-12-01 22:41:01 +00:00
Rob Bradford
725e388684 vmm: Seperate the CPUID setup from the CpuManager::new()
This allows the decoupling of CpuManager and MemoryManager.

See: #4761

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-12-01 22:41:01 +00:00
Rob Bradford
c5eac2e822 vmm: Don't store GuestMemoryMmap for "guest_debug" functionality
This removes the storage of the GuestMemoryMmap on the CpuManager
further allowing the decoupling of the CpuManager from the
MemoryManager.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-12-01 22:41:01 +00:00
Rob Bradford
c7b22156da aarch, vmm: Reduce requirement for guest memory to vCPU boot only
When configuring the vCPUs it is only necessary to provide the guest
memory when booting fresh (for populating the guest memory). As such
refactor the vCPU configuration to remove the use of the
GuestMemoryMmap stored on the CpuManager.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-12-01 22:41:01 +00:00
Rob Bradford
e5e5a89e65 vmm: cpu: Rename "vm_memory" parameter/member to "guest_memory"
This gives consistency across the file.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-12-01 22:41:01 +00:00
Rob Bradford
3888f57600 aarch64: Remove unnecessary casts (beta clippy check)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-12-01 17:02:30 +00:00
Sebastien Boeuf
4487c8376b vmm: Move CpuManager and Vcpu to the new restore design
Every Vcpu is now created with the right state if there's an available
snapshot associated with it. This simplifies the restore logic.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-12-01 09:27:00 +01:00
Sebastien Boeuf
86e7f07485 vmm: cpu: Create vCPUs before the DeviceManager
Moving the creation of the vCPUs before the DeviceManager gets created
will allow for the aarch64 vGIC to be created before the DeviceManager
as well in a follow up patch. The end goal being to adopt the same
creation sequence for both x86_64 and aarch64, and keeping in mind that
the vGIC requires every vCPU to be created.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-11-23 11:49:57 +01:00
Sebastien Boeuf
578780ed0c vmm: cpu: Split vCPU creation
Split the vCPU creation into two distincts parts. On the one hand we
create the actual Vcpu object with the creation of the hypervisor::Vcpu.
And on the other hand, we configure the existing Vcpu, setting registers
to proper values (such as setting the entry point).

This will allow for further work to move the creation earlier in the
boot, so that the hypervisor::Vcpu will be already created when the
DeviceManager gets created.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-11-23 11:49:57 +01:00
Sebastien Boeuf
ec01062ada vmm: Switch order between DeviceManager and CpuManager creation
The CpuManager is now created before the DeviceManager. This is required
as preliminary work for creating the vCPUs before the DeviceManager,
which is required to ensure both x86_64 and aarch64 follow the same
sequence.

It's important to note the optimization for faster PIO accesses on the
PCI config space had to be removed given the VmOps was required by the
CpuManager and by the Vcpu by extension. But given the PciConfigIo is
created as part of the DeviceManager, there was no proper way of moving
things around so that we could provide PciConfigIo early enough.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-11-23 11:49:57 +01:00
Wei Liu
d05586f520 vmm: modify or provide safety comments
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-11-18 12:50:01 +00:00
Bo Chen
a9ec0f33c0 misc: Fix clippy issues
Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-11-02 09:41:43 +01:00
Rob Bradford
06eb82d239 build: Consolidate "gdb" build feature into "guest_debug"
This simplifies the CI process but also logical with the existing
functionality under "guest_debug" (dumping guest memory).

Fixes: #4679

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-09-27 14:30:57 +01:00
Rob Bradford
1202b9a07a vmm: Add some tracing of boot sequence
Add tracing of the VM boot sequence from the point at which the request
to create a VM is received to the hand-off to the vCPU threads running.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-09-22 18:09:31 +01:00
Sebastien Boeuf
1849ffff31 vmm: Remove "amx" feature gate
Given the AMX x86 feature has been made available since kernel v5.17,
and given we don't have any test validating this feature, there's no
need to keep it behing a Rust feature gate.

Fixes #3996

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-09-16 15:03:31 +01:00
Nuno Das Neves
784a3aaf3c devices: gic: use VgicConfig everywhere
Use VgicConfig to initialize Vgic.
Use Gic::create_default_config everywhere so we don't always recompute
redist/msi registers.
Add a helper create_test_vgic_config for tests in hypervisor crate.

Signed-off-by: Nuno Das Neves <nudasnev@microsoft.com>
2022-08-31 08:33:05 +01:00
Wei Liu
3e6b0a5eab vmm: unify TranslateVirtualAddress error for both x86_64 and aarch64
Using anyhow::Error should cover both architectures.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-08-22 09:37:21 -07:00
Michael Zhao
0522e40933 vmm: Implement translate_gva on AArch64
On AArch64, `translate_gva` API is not provided by KVM. We implemented
it in VMM by walking through translation tables.

Address translation is big topic, here we only focus the scenario that
happens in VMM while debugging kernel. This `translate_gva`
implementation is restricted to:
 - Exception Level 1
 - Translate high address range only (kernel space)

This implementation supports following Arm-v8a features related to
address translation:
 - FEAT_LPA
 - FEAT_LVA
 - FEAT_LPA2

The implementation supports page sizes of 4KiB, 16KiB and 64KiB.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-08-21 17:07:26 +08:00
Michael Zhao
5febdec81a vmm: Enable gdbstub on AArch64
The `gva_translate` function is still missing, it will be added with a
separate commit.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-08-21 17:07:26 +08:00
Nuno Das Neves
fdc8546eef vmm: aarch64: Use GIC_V3_* consts instead of magic numbers in create_madt()
Signed-off-by: Nuno Das Neves <nudasnev@microsoft.com>
2022-08-21 17:06:48 +08:00
Michael Zhao
7199119bb2 hypervisor: Remove Vcpu::read_mpidr() on AArch64
Replaced `read_mpidr()` with `get_sys_reg()`.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-07-29 11:45:12 +01:00
Michael Zhao
cd7f36a713 hypervisor: Remove get/set_reg() on AArch64
`Vcpu::get/set_reg()` were only invoked in Vcpu itself.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-07-29 11:45:12 +01:00
Michael Zhao
f7b6d99c2d hypervisor: Remove get/set_sys_regs() on AArch64
`hypervisor::Vcpu::get/set_sys_regs()` are only used in Vcpu internally.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-07-29 11:45:12 +01:00
Rob Bradford
857edc71a9 vmm: cpu: Remove now unused CpuManager::vcpus_paused()
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-07-26 09:22:25 +02:00
Rob Bradford
0e29379bcf vmm: Make gdb break/resuming more resilient
When starting the VM such that it is already on a breakpoint (via
stop_on_boot) when attached to gdb then start the vCPUs in a paused
state rather than starting the vCPUs later (upon resume).

Further, make the resumption/break of the VM more resilient by only
attempting to resume the vCPUs if were are already in a break point and
only attempting to pause/break if we were already running.

Fixes: #4354

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-07-26 09:22:25 +02:00
Wei Liu
ad33f7c5e6 vmm: return seccomp rules according to hypervisors
That requires stashing the hypervisor type into various places.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-22 12:50:12 +01:00
Wei Liu
f84ddedb1a hypervisor, vmm: introduce trait functions for aarch64 PMU
The original code uses kvm_device_attr directly outside of the
hyeprvisor crate. That leaks hypervisor details.

No functional change intended.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-21 23:37:53 +01:00
Wei Liu
f21fc1dcb6 hypervisor: x86: provide a generic MsrEntry structure
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-20 10:13:41 +01:00
Wei Liu
4d2cc3778f hypervisor: move away from MsrEntries type
It is a flexible array. Switch to vector and slice instead.

No functional change intended.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-20 10:13:41 +01:00
Wei Liu
05e5106b9b hypervisor x86: provide a generic LapicState structure
This requires making get/set_lapic_reg part of the type.

For the moment we cannot provide a default variant for the new type,
because picking one will be wrong for the other hypervisor, so I just
drop the test cases that requires LapicState::default().

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-19 09:38:38 +01:00
Wei Liu
6a8c0fc887 hypervisor: provide a generic FpuState structure
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-18 22:15:30 +01:00
Wei Liu
08135fa085 hypervisor: provide a generic CpudIdEntry structure
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-18 22:15:30 +01:00
Wei Liu
45fbf840db hypervisor, vmm: move away from CpuId type
CpuId is an alias type for the flexible array structure type over
CpuIdEntry. The type itself and the type of the element in the array
portion are tied to the underlying hypervisor.

Switch to using CpuIdEntry slice or vector directly. The construction of
CpuId type is left to hypervisors.

This allows us to decouple CpuIdEntry from hypervisors more easily.

No functional change intended.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-18 22:15:30 +01:00
Wei Liu
f1ab86fecb hypervisor: x86: provide a generic SpecialRegisters structure
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-15 10:21:43 +01:00
Wei Liu
75797827d5 hypervisor: x86: provide a generic SegmentRegister structure
And drop SegmentRegisterOps since it is no longer required.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-15 10:21:43 +01:00
Wei Liu
8b7781e267 hypervisor: x86: provide a generic StandardRegisters structure
We only need to do this for x86 since MSHV does not have aarch64 support
yet. This reduces unnecessary code churn.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-15 10:21:43 +01:00
Michael Zhao
2d8635f04a hypervisor: Refactor system_registers on AArch64
Function `system_registers` took mutable vector reference and modified
the vector content. Now change the definition to `get/set` style.
And rename to `get/set_sys_regs` to align with other functions.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-07-14 22:55:19 +08:00
Michael Zhao
c445513976 hypervisor: Refactor core_registers on AArch64
On AArch64, the function `core_registers` and `set_core_registers` are
the same thing of `get/set_regs` on x86_64. Now the names are aligned.
This will benefit supporting `gdb`.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-07-14 22:55:19 +08:00
Wei Liu
84bbaf06d1 hypervisor: turn boot_msr_entries into a trait method
This allows dispatching to either KVM or MSHV automatically.

No functional change.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-08 16:49:58 +01:00
Rob Bradford
93237f0106 vmm: Set MADT "Online Capable" flag
The Linux kernel now checks for this before marking CPUs as
hotpluggable:

commit aa06e20f1be628186f0c2dcec09ea0009eb69778
Author: Mario Limonciello <mario.limonciello@amd.com>
Date:   Wed Sep 8 16:41:46 2021 -0500

    x86/ACPI: Don't add CPUs that are not online capable

    A number of systems are showing "hotplug capable" CPUs when they
    are not really hotpluggable.  This is because the MADT has extra
    CPU entries to support different CPUs that may be inserted into
    the socket with different numbers of cores.

    Starting with ACPI 6.3 the spec has an Online Capable bit in the
    MADT used to determine whether or not a CPU is hotplug capable
    when the enabled bit is not set.

    Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/05_ACPI_Software_Programming_Model/ACPI_Software_Programming_Model.html?#local-apic-flags
    Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-07-01 18:45:05 +01:00
Rob Bradford
65ec6631fb vmm: cpu: Store the vCPU snapshots in ascending order
The snapshots are stored in a BTree which is ordered however as the ids
are strings lexical ordering places "11" ahead of "2". So encode the
vCPU id with zero padding so it is lexically sorted.

This fixes issues with CPU restore on aarch64.

See: #4239

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-06-27 16:20:57 +01:00
Rob Bradford
94fb9f817d vmm: Fix clippy issues under "guest_debug" feature
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-06-08 11:40:56 +01:00
Michael Zhao
a7a15d56dd aarch64: Move setup_regs to hypervisor
`setup_regs` of AArch64 calls KVM sepecific code. Now move it to
`hypervisor` crate.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-06-06 11:07:46 +01:00
Sebastien Boeuf
65dc1c83a9 vmm: cpu: Save and restore CPU states during snapshot/restore
Based on recent KVM host patches (merged in Linux 5.16), it's forbidden
to call into KVM_SET_CPUID2 after the first successful KVM_RUN returned.
That means saving CPU states during the pause sequence, and restoring
these states during the resume sequence will not work with the current
design starting with kernel version 5.16.

In order to solve this problem, let's simply move the save/restore logic
to the snapshot/restore sequences rather than the pause/resume ones.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-06-06 11:07:29 +01:00
Yi Wang
ccb604e1e1 vmm: add cpu segment note for coredump
The crash tool use a special note segment which named 'QEMU' to
analyze kaslr info and so on. If we don't add the 'QEMU' note
segment, crash tool can't find linux version to move on.

For now, the most convenient way is to add 'QEMU' note segment to
make crash tool happy.

Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
2022-05-30 13:41:40 +02:00
Rob Bradford
16a9882153 vmm: cpu: tdx: Don't use fd suffix for something not an FD
The hypervisor::Vcpu is the abstraction over the fd.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-13 15:39:22 +02:00