Commit Graph

135 Commits

Author SHA1 Message Date
Michael Zhao
3613b4c096 aarch64: Enable default build option
We have been building Cloud Hypervisor with command like:
`cargo build --no-default-features --features ...`.

After implementing ACPI, we donot have to use specify all features
explicitly. Default build command `cargo build` can work.

This commit fixed some build warnings with default build option and
changed github workflow correspondingly.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-06-24 13:13:27 +01:00
Bo Chen
585269ecb9 clippy: Address the issue 'field is never read'
Issue from beta verion of clippy:

error: field is never read: `type`
   --> vmm/src/cpu.rs:235:5
    |
235 |     pub r#type: u8,
    |     ^^^^^^^^^^^^^^
    |
    = note: `-D dead-code` implied by `-D warnings`

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-06-24 08:55:43 +02:00
Rob Bradford
b56e1217b6 vmm: tdx: Add KVM_FEATURE_STEAL_TIME_BIT to filtered bits
Filter out the KVM_FEATURE_STEAL_TIME_BIT when running with TDX.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-06-18 15:54:10 +01:00
Sebastien Boeuf
a36ac96444 vmm: cpu_manager: Add _PXM ACPI method to each vCPU
In order to allow a hotplugged vCPU to be assigned to the correct NUMA
node in the guest, the DSDT table must expose the _PXM method for each
vCPU. This method defines the proximity domain to which each vCPU should
be attached to.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-06-17 16:08:46 +02:00
Bo Chen
b5bcdbaf48 misc: Upgrade to use the vm-memory crate w/ dirty-page-tracking
As the first step to complete live-migration with tracking dirty-pages
written by the VMM, this commit patches the dependent vm-memory crate to
the upstream version with the dirty-page-tracking capability. Most
changes are due to the updated `GuestMemoryMmap`, `GuestRegionMmap`, and
`MmapRegion` structs which are taking an additional generic type
parameter to specify what 'bitmap backend' is used.

The above changes should be transparent to the rest of the code base,
e.g. all unit/integration tests should pass without additional changes.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-06-03 08:34:45 +01:00
Rob Bradford
c357adae44 vmm: tdx: Clear unsupported KVM PV features
This matches with the features that QEMU clears as they are not
supported with TDX.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-06-01 23:00:54 +02:00
Michael Zhao
e1ef141112 acpi: Enable DSDT for CpuManager on AArch64
Simplified definition block of CPU's on AArch64. It is not complete yet.
Guest boots. But more is to do in future:
- Fix the error in ACPI definition blocks (seen in boot messages)
- Implement CPU hot-plug controller

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-05-25 10:20:37 +02:00
Michael Zhao
cab103d172 acpi: Implement MADT on AArch64
Added following structures in MADT table:
- GICC: GIC CPU interface structure
- GICD: GIC Distributor structure
- GICR: GIC Redistributor Structure
- GICITS: GIC Interrupt Translation Service Structure

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-05-25 10:20:37 +02:00
Rob Bradford
2439625785 hypervisor: Cleanup unused Hypervisor trait members
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-05-19 17:11:30 +02:00
Rob Bradford
b282ff44d4 vmm: Enhance boot with info!() level messages
These messages are predominantly during the boot process but will also
occur during events such as hotplug.

These cover all the significant steps of the boot and can be helpful for
diagnosing performance and functionality issues during the boot.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-05-18 20:45:38 +02:00
Rob Bradford
b8f5911c4e misc: Remove unused errors from public interface
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-05-11 13:37:19 +02:00
Rob Bradford
da8136e49d arch, vmm: Remove support for LinuxBoot
By supporting just PVH boot on x86-64 we simplify our boot path
substatially.

Fixes: #2231

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-04-30 16:16:48 +02:00
Rob Bradford
a7c4483b8b vmm: Directly (de)serialise CpuManager, DeviceManager and MemoryManager state
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-04-20 18:58:37 +02:00
Sebastien Boeuf
b92fe648e9 vmm: cpu: Disable KVM_FEATURE_ASYNC_PF_INT in CPUID
By disabling this KVM feature, we prevent the guest from using APF
(Asynchronous Page Fault) mechanism. The kernel has recently switched to
using interrupts to notify about a page being ready, but for some
reasons, this is causing unexpected behavior with Cloud Hypervisor, as
it will make the vcpu thread spin at 100%.

While investigating the issue, it's better to disable the KVM feature to
prevent 100% CPU usage in some cases.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-04-15 10:08:45 +01:00
Anatol Belski
9e9aba7c0b CpuManager: Fix MMIO read handling
There are two parts:

- Unconditionally zero the output area. The length of the incoming
  vector has been seen from 1 to 4 bytes, even though just the first
  byte might need to be handled. But also, this ensures any possibly
  unhandled offset will return zeroed result to the caller. The former
  implementation used an I/O port which seems to behave differently from
  MMIO and wouldn't require explicit output zeroing.
- An access with zero offset still takes place and needs to be handled.

Fixes #2437.

Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
2021-03-29 13:51:31 +01:00
Gaelan Steele
b161a570ec vmm: use Option::map and Option::cloned
It's more concise, more idiomatic Rust, and satisfies nightly clippy.

Signed-off-by: Gaelan Steele <gbs@canishe.com>
2021-03-29 09:55:29 +02:00
Anatol Belski
5b168f54a6 hyperv: Fix CPU hotadd
The following is from the Hyper-V specification v6.0b.

Cpuid leaf 0x40000003 EDX:

Bit 3: Support for physical CPU dynamic partitioning events is
available.

When Windows determines to be running under a hypervisor, it will
require this cpuid bit to be set to support dynamic CPU operations.

Cpuid leaf 0x40000004 EAX:

Bit 5: Recommend using relaxed timing for this partition. If
used, the VM should disable any watchdog timeouts that
rely on the timely delivery of external interrupts.

This bit has been figured out as required after seeing guest BSOD
when CPU hotplug bit is enabled. Race conditions seem to arise after a
hotplug operation, when a system watchdog has expired.

Closes #1799.

Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
2021-03-26 14:06:51 +01:00
Rob Bradford
3b8d1f1411 vmm: Address Rust 1.51.0 clippy issue (vec_init_then_push)
warning: calls to `push` immediately after creation
   --> vmm/src/cpu.rs:630:9
    |
630 | /         let mut cpuid_patches = Vec::new();
631 | |
632 | |         // Patch tsc deadline timer bit
633 | |         cpuid_patches.push(CpuidPatch {
...   |
662 | |             edx_bit: Some(MTRR_EDX_BIT),
663 | |         });
    | |___________^ help: consider using the `vec![]` macro: `let mut cpuid_patches = vec![..];`
    |
    = note: `#[warn(clippy::vec_init_then_push)]` on by default
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#vec_init_then_push

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-03-26 11:32:09 +00:00
Rob Bradford
9762c8bc28 vmm: Address Rust 1.51.0 clippy issue (upper_case_acroynms)
warning: name `LocalAPIC` contains a capitalized acronym
   --> vmm/src/cpu.rs:197:8
    |
197 | struct LocalAPIC {
    |        ^^^^^^^^^ help: consider making the acronym lowercase, except the initial letter: `LocalApic`
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#upper_case_acronyms

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-03-26 11:32:09 +00:00
Rob Bradford
db6516931d acpi_tables: Address Rust 1.51.0 clippy issue (upper_case_acronyms)
error: name `SDT` contains a capitalized acronym
  --> acpi_tables/src/sdt.rs:27:12
   |
27 | pub struct SDT {
   |            ^^^ help: consider making the acronym lowercase, except the initial letter: `Sdt`
   |
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#upper_case_acronyms

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-03-26 11:32:09 +00:00
Rob Bradford
57ce0986f7 vmm: cpu: Add functionality for enabling TDX for all vCPUs
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-03-08 18:30:00 +00:00
Rob Bradford
c8cad394b5 vmm: cpu: Expose the common/shared CPUID data for all vCPUs
This allows the CPUID data to be passed into the VM level ioctl used for
initalizing TDX.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-03-08 18:30:00 +00:00
Rob Bradford
c48e82915d vmm: Make kernel optional in VM internals
When booting with TDX no kernel is supplied as the TDFV is responsible
for loading the OS. The requirement to have the kernel is still
currently enforced at the validation entry point; this change merely
changes function prototypes and stored state to use Option<> to support.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-03-08 18:30:00 +00:00
Rob Bradford
f8875acec2 misc: Bulk upgrade dependencies
In particular update for the vmm-sys-util upgrade and all the other
dependent packages. This requires an updated forked version of
kvm-bindings (due to updated vfio-ioctls) but allowed the removal of our
forked version of kvm-ioctls.

The changes to the API from kvm-ioctls and vmm-sys-util required some
other minor changes to the code.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-02-26 11:31:08 +00:00
Rob Bradford
9c5be6f660 build: Remove unnecessary Result<> returns
If the function can never return an error this is now a clippy failure:

error: this function's return value is unnecessarily wrapped by `Result`
   --> virtio-devices/src/watchdog.rs:215:5
    |
215 | /     fn set_state(&mut self, state: &WatchdogState) -> io::Result<()> {
216 | |         self.common.avail_features = state.avail_features;
217 | |         self.common.acked_features = state.acked_features;
218 | |         // When restoring enable the watchdog if it was previously enabled. We reset the timer
...   |
223 | |         Ok(())
224 | |     }
    | |_____^
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_wraps

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-02-11 18:18:44 +00:00
Rob Bradford
50a995b63d vmm: Rename patch_cpuid() to generate_common_cpuid()
This reflects that it generates CPUID state used across all vCPUs.
Further ensure that errors from this function get correctly propagated.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-02-09 16:02:25 +00:00
Rob Bradford
ccdea0274c vmm, arch: Move KVM HyperV emulation handling to shared CPUID code
Move the code for populating the CPUID with KVM HyperV emulation details from
the per-vCPU CPUID handling code to the shared CPUID handling code.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-02-09 16:02:25 +00:00
Rob Bradford
688ead51c6 vmm, arch: Move CPU identification handling to shared CPUID code
Move the code for populating the CPUID with details of the CPU
identification from the per-vCPU CPUID handling code to the shared CPUID
handling code.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-02-09 16:02:25 +00:00
Rob Bradford
9792c9aafa vmm, arch: Move max_phys_bits handling to shared CPUID code
Move the code for populating the CPUID with details of the maximum
address space from the per-vCPU CPUID handling code to the shared CPUID
handling code.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-02-09 16:02:25 +00:00
Rob Bradford
952f9bd3fe vmm: acpi: Remove incorrect return statement
_EJx built in should not return.

dsdt.dsl    813:                 Return (CEJ0 (0x00))
Warning  3104 -                            ^ Reserved method should not return a value (_EJ0)

dsdt.dsl    813:                 Return (CEJ0 (0x00))
Error    6080 -                            ^ Called method returns no value

Fixes: #2216

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-01-28 14:30:34 +01:00
Rob Bradford
c29caf2a85 vmm: acpi: Fix incorrect mutex timeout value
The mutex timeout should be 0xffff rather than 0xfff to disable the
timeout feature.

dsdt.dsl    745:             Acquire (\_SB.PRES.CPLK, 0x0FFF)
Warning  3130 -                                           ^ Result is not used, possible operator timeout will be missed

dsdt.dsl    767:             Acquire (\_SB.PRES.CPLK, 0x0FFF)
Warning  3130 -                                           ^ Result is not used, possible operator timeout will be missed

dsdt.dsl    775:             Acquire (\_SB.PRES.CPLK, 0x0FFF)
Warning  3130 -                                           ^ Result is not used, possible operator timeout will be missed

Fixes: #2216

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-01-28 14:30:34 +01:00
Rob Bradford
76e15a4240 vmm: acpi: Support compiling ACPI code on aarch64
This skeleton commit brings in the support for compiling aarch64 with
the "acpi" feature ready to the ACPI enabling. It builds on the work to
move the ACPI hotplug devices from I/O ports to MMIO and conditionalises
any code that is x86_64 only (i.e. because it uses an I/O port.)

Filling in the aarch64 specific details in tables such as the MADT it
out of the scope.

See: #2178

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-01-26 15:19:02 +08:00
Rob Bradford
55a3a38e14 vmm: acpi: Move CpuManager ACPI device to an MMIO address
Migrate the CpuManager from a fixed I/O port address to an allocated
MMIO address.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-01-22 16:08:41 +01:00
Rob Bradford
1fc6d50f3e misc: Make Bus::write() return an Option<Arc<Barrier>>
This can be uses to indicate to the caller that it should wait on the
barrier before returning as there is some asynchronous activity
triggered by the write which requires the KVM exit to block until it's
completed.

This is useful for having vCPU thread wait for the VMM thread to proceed
to activate the virtio devices.

See #1863

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-12-17 11:23:53 +00:00
Muminul Islam
9ce6c3b75c hypervisor, vmm: Feature guard KVM specific code
There are some code base and function which are purely KVM specific for
now and we don't have those supports in mshv at the moment but we have plan
for the future. We are doing a feature guard with KVM. For example, KVM has
mp_state, cpu clock support,  which we don't have for mshv. In order to build
those code we are making the code base for KVM specific compilation.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2020-12-09 14:55:20 +01:00
Rob Bradford
ffaab46934 misc: Use a more relaxed memory model when possible
When a total ordering between multiple atomic variables is not required
then use Ordering::Acquire with atomic loads and Ordering::Release with
atomic stores.

This will improve performance as this does not require a memory fence
on x86_64 which Ordering::SeqCst will use.

Add a comment to the code in the vCPU handling code where it operates on
multiple atomics to explain why Ordering::SeqCst is required.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-12-02 19:04:30 +01:00
Rob Bradford
b2608ca285 vmm: cpu: Fix clippy issues inside test
Found by:  cargo clippy --all-features --all --tests

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-11-26 09:32:46 +01:00
Rob Bradford
0fec326582 hypervisor, vmm: Remove shared ownership of VmmOps
This interface is used by the vCPU thread to delegate responsibility for
handling MMIO/PIO operations and to support different approaches than a
VM exit.

During profiling I found that we were spending 13.75% of the boot CPU
uage acquiring access to the object holding the VmmOps via
ArcSwap::load_full()

    13.75%     6.02%  vcpu0            cloud-hypervisor    [.] arc_swap::ArcSwapAny<T,S>::load_full
            |
            ---arc_swap::ArcSwapAny<T,S>::load_full
               |
                --13.43%--<hypervisor::kvm::KvmVcpu as hypervisor::cpu::Vcpu>::run
                          std::sys_common::backtrace::__rust_begin_short_backtrace
                          core::ops::function::FnOnce::call_once{{vtable-shim}}
                          std::sys::unix:🧵:Thread:🆕:thread_start

However since the object implementing VmmOps does not need to be mutable
and it is only used from the vCPU side we can change the ownership to
being a simple Arc<> that is passed in when calling create_vcpu().

This completely removes the above CPU usage from subsequent profiles.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-11-19 00:16:02 +01:00
Michael Zhao
093a581ee1 vmm: Implement VM rebooting on AArch64
The logic to handle AArch64 system event was: SHUTDOWN and RESET were
all treated as RESET.

Now we handle them differently:
- RESET event will trigger Vmm::vm_reboot(),
- SHUTDOWN event will trigger Vmm::vm_shutdown().

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2020-10-30 17:14:44 +00:00
Michael Zhao
69394c9c35 vmm: Handle hypervisor VCPU run result from Vcpu to VcpuManager
Now Vcpu::run() returns a boolean value to VcpuManager, indicating
whether the VM is going to reboot (false) or just continue (true).
Moving the handling of hypervisor VCPU run result from Vcpu to
VcpuManager gives us the flexibility to handle more scenarios like
shutting down on AArch64.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2020-10-30 17:14:44 +00:00
Sebastien Boeuf
28e12e9f3a vmm, hypervisor: Fix snapshot/restore for Windows guest
The snasphot/restore feature is not working because some CPU states are
not properly saved, which means they can't be restored later on.

First thing, we ensure the CPUID is stored so that it can be properly
restored later. The code is simplified and pushed down to the hypervisor
crate.

Second thing, we identify for each vCPU if the Hyper-V SynIC device is
emulated or not. In case it is, that means some specific MSRs will be
set by the guest. These MSRs must be saved in order to properly restore
the VM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-10-21 19:11:03 +01:00
Wei Liu
d667ed0c70 vmm: don't call notify_guest_clock_paused when Hyper-V emulation is on
We turn on that emulation for Windows. Windows does not have KVM's PV
clock, so calling notify_guest_clock_paused results in an error.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-10-15 19:14:25 +02:00
Sebastien Boeuf
1b9890b807 vmm: cpu: Set CPU physical bits based on user input
If the user specified a maximum physical bits value through the
`max_phys_bits` option from `--cpus` parameter, the guest CPUID
will be patched accordingly to ensure the guest will find the
right amount of physical bits.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-10-13 18:58:36 +02:00
Wei Liu
ed1fdd1f7d hypervisor, arch: rename "OneRegister" and relevant code
The OneRegister literally means "one (arbitrary) register". Just call it
"Register" instead. There is no need to inherit KVM's naming scheme in
the hypervisor agnostic code.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-10-08 08:55:10 +02:00
Praveen Paladugu
71c435ce91 hypervisor, vmm: Introduce VmmOps trait
Run loop in hypervisor needs a callback mechanism to access resources
like guest memory, mmio, pio etc.

VmmOps trait is introduced here, which is implemented by vmm module.
While handling vcpuexits in run loop, this trait allows hypervisor
module access to the above mentioned resources via callbacks.

Signed-off-by: Praveen Paladugu <prapal@microsoft.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-10-02 16:42:55 +01:00
Sebastien Boeuf
c85e396ce5 vmm: cpu: x86: Enable MTRR feature in CPUID
The MTRR feature was missing from the CPUID, which is causing the guest
to ignore the MTRR settings exposed through dedicated MSRs.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-25 15:03:52 +02:00
Henry Wang
c6b47d39e0 vmm: refactor vCPU save/restore code in restoring VM
Similarly as the VM booting process, on AArch64 systems,
the vCPUs should be created before the creation of GIC. This
commit refactors the vCPU save/restore code to achieve the
above-mentioned restoring order.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
970a5a410d vmm: decouple vCPU init from configure_vcpus
Since calling `KVM_GET_ONE_REG` before `KVM_VCPU_INIT` will
result in an error: Exec format error (os error 8). This commit
decouples the vCPU init process from `configure_vcpus`. Therefore
in the process of restoring the vCPUs, these vCPUs can be
initialized separately before started.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
47e65cd341 vmm: AArch64: add methods to get saved vCPU states
The construction of `GICR_TYPER` register will need vCPU states.
Therefore this commit adds methods to extract saved vCPU states
from the cpu manager.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
9dd188a8e8 tests: AArch64: Add unit test cases for vCPU save/restore
Adds 3 more unit test cases for AArch64:

*save_restore_core_regs
*save_restore_system_regs
*get_set_mpstate

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00