We now send not only the 'VmConfig' at the 'Command::Config' step of
live migration, but also send the 'common CPUID'. In this way, we can
check the compatibility of CPUID features between the source and
destination VMs, and abort live migration early if needed.
Signed-off-by: Bo Chen <chen.bo@intel.com>
This refactoring ensures all CPUID related operations are centralized in
`arch::x86_64` module, and exposes only two related public functions to
the vmm crate, e.g. `generate_common_cpuid` and `configure_vcpu`.
Signed-off-by: Bo Chen <chen.bo@intel.com>
In order to uniquely identify each SGX EPC section, we introduce a
mandatory option `id` to the `--sgx-epc` parameter.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This patch fixes a few things to support TDVF correctly.
The HOB memory resources must contain EFI_RESOURCE_ATTRIBUTE_ENCRYPTED
attribute.
Any section with a base address within the already allocated guest RAM
must not be allocated.
The list of TD_HOB memory resources should contain both TempMem and
TdHob sections as well.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
If GICR_CTLR is restored before GICR_PROPBASER and GICR_PENDBASER,
the restoring of the latter registers will fail, as the LPI enable
bit is already set in GICR_CTLR. Therefore, in this commit, the
order of restoring GICR registers is changed.
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
This commit implements the GicV3Its Snapshottable trait, including:
- GicV3Its state: GIC registers and ITS registers
- Save/restore logic of GicV3Its state
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
This commit implements two helper functions `gicv3_its_attr_access`
and `gicv3_its_tables_access` to access ITS device attributes and
ITS tables.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
In current code, the ITS device fd of GICv3 will be lost after the
creation of GIC. This commit adds a new `its_device` field for the
`GicV3Its` struct, which will be useful to save the ITS device fd.
This fd will be used in restoring the ITS device.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
UEFI need to be loaded to a flash area at the beginning of guest memory
address space. To simulate the flash, we take a piece of RAM and hide
it to the guest. As this is a temporary solution, the hiden RAM for UEFI
should be as little as possible. The size was 64 MiB, that's too much,
4 MiB is enough.
The down side of such simulation is that there is a gap (4 MiB) between
the memory size in VMM's view and that in guest's view. This is to be
fixed by implementing a flash device in future.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Issue from beta verion of clippy:
Error: --> vm-virtio/src/queue.rs:700:59
|
700 | if let Some(used_event) = self.get_used_event(&mem) {
| ^^^^ help: change this to: `mem`
|
= note: `-D clippy::needless-borrow` implied by `-D warnings`
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow
Signed-off-by: Bo Chen <chen.bo@intel.com>
To debug the FDT (Flattened Device Tree), we usually need to modify
source code to save the generted DTB data to disk, and use 'dtc' command
to decode the binary file into a text file to analyze.
It would be ideal if the FDT content can be seen in log.
This commit makes it real by:
- Introducing 'fdt' crate for parsing FDT.
- Printing the content of the FDT in tree view.
The parsing and printing only happen when Debug level logging enabled.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Fixed wrong MPIDR value setting for VCPUs in FDT.
The wrong setting made only 16 VCPUs can be enabled at most, all other
VCPUs were showing off-line.
The issue was introduced when we were migrating FDT-generating code to
vmm-fdt crate.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
With the ability of getting host IPA size in `hypervisor` crate,
we can query the host IPA size through ioctl instead of hardcoding
a maximum IPA size. Therefore this commit removes the hardcoded
maximum host IPA size.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
EDK2 requires the beginning of PCIe high space above 4G address.
In CLH the space follows the RAM. If the RAM space is small, the PCIe
high space could fall bellow 4G.
Here we put it above 512G in FDT to workaround the EDK2 check only when
ACPI is enabled, because EDK2 collects PCIe information from FDT.
The address written in ACPI is not impacted.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Implemented an architecture specific function for loading UEFI binary.
Changed the logic of loading kernel image:
1. First try to load the image as kernel in PE format;
2. If failed, try again to load it as formatless UEFI binary.
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
As the first step to complete live-migration with tracking dirty-pages
written by the VMM, this commit patches the dependent vm-memory crate to
the upstream version with the dirty-page-tracking capability. Most
changes are due to the updated `GuestMemoryMmap`, `GuestRegionMmap`, and
`MmapRegion` structs which are taking an additional generic type
parameter to specify what 'bitmap backend' is used.
The above changes should be transparent to the rest of the code base,
e.g. all unit/integration tests should pass without additional changes.
Signed-off-by: Bo Chen <chen.bo@intel.com>
The function used to calculate "gicr-typer" value has nothing with
DeviceManager. Now it is moved to AArch64 specific files.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
On FDT, VMM can allocate IRQ from 0 for devices.
But on ACPI, the lowest range below 32 has to be avoided.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Before this change, the FDT was loaded at the end of RAM. The address of
FDT was not fixed.
While UEFI (edk2 now) requires fixed address to find FDT and RSDP.
Now the FDT is moved to the beginning of RAM, which is a fixed address.
RSDP is wrote to 2 MiB after FDT, also a fixed address.
Kernel comes 2 MiB after RSDP.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Now all crates use edition = "2018" then the majority of the "extern
crate" statements can be removed. Only those for importing macros need
to remain.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Remove unnecessary code for these structs. Moving this also allows the
removal of the arch_gen crate.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
For now, memory layout on arm64 is sparse and is conflict with uefi.
Here, we do some rearrangement to let it compact and compatible with
uefi support.
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Simplify snapshot & restore code by using generics to specify helper
functions that take / make a Serialize / Deserialize struct
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Fixes the current codebase so that every cargo clippy can be run with
the beta toolchain without any error.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
error: name `GPIOInterruptDisabled` contains a capitalized acronym
Error: --> devices/src/legacy/gpio_pl061.rs:46:5
|
46 | GPIOInterruptDisabled,
| ^^^^^^^^^^^^^^^^^^^^^ help: consider making the acronym lowercase, except the initial letter: `GpioInterruptDisabled`
|
= note: `-D clippy::upper-case-acronyms` implied by `-D warnings`
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#upper_case_acronyms
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
error: name `RSDPPastRamEnd` contains a capitalized acronym
--> arch/src/lib.rs:59:5
|
59 | RSDPPastRamEnd,
| ^^^^^^^^^^^^^^ help: consider making the acronym lowercase, except the initial letter: `RsdpPastRamEnd`
|
= note: `-D clippy::upper-case-acronyms` implied by `-D warnings`
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#upper_case_acronyms
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
With CONFIG_PVH in stable kernels for some time we should deprecate the
use of alternative boot methods since this will lead to a much simpler
boot flow and CI process.
See: #2231
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This commit switches the default serial device from 16550 to the
Arm dedicated UART controller PL011. The `ttyAMA0` can be enabled.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
On AArch64, interrupt controller (GIC) is emulated by KVM. VMM need to
set IRQ routing for devices, including legacy ones.
Before this commit, IRQ routing was only set for MSI. Legacy routing
entries of type KVM_IRQ_ROUTING_IRQCHIP were missing. That is way legacy
devices (like serial device ttyS0) does not work.
The setting of X86 IRQ routing entries are not impacted.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Add support extracting the sections out for a TDVF file which can be
then used to load the TDVF and TD HOB data into their appropriate
locations.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Add the skeleton of the "tdx" feature with a module ready inside the
arch crate to store implementation details.
TEST=cargo build --features="tdx"
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
In particular update for the vmm-sys-util upgrade and all the other
dependent packages. This requires an updated forked version of
kvm-bindings (due to updated vfio-ioctls) but allowed the removal of our
forked version of kvm-ioctls.
The changes to the API from kvm-ioctls and vmm-sys-util required some
other minor changes to the code.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The interrupt tests were not being run as they were erroneously under a
feature guard that does not exist in arch.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
If the function can never return an error this is now a clippy failure:
error: this function's return value is unnecessarily wrapped by `Result`
--> virtio-devices/src/watchdog.rs:215:5
|
215 | / fn set_state(&mut self, state: &WatchdogState) -> io::Result<()> {
216 | | self.common.avail_features = state.avail_features;
217 | | self.common.acked_features = state.acked_features;
218 | | // When restoring enable the watchdog if it was previously enabled. We reset the timer
... |
223 | | Ok(())
224 | | }
| |_____^
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_wraps
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This reflects that it generates CPUID state used across all vCPUs.
Further ensure that errors from this function get correctly propagated.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Move the code for populating the CPUID with KVM HyperV emulation details from
the per-vCPU CPUID handling code to the shared CPUID handling code.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Move the code for populating the CPUID with details of the CPU
identification from the per-vCPU CPUID handling code to the shared CPUID
handling code.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Move the code for populating the CPUID with details of the maximum
address space from the per-vCPU CPUID handling code to the shared CPUID
handling code.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
We must explicitly mark these values as u8 as the function that consumes
them takes a T and needs to use the specific width.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
We will need the GDT API for the hypervisor's x86 instruction
emulator implementation, it's better if the arch crate depends on the
hypervisor one rather than the other way around.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Before Virtio-mmio was removed, we passed an optional PCI space address
parameter to AArch64 code for generating FDT. The address is none if the
transport is MMIO.
Now Virtio-PCI is the only option, the parameter is mandatory.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Virtio-mmio is removed, now virtio-pci is the only option for virtio
transport layer. We use MSI for PCI device interrupt. While GICv2, the
legacy interrupt controller, doesn't support MSI. So GICv2 is not very
practical for Cloud-hypervisor, we can remove it.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
If the user specified a maximum physical bits value through the
`max_phys_bits` option from `--cpus` parameter, the guest CPUID
will be patched accordingly to ensure the guest will find the
right amount of physical bits.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The OneRegister literally means "one (arbitrary) register". Just call it
"Register" instead. There is no need to inherit KVM's naming scheme in
the hypervisor agnostic code.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
A new version of vm-memory was released upstream which resulted in some
components pulling in that new version. Update the version number used
to point to the latest version but continue to use our patched version
due to the fix for #1258
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
In order to speed up the Linux boot (so as to avoid it having to scan a
large number of pages) place the MP table directly after the SMBIOS
table if there is sufficient room. The start address of the SMBIOS table
is one of the three (and the largest) location that the MP table can
also be located at.
Before:
[ 0.000399] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WP UC- WT
[ 0.014945] check: Scanning 1 areas for low memory corruption
After:
[ 0.000284] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WP UC- WT
[ 0.000421] found SMP MP-table at [mem 0x000f0090-0x000f009f]
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The states of GIC should be part of the VM states. This commit
enables the AArch64 VM states save/restore by adding save/restore
of GIC states.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
Currently for AArch64, the GICv3-ITS is tried to be created first
when PCI is not needed, which is unnecessary. This commit fixes
the problem.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
Since calling `KVM_GET_ONE_REG` before `KVM_VCPU_INIT` will
result in an error: Exec format error (os error 8). This commit
decouples the vCPU init process from `configure_vcpus`. Therefore
in the process of restoring the vCPUs, these vCPUs can be
initialized separately before started.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
The value of GIC register `GICR_TYPER` is needed in restoring
the GIC states. This commit adds a field in the GIC device struct
and a method to construct its value.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
In AArch64 systems, the state of GIC device can only be
retrieved from `KVM_GET_DEVICE_ATTR` ioctl. Therefore to implement
saving/restoring the GIC states, we need to make sure that the
GIC object (either the file descriptor or the device itself) can
be extracted after the VM is started.
This commit refactors the code of GIC creation by adding a new
field `gic_device_entity` in device manager and methods to set/get
this field. The GIC object can be therefore saved in the device
manager after calling `arch::configure_system`.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
This commit adds a function which allows to save RDIST pending
tables to the guest RAM, as well as unit test case for it.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
This commit implements the `get_device_attr` method for the
`KVM_GET_DEVICE_ATTR` ioctl. This ioctl will be used in retrieving
the GIC status.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
This commit moves the GIC-related code to a separate module.
Therefore the implementation of GIC registers can be introduced
to the new module.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
This commit ports code from firecracker and refactors the existing
AArch64 code as the preparation for implementing save/restore
AArch64 vCPU, including:
1. Modification of `arm64_core_reg` macro to retrive the index of
arm64 core register and implemention of a helper to determine if
a register is a system register.
2. Move some macros and helpers in `arch` crate to the `hypervisor`
crate.
3. Added related unit tests for above functions and macros.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
Misspellings were identified by https://github.com/marketplace/actions/check-spelling
* Initial corrections suggested by Google Sheets
* Additional corrections by Google Chrome auto-suggest
* Some manual corrections
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
Inject CPUID leaves for advertising KVM HyperV support when the
"kvm_hyperv" toggle is enabled. Currently we only enable a selection of
features required to boot.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Shrink GICDevice trait to contain hypervisor agnostic API's only, which
are used in generating FDT.
Move all KVM specific logic into KvmGICDevice trait.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
This commit enables some mmio-related integration test cases on
AArch64, including:
* some vhost_user test cases
* virtio-blk test cases
* pmem test cases
Also this commit contains a bug fix in creating virtio-blk device.
Previously, when creating the FDT, the virtio-blk device was
labeled in the reverse order of address allocation.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
The retry order to create virtual GIC is GICv3-ITS, GICv3 and GICv2.
But there was not log message to show what was finally created.
The log messages also mute the warning for unused "log" crate.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
SGX expects the EPC region to be reported as "reserved" from the e820
table. This patch adds a new entry to the table if SGX is enabled.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The support for SGX is exposed to the guest through CPUID 0x12. KVM
passes static subleaves 0 and 1 from the host to the guest, without
needing any modification from the VMM itself.
But SGX also relies on dynamic subleaves 2 through N, used for
describing each EPC section. This is not handled by KVM, which means
the VMM is in charge of setting each subleaf starting from index 2
up to index N, depending on the number of EPC sections.
These subleaves 2 through N are not listed as part of the supported
CPUID entries from KVM. But it's important to set them as long as index
0 and 1 are present and indicate that SGX is supported.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Based on the presence of one or multiple SGX EPC sections from the VM
configuration, the MemoryManager will allocate a contiguous block of
guest address space to hold the entire EPC region. Within this EPC
region, each EPC section is memory mapped.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The type is now hypervisor::Vm. Switch from KVM specific name vm_fd to a
generic name just like 8186a8eee6 ("vmm: interrupt: Rename vm_fd").
No functional change.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
The OVMF firmware loops around looking for an entry marking the end of
the table. Without this entry processing the tables is an infinite loop.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Taken from crosvm: 44336b913126d73f9f8d6854f57aac92b5db809e and adapted
for Cloud Hypervisor.
This is basic and incomplete support but Linux correctly finds the DMI
data based on this:
root@clr-c6ed47bc1c9d473d9a3a8bddc50ee4cb ~ # dmesg | grep -i dmi
[ 0.000000] DMI: Cloud Hypervisor cloud-hypervisor, BIOS 0
root@clr-c6ed47bc1c9d473d9a3a8bddc50ee4cb ~ # dmesg | grep -i smbio
[ 0.000000] SMBIOS 3.2.0 present.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This commit fixes some warnings introduced in the previous
hyperviosr crate PR.Removed some unused variables from arch/aarch64
module.
Signed-off-by: Muminul Islam <muislam@microsoft.com>
Start moving the vmm, arch and pci crates to being hypervisor agnostic
by using the hypervisor trait and abstractions. This is not a complete
switch and there are still some remaining KVM dependencies.
Signed-off-by: Muminul Islam <muislam@microsoft.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
There are two CPUID leaves for handling CPU topology, 0xb and 0x1f. The
difference between the two is that the 0x1f leaf (Extended Topology
Leaf) supports exposing multiple die packages.
Fixes: #1284
Signed-off-by: Rob Bradford <robert.bradford@intel.com>