This commit rewrites the `create_pci_node` in the FDT creator to
create multiple PCI nodes based on the vector of `PciSpaceInfo`,
and each PCI node in FDT reflects a PCI segment.
- The PCI MMIO config space, 32 bits PCI device space and 64 bits
PCI device space is re-calculated based on the `PciSpaceInfo` for
each PCI segment.
- A new FDT property `linux,pci-domain` is added.
- The virtio-iommu node is only created for the first PCI segment.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
The constant `PCI_MMIO_CONFIG_SIZE` defined in `vmm/pci_segment.rs`
describes the MMIO configuation size for each PCI segment. However,
this name conflicts with the `PCI_MMCONFIG_SIZE` defined in `layout.rs`
in the `arch` crate, which describes the memory size of the PCI MMIO
configuration region.
Therefore, this commit renames the `PCI_MMIO_CONFIG_SIZE` to
`PCI_MMIO_CONFIG_SIZE_PER_SEGMENT` and moves this constant from `vmm`
crate to `arch` crate.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
Currently, a tuple containing PCI space start address and PCI space
size is used to pass the PCI space information to the FDT creator.
In order to support the multiple PCI segment for FDT, more information
such as the PCI segment ID should be passed to the FDT creator. If we
still use a tuple to store these information, the code flexibility and
readablity will be harmed.
To address this issue, this commit replaces the tuple containing the
PCI space information to a structure `PciSpaceInfo` and uses a vector
of `PciSpaceInfo` to store PCI space information for each segment, so
that multiple PCI segment information can be passed to the FDT together.
Note that the scope of this commit will only contain the refactor of
original code, the actual multiple PCI segments support will be in
following series, and for now `--platform num_pci_segments` should only
be 1.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
In order to avoid the identity map region to conflict with a possible
firmware being placed in the last 4MiB of the 4GiB range, we must set
the address to a chosen location. And it makes the most sense to have
this region placed right after the TSS region.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Place the 3 page TSS at an explicit location in the 32-bit address space
to avoid conflicting with the loaded raw firmware.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Reduce the size of the reserved 32-bit address space to the range used
by both the PCI MMIO config data and the 32-bit PCI device space.
This avoids issues when using firmware that is loaded into the very top
of the 32-bit address space as the RAM conflicts with the reserved
memory.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
If the provided binary isn't an ELF binary assume that it is a firmware
to be loaded in directly. In this case we shouldn't program any of the
registers as KVM starts in that state.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This test generates an array of random numbers and then applies the same
trivial algorithm twice -- once in set_apic_delivery_mode and another
time in an anonymous function.
Its usefulness is limited. Drop it to remove one unsafe in code.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Because anyhow version 1.0.46 has been yanked, let's move back to the
previous version 1.0.45.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This was causing some issues because of the use of 2 different versions
for the vm-memmory crate. We'll wait for all dependencies to be properly
resolved before we move to 0.7.0.
This reverts commit 76b6c62d07.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Instead of creating a MemoryManager from scratch, let's reuse the same
code path used by snapshot/restore, so that memory regions are created
identically to what they were on the source VM.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Storing multiple data coming from the MemoryManager in order to be able
to restore without creating everything from scratch.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Move the definition of MSI space to layout.rs, so other crates can
reference it. Now it is needed by virtio-iommu.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Add a virtio-iommu node into FDT if iommu option is turned on. Now we
support only one virtio-iommu device.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
This commit implements the GIC (including both GICv3 and GICv3ITS)
Pausable trait. The pause of device manager will trigger a "pause"
of GIC, where we flush GIC pending tables and ITS tables to the
guest RAM.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
The rust-vmm crates we're pulling from git have renamed their main
branches. We need to update the branch names we're giving to Cargo,
or people who don't have these dependencies cached will get errors
like this when trying to build:
error: failed to get `vm-fdt` as a dependency of package `arch v0.1.0 (/home/src/cloud-hypervisor/arch)`
Caused by:
failed to load source for dependency `vm-fdt`
Caused by:
Unable to update https://github.com/rust-vmm/vm-fdt?branch=master#031572a6
Caused by:
object not found - no match for id (031572a6edc2f566a7278f1e17088fc5308d27ab); class=Odb (9); code=NotFound (-3)
Signed-off-by: Alyssa Ross <hi@alyssa.is>
Based on `--memory-zone` and `--numa` param in the Cloud Hypervisor
cmdline, the NUMA memory configuration is described. This commit
adds such NUMA memory configuration to the FDT memory node.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
For the purpose of identification, each NUMA node is associated
with a unique token known as a `numa-node-id`. For the purpose of
device tree binding, a `numa-node-id` is a 32-bit integer.
The CPU node is associated with a NUMA node by the presence of a
`numa-node-id` property which contains the node id of the device.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
The optional device tree node distance-map describes the relative
distance (memory latency) between all NUMA nodes.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
This is to make sure the NUMA node data structures can be accessed
both from the `vmm` crate and `arch` crate.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
This doesn't really affect the build as we ship a Cargo.lock with fixed
versions in. However for clarity it makes sense to use fixed versions
throughout and let dependabot update them.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The Arm CPU topology is defined within the `cpu-map` node, which is
a direct child of the cpus node and provides a container where the
actual topology nodes are listed.
This commit adds an optional cpu-map node in device tree, based on
the Cloud Hypervisor command line vCPU topology information.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
In an Arm system, the hierarchy of CPUs is defined through three
entities that are used to describe the layout of physical CPUs in
the system:
- cluster
- core
- thread
All these three entities have their own FDT node field. Therefore,
This commit adds an AArch64-specific helper to pass the config from
the Cloud Hypervisor command line to the `configure_system`, where
eventually the `create_fdt` is called.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
To support different CPUID entry semantics, we now allow to
specify the compatible condition for each feature entry. Most entries
are considered compatible when they are "bitwise subset", with few
exceptions: 1. "equal", e.g. EBX/ECX/EDX of leaf `0x4000_0000` (KVM
CPUID SIGNATURE); 2. "smaller or equal as a number", e.g. EAX of leaf
`0x7` and leaf `0x4000_0000`;
Signed-off-by: Bo Chen <chen.bo@intel.com>
We now send not only the 'VmConfig' at the 'Command::Config' step of
live migration, but also send the 'common CPUID'. In this way, we can
check the compatibility of CPUID features between the source and
destination VMs, and abort live migration early if needed.
Signed-off-by: Bo Chen <chen.bo@intel.com>
This refactoring ensures all CPUID related operations are centralized in
`arch::x86_64` module, and exposes only two related public functions to
the vmm crate, e.g. `generate_common_cpuid` and `configure_vcpu`.
Signed-off-by: Bo Chen <chen.bo@intel.com>
In order to uniquely identify each SGX EPC section, we introduce a
mandatory option `id` to the `--sgx-epc` parameter.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This patch fixes a few things to support TDVF correctly.
The HOB memory resources must contain EFI_RESOURCE_ATTRIBUTE_ENCRYPTED
attribute.
Any section with a base address within the already allocated guest RAM
must not be allocated.
The list of TD_HOB memory resources should contain both TempMem and
TdHob sections as well.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
If GICR_CTLR is restored before GICR_PROPBASER and GICR_PENDBASER,
the restoring of the latter registers will fail, as the LPI enable
bit is already set in GICR_CTLR. Therefore, in this commit, the
order of restoring GICR registers is changed.
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
This commit implements the GicV3Its Snapshottable trait, including:
- GicV3Its state: GIC registers and ITS registers
- Save/restore logic of GicV3Its state
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
This commit implements two helper functions `gicv3_its_attr_access`
and `gicv3_its_tables_access` to access ITS device attributes and
ITS tables.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
In current code, the ITS device fd of GICv3 will be lost after the
creation of GIC. This commit adds a new `its_device` field for the
`GicV3Its` struct, which will be useful to save the ITS device fd.
This fd will be used in restoring the ITS device.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
UEFI need to be loaded to a flash area at the beginning of guest memory
address space. To simulate the flash, we take a piece of RAM and hide
it to the guest. As this is a temporary solution, the hiden RAM for UEFI
should be as little as possible. The size was 64 MiB, that's too much,
4 MiB is enough.
The down side of such simulation is that there is a gap (4 MiB) between
the memory size in VMM's view and that in guest's view. This is to be
fixed by implementing a flash device in future.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Issue from beta verion of clippy:
Error: --> vm-virtio/src/queue.rs:700:59
|
700 | if let Some(used_event) = self.get_used_event(&mem) {
| ^^^^ help: change this to: `mem`
|
= note: `-D clippy::needless-borrow` implied by `-D warnings`
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow
Signed-off-by: Bo Chen <chen.bo@intel.com>
To debug the FDT (Flattened Device Tree), we usually need to modify
source code to save the generted DTB data to disk, and use 'dtc' command
to decode the binary file into a text file to analyze.
It would be ideal if the FDT content can be seen in log.
This commit makes it real by:
- Introducing 'fdt' crate for parsing FDT.
- Printing the content of the FDT in tree view.
The parsing and printing only happen when Debug level logging enabled.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Fixed wrong MPIDR value setting for VCPUs in FDT.
The wrong setting made only 16 VCPUs can be enabled at most, all other
VCPUs were showing off-line.
The issue was introduced when we were migrating FDT-generating code to
vmm-fdt crate.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
With the ability of getting host IPA size in `hypervisor` crate,
we can query the host IPA size through ioctl instead of hardcoding
a maximum IPA size. Therefore this commit removes the hardcoded
maximum host IPA size.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
EDK2 requires the beginning of PCIe high space above 4G address.
In CLH the space follows the RAM. If the RAM space is small, the PCIe
high space could fall bellow 4G.
Here we put it above 512G in FDT to workaround the EDK2 check only when
ACPI is enabled, because EDK2 collects PCIe information from FDT.
The address written in ACPI is not impacted.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Implemented an architecture specific function for loading UEFI binary.
Changed the logic of loading kernel image:
1. First try to load the image as kernel in PE format;
2. If failed, try again to load it as formatless UEFI binary.
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
As the first step to complete live-migration with tracking dirty-pages
written by the VMM, this commit patches the dependent vm-memory crate to
the upstream version with the dirty-page-tracking capability. Most
changes are due to the updated `GuestMemoryMmap`, `GuestRegionMmap`, and
`MmapRegion` structs which are taking an additional generic type
parameter to specify what 'bitmap backend' is used.
The above changes should be transparent to the rest of the code base,
e.g. all unit/integration tests should pass without additional changes.
Signed-off-by: Bo Chen <chen.bo@intel.com>
The function used to calculate "gicr-typer" value has nothing with
DeviceManager. Now it is moved to AArch64 specific files.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
On FDT, VMM can allocate IRQ from 0 for devices.
But on ACPI, the lowest range below 32 has to be avoided.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Before this change, the FDT was loaded at the end of RAM. The address of
FDT was not fixed.
While UEFI (edk2 now) requires fixed address to find FDT and RSDP.
Now the FDT is moved to the beginning of RAM, which is a fixed address.
RSDP is wrote to 2 MiB after FDT, also a fixed address.
Kernel comes 2 MiB after RSDP.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Now all crates use edition = "2018" then the majority of the "extern
crate" statements can be removed. Only those for importing macros need
to remain.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Remove unnecessary code for these structs. Moving this also allows the
removal of the arch_gen crate.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
For now, memory layout on arm64 is sparse and is conflict with uefi.
Here, we do some rearrangement to let it compact and compatible with
uefi support.
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Simplify snapshot & restore code by using generics to specify helper
functions that take / make a Serialize / Deserialize struct
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Fixes the current codebase so that every cargo clippy can be run with
the beta toolchain without any error.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
error: name `GPIOInterruptDisabled` contains a capitalized acronym
Error: --> devices/src/legacy/gpio_pl061.rs:46:5
|
46 | GPIOInterruptDisabled,
| ^^^^^^^^^^^^^^^^^^^^^ help: consider making the acronym lowercase, except the initial letter: `GpioInterruptDisabled`
|
= note: `-D clippy::upper-case-acronyms` implied by `-D warnings`
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#upper_case_acronyms
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
error: name `RSDPPastRamEnd` contains a capitalized acronym
--> arch/src/lib.rs:59:5
|
59 | RSDPPastRamEnd,
| ^^^^^^^^^^^^^^ help: consider making the acronym lowercase, except the initial letter: `RsdpPastRamEnd`
|
= note: `-D clippy::upper-case-acronyms` implied by `-D warnings`
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#upper_case_acronyms
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
With CONFIG_PVH in stable kernels for some time we should deprecate the
use of alternative boot methods since this will lead to a much simpler
boot flow and CI process.
See: #2231
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This commit switches the default serial device from 16550 to the
Arm dedicated UART controller PL011. The `ttyAMA0` can be enabled.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
On AArch64, interrupt controller (GIC) is emulated by KVM. VMM need to
set IRQ routing for devices, including legacy ones.
Before this commit, IRQ routing was only set for MSI. Legacy routing
entries of type KVM_IRQ_ROUTING_IRQCHIP were missing. That is way legacy
devices (like serial device ttyS0) does not work.
The setting of X86 IRQ routing entries are not impacted.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Add support extracting the sections out for a TDVF file which can be
then used to load the TDVF and TD HOB data into their appropriate
locations.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Add the skeleton of the "tdx" feature with a module ready inside the
arch crate to store implementation details.
TEST=cargo build --features="tdx"
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
In particular update for the vmm-sys-util upgrade and all the other
dependent packages. This requires an updated forked version of
kvm-bindings (due to updated vfio-ioctls) but allowed the removal of our
forked version of kvm-ioctls.
The changes to the API from kvm-ioctls and vmm-sys-util required some
other minor changes to the code.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The interrupt tests were not being run as they were erroneously under a
feature guard that does not exist in arch.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
If the function can never return an error this is now a clippy failure:
error: this function's return value is unnecessarily wrapped by `Result`
--> virtio-devices/src/watchdog.rs:215:5
|
215 | / fn set_state(&mut self, state: &WatchdogState) -> io::Result<()> {
216 | | self.common.avail_features = state.avail_features;
217 | | self.common.acked_features = state.acked_features;
218 | | // When restoring enable the watchdog if it was previously enabled. We reset the timer
... |
223 | | Ok(())
224 | | }
| |_____^
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_wraps
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This reflects that it generates CPUID state used across all vCPUs.
Further ensure that errors from this function get correctly propagated.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Move the code for populating the CPUID with KVM HyperV emulation details from
the per-vCPU CPUID handling code to the shared CPUID handling code.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Move the code for populating the CPUID with details of the CPU
identification from the per-vCPU CPUID handling code to the shared CPUID
handling code.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Move the code for populating the CPUID with details of the maximum
address space from the per-vCPU CPUID handling code to the shared CPUID
handling code.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
We must explicitly mark these values as u8 as the function that consumes
them takes a T and needs to use the specific width.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
We will need the GDT API for the hypervisor's x86 instruction
emulator implementation, it's better if the arch crate depends on the
hypervisor one rather than the other way around.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Before Virtio-mmio was removed, we passed an optional PCI space address
parameter to AArch64 code for generating FDT. The address is none if the
transport is MMIO.
Now Virtio-PCI is the only option, the parameter is mandatory.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Virtio-mmio is removed, now virtio-pci is the only option for virtio
transport layer. We use MSI for PCI device interrupt. While GICv2, the
legacy interrupt controller, doesn't support MSI. So GICv2 is not very
practical for Cloud-hypervisor, we can remove it.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
If the user specified a maximum physical bits value through the
`max_phys_bits` option from `--cpus` parameter, the guest CPUID
will be patched accordingly to ensure the guest will find the
right amount of physical bits.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The OneRegister literally means "one (arbitrary) register". Just call it
"Register" instead. There is no need to inherit KVM's naming scheme in
the hypervisor agnostic code.
Signed-off-by: Wei Liu <liuwe@microsoft.com>