218 Commits

Author SHA1 Message Date
Henry Wang
90df54a245 aarch64: fdt: Create MSI mapping for PCI nodes
Each PCI device under a root complex is uniquely identified by its
Requester ID (AKA RID). A Requester ID is a triplet of a Bus number,
Device number, and Function number.

MSIs may be distinguished in part through the use of sideband data
accompanying writes. In the case of PCI devices, this sideband data
may be derived from the Requester ID. A mechanism is required to
associate a device with both the MSI controllers it can address,
and the sideband data that will be associated with its writes to
those controllers.

This commit adds the `msi-map` property for PCI nodes, therefore
creating MSI mapping for each PCI device.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-12-06 09:29:49 +00:00
Henry Wang
ca9a42ece8 aarch64: fdt: Create multiple PCI nodes based on PciSpaceInfo
This commit rewrites the `create_pci_node` in the FDT creator to
create multiple PCI nodes based on the vector of `PciSpaceInfo`,
and each PCI node in FDT reflects a PCI segment.

- The PCI MMIO config space, 32 bits PCI device space and 64 bits
PCI device space is re-calculated based on the `PciSpaceInfo` for
each PCI segment.
- A new FDT property `linux,pci-domain` is added.
- The virtio-iommu node is only created for the first PCI segment.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-12-06 09:29:49 +00:00
Henry Wang
2f8540da70 vmm: Rename PCI_MMIO_CONFIG_SIZE and move it to arch
The constant `PCI_MMIO_CONFIG_SIZE` defined in `vmm/pci_segment.rs`
describes the MMIO configuation size for each PCI segment. However,
this name conflicts with the `PCI_MMCONFIG_SIZE` defined in `layout.rs`
in the `arch` crate, which describes the memory size of the PCI MMIO
configuration region.

Therefore, this commit renames the `PCI_MMIO_CONFIG_SIZE` to
`PCI_MMIO_CONFIG_SIZE_PER_SEGMENT` and moves this constant from `vmm`
crate to `arch` crate.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-12-06 09:29:49 +00:00
Henry Wang
07bef815cc aarch64: Introduce struct PciSpaceInfo for FDT
Currently, a tuple containing PCI space start address and PCI space
size is used to pass the PCI space information to the FDT creator.
In order to support the multiple PCI segment for FDT, more information
such as the PCI segment ID should be passed to the FDT creator. If we
still use a tuple to store these information, the code flexibility and
readablity will be harmed.

To address this issue, this commit replaces the tuple containing the
PCI space information to a structure `PciSpaceInfo` and uses a vector
of `PciSpaceInfo` to store PCI space information for each segment, so
that multiple PCI segment information can be passed to the FDT together.

Note that the scope of this commit will only contain the refactor of
original code, the actual multiple PCI segments support will be in
following series, and for now `--platform num_pci_segments` should only
be 1.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-12-06 09:29:49 +00:00
Sebastien Boeuf
03a606c7ec arch, vmm: Place KVM identity map region after TSS region
In order to avoid the identity map region to conflict with a possible
firmware being placed in the last 4MiB of the 4GiB range, we must set
the address to a chosen location. And it makes the most sense to have
this region placed right after the TSS region.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-12-04 19:33:34 +00:00
Rob Bradford
348def9dfb arch, hypervisor, vmm: Explicitly place the TSS in the 32-bit space
Place the 3 page TSS at an explicit location in the 32-bit address space
to avoid conflicting with the loaded raw firmware.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-12-03 16:53:56 +01:00
Rob Bradford
4f4eb3f6b8 arch: x86_64: Only reserve used 32-bit address space
Reduce the size of the reserved 32-bit address space to the range used
by both the PCI MMIO config data and the 32-bit PCI device space.

This avoids issues when using firmware that is loaded into the very top
of the 32-bit address space as the RAM conflicts with the reserved
memory.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-12-02 16:42:01 +01:00
Rob Bradford
82d06c0efa vmm: Add support for booting raw binary (e.g. firmware) on x86-64
If the provided binary isn't an ELF binary assume that it is a firmware
to be loaded in directly. In this case we shouldn't program any of the
registers as KVM starts in that state.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-30 13:39:36 +01:00
Wei Liu
03862314eb arch: x86_64: add safety comments for impl ByteValued
Group impl clauses together to avoid repetition.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2021-11-17 14:40:51 +00:00
Wei Liu
5813ce0307 arch: x86_64: drop test_apic_delivery_mode
This test generates an array of random numbers and then applies the same
trivial algorithm twice -- once in set_apic_delivery_mode and another
time in an anonymous function.

Its usefulness is limited. Drop it to remove one unsafe in code.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2021-11-16 13:49:36 +00:00
Wei Liu
a4d8616676 arch: x86_64: tdx: drop one unsafe call in code
Implement Default trait for TdvfDescriptor and drop one unsafe in code.

No functional change.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2021-11-10 17:46:21 +01:00
Sebastien Boeuf
58d8206e2b migration: Use MemoryManager restore code path
Instead of creating a MemoryManager from scratch, let's reuse the same
code path used by snapshot/restore, so that memory regions are created
identically to what they were on the source VM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-10-06 18:35:49 -07:00
Sebastien Boeuf
5b177b205b arch, vmm: Extend the data being snapshot
Storing multiple data coming from the MemoryManager in order to be able
to restore without creating everything from scratch.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-10-06 18:35:49 -07:00
Sebastien Boeuf
84a741a3fa arch: x86_64: tdx: Add TD_VMM_DATA support
Adding the definitions and helpers to build TD_VMM_DATA regions as part
of the TD_HOB.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-30 06:35:55 -07:00
Rob Bradford
1a2d0e6dd8 build: bump linux-loader from 0.3.0 to 0.4.0
Requires manual change to command line loading.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-24 09:11:57 +00:00
Michael Zhao
b30ddc0837 aarch64: Refactor AArch64 GIC space definitions
Move the definition of MSI space to layout.rs, so other crates can
reference it. Now it is needed by virtio-iommu.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-09-17 12:19:46 +02:00
Michael Zhao
253c06d3ba arch/aarch64: Add virtio-iommu device in FDT
Add a virtio-iommu node into FDT if iommu option is turned on. Now we
support only one virtio-iommu device.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-09-17 12:19:46 +02:00
Henry Wang
46c60183cd arch, vmm: Implement GIC Pausable trait
This commit implements the GIC (including both GICv3 and GICv3ITS)
Pausable trait. The pause of device manager will trigger a "pause"
of GIC, where we flush GIC pending tables and ITS tables to the
guest RAM.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-09-02 15:18:41 +01:00
Henry Wang
c9cc97e9a0 arch: Add NUMA configuration to FDT memory node
Based on `--memory-zone` and `--numa` param in the Cloud Hypervisor
cmdline, the NUMA memory configuration is described. This commit
adds such NUMA memory configuration to the FDT memory node.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-08-12 10:49:02 +02:00
Henry Wang
f3197c3833 arch: Add numa-node-id property to CPU node
For the purpose of identification, each NUMA node is associated
with a unique token known as a `numa-node-id`. For the purpose of
device tree binding, a `numa-node-id` is a 32-bit integer.

The CPU node is associated with a NUMA node by the presence of a
`numa-node-id` property which contains the node id of the device.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-08-12 10:49:02 +02:00
Henry Wang
5a0a4bc505 arch: Add optional distance-map node to FDT
The optional device tree node distance-map describes the relative
distance (memory latency) between all NUMA nodes.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-08-12 10:49:02 +02:00
Henry Wang
165364e08b vmm: Move NUMA node data structures to arch
This is to make sure the NUMA node data structures can be accessed
both from the `vmm` crate and `arch` crate.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-08-12 10:49:02 +02:00
Henry Wang
447c986916 aarch64: Add optional cpu-map node in device tree
The Arm CPU topology is defined within the `cpu-map` node, which is
a direct child of the cpus node and provides a container where the
actual topology nodes are listed.

This commit adds an optional cpu-map node in device tree, based on
the Cloud Hypervisor command line vCPU topology information.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-08-05 21:19:16 +08:00
Henry Wang
7fb980f17b arch, vmm: Pass cpu topology configuation to FDT
In an Arm system, the hierarchy of CPUs is defined through three
entities that are used to describe the layout of physical CPUs in
the system:

- cluster
- core
- thread

All these three entities have their own FDT node field. Therefore,
This commit adds an AArch64-specific helper to pass the config from
the Cloud Hypervisor command line to the `configure_system`, where
eventually the `create_fdt` is called.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-08-05 21:19:16 +08:00
Bo Chen
2723995cfa arch: Support fine-grained CPUID compatibility check
To support different CPUID entry semantics, we now allow to
specify the compatible condition for each feature entry. Most entries
are considered compatible when they are "bitwise subset", with few
exceptions: 1. "equal", e.g. EBX/ECX/EDX of leaf `0x4000_0000` (KVM
CPUID SIGNATURE); 2. "smaller or equal as a number", e.g. EAX of leaf
`0x7` and leaf `0x4000_0000`;

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-07-28 09:26:02 +02:00
Bo Chen
6d9c1eb638 arch, vmm: Add CPUID check to the 'Config' step of live migration
We now send not only the 'VmConfig' at the 'Command::Config' step of
live migration, but also send the 'common CPUID'. In this way, we can
check the compatibility of CPUID features between the source and
destination VMs, and abort live migration early if needed.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-07-28 09:26:02 +02:00
Bo Chen
569be6e706 arch, vmm: Move "generate_common_cpuid" from "CpuManager" to "arch"
This refactoring ensures all CPUID related operations are centralized in
`arch::x86_64` module, and exposes only two related public functions to
the vmm crate, e.g. `generate_common_cpuid` and `configure_vcpu`.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-07-19 09:59:34 -07:00
Sebastien Boeuf
7f507dd77d arch: x86_64: tdx: Fix HobHandoffInfoTable
The handoff table was missing the boot_mode field.

Suggested-by: Jiaqi Gao <jiaqi.gao@intel.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-07-09 14:56:28 -07:00
Sebastien Boeuf
9aedabe11e sgx: Add mandatory id field to SgxEpcConfig
In order to uniquely identify each SGX EPC section, we introduce a
mandatory option `id` to the `--sgx-epc` parameter.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-07-09 14:45:30 +02:00
Henry Wang
c46441c937 build: bump vm-fdt from bbfd1e7 to 02d1b8f
Bumps [vm-fdt](https://github.com/rust-vmm/vm-fdt) from `bbfd1e7` to `02d1b8f`.
- [Release notes](https://github.com/rust-vmm/vm-fdt/releases)
- [Commits](bbfd1e7719...02d1b8fde2)

---
updated-dependencies:
- dependency-name: vm-fdt
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-07-07 09:55:18 +02:00
Sebastien Boeuf
5b6d424a77 arch, vmm: Fix TDVF section handling
This patch fixes a few things to support TDVF correctly.

The HOB memory resources must contain EFI_RESOURCE_ATTRIBUTE_ENCRYPTED
attribute.

Any section with a base address within the already allocated guest RAM
must not be allocated.

The list of TD_HOB memory resources should contain both TempMem and
TdHob sections as well.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-07-06 11:47:43 +02:00
Jianyong Wu
8744162a0e arch: gic: Change restoring order of GICR register
If GICR_CTLR is restored before GICR_PROPBASER and GICR_PENDBASER,
the restoring of the latter registers will fail, as the LPI enable
bit is already set in GICR_CTLR. Therefore, in this commit, the
order of restoring GICR registers is changed.

Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2021-07-05 22:51:56 +02:00
Henry Wang
6dcf9f6588 arch: aarch64: Implement ITS Snapshottable trait
This commit implements the GicV3Its Snapshottable trait, including:

- GicV3Its state: GIC registers and ITS registers
- Save/restore logic of GicV3Its state

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-07-05 22:51:56 +02:00
Henry Wang
4440671739 arch: gic: Prepare helper functions to access ITS
This commit implements two helper functions `gicv3_its_attr_access`
and `gicv3_its_tables_access` to access ITS device attributes and
ITS tables.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-07-05 22:51:56 +02:00
Henry Wang
957d3deeea arch: gic: Extend GicV3Its with its_device field
In current code, the ITS device fd of GICv3 will be lost after the
creation of GIC. This commit adds a new `its_device` field for the
`GicV3Its` struct, which will be useful to save the ITS device fd.
This fd will be used in restoring the ITS device.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-07-05 22:51:56 +02:00
Michael Zhao
45c4d1a06e aarch64: Reduce UEFI space size to 4 MiB
UEFI need to be loaded to a flash area at the beginning of guest memory
address space. To simulate the flash, we take a piece of RAM and hide
it to the guest. As this is a temporary solution, the hiden RAM for UEFI
should be as little as possible. The size was 64 MiB, that's too much,
4 MiB is enough.

The down side of such simulation is that there is a gap (4 MiB) between
the memory size in VMM's view and that in guest's view. This is to be
fixed by implementing a flash device in future.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-06-24 13:13:27 +01:00
Bo Chen
5825ab2dd4 clippy: Address the issue 'needless-borrow'
Issue from beta verion of clippy:

Error:    --> vm-virtio/src/queue.rs:700:59
    |
700 |             if let Some(used_event) = self.get_used_event(&mem) {
    |                                                           ^^^^ help: change this to: `mem`
    |
    = note: `-D clippy::needless-borrow` implied by `-D warnings`
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-06-24 08:55:43 +02:00
Michael Zhao
a94fa77621 arch: Add logging for FDT debugging on AArch64
To debug the FDT (Flattened Device Tree), we usually need to modify
source code to save the generted DTB data to disk, and use 'dtc' command
to decode the binary file into a text file to analyze.

It would be ideal if the FDT content can be seen in log.

This commit makes it real by:
- Introducing 'fdt' crate for parsing FDT.
- Printing the content of the FDT in tree view.

The parsing and printing only happen when Debug level logging enabled.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-06-18 09:07:46 +01:00
Michael Zhao
14c0e8424b aarch64: Fix wrong MPIDR setting
Fixed wrong MPIDR value setting for VCPUs in FDT.
The wrong setting made only 16 VCPUs can be enabled at most, all other
VCPUs were showing off-line.

The issue was introduced when we were migrating FDT-generating code to
vmm-fdt crate.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-06-16 15:38:23 +02:00
Henry Wang
1eb8a4671f arch: aarch64: Remove hardcoded host IPA size
With the ability of getting host IPA size in `hypervisor` crate,
we can query the host IPA size through ioctl instead of hardcoding
a maximum IPA size. Therefore this commit removes the hardcoded
maximum host IPA size.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-06-10 12:06:17 +02:00
Michael Zhao
88fda7c305 aarch64, acpi: Change PCIe high space for EDK2
EDK2 requires the beginning of PCIe high space above 4G address.
In CLH the space follows the RAM. If the RAM space is small, the PCIe
high space could fall bellow 4G.
Here we put it above 512G in FDT to workaround the EDK2 check only when
ACPI is enabled, because EDK2 collects PCIe information from FDT.
The address written in ACPI is not impacted.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-06-09 18:36:59 +08:00
Jianyong Wu
b8b5dccfd8 aarch64: Enable UEFI image loading
Implemented an architecture specific function for loading UEFI binary.

Changed the logic of loading kernel image:
1. First try to load the image as kernel in PE format;
2. If failed, try again to load it as formatless UEFI binary.

Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2021-06-09 18:36:59 +08:00
Henry Wang
bcee2fbd2d build: bump vm-fdt from 956b5a5 to 2e4ebde
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-06-03 14:13:02 +02:00
Bo Chen
b5bcdbaf48 misc: Upgrade to use the vm-memory crate w/ dirty-page-tracking
As the first step to complete live-migration with tracking dirty-pages
written by the VMM, this commit patches the dependent vm-memory crate to
the upstream version with the dirty-page-tracking capability. Most
changes are due to the updated `GuestMemoryMmap`, `GuestRegionMmap`, and
`MmapRegion` structs which are taking an additional generic type
parameter to specify what 'bitmap backend' is used.

The above changes should be transparent to the rest of the code base,
e.g. all unit/integration tests should pass without additional changes.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-06-03 08:34:45 +01:00
Michael Zhao
9a5f3fc2a7 vmm: Remove "gicr" handling from DeviceManager
The function used to calculate "gicr-typer" value has nothing with
DeviceManager. Now it is moved to AArch64 specific files.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-06-01 16:56:43 +01:00
Michael Zhao
195eba188a vmm: Split create_gic() from configure_system()
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-06-01 16:56:43 +01:00
Michael Zhao
5e53bbf405 arch: Bump vm-fdt from 13ab882 to 956b5a5
Interface of vm-fdt changed.
Updated aarch64 code to adapt.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-05-28 10:53:55 +02:00
Rob Bradford
cacec04df6 arch: Remove serde usage
With the only struct using it now using Versionize then the serde
dependency can be removed.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-05-26 22:27:41 +02:00
Rob Bradford
72ec98b8a8 arch: aarch64: Versionize Gicv3State
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-05-26 22:27:41 +02:00
Michael Zhao
ff46fb69d0 aarch64: Fix IRQ number setting for ACPI
On FDT, VMM can allocate IRQ from 0 for devices.
But on ACPI, the lowest range below 32 has to be avoided.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-05-25 10:20:37 +02:00