Commit Graph

343 Commits

Author SHA1 Message Date
Jinank Jain
4c99aea6c4 hypervisor: Switch to use the new StandardRegisters
With this we are removing the CloudHypervisor definition of
StandardRegisters instead using an enum which contains different
variants of StandardRegisters coming from their bindigs crate.

Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
2024-08-19 21:41:22 +00:00
Jinank Jain
c1f18fa634 arch: x86_64: Don't expose TSC deadline timer for MSHV guests
HV APIC(i.e., synthetic APIC controller exposed by Microsoft Hypervisor)
does not support one-shot operation using a TSC deadline value. Due to
which we see the following backtrace inside the guest when running with
hypervisor-fw/OVMF:

[    0.560765] unchecked MSR access error: WRMSR to 0x832 (tried to
write 0x00000000000400ec) at rIP: 0xffffffff8f473594
(native_write_msr+0x4/0x30)
[    0.560765] Call Trace:
[    0.560765]  ? native_apic_msr_write+0x2b/0x30
[    0.560765]  __setup_APIC_LVTT+0xbc/0xe0
[    0.560765]  lapic_timer_set_oneshot+0x27/0x30
[    0.560765]  clockevents_switch_state+0xaf/0xf0
[    0.560765]  tick_setup_periodic+0x47/0x90
[    0.560765]  tick_setup_device.isra.0+0x7c/0x110
[    0.560765]  tick_check_new_device+0xce/0xf0
[    0.560765]  clockevents_register_device+0x82/0x170
[    0.560765]  clockevents_config_and_register+0x2f/0x40
[    0.560765]  setup_APIC_timer+0xe1/0xf0
[    0.560765]  setup_boot_APIC_clock+0x5f/0x66
[    0.560765]  native_smp_prepare_cpus+0x1d6/0x286
[    0.560765]  kernel_init_freeable+0xcf/0x255
[    0.560765]  ? rest_init+0xb0/0xb0
[    0.560765]  kernel_init+0xe/0x110
[    0.560765]  ret_from_fork+0x22/0x40

Also, if this feature is exposed guest would not finish booting and get
stuck right before unpacking the root filesystem.

Fixes: 06e8d1c40 ("hypervisor: mshv: fix topology for Intel HW on MSHV")
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
2024-07-17 17:10:08 +00:00
Sean Banko
2e50f5769f arch: x86_64: fall back to host cpuid l1 cache info if omitted by kvm
If the KVM version is older than v6.6, KVM_GET_SUPPORTED_CPUID will omit
the L1 cache information in CPUID function 0x8000_0005. Fall back to
the host L1 cache information if it is omitted by KVM.

Signed-off-by: Sean Banko <sbanko@crusoe.ai>
2024-06-09 07:37:17 +00:00
Josh Soref
42e9632c53 misc: Fix spelling issues
Misspellings were identified by:
  https://github.com/marketplace/actions/check-spelling

* Initial corrections based on forbidden patterns from the action
* Additional corrections by Google Chrome auto-suggest
* Some manual corrections
* Adding markdown bullets to readme credits section

Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2024-06-08 16:31:30 +00:00
Rob Bradford
eb4dd90532 arch: x86_64: Avoid calling cpuid for setting LAPIC ID
Rather than calling cpuid and then updating the APIC ID field - use the
existing common CPUID data which already includes CPUID data for eax=1
(aka function = 1).  This removes the need to call cpuid per vCPU thread
created. This has a positive impact on boot time with multiple vCPUs as
the cpuid instruction is serialising.

Fixes: #5646

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-05-30 17:03:03 +00:00
SamrutGadde
193c006669 arch: Use thiserror for errors
Added thiserror crate for missing files in the arch package

Signed-off-by: SamrutGadde <samrut.gadde@gmail.com>
2024-05-23 20:54:36 +00:00
Wei Liu
fe704bd44a arch: simplify MP table signature generation code
Clippy complained one use of the char_array macro.

Instead of fixing that use case, the observation is that the macro was
needed only because MP table types mixed c_char and c_uchar for no
particular reason.

Only use c_uchar in those types, and drop char_array.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2024-04-30 07:32:08 +00:00
Rob Bradford
10ab87d6a3 misc: Migrate away from versionize
Replace with serde instead.

Fixes: #6370

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-04-22 17:10:55 +00:00
Ruslan Mstoi
5e9886bba4 build: add REUSE Compliance Check
In accordance with reuse requirements:
- Place each license file in the LICENSES/ directory
- Add missing SPDX-License-Identifier to files.
- Add .reuse/dep5 to bulk-license files

Fixes: #5887

Signed-off-by: Ruslan Mstoi <ruslan.mstoi@intel.com>
2024-04-19 17:35:45 +00:00
Rob Bradford
39ab482c47 arch: aarch64: Remove import of TryInto
This is already provided by the prelude.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-03-19 18:36:22 +00:00
Rob Bradford
adb318f4cd misc: Remove redundant "use" imports
With the nightly toolchain (2024-02-18) cargo check will flag up
redundant imports either because they are pulled in by the prelude on
earlier match.

Remove those redundant imports.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-02-19 17:54:30 +00:00
Stefan Nuernberger
09cf8c3118 arch: x86_64: bring back bzImage support
Allow cloud-hypervisor to direct boot the bzImage kernel format using
the regular 32 bit entry point. This can share the memory and vcpu
setup with the regular PVH boot code, but requires the setup of the
'zero page'.

Signed-off-by: Stefan Nuernberger <stefan.nuernberger@cyberus-technology.de>
2024-02-19 17:07:50 +00:00
Thomas Barrett
ce7db3f7c3 arch: x86_64: allow more than 2 E820_RAM ranges
The 'generate_ram_ranges' function currently hardcodes the assumption
that there are only 2 E820 RAM entries. This is not flexible enough to
handle vendor specific memory holes. Returning a Vec is also more
convenient for users of this function.

Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
2024-02-15 08:49:06 +00:00
Bo Chen
9b0b881351 arch: Remove unused wrapper data structure for linux_loader
The `ByteValued` trait implementations for the data structures from the
'linux_loader' crate are no longer needed, and hence their wrappers can
be removed.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2024-02-07 09:25:40 +00:00
Rob Bradford
4e0dc5203a arch: x86_64: Disable dead code detection for embedded struct
These structs directly embed another struct and then implement
ByteValued on that struct to implement ByteValued for the inner struct.
As such the inner struct is never directly accessed so to avoid the dead
code analysis mark this as allowed.

Beta clippy fix:

warning: field `0` is never read
   --> arch/src/x86_64/mod.rs:129:32
    |
129 | struct MemmapTableEntryWrapper(hvm_memmap_table_entry);
    |        ----------------------- ^^^^^^^^^^^^^^^^^^^^^^
    |        |
    |        field in this struct
    |
    = note: `MemmapTableEntryWrapper` has a derived impl for the trait `Clone`, but this is intentionally ignored during dead code analysis
    = note: `#[warn(dead_code)]` on by default
help: consider changing the field to be of unit type to suppress this warning while preserving the field numbering, or remove the field
    |
129 | struct MemmapTableEntryWrapper(());
    |

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-02-07 09:25:40 +00:00
Thomas Barrett
5ec47d4883 arch: x86_64: enable HTT flag
When the HTT flag CPUID.1.EDX[HTT] is 0, it indicates that there is
only a single logical processor in the package. When HTT is 1, it
indicates that CPUID.1.EBX[23:16] contains the number of logical
processors in the package.

When this information is not included in CPUID leaf 0x1, some cpu
topology enumeration software such as hwloc are known to crash.

Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
2024-01-30 13:49:35 -08:00
Thomas Barrett
7bc764d4e0 arch: x86_64: enable nested virtualization on amd if supported
When using amd topology, the svm feature flag on cpuid leaf
0x8000_0001.ecx is overwritten. We update the amd cpu topology
logic to use the flag values that originated in
KVM_GET_SUPPORTED_CPUID ioctl and override as necessary.

Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
2024-01-15 17:50:40 +00:00
Alyssa Ross
0fb44c074f arch: fix rustdoc warning
warning: this URL is not a hyperlink
	   --> arch/src/aarch64/layout.rs:114:58
	    |
	114 | ...in https://www.kernel.org/doc/Documentation/arm64/booting.txt.
	    |       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: use an automatic link instead: `<https://www.kernel.org/doc/Documentation/arm64/booting.txt.>`
	    |
	    = note: bare URLs are not automatically turned into clickable links
	    = note: `#[warn(rustdoc::bare_urls)]` on by default

I also noticed that it looks like this comment was supposed to be
applied to FDT_MAX_SIZE, not FDT_START, so I moved it.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2024-01-10 17:37:29 +00:00
Rob Bradford
e041defa67 fuzz: Fix warnings for unused arch code
Under the fuzzer this code appears dead:

error: field `0` is never read
   --> /home/rob/src/cloud-hypervisor/arch/src/x86_64/mod.rs:128:32
    |
128 | struct MemmapTableEntryWrapper(hvm_memmap_table_entry);
    |        ----------------------- ^^^^^^^^^^^^^^^^^^^^^^
    |        |
    |        field in this struct
    |
    = note: `MemmapTableEntryWrapper` has a derived impl for the trait `Clone`, but this is intentionally ignored during dead code analysis
    = note: `-D dead-code` implied by `-D warnings`
    = help: to override `-D warnings` add `#[allow(dead_code)]`
help: consider changing the field to be of unit type to suppress this warning while preserving the field numbering, or remove the field
    |
128 | struct MemmapTableEntryWrapper(());
    |                                ~~

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-01-08 17:39:05 +00:00
Yi Wang
3d6594a594 build: fix clippy ptr arg issue
CI reports errors:

error: writing `&Vec` instead of `&[_]` involves a new object where a slice will do
    --> arch/src/x86_64/mod.rs:1351:19
     |
1351 |     epc_sections: &Vec<SgxEpcSection>,
     |                   ^^^^^^^^^^^^^^^^^^^ help: change this to: `&[SgxEpcSection]`
     |
     = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#ptr_arg
     = note: `-D clippy::ptr-arg` implied by `-D warnings`
     = help: to override `-D warnings` add `#[allow(clippy::ptr_arg)]`

Signed-off-by: Yi Wang <foxywang@tencent.com>
2024-01-02 08:58:11 +00:00
Thomas Barrett
5c0b66529a arch: x86_64: handle npot CPU topology
This PR addresses a bug in which the cpu topology of a guest
with non power-of-two number of cores is incorrect. For example,
in some contexts, a virtual machine with 2-sockets and 12-cores
will incorrectly believe that 16 cores are on socket 1 and 8
cores are on socket 2. In other cases, common topology enumeration
software such as hwloc will crash.

The root of the problem was the way that cloud-hypervisor generates
apic_id. On x86_64, the (x2) apic_id embeds information about cpu
topology. The cpuid instruction is primarily used to discover the
number of sockets, dies, cores, threads, etc. Using this information,
the (x2) apic_id is masked to determine which {core, die, socket} the
cpu is on. When the cpu topology is not a power of two
(e.g. a 12-core machine), this requires non-contiguous (x2) apic_id.

Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
2024-01-01 10:05:03 +00:00
Bo Chen
38a2808d85 arch: x86_64: Refactor the way to generate e820 RAM maps
This patch defines a new function 'generate_ram_ranges', to generate
usable physical memory ranges for the guest based on the existing guest
memory managed by VMM. This function is also made public, so that it can
be reused, say by the IGVM loader in the future [1].

No functional change.

See: #6020

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-12-14 07:11:53 -08:00
Bo Chen
d4892f41b3 misc: Stop using deprecated functions from vm-memory crate
See: https://github.com/rust-vmm/vm-memory/pull/247

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-11-14 09:17:42 +00:00
Jianyong Wu
2434e76ee0 aarch64: fdt: Use more appropriate default value for topology
Now, default values for vcpu topology are 0s, that is not correct and may
lead to bug. Fix it by setting default value to 1s. Also add check in
case one or more of these values are zero.

Fixes: #5892
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-11-01 21:29:08 +08:00
Thomas Barrett
3029fbeafd vmm: Allow assignment of PCI segments to NUMA node
Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
2023-10-18 11:18:15 -07:00
Bo Chen
0b4c153d4d arch, vmm: Clear AMX CPUID bits if the feature is not enabled
Fixes: #5833

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-10-18 11:13:12 -07:00
Bo Chen
7dd260f82f arch, vmm: Add new struct CpuidConfig
This struct contains all configuration fields that controls the way how
we generate CPUID for the guest on x86_64. This allows cleaner extension
when adding new configuration fields.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-10-18 11:13:12 -07:00
Anatol Belski
b52966a12c cpu: Implement AMD compatible topology handling
cpu: Pass APIC id explicitly where needed
topology: Set subleaf number explicitly

Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
2023-10-17 18:43:22 +02:00
Julian Stecklina
0d9749282a vmm: simplify EntryPoint
EntryPoint had an optional entry_addr, but there is no usage of this
struct that makes it necessary that the address is optional.

Remove the Option to avoid being able to express things that are not
useful.

Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de>
2023-09-09 10:46:51 +01:00
Philipp Schuster
7bf0cc1ed5 misc: Fix various spelling errors using typos
This fixes all typos found by the typos utility with respect to the config file.

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
2023-09-09 10:46:21 +01:00
zhongbingnan
5857d4851b aarch64/fdt: fix cache sharing issue for cache passthrough on aarch64
We fixed the L2 and L3 level cache sharing issues and confirmed that the
L2 level cache is independent, while the L3 level cache is shared per-socket.

See:#5505

Signed-off-by: zhongbingnan <zhongbingnan@bytedance.com>
2023-08-16 08:25:20 +08:00
Yi Wang
ba5af92984 arch: x86_64: re-enable KVM_FEATURE_ASYNC_PF_INT in CPUID
The commit b92fe648e9 (vmm: cpu: Disable KVM_FEATURE_ASYNC_PF_INT in
CPUID) disabled APF (Asynchronous Page Fault) mechanism to address
problem that makes vcpu thread spin 100%. As the actual issue is in
KVM, which has been merged in commit 2f15d027c05f (KVM: x86: Properly
handle APF vs disabled LAPIC situation) since 2021, so it's okay to
re-enable APF now.

Signed-off-by: Yi Wang <foxywang@tencent.com>
2023-07-31 17:05:12 +01:00
Yu Li
87d81dd2b1 arch: remove redundant closing paren in log
Signed-off-by: Yu Li <liyu.yukiteru@bytedance.com>
2023-07-14 09:36:27 -07:00
zhongbingnan
c1b33329db aarch64/fdt: Forward host cache layout to guest
Using the data from sysfs forward the host host cache layout to the
guest using the FDT tables.

TEST=The host cache layout (from sysfs) can be seen in inside the guest
using lscpu.

Signed-off-by: zhongbingnan <zhongbingnan@bytedance.com>
2023-06-20 15:45:15 +01:00
Bo Chen
b06ad85604 arch: Refactor the way of creating memory mapping
This patch clarifies the assumptions we have regarding the guest address
space layout while creating memory mapping in E820 on x86_64 and fdt on
aarch64. It also explicitly checks on these assumptions and report
errors if these assumptions do not hold.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-06-16 14:15:03 -07:00
Yu Li
55ee8eb482 arch: let arch_memory_regions return all available regions
The previous `arch_memory_regions` function will provide some memory
regions with the specified memory size and fill all the previous
regions before using the next one, but sometimes there may be no need
to fill up the previous one, e.g., the previous one should be aligned
with hugepage size.

This commit make `arch_memory_regions` function not take any
parameters and return the max available regions, the memory manager
can use them on demand.

Fixes: #5463

Signed-off-by: Yu Li <liyu.yukiteru@bytedance.com>
2023-06-16 14:15:03 -07:00
Yu Li
1017157bb6 arch: create memory mapping by the actual memory info
The original codes did not consider that the previous memory region
might not be full and always set it to the maximum size.

This commit fixes this problem by creating memory mappings based on
the actual memory details in both E820 on x86_64 and fdt on aarch64.

Fixes: #5463

Signed-off-by: Yu Li <liyu.yukiteru@bytedance.com>
2023-06-16 14:15:03 -07:00
Jianyong Wu
57fdaa3a39 arch: x86_64: Populate the APIC Id
Program the APIC ID (CPUID leaf 0x1 EBX) with the CPU id. This resolves
an issue where the EDKII firmware expects the APIC ID to vary per-CPU.

Fixes: #5475
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-06-15 13:50:38 -07:00
Anatol Belski
7bf2a2c382 vmm: arch: Make phys_bits functionality use CPU vendor API
Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
2023-05-31 23:54:33 +02:00
Rafael Mendonca
6379074264 misc: Remove unnecessary clippy directives
Clippy passes fine without these.

Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com>
2023-04-18 10:48:31 -07:00
Alyssa Ross
c1f555cde3 vmm: fall back if CLONE_CLEAR_SIGHAND unsupported
This will allow the SIGWINCH listener to run on kernels older than
5.5, although on those kernels it will have to make 64 syscalls to
reset all the signal handlers.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2023-04-05 11:23:06 +01:00
Alyssa Ross
95f83320b1 arch: use a non-doc comment for diagram
This doesn't need to be rendered in the HTML API documentation, and
wouldn't be formatted correctly if it was.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2023-04-04 17:38:21 -07:00
Wei Liu
de3ca97095 hypervisor: rename get_cpuid to get_supported_cpuid
To better reflect its nature and avoid confusion with get_cpuid2.

No functional change.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2023-02-23 13:03:12 +00:00
Rob Bradford
c22c4675b3 arch, hypervisor: Populate CPUID leaf 0x4000_0010 (TSC frequency)
This hypervisor leaf includes details of the TSC frequency if that is
available from KVM. This can be used to efficiently calculate time
passed when there is an invariant TSC.

TEST=Run `cpuid` in the guest and observe the frequency populated.

Fixes: #5178

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2023-02-09 18:32:21 +01:00
Michael Zhao
4a51a6615f arch: Fix AArch64 socket setting in CPU topology
Before Linux v6.0, AArch64 didn't support "socket" in "cpu-map"
(CPU topology) of FDT.

We found that clusters can be used in the same way of sockets. That is
the way we implemented the socket settings in Cloud Hypervisor. But in
fact it was a bug.

Linux commit 26a2b7 fixed the mistake. So the cluster nodes can no
longer act as sockets. And in a following commit dea8c0, sockets were
supported.

This patch fixed the way to configure sockets. In each socket, a default
cluster was added to contain all the cores, because cluster layer is
mandatory in CPU topology on AArch64.

This fix will break the socket settings on the guests where the kernel
version is lower than v6.0. In that case, if socket number is set to
more than 1, the kernel will treat that as FDT mistake and all the CPUs
will be put in single cluster of single socket.

The patch only impacts the case of using FDT, not ACPI.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2023-01-30 08:12:56 +00:00
Bo Chen
574576c8e9 misc: Automatically fix cargo clippy issues added in 1.68 (beta)
Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-01-26 08:58:37 -08:00
Rob Bradford
e661139e1e arch: Print details of host hypervisor status & address space size
e.g. on QEMU on KVM:

cloud-hypervisor: 17.079406ms: <vmm> INFO:arch/src/x86_64/mod.rs:565 -- Running under nested virtualisation. Hypervisor string: KVMKVMKVM

Or under Azure:

cloud-hypervisor: 3.881263ms: <vmm> INFO:arch/src/x86_64/mod.rs:565 -- Running under nested virtualisation. Hypervisor string: Microsoft Hv

Fixes: #5067

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2023-01-11 14:38:22 +00:00
Rob Bradford
5e52729453 misc: Automatically fix cargo clippy issues added in 1.65 (stable)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-12-14 14:27:19 +00:00
Sebastien Boeuf
0489b6314e tdx: Support new way of declaring memory resources
Without breaking the former way of declaring them. This is simply based
on the presence of the GUID TDX Metadata offset. If not present, we
consider the firmware is quite old and therefore we fallback onto the
previous way to expose memory resources.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-12-08 10:13:12 -08:00
Sebastien Boeuf
4f3f36fe5f tdx: Add support for new method of TDVF descriptor discovery
The preferred way of retrieving the offset where to find the TDVF
descriptor structure is by going through a table of GUIDs that can be
found at a specific offset in the firmware file. If the expected GUIDs
can't be found, we can fallback onto the former way, which is to read
directly the value at a specific offset in the file.

This patch implements the new mechanism without breaking compatibility
for older firmwares as it keeps supporting the previous mechanism.

As a reference, here is the documentation from the EDK2 code, and
particularly from the OvmfPkg/ResetVector/Ia16/ResetVectorVtf0.asm file:

```
GUIDed structure.  To traverse this you should first verify the
presence of the table footer guid
(96b582de-1fb2-45f7-baea-a366c55a082d) at 0xffffffd0.  If that
is found, the two bytes at 0xffffffce are the entire table length.

The table is composed of structures with the form:

Data (arbitrary bytes identified by guid)
length from start of data to end of guid (2 bytes)
guid (16 bytes)

so work back from the footer using the length to traverse until you
either find the guid you're looking for or run off the beginning of
the table.
```

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-12-07 17:55:54 +00:00