Commit Graph

1708 Commits

Author SHA1 Message Date
Sebastien Boeuf
98f949d35d vmm: Add new I/O ports for ACPI shutdown and PM timer devices
Adding new I/O ports for both the ACPI shutdown and the ACPI PM timer
devices so they can be triggered from both addresses. The reason for
this change is that TDX expects only certain I/O ports to be enabled
based on what QEMU exposes. We follow this to avoid new ports from being
opened exclusively for Cloud Hypervisor.

We have to keep the former I/O ports available given all firmwares
haven't been updated yet. Once we reach a point where we know both Rust
Hypervisor Firmware, OVMF, TDVF and TDSHIM have been updated with the
new port values, we'll be able to remove the former ports.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-08-11 11:46:09 +01:00
Rob Bradford
8c22c03e1e vmm: openapi: Switch to describing new payload API
The old API remains usable, and will remain usable for two releases but
we should only advertise the new API.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-08-10 22:20:07 +01:00
Rob Bradford
51fdc48817 vmm: openapi: Fix OpenAPI YAML file formatting
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-08-10 22:20:07 +01:00
Rob Bradford
cef51a9de0 vmm: Encompass guest payload configuration in PayloadConfig
Introduce a new top level member of VmConfig called PayloadConfig that
(currently) encompasses the kernel, commandline and initramfs for the
guest to use.

In future this can be extended for firmware use. The existing
"--kernel", "--cmdline" and "initramfs" CLI parameters now fill the
PayloadConfig.

Any config supplied which uses the now deprecated config members have
those members mapped to the new version with a warning.

See: #4445

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-08-10 15:12:34 +01:00
Rob Bradford
6bc46ba9c1 vmm: config: Reject VFIO devices with the same path
By checking in the validation logic we get checking for both devices
specified in the initial config but also hotplug too.

Fixes: #4453

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-08-09 14:32:35 +02:00
Rob Bradford
ea58d2f68a vmm: config: Enhance test_cpu_parsing to add "affinity" parameter
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-08-08 16:23:00 +01:00
Rob Bradford
d295de4cd5 option_parser: Move test_option_parser to option_parser crate
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-08-08 16:23:00 +01:00
Wei Liu
53aecf9341 vmm: add oem_strings to openapi
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-08-08 08:59:19 +01:00
Wei Liu
57e9b80123 vmm: provide oem_strings option
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-08-08 08:59:19 +01:00
lizhaoxin1
65f42c1f62 vmm: openapi: Add uuid to PlatformConfig
Signed-off-by: lizhaoxin1 <Lxiaoyouling@163.com>
2022-08-04 09:20:06 +02:00
lizhaoxin1
bc3a276b43 arch, vmm: Expose platform uuid via SMBIOS
Parse and set uuid.

Signed-off-by: lizhaoxin1 <Lxiaoyouling@163.com>
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-08-04 09:20:06 +02:00
lizhaoxin1
3abc1e1e51 vmm: config: Add "uuid" option to "--platform"
The uuid indicates the unique ID of a virtual machine.
cloud-hypervisor takes the uuid passed by libvirt
and uses it to initialize cloud-init.

Signed-off-by: lizhaoxin1 <Lxiaoyouling@163.com>
2022-08-04 09:20:06 +02:00
Bo Chen
1125fd2667 vmm: api: Use 'BTreeMap' for 'HttpRoutes'
In this way, we get the values sorted by its key by default, which is
useful for the 'http_api' fuzzer.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-08-03 10:18:24 +01:00
Bo Chen
eb056d374a vmm: Make 'EpollContext::add_event()' public
So that it can be reused by other crate, e.g. from fuzz targets.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-08-03 10:18:24 +01:00
Sebastien Boeuf
4d74525bdc vmm: Remove unused "poll_queue" from DiskConfig
The parameter "poll_queue" was useful at the time Cloud Hypervisor was
responsible for spawning vhost-user backends, as it was carrying the
information the vhost-user-block backend should have this option enabled
or not.

It's been quite some time that we walked away from this design, as we
now expect a management layer to be responsible for running vhost-user
backends.

That's the reason why we can remove "poll_queue" from the DiskConfig
structure.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-08-02 15:10:11 +02:00
Michael Zhao
7199119bb2 hypervisor: Remove Vcpu::read_mpidr() on AArch64
Replaced `read_mpidr()` with `get_sys_reg()`.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-07-29 11:45:12 +01:00
Michael Zhao
cd7f36a713 hypervisor: Remove get/set_reg() on AArch64
`Vcpu::get/set_reg()` were only invoked in Vcpu itself.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-07-29 11:45:12 +01:00
Michael Zhao
f7b6d99c2d hypervisor: Remove get/set_sys_regs() on AArch64
`hypervisor::Vcpu::get/set_sys_regs()` are only used in Vcpu internally.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-07-29 11:45:12 +01:00
Rob Bradford
857edc71a9 vmm: cpu: Remove now unused CpuManager::vcpus_paused()
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-07-26 09:22:25 +02:00
Rob Bradford
0e29379bcf vmm: Make gdb break/resuming more resilient
When starting the VM such that it is already on a breakpoint (via
stop_on_boot) when attached to gdb then start the vCPUs in a paused
state rather than starting the vCPUs later (upon resume).

Further, make the resumption/break of the VM more resilient by only
attempting to resume the vCPUs if were are already in a break point and
only attempting to pause/break if we were already running.

Fixes: #4354

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-07-26 09:22:25 +02:00
Rob Bradford
a749182777 vmm: acpi: Use ACPI platform device addresses from DeviceManager
Remove the hardcoded addresses.

Also remove PM_TMR_BLK as spec compliant implementation will use
X_PM_TMR_BLK over this field.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-07-25 16:16:06 +01:00
Rob Bradford
2e8eb96ef6 vmm: device_manager: Store ACPI platform addresses for later use
These are ready for inclusion in the FACP table.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-07-25 16:16:06 +01:00
Wei Liu
ad33f7c5e6 vmm: return seccomp rules according to hypervisors
That requires stashing the hypervisor type into various places.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-22 12:50:12 +01:00
Wei Liu
a96a5d7816 hypervisor, vmm: use new vfio-ioctls
Use the new vfio-ioctls APIs. Drop Cloud Hypervisor's Device trait
since it is no longer needed.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-21 23:37:53 +01:00
Wei Liu
f84ddedb1a hypervisor, vmm: introduce trait functions for aarch64 PMU
The original code uses kvm_device_attr directly outside of the
hyeprvisor crate. That leaks hypervisor details.

No functional change intended.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-21 23:37:53 +01:00
Wei Liu
f21fc1dcb6 hypervisor: x86: provide a generic MsrEntry structure
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-20 10:13:41 +01:00
Wei Liu
4d2cc3778f hypervisor: move away from MsrEntries type
It is a flexible array. Switch to vector and slice instead.

No functional change intended.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-20 10:13:41 +01:00
Wei Liu
05e5106b9b hypervisor x86: provide a generic LapicState structure
This requires making get/set_lapic_reg part of the type.

For the moment we cannot provide a default variant for the new type,
because picking one will be wrong for the other hypervisor, so I just
drop the test cases that requires LapicState::default().

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-19 09:38:38 +01:00
Wei Liu
6a8c0fc887 hypervisor: provide a generic FpuState structure
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-18 22:15:30 +01:00
Wei Liu
08135fa085 hypervisor: provide a generic CpudIdEntry structure
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-18 22:15:30 +01:00
Wei Liu
45fbf840db hypervisor, vmm: move away from CpuId type
CpuId is an alias type for the flexible array structure type over
CpuIdEntry. The type itself and the type of the element in the array
portion are tied to the underlying hypervisor.

Switch to using CpuIdEntry slice or vector directly. The construction of
CpuId type is left to hypervisors.

This allows us to decouple CpuIdEntry from hypervisors more easily.

No functional change intended.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-18 22:15:30 +01:00
Wei Liu
f1ab86fecb hypervisor: x86: provide a generic SpecialRegisters structure
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-15 10:21:43 +01:00
Wei Liu
75797827d5 hypervisor: x86: provide a generic SegmentRegister structure
And drop SegmentRegisterOps since it is no longer required.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-15 10:21:43 +01:00
Wei Liu
8b7781e267 hypervisor: x86: provide a generic StandardRegisters structure
We only need to do this for x86 since MSHV does not have aarch64 support
yet. This reduces unnecessary code churn.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-15 10:21:43 +01:00
Wei Liu
4201bf4011 hypervisor: provide a generic ClockData structure
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-14 22:09:04 +01:00
Wei Liu
beb4f86b82 hypervisor, vmm: drop VmState and code
VmState was introduced to hold hypervisor specific VM state. KVM does
not need it and MSHV does not really use it yet.

Just drop the code. It can be easily revived once there is a need.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-14 22:09:04 +01:00
Alyssa Ross
a455917db5 vmm: fix missed API or debug events
Previously, we were assuming that every time an eventfd notified us,
there was only a single event waiting for us.  This meant that if,
while one API request was being processed, two more arrived, the
second one would not be processed (until the next one arrived, when it
would be processed instead of that event, and so on).  To fix this,
make sure we're processing the number of API and debug requests we've
been told have arrived, rather than just one.  This is easy to
demonstrate by sending lots of API events and adding some sleeps to
make sure multiple events can arrive while each is being processed.

For other uses of eventfd, like the exit event, this doesn't matter —
even if we've received multiple exit events in quick succession, we
only need to exit once.  So I've only made this change where receiving
an event is non-idempotent, i.e. where it matters that we process the
event the right number of times.

Technically, reset requests are also non-idempotent — there's an
observable difference between a VM resetting once, and a VM resetting
once and then immediately resetting again.  But I've left that alone
for now because two resets in immediate succession doesn't sound like
something anyone would ever want to me.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2022-07-14 17:44:11 +01:00
Michael Zhao
2d8635f04a hypervisor: Refactor system_registers on AArch64
Function `system_registers` took mutable vector reference and modified
the vector content. Now change the definition to `get/set` style.
And rename to `get/set_sys_regs` to align with other functions.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-07-14 22:55:19 +08:00
Michael Zhao
c445513976 hypervisor: Refactor core_registers on AArch64
On AArch64, the function `core_registers` and `set_core_registers` are
the same thing of `get/set_regs` on x86_64. Now the names are aligned.
This will benefit supporting `gdb`.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-07-14 22:55:19 +08:00
Wei Liu
0e8769d76a device_manager: assert passthrough_device has the correct type
There is a lot of unsafe code in such a small function. Add an assert
to help detect issues earlier.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-14 08:09:50 +01:00
Wei Liu
84bbaf06d1 hypervisor: turn boot_msr_entries into a trait method
This allows dispatching to either KVM or MSHV automatically.

No functional change.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-08 16:49:58 +01:00
Rob Bradford
121729a3b0 vmm: Split signal handling for VM and VMM signals
The VM specific signal (currently only SIGWINCH) should only be handled
when the VM is running.

The generic VMM signals (SIGINT and SIGTERM) need handling at all times.

Split the signal handling into two separate threads which have differing
lifetimes.

Tested by:
1.) Boot full VM and check resize handling (SIGWINCH) works & sending
    SIGTERM leads to cleanup (tested that API socket is removed.)
2.) Start without a VM and send SIGTERM/SIGINT and observe cleanup (API
    socket removed)
3.) Boot full VM, delete VM and observe 2.) holds.
4.) Boot full VM, delete VM, recreate VM and observe 1.) holds.

Fixes: #4269

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-07-08 15:15:46 +01:00
Rob Bradford
93237f0106 vmm: Set MADT "Online Capable" flag
The Linux kernel now checks for this before marking CPUs as
hotpluggable:

commit aa06e20f1be628186f0c2dcec09ea0009eb69778
Author: Mario Limonciello <mario.limonciello@amd.com>
Date:   Wed Sep 8 16:41:46 2021 -0500

    x86/ACPI: Don't add CPUs that are not online capable

    A number of systems are showing "hotplug capable" CPUs when they
    are not really hotpluggable.  This is because the MADT has extra
    CPU entries to support different CPUs that may be inserted into
    the socket with different numbers of cores.

    Starting with ACPI 6.3 the spec has an Online Capable bit in the
    MADT used to determine whether or not a CPU is hotplug capable
    when the enabled bit is not set.

    Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/05_ACPI_Software_Programming_Model/ACPI_Software_Programming_Model.html?#local-apic-flags
    Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-07-01 18:45:05 +01:00
Rob Bradford
adf5881757 build: #[allow(clippy::significant_drop_in_scrutinee) in some crates
This check is new in the beta version of clippy and exists to avoid
potential deadlocks by highlighting when the test in an if or for loop
is something that holds a lock. In many cases we would need to make
significant refactorings to be able to pass this check so disable in the
affected crates.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-06-30 20:50:45 +01:00
Rob Bradford
b57d7b258d build: Fix beta clippy issue (needless_return)
warning: unneeded `return` statement
   --> pci/src/vfio_user.rs:627:13
    |
627 | /             return Err(std::io::Error::new(
628 | |                 std::io::ErrorKind::Other,
629 | |                 format!("Region not found for 0x{:x}", gpa),
630 | |             ));
    | |_______________^
    |
    = note: `#[warn(clippy::needless_return)]` on by default
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_return
help: remove `return`
    |
627 ~             Err(std::io::Error::new(
628 +                 std::io::ErrorKind::Other,
629 +                 format!("Region not found for 0x{:x}", gpa),
630 +             ))
    |

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-06-30 20:50:45 +01:00
Rob Bradford
2716bc3311 build: Fix beta clippy issue (derive_partial_eq_without_eq)
warning: you are deriving `PartialEq` and can implement `Eq`
  --> vmm/src/serial_manager.rs:59:30
   |
59 | #[derive(Debug, Clone, Copy, PartialEq)]
   |                              ^^^^^^^^^ help: consider deriving `Eq` as well: `PartialEq, Eq`
   |
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#derive_partial_eq_without_eq

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-06-30 20:50:45 +01:00
Rob Bradford
2e664dca64 vmm: Always reset the console mode on VMM exit
Tested:

1. SIGTERM based
2. VM shutdown/poweroff
3. Injected VM boot failure after calling Vm::setup_tty()

Fixes: #4248

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-06-28 16:45:27 +01:00
Rob Bradford
65ec6631fb vmm: cpu: Store the vCPU snapshots in ascending order
The snapshots are stored in a BTree which is ordered however as the ids
are strings lexical ordering places "11" ahead of "2". So encode the
vCPU id with zero padding so it is lexically sorted.

This fixes issues with CPU restore on aarch64.

See: #4239

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-06-27 16:20:57 +01:00
Wei Liu
bccd7c7e48 vmm: drop Sync+Send bounds for EndpointHandler
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-06-20 23:28:57 +01:00
Wei Liu
8fa1098629 vmm: switch from lazy_static to once_cell
Once_cell does not require using macro and is slated to become part of
Rust std at some point.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-06-20 16:03:07 +01:00