1617 Commits

Author SHA1 Message Date
Sebastien Boeuf
54f39aa8cb vmm: Validate vhost-user-block/net are not configured with iommu=on
Extend the validate() function for both DiskConfig and NetConfig so that
we return an error if a vhost-user-block or vhost-user-net device is
expected to be placed behind the virtual IOMMU. Since these devices
don't support this feature, we can't allow iommu to be set to true in
these cases.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-05-05 13:08:41 +02:00
Rob Bradford
707cea2182 vmm, devices: Move logging of 0x80 timestamp to its own device
This is a cleaner approach to handling the I/O port write to 0x80.
Whilst doing this also use generate the timestamp at the start of the VM
creation. For consistency use the same timestamp for the ARM equivalent.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-04 23:02:53 +01:00
Rob Bradford
c47e3b8689 gdb: Do not use VmmOps for memory manipulation
We don't use the VmmOps trait directly for manipulating memory in the
core of the VMM as it's really designed for the MSHV crate to handle
instruction decoding. As I plan to make this trait MSHV specific to
allow reduced locking for MMIO and PIO handling when running on KVM this
use should be removed.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-04 11:33:02 -07:00
Bo Chen
7fe399598d vmm: device_manager: Map MMIO regions to the guest correctly
To correctly map MMIO regions to the guest, we will need to wait for valid
MMIO region information which is generated from 'PciDevice::allocate_bars()'
(as a part of 'DeviceManager::add_pci_device()').

Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-05-04 13:53:47 +02:00
Rob Bradford
1dfe4eda5c vmm: Prevent "internal" identifiers being used by user
For devices that cannot be named by the user use the "__" prefix to
identify them as internal devices. Check that any identifiers provided
in the config do not clash with those internal names. This prevents the
user from creating a disk such as "__serial" which would then cause a
failure in unpredictable manner.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-04 12:34:11 +02:00
Sebastien Boeuf
6e101f479c vmm: Ensure hotplugged device identifier is unique
Whenever a device (virtio, vfio, vfio-user or vdpa) is hotplugged, we
must verify the provided identifier is unique, otherwise we must return
an error.

Particularly, this will prevent issues with identifiers for serial,
console, IOAPIC, balloon, rng, watchdog, iommu and gpio since all of
these are hardcoded by the VMM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-05-03 18:34:24 +01:00
Rob Bradford
6d4862245d vmm: Generate event when device is removed
The new event contains the BDF and the device id:

{
  "timestamp": {
    "secs": 2,
    "nanos": 731073396
  },
  "source": "vm",
  "event": "device-removed",
  "properties": {
    "bdf": "0000:00:02.0",
    "id": "test-disk"
  }
}

Fixes: #4038

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-03 17:10:36 +02:00
Sebastien Boeuf
a5a2e591c9 vmm: Remove FsConfig from VmConfig when unplugging fs device
All hotpluggable devices were properly removed from the VmConfig when a
remove-device command was issued, except for the "fs" type. Fix this
lack of support as it is causing the integration tests to fail with the
recent addition of verifying that identifiers are unique.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-05-02 13:26:15 +02:00
Sebastien Boeuf
677c8831af vmm: Ensure uniqueness of generated identifiers
The device identifiers generated from the DeviceManager were not
guaranteed to be unique since they were not taking the list of
identifiers provided through the configuration.

By returning the list of unique identifiers from the configuration, and
by providing it to the DeviceManager, the generation of new identifiers
can rely both on the DeviceTree and the list of IDs from the
configuration.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-05-02 13:26:15 +02:00
Sebastien Boeuf
634c53ea50 vmm: config: Validate provided identifiers are unique
A valid configuration means we can only accept unique identifiers from
the user.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-05-02 13:26:15 +02:00
LiHui
ec0c1b01c4 vmm: api: Do not delete the API socket on API server creation
The socket will safely deleted on shutdown and so it is not necessary to
delete the API socket when starting the HTTP server.

Fixes: #4026

Signed-off-by: LiHui <andrewli@kubesphere.io>
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-04-29 18:40:49 +01:00
Rob Bradford
f17aa3755f vmm: Add clarifying comment about Vm::entry_point()
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-04-29 11:03:38 +01:00
Rob Bradford
744a049007 vmm: Parallelise functionality with kernel loading
Move fuctionality earlier in the boot so as to run in parallel with the
loading of the kernel.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-04-29 11:03:38 +01:00
Rob Bradford
e70bd069b3 vmm: Load kernel asynchronously
Start loading the kernel as possible in the VM in a separate thread.
Whilst it is loading other work can be carried out such as initialising
the devices.

The biggest performance improvement is seen with a more complex set of
devices. If using e.g. four virtio-net devices then the time to start the
kernel improves by 20-30ms. With the simplest configuration the
improvement was of the order of 2-3ms.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-04-29 11:03:38 +01:00
Rob Bradford
bfeb3120f5 vmm: Refactor kernel loading to decouple from Vm struct
This will allow the kernel to be loaded from another thread.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-04-29 11:03:38 +01:00
Rob Bradford
ce6d88d187 vmm: Merge aarch64 use statements
These were in their own block and not organised lexically.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-04-29 11:03:38 +01:00
Rob Bradford
56fe4c61af vmm: Duplicate Vm::entry_point() across architectures
These will have very different implementations when asynchronously
loading the kernel.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-04-29 11:03:38 +01:00
Rob Bradford
1d1a087fc5 vmm: Refactor kernel command line generation
This allows the same code for generating the kernel command line to be
used on both aarch64 and x86_64 when the latter starts loading the
kernel in asynchronously.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-04-29 11:03:38 +01:00
Rob Bradford
f1276c58d2 vmm: Commandline inject from devices is aarch64 specific
This is not required for x86_64 and maintains a tight coupling between
kernel loading and the DeviceManager.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-04-29 11:03:38 +01:00
Rob Bradford
da33eb5e8c vmm: device_manager: Remove extra whitespace lines
These originated from the removal of the acpi feature gate.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-04-29 11:03:38 +01:00
Fabiano Fidêncio
fdeb4f7c46 Revert "vmm, openapi: Token Bucket fields should be uint64"
This reverts commit 87eed369cd091db76ed8542750803659e729b239.

The reason we're reverting this is that OpenAPI Specification[0] doesn't
know how to deal with unsigned types. :-/

Right now the best to do is keep it as it's, as an int64, and try to fix
OpenAPI, or even switch to swagger, as the latter knows how to properly
deal with those.  However, switching to swagger is far from being an 1:1
transition and will require time to experiment, thus reverting this for
now seems the best approach.

[0]: https://github.com/OAI/OpenAPI-Specification/blob/main/versions/3.1.0.md#data-types

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-28 09:26:38 +02:00
Fabiano Fidêncio
87eed369cd vmm, openapi: Token Bucket fields should be uint64
The Token Bucket fields are, on the Cloud Hypervisor side, u64.
However, we expose those as int64 in the OpenAPI YAML file.

With that in mind, let's adjust the yaml file to expose those as uint64.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-27 13:16:02 +02:00
Rob Bradford
79f4c2db01 vmm: Enable virtio-iommu in VmConfig::validate()
This means that the automatic enabling of the virtio-iommu will also be
applied to VMs creates via the API as well as the CLI.

Fixes: #4016

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-04-26 12:27:00 +01:00
Rob Bradford
bf9f79081a vmm: Only create ACPI memory manager DSDT when resizable
If using the ACPI based hotplug only memory can be added so if the
hotplug RAM size is the same as the boot RAM size then do not include
the memory manager DSDT entries.

Also: this change simplifies the code marginally by making the
HotplugMethod enum Copyable.

This was identified from the following perf output:

     1.78%     0.00%  vmm              cloud-hypervisor      [.] <vmm::memory_manager::MemorySlots as acpi_tables::aml::Aml>::append_aml_bytes
            |
            ---<vmm::memory_manager::MemorySlots as acpi_tables::aml::Aml>::append_aml_bytes
               <vmm::memory_manager::MemorySlot as acpi_tables::aml::Aml>::append_aml_bytes
               acpi_tables::aml::Name::new
               <acpi_tables::aml::Path as acpi_tables::aml::Aml>::append_aml_bytes
               __libc_malloc

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-04-26 13:07:19 +02:00
Rob Bradford
62f17ccf8c vmm: Improve error handling for vmm::vm::Error
In particular implement thiserror::Error, cleanup wording and remove
unused errors.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-04-22 17:46:41 +01:00
Rob Bradford
cb03540ffd vmm: config: Derive thiserror::Error
No further changes are necessary that adding a #[derive(Error)] as there
is a manual implementation of Display.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-04-22 17:46:41 +01:00
Rob Bradford
0270d697ab vmm: cpu: Improve Error reporting
Remove unused enum members, improve error messages and implement
thiserror::Error.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-04-22 17:46:41 +01:00
Rob Bradford
47529796d0 arch: Improve arch::Error
Remove unused error enum entries, improve wording and derive
thiserror::Error.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-04-22 17:46:41 +01:00
Rob Bradford
1c786610b7 vmm: api: Don't use clashing struct name for Error
Import vmm::Error as VmmError to allow the use of thiserror::Error to
avoid clashing names.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-04-22 17:46:41 +01:00
Sebastien Boeuf
eb6daa2fc3 pci: Store MSI interrupt manager in VfioCommon
Extend VfioCommon structure to own the MSI interrupt manager. This will
be useful for implementing the restore code path.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-04-22 16:16:48 +02:00
Rob Bradford
adb3dcdc13 vmm: openapi: Add serial_number to PlatformConfig
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-04-21 17:17:08 +02:00
Rob Bradford
e972eb7c74 arch, vmm: Expose platform serial_number via SMBIOS
Fixes: #4002

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-04-21 17:17:08 +02:00
Rob Bradford
203dfdc156 vmm: config: Add "serial_number" option to "--platform"
This carries a string that is exposed via DMI/SMBIOS and is particularly
useful for cloud-init initialisation.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-04-21 17:17:08 +02:00
Rob Bradford
4a04d1f8f2 vmm: seccomp: Allow SYS_rseq as required by newer glibc
glibc 2.35 as shipped by Fedora 36 now uses the rseq syscall.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-04-21 13:02:51 +01:00
Rob Bradford
4ca066f077 vmm: api: Simplify error reporting from HTTP to internal API calls
Use a single enum member for representing errors from the internal API.
This avoids the ugly duplication of the API call name in the error
message:

e.g.

$ target/debug/ch-remote --api-socket /tmp/api resize --cpus 2
Error running command: Server responded with an error: InternalServerError: VmResize(VmResize(CpuManager(DesiredVCpuCountExceedsMax)))

Becomes:

$ target/debug/ch-remote --api-socket /tmp/api resize --cpus 2
Error running command: Server responded with an error: InternalServerError: ApiError(VmResize(CpuManager(DesiredVCpuCountExceedsMax)))

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-04-20 19:39:05 +01:00
Sebastien Boeuf
11e9f43305 vmm: Use new Resource type PciBar
Instead of defining some very generic resources as PioAddressRange or
MmioAddressRange for each PCI BAR, let's move to the new Resource type
PciBar in order to make things clearer. This allows the code for being
more readable, but also removes the need for hard assumptions about the
MMIO and PIO ranges. PioAddressRange and MmioAddressRange types can be
used to describe everything except PCI BARs. BARs are very special as
they can be relocated and have special information we want to carry
along with them.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-04-19 12:54:09 -07:00
Sebastien Boeuf
89218b6d1e pci: Replace BAR tuple with PciBarConfiguration
In order to make the code more consistent and easier to read, we remove
the former tuple that was used to describe a BAR, replacing it with the
existing structure PciBarConfiguration.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-04-19 12:54:09 -07:00
Sebastien Boeuf
1795afadb8 vmm: Factorize algorithm finding HOB memory resources
By factorizing the algorithm untangling TDVF sections from guest RAM
into a dedicated function, we can write some unit tests to validate it
properly achieves what we expect.

Adding the "tdx" feature to the unit tests, otherwise it wouldn't get
tested.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-04-19 15:23:12 +02:00
Sebastien Boeuf
5264d545dd pci, vmm: Extend PciDevice trait to support BAR relocation
By adding a new method id() to the PciDevice trait, we allow the caller
to retrieve a unique identifier. This is used in the context of BAR
relocation to identify the device being relocated, so that we can update
the DeviceTree resources for all PCI devices (and not only
VirtioPciDevice).

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-04-14 12:11:37 +02:00
Sebastien Boeuf
0c34846ef6 vmm: Return new PCI resources from add_pci_device()
By returning the new PCI resources from add_pci_device(), we allow the
factorization of the code translating the BARs into resources. This
allows VIRTIO, VFIO and vfio-user to add the resources to the DeviceTree
node.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-04-14 12:11:37 +02:00
Sebastien Boeuf
4f172ae4b6 vmm: Retrieve PCI resources for VFIO and vfio-user devices
Relying on the function introduced recently to get the PCI resources and
handle the restore case, both VFIO and vfio-user device creation paths
now have access to PCI resources, which can be provided to the function
add_pci_device().

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-04-14 12:11:37 +02:00
Sebastien Boeuf
0f12fe9b3b vmm: Factorize retrieval of PCI resources
Create a dedicated function for getting the PCI segment, b/d/f and
optional resources. This is meant for handling the potential case of a
restore.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-04-14 12:11:37 +02:00
Sebastien Boeuf
6e084572d4 pci, virtio: Make virtio-pci BAR restoration more generic
Updating the way of restoring BAR addresses for virtio-pci by providing
a more generic approach that will be reused for other PciDevice
implementations (i.e VfioPcidevice and VfioUserPciDevice).

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-04-14 12:11:37 +02:00
Rob Bradford
b212f2823d vmm: Deprecate mergeable option from virtio-pmem
KSM would never merge the file backed pages so this option has no
effect.

See: #3968

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-04-12 07:12:25 -07:00
Rob Bradford
ed87e42e6f vm-device, pci, devices: Remove InterruptSourceGroup::{un}mask
The calls to these functions are always preceded by a call to
InterruptSourceGroup::update(). By adding a masked boolean to that
function call it possible to remove 50% of the calls to the
KVM_SET_GSI_ROUTING ioctl as the the update will correctly handle the
masked or unmasked case.

This causes the ioctl to disappear from the perf report for a boot of
the VM.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-04-11 22:56:48 +01:00
Michael Zhao
d1b2a3fca9 aarch64: Add a memory-simulated flash for UEFI
EDK2 execution requires a flash device at address 0.

The new added device is not a fully functional flash. It doesn't
implement any spec of a flash device. Instead, a piece of memory is used
to simulate the flash simply.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-04-11 09:51:34 +01:00
Michael Zhao
298a5580a9 aarch64: Remove unnecessary function definitions
This is a refactoring commit to simplify source code.
Removed some functions that only return a layout const.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-04-08 11:08:43 -07:00
Michael Zhao
656425a328 aarch64: Align the data types in layout
Some addresses defined in `layout.rs` were of type `GuestAddress`, and
are `u64`. Now align the types of all the `*_START` definitions to
`GuestAddress`.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-04-08 11:08:43 -07:00
Michael Zhao
848d88c122 aarch64: Reserve a hole in 32-bit space
The reserved space is for devices.
Some devices (like TPM) require arbitrary addresses close to 4GiB.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-04-05 11:04:52 +08:00
Michael Zhao
a3dbc3b415 aarch64: Change RAM_START type GuestAddress
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-04-05 11:04:52 +08:00