Commit Graph

677 Commits

Author SHA1 Message Date
Rob Bradford
e37ec26ccf vmm: Remove PCI PIO optimisation
This optimisation provided some peformance improvement when measured by
perf however when considered in terms of boot time peformance this
optimisation doesn't have any impact measurable using our
peformance-metrics tooling.

Removing this optimisation helps simplify the VMM internals as it allows
the reordering of the VM creation process permitting refactoring of the
restore code path.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-11-22 19:47:53 +00:00
Wei Liu
d05586f520 vmm: modify or provide safety comments
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-11-18 12:50:01 +00:00
Praveen K Paladugu
09e79a5e9b vmm: Add tpm device to mmio bus
Add tpm device to mmio bus if appropriate cmdline arguments were
passed.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
2022-11-15 16:42:21 +00:00
Praveen K Paladugu
af261f231c vmm: Add required acpi entries for vtpm device
Add an TPM2 entry to DSDT ACPI table. Add a TPM2 table to guest's ACPI.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Co-authored-by: Sean Yoo <t-seanyoo@microsoft.com>
2022-11-15 16:42:21 +00:00
Bo Chen
a9ec0f33c0 misc: Fix clippy issues
Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-11-02 09:41:43 +01:00
Jinrong Liang
cb171d4a23 device_manager: Avoid checking io_uring support when it's not needed
After testing, io_uring_is_supported() causes about 38ms of
overhead when creating virtio-blk. By modifying the position
of io_uring_is_supported(), the overhead of creating virtio-blk
is reduced to less than 1ms when we close io_uring.

Signed-off-by: Jinrong Liang <cloudliang@tencent.com>
2022-10-27 22:21:51 -07:00
Sebastien Boeuf
1f0e5eb66a vmm: virtio-devices: Restore every VirtioDevice upon creation
Following the new design proposal to improve the restore codepath when
migrating a VM, all virtio devices are supplied with an optional state
they can use to restore from. The restore() implementation every device
was providing has been removed in order to prevent from going through
the restoration twice.

Here is the list of devices now following the new restore design:

- Block (virtio-block)
- Net (virtio-net)
- Rng (virtio-rng)
- Fs (vhost-user-fs)
- Blk (vhost-user-block)
- Net (vhost-user-net)
- Pmem (virtio-pmem)
- Vsock (virtio-vsock)
- Mem (virtio-mem)
- Balloon (virtio-balloon)
- Watchdog (virtio-watchdog)
- Vdpa (vDPA)
- Console (virtio-console)
- Iommu (virtio-iommu)

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-10-24 14:17:08 +02:00
Sebastien Boeuf
099cdd2af8 virtio-devices, vmm: vdpa: Implement live migration support
Vdpa now implements the Migratable trait, which allows the device to be
added to the DeviceTree and therefore allows live migrating any vDPA
device that supports being suspended.

Given a vDPA device can't be resumed from a suspended state without
having to reset everything, we don't support pause/resume for a vDPA
device, as well as snapshot/restore (which requires resume to be
supported).

In order for the migration to work locally, reusing the same device on
the same host machine, the vhost-vdpa handler is dropped after the
snapshot has been performed, which allows the destination VM to open the
device without any conflict about the device being busy.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-10-13 10:03:23 +02:00
Rob Bradford
1bc63e7848 vmm: Remove legacy I/O ports for ACPI
These addresses have been superseded and replaced with other I/O ports.

Fixes: #4483

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-10-03 17:08:57 +01:00
Sebastien Boeuf
903c08f8a1 net: Don't override default TAP interface MTU
Adjust MTU logic such that:
1. Apply an MTU to the TAP interface if the user supplies it
2. Always query the TAP interface for the MTU and expose that.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-09-27 10:37:35 +01:00
Rob Bradford
b2d1dd65f3 build: Remove "fwdebug" and "common" feature flags
This simplifes the buld and checks with very little overhead and the
fwdebug device is I/O port device on 0x402 that can be used by edk2 as a
very simple character device.

See: #4679

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-09-26 10:16:33 -07:00
Rob Bradford
1202b9a07a vmm: Add some tracing of boot sequence
Add tracing of the VM boot sequence from the point at which the request
to create a VM is received to the hand-off to the vCPU threads running.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-09-22 18:09:31 +01:00
Sebastien Boeuf
76dbf85b79 net: Give the user the ability to set MTU
Add a new "mtu" parameter to the NetConfig structure and therefore to
the --net option. This allows Cloud Hypervisor's users to define the
Maximum Transmission Unit (MTU) they want to use for the network
interface that they create.

In details, there are two main aspects. On the one hand, the TAP
interface is created with the proper MTU if it is provided. And on the
other hand the guest is made aware of the MTU through the VIRTIO
configuration. That means the MTU is properly set on both the TAP on the
host and the network interface in the guest.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-09-21 16:20:57 +02:00
Sebastien Boeuf
f38056fc9e virtio-devices, vmm: Simplify virtio-mem resize operation
There's no need to delegate the resize operation to the virtio-mem
thread. This can come directly from the vmm thread which will use the
Mem object to update the VIRTIO configuration and trigger the interrupt
for the guest to be notified.

In order to achieve what's described above, the VirtioMemZone structure
now has a handle onto the Mem object directly. This avoids the need for
intermediate Resize and ResizeSender structures.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-09-20 13:43:40 +02:00
Nuno Das Neves
784a3aaf3c devices: gic: use VgicConfig everywhere
Use VgicConfig to initialize Vgic.
Use Gic::create_default_config everywhere so we don't always recompute
redist/msi registers.
Add a helper create_test_vgic_config for tests in hypervisor crate.

Signed-off-by: Nuno Das Neves <nudasnev@microsoft.com>
2022-08-31 08:33:05 +01:00
Michael Zhao
b65639fad3 vmm:AArch64: move uefi_flash to memory manager
uefi_flash is used when load firmware, that is load payload depends on
device manager. move uefi_flash to memory manager can eliminate the
dependency.

Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-08-31 08:32:08 +01:00
Sebastien Boeuf
8c02648ac9 vmm: device_manager: Update virtio-console for proper PTY support
Given the virtio-console is now able to buffer its output when no PTY is
connected on the other end, the device manager code is updated to enable
this. Moving the endpoint type from FilePair to PtyPair enables the
proper codepath in the virtio-console implementation, as well as
updating the PTY resize code, and forcing the PTY to always be
non-blocking.

The non-blocking behavior is required to avoid blocking the guest that
would be waiting on the virtio-console driver. When receiving an
EWOULDBLOCK error, the output will simply be redirected to the temporary
buffer so that it can be later flushed.

The PTY resize logic has been slightly modified to ensure the PTY file
descriptors are closed. It avoids the child process to keep a hold onto
the PTY device, which would have caused the PTY to believe something is
connected on the other end, which would have prevented the detection of
any new connection on the PTY.

Fixes #4521

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-08-30 13:47:51 +02:00
Sebastien Boeuf
cdcd4d259e vmm: serial: Wait for PTY to be available before writing to it
The goal of this patch is to provide a reliable way to detect when the
other end of the PTY is connected, and therefore be able to identify
when we can write to the PTY device. This is needed because writing to
the PTY device when the other end isn't connected causes the loss of
the written bytes.

The way to detect the connection on the other end of the PTY is by
knowing the other end is disconnected at first with the presence of the
EPOLLHUP event. Later on, when the connection happens, EPOLLHUP is not
triggered anymore, and that's when we can assume it's okay to write to
the PTY main device.

It's important to note we had to ensure the file descriptor for the
other end was closed, otherwise we would have never seen the EPOLLHUP
event. And we did so by removing the "sub" field from the PtyPair
structure as it was keeping the associated File opened.

Fixes #3170

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-08-19 14:39:06 +01:00
Sebastien Boeuf
98f949d35d vmm: Add new I/O ports for ACPI shutdown and PM timer devices
Adding new I/O ports for both the ACPI shutdown and the ACPI PM timer
devices so they can be triggered from both addresses. The reason for
this change is that TDX expects only certain I/O ports to be enabled
based on what QEMU exposes. We follow this to avoid new ports from being
opened exclusively for Cloud Hypervisor.

We have to keep the former I/O ports available given all firmwares
haven't been updated yet. Once we reach a point where we know both Rust
Hypervisor Firmware, OVMF, TDVF and TDSHIM have been updated with the
new port values, we'll be able to remove the former ports.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-08-11 11:46:09 +01:00
Rob Bradford
2e8eb96ef6 vmm: device_manager: Store ACPI platform addresses for later use
These are ready for inclusion in the FACP table.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-07-25 16:16:06 +01:00
Wei Liu
ad33f7c5e6 vmm: return seccomp rules according to hypervisors
That requires stashing the hypervisor type into various places.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-22 12:50:12 +01:00
Wei Liu
a96a5d7816 hypervisor, vmm: use new vfio-ioctls
Use the new vfio-ioctls APIs. Drop Cloud Hypervisor's Device trait
since it is no longer needed.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-21 23:37:53 +01:00
Wei Liu
0e8769d76a device_manager: assert passthrough_device has the correct type
There is a lot of unsafe code in such a small function. Add an assert
to help detect issues earlier.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-07-14 08:09:50 +01:00
Sebastien Boeuf
81ba70a497 pci, vmm: Defer mapping VFIO MMIO regions on restore
When restoring a VM, the restore codepath will take care of mapping the
MMIO regions based on the information from the snapshot, rather than
having the mapping being performed during device creation.

When the device is created, information such as which BARs contain the
MSI-X tables are missing, preventing to perform the mapping of the MMIO
regions.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-06-09 09:19:58 +02:00
Sebastien Boeuf
7df7061610 pci, vmm: Add migratable support to vfio-user devices
Based on recent changes to VfioUserPciDevice, the vfio-user devices can
now be migrated.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-06-09 09:19:58 +02:00
Sebastien Boeuf
c021dda267 pci, vmm: Add migratable support to VFIO devices
Based on recent changes to VfioPciDevice, the VFIO devices can now be
migrated.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-06-09 09:19:58 +02:00
Michael Zhao
957d3a7443 aarch64: Simplify GIC related structs definition
Combined the `GicDevice` struct in `arch` crate and the `Gic` struct in
`devices` crate.

After moving the KVM specific code for GIC in `arch`, a very thin wapper
layer `GicDevice` was left in `arch` crate. It is easy to combine it
with the `Gic` in `devices` crate.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-06-06 10:17:26 +08:00
Michael Zhao
04949755c0 arch: Switch to new GIC interface
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-06-06 10:17:26 +08:00
Rob Bradford
ade3a9c8f6 virtio-devices, vmm: Optimised async virtio device activation
In order to ensure that the virtio device thread is spawned from the vmm
thread we use an asynchronous activation mechanism for the virtio
devices. This change optimises that code so that we do not need to
iterate through all virtio devices on the platform in order to find the
one that requires activation. We solve this by creating a separate short
lived VirtioPciDeviceActivator that holds the required state for the
activation (e.g. the clones of the queues) this can then be stored onto
the device manager ready for asynchronous activation.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-06-01 09:42:02 +02:00
Rob Bradford
979797786d vmm: Remove DAX cache setup for virtio-fs devices
Remove the code from the DeviceManager that prepares the DAX cache since
the functionality has now been removed.

Fixes: #3889

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-27 09:47:13 +02:00
Michael Zhao
3fe20cc09a aarch64: Remove GicDevice trait
`GicDevice` trait was defined for the common part of GicV3 and ITS.
Now that the standalone GicV3 do not exist, `GicDevice` is not needed.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-05-27 10:57:50 +08:00
Rob Bradford
fa07d83565 Revert "virtio-devices, vmm: Optimised async virtio device activation"
This reverts commit f160572f9d.

There has been increased flakiness around the live migration tests since
this was merged. Speculatively reverting to see if there is increased
stability.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-21 21:27:33 +01:00
Rob Bradford
f160572f9d virtio-devices, vmm: Optimised async virtio device activation
In order to ensure that the virtio device thread is spawned from the vmm
thread we use an asynchronous activation mechanism for the virtio
devices. This change optimises that code so that we do not need to
iterate through all virtio devices on the platform in order to find the
one that requires activation. We solve this by creating a separate short
lived VirtioPciDeviceActivator that holds the required state for the
activation (e.g. the clones of the queues) this can then be stored onto
the device manager ready for asynchronous activation.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-20 17:07:13 +01:00
Maksym Pavlenko
3a0429c998 cargo: Clean up serde dependencies
There is no need to include serde_derive separately,
as it can be specified as serde feature instead.

Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2022-05-18 08:21:19 +02:00
Rob Bradford
d3f66f8702 hypervisor: Make vm module private
And thus only export what is necessary through a `pub use`. This is
consistent with some of the other modules and makes it easier to
understand what the external interface of the hypervisor crate is.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-13 15:39:22 +02:00
Rob Bradford
b1bd87df19 vmm: Simplify MsiInterruptManager generics
By taking advantage of the fact that IrqRoutingEntry is exported by the
hypervisor crate (that is typedef'ed to the hypervisor specific version)
then the code for handling the MsiInterruptManager can be simplified.

This is particularly useful if in this future it is not a typedef but
rather a wrapper type.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-11 11:19:14 +01:00
Rob Bradford
c2c813599d vmm: Don't use kvm_ioctls directly
The IoEventAddress is re-exported through the crate at the top-level.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-10 15:57:43 +01:00
Sebastien Boeuf
058a61148c vmm: Factorize net creation
Since both Net and vhost_user::Net implement the Migratable trait, we
can factorize the common part to simplify the code related to the net
creation.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-05-05 13:08:41 +02:00
Sebastien Boeuf
425902b296 vmm: Factorize disk creation
Since both Block and vhost_user::Blk implement the Migratable trait, we
can factorize the common part to simplify the code related to the disk
creation.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-05-05 13:08:41 +02:00
Rob Bradford
707cea2182 vmm, devices: Move logging of 0x80 timestamp to its own device
This is a cleaner approach to handling the I/O port write to 0x80.
Whilst doing this also use generate the timestamp at the start of the VM
creation. For consistency use the same timestamp for the ARM equivalent.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-04 23:02:53 +01:00
Bo Chen
7fe399598d vmm: device_manager: Map MMIO regions to the guest correctly
To correctly map MMIO regions to the guest, we will need to wait for valid
MMIO region information which is generated from 'PciDevice::allocate_bars()'
(as a part of 'DeviceManager::add_pci_device()').

Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-05-04 13:53:47 +02:00
Rob Bradford
1dfe4eda5c vmm: Prevent "internal" identifiers being used by user
For devices that cannot be named by the user use the "__" prefix to
identify them as internal devices. Check that any identifiers provided
in the config do not clash with those internal names. This prevents the
user from creating a disk such as "__serial" which would then cause a
failure in unpredictable manner.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-04 12:34:11 +02:00
Sebastien Boeuf
6e101f479c vmm: Ensure hotplugged device identifier is unique
Whenever a device (virtio, vfio, vfio-user or vdpa) is hotplugged, we
must verify the provided identifier is unique, otherwise we must return
an error.

Particularly, this will prevent issues with identifiers for serial,
console, IOAPIC, balloon, rng, watchdog, iommu and gpio since all of
these are hardcoded by the VMM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-05-03 18:34:24 +01:00
Rob Bradford
6d4862245d vmm: Generate event when device is removed
The new event contains the BDF and the device id:

{
  "timestamp": {
    "secs": 2,
    "nanos": 731073396
  },
  "source": "vm",
  "event": "device-removed",
  "properties": {
    "bdf": "0000:00:02.0",
    "id": "test-disk"
  }
}

Fixes: #4038

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-03 17:10:36 +02:00
Sebastien Boeuf
677c8831af vmm: Ensure uniqueness of generated identifiers
The device identifiers generated from the DeviceManager were not
guaranteed to be unique since they were not taking the list of
identifiers provided through the configuration.

By returning the list of unique identifiers from the configuration, and
by providing it to the DeviceManager, the generation of new identifiers
can rely both on the DeviceTree and the list of IDs from the
configuration.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-05-02 13:26:15 +02:00
Rob Bradford
f1276c58d2 vmm: Commandline inject from devices is aarch64 specific
This is not required for x86_64 and maintains a tight coupling between
kernel loading and the DeviceManager.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-04-29 11:03:38 +01:00
Rob Bradford
da33eb5e8c vmm: device_manager: Remove extra whitespace lines
These originated from the removal of the acpi feature gate.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-04-29 11:03:38 +01:00
Sebastien Boeuf
eb6daa2fc3 pci: Store MSI interrupt manager in VfioCommon
Extend VfioCommon structure to own the MSI interrupt manager. This will
be useful for implementing the restore code path.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-04-22 16:16:48 +02:00
Sebastien Boeuf
11e9f43305 vmm: Use new Resource type PciBar
Instead of defining some very generic resources as PioAddressRange or
MmioAddressRange for each PCI BAR, let's move to the new Resource type
PciBar in order to make things clearer. This allows the code for being
more readable, but also removes the need for hard assumptions about the
MMIO and PIO ranges. PioAddressRange and MmioAddressRange types can be
used to describe everything except PCI BARs. BARs are very special as
they can be relocated and have special information we want to carry
along with them.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-04-19 12:54:09 -07:00
Sebastien Boeuf
89218b6d1e pci: Replace BAR tuple with PciBarConfiguration
In order to make the code more consistent and easier to read, we remove
the former tuple that was used to describe a BAR, replacing it with the
existing structure PciBarConfiguration.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-04-19 12:54:09 -07:00
Sebastien Boeuf
5264d545dd pci, vmm: Extend PciDevice trait to support BAR relocation
By adding a new method id() to the PciDevice trait, we allow the caller
to retrieve a unique identifier. This is used in the context of BAR
relocation to identify the device being relocated, so that we can update
the DeviceTree resources for all PCI devices (and not only
VirtioPciDevice).

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-04-14 12:11:37 +02:00
Sebastien Boeuf
0c34846ef6 vmm: Return new PCI resources from add_pci_device()
By returning the new PCI resources from add_pci_device(), we allow the
factorization of the code translating the BARs into resources. This
allows VIRTIO, VFIO and vfio-user to add the resources to the DeviceTree
node.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-04-14 12:11:37 +02:00
Sebastien Boeuf
4f172ae4b6 vmm: Retrieve PCI resources for VFIO and vfio-user devices
Relying on the function introduced recently to get the PCI resources and
handle the restore case, both VFIO and vfio-user device creation paths
now have access to PCI resources, which can be provided to the function
add_pci_device().

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-04-14 12:11:37 +02:00
Sebastien Boeuf
0f12fe9b3b vmm: Factorize retrieval of PCI resources
Create a dedicated function for getting the PCI segment, b/d/f and
optional resources. This is meant for handling the potential case of a
restore.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-04-14 12:11:37 +02:00
Sebastien Boeuf
6e084572d4 pci, virtio: Make virtio-pci BAR restoration more generic
Updating the way of restoring BAR addresses for virtio-pci by providing
a more generic approach that will be reused for other PciDevice
implementations (i.e VfioPcidevice and VfioUserPciDevice).

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-04-14 12:11:37 +02:00
Rob Bradford
b212f2823d vmm: Deprecate mergeable option from virtio-pmem
KSM would never merge the file backed pages so this option has no
effect.

See: #3968

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-04-12 07:12:25 -07:00
Michael Zhao
d1b2a3fca9 aarch64: Add a memory-simulated flash for UEFI
EDK2 execution requires a flash device at address 0.

The new added device is not a fully functional flash. It doesn't
implement any spec of a flash device. Instead, a piece of memory is used
to simulate the flash simply.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-04-11 09:51:34 +01:00
Michael Zhao
656425a328 aarch64: Align the data types in layout
Some addresses defined in `layout.rs` were of type `GuestAddress`, and
are `u64`. Now align the types of all the `*_START` definitions to
`GuestAddress`.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-04-08 11:08:43 -07:00
Sebastien Boeuf
e76a5969e8 vmm: Add iommu parameter to VdpaConfig
Add a new iommu parameter to VdpaConfig in order to place the vDPA
device behind a virtual IOMMU.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-04-05 00:09:52 +02:00
Sebastien Boeuf
3c973fa7ce virtio-devices: vhost-user: Add support for TDX
By enabling the VIRTIO feature VIRTIO_F_IOMMU_PLATFORM for all
vhost-user devices when needed, we force the guest to use the DMA API,
making these devices compatible with TDX. By using DMA API, the guest
triggers the TDX codepath to share some of the guest memory, in
particular the virtqueues and associated buffers so that the VMM and
vhost-user backends/processes can access this memory.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-03-30 10:32:23 +02:00
Rob Bradford
ca68b9e7a9 build: Remove "cmos" feature gate
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-03-29 15:20:58 +01:00
Rob Bradford
e0d3efec6e devices: cmos: Implement CMOS based reset
If EFI reset fails on the Linux kernel then it will fallthrough to CMOS
reset. Implement this as one of our reset solutions.

Fixes: #3912

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-03-29 15:20:58 +01:00
Rob Bradford
7c0cf8cc23 arch, devices, vmm: Remove "acpi" feature gate
Compile this feature in by default as it's well supported on both
aarch64 and x86_64 and we only officially support using it (no non-acpi
binaries are available.)

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-03-28 09:18:29 -07:00
Sebastien Boeuf
afd9f17b73 virtio-fs: Deprecate the DAX feature
Disable the DAX feature from the virtio-fs implementation as the feature
is still not stable. The feature is deprecated, meaning the 'dax'
parameter will be removed in about 2 releases cycles.

In the meantime, the parameter value is ignored and forced to be
disabled.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-03-24 10:39:11 -07:00
Rob Bradford
7a8061818e vmm: Don't expose MemoryManager ACPI functionality unless required
When running non-dynamic or with virtio-mem for hotplug the ACPI
functionality should not be included on the DSDT nor does the
MemoryManager need to be placed on the MMIO bus.

Fixes: #3883

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-03-24 13:17:51 +00:00
Rob Bradford
1756b23aea vmm: device_manager: Check IOMMU placed device hotplug
Rather than just printing a message return an error back through the API
if the user attempts to hotplug a device that supports being behind an
IOMMU where that device isn't placed on an IOMMU segment.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-03-22 11:13:20 +00:00
Rob Bradford
6d2224f1ba vmm: device_manager: Create IOMMU mapping for hotplugged virtio devices
Previously it was not possible to enable vIOMMU for a virtio device.
However with the ability to place an entire PCI segment behind the
IOMMU the IOMMU mapping needs to be setup for the virtio device if it is
behind the IOMMU.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-03-22 11:13:20 +00:00
Sebastien Boeuf
3fea5f5396 vmm: Add support for hotplugging a vDPA device
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-03-18 12:28:40 +01:00
Sebastien Boeuf
c73c6039c3 vmm: Enable vDPA support
Based on the newly added Vdpa device along with the new vdpa parameter,
this patch enables the support for vDPA devices.

It's important to note this the only virtio device for which we provide
an ExternalDmaMapping instance. This will allow for the right DMA ranges
to be mapped/unmapped.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-03-18 12:28:40 +01:00
Sebastien Boeuf
9d46890dc0 vmm: device_manager: Make virtio DMA mapping conditional on vIOMMU
In case the virtio device which requires DMA mapping is placed behind a
virtual IOMMU, we shouldn't map/unmap any region manually. Instead, we
provide the DMA handler to the virtio-iommu device so that it can
trigger the proper mappings.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-03-11 12:37:17 +01:00
Sebastien Boeuf
a4f742277b vmm: device_manager: Handle DMA mapping for virtio devices
If a virtio device is associated with a DMA handler, the DMA mapping and
unmapping is performed from the device manager through the handler.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-03-11 12:37:17 +01:00
Sebastien Boeuf
86bc313f38 virtio-devices, vmm: Register a DMA handler to VirtioPciDevice
Given that some virtio device might need some DMA handling, we provide a
way to store this through the VirtioPciDevice layer, so that it can be
accessed when the PCI device is removed.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-03-11 12:37:17 +01:00
Sebastien Boeuf
54d63e774c vmm: device_manager: Extend MetaVirtioDevice with a DMA handler
In anticipation for handling potential DMA mapping/unmapping operations for a
virtio device, we extend the MetaVirtioDevice with an additional field
that holds an optional DMA handler.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-03-11 12:37:17 +01:00
Sebastien Boeuf
f801b0fc72 vmm: device_manager: Factorize virtio device tuple into structure
The tuple of information related to each virtio device is too big, and
it's better to factorize it through a dedicated structure.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-03-11 12:37:17 +01:00
Sebastien Boeuf
80296b9497 vmm: device_manager: Remove typedef VirtioDeviceArc
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-03-11 12:37:17 +01:00
Wei Liu
4cf22e4ec7 arch: do not hardcode MMIO region length in MmioDeviceInfo
Add a field for its length and fix up users.

Things work just because all hardcoded values agree with each other.
This is prone to breakage.

No functional change.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-03-04 15:21:48 +08:00
Sebastien Boeuf
42b5d4a2f7 pci, vmm: Update DeviceNode to store PciBdf instead of u32
By having the DeviceNode storing a PciBdf, we simplify the internal code
as well as allow for custom Serialize/Deserialize implementation for the
PciBdf structure. These custom implementations let us display the PCI
s/b/d/f in a human readable format.

Fixes #3711

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-02-16 11:57:23 +00:00
Rob Bradford
20b9f95afd vmm: Attach all devices from specified segments to the IOMMU
Since the devices behind the IOMMU cannot be changed at runtime we offer
the ability to place all devices on user chosen segments behind the
IOMMU. This allows the hotplugging of devices behind the IOMMU provided
that they are assigned to a segment that is located behind the iommu.

Fixes: #911

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-02-11 11:20:04 +00:00
Sebastien Boeuf
052f38fa96 vmm: Enable guest to report free pages through virtio-balloon
Adding a new parameter free_page_reporting=on|off to the balloon device
so that we can enable the corresponding feature from virtio-balloon.

Running a VM with a balloon device where this feature is enabled allows
the guest to report pages that are free from guest's perspective. This
information is used by the VMM to release the corresponding pages on the
host.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-02-11 12:10:07 +01:00
lizhaoxin1
a45e458c50 vm-migration: Add start_migration() to Migratable trait
In order to clearly decouple when the migration is started compared to
when the dirty logging is started, we introduce a new method to the
Migratable trait. This clarifies the semantics as we don't end up using
start_dirty_log() for identifying when the migration has been started.
And similarly, we rely on the already existing complete_migration()
method to know when the migration has been ended.

A bug was reported when running a local migration with a vhost-user-net
device in server mode. The reason was because the migration_started
variable was never set to "true", since the start_dirty_log() function
was never invoked.

Signed-off-by: lizhaoxin1 <Lxiaoyouling@163.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-02-03 13:33:26 +01:00
Sebastien Boeuf
8eed276d14 vm-virtio: Define AccessPlatform trait
Moving the whole codebase to rely on the AccessPlatform definition from
vm-virtio so that we can fully remove it from virtio-queue crate.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-01-27 10:00:20 +00:00
Anatol Belski
e2a8a1483f acpi: aarch64: Implement DBG2 table
This table is listed as required in the ARM Base Boot Requirements
document. The particular need arises to make the serial debugging of
Windows guest functional.

Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
2022-01-20 09:11:21 +08:00
Michael Zhao
1db7718589 pci, vmm: Pass PCI BDF to vfio and vfio_user
On AArch64, PCI BDF is used for devId in MSI-X routing entry.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-01-18 18:00:00 -08:00
Wei Liu
277cfd07ba device_manager: use if let to drop single match
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-01-18 17:23:27 -08:00
Rob Bradford
9ef1187f4a vmm, pci: Fix potential deadlock in PCI BAR allocation
The allocator is locked by both the BAR allocation code and the
interrupt allocation code. Resulting in a potential lock inversion
error.

WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock) (pid=26318)
  Cycle in lock order graph: M87 (0x7b0c00001e30) => M28 (0x7b0c00001830) => M87

  Mutex M28 acquired here while holding mutex M87 in thread T1:
    #0 pthread_mutex_lock /rustc/llvm/src/llvm-project/compiler-rt/lib/tsan/../sanitizer_common/sanitizer_common_interceptors.inc:4249:3 (cloud-hypervisor+0x9c368)
    #1 std::sys::unix::mutex::Mutex:🔒:hcd1b9aa06ff775d3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/mutex.rs:63:17 (cloud-hypervisor+0x663954)
    #2 std::sys_common::mutex::MovableMutex::raw_lock::hff98d0b036469bca /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/mutex.rs:76:18 (cloud-hypervisor+0x491bae)
    #3 std::sync::mutex::Mutex$LT$T$GT$:🔒:hc61622e5536f5b72 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sync/mutex.rs:267:13 (cloud-hypervisor+0x36d07b)
    #4 _$LT$vmm..interrupt..MsiInterruptManager$LT$kvm_bindings..x86..bindings..kvm_irq_routing_entry$GT$$u20$as$u20$vm_device..interrupt..InterruptManager$GT$::create_group::hd412b5e1e8eeacc2 /home/rob/src/cloud-hypervisor/vmm/src/interrupt.rs:310:29 (cloud-hypervisor+0x6d1403)
    #5 virtio_devices::transport::pci_device::VirtioPciDevice:🆕:h3af603c3f00f4b3d /home/rob/src/cloud-hypervisor/virtio-devices/src/transport/pci_device.rs:376:38 (cloud-hypervisor+0x8e6137)
    #6 vmm::device_manager::DeviceManager::add_virtio_pci_device::h23608151d7668a1c /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:3333:37 (cloud-hypervisor+0x3b6339)
    #7 vmm::device_manager::DeviceManager::add_pci_devices::h136cc20cbeb6b977 /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:1236:30 (cloud-hypervisor+0x390aad)
    #8 vmm::device_manager::DeviceManager::create_devices::h29fc5b8a20e1aea5 /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:1155:9 (cloud-hypervisor+0x38f48c)
    #9 vmm::vm::Vm:🆕:h43efe7c6cd97ede5 /home/rob/src/cloud-hypervisor/vmm/src/vm.rs:799:9 (cloud-hypervisor+0x334641)
    #10 vmm::Vmm::vm_boot::h06bdf54b95d5e14f /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:379:26 (cloud-hypervisor+0x2e5ba8)
    #11 vmm::Vmm::control_loop::h40c9b48c7b800bed /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:1299:48 (cloud-hypervisor+0x2f44e0)
    #12 vmm::start_vmm_thread::_$u7b$$u7b$closure$u7d$$u7d$::h016d2f7cff698175 /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:263:17 (cloud-hypervisor+0x1dda20)
    #13 std::sys_common::backtrace::__rust_begin_short_backtrace::h7fd2df3e7cfba503 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/backtrace.rs:123:18 (cloud-hypervisor+0x659bca)
    #14 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h89880b05fe892d7e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:477:17 (cloud-hypervisor+0x44100e)
    #15 _$LT$core..panic..unwind_safe..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h487382524d80571f /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/panic/unwind_safe.rs:271:9 (cloud-hypervisor+0x6dda5e)
    #16 std::panicking::try::do_call::h1d9c2ccdc39f3322 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:406:40 (cloud-hypervisor+0x6d36d1)
    #17 __rust_try 3hkmq3dzyyv5ejsx (cloud-hypervisor+0x6d4718)
    #18 std::panicking::try::h251306df23d21913 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:370:19 (cloud-hypervisor+0x6d2459)
    #19 std::panic::catch_unwind::h2a9ac2fb12c3c64e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panic.rs:133:14 (cloud-hypervisor+0x57bdce)
    #20 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::h10f4c340611b55e4 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:476:30 (cloud-hypervisor+0x43f913)
    #21 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hdd9b37241caf97b3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/ops/function.rs:227:5 (cloud-hypervisor+0x3cf3f5)
    #22 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::ha5022a6bb7833f62 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d492)
    #23 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h481697829cbc6746 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d492)
    #24 std::sys::unix:🧵:Thread:🆕:thread_start::h6fad62c4c393bbe7 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/thread.rs:108:17 (cloud-hypervisor+0x119d492)

  Mutex M87 previously acquired by the same thread here:
    #0 pthread_mutex_lock /rustc/llvm/src/llvm-project/compiler-rt/lib/tsan/../sanitizer_common/sanitizer_common_interceptors.inc:4249:3 (cloud-hypervisor+0x9c368)
    #1 std::sys::unix::mutex::Mutex:🔒:h9a2d3e97e05c6430 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/mutex.rs:63:17 (cloud-hypervisor+0x9ea344)
    #2 std::sys_common::mutex::MovableMutex::raw_lock::h8abb3b5cf55c0264 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/mutex.rs:76:18 (cloud-hypervisor+0x96face)
    #3 std::sync::mutex::Mutex$LT$T$GT$:🔒:hecec128d40c6dd44 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sync/mutex.rs:267:13 (cloud-hypervisor+0x97120a)
    #4 virtio_devices::transport::pci_device::VirtioPciDevice:🆕:h3af603c3f00f4b3d /home/rob/src/cloud-hypervisor/virtio-devices/src/transport/pci_device.rs:356:29 (cloud-hypervisor+0x8e5c0e)
    #5 vmm::device_manager::DeviceManager::add_virtio_pci_device::h23608151d7668a1c /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:3333:37 (cloud-hypervisor+0x3b6339)
    #6 vmm::device_manager::DeviceManager::add_pci_devices::h136cc20cbeb6b977 /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:1236:30 (cloud-hypervisor+0x390aad)
    #7 vmm::device_manager::DeviceManager::create_devices::h29fc5b8a20e1aea5 /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:1155:9 (cloud-hypervisor+0x38f48c)
    #8 vmm::vm::Vm:🆕:h43efe7c6cd97ede5 /home/rob/src/cloud-hypervisor/vmm/src/vm.rs:799:9 (cloud-hypervisor+0x334641)
    #9 vmm::Vmm::vm_boot::h06bdf54b95d5e14f /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:379:26 (cloud-hypervisor+0x2e5ba8)
    #10 vmm::Vmm::control_loop::h40c9b48c7b800bed /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:1299:48 (cloud-hypervisor+0x2f44e0)
    #11 vmm::start_vmm_thread::_$u7b$$u7b$closure$u7d$$u7d$::h016d2f7cff698175 /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:263:17 (cloud-hypervisor+0x1dda20)
    #12 std::sys_common::backtrace::__rust_begin_short_backtrace::h7fd2df3e7cfba503 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/backtrace.rs:123:18 (cloud-hypervisor+0x659bca)
    #13 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h89880b05fe892d7e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:477:17 (cloud-hypervisor+0x44100e)
    #14 _$LT$core..panic..unwind_safe..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h487382524d80571f /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/panic/unwind_safe.rs:271:9 (cloud-hypervisor+0x6dda5e)
    #15 std::panicking::try::do_call::h1d9c2ccdc39f3322 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:406:40 (cloud-hypervisor+0x6d36d1)
    #16 __rust_try 3hkmq3dzyyv5ejsx (cloud-hypervisor+0x6d4718)
    #17 std::panicking::try::h251306df23d21913 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:370:19 (cloud-hypervisor+0x6d2459)
    #18 std::panic::catch_unwind::h2a9ac2fb12c3c64e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panic.rs:133:14 (cloud-hypervisor+0x57bdce)
    #19 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::h10f4c340611b55e4 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:476:30 (cloud-hypervisor+0x43f913)
    #20 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hdd9b37241caf97b3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/ops/function.rs:227:5 (cloud-hypervisor+0x3cf3f5)
    #21 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::ha5022a6bb7833f62 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d492)
    #22 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h481697829cbc6746 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d492)
    #23 std::sys::unix:🧵:Thread:🆕:thread_start::h6fad62c4c393bbe7 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/thread.rs:108:17 (cloud-hypervisor+0x119d492)

  Mutex M87 acquired here while holding mutex M28 in thread T1:
    #0 pthread_mutex_lock /rustc/llvm/src/llvm-project/compiler-rt/lib/tsan/../sanitizer_common/sanitizer_common_interceptors.inc:4249:3 (cloud-hypervisor+0x9c368)
    #1 std::sys::unix::mutex::Mutex:🔒:h9a2d3e97e05c6430 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/mutex.rs:63:17 (cloud-hypervisor+0x9ea344)
    #2 std::sys_common::mutex::MovableMutex::raw_lock::h8abb3b5cf55c0264 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/mutex.rs:76:18 (cloud-hypervisor+0x96face)
    #3 std::sync::mutex::Mutex$LT$T$GT$:🔒:hecec128d40c6dd44 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sync/mutex.rs:267:13 (cloud-hypervisor+0x97120a)
    #4 _$LT$virtio_devices..transport..pci_device..VirtioPciDevice$u20$as$u20$pci..device..PciDevice$GT$::allocate_bars::h39dc42b48fc8264c /home/rob/src/cloud-hypervisor/virtio-devices/src/transport/pci_device.rs:850:22 (cloud-hypervisor+0x8eb1a4)
    #5 vmm::device_manager::DeviceManager::add_pci_device::h561f6c8ed61db117 /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:3087:20 (cloud-hypervisor+0x3b0c62)
    #6 vmm::device_manager::DeviceManager::add_virtio_pci_device::h23608151d7668a1c /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:3359:20 (cloud-hypervisor+0x3b6707)
    #7 vmm::device_manager::DeviceManager::add_pci_devices::h136cc20cbeb6b977 /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:1236:30 (cloud-hypervisor+0x390aad)
    #8 vmm::device_manager::DeviceManager::create_devices::h29fc5b8a20e1aea5 /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:1155:9 (cloud-hypervisor+0x38f48c)
    #9 vmm::vm::Vm:🆕:h43efe7c6cd97ede5 /home/rob/src/cloud-hypervisor/vmm/src/vm.rs:799:9 (cloud-hypervisor+0x334641)
    #10 vmm::Vmm::vm_boot::h06bdf54b95d5e14f /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:379:26 (cloud-hypervisor+0x2e5ba8)
    #11 vmm::Vmm::control_loop::h40c9b48c7b800bed /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:1299:48 (cloud-hypervisor+0x2f44e0)
    #12 vmm::start_vmm_thread::_$u7b$$u7b$closure$u7d$$u7d$::h016d2f7cff698175 /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:263:17 (cloud-hypervisor+0x1dda20)
    #13 std::sys_common::backtrace::__rust_begin_short_backtrace::h7fd2df3e7cfba503 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/backtrace.rs:123:18 (cloud-hypervisor+0x659bca)
    #14 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h89880b05fe892d7e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:477:17 (cloud-hypervisor+0x44100e)
    #15 _$LT$core..panic..unwind_safe..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h487382524d80571f /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/panic/unwind_safe.rs:271:9 (cloud-hypervisor+0x6dda5e)
    #16 std::panicking::try::do_call::h1d9c2ccdc39f3322 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:406:40 (cloud-hypervisor+0x6d36d1)
    #17 __rust_try 3hkmq3dzyyv5ejsx (cloud-hypervisor+0x6d4718)
    #18 std::panicking::try::h251306df23d21913 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:370:19 (cloud-hypervisor+0x6d2459)
    #19 std::panic::catch_unwind::h2a9ac2fb12c3c64e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panic.rs:133:14 (cloud-hypervisor+0x57bdce)
    #20 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::h10f4c340611b55e4 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:476:30 (cloud-hypervisor+0x43f913)
    #21 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hdd9b37241caf97b3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/ops/function.rs:227:5 (cloud-hypervisor+0x3cf3f5)
    #22 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::ha5022a6bb7833f62 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d492)
    #23 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h481697829cbc6746 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d492)
    #24 std::sys::unix:🧵:Thread:🆕:thread_start::h6fad62c4c393bbe7 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/thread.rs:108:17 (cloud-hypervisor+0x119d492)

  Mutex M28 previously acquired by the same thread here:
    #0 pthread_mutex_lock /rustc/llvm/src/llvm-project/compiler-rt/lib/tsan/../sanitizer_common/sanitizer_common_interceptors.inc:4249:3 (cloud-hypervisor+0x9c368)
    #1 std::sys::unix::mutex::Mutex:🔒:hcd1b9aa06ff775d3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/mutex.rs:63:17 (cloud-hypervisor+0x663954)
    #2 std::sys_common::mutex::MovableMutex::raw_lock::hff98d0b036469bca /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/mutex.rs:76:18 (cloud-hypervisor+0x491bae)
    #3 std::sync::mutex::Mutex$LT$T$GT$:🔒:hc61622e5536f5b72 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sync/mutex.rs:267:13 (cloud-hypervisor+0x36d07b)
    #4 vmm::device_manager::DeviceManager::add_pci_device::h561f6c8ed61db117 /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:3091:22 (cloud-hypervisor+0x3b0a95)
    #5 vmm::device_manager::DeviceManager::add_virtio_pci_device::h23608151d7668a1c /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:3359:20 (cloud-hypervisor+0x3b6707)
    #6 vmm::device_manager::DeviceManager::add_pci_devices::h136cc20cbeb6b977 /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:1236:30 (cloud-hypervisor+0x390aad)
    #7 vmm::device_manager::DeviceManager::create_devices::h29fc5b8a20e1aea5 /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:1155:9 (cloud-hypervisor+0x38f48c)
    #8 vmm::vm::Vm:🆕:h43efe7c6cd97ede5 /home/rob/src/cloud-hypervisor/vmm/src/vm.rs:799:9 (cloud-hypervisor+0x334641)
    #9 vmm::Vmm::vm_boot::h06bdf54b95d5e14f /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:379:26 (cloud-hypervisor+0x2e5ba8)
    #10 vmm::Vmm::control_loop::h40c9b48c7b800bed /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:1299:48 (cloud-hypervisor+0x2f44e0)
    #11 vmm::start_vmm_thread::_$u7b$$u7b$closure$u7d$$u7d$::h016d2f7cff698175 /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:263:17 (cloud-hypervisor+0x1dda20)
    #12 std::sys_common::backtrace::__rust_begin_short_backtrace::h7fd2df3e7cfba503 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/backtrace.rs:123:18 (cloud-hypervisor+0x659bca)
    #13 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h89880b05fe892d7e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:477:17 (cloud-hypervisor+0x44100e)
    #14 _$LT$core..panic..unwind_safe..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h487382524d80571f /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/panic/unwind_safe.rs:271:9 (cloud-hypervisor+0x6dda5e)
    #15 std::panicking::try::do_call::h1d9c2ccdc39f3322 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:406:40 (cloud-hypervisor+0x6d36d1)
    #16 __rust_try 3hkmq3dzyyv5ejsx (cloud-hypervisor+0x6d4718)
    #17 std::panicking::try::h251306df23d21913 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:370:19 (cloud-hypervisor+0x6d2459)
    #18 std::panic::catch_unwind::h2a9ac2fb12c3c64e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panic.rs:133:14 (cloud-hypervisor+0x57bdce)
    #19 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::h10f4c340611b55e4 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:476:30 (cloud-hypervisor+0x43f913)
    #20 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hdd9b37241caf97b3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/ops/function.rs:227:5 (cloud-hypervisor+0x3cf3f5)
    #21 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::ha5022a6bb7833f62 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d492)
    #22 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h481697829cbc6746 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d492)
    #23 std::sys::unix:🧵:Thread:🆕:thread_start::h6fad62c4c393bbe7 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/thread.rs:108:17 (cloud-hypervisor+0x119d492)

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-06 09:59:36 +01:00
Rob Bradford
a749063c8a vmm: Don't assume that resize_pipe is initialised
If the underlying kernel is old PTY resize is disabled and this is
represented by the use of None in the provided Option<File> type. In the
virtio-console PTY path don't blindly unwrap() the value that will be
preserved across a reboot.

Fixes: #3496

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-04 12:04:50 +00:00
Rob Bradford
afe386bc13 vmm: Only warn on error when setting up SIGWINCH handler
Setting up the SIGWINCH handler requires at least Linux 5.7. However
this functionality is not required for basic PTY operation.

Fixes: #3456

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-12-14 13:05:09 +01:00
Rob Bradford
50f5f43ae3 vmm: acpi: Make MBRD _CRS multi-segment aware
Advertise the PCI MMIO config spaces here so that the MMIO config space
is correctly recognised.

Tested by: --platform num_pci_segments=1 or 16 hotplug NVMe vfio-user device
works correctly with hypervisor-fw & OVMF and direct kernel boot.

Fixes: #3432

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-12-08 14:38:30 +00:00
Rob Bradford
e1c09b66ba vmm: Replace device tree value when restoring DeviceManager
When restoring replace the internal value of the device tree rather than
replacing the Arc<Mutex<DeviceTree>> itself. This is fixes an issue
where the AddressManager has a copy of the the original
Arc<Mutex<DeviceTree>> from when the DeviceManager was created. The
original restore path only replaced the DeviceManager's version of the
Arc<Mutex<DeviceTree>>. Instead replace the contents of the
Arc<Mutex<DeviceTree>> so all users see the updated version.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-12-06 15:58:37 +00:00
Henry Wang
07bef815cc aarch64: Introduce struct PciSpaceInfo for FDT
Currently, a tuple containing PCI space start address and PCI space
size is used to pass the PCI space information to the FDT creator.
In order to support the multiple PCI segment for FDT, more information
such as the PCI segment ID should be passed to the FDT creator. If we
still use a tuple to store these information, the code flexibility and
readablity will be harmed.

To address this issue, this commit replaces the tuple containing the
PCI space information to a structure `PciSpaceInfo` and uses a vector
of `PciSpaceInfo` to store PCI space information for each segment, so
that multiple PCI segment information can be passed to the FDT together.

Note that the scope of this commit will only contain the refactor of
original code, the actual multiple PCI segments support will be in
following series, and for now `--platform num_pci_segments` should only
be 1.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-12-06 09:29:49 +00:00
Ziye Yang
51cfffd24f vmm: Make the comments consistent in 'DeviceManager'
Change  "Failed xxing" to "Failed to xx", then
we can only we one style.

Signed-off-by: Ziye Yang <ziye.yang@intel.com>
2021-11-19 08:43:23 +00:00
Bo Chen
2a312cd4fe vmm: Fix a comment typo from 'DeviceManager'
Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-11-18 12:00:39 -08:00
Wei Liu
9b3cab8c72 device_manager: check return value of dup(2)
That function call can return -1 when it fails. Wrapping -1 into File
causes the code to panic when the File is dropped.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2021-11-17 23:12:11 +00:00
Wei Liu
84630aa0b5 device_manager: provide a few safety comments
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2021-11-17 23:12:11 +00:00
Alyssa Ross
ad8ed80eb1 vmm: use the tty raw mode implementation from libc
I encountered some trouble trying to use a virtio-console hooked up to
a PTY.  Reading from the PTY would produce stuff like this
"\n\nsh-5.1# \n\nsh-5.1# " (where I'm just pressing enter at a shell
prompt), and a terminal would render that like this:

----------------------------------------------------------------

sh-5.1#

       sh-5.1#
----------------------------------------------------------------

This was because we weren't disabling the ICRNL termios iflag, which
turns carriage returns (\r) into line feeds (\n).  Other raw mode
implementations (like QEMU's) set this flag, and don't have this
problem.

Instead of fixing our raw mode implementation to just disable ICRNL,
or copy the flags from QEMU's, though, here I've changed it to use the
raw mode implementation in libc.  It seems to work correctly in my
testing, and means we don't have to worry about what exactly raw mode
looks like under the hood any more.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2021-11-17 14:41:00 +00:00
Rob Bradford
3480e69ff5 vmm: Cache whether io_uring is supported in DeviceManager
Probing for whether the io_uring is supported is time consuming so cache
this value if it is known to reduce the cost for secondary block devices
that are added.

Before:

cloud-hypervisor: 3.988896ms: <vmm> INFO:vmm/src/device_manager.rs:1901 -- Creating virtio-block device: DiskConfig { path: Some("/home/rob/workloads/focal-server-cloudimg-amd64-custom-20210609-0.raw"), readonly: false, direct: false, iommu: false, num_queues: 1, queue_size: 128, vhost_user: false, vhost_socket: None, poll_queue: true, rate_limiter_config: None, id: Some("_disk0"), disable_io_uring: false, pci_segment: 0 }
cloud-hypervisor: 14.129591ms: <vmm> INFO:vmm/src/device_manager.rs:1983 -- Using asynchronous RAW disk file (io_uring)
cloud-hypervisor: 14.159853ms: <vmm> INFO:vmm/src/device_manager.rs:1901 -- Creating virtio-block device: DiskConfig { path: Some("/tmp/disk"), readonly: false, direct: false, iommu: false, num_queues: 1, queue_size: 128, vhost_user: false, vhost_socket: None, poll_queue: true, rate_limiter_config: None, id: Some("_disk1"), disable_io_uring: false, pci_segment: 0 }
cloud-hypervisor: 22.110281ms: <vmm> INFO:vmm/src/device_manager.rs:1983 -- Using asynchronous RAW disk file (io_uring)

After:

cloud-hypervisor: 4.880411ms: <vmm> INFO:vmm/src/device_manager.rs:1916 -- Creating virtio-block device: DiskConfig { path: Some("/home/rob/workloads/focal-server-cloudimg-amd64-custom-20210609-0.raw"), readonly: false, direct: false, iommu: false, num_queues: 1, queue_size: 128, vhost_user: false, vhost_socket: None, poll_queue: true, rate_limiter_config: None, id: Some("_disk0"), disable_io_uring: false, pci_segment: 0 }
cloud-hypervisor: 14.105123ms: <vmm> INFO:vmm/src/device_manager.rs:1998 -- Using asynchronous RAW disk file (io_uring)
cloud-hypervisor: 14.134837ms: <vmm> INFO:vmm/src/device_manager.rs:1916 -- Creating virtio-block device: DiskConfig { path: Some("/tmp/disk"), readonly: false, direct: false, iommu: false, num_queues: 1, queue_size: 128, vhost_user: false, vhost_socket: None, poll_queue: true, rate_limiter_config: None, id: Some("_disk1"), disable_io_uring: false, pci_segment: 0 }
cloud-hypervisor: 14.221869ms: <vmm> INFO:vmm/src/device_manager.rs:1998 -- Using asynchronous RAW disk file (io_uring)

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-12 18:09:55 +00:00
Rob Bradford
d96d98d88e vmm: Port DeviceManager to Aml::append_aml_bytes()
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-08 16:46:30 +00:00
Rob Bradford
def98faf37 vmm, vm-allocator: Introduce an allocator for platform devices
This allocator allocates 64-bit MMIO addresses for use with platform
devices e.g. ACPI control devices and ensures there is no overlap with
PCI address space ranges which can cause issues with PCI device
remapping.

Use this allocator the ACPI platform devices.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
1a5a89508b vmm: Remove segment_id from DeviceNode
With the segment id now encoded in the bdf it is not necessary to have
the separate field for it.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
ae83e3b383 vmm: Use PciBdf throughout in order to remove manual bit manipulation
In particular use the accessor for getting the device id from the bdf.
As a side effect the VIOT table is now segment aware.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00