The goal of this patch is to provide a reliable way to detect when the
other end of the PTY is connected, and therefore be able to identify
when we can write to the PTY device. This is needed because writing to
the PTY device when the other end isn't connected causes the loss of
the written bytes.
The way to detect the connection on the other end of the PTY is by
knowing the other end is disconnected at first with the presence of the
EPOLLHUP event. Later on, when the connection happens, EPOLLHUP is not
triggered anymore, and that's when we can assume it's okay to write to
the PTY main device.
It's important to note we had to ensure the file descriptor for the
other end was closed, otherwise we would have never seen the EPOLLHUP
event. And we did so by removing the "sub" field from the PtyPair
structure as it was keeping the associated File opened.
Fixes#3170
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Adding new I/O ports for both the ACPI shutdown and the ACPI PM timer
devices so they can be triggered from both addresses. The reason for
this change is that TDX expects only certain I/O ports to be enabled
based on what QEMU exposes. We follow this to avoid new ports from being
opened exclusively for Cloud Hypervisor.
We have to keep the former I/O ports available given all firmwares
haven't been updated yet. Once we reach a point where we know both Rust
Hypervisor Firmware, OVMF, TDVF and TDSHIM have been updated with the
new port values, we'll be able to remove the former ports.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
When restoring a VM, the restore codepath will take care of mapping the
MMIO regions based on the information from the snapshot, rather than
having the mapping being performed during device creation.
When the device is created, information such as which BARs contain the
MSI-X tables are missing, preventing to perform the mapping of the MMIO
regions.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Combined the `GicDevice` struct in `arch` crate and the `Gic` struct in
`devices` crate.
After moving the KVM specific code for GIC in `arch`, a very thin wapper
layer `GicDevice` was left in `arch` crate. It is easy to combine it
with the `Gic` in `devices` crate.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
In order to ensure that the virtio device thread is spawned from the vmm
thread we use an asynchronous activation mechanism for the virtio
devices. This change optimises that code so that we do not need to
iterate through all virtio devices on the platform in order to find the
one that requires activation. We solve this by creating a separate short
lived VirtioPciDeviceActivator that holds the required state for the
activation (e.g. the clones of the queues) this can then be stored onto
the device manager ready for asynchronous activation.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Remove the code from the DeviceManager that prepares the DAX cache since
the functionality has now been removed.
Fixes: #3889
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
`GicDevice` trait was defined for the common part of GicV3 and ITS.
Now that the standalone GicV3 do not exist, `GicDevice` is not needed.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
This reverts commit f160572f9d.
There has been increased flakiness around the live migration tests since
this was merged. Speculatively reverting to see if there is increased
stability.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
In order to ensure that the virtio device thread is spawned from the vmm
thread we use an asynchronous activation mechanism for the virtio
devices. This change optimises that code so that we do not need to
iterate through all virtio devices on the platform in order to find the
one that requires activation. We solve this by creating a separate short
lived VirtioPciDeviceActivator that holds the required state for the
activation (e.g. the clones of the queues) this can then be stored onto
the device manager ready for asynchronous activation.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
There is no need to include serde_derive separately,
as it can be specified as serde feature instead.
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
And thus only export what is necessary through a `pub use`. This is
consistent with some of the other modules and makes it easier to
understand what the external interface of the hypervisor crate is.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
By taking advantage of the fact that IrqRoutingEntry is exported by the
hypervisor crate (that is typedef'ed to the hypervisor specific version)
then the code for handling the MsiInterruptManager can be simplified.
This is particularly useful if in this future it is not a typedef but
rather a wrapper type.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Since both Net and vhost_user::Net implement the Migratable trait, we
can factorize the common part to simplify the code related to the net
creation.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Since both Block and vhost_user::Blk implement the Migratable trait, we
can factorize the common part to simplify the code related to the disk
creation.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This is a cleaner approach to handling the I/O port write to 0x80.
Whilst doing this also use generate the timestamp at the start of the VM
creation. For consistency use the same timestamp for the ARM equivalent.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
To correctly map MMIO regions to the guest, we will need to wait for valid
MMIO region information which is generated from 'PciDevice::allocate_bars()'
(as a part of 'DeviceManager::add_pci_device()').
Signed-off-by: Bo Chen <chen.bo@intel.com>
For devices that cannot be named by the user use the "__" prefix to
identify them as internal devices. Check that any identifiers provided
in the config do not clash with those internal names. This prevents the
user from creating a disk such as "__serial" which would then cause a
failure in unpredictable manner.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Whenever a device (virtio, vfio, vfio-user or vdpa) is hotplugged, we
must verify the provided identifier is unique, otherwise we must return
an error.
Particularly, this will prevent issues with identifiers for serial,
console, IOAPIC, balloon, rng, watchdog, iommu and gpio since all of
these are hardcoded by the VMM.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The device identifiers generated from the DeviceManager were not
guaranteed to be unique since they were not taking the list of
identifiers provided through the configuration.
By returning the list of unique identifiers from the configuration, and
by providing it to the DeviceManager, the generation of new identifiers
can rely both on the DeviceTree and the list of IDs from the
configuration.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This is not required for x86_64 and maintains a tight coupling between
kernel loading and the DeviceManager.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Extend VfioCommon structure to own the MSI interrupt manager. This will
be useful for implementing the restore code path.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Instead of defining some very generic resources as PioAddressRange or
MmioAddressRange for each PCI BAR, let's move to the new Resource type
PciBar in order to make things clearer. This allows the code for being
more readable, but also removes the need for hard assumptions about the
MMIO and PIO ranges. PioAddressRange and MmioAddressRange types can be
used to describe everything except PCI BARs. BARs are very special as
they can be relocated and have special information we want to carry
along with them.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In order to make the code more consistent and easier to read, we remove
the former tuple that was used to describe a BAR, replacing it with the
existing structure PciBarConfiguration.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
By adding a new method id() to the PciDevice trait, we allow the caller
to retrieve a unique identifier. This is used in the context of BAR
relocation to identify the device being relocated, so that we can update
the DeviceTree resources for all PCI devices (and not only
VirtioPciDevice).
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
By returning the new PCI resources from add_pci_device(), we allow the
factorization of the code translating the BARs into resources. This
allows VIRTIO, VFIO and vfio-user to add the resources to the DeviceTree
node.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Relying on the function introduced recently to get the PCI resources and
handle the restore case, both VFIO and vfio-user device creation paths
now have access to PCI resources, which can be provided to the function
add_pci_device().
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Create a dedicated function for getting the PCI segment, b/d/f and
optional resources. This is meant for handling the potential case of a
restore.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Updating the way of restoring BAR addresses for virtio-pci by providing
a more generic approach that will be reused for other PciDevice
implementations (i.e VfioPcidevice and VfioUserPciDevice).
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
EDK2 execution requires a flash device at address 0.
The new added device is not a fully functional flash. It doesn't
implement any spec of a flash device. Instead, a piece of memory is used
to simulate the flash simply.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Some addresses defined in `layout.rs` were of type `GuestAddress`, and
are `u64`. Now align the types of all the `*_START` definitions to
`GuestAddress`.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Add a new iommu parameter to VdpaConfig in order to place the vDPA
device behind a virtual IOMMU.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
By enabling the VIRTIO feature VIRTIO_F_IOMMU_PLATFORM for all
vhost-user devices when needed, we force the guest to use the DMA API,
making these devices compatible with TDX. By using DMA API, the guest
triggers the TDX codepath to share some of the guest memory, in
particular the virtqueues and associated buffers so that the VMM and
vhost-user backends/processes can access this memory.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
If EFI reset fails on the Linux kernel then it will fallthrough to CMOS
reset. Implement this as one of our reset solutions.
Fixes: #3912
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Compile this feature in by default as it's well supported on both
aarch64 and x86_64 and we only officially support using it (no non-acpi
binaries are available.)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Disable the DAX feature from the virtio-fs implementation as the feature
is still not stable. The feature is deprecated, meaning the 'dax'
parameter will be removed in about 2 releases cycles.
In the meantime, the parameter value is ignored and forced to be
disabled.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
When running non-dynamic or with virtio-mem for hotplug the ACPI
functionality should not be included on the DSDT nor does the
MemoryManager need to be placed on the MMIO bus.
Fixes: #3883
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Rather than just printing a message return an error back through the API
if the user attempts to hotplug a device that supports being behind an
IOMMU where that device isn't placed on an IOMMU segment.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Previously it was not possible to enable vIOMMU for a virtio device.
However with the ability to place an entire PCI segment behind the
IOMMU the IOMMU mapping needs to be setup for the virtio device if it is
behind the IOMMU.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>