Advertise the PCI MMIO config spaces here so that the MMIO config space
is correctly recognised.
Tested by: --platform num_pci_segments=1 or 16 hotplug NVMe vfio-user device
works correctly with hypervisor-fw & OVMF and direct kernel boot.
Fixes: #3432
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
When restoring replace the internal value of the device tree rather than
replacing the Arc<Mutex<DeviceTree>> itself. This is fixes an issue
where the AddressManager has a copy of the the original
Arc<Mutex<DeviceTree>> from when the DeviceManager was created. The
original restore path only replaced the DeviceManager's version of the
Arc<Mutex<DeviceTree>>. Instead replace the contents of the
Arc<Mutex<DeviceTree>> so all users see the updated version.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The observation is that the code in question was used to bridge
synchronized and asynchronized code.
We can group the functions for that purpose under an adaptor trait. To
limit the scope of locking, the users of the trait are required to
implement a method to return a MutexGuard for the underlying file.
This then allows us to use concrete types (QcowFile and Vhdx) in code,
which is easier to read than a bunch of traits.
No functional change intended.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Saving the KVM clock and restoring it is key for correct behaviour of
the VM when doing snapshot/restore or live migration. The clock is
restored to the KVM state as part of the Vm::resume() method prior to
that it must be extracted from the state object and stored for later use
by this method. This change simplifies the extraction and storage part
so that it is done in the same way for both snapshot/restore and live
migration.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This commit enhances the integration test for multiple PCI segments
by:
(1) Enables the `test_virtio_fs_multi_segment` on AArch64.
(2) Adds a new integration test case for both x86_64 and AArch64 using
the direct kernel boot to test virtio-disk multiple PCI segments.
The test case does:
- Start a VM using direct kernel boot with 16 PCI segments and assign
the last PCI segment with a virtio-disk device.
- Check if the number of PCI host bridges equals to 16 after VM boots.
- Mount the virtio-disk device on the last PCI segment to the rootfs
and write/read data to the virtio-disk device.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
Each PCI device under a root complex is uniquely identified by its
Requester ID (AKA RID). A Requester ID is a triplet of a Bus number,
Device number, and Function number.
MSIs may be distinguished in part through the use of sideband data
accompanying writes. In the case of PCI devices, this sideband data
may be derived from the Requester ID. A mechanism is required to
associate a device with both the MSI controllers it can address,
and the sideband data that will be associated with its writes to
those controllers.
This commit adds the `msi-map` property for PCI nodes, therefore
creating MSI mapping for each PCI device.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
This commit rewrites the `create_pci_node` in the FDT creator to
create multiple PCI nodes based on the vector of `PciSpaceInfo`,
and each PCI node in FDT reflects a PCI segment.
- The PCI MMIO config space, 32 bits PCI device space and 64 bits
PCI device space is re-calculated based on the `PciSpaceInfo` for
each PCI segment.
- A new FDT property `linux,pci-domain` is added.
- The virtio-iommu node is only created for the first PCI segment.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
The constant `PCI_MMIO_CONFIG_SIZE` defined in `vmm/pci_segment.rs`
describes the MMIO configuation size for each PCI segment. However,
this name conflicts with the `PCI_MMCONFIG_SIZE` defined in `layout.rs`
in the `arch` crate, which describes the memory size of the PCI MMIO
configuration region.
Therefore, this commit renames the `PCI_MMIO_CONFIG_SIZE` to
`PCI_MMIO_CONFIG_SIZE_PER_SEGMENT` and moves this constant from `vmm`
crate to `arch` crate.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
Currently, a tuple containing PCI space start address and PCI space
size is used to pass the PCI space information to the FDT creator.
In order to support the multiple PCI segment for FDT, more information
such as the PCI segment ID should be passed to the FDT creator. If we
still use a tuple to store these information, the code flexibility and
readablity will be harmed.
To address this issue, this commit replaces the tuple containing the
PCI space information to a structure `PciSpaceInfo` and uses a vector
of `PciSpaceInfo` to store PCI space information for each segment, so
that multiple PCI segment information can be passed to the FDT together.
Note that the scope of this commit will only contain the refactor of
original code, the actual multiple PCI segments support will be in
following series, and for now `--platform num_pci_segments` should only
be 1.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
Previously mutex (semaphore) and file were separated. The code needed to
create artificial scopes to use mutex to protect file.
Rewrite the code to be idiomatic. The file itself is turned into a trait
object and placed inside the mutex. This requires providing a new
ReadWriteSeekFile trait to unify all helper functions.
The rewrite further simplified vhdx_sync code. The original code
contained two mutex'es for no apparent reason.
No functional change intended.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Extending the test_simple_launch() integration test to validate Cloud
Hypervisor boots correctly with both rust-hypervisor-fw and OVMF on
x86_64 platforms.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Bumping the OVMF binary version along with UEFI documentation to
reflect the latest set of patches on top of tianocore/edk2 'master'
branch, which can be found on the Cloud Hypervisor fork on 'ch' branch.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In order to avoid the identity map region to conflict with a possible
firmware being placed in the last 4MiB of the 4GiB range, we must set
the address to a chosen location. And it makes the most sense to have
this region placed right after the TSS region.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Extending the Vm trait with set_identity_map_address() in order to
expose this ioctl to the VMM.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Commit ac2517217634ca8b89e441f5769ab6b916868ee1 bumps the rust
version of virtiofsd named `virtiofsd-rs`, which causes a warning
```
warning: use of deprecated parameter '--socket':
Please use the '--socket-path' option instead.
```
This commit updates the cmdline parameter accordingly.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
Place the 3 page TSS at an explicit location in the 32-bit address space
to avoid conflicting with the loaded raw firmware.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Reduce the size of the reserved 32-bit address space to the range used
by both the PCI MMIO config data and the 32-bit PCI device space.
This avoids issues when using firmware that is loaded into the very top
of the 32-bit address space as the RAM conflicts with the reserved
memory.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Current `Getting Started` section only contains steps for the x86_64
platform, as we have a documentation doing the same thing for AArch64,
we can point users to the correct documentation.
Also, this commit modifies the `docs/arm64.md` to fit the documentation
style within the project.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
This test is flaky (#3400) while we are experiencing a bug of using the latest
SPDK/NVMe backend as VFIO user device (#3401). Let's disable this test
before we fix the above two issues.
Signed-off-by: Bo Chen <chen.bo@intel.com>
For now we only enable the vfio-user test on x86_64 platform, as we have
a known hanging issue to resovle on the aarch64 platform.
Fixes: #3098
Signed-off-by: Bo Chen <chen.bo@intel.com>
Enabling these configs can avoid systemd errors related to Device Mapper
multipath while guest booting. Especially, the guest can hang when being
used with an NVMe backend without these configs (#3352).
Signed-off-by: Bo Chen <chen.bo@intel.com>
This kernel config is needed to fix the observed guest hanging issue
cased by systemd crash while booting.
Fixes: #3352
Signed-off-by: Bo Chen <chen.bo@intel.com>
Added fields:
- `Memory address size limit`: the missing of this field triggered
warnings in guest kernel
- `Node ID`
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
After introducing multiple PCI segments, the `devid` value in
`kvm_irq_routing_entry` exceeds the maximum supported range on AArch64.
This commit restructed the `devid` to the allowed range.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
The register D has only one bit that is not reserved, and its purpose is
to report if the RTC/CMOS device is powered or not.
The OVMF firmware was failing to boot as it was getting the information
that the device was powered off from the register D.
The simple way to fix this issue is by always returning the bit 7 from
register D as 1, indicating the device is always powered.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
If the provided binary isn't an ELF binary assume that it is a firmware
to be loaded in directly. In this case we shouldn't program any of the
registers as KVM starts in that state.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>