The PCI buses should not declare the address space related to the MMIO
config space given it's already declared in the MCFG table and through
the motherboard device PNP0C02 in the DSDT table.
The PCI MMIO config region for the segment was being wrongly exposed as
part of the _CRS for the ACPI bus device (using Memory32Fixed). Exposing
it via this object was ineffectual as the equivalent entry in the
PNP0C02 (_SB_.MBRD) marked those ranges as not usable via the kernel.
Either way, with both devices used by the kernel, the kernel will not
try and use those memory ranges for the device BARs. However under
td-shim on TDX the PNP0C02 device is not on the permitted list of
devices so the the memory ranges were not marked as unusable resulting
in the kernel attempting to allocate BARs that collided with the PCI
MMIO configuration space.
This is based on the kernel documentation PCI/acpi-info.rst which relies
on ACPI and PCI Firmware specifications. And here are the interesting
quotes from this document:
"""
Prior to the addition of Extended Address Space descriptors, the failure
of Consumer/Producer meant there was no way to describe bridge registers
in the PNP0A03/PNP0A08 device itself. The workaround was to describe the
bridge registers (including ECAM space) in PNP0C02 catch-all devices.
With the exception of ECAM, the bridge register space is device-specific
anyway, so the generic PNP0A03/PNP0A08 driver (pci_root.c) has no need
to know about it.
PNP0C02 “motherboard” devices are basically a catch-all. There’s no
programming model for them other than “don’t use these resources for
anything else.” So a PNP0C02 _CRS should claim any address space that is
(1) not claimed by _CRS under any other device object in the ACPI
namespace and (2) should not be assigned by the OS to something else.
The address range reported in the MCFG table or by _CBA method (see
Section 4.1.3) must be reserved by declaring a motherboard resource. For
most systems, the motherboard resource would appear at the root of the
ACPI namespace (under _SB) in a node with a _HID of EISAID (PNP0C02),
and the resources in this case should not be claimed in the root PCI
bus’s _CRS. The resources can optionally be returned in Int15 E820 or
EFIGetMemoryMap as reserved memory but must always be reported through
ACPI as a motherboard resource.
"""
This change has been manually tested by running a VM with multiple
segments (4 segments), and by hotplugging an additional disk to the
segment number 2 (third segment).
From one shell:
"""
cloud-hypervisor \
--cpus boot=1 \
--memory size=1G \
--kernel vmlinux \
--cmdline "root=/dev/vda1 rw console=hvc0" \
--disk path=jammy-server-cloudimg.raw \
--api-socket /tmp/ch.sock \
--platform num_pci_segments=4
"""
From another shell (after the VM is booted):
"""
ch-remote \
--api-socket=/tmp/ch.sock \
add-disk \
path=test-disk.raw,id=disk2,pci_segment=2
"""
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
As 'handle_child_output()' may terminate the test on panic, we need to
cleanup ovs-dpdk setup in advance.
see: #4555
Signed-off-by: Bo Chen <chen.bo@intel.com>
Instead of always fuzzing virt-queues with default values (mostly 0s),
the fuzzer now initializes the virt-queue based on the fuzzed input
bytes, such as the tail position of the available ring, queue size
selected by driver, descriptor table address, available ring address,
used ring address, etc. In this way, the fuzzer can explore the
virtio-block code path with various virt-queue setup.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Use VgicConfig to initialize Vgic.
Use Gic::create_default_config everywhere so we don't always recompute
redist/msi registers.
Add a helper create_test_vgic_config for tests in hypervisor crate.
Signed-off-by: Nuno Das Neves <nudasnev@microsoft.com>
VgicConfig structure will be used for initializing the Vgic.
Gic::create_default_config will be used everywhere we currently compute
redist/msi registers.
Signed-off-by: Nuno Das Neves <nudasnev@microsoft.com>
AArch64 can share the same way of loading payload with x86_64. It makes
the payload loading more consistent between different arches.
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
uefi_flash is used when load firmware, that is load payload depends on
device manager. move uefi_flash to memory manager can eliminate the
dependency.
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
A new firmware item has been added into payload config, we need
extend ability to load standalone firmware on AArch64.
"load_kernel" method will be the entry of image loading work including
kernel and firmware.
This change is back compatible. So, we can either load firmware through
"-kernel" like before or "-firmware".
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Later, we will load standalone firmware. So, refactor load_kernel
by abstracting load_firmware method.
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
It provides fuzzer a reliable way to wait for a sequence of events
to complete for virtio-devices while not using a fixed timeout to
maintain the full speed of fuzzing.
Take virtio-block as an example, the 'queue event' with a valid
available queue setup can trigger a 'completion event'. This is a
meaningful virtio-block code path of processing guest inputs which is
our target for fuzzing virtio devices.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Fetch the whole git repository (not just the specific commit) and use
the github context instead of hardcoded branch.
Unfortunately now that we process the list of revisions correctly it
shows that the checks don't work on aarch64 due to cross limitations so
this has been removed.
Fixes: #4523
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Given the virtio-console is now able to buffer its output when no PTY is
connected on the other end, the device manager code is updated to enable
this. Moving the endpoint type from FilePair to PtyPair enables the
proper codepath in the virtio-console implementation, as well as
updating the PTY resize code, and forcing the PTY to always be
non-blocking.
The non-blocking behavior is required to avoid blocking the guest that
would be waiting on the virtio-console driver. When receiving an
EWOULDBLOCK error, the output will simply be redirected to the temporary
buffer so that it can be later flushed.
The PTY resize logic has been slightly modified to ensure the PTY file
descriptors are closed. It avoids the child process to keep a hold onto
the PTY device, which would have caused the PTY to believe something is
connected on the other end, which would have prevented the detection of
any new connection on the PTY.
Fixes#4521
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Through multiple changes, this patch aims at providing a reliable
solution for detecting the state of the PTY's connection. Being able to
find out when the other end of the PTY is connected is essential to
prevent the loss of data being output through the PTY. When the PTY
isn't connected, the output is buffered through the SerialBuffer, the
same solution that was created for the serial port initially.
Fixes#4521
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Extending and improving both the structure and the trait allows for more
flexibility regarding what can be achieved with the epoll loop. It
allows for a timeout to be configured instead of the default blocking
behavior. There is a new method in the trait to notify the caller that
the timeout has been reached. And there's a new knob to be notified with
the full list of events before the internal code will actually loop over
every event.
All of these new features are not affecting the previous behavior, and
using EpollHelper::run() should be unchanged.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
We want to be able to reuse the SerialBuffer from the virtio-devices
crate, particularly from the virtio-console implementation. That's why
we move the SerialBuffer definition to its own crate so that it can be
accessed from both vmm and virtio-devices crates, without creating any
cyclic dependency.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
If the epoll_wait() call returns EINTR, this only means a signal has
been delivered before any of the file descriptors registered triggered
an event or before the end of the timeout (if timeout isn't -1). For
that reason, we should simply try to listen on the epoll loop again.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
We must limit how much the buffer can grow, otherwise this could lead
the process to consume all the memory on the machine. This could happen
if the output from the guest was very important and nothing would
connect to the PTY for a long time.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Set the maximum number of HW breakpoints according to the value returned
from `Hypervisor::get_guest_debug_hw_bps()`.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Added `Hypervisor::get_guest_debug_hw_bps()` for fetching the number of
supported hardware breakpoints.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>