Use the ACPI feature to control whether to build the mptable. This is
necessary as the mptable and ACPI RSDP table can easily overwrite each
other leading to it failing to boot.
TEST=Compile with default features and see that --cpus boot=48 now
works, try with --no-default-features --features "pci" and observe the
--cpus boot=48 also continues to work.
Fixes: #1132
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The setup_mptables() call which is not used on ACPI builds has a side
effect of testing whether there was enough RAM which one of the unit
tests was relying on. Add a similar check for the RSDP address.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Fill and write to guest memory the necessary boot module
structure to allow a guest using the PVH boot protocol
to load an initramfs image.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Commit 2adddce2 reorganized the crate for a cleaner multi architecture
(x86_64 and aarch64) support.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
* load the initramfs File into the guest memory, aligned to page size
* finally setup the initramfs address and its size into the boot params
(in configure_64bit_boot)
Signed-off-by: Damjan Georgievski <gdamjan@gmail.com>
We set it to 0xff, which is for unregistered loaders.
The kernel checks that the bootloader ID is set when e.g. loading
ramdisks, so not setting it when we get a bootparams header from the
loader will prevent the kernel from loading ramdisks.
Fixes: #918
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Validate correct GDT entries, initial segment configuration, and control
register bits that are required by PVH boot protocol.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Expand the unit tests to cover the configure_system() code when
using the PVH boot protocol. Verify the method for adding memory
map table entries in the format specified by PVH boot protocol.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Fill the hvm_start_info and related memory map structures as
specified in the PVH boot protocol. Write the data structures
to guest memory at the GPA that will be stored in %rbx when
the guest starts.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
In order to properly initialize the kvm regs/sregs structs for
the guest, the load_kernel() return type must specify which
boot protocol to use with the entry point address it returns.
Make load_kernel() return an EntryPoint struct containing the
required information. This structure will later be used
in the vCPU configuration methods to setup the appropriate
initial conditions for the guest.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Create supporting definitions to use the hvm start info and memory
map table entry struct definitions from the linux-loader crate in
order to enable PVH boot protocol support
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Since the kvm crates now depend on vmm-sys-util, the bump must be
atomic.
The kvm-bindings and ioctls 0.2.0 and 0.4.0 crates come with a few API
changes, one of them being the use of a kvm_ioctls specific error type.
Porting our code to that type makes for a fairly large diff stat.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Remove ACPI table creation from arch crate to the vmm crate simplifying
arch::configure_system()
GuestAddress(0) is used to mean no RSDP table rather than adding
complexity with a conditional argument or an Option type as it will
evaluate to a zero value which would be the default anyway.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Add basic processor details to the DSDT table. The code has to be
slightly convoluted (with the second pass over the cpu_devices vector)
in order to keep the objects alive long enough in order to be able to
take their reference.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
We need to rely on the latest kvm-ioctls version to benefit from the
recent addition of unregister_ioevent(), allowing us to detach a
previously registered eventfd to a PIO or MMIO guest address.
Because of this update, we had to modify the current constraint we had
on the vmm-sys-util crate, using ">= 0.1.1" instead of being strictly
tied to "0.2.0".
Once the dependency conflict resolved, this commit took care of fixing
build issues caused by recent modification of kvm-ioctls relying on
EventFd reference instead of RawFd.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
To avoid a clash with to_bytes() for the unsigned integer types that is
coming in a future release.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This was verified by comparing the ASL from disassembling the DSDT
before and after. All the individual AML components themselves are also
unit tested.
Fixes: #352
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The virtual IOMMU exposed through virtio-iommu device has a dependency
on ACPI. It needs to expose the device ID of the virtio-iommu device,
and all the other devices attached to this virtual IOMMU. The IDs are
expressed from a PCI bus perspective, based on segment, bus, device and
function.
The guest relies on the topology description provided by the IORT table
to attach devices to the virtio-iommu device.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The PCI Express Firmware specification says that the region may
be included in the E820 tables (but it must always be in the ACPI
tables.)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The PCI Express Firmware spec says that the region to be used for PCI
MMCONFIG should be reserved as part of the motherboard's resources in
the ACPI tables.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The PCI MMCONFIG area must be below 4GiB and must not be part of the
device space. Shrink the device area and put the PCI MMCONFIG region
above it.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Patch the table with the currently used constants. This will be relevant
when we want to adjust the size of the PCI device area to accomodate the
PCI MMCONFIG region.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
These are part of RAM and are used as the initial page table entries for
booting the OS and firmware (identity mapping.)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Using the existing layout module start documenting the major regions of
RAM and those areas that are reserved. Some of the constants have also
been renamed to be more consistent and some functions that returned
constant variables have been replaced.
Future commits will move more constants into this file to make it the
canonical source of information about the memory layout.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The last byte was missing from the E820 RAM area. This was due to the
function using the last address relative to the first address in the
range to calculate the size. This incorrectly calculated the size by
one. This produced incorrect E820 tables like this:
[ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009ffff] usable
[ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000001ffffffe] usable
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
After the 32-bit gap the memory is shared between the devices and the
RAM. Ensure that the ACPI tables correctly indicate where the RAM ends
and the device area starts by patching the precompiled tables. We get
the following valid output now from the PCI bus probing (8GiB guest)
[ 0.317757] pci_bus 0000:00: resource 4 [io 0x0000-0x0cf7 window]
[ 0.319035] pci_bus 0000:00: resource 5 [io 0x0d00-0xffff window]
[ 0.320215] pci_bus 0000:00: resource 6 [mem 0x000a0000-0x000bffff window]
[ 0.321431] pci_bus 0000:00: resource 7 [mem 0xc0000000-0xfebfffff window]
[ 0.322613] pci_bus 0000:00: resource 8 [mem 0x240000000-0xfffffffff window]
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
There was an off-by-error in the result making the hole one byte too
big and ending at an address too high.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The starting length of the MCFG table was too long resulting in the
kernel trying to get extra MCFG entries from the table that weren't
there resulting in the following error message from the kernel:
PCI: no memory for MCFG entries
The MCFG table also has an 8 bytes of padding at the start before the
table begins.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The command "cargo build --no-default-features" does not recursively
disable the default features across the workspace. Instead add an acpi
feature at the top-level, making it default, and then make that feature
conditional on all the crate acpi features.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Put the ACPI support behind a feature and ensure that the code compiles
without that feature by adding an extra build to Travis.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Add an I/O port "device" to handle requests from the kernel to shutdown
or trigger a reboot, borrowing an I/O used for ACPI on the Q35 platform.
The details of this I/O port are included in the FADT
(SLEEP_STATUS_REG/SLEEP_CONTROL_REG/RESET_REG) with the details of the
value to write in the FADT for reset (RESET_VALUE) and in the DSDT for
shutdown (S5 -> 0x05)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The DSDT must declare the interrupt used by the serial device. This
helps the guest kernel matching the right interrupt to the 8250 serial
device. This is mandatory in case the IRQ routing is handled by ACPI, as
we must let ACPI know what do do with pin based interrupts.
One thing to notice, if we were using acpi=noirq from the kernel command
line, this would mean ACPI is not in charge of the IRQ routing, and the
device COM1 declaration would not be needed.
One additional requirement is to provide the appropriate interrupt
source override for the legacy ISA interrupts (0-15), which will give
the right information to the guest kernel about how to allocate the
associated IRQs.
Because we want to keep the MADT as simple as possible, and given that
our only device requiring pin based interrupt is the serial device, we
choose to only define the pin 4.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>