Instead of passing the GuestMemoryMmap directly to the CpuManager upon
its creation, it's better to pass a reference to the MemoryManager. This
way we will be able to know if SGX EPC region along with one or multiple
sections are present.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Based on the presence of one or multiple SGX EPC sections from the VM
configuration, the MemoryManager will allocate a contiguous block of
guest address space to hold the entire EPC region. Within this EPC
region, each EPC section is memory mapped.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This commit store balloon size to MemoryConfig.
After reboot, virtio-balloon can use this size to inflate back to
the size before reboot.
Signed-off-by: Hui Zhu <teawater@antfin.com>
In order to move the hypervisor specific parts of the VM exit handling
path, we're defining a generic, hypervisor agnostic VM exit enum.
This is what the hypervisor's Vcpu run() call should return when the VM
exit can not be completely handled through the hypervisor specific bits.
For KVM based hypervisors, this means directly forwarding the IO related
exits back to the VMM itself. For other hypervisors that e.g. rely on the
VMM to decode and emulate instructions, this means the decoding itself
would happen in the hypervisor crate exclusively, and the rest of the VM
exit handling would be handled through the VMM device model implementation.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Fix test_vm unit test by using the new abstraction and dropping some
dead code.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
The fd naming is quite KVM specific. Since we're now using the
hypervisor crate abstractions, we can rename those into something more
readable and meaningful.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Split the generic virtio code (queues and device type) from the
VirtioDevice trait, transport and device implementations.
This also simplifies the feature handling in vhost_user_backend as the
vm-virtio crate is no longer has any features.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Because we don't want the guest to miss any event triggered by the
emulation of devices, it is important to resume all vCPUs before we can
resume the DeviceManager with all its associated devices.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
When set_user_memory_region was moved to hypervisor crate, it was turned
into a safe function that wrapped around an unsafe call. All but one
call site had the safety statements removed. But safety statement was
not moved inside the wrapper function.
Add the safety statement back to help reasoning in the future. Also
remove that one last instance where the safety statement is not needed .
No functional change.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
That removes one more KVM-ism in VMM crate.
Note that there are more KVM specific code in those files to be split
out, but we're not at that stage yet.
No functional change.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Collate the virtio device counters in DeviceManager for each device that
exposes any and expose it through the recently added HTTP API.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The counters are a hash of device name to hash of counter name to u64
value. Currently the API is only implemented with a stub that returns an
empty set of counters.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
In order to maintain correct time when doing pause/resume and
snapshot/restore operations, this patch stores the clock value
on pause, and restore it on resume. Because snapshot/restore
expects a VM to be paused before the snapshot and paused after
the restore, this covers the migration use case too.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Because we need to pause the VM before it is snapshot, it should be
restored in a paused state to keep the sequence symmetrical. That's the
reason why the state machine regarding the valid VM's state transition
needed to be updated accordingly.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Start moving the vmm, arch and pci crates to being hypervisor agnostic
by using the hypervisor trait and abstractions. This is not a complete
switch and there are still some remaining KVM dependencies.
Signed-off-by: Muminul Islam <muislam@microsoft.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
This patch completes the series by connecting the dots between the HTTP
frontend and the device manager backend.
Any request to hotplug a VFIO, disk, fs, pmem, net, or vsock device will
now return a response including the device name and the place of the
device in the PCI topology.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Pass from the device manager to the calling code the information about
the PCI device that has just been hotplugged.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Now the flow of both architectures are aligned to:
1. load kernel
2. create VCPU's
3. configure system
4. start VCPU's
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Between X86 and AArch64, there is some difference in booting a VM:
- X86_64 can setup IOAPIC before creating any VCPU.
- AArch64 have to create VCPU's before creating GIC.
The old process is:
1. load_kernel()
load kernel binary
configure system
2. activate_vcpus()
create & start VCPU's
So we need to separate "activate_vcpus" into "create_vcpus" and
"activate_vcpus" (to start vcpus only). Setup GIC and create FDT
between the 2 steps.
The new procedure is:
1. load_kernel()
load kernel binary
(X86_64) configure system
2. create VCPU's
3. (AArch64) setup GIC
4. (AArch64) configure system
5. start VCPU's
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
This is a preparing commit to build and test CH on AArch64. All building
issues were fixed, but no functionality was introduced.
For X86, the logic of code was not changed at all.
For ARM, the architecture specific part is still empty. And we applied
some tricks to workaround lint warnings. But such code will be replaced
later by other commits with real functionality.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
The ch branch has been rebased to incorporate the latest upstream code
requiring a small change to the unit tests.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
To ensure that the DeviceManager threads (such as those used for virtio
devices) are cleaned up it is necessary to unpark them so that they get
cleanly terminated as part of the shutdown.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
In case the VM is created from scratch, the devices should be created
after the DeviceManager has been created. But this should not affect the
restore codepath, as in this case the devices should be created as part
of the restore() function.
It's necessary to perform this differentiation as the restore must go
through the following steps:
- Create the DeviceManager
- Restore the DeviceManager with the right state
- Create the devices based on the restored DeviceManager's device tree
- Restore each device based on the restored DeviceManager's device tree
That's why this patch leverages the recent split of the DeviceManager's
creation to achieve what's needed.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This commit performs the split of the DeviceManager's creation into two
separate functions by moving anything related to device's creation after
the DeviceManager structure has been initialized.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
If the current state is paused that means most of the handles got killed by pthread_kill
We need to unpark those threads to make the shutdown worked. Otherwise
The shutdown API hangs and the API is not responding afterwards. So
before the shutdown call we need to resume the VM make it succeed.
Fixes: #817
Signed-off-by: Muminul Islam <muislam@microsoft.com>
Adds DeviceManager method `make_virtio_fs_device` which creates a single
device, and modifies `make_virtio_fs_devices` to use this method.
Implements the new `vm.add-fs route`.
Signed-off-by: Dean Sheather <dean@coder.com>
We can now allow guests that specify an initramfs to boot
using the PVH boot protocol.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
When performing an API boot validate the configuration. For now only
some very basic validation is performed but in subsequent commits
the validation will be extended.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Now that the restore path uses RestoreConfig structure, we add a new
parameter called "prefault" to it. This will give the user the ability
to populate the pages corresponding to the mapped regions backed by the
snapshotted memory files.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The goal here is to move the restore parameters into a dedicated
structure that can be reused from the entire codebase, making the
addition or removal of a parameter easier.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
When CoW can be used, the VM restoration time is reduced, but the pages
are not populated. This can lead to some slowness from the guest when
accessing these pages.
Depending on the use case, we might prefer a slower boot time for better
performances from guest runtime. The way to achieve this is to prefault
the pages in this case, using the MAP_POPULATE flag along with CoW.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Whenever a MemoryManager is restored from a snapshot, the memory regions
associated with it might need to directly back the mapped memory for
increased performances. If that's the case, a list of external regions
is provided and the MemoryManager should simply ignore what's coming
from the MemoryConfig.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The MemoryManager is somehow a special case, as its restore() function
was not implemented as part of the Snapshottable trait. Instead, and
because restoring memory regions rely both on vm.json and every memory
region snapshot file, the memory manager is restored at creation time.
This makes the restore path slightly different from CpuManager, Vcpu,
DeviceManager and Vm, but achieve the correct restoration of the
MemoryManager along with its memory regions filled with the correct
content.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
This is only implementing the send() function in order to store all Vm
states into a file.
This needs to be extended for live migration, by adding more transport
methods, and also the recv() function must be implemented.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
By aggregating snapshots from the CpuManager, the MemoryManager and the
DeviceManager, Vm implements the snapshot() function from the
Snapshottable trait.
And by restoring snapshots from the CpuManager, the MemoryManager and
the DeviceManager, Vm implements the restore() function from the
Snapshottable trait.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Implement the Snapshottable trait for Vcpu, and then implements it for
CpuManager. Note that CpuManager goes through the Snapshottable
implementation of Vcpu for every vCPU in order to implement the
Snapshottable trait for itself.
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
A Snapshottable component can snapshot itself and
provide a MigrationSnapshot payload as a result.
A MigrationSnapshot payload is a map of component IDs to a list of
migration sections (MigrationSection). As component can be made of
several Migratable sub-components (e.g. the DeviceManager and its
device objects), a migration snapshot can be made of multiple snapshot
itself.
A snapshot is a list of migration sections, each section being a
component state snapshot. Having multiple sections allows for easier and
backward compatible migration payload extensions.
Once created, a migratable component snapshot may be transported and this
is what the Transportable trait defines, through 2 methods: send and recv.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Whenever the memory is resized, it's important to retrieve the new
region to pass it down to the device manager, this way it can decide
what to do with it.
Also, there's no need to use a boolean as we can instead use an Option
to carry the information about the region. In case of virtio-mem, there
will be no region since the whole memory has been reserved up front by
the VMM at boot. This means only the ACPI hotplug will return a region
and is the only method that requires the memory to be updated from the
device manager.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Commit 2adddce2 reorganized the crate for a cleaner multi architecture
(x86_64 and aarch64) support.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
For now, the codebase does not support booting from initramfs with PVH
boot protocol, therefore we need to fallback to the legacy boot.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
* load the initramfs File into the guest memory, aligned to page size
* finally setup the initramfs address and its size into the boot params
(in configure_64bit_boot)
Signed-off-by: Damjan Georgievski <gdamjan@gmail.com>
The persistent memory will be hotplugged via DeviceManager and saved in
the config for later use.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This commit adds new option hotplug_method to memory config.
It can set the hotplug method to "acpi" or "virtio-mem".
Signed-off-by: Hui Zhu <teawater@antfin.com>
The persistent memory will be hotplugged via DeviceManager and saved in
the config for later use.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Whenever the VM memory is resized, DeviceManager needs to be notified
so that it can subsequently notify each virtio devices about it.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This feature is stable and there is no need for this to be behind a
flag. This will also reduce the time needed to run the integration test
as we will not be running them all again under the flag.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
I spent a few minutes trying to understand why we were unconditionally
updating the VM config memory size, even if the guest memory resizing
did not happen.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Fill the hvm_start_info and related memory map structures as
specified in the PVH boot protocol. Write the data structures
to guest memory at the GPA that will be stored in %rbx when
the guest starts.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
In order to properly initialize the kvm regs/sregs structs for
the guest, the load_kernel() return type must specify which
boot protocol to use with the entry point address it returns.
Make load_kernel() return an EntryPoint struct containing the
required information. This structure will later be used
in the vCPU configuration methods to setup the appropriate
initial conditions for the guest.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Add a new id option to the VFIO hotplug command so that it matches the
VFIO coldplug semantic.
This is done by refactoring the existing code for VFIO hotplug, where
VmAddDeviceData structure is replaced by DeviceConfig. This structure is
the one used whenever a VFIO device is coldplugged, which is why it
makes sense to reuse it for the hotplug codepath.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This commit ensures that when a VFIO device is hot-unplugged from the
VM, it is also removed from the VmConfig. This prevents a potential
reboot from creating the device.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This commit introduces the new command "remove-device" that will let a
user hot-unplug a VFIO PCI device from an already running VM.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The Vm structure was used to store a strong reference to the IO bus.
This is not needed anymore since the AddressManager is logically the
one holding this strong reference. This has been made possible by the
introduction of Weak references on the Bus structure itself.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The method add_vfio_device() from the DeviceManager needs to be mutable
if we want later to be able to update some internal fields from the
DeviceManager from this same function.
This commit simply takes care of making the necessary changes to change
this function as mutable.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
It's more logical to name the field referring to the DeviceManager as
"device_manager" instead of "devices".
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
By inserting the DeviceManager on the IO bus, we introduced some cyclic
dependency:
DeviceManager ---> AddressManager ---> Bus ---> BusDevice
^ |
| |
+---------------------------------------------+
This cycle needs to be broken by inserting a Weak reference instead of
an Arc (considered as a strong reference).
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Ensures the configuration is updated after a new device has been
hotplugged. In the event of a reboot, this means the new VM will be
started with the new device that had been previously hotplugged.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This commit finalizes the VFIO PCI hotplug support, based on all the
previous commits preparing for it.
One thing to notice, this does not support vIOMMU yet. This means we can
hotplug VFIO PCI devices, but we cannot attach them to an existing or a
new virtio-iommu device.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Whenever the user wants to hotplug a new VFIO PCI device, the VMM will
have to trigger a hotplug notification through the GED device.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This commit introduces the new command "add-device" that will let a user
hotplug a VFIO PCI device to an already running VM.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In anticipation of the support for device hotplug, this commit moves the
DeviceManager object into an Arc<Mutex<>> when the DeviceManager is
being created. The reason is, we need the DeviceManager to implement the
BusDevice trait and then provide it to the IO bus, so that IO accesses
related to device hotplug can be handled correctly.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Relying on the latest vm-memory version, including the freshly
introduced structure GuestMemoryAtomic, this patch replaces every
occurrence of Arc<ArcSwap<GuestMemoryMmap> with
GuestMemoryAtomic<GuestMemoryMmap>.
The point is to rely on the common RCU-like implementation from
vm-memory so that we don't have to do it from Cloud-Hypervisor.
Fixes#735
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
It is necessary to do this at the start of the VMM execution rather than
later as it must be done in the main thread in order to satisfy the
checks required by PTRACE_MODE_READ_FSCREDS (see proc(5) and
ptrace(2))
The alternative is to run as CAP_SYS_PTRACE but that has its
disadvantages.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
If the ioctl syscall KVM_CREATE_VM gets interrupted while creating the
VM, it is expected that we should retry since EINTR should not be
considered a standard error.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The build is run against "--all-features", "pci,acpi", "pci" and "mmio"
separately. The clippy validation must be run against the same set of
features in order to validate the code is correct.
Because of these new checks, this commit includes multiple fixes
related to the errors generated when manually running the checks.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Use independent bits for storing whether there is a CPU or memory device
changed when reporting changes via ACPI GED interrupt. This prevents a
later notification squashing an earlier one and ensure that hotplugging
both CPU and memory at the same time succeeds.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
If a new amount of RAM is requested in the VmResize command try and
hotplug if it an increase (MemoryManager::Resize() silently ignores
decreases.)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Generate and expose the DSDT table entries required to support memory
hotplug. The AML methods call into the MemoryManager via I/O ports
exposed as fields.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
For now the new memory size is only used after a reboot but support for
hotplugging memory will be added in a later commit.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This specifies how much address space should be reserved for hotplugging
of RAM. This space is reserved by adding move the start of the device
area by the desired amount.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
In order to be able to support resizing either vCPUs or memory or both
make the fields in the resize command optional.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This allows us to change the memory map that is being used by the
devices via an atomic swap (by replacing the map with another one). The
ArcSwap provides the mechanism for atomically swapping from to another
whilst still giving good read performace. It is inside an Arc so that we
can use a single ArcSwap for all users.
Not covered by this change is replacing the GuestMemoryMmap itself.
This change also removes some vertical whitespace from use blocks in the
files that this commit also changed. Vertical whitespace was being used
inconsistently and broke rustfmt's behaviour of ordering the imports as
it would only do it within the block.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This removes the need to handle a mutable integer and also centralises
the allocation of these slot numbers.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The memory manager is responsible for setting up the guest memory and in
the long term will also handle addition of guest memory.
In this commit move code for creating the backing memory and populating
the allocator into the new implementation trying to make as minimal
changes to other code as possible.
Follow on commits will further reduce some of the duplicated code.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Since the Snapshotable placeholder and Migratable traits are provided as
well, the DeviceManager object and all its objects are now Migratable.
All Migratable devices are tracked as Arc<Mutex<dyn Migratable>>
references.
Keeping track of all migratable devices allows for implementing the
Migratable trait for the DeviceManager structure, making the whole
device model potentially migratable.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Now that the GED device does not use a hardcoded IRQ number the starting
IRQ number can be restored (needed for the hardcoded serial port IRQ.)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Move the code for handling the creation of the DSDT entries for devices
into the DeviceManager.
This will make it easier to handle device hotplug and also in the future
remove some hardcoded ACPI constants.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Move the code for generating the MADT (APIC) table and the DSDT
generation for CPU related functionality into the CpuManager.
There is no functional change just code rearrangement.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Previously the device setup code assumed that if no IOAPIC was passed in
then the device should be added to the kernel irqchip. As an earlier
change meant that there was always a userspace IOAPIC this kernel based
code can be removed.
The accessor still returns an Option type to leave scope for
implementing a situation without an IOAPIC (no serial or GED device).
This change does not add support no-IOAPIC mode as the original code did
not either.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Update the configuration after a resize to ensure that after a reboot
the added vCPUs are preserved.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The tty mode remains raw mode when cloud-hypervisor is terminted by
SIGTERM or SIGINT. The terminal is unusable due to echoing is
disabled which is really annoying.
Signed-off-by: Qiu Wenbo <qiuwenbo@phytium.com.cn>
Some physical address bits may become reserved in page table when SME
is enabled on AMD platform. Guest will trigger a reserved bit
violation page fault in this case due to write these reserved bits to 1
in page table. We need reduce the reserved bits to get the right
physical address range.
Signed-off-by: Qiu Wenbo <qiuwenbo@phytium.com.cn>
Add ability to notify via the GED device that there is some new hotplug
activity. This will be used by the CpuManager (and later DeviceManager
itself) to notify of new hotplug activity.
Currently it has a hardcoded IRQ of 5 as the ACPI tables also need to
refer to this IRQ and the IRQ allocation does not permit the allocation
of specific IRQs.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Currently only increasing the number of vCPUs is supported but in the
future it will be extended.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The MADT table contains the details of all the potential vCPUs and
whether they are present at boot (as indicated by the flags field.)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
When initialising the ACPI tables and configuring the VM use the new
accessor on the CpuManager to get the number of boot vCPUs.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Since the kvm crates now depend on vmm-sys-util, the bump must be
atomic.
The kvm-bindings and ioctls 0.2.0 and 0.4.0 crates come with a few API
changes, one of them being the use of a kvm_ioctls specific error type.
Porting our code to that type makes for a fairly large diff stat.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
In case the VM is started with the flag "--memory mergeable=on", it
means the user expects the guest RAM pages to be marked as mergeable.
This commit relies on the madvise(MADV_MERGEABLE) system call to inform
the host kernel about these pages.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Move CpuManager, Vcpu and related functionality to its own module (and
file) inside the VMM crate
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Pull details of vCPU management (booting, pausing, resuming, shutdown)
into it's own structure. This will ultimately enable this to be moved to
its own file and encapsulate all the vCPU handling for the VMM.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Remove ACPI table creation from arch crate to the vmm crate simplifying
arch::configure_system()
GuestAddress(0) is used to mean no RSDP table rather than adding
complexity with a conditional argument or an Option type as it will
evaluate to a zero value which would be the default anyway.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
We need to rely on the latest kvm-ioctls version to benefit from the
recent addition of unregister_ioevent(), allowing us to detach a
previously registered eventfd to a PIO or MMIO guest address.
Because of this update, we had to modify the current constraint we had
on the vmm-sys-util crate, using ">= 0.1.1" instead of being strictly
tied to "0.2.0".
Once the dependency conflict resolved, this commit took care of fixing
build issues caused by recent modification of kvm-ioctls relying on
EventFd reference instead of RawFd.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In order to reuse the SystemAllocator later at runtime, it is moved into
the new structure AddressManager. The goal is to have a hold onto the
SystemAllocator and both IO and MMIO buses so that we can use them
later.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
We should return an explicit error when the transition from on VM state
to another is invalid.
The valid_transition() routine for the VmState enum essentially
describes the VM state machine.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
In order to pause a VM, we signal all the vCPU threads to get them out
of vmx non-root. Once out, the vCPU thread will check for a an atomic
pause boolean. If it's set to true, then the thread will park until
being resumed.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
So that we don't need to forward an ExitBehaviour up to the VMM thread.
This simplifies the control loop and the VMM thread even further.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
This commit is the glue between the virtio-pci devices attached to the
vIOMMU, and the IORT ACPI table exposing them to the guest as sitting
behind this vIOMMU.
An important thing is the trait implementation provided to the virtio
vrings for each device attached to the vIOMMU, as they need to perform
proper address translation before they can access the buffers.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The virtual IOMMU exposed through virtio-iommu device has a dependency
on ACPI. It needs to expose the device ID of the virtio-iommu device,
and all the other devices attached to this virtual IOMMU. The IDs are
expressed from a PCI bus perspective, based on segment, bus, device and
function.
The guest relies on the topology description provided by the IORT table
to attach devices to the virtio-iommu device.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
We used to have errors definitions spread across vmm, vm, api,
and http.
We now have a cleaner separation: All API routines only return an
ApiResult. All VM operations, including the VMM wrappers, return a
VmResult. This makes it easier to carry errors up to the HTTP caller.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>