Expose the details of hotplug RAM slots via an I/O port. This will be
consumed by the ACPI DSDT tables to report the hotplug memory details to
the guest.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Add a "resize()" method on MemoryManager which will create a new memory
allocation based on the difference between the desired RAM amount and
the amount already in use. After allocating the added RAM using the same
backing method as the boot RAM store the details in a vector and update
the KVM map and create a new GuestMemoryMmap and replace all the users.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
For now the new memory size is only used after a reboot but support for
hotplugging memory will be added in a later commit.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
When the value is read from the I/O port via the ACPI AML functions to
determine what has been triggered the notifiction value is reset
preventing a second read from exposing the value. If we need support
multiple types of GED notification (such as memory hotplug) then we
should avoid reading the value multiple times.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This specifies how much address space should be reserved for hotplugging
of RAM. This space is reserved by adding move the start of the device
area by the desired amount.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
In order to be able to support resizing either vCPUs or memory or both
make the fields in the resize command optional.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Make the GuestMemoryMmap from a Vec<Arc<GuestRegionMmap>> by using this
method we can persist a set of regions in the MemoryManager and then
extend this set with a newly created region. Ultimately that will allow
the hotplug of memory.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
If neither PCI or MMIO are built in, we should not bother creating any
virtio devices at all.
When building a minimal VMM made of a kernel with an initramfs and a
serial console, the RNG virtio device is still created even though there
is no way it can ever get probed.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Because virtio-iommu is still evolving (as it's only partly upstream),
some pieces like the ACPI declaration of the different nodes and devices
attached to the virtual IOMMU are changing.
This patch introduces a new ACPI table called VIOT, standing as the high
level table overseeing the IORT table and associated subtables.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This allows us to change the memory map that is being used by the
devices via an atomic swap (by replacing the map with another one). The
ArcSwap provides the mechanism for atomically swapping from to another
whilst still giving good read performace. It is inside an Arc so that we
can use a single ArcSwap for all users.
Not covered by this change is replacing the GuestMemoryMmap itself.
This change also removes some vertical whitespace from use blocks in the
files that this commit also changed. Vertical whitespace was being used
inconsistently and broke rustfmt's behaviour of ordering the imports as
it would only do it within the block.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This function will be useful for other parts of the VMM that also
estabilish their own mappings.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This removes the need to handle a mutable integer and also centralises
the allocation of these slot numbers.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The memory manager is responsible for setting up the guest memory and in
the long term will also handle addition of guest memory.
In this commit move code for creating the backing memory and populating
the allocator into the new implementation trying to make as minimal
changes to other code as possible.
Follow on commits will further reduce some of the duplicated code.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
To reflect updated clippy rules:
error: `if` chain can be rewritten with `match`
--> vmm/src/device_manager.rs:1508:25
|
1508 | / if ret > 0 {
1509 | | debug!("MSI message successfully delivered");
1510 | | } else if ret == 0 {
1511 | | warn!("failed to deliver MSI message, blocked by guest");
1512 | | }
| |_________________________^
|
= note: `-D clippy::comparison-chain` implied by `-D warnings`
= help: Consider rewriting the `if` chain to use `cmp` and `match`.
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#comparison_chain
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Address updated clippy errors:
error: redundant clone
--> vmm/src/device_manager.rs:699:32
|
699 | .insert(acpi_device.clone(), 0x3c0, 0x4)
| ^^^^^^^^ help: remove this
|
= note: `-D clippy::redundant-clone` implied by `-D warnings`
note: this value is dropped without further use
--> vmm/src/device_manager.rs:699:21
|
699 | .insert(acpi_device.clone(), 0x3c0, 0x4)
| ^^^^^^^^^^^
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#redundant_clone
error: redundant clone
--> vmm/src/device_manager.rs:737:26
|
737 | .insert(i8042.clone(), 0x61, 0x4)
| ^^^^^^^^ help: remove this
|
note: this value is dropped without further use
--> vmm/src/device_manager.rs:737:21
|
737 | .insert(i8042.clone(), 0x61, 0x4)
| ^^^^^
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#redundant_clone
error: redundant clone
--> vmm/src/device_manager.rs:754:29
|
754 | .insert(cmos.clone(), 0x70, 0x2)
| ^^^^^^^^ help: remove this
|
note: this value is dropped without further use
--> vmm/src/device_manager.rs:754:25
|
754 | .insert(cmos.clone(), 0x70, 0x2)
| ^^^^
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#redundant_clone
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
When the running OS has been told that a CPU should be removed it will
shutdown the CPU and then signal to the hypervisor via the "_EJ0" method
on the device that ultimately writes into an I/O port than the vCPU
should be shutdown. Upon notification the hypervisor signals to the
individual thread that it should shutdown and waits for that thread to
end.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Allow the resizing of the number of vCPUs to less than the current
active vCPUs. This does not currently remove them from the system but
the kernel will take them offline.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
When we add a vCPU set an "inserting" boolean that is exposed as an ACPI
field that will be checked for and reset when the ACPI GED notification
for CPU devices happens.
This change is a precursor for CPU unplug.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Continue to notify on all vCPUs but instead separate the notification
functionality into two methods, CSCN that walks through all the CPUs
and CTFY which notifies based on the numerical CPU id. This is an
interim step towards only notifying on changed CPUs and ultimately CPU
removal.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
In anticipation for the writing of unit tests comparing two VmConfig
structures, this commit derives the PartialEq trait for VmConfig and
all embedded structures.
This patch also derives the Debug trait for the same set of structures
so that we can print them to facilitate debugging.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The OpenAPI should not have to provide a command line since the CLI
considers the command line as an empty string if nothing is provided.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This brings more modularity to the code, which will be helpful when we
will later test the CLI and OpenAPI generate the same VmConfig output.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Since the Snapshotable placeholder and Migratable traits are provided as
well, the DeviceManager object and all its objects are now Migratable.
All Migratable devices are tracked as Arc<Mutex<dyn Migratable>>
references.
Keeping track of all migratable devices allows for implementing the
Migratable trait for the DeviceManager structure, making the whole
device model potentially migratable.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Migratable devices can be virtio or legacy devices.
In any case, they can potentially be tracked through one of the IO bus
as an Arc<Mutex<dyn BusDevice>>. In order for the DeviceManager to also
keep track of such devices as Migratable trait objects, they must be
shared as mutable atomic references, i.e. Arc<Mutex<T>>. That forces all
Migratable objects to be tracked as Arc<Mutex<dyn Migratable>>.
Virtio devices are typically migratable, and thus for them to be
referenced by the DeviceManager, they now should be built as
Arc<Mutex<VirtioDevice>>.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
The FsConfig structure has been recently adjusted so that the default
value matches between OpenAPI and CLI. Unfortunately, with the current
description, there is no way from the OpenAPI to describe a cache_size
value "None", so that DAX does not get enabled. Usually, using a Rust
"Option" works because the default value is None. But in this case, the
default value is Some(8G), which means we cannot describe a None.
This commit tackles the problem, introducing an explicit parameter
"dax", and leaving "cache_size" as a simple u64 integer.
This way, the default value is dax=true and cache_size=8G, but it lets
the opportunity to disable DAX entirely with dax=false, which will
simply ignore the cache_size value.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In order to let the CLI and the HTTP API behave the same regarding the
VhostUserBlkConfig structure, this patch defines some default values
for num_queues, queue_size and wce.
num_queues is 1, queue_size is 128 and wce is true.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In order to let the CLI and the HTTP API behave the same regarding the
VhostUserNetConfig structure, this patch defines some default values
for num_queues, queue_size and mac.
num_queues is 2 since that's a pair of TX/RX queues, queue_size is 256
and mac is a randomly generated value.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
We want to set different default configurations for vhost-user-net and
vhost-user-blk, which is the reason why the common part corresponding to
the number of queues and the queue size cannot be embedded.
This prepares for the following commit, matching API and CLI behaviors.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
A simple patch making sure the field "file" is provisioned with the same
default value through CLI and OpenAPI.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Just making sure we have a serde default for the field "file" since it
is not a required field in the OpenAPI definition.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
All structures match between the OpenAPI definition and the internal
configuration code, that's why CpuConfig is being renamed into
CpusConfig.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The signal handling for vCPU signals has changed in the latest release
so switch to the new API.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
In order to let the CLI and the HTTP API behave the same regarding the
FsConfig structure, this patch defines some default values for
num_queues, queue_size and the cache_size.
num_queues is set to 1, queue_size is set to 1024, and cache_size is set
to Some(8G) which means that DAX is enabled by default with a shared
region of 8GiB.
Fixes#508
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In order to let the CLI and the HTTP API behave the same regarding the
NetConfig structure, this patch defines some default values for tap, ip,
mask, mac and iommu.
tap is None, ip is 192.168.249.1, mask is 255.255.255.0, mac is a
randomly generated value, and iommu is false.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Now that the GED device does not use a hardcoded IRQ number the starting
IRQ number can be restored (needed for the hardcoded serial port IRQ.)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Remove the previously hardcoded IRQ number used for the GED device.
Instead allocate the IRQ using the allocator and use that value in the
definition in the ACPI device.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Move the code for handling the creation of the DSDT entries for devices
into the DeviceManager.
This will make it easier to handle device hotplug and also in the future
remove some hardcoded ACPI constants.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Move the code for generating the MADT (APIC) table and the DSDT
generation for CPU related functionality into the CpuManager.
There is no functional change just code rearrangement.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
When consumer of the HTTP API try to interact with cloud-hypervisor,
they have to provide the equivalent of the config structure related to
each component they need. Problem is, the Rust enum type "Option" cannot
be obtained from the OpenAPI YAML definition.
This patch intends to fix this inconsistency between what is possible
through the CLI and what's possible through the HTTP API by using simple
types bool and int64 instead of Option<u64>.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Previously the device setup code assumed that if no IOAPIC was passed in
then the device should be added to the kernel irqchip. As an earlier
change meant that there was always a userspace IOAPIC this kernel based
code can be removed.
The accessor still returns an Option type to leave scope for
implementing a situation without an IOAPIC (no serial or GED device).
This change does not add support no-IOAPIC mode as the original code did
not either.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Update the configuration after a resize to ensure that after a reboot
the added vCPUs are preserved.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This was causing issues when the kernel was trying to reset the
interrupt and making the reboot fail.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The tty mode remains raw mode when cloud-hypervisor is terminted by
SIGTERM or SIGINT. The terminal is unusable due to echoing is
disabled which is really annoying.
Signed-off-by: Qiu Wenbo <qiuwenbo@phytium.com.cn>
Some physical address bits may become reserved in page table when SME
is enabled on AMD platform. Guest will trigger a reserved bit
violation page fault in this case due to write these reserved bits to 1
in page table. We need reduce the reserved bits to get the right
physical address range.
Signed-off-by: Qiu Wenbo <qiuwenbo@phytium.com.cn>
The KVM_SET_GSI_ROUTING ioctl is very simple, it overrides the previous
routes configuration with the new ones being applied. This means the
caller, in this case cloud-hypervisor, needs to maintain the list of all
interrupts which needs to be active at all times. This allows to
correctly support multiple devices to be passed through the VM and being
functional at the same time.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Call the "CTFY" method that will itself call Notify() on the CPU objects
in the ACPI namespace.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Add ability to notify via the GED device that there is some new hotplug
activity. This will be used by the CpuManager (and later DeviceManager
itself) to notify of new hotplug activity.
Currently it has a hardcoded IRQ of 5 as the ACPI tables also need to
refer to this IRQ and the IRQ allocation does not permit the allocation
of specific IRQs.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Currently only increasing the number of vCPUs is supported but in the
future it will be extended.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
When configuring a processor after boot as a hotplug CPU we only
configure a subset of the CPU state. In particular we should not
configure the FPU, segment registers (or reconfigure the paging which is
a side-effect of that) nor the main registers. Achieve this by making
the function take an Option type for the start address.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Refactor the vCPU thread starting so that there is the possibility to
bring on extra vCPU threads.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Currently this just holds the thread handle but will be enlarged to
encompass details such as whether the vCPU is currently being inserted
or ejected.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The MADT table contains the details of all the potential vCPUs and
whether they are present at boot (as indicated by the flags field.)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
When initialising the ACPI tables and configuring the VM use the new
accessor on the CpuManager to get the number of boot vCPUs.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Since the kvm crates now depend on vmm-sys-util, the bump must be
atomic.
The kvm-bindings and ioctls 0.2.0 and 0.4.0 crates come with a few API
changes, one of them being the use of a kvm_ioctls specific error type.
Porting our code to that type makes for a fairly large diff stat.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
context ID on vsock man defines a 32-bits value, openapi default integer
is a signed 32-bits value.
This could lead to miss one bit during castings for typed client
implmentations. Lets increase the range of valid values by requesting an
int64.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
In case the VM is started with the flag "--pmem mergeable=on", it means
the user expects the guest persistent memory pages to be marked as
mergeable. This commit relies on the madvise(MADV_MERGEABLE) system call
to inform the host kernel about these pages.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In order to let the user indicate if the persistent memory pages should
be marked as mergeable or not, a new option is being introduced.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In case the VM is started with the flag "--memory mergeable=on", it
means the user expects the guest RAM pages to be marked as mergeable.
This commit relies on the madvise(MADV_MERGEABLE) system call to inform
the host kernel about these pages.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In order to let the user indicate if the guest RAM pages should be
marked as mergeable or not, a new option is being introduced.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
When vmm.ping give a response, we expect get the version from
the VMM not the vmm create
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
vmm.ping will help to check if http API server is up and
running.
This also removes the vmm.info endpoint.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
The CPU manager uses an I/O port and to prevent potential clashes with
assignment for PCI devices ensure that it is allocated by the allocator.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Rather than hardcode the CPU status for all the CPUs instead query from
the CPU manager via the I/O port that is is on via the ACPI tables.
Each CPU device has a _STA method that calls into the CSTA method which
reads and writes the I/O ports via the PRST field which exposes the I/O
port through and OpRegion.
As we only support boot CPUS report that all the CPUs are enabled for
now.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The Linux kernel expects all CPUs, whether they be enabled or disabled
to have an _MAT entry containing the LAPIC details for this CPU with the
enabled bit set to 1 (in the flags.)
In the MADT table the same bit is used to determine if the CPU is
present at boot vs available later.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
When adding devices to the guest, and populating the device model, we
should prefix the routines with add_. When we're just creating the
device objects but not yet adding them we use make_.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
In order to reduce the DeviceManager's new() complexity, we can move the
MMIO devices creation code into its own routine.
Fixes: #441
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
In order to reduce the DeviceManager's new() complexity, we can move the
PCI devices creation code into its own routine.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
In order to reduce the DeviceManager's new() complexity, we can move the
ACPI device creation code into its own routine.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
In order to reduce the DeviceManager's new() complexity, we can move the
ACPI device creation code into its own routine.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
In order to reduce the DeviceManager's new() complexity, we can move the
legacy devices creation code into its own routine.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
In order to reduce the DeviceManager's new() complexity, we can move the
console creation code into its own routine.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Move CpuManager, Vcpu and related functionality to its own module (and
file) inside the VMM crate
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
In most cases we return a 204 (No Content) and not a 201.
In those cases, we do not send any HTTP body back at all.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
The new micro_http package provides a built-in HttpServer wrapper for
running a more robust HTTP server based on the package HTTP API.
Switching to this implementation allows us to, among other things,
handle HTTP requests that are larger than 1024 bytes.
Fixes: #423
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
The HTTP API responses are encoded in json
Suggested-by: Samuel Ortiz <sameo@linux.intel.com>
Tested-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Update micro_http create to allow set content type.
Suggested-by: Samuel Ortiz <sameo@linux.intel.com>
Tested-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Pull details of vCPU management (booting, pausing, resuming, shutdown)
into it's own structure. This will ultimately enable this to be moved to
its own file and encapsulate all the vCPU handling for the VMM.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Remove ACPI table creation from arch crate to the vmm crate simplifying
arch::configure_system()
GuestAddress(0) is used to mean no RSDP table rather than adding
complexity with a conditional argument or an Option type as it will
evaluate to a zero value which would be the default anyway.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
In vm_reboot, while build the new vm, the old one pointed by self.vm
is not released, that is, the tap opened by self.vm is not closed
either. As a result, the associated dev name slot in host kernel is
still in use state, which prevents the new build from picking it up as
the new opened tap's name, but to use the name in next slot finally.
Call self.vm_shutdown instead here since it has call take() on vm reference,
which could ensure the old vm is destructed before the new vm build.
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Ensure that we tell the allocator about all the I/O ports that we are
using for I/O bus attached devices (serial, i8042, ACPI device.)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
In order to group together some functions that can be shared across
virtio transport layers, this commit introduces a new trait called
VirtioTransport.
The first function of this trait being ioeventfds() as it is needed from
both virtio-mmio and virtio-pci devices, represented by MmioDevice and
VirtioPciDevice structures respectively.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Now that kvm-ioctls has been updated, the function unregister_ioevent()
can be used to remove eventfd previously associated with some specific
PIO or MMIO guest address. Particularly, it is useful for the PCI BAR
reprogramming case, as we want to ensure the eventfd will only get
triggered by the new BAR address, and not the old one.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
We need to rely on the latest kvm-ioctls version to benefit from the
recent addition of unregister_ioevent(), allowing us to detach a
previously registered eventfd to a PIO or MMIO guest address.
Because of this update, we had to modify the current constraint we had
on the vmm-sys-util crate, using ">= 0.1.1" instead of being strictly
tied to "0.2.0".
Once the dependency conflict resolved, this commit took care of fixing
build issues caused by recent modification of kvm-ioctls relying on
EventFd reference instead of RawFd.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The specific part of PCI BAR reprogramming that happens for a virtio PCI
device is the update of the ioeventfds addresses KVM should listen to.
This should not be triggered for every BAR reprogramming associated with
the virtio device since a virtio PCI device might have multiple BARs.
The update of the ioeventfds addresses should only happen when the BAR
related to those addresses is being moved.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The PciDevice trait is supposed to describe only functions related to
PCI. The specific method ioeventfds() has nothing to do with PCI, but
instead would be more specific to virtio transport devices.
This commit removes the ioeventfds() method from the PciDevice trait,
adding some convenient helper as_any() to retrieve the Any trait from
the structure behing the PciDevice trait. This is the only way to keep
calling into ioeventfds() function from VirtioPciDevice, so that we can
still properly reprogram the PCI BAR.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Storing a strong reference to the AddressManager behind the
DeviceRelocation trait results in a cyclic reference count.
Use a weak reference to break that dependency.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Based on the value being written to the BAR, the implementation can
now detect if the BAR is being moved to another address. If that is the
case, it invokes move_bar() function from the DeviceRelocation trait.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In order to trigger the PCI BAR reprogramming from PciConfigIo and
PciConfigMmmio, we need the PciBus to have a hold onto the trait
implementation of DeviceRelocation.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
By implementing the DeviceRelocation trait for the AddressManager
structure, we now have a way to let the PCI BAR reprogramming happen.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In order to reuse the SystemAllocator later at runtime, it is moved into
the new structure AddressManager. The goal is to have a hold onto the
SystemAllocator and both IO and MMIO buses so that we can use them
later.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In case a VFIO devices is being attached behind a virtual IOMMU, we
should not automatically map the entire guest memory for the specific
device.
A VFIO device attached to the virtual IOMMU will be driven with IOVAs,
hence we should simply wait for the requests coming from the virtual
IOMMU.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
When VFIO devices are created and if the device is attached to the
virtual IOMMU, the ExternalDmaMapping trait implementation is created
and associated with the device. The idea is to build a hash map of
device IDs with their associated trait implementation.
This hash map is provided to the virtual IOMMU device so that it knows
how to properly trigger external mappings associated with VFIO devices.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
With this implementation of the trait ExternalDmaMapping, we now have
the tool to provide to the virtual IOMMU to trigger the map/unmap on
behalf of the guest.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The VFIO container is the object needed to update the VFIO mapping
associated with a VFIO device. This patch allows the device manager
to have access to the VFIO container.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This patch attaches VFIO devices to the virtual IOMMU if they are
identified as they should be, based on the option "iommu=on". This
simply takes care of adding the PCI device ID to the ACPI IORT table.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Having the virtual IOMMU created with --iommu is one thing, but we also
need a way to decide if a VFIO device should be attached to the virtual
IOMMU or not. That's why we introduce an extra option "iommu" with the
value "on" or "off". By default, the device is not attached, which means
"iommu=off".
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Fix invalid type for version:
- VmInfo.version.type string
Change Null value from enum as it has problems to build clients with
openapi tools.
- ConsoleConfig.mode.enum Null -> Nil
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
We should return an explicit error when the transition from on VM state
to another is invalid.
The valid_transition() routine for the VmState enum essentially
describes the VM state machine.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
In order to pause a VM, we signal all the vCPU threads to get them out
of vmx non-root. Once out, the vCPU thread will check for a an atomic
pause boolean. If it's set to true, then the thread will park until
being resumed.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
So that we don't need to forward an ExitBehaviour up to the VMM thread.
This simplifies the control loop and the VMM thread even further.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
This commit is the glue between the virtio-pci devices attached to the
vIOMMU, and the IORT ACPI table exposing them to the guest as sitting
behind this vIOMMU.
An important thing is the trait implementation provided to the virtio
vrings for each device attached to the vIOMMU, as they need to perform
proper address translation before they can access the buffers.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Having the virtual IOMMU created with --iommu is one thing, but we also
need a way to decide if a virtio-vsock device should be attached to this
virtual IOMMU or not. That's why we introduce an extra option "iommu"
with the value "on" or "off". By default, the device is not attached,
which means "iommu=off".
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Having the virtual IOMMU created with --iommu is one thing, but we also
need a way to decide if a virtio-console device should be attached to
this virtual IOMMU or not. That's why we introduce an extra option
"iommu" with the value "on" or "off". By default, the device is not
attached, which means "iommu=off".
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Having the virtual IOMMU created with --iommu is one thing, but we also
need a way to decide if a virtio-pmem device should be attached to this
virtual IOMMU or not. That's why we introduce an extra option "iommu"
with the value "on" or "off". By default, the device is not attached,
which means "iommu=off".
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Having the virtual IOMMU created with --iommu is one thing, but we also
need a way to decide if a virtio-rng device should be attached to this
virtual IOMMU or not. That's why we introduce an extra option "iommu"
with the value "on" or "off". By default, the device is not attached,
which means "iommu=off".
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Having the virtual IOMMU created with --iommu is one thing, but we also
need a way to decide if a virtio-net device should be attached to this
virtual IOMMU or not. That's why we introduce an extra option "iommu"
with the value "on" or "off". By default, the device is not attached,
which means "iommu=off".
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Having the virtual IOMMU created with --iommu is one thing, but we also
need a way to decide if a virtio-blk device should be attached to this
virtual IOMMU or not. That's why we introduce an extra option "iommu"
with the value "on" or "off". By default, the device is not attached,
which means "iommu=off".
One side effect of this new option is that we had to introduce a new
option for the disk path, simply called "path=".
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Adding a simple iommu boolean field to the VmConfig structure so that we
can later use it to create a virtio-iommu device for the current VM.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>