Add API to the hypervisor interface and implement for KVM to allow the
special TDX KVM ioctls on the VM and vCPU FDs.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Add the skeleton of the "tdx" feature with a module ready inside the
arch crate to store implementation details.
TEST=cargo build --features="tdx"
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
In particular update for the vmm-sys-util upgrade and all the other
dependent packages. This requires an updated forked version of
kvm-bindings (due to updated vfio-ioctls) but allowed the removal of our
forked version of kvm-ioctls.
The changes to the API from kvm-ioctls and vmm-sys-util required some
other minor changes to the code.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This commit moves both pci and vmm code from the internal vfio-ioctls
crate to the upstream one from the rust-vmm project.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This is the initial folder structure of the mshv module inside
the hypervisor crate. The aim of this module is to support Microsoft
Hyper-V as a supported Hypervisor.
Signed-off-by: Muminul Islam <muislam@microsoft.com>
This gives a nicer user experience and this error can now be used as the
source for other errors based off this.
See: #1910
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
A new version of vm-memory was released upstream which resulted in some
components pulling in that new version. Update the version number used
to point to the latest version but continue to use our patched version
due to the fix for #1258
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
virtio-mem device would use 'VIRTIO_MEM_F_ACPI_PXM' to add memory to NUMA
node, which MUST be existed, otherwise it will be assigned to node id 0,
even if user specify different node id.
According ACPI spec about Memory Affinity Structure, system hardware
supports hot-add memory region using 'Hot Pluggable | Enabled' flags.
Signed-off-by: Jiangbo Wu <jiangbo.wu@intel.com>
By adding a new io_uring feature gate, we let the user the possibility
to choose if he wants to enable the io_uring improvements or not.
Since the io_uring feature depends on the availability on recent host
kernels, it's better if we leave it off for now.
As soon as our CI will have support for a kernel 5.6 with all the
features needed from io_uring, we'll enable this feature gate
permanently.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In case the host supports io_uring and the specific io_uring options
needed, the VMM will choose the asynchronous version of virtio-blk.
This will enable better I/O performances compared to the default
synchronous version.
This is also important to note the VMM won't be able to use the
asynchronous version if the backend image is in QCOW format.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
It gets bubbled all the way up from hypervsior crate to top-level
Cargo.toml.
Cloud Hypervisor can't function without KVM at this point, so make it
a default feature.
Fix all scripts that use --no-default-features.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Remove the vmm dependency from vhost_user_block and vhost_user_net where
it was existing to use config::OptionParser. By moving the OptionParser
to its own crate at the top-level we can remove the very heavy
dependency that these vhost-user backends had.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Split the generic virtio code (queues and device type) from the
VirtioDevice trait, transport and device implementations.
This also simplifies the feature handling in vhost_user_backend as the
vm-virtio crate is no longer has any features.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Start moving the vmm, arch and pci crates to being hypervisor agnostic
by using the hypervisor trait and abstractions. This is not a complete
switch and there are still some remaining KVM dependencies.
Signed-off-by: Muminul Islam <muislam@microsoft.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Use the ACPI feature to control whether to build the mptable. This is
necessary as the mptable and ACPI RSDP table can easily overwrite each
other leading to it failing to boot.
TEST=Compile with default features and see that --cpus boot=48 now
works, try with --no-default-features --features "pci" and observe the
--cpus boot=48 also continues to work.
Fixes: #1132
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
OVMF and other standard firmwares use I/O port 0x402 as a simple debug
port by writing ASCII characters to it. This is gated under a feature
that is not enabled by default.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This implements the send() function of the Transportable trait, so that
the guest memory regions can be saved into one file per region.
This will need to be extended for live migration, as it will require
other transport methods and the recv() function will need to be
implemented too.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
These two new helpers will be useful to capture a vCPU state and being
able to restore it at a later time.
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
We need the project to rely on kvm-bindings and kvm-ioctls branches
which include the serde derive to be able to serialize and deserialize
some KVM structures.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
A Snapshottable component can snapshot itself and
provide a MigrationSnapshot payload as a result.
A MigrationSnapshot payload is a map of component IDs to a list of
migration sections (MigrationSection). As component can be made of
several Migratable sub-components (e.g. the DeviceManager and its
device objects), a migration snapshot can be made of multiple snapshot
itself.
A snapshot is a list of migration sections, each section being a
component state snapshot. Having multiple sections allows for easier and
backward compatible migration payload extensions.
Once created, a migratable component snapshot may be transported and this
is what the Transportable trait defines, through 2 methods: send and recv.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
The seccomp crate from Firecracker is nicely implemented, documented and
tested, which is a good reason for relying on it to create and apply
seccomp filters.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This feature is stable and there is no need for this to be behind a
flag. This will also reduce the time needed to run the integration test
as we will not be running them all again under the flag.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Use a new feature called "pvh_boot" to enable using the PVH boot
protocol if the guest kernel supports it. The feature can be enabled
by building with:
cargo build [--release] --features "pvh_boot"
Once performance has been evaluated, this can be made part of the
default set of features so that any guest that supports it boots
using PVH as the preferred option as is the case in QEMU.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Relying on the latest vm-memory version, including the freshly
introduced structure GuestMemoryAtomic, this patch replaces every
occurrence of Arc<ArcSwap<GuestMemoryMmap> with
GuestMemoryAtomic<GuestMemoryMmap>.
The point is to rely on the common RCU-like implementation from
vm-memory so that we don't have to do it from Cloud-Hypervisor.
Fixes#735
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
If no socket is supplied when enabling "vhost_user=true" on "--net"
follow the "exe" path in the /proc entry for this process and launch the
network backend (via the vmm_path field.)
Currently this only supports creating a new tap interface as the network
backend also only supports that.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This allows us to change the memory map that is being used by the
devices via an atomic swap (by replacing the map with another one). The
ArcSwap provides the mechanism for atomically swapping from to another
whilst still giving good read performace. It is inside an Arc so that we
can use a single ArcSwap for all users.
Not covered by this change is replacing the GuestMemoryMmap itself.
This change also removes some vertical whitespace from use blocks in the
files that this commit also changed. Vertical whitespace was being used
inconsistently and broke rustfmt's behaviour of ordering the imports as
it would only do it within the block.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This brings more modularity to the code, which will be helpful when we
will later test the CLI and OpenAPI generate the same VmConfig output.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Since the Snapshotable placeholder and Migratable traits are provided as
well, the DeviceManager object and all its objects are now Migratable.
All Migratable devices are tracked as Arc<Mutex<dyn Migratable>>
references.
Keeping track of all migratable devices allows for implementing the
Migratable trait for the DeviceManager structure, making the whole
device model potentially migratable.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
The signal handling for vCPU signals has changed in the latest release
so switch to the new API.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Since the kvm crates now depend on vmm-sys-util, the bump must be
atomic.
The kvm-bindings and ioctls 0.2.0 and 0.4.0 crates come with a few API
changes, one of them being the use of a kvm_ioctls specific error type.
Porting our code to that type makes for a fairly large diff stat.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
The new micro_http package provides a built-in HttpServer wrapper for
running a more robust HTTP server based on the package HTTP API.
Switching to this implementation allows us to, among other things,
handle HTTP requests that are larger than 1024 bytes.
Fixes: #423
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Update micro_http create to allow set content type.
Suggested-by: Samuel Ortiz <sameo@linux.intel.com>
Tested-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Remove ACPI table creation from arch crate to the vmm crate simplifying
arch::configure_system()
GuestAddress(0) is used to mean no RSDP table rather than adding
complexity with a conditional argument or an Option type as it will
evaluate to a zero value which would be the default anyway.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
We need to rely on the latest kvm-ioctls version to benefit from the
recent addition of unregister_ioevent(), allowing us to detach a
previously registered eventfd to a PIO or MMIO guest address.
Because of this update, we had to modify the current constraint we had
on the vmm-sys-util crate, using ">= 0.1.1" instead of being strictly
tied to "0.2.0".
Once the dependency conflict resolved, this commit took care of fixing
build issues caused by recent modification of kvm-ioctls relying on
EventFd reference instead of RawFd.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>