Implement (most) of the client side (i.e. VMM side) of the vfio-user
protocol:
https://github.com/nutanix/libvfio-user/blob/master/docs/vfio-user.rst
Items that are not implemented (because they are optimisations or unused
due to alternative solutions:
* VFIO_USER_DMA_READ/WRITE - this is a way for the server to read guest
memory if the guest memory is not shared by fd where the client
doesn't support it. However since we do support sharing the memory by
fd this is not required.
* VFIO_USER_GET_REGION_IO_FDS - an optimisation to bypass the VMM by
having KVM talk directly to the backend using ioregionfd
* VFIO_USER_DIRTY_PAGES - for the implementation of live migration
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
As the first step to complete live-migration with tracking dirty-pages
written by the VMM, this commit patches the dependent vm-memory crate to
the upstream version with the dirty-page-tracking capability. Most
changes are due to the updated `GuestMemoryMmap`, `GuestRegionMmap`, and
`MmapRegion` structs which are taking an additional generic type
parameter to specify what 'bitmap backend' is used.
The above changes should be transparent to the rest of the code base,
e.g. all unit/integration tests should pass without additional changes.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Remove unnecessary code for these structs. Moving this also allows the
removal of the arch_gen crate.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The Rust manifiest requires a three part version number however we
continue with our plan to use two digits so the last part will not be
used.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
To support I/O throttling on virt-net devices, we need to use the
'rate_limiter' module from the 'net_utils' crate. Given the
'virtio-devices' crate has dependency on the 'net_utils', we will need
to move the 'rate_limiter' module out of the 'virtio-devices' crate to
avoid circular dependency issue. Considering the 'rate_limiter' is not
virtio specific and could be reused for non virtio devices, we move it
to its own crate.
Signed-off-by: Bo Chen <chen.bo@intel.com>
It must be specified as excluded from the workspace as it must not be
built on non-test targets due to issues with the ssh2 dependency and the
musl toolchain.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Add the skeleton of the "tdx" feature with a module ready inside the
arch crate to store implementation details.
TEST=cargo build --features="tdx"
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
These need to be updated together as the kvm-ioctls depends upon a
strictly newer version of kvm-bindings which requires a rebase in the CH
fork.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
In particular update for the vmm-sys-util upgrade and all the other
dependent packages. This requires an updated forked version of
kvm-bindings (due to updated vfio-ioctls) but allowed the removal of our
forked version of kvm-ioctls.
The changes to the API from kvm-ioctls and vmm-sys-util required some
other minor changes to the code.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The vhost crate from rust-vmm is ready, which is why we do the switch
from the Cloud Hypervisor fork to the upstream crate.
At the same time, we rename the crate from vhost_rs to vhost.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This commit moves both pci and vmm code from the internal vfio-ioctls
crate to the upstream one from the rust-vmm project.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This is only use in the integration test and was erroneously included in
the main binary dependencies.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This removes the dependency on "tempdir" which in turn depends on the
large rand dependency chain.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This crate exposes the abililty for the VMM to set a file that events
should be written to. The event!() macro provides an interface to report
those events allowing the specification of an event source, an event
type and optional extra data. This will be written to the provided file
descriptor as JSON data.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
If we receive SIGSYS and identify it as a seccomp violation then give
friendly instructions on how to debug further. We are unable to decode
the siginfo_t struct ourselves due to https://github.com/rust-lang/libc/issues/716Fixes: #2139
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Since we can't test mshv and kvm at the same time, --all-features no
longer work.
We factorize all, non-hypervisor related features into a common set and
mix that with either mshv and kvm.
Co-Developed-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Co-Developed-by: Wei Liu <liuwe@microsoft.com>
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
This is the initial folder structure of the mshv module inside
the hypervisor crate. The aim of this module is to support Microsoft
Hyper-V as a supported Hypervisor.
Signed-off-by: Muminul Islam <muislam@microsoft.com>
Unfortunately it seems patch entries are ignored when obtaining
dependencies from another workspace.
Remove the problematic kvm-ioctls and kvm-bindings patch entries and use
the forked repository directly.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Instead of waiting blindly with fixed amount of sleeping time, we can
use the `wait-timeout` crate to explicitly wait VM shutdown (with a
timeout). It can reduces the execution time of some tests
substantially. Also, this patch increases the `shutdown` timeout for
'test_reboot', which should fix the recent sporadic failures on this
test.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Instead of blindly waiting for 20-40s for the guest VM to boot, this
patch waits the notification from the guest VM explicitly by using a
simple TcpListener on the host and a custom systemd service in the
guest.
This patch also ported few tests to use this new machanism, while more
tests are to be ported.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Use a Result<> type with an error to simplify the code in start_vmm().
This will also make it easier to add cleanup funtionality.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Now that we that our CI is running with a kernel that is new enough to
support io_uring we can turn this feature on by default.
See: #1561
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Split out the HTTP request handling code from ch-remote into a new
crate which can be used in other places where talking to the API server
by HTTP is necessary.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Instead of having the hypervisor crate embedding Cloud-Hypervisor forks
from the rust-vmm project, it's more appropriate to leave the rust-vmm
references in the hypervisor crate, and have the root Cargo.toml being
patched.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The Cloud-Hypervisor fork of the vhost crate contains one small
additional patch compared to the rust-vmm upstream version, meant for
increasing the connection timeout.
This patch is intended to be merged in order to check if it helps our CI
fixing the vhost-user-blk flakes that we've been observing recently.
If it fixes it, we'll submit a similar patch upstream and switch back to
the upstream vhost crate, otherwise we'll simply switch back to the
upstream crate, discarding this patch.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In our build-script (build.rs), we won't set the environment variable
'BUILD_VERSION' when the 'git describe' command failed (e.g. when the
current source tree does not contain git information). This patch added
a fall back path where the default value of 'BUILD_VERSION' is based on
the 'cloud-hypervisor' crate version.
Fixes: #1669
Signed-off-by: Bo Chen <chen.bo@intel.com>
By adding a new io_uring feature gate, we let the user the possibility
to choose if he wants to enable the io_uring improvements or not.
Since the io_uring feature depends on the availability on recent host
kernels, it's better if we leave it off for now.
As soon as our CI will have support for a kernel 5.6 with all the
features needed from io_uring, we'll enable this feature gate
permanently.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Extract the code that is used by vhost_user_block from the
virtio-devices crate to remove the dependencies on unrequired
functionality such as the virtio transports.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
It gets bubbled all the way up from hypervsior crate to top-level
Cargo.toml.
Cloud Hypervisor can't function without KVM at this point, so make it
a default feature.
Fix all scripts that use --no-default-features.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Remove the vmm dependency from vhost_user_block and vhost_user_net where
it was existing to use config::OptionParser. By moving the OptionParser
to its own crate at the top-level we can remove the very heavy
dependency that these vhost-user backends had.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
There are several dependencies that need updating so update them
manually rather than relying on dependabot. This will reduce the load on
the CI.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
With vhost_user_fs binary moved to its own crate the dependencies in the
top level can be trimmed significantly.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Split the generic virtio code (queues and device type) from the
VirtioDevice trait, transport and device implementations.
This also simplifies the feature handling in vhost_user_backend as the
vm-virtio crate is no longer has any features.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The purpose of this trait is to add support for other hypervisors than
KVM, like e.g. Microsoft Hyper-V.
Further commits will define additional hypervisor related traits like
Vcpu and Vm. Each of the supported hypervisor will need to implement all
traits defined from the hypervisor crate.
Signed-off-by: Muminul Islam <muislam@microsoft.com>
Currently released vm-memory uses aligned and volatile copying for all
data. The version in the fork only uses the assured (and slower) path
for data upto the natural data width.
Fixes: #1258
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Extend the set of tests we have for virtio-net and vhost-user-net to
check for host MAC address setting.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
OVMF and other standard firmwares use I/O port 0x402 as a simple debug
port by writing ASCII characters to it. This is gated under a feature
that is not enabled by default.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
We need the project to rely on kvm-bindings and kvm-ioctls branches
which include the serde derive to be able to serialize and deserialize
some KVM structures.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
A Snapshottable component can snapshot itself and
provide a MigrationSnapshot payload as a result.
A MigrationSnapshot payload is a map of component IDs to a list of
migration sections (MigrationSection). As component can be made of
several Migratable sub-components (e.g. the DeviceManager and its
device objects), a migration snapshot can be made of multiple snapshot
itself.
A snapshot is a list of migration sections, each section being a
component state snapshot. Having multiple sections allows for easier and
backward compatible migration payload extensions.
Once created, a migratable component snapshot may be transported and this
is what the Transportable trait defines, through 2 methods: send and recv.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
This commit introduces the application of the seccomp filter to the VMM
thread. The filter is empty for now (SeccompLevel::None).
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Rather than using a raw OS disk image. This will be useful when the test
is extended to doing I/O on the image.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This feature is stable and there is no need for this to be behind a
flag. This will also reduce the time needed to run the integration test
as we will not be running them all again under the flag.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Use a new feature called "pvh_boot" to enable using the PVH boot
protocol if the guest kernel supports it. The feature can be enabled
by building with:
cargo build [--release] --features "pvh_boot"
Once performance has been evaluated, this can be made part of the
default set of features so that any guest that supports it boots
using PVH as the preferred option as is the case in QEMU.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
@dependabot bumped the dependency to 0.4.10 but this is no longer a
valid version so downgrade appropriately.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This change enables vhost_user_fs to process multiple requests in
parallel by scheduling them into a ThreadPool (from the Futures
crate).
Parallelism on a single file is limited by the nature of the operation
executed on it. A recent commit replaced the Mutex that protects the
File within HandleData with a RwLock, to allow some operations (at
this moment, only "read" and "write") to proceed in parallel by
acquiring a read lock.
A more complex approach was also implemented [1], involving
instrumentation through vhost_user_backend to be able to serialize
completions, reducing the pressure on the vring RwLock. This strategy
improved the performance on some corner cases, while making it worse
on other, more common ones. This fact, in addition to it requiring
wider changes through the source code, prompted me to drop it in favor
of this one.
[1] https://github.com/slp/cloud-hypervisor/tree/vuf_async
Signed-off-by: Sergio Lopez <slp@redhat.com>
This prevents the output being wrapped at 120 characters and giving
strange results.
Fixes: #899
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
As cloud-hypervisor/vhost crate (dragonball branch) is ready to be used,
switch vhost_rs from internal crate to the external one.
Signed-off-by: Eryu Guan <eguan@linux.alibaba.com>
Add a build-script to propagate the git commit hash to other crates at
compile time through environment variables, and display the hash along
with the '--version' option.
Fixes#729
Signed-off-by: Bo Chen <chen.bo@intel.com>
Extract the majority of the code that provides the vhost-user-block
backend into its own crate and port the binary to use it.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Extract the majority of the code that provides the vhost-user-net
backend into its own crate and port the binary to use it.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Make all the crates members of the workspace so that "cargo test
--workspace" will find them all and test them with the features enabled
that we use.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This allows us to change the memory map that is being used by the
devices via an atomic swap (by replacing the map with another one). The
ArcSwap provides the mechanism for atomically swapping from to another
whilst still giving good read performace. It is inside an Arc so that we
can use a single ArcSwap for all users.
Not covered by this change is replacing the GuestMemoryMmap itself.
This change also removes some vertical whitespace from use blocks in the
files that this commit also changed. Vertical whitespace was being used
inconsistently and broke rustfmt's behaviour of ordering the imports as
it would only do it within the block.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The goal here is to ensure that CLI and OpenAPI both behave as closely
as possible, and also that they behave as expected.
Leveraging the reorganization of the code, we can now compare two
VmConfig structures generated from one CLI entry on one side, and from
an OpenAPI entry (JSON payload) on the other side.
Fixes#535
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The signal handling for vCPU signals has changed in the latest release
so switch to the new API.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Since the kvm crates now depend on vmm-sys-util, the bump must be
atomic.
The kvm-bindings and ioctls 0.2.0 and 0.4.0 crates come with a few API
changes, one of them being the use of a kvm_ioctls specific error type.
Porting our code to that type makes for a fairly large diff stat.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
This new crate will be dedicated to vhost_user_fs specific code that can
be used as a library from the vhost-user-fs daemon.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Update micro_http create to allow set content type.
Suggested-by: Samuel Ortiz <sameo@linux.intel.com>
Tested-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Create a vhost-user-blk backend using vhost-user-backend and following
the conventions established by the existing vhost-user-net
implementation.
This backend is based on https://github.com/slp/vhost-user-backend,
but a bit simplified, making it closer to the original implementation
in Firecracker. The main features missing are EVENT_IDX, support for
asynchronous I/O and multiqueue, but it's still fully functional and
provides a good starting point for evolving it into a more complete
implementation.
Signed-off-by: Sergio Lopez <slp@redhat.com>
We need to rely on the latest kvm-ioctls version to benefit from the
recent addition of unregister_ioevent(), allowing us to detach a
previously registered eventfd to a PIO or MMIO guest address.
Because of this update, we had to modify the current constraint we had
on the vmm-sys-util crate, using ">= 0.1.1" instead of being strictly
tied to "0.2.0".
Once the dependency conflict resolved, this commit took care of fixing
build issues caused by recent modification of kvm-ioctls relying on
EventFd reference instead of RawFd.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The new crate vm-device is created here to host the definitions of
traits not meant to be tied to virtio of VFIO specifically. We need to
add a new trait to update external DMA mappings for devices, which is
why the vm-device crate is the right fit for this.
We can expect this crate to be extended later once the design gets
approved from a rust-vmm perspective.
In this specific use case, we can have some devices like VFIO or
vhost-user ones requiring to be notified about mapping updates. This
new trait ExternalDmaMapping will allow such devices to implement their
own way to handle such event.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Based off of crosvm revision b5237bbcf074eb30cf368a138c0835081e747d71
add a CMOS device. This environments that can't use KVM clock to get the
current time (e.g. Windows and EFI.)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Create vhost-user-net backend with Tap interface, to offload network
transaction from cloud-hypervisor. The goal is to provide flexibility
about the backend being in use, but also more security as it will allow
users to isolate the backend with different security profiles since it
will run as a dedicated process on the host.
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Remove workspace from vhost_user_backend/Cargo.toml to have
vhost-user-backend compiled in cloud-hypervisor. Add workspace in
Cargo.toml to have vhost-user-backend consumed by vhost-user-net.
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
We now start the main VMM thread, which will be listening for VM and IPC
related events.
In order to start the configured VM, we no longer directly call the VM
API but we use the IPC instead, to first create and then start a VM.
Fixes: #303
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Add (non-default) support for using MMIO for virtio devices. This can be
tested by:
cargo build --no-default-features --features "mmio"
All necessary options will be included injected into the kernel
commandline.
Fixes: #243
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The command "cargo build --no-default-features" does not recursively
disable the default features across the workspace. Instead add an acpi
feature at the top-level, making it default, and then make that feature
conditional on all the crate acpi features.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This makes the log macros (error!, warn!, info!, etc) in the code work.
It currently defaults to showing only error! messages, but by passing an
increasing number of "-v"s on the command line the verbosity can be
increased.
By default log output goes onto stderr but it can also be sent to a
file.
Fixes: #121
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Update all dependencies with "cargo upgrade" with the exception of
vmm-sys-utils which needs some extra porting work.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Use the tempdir crate to create a temporary directory that is deleted
when the structure goes out of scope.
Use this temporary directory for all temporary test files created by the
tests. The cloud init file is still in /tmp as that is created by the
test wrapper code.
This is the first stage towards being able to run the integration tests
in parallel.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The addition of [workspace] to the top level Cargo.toml is necessary to
have the binaries colocated together.
The Cargo.lock files have also been refreshed by the change to the
Cargo.toml.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Launch the test binary by command rather than using using the vmm layer.
This makes it easier to manage the running VM as you can explicitly kill
it.
Also switch to using credibility for the tests which catches assertions
and continues with subsequent commands and reports the issues at the
end. This means it is possible to cleanup even on failed test runs.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Add basic integration testing of the hypervisor using a cloud-init to
configure the VM at boot and SSH to control it at runtime.
Initial test just boots the VM up checks some basic resources and
reboots. With a second test that calls into the first to check that
subsequent tests work correctly.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The cargo interaction with the .cargo/config does not meet our
requirements.
Regardless of .cargo/config explicitly replacing our external sources
with vendored ones, cargo build will rely first on Cargo.lock to update
its local source cache. If a dependency has been push forced, build
fails because of our top level Cargo.toml description.
This prevents us from actually pinning dependencies, which defeats the
vendoring purpose.
We're removing vendoring for now, until we understand it better.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
We use cargo vendor to generate a .cargo/config file and the vendor
directory. Vendoring allows us to lock our dependencies and to modify
them easily from the top level Cargo.toml.
We vendor all dependencies, including the crates.io ones, which allows
for network isolated builds.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>