Relying on the vm-virtio/virtio-queue crate from rust-vmm which has been
copied inside the Cloud Hypervisor tree, the entire codebase is moved to
the new definition of a Queue and other related structures.
The reason for this move is to follow the upstream until we get some
agreement for the patches that we need on top of that to make it
properly work with Cloud Hypervisor.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This dependency bump needed some manual handling since the API changed
quite a lot regarding some RawFd being changed into either File or
AsRawFd traits.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Vhost user INFLIGHT_SHMFD protocol feature supports inflight I/O
tracking, this commit implement the vhost-user device (master) support
of the feature. Till this commit, specific vhost-user devices (blk, fs,
or net) have not enable this feature.
Signed-off-by: Jiachen Zhang <zhangjiachen.jaycee@bytedance.com>
As the first step to complete live-migration with tracking dirty-pages
written by the VMM, this commit patches the dependent vm-memory crate to
the upstream version with the dirty-page-tracking capability. Most
changes are due to the updated `GuestMemoryMmap`, `GuestRegionMmap`, and
`MmapRegion` structs which are taking an additional generic type
parameter to specify what 'bitmap backend' is used.
The above changes should be transparent to the rest of the code base,
e.g. all unit/integration tests should pass without additional changes.
Signed-off-by: Bo Chen <chen.bo@intel.com>
The vhost crate does not support the need_reply flag yet, meaning we
can't be sure the backend is properly setup before the guest goes on.
One can run in a race condition where the VMM enables the vring, but
never gets any acknowledgement, meaning it assumes everything went well
and finalize the virtio device activation. Once the device is seen as
ready by the guest, it keeps going by sending some messages through the
virtqueues. Problem is, if it took some time for the backend to enable
the queue, one of the backend thread might receive a kick from the guest
while the corresponding queue is not enabled. This leads to the loss of
the event as it is discarded because the queue is not enabled.
Until vhost crate allows for requests with ACK, the way to mitigate this
issue is by ignoring an event coming up on a queue that has not been
enabled.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Introduce some new code to support running a vhost-user backend as a
client instead of the default server mode.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The point being to identify clearly that we're running the backend as a
server. This anticipates the addition of a new function for running the
backend as a client.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Fixes the current codebase so that every cargo clippy can be run with
the beta toolchain without any error.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The latest vhost version adds the support for the new commands
get_max_mem_slots(), add_mem_region() and remove_mem_region(), all
related to the new vhost-user protocol feature CONFIGURE_MEM_SLOTS.
The vhost_user_backend crate is updated accordingly in order to support
these new commands, mostly related to the capability of updating the
guest memory mappings with a finer control than set_mem_table() command.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Moving to the latest version of the rust-vmm/vhost crate, before it gets
published on crates.io.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The vhost crate from rust-vmm is ready, which is why we do the switch
from the Cloud Hypervisor fork to the upstream crate.
At the same time, we rename the crate from vhost_rs to vhost.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The logic for handling the networking queues can now be shared between
the version running in vhost-user-net and vm-virtio.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Previous to adding a a trait method to inform the backends of the acked
features backends can use features than the guest has not enabled which
could lead to unpredictable results.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Move the method that is used to decide whether the guest should be
signalled into the Queue implementation from vm-virtio. This removes
duplicated code between vhost_user_backend and the vm-virtio block
implementation.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Explicit call to 'close()' is required on file descriptors allocated
from 'epoll::create()', which is missing for the 'EpollContext' and
'VringWorker'. This patch enforces to close the file descriptors by
reusing the Drop trait of the 'File' struct.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Changes is vhost crate require VhostUserDaemon users to create and
provide a vhost::Listener in advance. This allows us to adopt
sandboxing strategies in the future, by being able to create the UNIX
socket before switching to a restricted namespace.
Update also the reference to vhost crate in Cargo.lock to point to the
latest commit from the dragonball branch.
Signed-off-by: Sergio Lopez <slp@redhat.com>
By adding a "thread_id" parameter to handle_event(), the backend crate
can now indicate to the backend implementation which thread triggered
the processing of some events.
This is applied to vhost-user-net backend and allows for simplifying a
lot the code since each thread is identical.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
By adding the "thread_index" parameter to the function exit_event() from
the VhostUserBackend trait, the backend crate now has the ability to ask
the backend implementation about the exit event related to a specific
thread.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Now that multiple worker threads can be run from the backend crate, it
is important that each backend implementation can access every worker
thread.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In order to support multiqueues running on multiple threads for
increasing the IO performances, this commit introduces a new function
queues_per_thread() to the VhostUserBackend trait.
This gives each backend implementation the opportunity to define which
queues belong to which thread.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
By changing the mutability of this function, after adapting all
backends, we should be able to implement multithreads with
multiqueues support without hitting a bottleneck on the backend
locking.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This change, combined with the compiler hint to inline get_used_event,
shortens the window between the memory read and the actual check by
calling get_used_event from needs_notification.
Without it, when putting enough pressure on the vring, it's possible
that a notification is wrongly omitted, causing the queue to stall.
Signed-off-by: Sergio Lopez <slp@redhat.com>
This is a perfectly acceptable situation as it causes the backend to
exit because the VMM has closed the connection. This addresses the
rather ugly reporting of errors from the backend that appears
interleaved with the output from the VMM.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Add helpers to Vring and VhostUserSlaveReqHandler for EVENT_IDX, so
consumers of this crate can make use of this feature.
Signed-off-by: Sergio Lopez <slp@redhat.com>
This adds the missing part of supporting virtiofs dax on the slave end,
that is, receiving a socket pair fd from the master end to set up a
communication channel for sending setupmapping & removemapping messages.
Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com>
All backends currently provide their own implementation for triggering
the worker thread to exit via an EventFd. Modify the VhostUserBackend
trait to allow a common implementation strategy that backends can use to
provide an EventFd (and optional id) that can be used to trigger the
worker to exit.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Check the return value from the worker thread by saving the thread
handle and waiting for it to return.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The main thread returns a Result with any errors from it. Although the
error from the join itself was being returned the real error from the
thread was being ignored so ensure that it is forwarded.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This reverts commit 4a1af7f63c755c54db30b9cc47b2cb86608899ff.
This change erroneously ignored the return value for the result which
meant that requests to break out of the loop due to a kill event were
lost.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The way in which offsets are currently use in memory regions is
derived from QEMU's contrib/libvhost-user, but while this one works
mainly by translating vmm va's to local va's, vm-memory expects us to
use proper guest addresses and thus, define memory regions that
actually match the guest's memory disposition.
With this change, we create the memory regions with the proper length
and offsets, extend AddrMapping to store the guest physical address,
and use the latter instead of offset in vmm_va_to_gpa().
Signed-off-by: Sergio Lopez <slp@redhat.com>
set_features() fails with InvalidOperation if !self.owned. I don't see
this as a requirement in the specification and, in fact, vm-virtio
implementation for resetting the device calls SET_FEATURES just after
RESET_OWNER.
Signed-off-by: Sergio Lopez <slp@redhat.com>
Extend VhostUserBackend trait with protocol_features(), so device
backend implementations can freely define which protocol features they
want to support.
Signed-off-by: Sergio Lopez <slp@redhat.com>
The VhostUserConfig carries a message with a payload, the contents of
which depend on the kind of device being emulated.
With this change, we calculate the offset of the payload within the
message, check its size corresponds to the expected one, and pass it
to the backend as a reference to a slice adjusted to the payload
dimensions.
The backend will be responsible of validating the payload, as it's the
one aware of its expected contents.
Signed-off-by: Sergio Lopez <slp@redhat.com>
This commit fixes all the remaining issues that were found as part of
the integration with vhost-user-net.
It fixes the way to notify that a vring is used, by using the proper
EventFd.
It removes the process_queue() function from the trait, since the
complexity it was introducing was leading to deadlocks with mutexes.
It moves the register/unregister functions for registering custom events
from the backend, from the VringEpollHandler to the VringWorker. This
allows for a lot of simplification and solve a deadlock issue.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The original logic does not has any problem without offset, since the
current offset is zero. However, if offset is not zero, while convert
vmm address to backend process address, it needs to consider the
offset.
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
The error handling here to trigger break epoll seems not correct,
epoll will be ended once one event is handled, no matter successfully
or failed. Fix it.
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
The vhost-user protocol does not indicate set_features could not
be issued more than once, the checking is not needed at all, and
prevent communication between master and slave. Remove it to
fix the issue.
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
This patch modifies the library so that a consumer can update the
backend after it's been passed to the daemon.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
By letting the consumer of this crate getting access to the vring
handler, we will be able to let it perform several actions without
producing a deadlock.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
We cannot expect every backend to support GET_CONFIG and SET_CONFIG
commands. That's why this patch adds some default implementations for
the trait VhostUserBackend regarding both get_config() and set_config()
functions.
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The code needs to initialize a listener to accept connection from the
VMM being the client in this case.
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Every time an event is triggered, it needs to be read, but only based on
the status of the vring (enabled or not) will decide if the queue needs
to be processed.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The Queue structure already contains a field "ready" that can be used to
track the status of the vrings.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Let's be realistic, the trait VhostUserBackend will need to have mutable
self for some functions like handle_event, process_queue and set_config,
which is the reason why this commit needs to introduce a RwLock on the
backend instance that was passed around as a simple Arc.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Instead of locking every queues whenever something needs to be updated,
this patch modifies the code design to lock each Vring independently.
This allows for much finer granularity, and will allow multiple queues
to be handled at the same time.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The purpose of this new crate is to provide a common library to all
vhost-user backend implementations. The more is handled by this library,
the less duplication will need to happen in each vhost-user daemon.
This crate relies a lot on vhost_rs, vm-memory and vm-virtio crates.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>