Add the support for reconnecting the backend request handler after a
disconnection/crash happened.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Since the slave request handler is common to all vhost-user devices, the
same way the reconnection is, it makes sense to handle the requests from
the backend through the same thread.
The reconnection thread now handles both a reconnection as well as any
request coming from the backend.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This commit enables socket reconnection for vhost-user-fs backends. Note
that, till this commit:
- The re-establish of the slave communication channel is no supported. So
the socket reconnection does not support virtiofsd with DAX enabled.
- Inflight I/O tracking and restoring is not supported. Therefore, only
virtio-fs daemons that are not processing inflight requests can work
normally after reconnection.
- To make the restarted virtiofsd work normally after reconnection, the
internal status of virtiofsd should also be recovered. This is not the
work of cloud-hypervisor. If the virtio-fs daemon does not support
saving or restoring its internal status, then a re-mount in guest after
socket reconnection should be performed.
Signed-off-by: Jiachen Zhang <zhangjiachen.jaycee@bytedance.com>
This commit enables socket reconnection for vhost-user-blk backends.
Note that, till this commit, inflight I/O trakcing and restoring is not
supported. Therefore, only vhost-user-blk backend that are not processing
inflight requests can work normally after reconnection.
Signed-off-by: Jiachen Zhang <zhangjiachen.jaycee@bytedance.com>
We should try to read the last avail index from the vring memory aera. This
is necessary when handling vhost-user socket reconnection.
Signed-off-by: Jiachen Zhang <zhangjiachen.jaycee@bytedance.com>
Function "GuestMemory::with_regions(_mut)" were mainly temporary methods
to access the regions in `GuestMemory` as the lack of iterator-based
access, and hence they are deprecated in the upstream vm-memory crate [1].
[1] https://github.com/rust-vmm/vm-memory/issues/133
Signed-off-by: Bo Chen <chen.bo@intel.com>
As the first step to complete live-migration with tracking dirty-pages
written by the VMM, this commit patches the dependent vm-memory crate to
the upstream version with the dirty-page-tracking capability. Most
changes are due to the updated `GuestMemoryMmap`, `GuestRegionMmap`, and
`MmapRegion` structs which are taking an additional generic type
parameter to specify what 'bitmap backend' is used.
The above changes should be transparent to the rest of the code base,
e.g. all unit/integration tests should pass without additional changes.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Add a helper to VirtioCommon which returns duplicates of the EventFds
for kill and pause event.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
When in client mode, the VMM will retry connecting the backend for a
minute before giving up.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The reconnection code is moved to the vhost-user module as it is a
common place to be shared across all vhost-user devices.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This commit implements the reconnection feature for vhost-user-net in
case the connection with the backend is shutdown.
The guest has no knowledge about what happens when a disconnection
occurs, which simplifies the code to handle the reconnection. The
reconnection happens with the backend, and the VMM goes through the
vhost-user setup once again.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
We thought we could move the control queue to the backend as it was
making some good sense. Unfortunately, doing so was a wrong design
decision as it broke the compatibility with OVS-DPDK backend.
This is why this commit moves the control queue back to the VMM side,
meaning an additional thread is being run for handling the communication
with the guest.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
A lot of the VIRTIO reserved features should be supported or not by the
vhost-user backend. That means on the VMM side, these features should be
available, so that they don't get lost during the negotiation.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The VIRTIO features should not be set before they are acked from the
guest. This code was only present to overcome a vhost crate limitation
that was expecting the VIRTIO features to be set before we could fetch
and set the protocol features.
The vhost crate has been recently fixed by removing the limitation,
therefore there's no need for this workaround in the Cloud Hypervisor
codebase anymore.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Moving helpers to the net_util crate since we don't want virtio-net
common code to be split between two places. The net_util crate should be
the only place to host virtio-net common code.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Factorize the virtio features and vhost-user protocol features
negotiation through a common function that blk, fs and net
implementations can directly rely on.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Make sure the virtio features are set upon device activation. At the
time the device is activated, we know the guest acknowledged the
features, which mean it's safe to set them back to the backend.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The virtio features are negotiated and set at the time the device is
created, hence there's no need to set the features again while going
through the vhost-user setup that is performed upon queue activation.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Now that the control queue is correctly handled by the backend, there's
no need to handle it as well from the VMM.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Some refactoring is performed in order to always expect the irqfd to be
provided by VirtioInterrupt trait. In case no irqfd is available, we
simply fail initializing the vhost-user device. This allows for further
simplification since we can assume the interrupt will always be
triggered directly by the vhost-user backend without proxying through
the VMM. This allows for complete removal of the dedicated thread for
both block and net.
vhost-user-fs is a bit more complex as it requires the slave request
protocol feature in order to support DAX. That's why we still need the
VMM to interfere and therefore run a dedicated thread for it.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Now all crates use edition = "2018" then the majority of the "extern
crate" statements can be removed. Only those for importing macros need
to remain.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
error: reference to packed field is unaligned
--> virtio-devices/src/vhost_user/fs.rs:85:21
|
85 | fs.flags[i].bits() as i32,
| ^^^^^^^^^^^
|
= note: `-D unaligned-references` implied by `-D warnings`
= warning: this was previously accepted by the compiler but is being
phased out; it will become a hard error in a future release!
= note: for more information, see issue #82523
<https://github.com/rust-lang/rust/issues/82523>
= note: fields of packed structs are not properly aligned, and
creating a misaligned reference is undefined behavior (even if that
reference is never dereferenced)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Adding the support for an OVS vhost-user backend to connect as the
vhost-user client. This means we introduce with this patch a new
option to our `--net` parameter. This option is called 'server' in order
to ask the VMM to run as the server for the vhost-user socket.
Fixes#1745
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This allows the guest to reprogram the offload settings and mitigates
issues where the Linux kernel tries to reprogram the queues even when
the feature is not advertised.
Fixes: #2528
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Cleanup the control queue handling in preparation for supporting
alternative commands.
Note that this change does not make the MQ handling spec compliant.
According to the specification MQ should only be enabled once the number
of queue pairs the guest would like to use has been specified. The only
improvement towards the specication in this change is correct error
handling if the guest specifies an inappropriate number of queues (out
of range.)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
error: name `TYPE_UNKNOWN` contains a capitalized acronym
--> vm-virtio/src/lib.rs:48:5
|
48 | TYPE_UNKNOWN = 0xFF,
| ^^^^^^^^^^^^ help: consider making the acronym lowercase, except the initial letter: `Type_Unknown`
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#upper_case_acronyms
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Now that virtio devices can be updated with add_memory_region(), there's
no need to keep update_memory() around.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Assuming vhost-user devices support CONFIGURE_MEM_SLOTS protocol
feature, we introduce a new method to the VirtioDevice trait in order to
update one single memory at a time.
In case CONFIGURE_MEM_SLOTS is not supported by the backend (feature not
acked), we fallback onto the current way of updating the memory
mappings, that is with SET_MEM_TABLE.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
There is no need to get the vring base when resetting the vhost-user
device. This was mostly ignored, but in some cases, it was causing some
actual errors.
A reset must simply be a combination of disabling the vrings along with
the reset of the owner.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Originally, VhostUserSetup is only used by vhost-user-fs. While
vhost-user-blk and vhost-user-net have their own error messages,
we rename VhostUserSetup to VhostUserFsSetup.
Signed-off-by: Jiachen Zhang <zhangjiachen.jaycee@bytedance.com>
Moving to the latest version of the rust-vmm/vhost crate, before it gets
published on crates.io.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The vhost crate from rust-vmm is ready, which is why we do the switch
from the Cloud Hypervisor fork to the upstream crate.
At the same time, we rename the crate from vhost_rs to vhost.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Now that ExternalDmaMapping is defined in vm-device, let's use it from
there.
This commit also defines the function get_host_address_range() to move
away from the vfio-ioctls dependency.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This thread is virtio-net specific, so it is not handled in the common
virtio device code.
The non-vhost implementation resumes the thread itself. Do the same
thing for vhost-user-net.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Even though the driver can provide fewer queues than those advertised
for some device types their is a minimum number that is required for
operation.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Rather than having to give and return the ioeventfd used for a device
clone them each time. This will make it simpler when we start handling
the driver enabling fewer queues than advertised by the device.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
In order to make the thread naming more useful derive their name from
the device id (which can be supplied by the user) and a device specific
suffix that has details of the individual queue (or queue pair.)
e.g.
rob@artemis:~$ pstree -p -c -l -t `pidof cloud-hypervisor`
cloud-hyperviso(27501)─┬─{_console}(27525)
├─{_disk0_q0}(27529)
├─{_disk0_q1}(27532)
├─{_net1_ctrl}(27533)
├─{_net1_qp0}(27534)
├─{_net1_qp1}(27535)
├─{_rng}(27526)
├─{http-server}(27504)
├─{seccomp_signal_}(27502)
├─{signal_handler}(27523)
├─{vcpu0}(27520)
├─{vcpu1}(27522)
└─{vmm}(27503)
Fixes: #2077
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
In order to simplify the transition to VirtioCommon and to avoid needing
to set empty fields derive Default for VirtioCommon.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
As we never join the spawned virtio-devices worker threads, the error
returned from each worker thread is lost. For now, we simply print out
the error from each worker thread.
Fixes: #1551
Signed-off-by: Bo Chen <chen.bo@intel.com>
Using the Rust Barrier mechanism, this patch forces each virtio device
to acknowledge they've been correctly paused before going further.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Migrate virtio-net and vhost-user-net control queue to EpollHelper so
as to remove code duplication.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Migrate all vhost-user devices to EpollHelper so as to remove code that
is duplicated between multiple virtio devices.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The only driver writable field in the virtio-block specification is the
writeback one. Check that the offset being written to is for that field
and update it.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Extract the code that is used by vhost_user_block from the
virtio-devices crate to remove the dependencies on unrequired
functionality such as the virtio transports.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Split the generic virtio code (queues and device type) from the
VirtioDevice trait, transport and device implementations.
This also simplifies the feature handling in vhost_user_backend as the
vm-virtio crate is no longer has any features.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>