Now that KVM specific interrupts are handled through InterruptManager
trait implementation, the vm-virtio crate does not need to rely on
kvm_ioctls and kvm_bindings crates.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Based on all the previous changes, we can at this point replace the
entire interrupt management with the implementation of InterruptManager
and InterruptSourceGroup traits.
By using KvmInterruptManager from the DeviceManager, we can provide both
VirtioPciDevice and VfioPciDevice a way to pick the kind of
InterruptSourceGroup they want to create. Because they choose the type
of interrupt to be MSI/MSI-X, they will be given a MsiInterruptGroup.
Both MsixConfig and MsiConfig are responsible for the update of the GSI
routes, which is why, by passing the MsiInterruptGroup to them, they can
still perform the GSI route management without knowing implementation
details. That's where the InterruptSourceGroup is powerful, as it
provides a generic way to manage interrupt, no matter the type of
interrupt and no matter which hypervisor might be in use.
Once the full replacement has been achieved, both SystemAllocator and
KVM specific dependencies can be removed.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Thanks to the recently introduced function notifier() in the
VirtioInterrupt trait, all vhost-user devices can now bypass
listening onto an intermediate event fd as they can provide the
actual fd responsible for triggering the interrupt directly to
the vhost-user backend.
In case the notifier does not provide the event fd, the code falls
back onto the creation of an intermediate event fd it needs to listen
to, so that it can trigger the interrupt on behalf of the backend.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The point is to be able to retrieve directly the event fd related to
the interrupt, as this might optimize the way VirtioDevice devices are
implemented.
For instance, this can be used by vhost-user devices to provide
vhost-user backends directly with the event fd triggering the
interrupt related to a virtqueue.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Callbacks are not the most Rust idiomatic way of programming. The right
way is to use a Trait to provide multiple implementation of the same
interface.
Additionally, a Trait will allow for multiple functions to be defined
while using callbacks means that a new callback must be introduced for
each new function we want to add.
For these two reasons, the current commit modifies the existing
VirtioInterrupt callback into a Trait of the same name.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Now that MsixConfig has access to the irq_fd descriptors associated with
each vector, it can directly write to it anytime it needs to trigger an
interrupt.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Because MsixConfig will be responsible for updating KVM GSI routes at
some point, it is necessary that it can access the list of routes
contained by gsi_msi_routes.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Because MsixConfig will be responsible for updating the KVM GSI routes
at some point, it must have access to the VmFd to invoke the KVM ioctl
KVM_SET_GSI_ROUTING.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The point here is to let MsixConfig take care of the GSI allocation,
which means the SystemAllocator must be passed from the vmm crate all
the way down to the pci crate.
Once this is done, the GSI allocation and irq_fd creation is performed
by MsixConfig directly.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Doing I/O on an image opened with O_DIRECT requires to adhere to
certain restrictions, requiring the following elements to be aligned:
- Address of the source/destination memory buffer.
- File offset.
- Length of the data to be read/written.
The actual alignment value depends on various elements, and according
to open(2) "(...) there is currently no filesystem-independent
interface for an application to discover these restrictions (...)".
To discover such value, we iterate through a list of alignments
(currently, 512 and 4096) calling pread() with each one and checking
if the operation succeeded.
We also extend RawFile so it can be used as a backend for QcowFile,
so the later can be easily adapted to support O_DIRECT too.
Signed-off-by: Sergio Lopez <slp@redhat.com>
Update the common part in net_util.rs under vm-virtio to add mq
support, meanwhile enable mq for virtio-net device, vhost-user-net
device and vhost-user-net backend. Multiple threads will be created,
one thread will be responsible to handle one queue pair separately.
To gain the better performance, it requires to have the same amount
of vcpus as queue pair numbers defined for the net device, due to
the cpu affinity.
Multiple thread support is not added for vhost-user-net backend
currently, it will be added in future.
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Add support to allow VMMs to open the same tap device many times, it will
create multiple file descriptors meanwhile.
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Current guest kernel will check the oneline cpu count, in principle,
if the online cpu count is not smaller than the number of queue pairs
VMM reported, the net packets could be put/get to all the virtqueues,
otherwise, only the number of queue pairs that match the oneline cpu
count will have packets work with. guest kernel will send command
through control queue to tell VMMs the actual queue pair numbers which
it could currently play with. Add mq process in control queue handling
to get the queue pair number, VMM will verify if it is in a valid range,
nothing else but this.
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
While feature VIRTIO_NET_F_CTRL_VQ is negotiated, control queue
will exits besides the Tx/Rx virtqueues, an epoll handler should
be started to monitor and handle the control queue event.
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
As virtio spec 1.1 said, the driver uses the control queue
to send commands to manipulate various features of the devices,
such as VIRTIO_NET_F_MQ which is required by multiple queue
support. Here add the control queue handling process.
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Since the common parts are put into net_util.rs under vm-virtio,
refactoring code for virtio-net device, vhost-user-net device
and backend to shrink the code size and improve readability
meanwhile.
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
There are some common logic shared among virtio-net device, vhost-user-net
device and vhost-user-net backend, abstract those parts into net_util.rs
to improve code maintainability and readability.
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
According to virtio spec, for used buffer notifications, if
MSI-X capability is enabled, and queue msix vector is
VIRTIO_MSI_NO_VECTOR 0xffff, the device must not deliver an
interrupt for that virtqueue.
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
The From and Display traits were not handling some of the enum
definitions. We no longer have a default case for Display so any future
misses will fail at build time.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
The way the code is currently implemented, only by writing to STDIN a
user can trigger some input to reach the VM through virtio-console. But
in case, there were not enough virtio descriptors to process what was
retrieved from STDIN, the remaining bits would be transferred only if
STDIN was triggered again. The missing part is that when some
descriptors are made available from the guest, the virtio-console device
should try to send any possible remaining bits.
By triggering the function process_input_queue() whenever the guest
notifies the host that some new descriptors are ready for the receive
queue, this patch allows to fill the implementation void that was left.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In case the virtio descriptor is pulled out of the Queue iterator, it
is important to fill it and tag it as used. This is already done from
the successful code path, but in case there's an error during the
filling, we should make sure to put the descriptor back in the list of
available descriptors. This way, when the error occurs, we don't loose
a descriptor, and it could be used later.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The existing code was a bit too complex and it was introducing a bug
when trying to paste long lines directly to the console. By simplifying
the code, and by doing proper usage of the drain() function, the bug is
fixed by this commit.
Here is the similar output one could have gotten from time to time, when
pasting important amounts of bytes:
ERROR:vm-virtio/src/console.rs:104 -- Failed to write slice:
InvalidGuestAddress(GuestAddress(1040617472))
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The virtio-fs messages coming from the slave can contain multiple
mappings (up to 8) through one single request. By implementing such
feature, the virtio-fs implementation of cloud-hypervisor is optimal and
fully functional as it resolves a bug that was seen when running fio
testing without this patch.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
By implementing this virtio feature, we let the virtio-iommu driver call
the device backend so that it can probe each device that gets attached.
Through this probing, the device provides a range of reserved memory
related to MSI. This is mandatory for x86 architecture as we want to
avoid the default MSI range assigned by the virtio-iommu driver if no
range is provided at all. The default range is 0x8000000-0x80FFFFF but
it only makes sense for ARM architectures.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The following commit broke this unit test:
"""
vmm: Convert virtio devices to Arc<Mutex<T>>
Migratable devices can be virtio or legacy devices.
In any case, they can potentially be tracked through one of the IO bus
as an Arc<Mutex<dyn BusDevice>>. In order for the DeviceManager to also
keep track of such devices as Migratable trait objects, they must be
shared as mutable atomic references, i.e. Arc<Mutex<T>>. That forces all
Migratable objects to be tracked as Arc<Mutex<dyn Migratable>>.
Virtio devices are typically migratable, and thus for them to be
referenced by the DeviceManager, they now should be built as
Arc<Mutex<VirtioDevice>>.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
"""
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This allows us to change the memory map that is being used by the
devices via an atomic swap (by replacing the map with another one). The
ArcSwap provides the mechanism for atomically swapping from to another
whilst still giving good read performace. It is inside an Arc so that we
can use a single ArcSwap for all users.
Not covered by this change is replacing the GuestMemoryMmap itself.
This change also removes some vertical whitespace from use blocks in the
files that this commit also changed. Vertical whitespace was being used
inconsistently and broke rustfmt's behaviour of ordering the imports as
it would only do it within the block.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This patch has been cherry-picked from the Firecracker tree. The
reference commit is 1db04ccc69862f30b7814f30024d112d1b86b80e.
Changed the host-initiated vsock connection protocol to include a
trivial handshake.
The new protocol looks like this:
- [host] CONNECT <port><LF>
- [guest/success] OK <assigned_host_port><LF>
On connection failure, the host host connection is reset without any
accompanying message, as before.
This allows host software to more easily detect connection failures, for
instance when attempting to connect to a guest server that may have not
yet started listening for client connections.
Signed-off-by: Dan Horobeanu <dhr@amazon.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The vsock packets that we're building are resolving guest addresses to
host ones and use the latter as raw pointers.
If the corresponding guest mapped buffer spans across several regions in
the guest, they will do so in the host as well. Since we have no
guarantees that host regions are contiguous, it may lead the VMM into
trying to access memory outside of its memory space.
For now we fix that by ensuring that the guest buffers do not span
across several regions. If they do, we error out.
Ideally, we should enhance the rust-vmm memory model to support safe
acces across host regions.
Fixes CVE-2019-18960
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Due to the amount of code currently duplicated across vhost-user devices,
the stats for this commit is on the large side but it's mostly more
duplicated code, unfortunately.
Migratable and Snapshotable placeholder implementations are provided as
well, making all vhost-user devices Migratable.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Due to the amount of code currently duplicated across virtio devices,
the stats for this commit is on the large side but it's mostly more
duplicated code, unfortunately.
Migratable and Snapshotable placeholder implementations are provided as
well, making all virtio devices Migratable.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Migratable devices can be virtio or legacy devices.
In any case, they can potentially be tracked through one of the IO bus
as an Arc<Mutex<dyn BusDevice>>. In order for the DeviceManager to also
keep track of such devices as Migratable trait objects, they must be
shared as mutable atomic references, i.e. Arc<Mutex<T>>. That forces all
Migratable objects to be tracked as Arc<Mutex<dyn Migratable>>.
Virtio devices are typically migratable, and thus for them to be
referenced by the DeviceManager, they now should be built as
Arc<Mutex<VirtioDevice>>.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
The signal handling for vCPU signals has changed in the latest release
so switch to the new API.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Since the kvm crates now depend on vmm-sys-util, the bump must be
atomic.
The kvm-bindings and ioctls 0.2.0 and 0.4.0 crates come with a few API
changes, one of them being the use of a kvm_ioctls specific error type.
Porting our code to that type makes for a fairly large diff stat.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
In order to iterate over a chain of descriptor chains, this code has
been ported over from crosvm, based on the commit
961461350c0b6824e5f20655031bf6c6bf6b7c30.
The main modification compared to the original code is the way the
sorting between readable and writable descriptors happens.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Update micro_http create to allow set content type.
Suggested-by: Samuel Ortiz <sameo@linux.intel.com>
Tested-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
In order to group together some functions that can be shared across
virtio transport layers, this commit introduces a new trait called
VirtioTransport.
The first function of this trait being ioeventfds() as it is needed from
both virtio-mmio and virtio-pci devices, represented by MmioDevice and
VirtioPciDevice structures respectively.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
We need to rely on the latest kvm-ioctls version to benefit from the
recent addition of unregister_ioevent(), allowing us to detach a
previously registered eventfd to a PIO or MMIO guest address.
Because of this update, we had to modify the current constraint we had
on the vmm-sys-util crate, using ">= 0.1.1" instead of being strictly
tied to "0.2.0".
Once the dependency conflict resolved, this commit took care of fixing
build issues caused by recent modification of kvm-ioctls relying on
EventFd reference instead of RawFd.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The specific part of PCI BAR reprogramming that happens for a virtio PCI
device is the update of the ioeventfds addresses KVM should listen to.
This should not be triggered for every BAR reprogramming associated with
the virtio device since a virtio PCI device might have multiple BARs.
The update of the ioeventfds addresses should only happen when the BAR
related to those addresses is being moved.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The PciDevice trait is supposed to describe only functions related to
PCI. The specific method ioeventfds() has nothing to do with PCI, but
instead would be more specific to virtio transport devices.
This commit removes the ioeventfds() method from the PciDevice trait,
adding some convenient helper as_any() to retrieve the Any trait from
the structure behing the PciDevice trait. This is the only way to keep
calling into ioeventfds() function from VirtioPciDevice, so that we can
still properly reprogram the PCI BAR.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Based on the value being written to the BAR, the implementation can
now detect if the BAR is being moved to another address. If that is the
case, it invokes move_bar() function from the DeviceRelocation trait.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Based on the type of BAR, we can now provide the correct address related
to a BAR index provided by the caller.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In order to speed up the boot time and reduce the amount of mappings,
this patch exposes the virtio-iommu device as supporting both 2M and 4k
page sizes.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This patch relies on the trait implementation provided for each device
which requires some sort of external update based on a map or unmap.
Whenever a MAP or UNMAP request comes through the virtqueues, it
triggers a call to the external mapping trait with map()/unmap()
functions being invoked.
Those external mappings are meant to be used from VFIO and vhost-user
devices as they need to update their own mappings. In case of VFIO, the
goal is to update the DMAR table in the physical IOMMU, while vhost-user
devices needs to update their internal representation of the virtqueues.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This patch introduces the first implementation of the virtio-iommu
device. This device emulates an IOMMU for the guest, which allows
special use cases like nesting passed through devices, or even using
IOVAs from the guest.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In case some virtio devices are attached to the virtual IOMMU, their
vring addresses need to be translated from IOVA into GPA. Otherwise it
makes no sense to try to access them, and they would cause out of range
errors.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Adding virtio feature VIRTIO_F_IOMMU_PLATFORM when explicitly asked by
the user. The need for this feature is to be able to attach the virtio
device to a virtual IOMMU.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Adding virtio feature VIRTIO_F_IOMMU_PLATFORM when explicitly asked by
the user. The need for this feature is to be able to attach the virtio
device to a virtual IOMMU.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Adding virtio feature VIRTIO_F_IOMMU_PLATFORM when explicitly asked by
the user. The need for this feature is to be able to attach the virtio
device to a virtual IOMMU.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Adding virtio feature VIRTIO_F_IOMMU_PLATFORM when explicitly asked by
the user. The need for this feature is to be able to attach the virtio
device to a virtual IOMMU.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Adding virtio feature VIRTIO_F_IOMMU_PLATFORM when explicitly asked by
the user. The need for this feature is to be able to attach the virtio
device to a virtual IOMMU.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Adding virtio feature VIRTIO_F_IOMMU_PLATFORM when explicitly asked by
the user. The need for this feature is to be able to attach the virtio
device to a virtual IOMMU.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The virtio specification defines a device can be reset, which was not
supported by this vhost-user-fs implementation. The reason it is needed
is to support unbinding this device from the guest driver, and rebind it
to vfio-pci driver.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The virtio specification defines a device can be reset, which was not
supported by this vhost-user-net implementation. The reason it is needed
is to support unbinding this device from the guest driver, and rebind it
to vfio-pci driver.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The virtio specification defines a device can be reset, which was not
supported by this virtio-console implementation. The reason it is needed
is to support unbinding this device from the guest driver, and rebind it
to vfio-pci driver.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The virtio specification defines a device can be reset, which was not
supported by this virtio-vsock implementation. The reason it is needed
is to support unbinding this device from the guest driver, and rebind
it to vfio-pci driver.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The virtio specification defines a device can be reset, which was not
supported by this virtio-pmem implementation. The reason it is needed
is to support unbinding this device from the guest driver, and rebind
it to vfio-pci driver.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The virtio specification defines a device can be reset, which was not
supported by this virtio-rng implementation. The reason it is needed
is to support unbinding this device from the guest driver, and rebind
it to vfio-pci driver.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The virtio specification defines a device can be reset, which was not
supported by this virtio-net implementation. The reason it is needed is
to support unbinding this device from the guest driver, and rebind it to
vfio-pci driver.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
While implement vhost-user-net backend with Tap interface, it keeps
failed to enable the tx vring, since there is a checking in
slave_req_handler.rs to require acked_protocol_features to be setup
as a pre-requirement, which is filled by set_protocol_features call.
Add this call in vhost-user-net device implementation to address the issue.
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
The purpose of this new crate is to provide a common library to all
vhost-user backend implementations. The more is handled by this library,
the less duplication will need to happen in each vhost-user daemon.
This crate relies a lot on vhost_rs, vm-memory and vm-virtio crates.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The vhost-user implementation was always passing the maximum size
supported by the virtqueues to the backend, but this is obviously wrong
as it must pass the size being set by the driver running in the guest.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The 32 or 64 bits type for the memory BAR was not set correctly. This
patch ensure the right type is applied to the BAR.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
If we expect the vhost-user-blk device to be used for booting a VMM
along with the firmware, then need the device to support being reset.
In the vhost-user context, this means the backend needs to be informed
the vrings are disabled and stopped, and the owner needs to be reset
too.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In vhost-user-blk, only WCE value can be set back to device in
guest kernel like
echo "write through" > /sys/block/vda/cache_type
So write_config() will only set WCE value from guest kernel to
vhost user side.
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Since config space in vhost-user-blk are mostly from backend
device, this change will get config space info from backend
by vhost-user protocol.
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
vhost-user-blk has better performance than virtio-blk, so we need
add vhost-user-blk support with SPDK in Rust-based VMMs.
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Now that virtio-bindings is a crate part of the rust-vmm project, we
want to rely on this one instead of the local one we had so far.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Currently all devices and guest memory share the same 64GiB
allocation. With guest memory working upwards and devices working
downwards. This creates issues if you want to either have a VM with a
large amount of memory or want to have devices with a large allocation
(e.g. virtio-pmem.)
As it is possible for the hypervisor to place devices anywhere in its
address range it is required for simplistic users like the firmware to
set up an identity page table mapping across the full range. Currently
the hypervisor sets up an identify mapping of 1GiB which the firmware
extends to 64GiB to match the current address space size of the
hypervisor.
A simpler solution is to place the device needed for booting with the
firmware (virtio-block) inside the 32-bit memory hole. This allows the
firmware to easily access the block device and paves the way for
increasing the address space beyond the current 64GiB limit.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Derived from the crosvm code at 5656c124af2bb956dba19e409a269ca588c685e3
and adapted to work within cloud-hypervisor:
Main differences:
* Interrupt handling is done via a VirtioInterrupt turned into a
devices::Interrupt
* GuestMemory -> GuestMemoryMmap
* Differences in read/write for BusDevice
* Different crates for EventFd and GuestAddress
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This unit testing porting effort is based off of Firecracker commit
1e1cb6f8f8003e0bdce11d265f0feb23249a03f6
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This is the last step connecting the dots between the virtio-vsock
device and the bulk of the logic hosted in the unix and csm modules.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This commit relies on the new vsock::unix module to create the backend
that will be used from the virtio-vsock device.
The concept of backend is interesting here as it would allow for a vhost
kernel backend to be plugged if that was needed someday.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This code porting is based off of Firecracker commit
1e1cb6f8f8003e0bdce11d265f0feb23249a03f6
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This code porting is based off of Firecracker commit
1e1cb6f8f8003e0bdce11d265f0feb23249a03f6
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
There is a lot of code related to this virtio-vsock hybrid
implementation, that's why it's better to keep it under its
own module.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This is the first commit introducing the support for virtio-vsock.
This is based off of Firecracker commit
1e1cb6f8f8003e0bdce11d265f0feb23249a03f6
Fixes#102
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Regarding vhost-user-net, there are features in avail_features
and acked_features, like VIRTIO_NET_F_MAC which is required by
driver and device to transfer mac address through config space,
but not needed by backend, like ovs+dpdk, so it's necessary to
adjust backend_features based on acked_features before calling
set_features() API.
This fix is to record backend_features in vhost-user-net to avoid
requesting it twice.
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
New event is added in VhostUserEpollHandler for vhost-user fs,
but the total event count is not update accordingly. Fix the
issue and refactor the event data setting for new event
expansion in the future.
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
At this point in the code, the acked features have been provided by the
guest and they can be set back to the backend. There's no need to
retrieve one more time the backend features for this purpose.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
As mentioned in the vhost-user specification, each ring is initialized
in a stopped state. This means each ring should be enabled only after
it has been correctly initialized.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The available features are masked with the backend features, therefore
the available features should be the one used when calling into
set_features() API.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In order to factorize the code between vhost-user-net and virtio-fs one
step further, this patch extends the vhost-user handler implementation
to support slave requests.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This patch factorizes the existing virtio-fs code by relying onto the
common code part of the vhost_user module in the vm-virtio crate.
In details, it factorizes the vhost-user setup, and reuses the error
types defined by the module instead of defining its own types.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
vhost-user-net introduced a new module vhost_user inside the vm-virtio
crate. Because virtio-fs is actually vhost-user-fs, it belongs to this
new module and needs to be moved there.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
vhost-user framwork could provide good performance in data intensive
scenario due to the memory sharing mechanism. Implement vhost-user-net
device to get the benefit for Rust-based VMMs network.
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
By making the registration functions immutable, this patch prevents from
self borrowing issues with the RwLock on self.mem.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Following the refactoring of the code allowing multiple threads to
access the same instance of the guest memory, this patch goes one step
further by adding RwLock to it. This anticipates the future need for
being able to modify the content of the guest memory at runtime.
The reasons for adding regions to an existing guest memory could be:
- Add virtio-pmem and virtio-fs regions after the guest memory was
created.
- Support future hotplug of devices, memory, or anything that would
require more memory at runtime.
Because most of the time, the lock will be taken as read only, using
RwLock instead of Mutex is the right approach.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The VMM guest memory was cloned (copied) everywhere the code needed to
have ownership of it. In order to clean the code, and in anticipation
for future support of modifying this guest memory instance at runtime,
it is important that every part of the code share the same instance.
Because VirtioDevice implementations need to have access to it from
different threads, that's why Arc must be used in this case.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
When there are no available descriptors in the queue (observed when the
network interface hasn't been brought up by the kernel) stop waiting for
notifications that the TAP fd should be read from.
This avoids a situation where the TAP device has data avaiable and wakes
up the virtio-net thread only for the virtio-net thread not read that
data as it has nowhere to put it.
When there are descriptors available in the queue then we resume waiting
for the epoll event on the TAP fd.
This bug demonstrated itself as 100% CPU usage for cloud-hypervisor
binary prior to the guest network interface being brought up. The
solution was inspired by the Firecracker virtio-net code.
Fixes: #208
Signed-off-by: Rob Bradford <robert.bradford@intel.com>