On the CI we are seeing issues with the activation barriers not being released:
cloud-hypervisor: 12.452434193s: INFO:vmm/src/vm.rs:413 -- Waiting for barrier
cloud-hypervisor: 12.452499794s: INFO:virtio-devices/src/block.rs:382 -- Changing cache mode to writeback
cloud-hypervisor: 12.452605195s: INFO:vmm/src/vm.rs:413 -- Waiting for barrier
cloud-hypervisor: 12.452684596s: INFO:virtio-devices/src/transport/pci_device.rs:671 -- Waiting for barrier
cloud-hypervisor: 12.452708196s: INFO:virtio-devices/src/transport/pci_device.rs:673 -- Barrier released
cloud-hypervisor: 12.452717596s: INFO:vmm/src/vm.rs:415 -- Barrier released
Add some debugging to try and identify the vause of this issue.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Add support for creating virtio-net device from existing TAP fd.
Currently only a single fd and thus no-more than 2 queues (one pair) is
suppored.
Fixes: #2052
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
When a device is ready to be activated signal to the VMM thread via an
EventFd that there is a device to be activated. When the VMM receives a
notification on the EventFd that there is a device to be activated
notify the device manager to attempt to activate any devices that have
not been activated.
As a side effect the VMM thread will create the virtio device threads.
Fixes: #1863
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
We need to be able to return the barrier from the code that prepares to
activate the virtio device. This triggered by a write to the
configuration fields stored in the PCI BAR. Since bars can be accessed
by both memory mapping and through PCI config I/O several prototypes
must be changed.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This can be uses to indicate to the caller that it should wait on the
barrier before returning as there is some asynchronous activity
triggered by the write which requires the KVM exit to block until it's
completed.
This is useful for having vCPU thread wait for the VMM thread to proceed
to activate the virtio devices.
See #1863
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
When a total ordering between multiple atomic variables is not required
then use Ordering::Acquire with atomic loads and Ordering::Release with
atomic stores.
This will improve performance as this does not require a memory fence
on x86_64 which Ordering::SeqCst will use.
Add a comment to the code in the vCPU handling code where it operates on
multiple atomics to explain why Ordering::SeqCst is required.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This device operates a single virtq. When the driver offers a descriptor
to the device it is interpreted as a "ping" to indicate that the guest
is alive. A periodic timer fires and if when the timer is fired there
has not been a "ping" from the guest then the device will reset the VM.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
While using the virtio-iommu device involving L2 scenario, and tearing
things down all the way from L2 back to L0 exposed some bad syscalls
that were not part of the authorized list.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The goal here is to replace anywhere possible a virtio structure
with a "C, packed" representation by a "C" representation. Some
virtio structures are not expected to be packed, therefore there's
no reason for using the more restrictive "C, packed" representation.
This is important since "packed" representation can still cause
undefined behaviors with Rust 2018.
By removing the need for "packed" representation, we can simplify a
bit of code by deriving the Serialize trait without writing the
implementation ourselves.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Small patch creating a dedicated `block_io_uring_is_supported()`
function for the non-io_uring case, so that we can simplify the
code in the DeviceManager.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The virtio-balloon change the memory size is asynchronous.
VirtioBalloonConfig.actual of balloon device show current balloon size.
This commit add memory_actual_size to vm.info to show memory actual size.
Signed-off-by: Hui Zhu <teawater@antfin.com>
Misspellings were identified by https://github.com/marketplace/actions/check-spelling
* Initial corrections suggested by Google Sheets
* Additional corrections by Google Chrome auto-suggest
* Some manual corrections
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
The missing syscall rt_sigprocmask(2) was triggered for the musl build
upon rebooting the VM, and was causing the VM to be killed.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This commit gives the possibility to create a virtio-mem device with
some memory already plugged into it. This is preliminary work to be
able to reboot a VM with the virtio-mem region being already resized.
Signed-off-by: Hui Zhu <teawater@antfin.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The virtio-mem driver is generating some warnings regarding both size
and alignment of the virtio-mem region if not based on 128MiB:
The alignment of the physical start address can make some memory
unusable.
The alignment of the physical end address can make some memory
unusable.
For these reasons, the current patch enforces virtio-mem regions to be
128MiB aligned and checks the size provided by the user is a multiple of
128MiB.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Implement support for associating a virtio-mem device with a specific
guest NUMA node, based on the ACPI proximity domain identifier.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
By testing manually the memory resizing through virtio-mem, several
missing syscalls have been identified.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The Windows virtio block driver puts multiple data descriptors between
the header and the status footer. To handle this when parsing iterate
over the descriptor chain until the end is reached accumulating the
address and length pairs in a vector. For execution iterate over the
vector and make sequential reads from the disk for each data descriptor.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
We observed CI instability for the past couple of days. This
instability is confirmed to be a result of incomplete seccomp
filters. Given the filter on 'virtio_vsock' is recently added and
is missing 'brk', it is likely to be the root cause of the
instability.
Signed-off-by: Bo Chen <chen.bo@intel.com>
This removes the dependency of the pci crate on the devices crate which
now only contains the device implementations themselves.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Split the block device implementation into code that be used in common
between multiple different virtio device implementations.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
In order to simplify the transition to VirtioCommon and to avoid needing
to set empty fields derive Default for VirtioCommon.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Rearrange the code to match other devices which makes it easier to prep
for sharing this between other devices.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Move the if-let for the taps later which makes the earlier activation
code identical to other devices.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Introduce VirtioCommon to help remove duplicated functionality and state
between implementations of VirtioDevice. Initially it is only handling
feature acknowledgement and testing.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
There will be some cases where the implementation of the snapshot()
function from the Snapshottable trait will require to modify some
internal data, therefore we make this possible by updating the trait
definition with snapshot(&mut self).
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
"debug!" marco is used in virtio-devices/src/epoll_helper.rs. When"-vvv"
and "--log-file" option was specified, the missing "SYS_write" rule
caused a "bad system call" crash.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
As we never join the spawned virtio-devices worker threads, the error
returned from each worker thread is lost. For now, we simply print out
the error from each worker thread.
Fixes: #1551
Signed-off-by: Bo Chen <chen.bo@intel.com>
From the experiments of running integration tests on my local machine,
auditd occationally reported the 'brk' syscall is needed for the
'virtio-rng' worker thread.
Signed-off-by: Bo Chen <chen.bo@intel.com>
This patch adds the seccomp filter list for the virtio_net thread, while
the list was already added for the virtio_net_ctl thread.
Partially fixes: #925
Signed-off-by: Bo Chen <chen.bo@intel.com>
The current seccomp filter for virtio-net is actually for the worker
thread 'virtio_net_ctl' (not the actual worker thread
'virtio_net'). This patch introduces changes to distinguish those two
worker threads and seccomp filters.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Using the Rust Barrier mechanism, this patch forces each virtio device
to acknowledge they've been correctly paused before going further.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The seccomp filters specific to the virtio-net threads must contain
dup() syscall now that we ported the epoll code to the EpollHelper.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Migrate virtio-net and vhost-user-net control queue to EpollHelper so
as to remove code duplication.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Migrate all vhost-user devices to EpollHelper so as to remove code that
is duplicated between multiple virtio devices.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Migrate to EpollHelper so as to remove code that is duplicated between
multiple virtio devices.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Instead of passing only the event type through the handle_event()
callback, we make the trait slightly more generic by providing the
epoll event to each virtio device implementation.
This is particularly useful for vsock as it will need the event set.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Migrate to EpollHelper so as to remove code that is duplicated between
multiple virtio devices.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The seccomp filters specific to the virtio-console thread must contain
dup syscall now that we ported the epoll code to the EpollHelper.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Migrate to EpollHelper so as to remove code that is duplicated between
multiple virtio devices.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Migrate to EpollHelper so as to remove code that is duplicated between
multiple virtio devices.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Migrate to EpollHelper so as to remove code that is duplicated between
multiple virtio devices.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Currently any messages generated during the worker thread are not
shown anywhere as the thread is never join()ed on. Instead output the
error immediately.
For now only cover the subset where the work to port to EpollHandler
clashed with the seccomp filtering for virtio devices.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Migrate to EpollHelper so as to remove code that is duplicated between
multiple virtio devices.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Migrate to EpollHelper so as to remove code that is duplicated between
multiple virtio devices.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This patch added the seccomp_filter module to the virtio-devices crate
by taking reference code from the vmm crate. This patch also adds
allowed-list for the virtio-block worker thread.
Partially fixes: #925
Signed-off-by: Bo Chen <chen.bo@intel.com>
By adding a new io_uring feature gate, we let the user the possibility
to choose if he wants to enable the io_uring improvements or not.
Since the io_uring feature depends on the availability on recent host
kernels, it's better if we leave it off for now.
As soon as our CI will have support for a kernel 5.6 with all the
features needed from io_uring, we'll enable this feature gate
permanently.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Instead of just logging error messages but continue the processing of
the queues, this patch returns errors right away. This allows for a
quicker detection of an error happening on the virtqueue.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This introduces a new version of virtio-blk device. The default
virtio-blk provides synchronous processing of the queues, while this
new version relies on io_uring from the host kernel to provide an
asynchronous processing of the queues.
This new asynchronous version provides a huge performance improvement
compared to the default synchronous version.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
EpollHelper allows the removal of much duplicated loop handling code and
instead the device specific even handling is delegated via an
implementation of EpollHelperHandler.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This is required for implementing virtio-net as the epoll RawFd must be
assigned into the NetQueuePair.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Port virtio-block device to the new EpollHelper. This required moving
the queue EventFd ownership to BlockEpollHandler.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This is a helper for implementing the worker thread for virtio devices
and in particular handles special behaviour for pause and kill events.
The device specific event handling (for the queues themselves) is
delegated to a method invoked on a new EpollHelperHandler trait. This
method is passed the event as well as the EpollHelper so that it can
operate on the handler in order to manage events itself (required for
virtio-net.)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The only driver writable field in the virtio-block specification is the
writeback one. Check that the offset being written to is for that field
and update it.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The only driver writable field in the virtio-block specification is the
writeback one. Check that the offset being written to is for that field
and update it.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Add a helper function to share code between implementations that can use
a slice accessible data structure for configuration data.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Remove the write_config() implementations that only generate a warning
as that is now done at the VirtioDevice level.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Not every virtio device has any config fields that can be read and most
have none that can be written to.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Extract the code that is used by vhost_user_block from the
virtio-devices crate to remove the dependencies on unrequired
functionality such as the virtio transports.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
In this commit we saved the BDF of a PCI device and set it to "devid"
in GSI routing entry, because this field is mandatory for GICv3-ITS.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
The virtio configuration structures have been slightly modified between
5.6-rc4 and 5.8-rc4, forcing the virtio-iommu device to be updated
accordingly.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Move the definition of RawFile from virtio-devices crate into qcow
crate. All the code that consumes RawFile also already depends on the
qcow crate for image file type detection so this change breaks the
need for the qcow crate to depend on the very large virtio-devices
crate.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Move NetQueuePair and the related NetCounters into the net_util crate.
This means that the vhost_user_net crate now no longer depends on
virtio-devices and so does not depend on the pci, qcow or other similar
crates. This significantly simplifies the build chain for this backend.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
By moving the code for opening the two RX and TX queues into a shared
location we are starting to remove the requirement for the
vhost-user-net backend to depend on the virtio-devices crate which in of
itself depends on many other crates that are not necessary for the
backend to function.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
By moving the code for opening the TAP device into a shared location we
are starting to remove the requirement for the vhost-user-net backend to
depend on the virtio-devices crate which in of itself depends on many
other crates that are not necessary for the backend to function.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This commit store balloon size to MemoryConfig.
After reboot, virtio-balloon can use this size to inflate back to
the size before reboot.
Signed-off-by: Hui Zhu <teawater@antfin.com>
The implementation of the virtio-balloon was slightly wrong as it was
generating the GPA (Guest Physical Address) from the PFN (Page Frame
Number) which was a u32. That means the GPA was created as a u32, and
later a cast was done to extend it to a u64 type. Unfortunately, by
doing so, the GPA was wrong if the value was supposedly more than 32
bits.
That's why the PFN is casted into a u64 before the GPA is generated,
which creates the GPA on 64 bits directly.
Additionally, this patch simplifies the process_queue() function,
relying on multiple vm-memory helpers.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Split the generic virtio code (queues and device type) from the
VirtioDevice trait, transport and device implementations.
This also simplifies the feature handling in vhost_user_backend as the
vm-virtio crate is no longer has any features.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>