Commit Graph

51 Commits

Author SHA1 Message Date
Li Hangjing
3fed07419a virtio-balloon: use fallocate when balloon inflated
For vhost-user devices, memory should be shared between CLH and
vhost-user backend. However, madvise DONTNEED doesn't working in
this case. So, let's use fallocate PUNCH_HOLE to discard those
memory regions instead.

Signed-off-by: Li Hangjing <lihangjing@bytedance.com>
2021-06-16 09:55:22 +02:00
Sebastien Boeuf
acec7e34fc virtio-devices: vhost_user: Factorize slave request handler
Since the slave request handler is common to all vhost-user devices, the
same way the reconnection is, it makes sense to handle the requests from
the backend through the same thread.

The reconnection thread now handles both a reconnection as well as any
request coming from the backend.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-06-10 16:17:23 +02:00
Sebastien Boeuf
e62aafdf58 virtio-devices: Update seccomp filters for vhost-user-net control queue
The control queue was missing rt_sigprocmask syscall, which was causing
a crash when the VM was shutdown.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-06-02 17:31:18 +02:00
Sebastien Boeuf
e9cc23ea94 virtio-devices: vhost_user: net: Move control queue back
We thought we could move the control queue to the backend as it was
making some good sense. Unfortunately, doing so was a wrong design
decision as it broke the compatibility with OVS-DPDK backend.

This is why this commit moves the control queue back to the VMM side,
meaning an additional thread is being run for handling the communication
with the guest.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-05-26 16:09:32 +01:00
Rob Bradford
c05010887f virtio-devices: seccomp: Cleanup unused seccomp filter entries
The threads in question are no longer created and so no longer need
seccomp rules for them.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-05-19 18:21:47 +02:00
Rob Bradford
51a93bc635 virtio-devices: net: Add support for VIRTIO_NET_F_CTRL_GUEST_OFFLOADS
This allows the guest to reprogram the offload settings and mitigates
issues where the Linux kernel tries to reprogram the queues even when
the feature is not advertised.

Fixes: #2528

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-04-29 10:02:10 +02:00
Rob Bradford
4806357f52 virtio-devices: net: Cleanup the MQ handling in the control queue
Cleanup the control queue handling in preparation for supporting
alternative commands.

Note that this change does not make the MQ handling spec compliant.
According to the specification MQ should only be enabled once the number
of queue pairs the guest would like to use has been specified. The only
improvement towards the specication in this change is correct error
handling if the guest specifies an inappropriate number of queues (out
of range.)

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-04-29 10:02:10 +02:00
Bo Chen
b176ddfe2a virtio-devices, vmm: Add rate limiter for the TX queue of virtio-net
Partially fixes: #1286

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-03-30 19:47:43 +02:00
Rob Bradford
7c9a83f6e1 virtio-devices: Address Rust 1.51.0 clippy issue (needless_question_mark)
warning: Question mark operator is useless here
   --> virtio-devices/src/seccomp_filters.rs:485:5
    |
485 | /     Ok(SeccompFilter::new(
486 | |         rules.into_iter().collect(),
487 | |         SeccompAction::Log,
488 | |     )?)
    | |_______^
    |
    = note: `#[warn(clippy::needless_question_mark)]` on by default
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_question_mark
help: try
    |
485 |     SeccompFilter::new(
486 |         rules.into_iter().collect(),
487 |         SeccompAction::Log,
488 |     )
    |

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-03-26 11:32:09 +00:00
Bo Chen
6307db5699 virtio-devices: seccomp: Add 'timerfd_settime' to block device
The `timerfd_settime` syscall is required when I/O throttling is
enabled.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-03-15 20:19:33 +00:00
Sebastien Boeuf
e3a8d6c13c virtio-devices: vhost-user: net: Fix seccomp filters
On x86_64 architecture, multiple syscalls were missing when shutting
down the vhost-user-net device along with the VM. This was causing the
usual crash related to seccomp filters.

This commit adds these missing syscalls to fix the issue.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-03-11 19:04:21 +01:00
Sebastien Boeuf
61f9a4ec6c virtio-devices: mem: Accept handlers to update DMA mappings
Create two functions for registering/unregistering DMA mapping handlers,
each handler being associated with a VFIO device.

Whenever the plugged_size is modified (which means triggered by the
virtio-mem driver in the guest), the virtio-mem backend is responsible
for updating the DMA mappings related to every VFIO device through the
handler previously provided.

It's important to update the map when the handler is either registered
or unregistered as well, as we don't want to miss some plugged memory
that would have been added before the VFIO device is added to the VM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-03-05 10:38:42 +01:00
Rob Bradford
f8875acec2 misc: Bulk upgrade dependencies
In particular update for the vmm-sys-util upgrade and all the other
dependent packages. This requires an updated forked version of
kvm-bindings (due to updated vfio-ioctls) but allowed the removal of our
forked version of kvm-ioctls.

The changes to the API from kvm-ioctls and vmm-sys-util required some
other minor changes to the code.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-02-26 11:31:08 +00:00
Sebastien Boeuf
4ed0e1a3c8 net_util: Simplify TX/RX queue handling
The main idea behind this commit is to remove all the complexity
associated with TX/RX handling for virtio-net. By using writev() and
readv() syscalls, we could get rid of intermediate buffers for both
queues.

The complexity regarding the TAP registration has been simplified as
well. The RX queue is only processed when some data are ready to be
read from TAP. The event related to the RX queue getting more
descriptors only serves the purpose to register the TAP file if it's not
already.

With all these simplifications, the code is more readable but more
performant as well. We can see an improvement of 10% for a single
queue device.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-02-22 10:39:23 +00:00
Rob Bradford
9c5be6f660 build: Remove unnecessary Result<> returns
If the function can never return an error this is now a clippy failure:

error: this function's return value is unnecessarily wrapped by `Result`
   --> virtio-devices/src/watchdog.rs:215:5
    |
215 | /     fn set_state(&mut self, state: &WatchdogState) -> io::Result<()> {
216 | |         self.common.avail_features = state.avail_features;
217 | |         self.common.acked_features = state.acked_features;
218 | |         // When restoring enable the watchdog if it was previously enabled. We reset the timer
...   |
223 | |         Ok(())
224 | |     }
    | |_____^
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_wraps

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-02-11 18:18:44 +00:00
Sebastien Boeuf
c6854c5a97 block_util: Simplify RAW synchronous implementation
Using directly preadv and pwritev, we can simply use a RawFd instead of
a file, and we don't need to use the more complex implementation from
the qcow crate.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-02-01 13:45:08 +00:00
Sebastien Boeuf
b2e5dbaecb block_util, vmm: Add fixed VHD asynchronous implementation
This commit adds the asynchronous support for fixed VHD disk files.

It introduces FixedVhd as a new ImageType, moving the image type
detection to the block_util crate (instead of qcow crate).

It creates a new vhd module in the block_util crate in order to handle
VHD footer, following the VHD specification.

It creates a new fixed_vhd_async module in the block_util crate to
implement the asynchronous version of fixed VHD disk file. It relies on
io_uring.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-02-01 13:45:08 +00:00
Sebastien Boeuf
2824642e80 virtio-devices: Rename BlockIoUring to Block
Now that BlockIoUring is the only implementation of virtio-block,
handling both synchronous and asynchronous backends based on the
AsyncIo trait, we can rename it to Block.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-01-22 16:10:34 +00:00
Sebastien Boeuf
41cfdb50cd virtio-devices: Remove virtio-block synchronous implementation
Now that both synchronous and asynchronous backends rely on the
asynchronous version of virtio-block (namely BlockIoUring), we can
get rid of the synchronous version (namely Block).

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-01-22 16:10:34 +00:00
Sebastien Boeuf
12e20effd7 block_util: Port synchronous QCOW file to AsyncIo trait
Based on the synchronous QCOW file implementation present in the qcow
crate, we created a new qcow_sync module in block_util that ports this
synchronous implementation to the AsyncIo trait.

The point is to reuse virtio-blk asynchronous implementation for both
synchronous and asynchronous backends.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-01-22 16:10:34 +00:00
Sebastien Boeuf
9fc86a91e2 block_util: Port synchronous RAW file to AsyncIo trait
Based on the synchronous RAW file implementation present in the qcow
crate, we created a new raw_sync module in block_util that ports this
synchronous implementation to the AsyncIo trait.

The point is to reuse virtio-blk asynchronous implementation for both
synchronous and asynchronous backends.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-01-22 16:10:34 +00:00
Sebastien Boeuf
f70852c04b virtio-devices: Update seccomp filters for virtio-net thread
On aarch64, the openat() syscall was missing from the seccomp filters
list, preventing the test_watchdog from running properly.

Fixes #2103

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-01-08 12:37:32 +00:00
Rob Bradford
6d4656c68f virtio-devices: seccomp_filters: Add fsync to block io_uring filter
This is required when booting with hypervisor-fw.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-10-27 10:53:54 +00:00
Rob Bradford
d2c7645731 virtio-devices: Add simple virtio-watchdog device
This device operates a single virtq. When the driver offers a descriptor
to the device it is interpreted as a "ping" to indicate that the guest
is alive. A periodic timer fires and if when the timer is fired there
has not been a "ping" from the guest then the device will reset the VM.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-10-21 16:02:39 +01:00
Sebastien Boeuf
0c967e1aa0 virtio-devices: iommu: Update the list of seccomp filters
While using the virtio-iommu device involving L2 scenario, and tearing
things down all the way from L2 back to L0 exposed some bad syscalls
that were not part of the authorized list.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-10-14 19:15:09 +02:00
Sebastien Boeuf
46d972e402 virtio-devices: mem: Add missing syscall to seccomp filters
The missing syscall rt_sigprocmask(2) was triggered for the musl build
upon rebooting the VM, and was causing the VM to be killed.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-16 19:20:04 +02:00
Sebastien Boeuf
bc1bbb6dc4 virtio-devices: virtio-mem: Add missing syscalls
By testing manually the memory resizing through virtio-mem, several
missing syscalls have been identified.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-16 19:20:04 +02:00
Bo Chen
b4f6db5f31 virtio-devices: vsock: Add 'brk' to the seccomp list
We observed CI instability for the past couple of days. This
instability is confirmed to be a result of incomplete seccomp
filters. Given the filter on 'virtio_vsock' is recently added and
is missing 'brk', it is likely to be the root cause of the
instability.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-09-11 07:56:52 +02:00
Bo Chen
3c923f0727 virtio-devices: seccomp: Add seccomp filters for virtio_vsock thread
This patch enables the seccomp filters for the virtio_vsock worker
thread.

Partially fixes: #925

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-09-09 17:04:39 +01:00
Bo Chen
1175fa2bc7 virtio-devices: seccomp: Add seccomp filters for blk_io_uring thread
This patch enables the seccomp filters for the block_io_uring worker
thread.

Partially fixes: #925

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-09-09 17:04:39 +01:00
Michael Zhao
23e5a726ec virtio-devices: Add seccomp rules for vhost-user backend
The missing rules caused failures when guest powered off.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2020-08-31 08:19:23 +02:00
Michael Zhao
cd0b8ed8f8 virtio-devices: Allowing SYS_write syscall for virtio-net-ctl thread
"debug!" marco is used in virtio-devices/src/epoll_helper.rs. When"-vvv"
and "--log-file" option was specified, the missing "SYS_write" rule
caused a "bad system call" crash.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2020-08-19 14:26:07 +02:00
Bo Chen
02d87833f0 virtio-devices: seccomp: Add seccomp filters for vhost_blk thread
This patch enables the seccomp filters for the vhost_blk worker thread.

Partially fixes: #925

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-08-19 08:33:58 +02:00
Bo Chen
4e0ea15075 virtio-devices: seccomp: Add seccomp filter for vhost_net thread
This patch enables the seccomp filters for the vhost_net worker thread.

Partially fixes: #925

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-08-19 08:33:58 +02:00
Bo Chen
896b9a1d4b virtio-devices: seccomp: Add seccomp filter for vhost_net_ctl thread
This patch enables the seccomp filters for the vhost_net_ctl worker thread.

Partially fixes: #925

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-08-19 08:33:58 +02:00
Bo Chen
02d63149fe virtio-devices: seccomp: Add seccomp filters for vhost_fs thread
This patch enables the seccomp filters for the vhost_fs worker thread.

Partially fixes: #925

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-08-19 08:33:58 +02:00
Bo Chen
c82ded8afa virtio-devices: seccomp: Add seccomp filters for balloon thread
This patch enables the seccomp filters for the balloon worker thread.

Partially fixes: #925

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-08-19 08:33:58 +02:00
Bo Chen
c460178723 virtio-devices: seccomp: Add seccomp filters for mem thread
This patch enables the seccomp filters for the mem worker thread.

Partially fixes: #925

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-08-19 08:33:58 +02:00
Bo Chen
aaa02a0d78 virtio-devices: seccomp: Add 'brk' syscall to all worker threads
To prevent potential failures, this patch adds 'brk' syscall to all
virtio-devices worker threads.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-08-17 21:08:49 +02:00
Bo Chen
c90a71e329 virtio-devices: seccomp: Add 'brk' syscall to the rng thread
From the experiments of running integration tests on my local machine,
auditd occationally reported the 'brk' syscall is needed for the
'virtio-rng' worker thread.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-08-17 21:08:49 +02:00
Bo Chen
c70ad27247 virtio-devices: Add seccomp filter list for net worker thread
This patch adds the seccomp filter list for the virtio_net thread, while
the list was already added for the virtio_net_ctl thread.

Partially fixes: #925

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-08-17 21:08:49 +02:00
Bo Chen
1bf7817c40 virtio-devices: seccomp: Distinguish viritio-net-ctl from virtio-net
The current seccomp filter for virtio-net is actually for the worker
thread 'virtio_net_ctl' (not the actual worker thread
'virtio_net'). This patch introduces changes to distinguish those two
worker threads and seccomp filters.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-08-17 21:08:49 +02:00
Bo Chen
4539236690 virtio-devices: seccomp: Add seccomp filters for iommu thread
This patch enables the seccomp filters for the iommu worker thread.

Partially fixes: #925

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-08-17 21:08:49 +02:00
Sebastien Boeuf
fca46fd00e virtio-devices: net: Add dup syscall to seccomp filters
The seccomp filters specific to the virtio-net threads must contain
dup() syscall now that we ported the epoll code to the EpollHelper.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-08-12 17:34:02 +02:00
Sebastien Boeuf
e8f0bdb6f2 virtio-devices: console: Add dup syscall to seccomp filters
The seccomp filters specific to the virtio-console thread must contain
dup syscall now that we ported the epoll code to the EpollHelper.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-08-11 19:17:50 +02:00
Rob Bradford
55c16fecbf virtio-devices: seccomp: Add missing dup() syscalls
The refactoring to use EpollHelper added a requirement on this system
call.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-08-05 11:32:31 +02:00
Bo Chen
dc71d2765a virtio-devices: seccomp: Add seccomp filters for pmem thread
This patch enables the seccomp filters for the pmem worker thread.

Partially fixes: #925

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-08-05 08:13:31 +01:00
Bo Chen
d77977536d virtio-devices: seccomp: Add seccomp filters for net thread
This patch enables the seccomp filters for the net worker thread.

Partially fixes: #925

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-08-05 08:13:31 +01:00
Bo Chen
276df6b71c virtio-devices: seccomp: Add seccomp filters for console thread
This patch enables the seccomp filters for the console worker thread.

Partially fixes: #925

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-08-05 08:13:31 +01:00
Bo Chen
a426221167 virtio-devices: seccomp: Add seccomp filters for rng thread
This patch enables the seccomp filters for the rng worker thread.

Partially fixes: #925

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-08-05 08:13:31 +01:00