Commit Graph

473 Commits

Author SHA1 Message Date
Rob Bradford
ce51755109 block_util: Avoid intermediate completion queue allocation
Rather than aggregate the completion list into an intermediate vector
instead adjust the API to provide one completion item at a time.

With DHAT this shows the number of heap allocations has decreased.

Before:

    dhat: Total:     623,852 bytes in 8,157 blocks

After:

    dhat: Total:     380,444 bytes in 3,469 blocks

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2023-01-10 17:30:25 +00:00
Rob Bradford
ba9554389b virtio-devices: block: Replace use of HashMap for inflight requests
During analysis of the asynchrous block I/O handling it was observed
that the majority of the time the completion events occur in the same
order as submissions. Further the maximum number of inflight requests
during the boot time is much lower than the size of the queue.

Through the use of a double ended queue (VecDequeue) with a reasonable
pre-allocation capacity we can have O(1) allocation free addition of
items to the list of inflight requests and mostly O(1) matching of
completed requests to submissions.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2023-01-10 10:41:24 +00:00
Hao Xu
1b0f35e42d virtio-devices: block: Remove duplicated code in handle_event()
There is duplicated code when handlin queue events in handle_event()
refactor and introduce a new helper function.

Signed-off-by: Hao Xu <howeyxu@tencent.com>
2022-12-16 14:52:48 +00:00
Rob Bradford
5e52729453 misc: Automatically fix cargo clippy issues added in 1.65 (stable)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-12-14 14:27:19 +00:00
Sebastien Boeuf
748018ace3 vm-migration: Don't store the id as part of Snapshot structure
The information about the identifier related to a Snapshot is only
relevant from the BTreeMap perspective, which is why we can get rid of
the duplicated identifier in every Snapshot structure.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-12-09 10:26:06 +01:00
Sebastien Boeuf
5b3bcfa233 vm-migration: Snapshot should have a unique SnapshotDataSection
There's no reason to carry a HashMap of SnapshotDataSection per
Snapshot. And given we now provide at most one SnapshotDataSection per
Snapshot, there's no need to keep the id part of the SnapshotDataSection
structure.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-12-09 10:26:06 +01:00
Rob Bradford
4b08142117 misc: Remove #![allow(clippy::significant_drop_in_scrutinee)]
This isn't supported by clippy on Rust 1.60 but also no longer seems to
be required.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-12-07 17:50:48 +00:00
Sebastien Boeuf
b62a40efae virtio-devices, vmm: Always restore virtio devices in paused state
Following the new restore design, it is not appropriate to set every
virtio device threads into a paused state after they've been started.

This is why we remove the line of code pausing the devices only after
they've been restored, and replace it with a small patch in every virtio
device implementation. When a virtio device is created as part of a
restored VM, the associated "paused" boolean is set to true. This
ensures the corresponding thread will be directly parked when being
started, avoiding the thread to be in a different state than the one it
was on the source VM during the snapshot.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-12-01 09:27:00 +01:00
Bo Chen
83ab5ea528 virtio-devices: net: Provide custom functions for fuzzing
Three functions are added:
* 'Tap::new_for_fuzzing()' a custom constructor that creates a dummy
`Tap` interface directly from `File` backed by Unix domain socket;
* 'Tap::mtu()' a custom function that returns hard-coded mtu;
* 'Net::wait_for_epoll_threads()'.

Two functions are reused with modifications to work with the dummy 'Tap'
interface:
* 'Net::new_with_tap()' is made public for fuzzing;
* 'Net::activate()' is modified to not call into 'Tap::set_offload()'
for fuzzing.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-11-30 12:13:14 +00:00
Sebastien Boeuf
a50b3784fe virtio-devices: Create a proper result type for VirtioPciDevice
Creating a dedicated Result type for VirtioPciDevice, associated with
the new VirtioPciDeviceError enum. This allows for a clearer handling of
the errors generated through VirtioPciDevice::new().

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-11-23 18:37:40 +00:00
Sebastien Boeuf
eae8043890 pci, virtio-devices: Move VirtioPciDevice to the new restore design
The code for restoring a VirtioPciDevice has been updated, including the
dependencies VirtioPciCommonConfig, MsixConfig and PciConfiguration.

It's important to note that both PciConfiguration and MsixConfig still
have restore() implementations because Vfio and VfioUser devices still
rely on the old way for restore.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-11-23 18:37:40 +00:00
Wei Liu
c45d24df16 virtio-devices: modify or provide safety comments
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-11-18 12:50:01 +00:00
Rob Bradford
149e424b6e virtio-devices: block: Return error to driver on writes if read-only
TEST=Boot `--disk readonly=on` along with a guest that tries to write
(unmodified hypervisor-fw) and observe that the virtio device thread no
longer panics.

Fixes: #4888

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-11-14 15:28:30 +00:00
Wei Liu
b07d471d4f virtio-devices: show the failed block request to help debugging
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-11-14 14:19:17 +00:00
Rob Bradford
f30d460fa3 virtio-devices: seccomp: Move mprotect() to virtio common rules
It's perfectly reasonable to expect if that some virtio threads trigger
libc behaviour that needs mprotect() that all virtio threads would do
the same.

Fixes: #4874

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-11-12 08:29:47 +00:00
Rob Bradford
57508a4b1c virtio-net: net: Wait for threads to exit on Drop
It is required to close all file descriptors pointing to an opened TAP
device prior to reopening the TAP device; otherwise it will return
-EBUSY as the device can only be opened once (excluding MQ use cases.)

When rebooting the VM the virtio-net threads would still be running and
so the TAP file descriptor may not have been closed. To ensure that the
TAP FD is closed wait for all the epoll threads to exit after receiving the
KILL_EVENT.

Fixes: #4868

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-11-10 07:46:16 -08:00
Bo Chen
b37e2ed378 virtio-devices: mem: Handle integer overflow properly
An integer overflow from our virtio-mem device can be triggered
from (misbehaved) guest driver with malicious requests. This patch
handles this integer overflow explicitly and treats it as an invalid
request.

Note: this bug was detected by our virtio-mem fuzzer through 'oss-fuzz'.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-11-04 09:33:21 +00:00
Bo Chen
cfafc85b9c virtio-devices: Custom 'EpollHelper::run_with_timeout' for fuzz
To support all virtio-devices, this patch replaces the customized
EpollHelper::run` with customized `EpollHelper::run_with_timeout` for
fuzzing.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-11-03 09:10:41 -07:00
Bo Chen
683491a955 virtio-devices: console: Provide 'wait_for_epoll_threads'
Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-11-03 09:10:41 -07:00
Bo Chen
a9ec0f33c0 misc: Fix clippy issues
Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-11-02 09:41:43 +01:00
Bo Chen
078c0408b3 virtio-devices: console: Remove obsoleted 'INPUT_EVENT'
Since the processing of the console inputs was moved from the VMM thread
to the virtio-console thread (#3061), we have been using the 'FILE_EVENT'
to handle input from stdin/pty/file, which made 'INPUT_EVENT' obsoleted.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-10-27 09:27:12 +02:00
Bo Chen
a5d0ff7039 virtio-devices: console: Propagate GuestMemory errors properly
Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-10-26 09:39:30 +02:00
Bo Chen
da1ab77848 virtio-devices: console: Report error instead of panic
Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-10-26 09:39:30 +02:00
Sebastien Boeuf
1f0e5eb66a vmm: virtio-devices: Restore every VirtioDevice upon creation
Following the new design proposal to improve the restore codepath when
migrating a VM, all virtio devices are supplied with an optional state
they can use to restore from. The restore() implementation every device
was providing has been removed in order to prevent from going through
the restoration twice.

Here is the list of devices now following the new restore design:

- Block (virtio-block)
- Net (virtio-net)
- Rng (virtio-rng)
- Fs (vhost-user-fs)
- Blk (vhost-user-block)
- Net (vhost-user-net)
- Pmem (virtio-pmem)
- Vsock (virtio-vsock)
- Mem (virtio-mem)
- Balloon (virtio-balloon)
- Watchdog (virtio-watchdog)
- Vdpa (vDPA)
- Console (virtio-console)
- Iommu (virtio-iommu)

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-10-24 14:17:08 +02:00
Bo Chen
fdecd94b20 virtio-devices: iommu: Provide 'wait_for_epoll_threads()'
Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-10-21 14:21:42 +01:00
Bo Chen
2af2cc539f misc: Unify error message punctuation
Considering error messages will be mostly nested, ensuring no
punctuation at the end will make the error log more readable.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-10-21 12:19:07 +02:00
Bo Chen
9c658e21a5 virtio-devices: iommu: Remove trivial handling of 'EVENT_Q_EVENT'
Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-10-20 10:32:23 -07:00
Bo Chen
38620eaea8 virtio-devices: net: Avoid using vector and direct indexing
With known number of queues and queue events, we can make each of them
more explicit and avoid using vector/direct indexing, which is cleaner
and slightly more efficient.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-10-20 07:45:36 -07:00
Bo Chen
a388d76228 virtio-devices: console: Avoid using vector and direct indexing
The the number of queues and associated events is known and fixed. We
can define and use each of them explicitly and avoid using vector (and
hence direct indexing), which is cleaner and slightly more efficient.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-10-20 07:45:36 -07:00
Bo Chen
710a860e9b virtio-devices: iommu: Avoid using vector and direct indexing
The the number of queues and associated events is known and fixed. We
can define and use each of them explicitly and avoid using vector (and
hence direct indexing), which is cleaner and slightly more efficient.
Also, this refactoring makes it clearer that we are not handling "event
queue" events (as "_event_queue" is not being used intentionally).

Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-10-20 07:45:36 -07:00
Bo Chen
83f22ac779 virtio-devices: iommu: Specify minimum number of queues to avoid OOB
In this way, the virtio-iommu code can properly report an error when
a wrong number of queues is provided, instead of triggering an
out-of-bound error.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-10-20 07:45:00 -07:00
Bo Chen
5b706422e8 virtio-devices: iommu: Propagate errors of processing request queue
Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-10-14 11:28:31 +01:00
Bo Chen
84105992b7 virtio-devices: iommu: Switch to use 'thiserror'
Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-10-14 11:28:31 +01:00
Bo Chen
0235ed3388 virtio-devices: mem: Report error instead of panic
Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-10-14 11:28:31 +01:00
Sebastien Boeuf
099cdd2af8 virtio-devices, vmm: vdpa: Implement live migration support
Vdpa now implements the Migratable trait, which allows the device to be
added to the DeviceTree and therefore allows live migrating any vDPA
device that supports being suspended.

Given a vDPA device can't be resumed from a suspended state without
having to reset everything, we don't support pause/resume for a vDPA
device, as well as snapshot/restore (which requires resume to be
supported).

In order for the migration to work locally, reusing the same device on
the same host machine, the vhost-vdpa handler is dropped after the
snapshot has been performed, which allows the destination VM to open the
device without any conflict about the device being busy.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-10-13 10:03:23 +02:00
Sebastien Boeuf
340fd6571a virtio-devices: vdpa: Make vhost-vdpa handler optional
In order to anticipate for migration support, we need to be able to
create a Vdpa object without VhostKernVdpa object associated with it.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-10-13 10:03:23 +02:00
Sebastien Boeuf
02f951a9c3 virtio-devices: vdpa: Simplify vring enabling
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-10-13 10:03:23 +02:00
Sebastien Boeuf
2b150ac2ea pci, virtio-devices: Restore proper BAR type
When restoring a VM, the BAR type can be found directly from the
snapshot resources. It is more reliable than the previous method which
was using self.use_64bit_bar from VirtioPciDevice because at the time
the BARs are allocated, the VirtioDevice hasn't been restored yet,
meaning the way to determine the value of use_64bit_bar is wrong for a
device like vDPA. At this time, the device type is not known and relying
on the stored resources is the only reliable way.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-10-13 10:03:23 +02:00
Bo Chen
fd9fa2a681 virtio-mem: mem: Simplify 'process_queue'
No functional change.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-10-07 07:57:08 -07:00
Bo Chen
756aebafda virtio-devices: mem: Handle and propagate errors properly
Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-10-07 07:57:08 -07:00
Rob Bradford
31ca22d4b6 virtio-devices: rng: Fix error message
The RNG device never reads from the guest memory it reads from a file
and writes to the guest memory.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-10-04 16:38:41 +01:00
Rob Bradford
cf995451a2 virtio-devices: watchdog: Generate error on invalid queue descriptor
Don't silently ignore the descriptors provided by the guest. This is
consistent with other devices.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-10-04 16:38:29 +01:00
Bo Chen
f0c55f5245 virtio-devices: rng: Error out of queue execution on invalid requests
With the virtio-rng device the descriptors that are provided by the
guest must be writable and of non-zero length. Also propagate an error
if writing to the guest memory fails.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-09-28 10:07:44 +01:00
Sebastien Boeuf
903c08f8a1 net: Don't override default TAP interface MTU
Adjust MTU logic such that:
1. Apply an MTU to the TAP interface if the user supplies it
2. Always query the TAP interface for the MTU and expose that.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-09-27 10:37:35 +01:00
Rob Bradford
194b59f44b fuzz: Don't overload meaning of reset()
This function is for really for the transport layer to trigger a device
reset. Instead name it appropriately for the fuzzing specific use case.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-09-22 11:01:41 -07:00
Sebastien Boeuf
76dbf85b79 net: Give the user the ability to set MTU
Add a new "mtu" parameter to the NetConfig structure and therefore to
the --net option. This allows Cloud Hypervisor's users to define the
Maximum Transmission Unit (MTU) they want to use for the network
interface that they create.

In details, there are two main aspects. On the one hand, the TAP
interface is created with the proper MTU if it is provided. And on the
other hand the guest is made aware of the MTU through the VIRTIO
configuration. That means the MTU is properly set on both the TAP on the
host and the network interface in the guest.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-09-21 16:20:57 +02:00
Sebastien Boeuf
f38056fc9e virtio-devices, vmm: Simplify virtio-mem resize operation
There's no need to delegate the resize operation to the virtio-mem
thread. This can come directly from the vmm thread which will use the
Mem object to update the VIRTIO configuration and trigger the interrupt
for the guest to be notified.

In order to achieve what's described above, the VirtioMemZone structure
now has a handle onto the Mem object directly. This avoids the need for
intermediate Resize and ResizeSender structures.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-09-20 13:43:40 +02:00
Rob Bradford
a375e230b8 misc: Manual beta clippy fixes (boolean to int conversion using if)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-09-20 10:59:48 +01:00
Sebastien Boeuf
1a2cd44e44 virtio-devices: balloon: Simplify the resize operation
There's no need to delegate the resize operation to the virtio-balloon
thread. This can come directly from the vmm thread which will use the
Balloon object to update the VIRTIO configuration and trigger the
interrupt for the guest to be notified.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-09-16 17:21:04 +02:00
Sebastien Boeuf
a5f655a032 virtio-devices: watchdog: Update process queue design
Update the implementation of the process_queue() function to match all
other virtio devices implementations. This solves some issue related to
potential out-of-bound accesses to the former used_desc_heads list.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-09-16 10:04:24 +01:00