91 Commits

Author SHA1 Message Date
Wei Liu
32482f6634 block: make available VIRTIO_BLK_F_SEG_MAX
This allows the guest to put in more than one segment per request. It
can improve the throughput of the system.

Introduce a new check to make sure the queue size configured by the user
is large enough to hold at least one segment.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2025-01-01 18:50:39 +00:00
Ruoqing He
ab7b294688 misc: Replace map_or on false with is_some_and
Replace `map_or()` on false condition with `is_some_and` to provide
better readability, as suggestted by v1.84.0-beta.1 `cargo clippy`.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2024-11-29 12:44:33 +00:00
Rob Bradford
df1d6eaaee virtio-devices: Enable VIRTIO_RING_F_INDIRECT_DESC
This improves sequential write performance using fio (2888MiB/s ->
3293MiB/s)

VM config: cloud-hypervisor --disk path=~/workloads/jammy.raw,direct=on path=~/workloads/big-disk.img,direct=on --cpus boot=1 --memory size=2G,shared=on --serial tty --console off --seccomp log --kernel ~/workloads/hypervisor-fw

Host: fio --filename=big-disk.img --direct=1 --rw=write --bs=256k --ioengine=libaio --iodepth=64 --runtime=120 --numjobs=1 --time_based --group_reporting --name=throughput-test-job --eta-newline=1

VM:  fio --filename=/dev/vdb --direct=1 --rw=write --bs=256k --ioengine=libaio --iodepth=64 --runtime=120 --numjobs=1 --time_based --group_reporting --name=throughput-test-job --eta-newline=1

Baseline (file on filesystem on host used as backing store for block
device):

throughput-test-job: (groupid=0, jobs=1): err= 0: pid=10169: Tue Nov  5 09:31:55 2024
  write: IOPS=13.5k, BW=3385MiB/s (3549MB/s)(397GiB/120008msec); 0 zone resets
    slat (usec): min=4, max=10222, avg=20.25, stdev=29.01
    clat (usec): min=984, max=45599, avg=4706.01, stdev=2278.11
     lat (usec): min=1002, max=45610, avg=4726.27, stdev=2278.77
    clat percentiles (usec):
     |  1.00th=[ 3195],  5.00th=[ 3228], 10.00th=[ 3261], 20.00th=[ 3261],
     | 30.00th=[ 3261], 40.00th=[ 3261], 50.00th=[ 3294], 60.00th=[ 3916],
     | 70.00th=[ 5014], 80.00th=[ 7308], 90.00th=[ 7635], 95.00th=[ 7898],
     | 99.00th=[ 8586], 99.50th=[ 8979], 99.90th=[36439], 99.95th=[36963],
     | 99.99th=[43779]
   bw (  MiB/s): min= 1934, max= 4821, per=100.00%, avg=3391.67, stdev=1266.42, samples=239
   iops        : min= 7738, max=19286, avg=13566.67, stdev=5065.65, samples=239
  lat (usec)   : 1000=0.01%
  lat (msec)   : 2=0.03%, 4=61.10%, 10=38.62%, 20=0.11%, 50=0.15%
  cpu          : usr=17.13%, sys=14.38%, ctx=1352501, majf=0, minf=11
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=0,1624829,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
  WRITE: bw=3385MiB/s (3549MB/s), 3385MiB/s-3385MiB/s (3549MB/s-3549MB/s), io=397GiB (426GB), run=120008-120008msec

Disk stats (read/write):
    dm-2: ios=129/1624787, sectors=1872/831364040, merge=0/0, ticks=185/6960387, in_queue=6960572, util=100.00%, aggrios=130/1626025, aggsectors=1880/831915888, aggrmerge=0/0, aggrticks=194/6967818, aggrin_queue=6968012, aggrutil=99.97%
    dm-0: ios=130/1626025, sectors=1880/831915888, merge=0/0, ticks=194/6967818, in_queue=6968012, util=99.97%, aggrios=130/1606095, aggsectors=1880/831915888, aggrmerge=0/19930, aggrticks=204/6634488, aggrin_queue=6635288, aggrutil=58.59%
  nvme0n1: ios=130/1606095, sectors=1880/831915888, merge=0/19930, ticks=204/6634488, in_queue=6635288, util=58.59%

On block device in VM:

throughput-test-job: (groupid=0, jobs=1): err= 0: pid=667: Tue Nov  5 09:53:19 2024
  write: IOPS=13.2k, BW=3293MiB/s (3453MB/s)(386GiB/120008msec); 0 zone resets
    slat (usec): min=4, max=3518, avg=27.77, stdev=35.32
    clat (usec): min=723, max=44252, avg=4829.82, stdev=2222.41
     lat (usec): min=735, max=44270, avg=4857.85, stdev=2223.45
    clat percentiles (usec):
     |  1.00th=[ 3097],  5.00th=[ 3195], 10.00th=[ 3195], 20.00th=[ 3228],
     | 30.00th=[ 3261], 40.00th=[ 3294], 50.00th=[ 3621], 60.00th=[ 4555],
     | 70.00th=[ 5997], 80.00th=[ 7242], 90.00th=[ 7570], 95.00th=[ 7898],
     | 99.00th=[ 8586], 99.50th=[ 8848], 99.90th=[36439], 99.95th=[36963],
     | 99.99th=[40633]
   bw (  MiB/s): min= 1914, max= 4857, per=100.00%, avg=3299.46, stdev=1180.81, samples=239
   iops        : min= 7658, max=19430, avg=13197.77, stdev=4723.22, samples=239
  lat (usec)   : 750=0.01%, 1000=0.01%
  lat (msec)   : 2=0.01%, 4=52.79%, 10=46.95%, 20=0.10%, 50=0.14%
  cpu          : usr=25.95%, sys=16.71%, ctx=1111821, majf=0, minf=10
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=0,1580693,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
  WRITE: bw=3293MiB/s (3453MB/s), 3293MiB/s-3293MiB/s (3453MB/s-3453MB/s), io=386GiB (414GB), run=120008-120008msec

Disk stats (read/write):
  vdb: ios=60/1953213, merge=0/0, ticks=14/8229134, in_queue=8229149, util=100.00%

Prior to change:

throughput-test-job: (groupid=0, jobs=1): err= 0: pid=667: Tue Nov  5 09:37:45 2024
  write: IOPS=11.6k, BW=2888MiB/s (3028MB/s)(338GiB/120008msec); 0 zone resets
    slat (usec): min=3, max=3200, avg=18.48, stdev=24.54
    clat (usec): min=1237, max=46575, avg=5521.41, stdev=2641.99
     lat (usec): min=1249, max=46591, avg=5540.06, stdev=2643.54
    clat percentiles (usec):
     |  1.00th=[ 2999],  5.00th=[ 3163], 10.00th=[ 3195], 20.00th=[ 3261],
     | 30.00th=[ 3294], 40.00th=[ 3359], 50.00th=[ 6063], 60.00th=[ 7111],
     | 70.00th=[ 7373], 80.00th=[ 7570], 90.00th=[ 7832], 95.00th=[ 8094],
     | 99.00th=[ 8717], 99.50th=[ 9241], 99.90th=[36963], 99.95th=[37487],
     | 99.99th=[41157]
   bw (  MiB/s): min= 1936, max= 4826, per=100.00%, avg=2892.43, stdev=1202.99, samples=239
   iops        : min= 7746, max=19306, avg=11569.68, stdev=4811.98, samples=239
  lat (msec)   : 2=0.01%, 4=46.26%, 10=53.38%, 20=0.09%, 50=0.26%
  cpu          : usr=14.20%, sys=8.59%, ctx=1246257, majf=0, minf=12
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=0,1386102,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
  WRITE: bw=2888MiB/s (3028MB/s), 2888MiB/s-2888MiB/s (3028MB/s-3028MB/s), io=338GiB (363GB), run=120008-120008msec

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-11-05 15:44:41 +00:00
Ruoqing He
61e57e1cb1 misc: Further improve imports styling
By introducing `imports_granularity="Module"` format strategy,
effectively groups imports from the same module into one line or block,
improving maintainability and readability.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2024-09-29 16:13:48 +00:00
Rob Bradford
88a9f79944 misc: Adapt consistent import style formatting
Historically the Cloud Hypervisor coding style has been to ensure that
all imports are ordered and placed in a single group. Unfortunately
cargo fmt has no support for ensuring that all imports are in a single
group so if whitespace lines were added as part of the import statements
then they would only be odered correctly in the group.

By adopting "group_imports="StdExternalCrate" we can enforce a style
where imports are placed in at most three groups for std, external
crates and the crate itself. Choosing a style enforceable by the tooling
reduces the reviewer burden.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-09-29 13:08:12 +01:00
wuxinyue
6956306604 virtio-devices: block: Reduce notification latency when rate limited
When the rate limit was reached it was possible for the notification to
the guest to be lost since the logic to handle the notification was
tightly coupled with processing the queue. The notification would
eventually be triggered when the rate limit pool was refilled but this
could add significant latency.

Address this by refactoring the code to separate processing queue and
signalling - the processing of the queue is suspended when the rate
limit is reached but the signalling will still be attempted if needed
(i.e. VIRTIO_F_EVENT_IDX is still considered.)

Signed-off-by: wuxinyue <wuxinyue.wxy@antgroup.com>
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-08-31 20:54:13 +00:00
wuxinyue
a2438700e4 virtio-devices: support event idx for virtio-blk
Support event idx feature for virtio-blk device.
This feature could improve disk IO performance by suppressing
notifications from guest to host and interrupts from host to
guest, which has been already supported in virtio-net and
vhost-user devices.

To achieve this, virtqueue's event-idx-related API is
leveraged for avail_event field update and needs_notification
check.

Fixes: #6580
Signed-off-by: wuxinyue <wuxinyue.wxy@antgroup.com>
2024-07-23 14:16:34 +00:00
Changyuan Lyu
bc6acb842f block: fix status value size
As per VirtIO spec 1.2 section 5.2.6, the `status` field is a byte, not
u32. cloud-hypervisor writes an `u32` to guest memory, which
accidentally zeros out the following 3 bytes, and may corrupt guest OS
internal state.

Signed-off-by: Changyuan Lyu <changyuanl@google.com>
2024-07-14 19:23:06 +00:00
Josh Soref
42e9632c53 misc: Fix spelling issues
Misspellings were identified by:
  https://github.com/marketplace/actions/check-spelling

* Initial corrections based on forbidden patterns from the action
* Additional corrections by Google Chrome auto-suggest
* Some manual corrections
* Adding markdown bullets to readme credits section

Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2024-06-08 16:31:30 +00:00
Rob Bradford
10ab87d6a3 misc: Migrate away from versionize
Replace with serde instead.

Fixes: #6370

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-04-22 17:10:55 +00:00
Rob Bradford
adb318f4cd misc: Remove redundant "use" imports
With the nightly toolchain (2024-02-18) cargo check will flag up
redundant imports either because they are pulled in by the prelude on
earlier match.

Remove those redundant imports.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-02-19 17:54:30 +00:00
acarp
035c4b20fb block: Set an option to pin virtio block threads to host cpus
Currently the only way to set the affinity for virtio block threads is
to boot the VM, search for the tid of each of the virtio block threads,
then set the affinity manually. This commit adds an option to pin virtio
block queues to specific host cpus (similar to pinning vcpus to host
cpus). A queue_affinity option has been added to the disk flag in
the cli to specify a mapping of queue indices to host cpus.

Signed-off-by: acarp <acarp@crusoeenergy.com>
2024-02-13 09:05:57 +00:00
Thomas Barrett
c297d8d796 vmm: use RateLimiterGroup for virtio-blk devices
Add a 'rate_limit_groups' field to VmConfig that defines a set of
named RateLimiterGroups.

When the 'rate_limit_group' field of DiskConfig is defined, all
virtio-blk queues will be rate-limited by a shared RateLimiterGroup.
The lifecycle of all RateLimiterGroups is tied to the Vm.
A RateLimiterGroup may exist even if no Disks are configured to use
the RateLimiterGroup. Disks may be hot-added or hot-removed from the
RateLimiterGroup.

When the 'rate_limiter' field of DiskConfig is defined, we construct
an anonymous RateLimiterGroup whose lifecycle is tied to the Disk.
This is primarily done for api backwards compatability. Importantly,
the behavior is not the same! This implementation rate_limits the
aggregate bandwidth / iops of an individual disk rather than the
bandwidth / iops of an individual queue of a disk.

When neither the 'rate_limit_group' or the 'rate_limiter' fields of
DiskConfig is defined, the Disk is not rate-limited.

Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
2024-01-03 10:21:06 -08:00
Muminul Islam
274f1aa2e7 virtio-devices,vm-allocator: Fix clippy warnings
Signed-off-by: Muminul Islam <muislam@microsoft.com>
2023-10-19 08:42:17 +01:00
Bo Chen
9c994f882a virtio-devices: block: Fix the latency counter for max read/write
See: #5712

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-09-11 23:08:23 +01:00
Bo Chen
b76d0e8b50 virtio-devices: block: Fix latency counter for average read/write
The cumulative average formula [1] requires to use signed integers
for proper calculations, while calculated result (e.g. cumulative
average) is always positive. This patch reflects the above requirements
in our code.

[1] https://en.wikipedia.org/wiki/Moving_average#Cumulative_average

Fixes: #5745

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-09-11 23:08:15 +01:00
Thomas Barrett
c4e8e653ac block: Add support for user specified ID_SERIAL
Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
2023-09-11 12:50:41 +01:00
Bo Chen
e2db476f6e virtio-devices: block: Correct the latency for the first op
There is a "LATENCY_SCALE" being used for calculating cumulative average
latency, so it should also be used for the latency of the first op.

See: #5712

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-09-07 23:00:47 +01:00
Rob Bradford
b5766028a8 virtio-devices: block: Make latency infinite before first op
Logically until we have handled the first operation the latency is
infinite; this logic was applied to the minimum latency originally but
this patch extends that logic to the maximum and average latency.

To prevent the initial average latency being skewed by the inclusion of
infinity the average value is initally seeded with the first measured
latency.

Fixes: #5704

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2023-08-31 08:22:07 -07:00
Yu Li
447cad3861 block: merge qcow, vhdx and block_util into block crate
This commit merges crates `qcow`, `vhdx` and `block_util` into the
crate `block`, which can allow `qcow` to use functions from `block_util`
without introducing a circular crate dependency.

This commit is based on crosvm implementation:
f2eecc4152

Signed-off-by: Yu Li <liyu.yukiteru@bytedance.com>
2023-07-19 13:52:43 +01:00
dependabot[bot]
1d55de9c74 build: Bump virtio-bindings from 0.1.0 to 0.2.0
Bumps [virtio-bindings](https://github.com/rust-vmm/vm-virtio) from 0.1.0 to 0.2.0.
- [Release notes](https://github.com/rust-vmm/vm-virtio/releases)
- [Commits](https://github.com/rust-vmm/vm-virtio/compare/virtio-queue-v0.1.0...virtio-bindings-v0.2.0)

---
updated-dependencies:
- dependency-name: virtio-bindings
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-02-23 00:59:32 +00:00
Philipp Schuster
ad6c0ee52b virtio-devices: properly join all threads on Drop
This change is important to do a proper resource cleanup. We decided
to do this repetitive approach as VirtioCommon can't implement Drop
without major changes to the corresponding code. Also, devices such as
Net can't easily use the epoll_threads-abstraction from VirtioCommon as
it has multiple threads with different semantics.

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
2023-01-12 18:03:33 +00:00
Yong He
0dc122a9a9 virto-device: add latency account for virtio-block
Add new latency counters for virtio-block device, including
minimal latency, maximal latency, and average latency for block
read and write.

The average latency is calculated based on cumulative average.

Signed-off-by: Yong He <alexyonghe@tencent.com>
2023-01-11 17:38:42 +00:00
Rob Bradford
ce51755109 block_util: Avoid intermediate completion queue allocation
Rather than aggregate the completion list into an intermediate vector
instead adjust the API to provide one completion item at a time.

With DHAT this shows the number of heap allocations has decreased.

Before:

    dhat: Total:     623,852 bytes in 8,157 blocks

After:

    dhat: Total:     380,444 bytes in 3,469 blocks

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2023-01-10 17:30:25 +00:00
Rob Bradford
ba9554389b virtio-devices: block: Replace use of HashMap for inflight requests
During analysis of the asynchrous block I/O handling it was observed
that the majority of the time the completion events occur in the same
order as submissions. Further the maximum number of inflight requests
during the boot time is much lower than the size of the queue.

Through the use of a double ended queue (VecDequeue) with a reasonable
pre-allocation capacity we can have O(1) allocation free addition of
items to the list of inflight requests and mostly O(1) matching of
completed requests to submissions.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2023-01-10 10:41:24 +00:00
Hao Xu
1b0f35e42d virtio-devices: block: Remove duplicated code in handle_event()
There is duplicated code when handlin queue events in handle_event()
refactor and introduce a new helper function.

Signed-off-by: Hao Xu <howeyxu@tencent.com>
2022-12-16 14:52:48 +00:00
Rob Bradford
5e52729453 misc: Automatically fix cargo clippy issues added in 1.65 (stable)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-12-14 14:27:19 +00:00
Sebastien Boeuf
748018ace3 vm-migration: Don't store the id as part of Snapshot structure
The information about the identifier related to a Snapshot is only
relevant from the BTreeMap perspective, which is why we can get rid of
the duplicated identifier in every Snapshot structure.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-12-09 10:26:06 +01:00
Sebastien Boeuf
b62a40efae virtio-devices, vmm: Always restore virtio devices in paused state
Following the new restore design, it is not appropriate to set every
virtio device threads into a paused state after they've been started.

This is why we remove the line of code pausing the devices only after
they've been restored, and replace it with a small patch in every virtio
device implementation. When a virtio device is created as part of a
restored VM, the associated "paused" boolean is set to true. This
ensures the corresponding thread will be directly parked when being
started, avoiding the thread to be in a different state than the one it
was on the source VM during the snapshot.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-12-01 09:27:00 +01:00
Rob Bradford
149e424b6e virtio-devices: block: Return error to driver on writes if read-only
TEST=Boot `--disk readonly=on` along with a guest that tries to write
(unmodified hypervisor-fw) and observe that the virtio device thread no
longer panics.

Fixes: #4888

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-11-14 15:28:30 +00:00
Wei Liu
b07d471d4f virtio-devices: show the failed block request to help debugging
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-11-14 14:19:17 +00:00
Bo Chen
a9ec0f33c0 misc: Fix clippy issues
Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-11-02 09:41:43 +01:00
Sebastien Boeuf
1f0e5eb66a vmm: virtio-devices: Restore every VirtioDevice upon creation
Following the new design proposal to improve the restore codepath when
migrating a VM, all virtio devices are supplied with an optional state
they can use to restore from. The restore() implementation every device
was providing has been removed in order to prevent from going through
the restoration twice.

Here is the list of devices now following the new restore design:

- Block (virtio-block)
- Net (virtio-net)
- Rng (virtio-rng)
- Fs (vhost-user-fs)
- Blk (vhost-user-block)
- Net (vhost-user-net)
- Pmem (virtio-pmem)
- Vsock (virtio-vsock)
- Mem (virtio-mem)
- Balloon (virtio-balloon)
- Watchdog (virtio-watchdog)
- Vdpa (vDPA)
- Console (virtio-console)
- Iommu (virtio-iommu)

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-10-24 14:17:08 +02:00
Rob Bradford
194b59f44b fuzz: Don't overload meaning of reset()
This function is for really for the transport layer to trigger a device
reset. Instead name it appropriately for the fuzzing specific use case.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-09-22 11:01:41 -07:00
Bo Chen
b4fe41ad0c virtio-devices: block: Refactor 'handle_event' for readability
Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-08-19 08:54:25 +02:00
Bo Chen
df5b803a63 virtio-devices: Shutdown VMM upon worker thread errors
Fixes: #4462

Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-08-16 11:45:44 +01:00
Bo Chen
b1752994d5 virtio-devices: Report errors from EpollHelperHandler::handle_event
Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-08-16 11:45:44 +01:00
Bo Chen
f9b36a3412 virtio-devices: block: Derive thiserror::Error
Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-08-16 11:45:44 +01:00
Bo Chen
c5fdc47918 virtio-devices: block: Avoid panic with invalid guest address
Remove the use of 'unwrap()' that assumes the guest address for request
status is always valid, which avoid virtio-block thread panic on
malformed descriptors from the guest.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-08-09 08:06:53 -07:00
Sebastien Boeuf
a4859ffe85 virtio-devices: Optimize add_used() usage
Now that we rely on pop_descriptor_chain() rather than iter() to iterate
over a queue, there's no more borrow on the queue itself, meaning we can
invoke add_used() directly for the iteration loop. This simplifies the
processing of the queues for each virtio device, and bring some possible
performance improvement given we don't have to iterate twice over the
list of descriptors to invoke add_used().

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-07-29 17:41:32 +01:00
Sebastien Boeuf
87f57f7c1e virtio-devices: Improve queue handling with pop_descriptor_chain()
Using pop_descriptor_chain() is much more appropriate than iter() since
it recreates the iterator every time, avoiding the queue to be borrowed
and allowing the virtio-net implementation to match all the other ones.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-07-29 17:41:32 +01:00
Sebastien Boeuf
a423bf13ad virtio: Port codebase to the latest virtio-queue version
The new virtio-queue version introduced some breaking changes which need
to be addressed so that Cloud Hypervisor can still work with this
version.

The most important change is about removing a handle to the guest memory
from the Queue, meaning the caller has to provide the guest memory
handle for multiple methods from the QueueT trait.

One interesting aspect is that QueueT has been widely extended to
provide every getter and setter we need to access and update the Queue
structure without having direct access to its internal fields.

This patch ports all the virtio and vhost-user devices to this new crate
definition. It also updates both vhost-user-block and vhost-user-net
backends based on the updated vhost-user-backend crate. It also updates
the fuzz directory.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-07-29 17:41:32 +01:00
Sebastien Boeuf
3f62a172b2 virtio-devices: Pass a list of tuples for virtqueues
Instead of passing separately a list of Queues and the equivalent list
of EventFds, we consolidate these two through a tuple along with the
queue index.

The queue index can be very useful if looking for the actual index
related to the queue, no matter if other queues have been enabled or
not.

It's also convenient to have the EventFd associated with the Queue so
that we don't have to carry two lists with the same amount of items.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-07-21 14:28:41 +02:00
Sebastien Boeuf
8eed276d14 vm-virtio: Define AccessPlatform trait
Moving the whole codebase to rely on the AccessPlatform definition from
vm-virtio so that we can fully remove it from virtio-queue crate.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-01-27 10:00:20 +00:00
Sebastien Boeuf
3e1ce98d1a virtio-devices: block: Handle descriptor address translation
Since we're trying to move away from the translation happening in the
virtio-queue crate, the device itself is performing the address
translation when needed.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-01-27 10:00:20 +00:00
Sebastien Boeuf
de3e003e3e virtio-devices: Handle virtio queues interrupts from transport layer
Instead of relying on the virtio-queue crate to store the information
about the MSI-X vectors for each queue, we handle this directly from the
PCI transport layer.

This is the first step in getting closer to the upstream version of
virtio-queue so that we can eventually move fully to the upstream
version.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-01-25 12:01:12 +01:00
Sebastien Boeuf
85bbf75fe8 block_util: Align buffers for O_DIRECT
Whenever the backing file of our virtio-block device is opened with
O_DIRECT, there's a requirement about the buffer address and size to be
aligned to the sector size.

We know virtio-block requests are sector aligned in terms of size, but
we must still check if the buffer address is. In case it's not, we
create an intermediate buffer that will be passed through the system
call. In case of a write operation, the content of the non-aligned
buffer must be copied beforehand, and in case of a read operation, the
content of the aligned buffer must be copied to the non-aligned one
after the operation has been completed.

Fixes #3587

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-01-20 11:49:02 +00:00
Rob Bradford
4773e23c77 virtio-devices: block: Expose device topology
If the disk is backed by a block device on the host a non-default
topology will be available and that topology can be advertised by virtio
block.

Fixes: #3262

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-12-17 12:42:10 +01:00
Sebastien Boeuf
0249e8641a Move Cloud Hypervisor to virtio-queue crate
Relying on the vm-virtio/virtio-queue crate from rust-vmm which has been
copied inside the Cloud Hypervisor tree, the entire codebase is moved to
the new definition of a Queue and other related structures.

The reason for this move is to follow the upstream until we get some
agreement for the patches that we need on top of that to make it
properly work with Cloud Hypervisor.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-10-22 11:38:55 +02:00
Rob Bradford
687d646c60 virtio-devices, vmm: Shutdown VMM on virtio thread panic
Shutdown the VMM in the virtio (or VMM side of vhost-user) thread
panics.

See: #3031

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-08 09:40:36 +01:00