virtio-devices: Enable VIRTIO_RING_F_INDIRECT_DESC

This improves sequential write performance using fio (2888MiB/s ->
3293MiB/s)

VM config: cloud-hypervisor --disk path=~/workloads/jammy.raw,direct=on path=~/workloads/big-disk.img,direct=on --cpus boot=1 --memory size=2G,shared=on --serial tty --console off --seccomp log --kernel ~/workloads/hypervisor-fw

Host: fio --filename=big-disk.img --direct=1 --rw=write --bs=256k --ioengine=libaio --iodepth=64 --runtime=120 --numjobs=1 --time_based --group_reporting --name=throughput-test-job --eta-newline=1

VM:  fio --filename=/dev/vdb --direct=1 --rw=write --bs=256k --ioengine=libaio --iodepth=64 --runtime=120 --numjobs=1 --time_based --group_reporting --name=throughput-test-job --eta-newline=1

Baseline (file on filesystem on host used as backing store for block
device):

throughput-test-job: (groupid=0, jobs=1): err= 0: pid=10169: Tue Nov  5 09:31:55 2024
  write: IOPS=13.5k, BW=3385MiB/s (3549MB/s)(397GiB/120008msec); 0 zone resets
    slat (usec): min=4, max=10222, avg=20.25, stdev=29.01
    clat (usec): min=984, max=45599, avg=4706.01, stdev=2278.11
     lat (usec): min=1002, max=45610, avg=4726.27, stdev=2278.77
    clat percentiles (usec):
     |  1.00th=[ 3195],  5.00th=[ 3228], 10.00th=[ 3261], 20.00th=[ 3261],
     | 30.00th=[ 3261], 40.00th=[ 3261], 50.00th=[ 3294], 60.00th=[ 3916],
     | 70.00th=[ 5014], 80.00th=[ 7308], 90.00th=[ 7635], 95.00th=[ 7898],
     | 99.00th=[ 8586], 99.50th=[ 8979], 99.90th=[36439], 99.95th=[36963],
     | 99.99th=[43779]
   bw (  MiB/s): min= 1934, max= 4821, per=100.00%, avg=3391.67, stdev=1266.42, samples=239
   iops        : min= 7738, max=19286, avg=13566.67, stdev=5065.65, samples=239
  lat (usec)   : 1000=0.01%
  lat (msec)   : 2=0.03%, 4=61.10%, 10=38.62%, 20=0.11%, 50=0.15%
  cpu          : usr=17.13%, sys=14.38%, ctx=1352501, majf=0, minf=11
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=0,1624829,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
  WRITE: bw=3385MiB/s (3549MB/s), 3385MiB/s-3385MiB/s (3549MB/s-3549MB/s), io=397GiB (426GB), run=120008-120008msec

Disk stats (read/write):
    dm-2: ios=129/1624787, sectors=1872/831364040, merge=0/0, ticks=185/6960387, in_queue=6960572, util=100.00%, aggrios=130/1626025, aggsectors=1880/831915888, aggrmerge=0/0, aggrticks=194/6967818, aggrin_queue=6968012, aggrutil=99.97%
    dm-0: ios=130/1626025, sectors=1880/831915888, merge=0/0, ticks=194/6967818, in_queue=6968012, util=99.97%, aggrios=130/1606095, aggsectors=1880/831915888, aggrmerge=0/19930, aggrticks=204/6634488, aggrin_queue=6635288, aggrutil=58.59%
  nvme0n1: ios=130/1606095, sectors=1880/831915888, merge=0/19930, ticks=204/6634488, in_queue=6635288, util=58.59%

On block device in VM:

throughput-test-job: (groupid=0, jobs=1): err= 0: pid=667: Tue Nov  5 09:53:19 2024
  write: IOPS=13.2k, BW=3293MiB/s (3453MB/s)(386GiB/120008msec); 0 zone resets
    slat (usec): min=4, max=3518, avg=27.77, stdev=35.32
    clat (usec): min=723, max=44252, avg=4829.82, stdev=2222.41
     lat (usec): min=735, max=44270, avg=4857.85, stdev=2223.45
    clat percentiles (usec):
     |  1.00th=[ 3097],  5.00th=[ 3195], 10.00th=[ 3195], 20.00th=[ 3228],
     | 30.00th=[ 3261], 40.00th=[ 3294], 50.00th=[ 3621], 60.00th=[ 4555],
     | 70.00th=[ 5997], 80.00th=[ 7242], 90.00th=[ 7570], 95.00th=[ 7898],
     | 99.00th=[ 8586], 99.50th=[ 8848], 99.90th=[36439], 99.95th=[36963],
     | 99.99th=[40633]
   bw (  MiB/s): min= 1914, max= 4857, per=100.00%, avg=3299.46, stdev=1180.81, samples=239
   iops        : min= 7658, max=19430, avg=13197.77, stdev=4723.22, samples=239
  lat (usec)   : 750=0.01%, 1000=0.01%
  lat (msec)   : 2=0.01%, 4=52.79%, 10=46.95%, 20=0.10%, 50=0.14%
  cpu          : usr=25.95%, sys=16.71%, ctx=1111821, majf=0, minf=10
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=0,1580693,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
  WRITE: bw=3293MiB/s (3453MB/s), 3293MiB/s-3293MiB/s (3453MB/s-3453MB/s), io=386GiB (414GB), run=120008-120008msec

Disk stats (read/write):
  vdb: ios=60/1953213, merge=0/0, ticks=14/8229134, in_queue=8229149, util=100.00%

Prior to change:

throughput-test-job: (groupid=0, jobs=1): err= 0: pid=667: Tue Nov  5 09:37:45 2024
  write: IOPS=11.6k, BW=2888MiB/s (3028MB/s)(338GiB/120008msec); 0 zone resets
    slat (usec): min=3, max=3200, avg=18.48, stdev=24.54
    clat (usec): min=1237, max=46575, avg=5521.41, stdev=2641.99
     lat (usec): min=1249, max=46591, avg=5540.06, stdev=2643.54
    clat percentiles (usec):
     |  1.00th=[ 2999],  5.00th=[ 3163], 10.00th=[ 3195], 20.00th=[ 3261],
     | 30.00th=[ 3294], 40.00th=[ 3359], 50.00th=[ 6063], 60.00th=[ 7111],
     | 70.00th=[ 7373], 80.00th=[ 7570], 90.00th=[ 7832], 95.00th=[ 8094],
     | 99.00th=[ 8717], 99.50th=[ 9241], 99.90th=[36963], 99.95th=[37487],
     | 99.99th=[41157]
   bw (  MiB/s): min= 1936, max= 4826, per=100.00%, avg=2892.43, stdev=1202.99, samples=239
   iops        : min= 7746, max=19306, avg=11569.68, stdev=4811.98, samples=239
  lat (msec)   : 2=0.01%, 4=46.26%, 10=53.38%, 20=0.09%, 50=0.26%
  cpu          : usr=14.20%, sys=8.59%, ctx=1246257, majf=0, minf=12
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=0,1386102,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
  WRITE: bw=2888MiB/s (3028MB/s), 2888MiB/s-2888MiB/s (3028MB/s-3028MB/s), io=338GiB (363GB), run=120008-120008msec

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
This commit is contained in:
Rob Bradford 2024-11-05 10:08:20 +00:00
parent a6a3d247da
commit df1d6eaaee

View File

@ -27,7 +27,7 @@ use serde::{Deserialize, Serialize};
use thiserror::Error;
use virtio_bindings::virtio_blk::*;
use virtio_bindings::virtio_config::*;
use virtio_bindings::virtio_ring::VIRTIO_RING_F_EVENT_IDX;
use virtio_bindings::virtio_ring::{VIRTIO_RING_F_EVENT_IDX, VIRTIO_RING_F_INDIRECT_DESC};
use virtio_queue::{Queue, QueueOwnedT, QueueT};
use vm_memory::{ByteValued, Bytes, GuestAddressSpace, GuestMemoryAtomic, GuestMemoryError};
use vm_migration::{Migratable, MigratableError, Pausable, Snapshot, Snapshottable, Transportable};
@ -627,8 +627,8 @@ impl Block {
| (1u64 << VIRTIO_BLK_F_CONFIG_WCE)
| (1u64 << VIRTIO_BLK_F_BLK_SIZE)
| (1u64 << VIRTIO_BLK_F_TOPOLOGY)
| (1u64 << VIRTIO_RING_F_EVENT_IDX);
| (1u64 << VIRTIO_RING_F_EVENT_IDX)
| (1u64 << VIRTIO_RING_F_INDIRECT_DESC);
if iommu {
avail_features |= 1u64 << VIRTIO_F_IOMMU_PLATFORM;
}