Commit Graph

14 Commits

Author SHA1 Message Date
Rob Bradford
f3f398eb44 vhost_user_block: Consolidate the vhost-user-block backend syntax
Rather than repeat syntax for the vhost-user-block backend in multiple
places store it in one place and reference it from the required places.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-05-11 09:40:40 +02:00
Rob Bradford
d5bfa2dfc8 vmm, vhost_user_block: Make parameter names match --disk
Make the --block-backend parameters match the --disk parameters.

Fixes: #898

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-30 15:20:55 +02:00
Rob Bradford
ca3b39c0be bin: Fix wrapping in help strings
Some of the help strings had extra newlines in them or otherwise strange
wrapping. The strings were rewrapped with the nightly version of rustfmt
that supports string formatting.

Fixes: #899

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-12 18:03:18 +00:00
Sergio Lopez
5c06b7f862 vhost_user_block: Implement optional static polling
Actively polling the virtqueue significantly reduces the latency of
each I/O operation, at the expense of using more CPU time. This
features is specially useful when using low-latency devices (SSD,
NVMe) as the backend.

This change implements static polling. When a request arrives after
being idle, vhost_user_block will keep checking the virtqueue for new
requests, until POLL_QUEUE_US (50us) has passed without finding one.

POLL_QUEUE_US is defined to be 50us, based on the current latency of
enterprise SSDs (< 30us) and the overhead of the emulation.

This feature is enabled by default, and can be disabled by using the
"poll_queue" parameter of "block-backend".

This is a test using null_blk as a backend for the image, with the
following parameters:

 - null_blk gb=20 nr_devices=1 irqmode=2 completion_nsec=0 no_sched=1

With "poll_queue=false":

fio --ioengine=sync --bs=4k --rw randread --name randread --direct=1
--filename=/dev/vdb --time_based --runtime=10

randread: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=sync, iodepth=1
fio-3.14
Starting 1 process
Jobs: 1 (f=1): [r(1)][100.0%][r=169MiB/s][r=43.2k IOPS][eta 00m:00s]
randread: (groupid=0, jobs=1): err= 0: pid=433: Tue Feb 18 11:12:59 2020
  read: IOPS=43.2k, BW=169MiB/s (177MB/s)(1688MiB/10001msec)
    clat (usec): min=17, max=836, avg=21.64, stdev= 3.81
     lat (usec): min=17, max=836, avg=21.77, stdev= 3.81
    clat percentiles (nsec):
     |  1.00th=[19328],  5.00th=[19840], 10.00th=[20352], 20.00th=[21120],
     | 30.00th=[21376], 40.00th=[21376], 50.00th=[21376], 60.00th=[21632],
     | 70.00th=[21632], 80.00th=[21888], 90.00th=[22144], 95.00th=[22912],
     | 99.00th=[28544], 99.50th=[30336], 99.90th=[39168], 99.95th=[42752],
     | 99.99th=[71168]
   bw (  KiB/s): min=168440, max=188496, per=100.00%, avg=172912.00, stdev=3975.63, samples=19
   iops        : min=42110, max=47124, avg=43228.00, stdev=993.91, samples=19
  lat (usec)   : 20=5.90%, 50=94.08%, 100=0.02%, 250=0.01%, 500=0.01%
  lat (usec)   : 750=0.01%, 1000=0.01%
  cpu          : usr=10.35%, sys=25.82%, ctx=432417, majf=0, minf=10
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=432220,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=169MiB/s (177MB/s), 169MiB/s-169MiB/s (177MB/s-177MB/s), io=1688MiB (1770MB), run=10001-10001msec

Disk stats (read/write):
  vdb: ios=427867/0, merge=0/0, ticks=7346/0, in_queue=0, util=99.04%

With "poll_queue=true" (default):

fio --ioengine=sync --bs=4k --rw randread --name randread --direct=1
--filename=/dev/vdb --time_based --runtime=10

randread: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=sync, iodepth=1
fio-3.14
Starting 1 process
Jobs: 1 (f=1): [r(1)][100.0%][r=260MiB/s][r=66.7k IOPS][eta 00m:00s]
randread: (groupid=0, jobs=1): err= 0: pid=422: Tue Feb 18 11:14:47 2020
  read: IOPS=68.5k, BW=267MiB/s (280MB/s)(2674MiB/10001msec)
    clat (usec): min=10, max=966, avg=13.60, stdev= 3.49
     lat (usec): min=10, max=966, avg=13.70, stdev= 3.50
    clat percentiles (nsec):
     |  1.00th=[11200],  5.00th=[11968], 10.00th=[11968], 20.00th=[12224],
     | 30.00th=[12992], 40.00th=[13504], 50.00th=[13760], 60.00th=[13888],
     | 70.00th=[14016], 80.00th=[14144], 90.00th=[14272], 95.00th=[14656],
     | 99.00th=[20352], 99.50th=[23936], 99.90th=[35072], 99.95th=[36096],
     | 99.99th=[47872]
   bw (  KiB/s): min=265456, max=296456, per=100.00%, avg=274229.05, stdev=13048.14, samples=19
   iops        : min=66364, max=74114, avg=68557.26, stdev=3262.03, samples=19
  lat (usec)   : 20=98.84%, 50=1.15%, 100=0.01%, 250=0.01%, 500=0.01%
  lat (usec)   : 750=0.01%, 1000=0.01%
  cpu          : usr=8.24%, sys=21.15%, ctx=684669, majf=0, minf=10
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=684611,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=267MiB/s (280MB/s), 267MiB/s-267MiB/s (280MB/s-280MB/s), io=2674MiB (2804MB), run=10001-10001msec

Disk stats (read/write):
  vdb: ios=677855/0, merge=0/0, ticks=7026/0, in_queue=0, util=99.04%

Signed-off-by: Sergio Lopez <slp@redhat.com>
2020-02-19 17:13:47 +00:00
Yang Zhong
99da1dff90 vhost-user-blk: Add MQ support in backend
Adding the num_queues parameter for vhost-user-blk backend, which
can enable MQ support in the backend.

This patch has enabled the MQ support from handle_event, and the
vhost-user-backend crate will enable multiple threads to call this
handle_event to handle read/write operations.

Signed-off-by: Yang Zhong <yang.zhong@intel.com>
2020-02-03 09:49:27 +01:00
Rob Bradford
7f73eebbdb vhost_user_block: Split launching backend into its own function
Split the basic launching functionality into its own function in the
newly added vhost_user_block crate.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-01-23 10:30:06 +00:00
Rob Bradford
1dd2451895 vhost_user_block: Refactor vhost_user_block backend code into a new crate
Extract the majority of the code that provides the vhost-user-block
backend into its own crate and port the binary to use it.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-01-23 10:30:06 +00:00
Rob Bradford
3ede2dc53a bin: vhost_user_blk: Rename "--backend" to "--block-backend"
This will prevent it from conflicting when it is aggregated into the
cloud-hypervisor binary.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-01-23 10:30:06 +00:00
Sergio Lopez
e0a8da2f46 vhost_user_blk: Add missing WCE property support
Add missing WCE (write-cache enable) property support. This not only
an enhancement, but also a fix for a bug.

Right now, when vhost_user_blk uses a qcow2 image, it doesn't write
the QCOW2 metadata until the guest explicitly requests a flush. In
practice, this is equivalent to the write back semantic.

Without WCE, the guest assumes write through for the virtio_blk
device, and doesn't send those flush requests. Adding support for WCE,
and enabling it by default, we ensure the guest does send said
requests.

Supporting "WCE = false" would require updating our qcow2
implementation to ensure that, when required, it honors the write
through semantics by not deferring the updates to QCOW2 metadata.

Signed-off-by: Sergio Lopez <slp@redhat.com>
2020-01-17 17:28:44 +00:00
Sergio Lopez
c7e9056c1e vhost_user_blk: implement support for direct (O_DIRECT) mode
Add support for opening the disk images with O_DIRECT. This allows
bypassing the host's file system cache, which is useful to avoid
polluting its cache and for better data integrity.

This mode of operation can be enabled by adding the "direct=<bool>"
parameter to the "backend" argument:

./target/debug/vhost_user_blk --backend image=test.raw,sock=/tmp/vhostblk,direct=true

The "direct" parameter defaults to "false", to preserve the original
behavior.

Signed-off-by: Sergio Lopez <slp@redhat.com>
2020-01-17 17:28:44 +00:00
Sergio Lopez
a14aee9213 qcow: Use RawFile as backend instead of File
Use RawFile as backend instead of File. This allows us to abstract
the access to the actual image with a specialized layer, so we have a
place where we can deal with the low-level peculiarities.

Signed-off-by: Sergio Lopez <slp@redhat.com>
2020-01-17 17:28:44 +00:00
Sergio Lopez
c5a656c9dc vm-virtio: block: Add support for alignment restrictions
Doing I/O on an image opened with O_DIRECT requires to adhere to
certain restrictions, requiring the following elements to be aligned:

 - Address of the source/destination memory buffer.
 - File offset.
 - Length of the data to be read/written.

The actual alignment value depends on various elements, and according
to open(2) "(...) there is currently no filesystem-independent
interface for an application to discover these restrictions (...)".

To discover such value, we iterate through a list of alignments
(currently, 512 and 4096) calling pread() with each one and checking
if the operation succeeded.

We also extend RawFile so it can be used as a backend for QcowFile,
so the later can be easily adapted to support O_DIRECT too.

Signed-off-by: Sergio Lopez <slp@redhat.com>
2020-01-17 17:28:44 +00:00
Yang Zhong
cee01edb97 vhost-user-blk backend: add readonly support
The current backend only support rw, and we also need
add readonly support.

The new command:
vhost_user_blk \
  --backend "image=/home/test.img, \
            sock=/home/path/vhost.socket, \
            readonly=true"

Signed-off-by: Yang Zhong <yang.zhong@intel.com>
2019-12-18 09:45:11 +01:00
Sergio Lopez
5870452d25 src: add vhost-user-blk backend
Create a vhost-user-blk backend using vhost-user-backend and following
the conventions established by the existing vhost-user-net
implementation.

This backend is based on https://github.com/slp/vhost-user-backend,
but a bit simplified, making it closer to the original implementation
in Firecracker. The main features missing are EVENT_IDX, support for
asynchronous I/O and multiqueue, but it's still fully functional and
provides a good starting point for evolving it into a more complete
implementation.

Signed-off-by: Sergio Lopez <slp@redhat.com>
2019-11-07 10:36:30 +00:00