Commit Graph

1820 Commits

Author SHA1 Message Date
Rob Bradford
c98949bdd3 tests: Wait for VMM to exit in test_serial_file/test_console_file
This should address any flakiness as the VMM process will have
completely terminated and all files closed.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-27 17:08:49 +00:00
Rob Bradford
2f58fb8307 tests: Test rebooting works for block self spawn test
Being able to reboot the VM is an important aspect of the self spawning
behaviour.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-27 17:08:49 +00:00
Rob Bradford
e817aa6824 tests: Improve VM shutdown behaviour
Many of the tests attempted to SSH into the VM and then run "shutdown"
but don't actually check that the VM has shutdown correctly and proceed
to kill the child process. Remove the associated SSH commands and sleeps
from those tests that are not explicitly checking the shutdown
behaviour.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-27 17:08:49 +00:00
Rob Bradford
559b70cf0a tests: Make output capture optional
Only some tests require the output for the tests to be captured so
default to not capturing the output to a pipe and instead make it
controllable.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-27 17:08:49 +00:00
Rob Bradford
dae7608b1e tests: Remove duplicated network configuration
Many of the tests use an identical network configuration. Add a
GuestCommand::default_net() to generate this configuration and use it
wherever possible.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-27 17:08:49 +00:00
Rob Bradford
6466ad2112 tests: Remove duplicated disk configuration
Many of the tests use an identical disk configuration. Add a
GuestCommand::default_disks() to generate this configuration and use it
wherever possible.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-27 17:08:49 +00:00
Rob Bradford
9f1ac24bed tests: Make the GuestCommand take a reference to the guest
This will be used fill out default parts of the guest command.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-27 17:08:49 +00:00
Rob Bradford
49e70c6203 tests: Port integration tests over to GuestCommand
Rather than creating the command itself delegate to the GuestCommand.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-27 17:08:49 +00:00
Rob Bradford
67a5882415 tests: Introduce new GuestCommand to handle launching the guest
This is a thin wrapper over std::process:Command which currently only
specifies the default binary but in future will handle more default
behaviour.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-27 17:08:49 +00:00
Sebastien Boeuf
8142c823ed vmm: Move DeviceManager into an Arc<Mutex<>>
In anticipation of the support for device hotplug, this commit moves the
DeviceManager object into an Arc<Mutex<>> when the DeviceManager is
being created. The reason is, we need the DeviceManager to implement the
BusDevice trait and then provide it to the IO bus, so that IO accesses
related to device hotplug can be handled correctly.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-02-27 11:12:31 +01:00
Sergio Lopez
531f4ff6b0 vhost_user_fs: Remove an unneeded unwrap in handle_event
Remove an unneeded unwrap in handle_event.

Signed-off-by: Sergio Lopez <slp@redhat.com>
2020-02-26 12:08:12 +01:00
Sergio Lopez
e52129efb4 vhost_user_fs: Process events from HIPRIO queue
We weren't processing events arriving at the HIPRIO queue, which
implied ignoring FUSE_INTERRUPT, FUSE_FORGET, and FUSE_BATCH_FORGET
requests.

One effect of this issue was that file descriptors weren't closed on
the server, so it eventually hits RLIMIT_NOFILE. Additionally, the
guest OS may hang while attempting to unmount the filesystem.

Signed-off-by: Sergio Lopez <slp@redhat.com>
2020-02-26 12:08:12 +01:00
dependabot-preview[bot]
0c5c470247 build(deps): bump micro_http from b85757e to 8d48e73
Bumps [micro_http](https://github.com/firecracker-microvm/firecracker) from `b85757e` to `8d48e73`.
- [Release notes](https://github.com/firecracker-microvm/firecracker/releases)
- [Commits](b85757ec00...8d48e730f2)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-02-26 07:06:24 +00:00
Sebastien Boeuf
5b96dd5f70 ci: Don't give special capabilities to Rust vhost-user-fs backend
There is no reason to give some special capabilities to the Rust version
of virtiofsd since it behaves slightly differently and does not require
neither DAC_OVERRIDE nor SYS_ADMIN.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-02-25 15:05:05 +00:00
Sergio Lopez
d8d790bb7b vhost_rs: Don't check for SLAVE_SEND_FD on SET_SLAVE_REQ_FD
SLAVE_SEND_FD and SLAVE_REQ are different protocol features. Check for
SLAVE_REQ instead.

Signed-off-by: Sergio Lopez <slp@redhat.com>
2020-02-25 11:12:50 +00:00
Sergio Lopez
1c5562b656 vhost_user_fs: Add support for EVENT_IDX
Now that Queue supports EVENT_IDX, expose the feature and add support
for it in vhost_user_fs.

Signed-off-by: Sergio Lopez <slp@redhat.com>
2020-02-25 11:12:50 +00:00
Sergio Lopez
eae4f1d249 vhost_user_fs: Add support for indirect descriptors
Now that Queue supports indirect descriptors, expose the feature and
support them in vhost_user_fs too.

Signed-off-by: Sergio Lopez <slp@redhat.com>
2020-02-25 11:12:50 +00:00
Sergio Lopez
ea0bc240fd vhost_user_fs: Be honest about protocol supported features
vhost_user_fs doesn't really support all vhost protocol features, just
MQ and SLAVE_REQ, so return that in protocol_features().

Signed-off-by: Sergio Lopez <slp@redhat.com>
2020-02-25 11:12:50 +00:00
Sergio Lopez
42937c9754 vm-virtio: Add support for indirect descriptors
Indirect descriptors is a virtio feature that allows the driver to
store a table of descriptors anywhere in memory, pointing to it from a
virtqueue ring's descriptor with a particular flag.

We can't seamlessly transition from an iterator over a conventional
descriptor chain to an indirect chain, so Queue users need to
explicitly support this feature by calling Queue::is_indirect() and
Queue::new_from_indirect().

Signed-off-by: Sergio Lopez <slp@redhat.com>
2020-02-25 11:12:50 +00:00
Rob Bradford
d7b0b9842d tests: Move integration tests to their own directory
Simplify main.rs by moving the integration tests to their own directory.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-25 10:42:54 +00:00
Eryu Guan
3cb4513077 vhost_rs: control SlaveFsCacheReq with vhost-user-slave feature
We import slave_fs_cache mod under vhost-user-slave feature control,
but not the self::slave_fs_cache::SlaveFsCacheReq import.

Signed-off-by: Eryu Guan <eguan@linux.alibaba.com>
2020-02-25 09:48:04 +00:00
Qiu Wenbo
9de3ace8c7 devices: implement Aml trait for GED device
Fixes: #657

Signed-off-by: Qiu Wenbo <qiuwenbo@phytium.com.cn>
2020-02-25 08:32:16 +00:00
Sebastien Boeuf
b77fdeba2d msi/msi-x: Prevent from losing masked interrupts
We want to prevent from losing interrupts while they are masked. The
way they can be lost is due to the internals of how they are connected
through KVM. An eventfd is registered to a specific GSI, and then a
route is associated with this same GSI.

The current code adds/removes a route whenever a mask/unmask action
happens. Problem with this approach, KVM will consume the eventfd but
it won't be able to find an associated route and eventually it won't
be able to deliver the interrupt.

That's why this patch introduces a different way of masking/unmasking
the interrupts, simply by registering/unregistering the eventfd with the
GSI. This way, when the vector is masked, the eventfd is going to be
written but nothing will happen because KVM won't consume the event.
Whenever the unmask happens, the eventfd will be registered with a
specific GSI, and if there's some pending events, KVM will trigger them,
based on the route associated with the GSI.

Suggested-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-02-25 08:31:14 +00:00
dependabot-preview[bot]
8423c0897a build(deps): bump proc-macro2 from 1.0.8 to 1.0.9
Bumps [proc-macro2](https://github.com/alexcrichton/proc-macro2) from 1.0.8 to 1.0.9.
- [Release notes](https://github.com/alexcrichton/proc-macro2/releases)
- [Commits](https://github.com/alexcrichton/proc-macro2/compare/1.0.8...1.0.9)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-02-25 07:51:48 +00:00
dependabot-preview[bot]
6315f16ca7 build(deps): bump syn from 1.0.15 to 1.0.16
Bumps [syn](https://github.com/dtolnay/syn) from 1.0.15 to 1.0.16.
- [Release notes](https://github.com/dtolnay/syn/releases)
- [Commits](https://github.com/dtolnay/syn/compare/1.0.15...1.0.16)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-02-25 07:19:05 +00:00
Qiu Wenbo
4cf89d373d pci: handle extended configuration space properly
This is critical to support extended capabilitiy list.

Signed-off-by: Qiu Wenbo <qiuwenbo@phytium.com.cn>
2020-02-24 17:05:09 +01:00
Qiu Wenbo
f6b9445be7 pci: fix pci MMCONFIG address parsing
We should not assume the offset produced by ECAM is identical to the
CONFIG_ADDRESS register of legacy PCI port io enumeration.

Signed-off-by: Qiu Wenbo <qiuwenbo@phytium.com.cn>
2020-02-24 17:05:09 +01:00
Rob Bradford
77ee331be0 resources: Enable KASLR in kernel config
This option improves the security of the guest by randomising the start
address of the kernel in physical memory. We should turn this on so as
to ensure all our functionality such as memory hotplug and kernel
loading works as this is an option used widely in production.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-24 15:56:37 +00:00
Rob Bradford
bba5ef3a59 vmm: Remove deprecated CPU syntax
Remove the old way of specifying the number of vCPUs to use.

Fixes: #678

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-24 07:26:31 +01:00
Rob Bradford
374ac77c63 main, vmm: Remove deprecated --vhost-user-net
This has been superseded by using --net with vhost_user=true and
socket=<socket>

Fixes: #678

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-24 07:26:31 +01:00
Rob Bradford
ffd816ebfa main, vmm: Remove deprecated --vhost-user-blk
This has been superseded by using --disk with vhost_user=true and
socket=<socket>

Fixes: #678

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-24 07:26:31 +01:00
dependabot-preview[bot]
d04e0dc9e1 build(deps): bump crossbeam-utils from 0.7.0 to 0.7.2
Bumps [crossbeam-utils](https://github.com/crossbeam-rs/crossbeam) from 0.7.0 to 0.7.2.
- [Release notes](https://github.com/crossbeam-rs/crossbeam/releases)
- [Changelog](https://github.com/crossbeam-rs/crossbeam/blob/master/CHANGELOG.md)
- [Commits](https://github.com/crossbeam-rs/crossbeam/compare/crossbeam-utils-0.7.0...crossbeam-0.7.2)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-02-24 06:25:32 +00:00
dependabot-preview[bot]
7da5b531a0 build(deps): bump ssh2 from 0.7.1 to 0.8.0
Bumps [ssh2](https://github.com/alexcrichton/ssh2-rs) from 0.7.1 to 0.8.0.
- [Release notes](https://github.com/alexcrichton/ssh2-rs/releases)
- [Commits](https://github.com/alexcrichton/ssh2-rs/compare/0.7.1...0.8.0)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-02-24 06:24:52 +00:00
dependabot-preview[bot]
109c7f731d build(deps): bump hermit-abi from 0.1.7 to 0.1.8
Bumps [hermit-abi](https://github.com/hermitcore/rusty-hermit) from 0.1.7 to 0.1.8.
- [Release notes](https://github.com/hermitcore/rusty-hermit/releases)
- [Commits](https://github.com/hermitcore/rusty-hermit/commits)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-02-24 06:24:31 +00:00
dependabot-preview[bot]
812a6b97d3 build(deps): bump syn from 1.0.14 to 1.0.15
Bumps [syn](https://github.com/dtolnay/syn) from 1.0.14 to 1.0.15.
- [Release notes](https://github.com/dtolnay/syn/releases)
- [Commits](https://github.com/dtolnay/syn/compare/1.0.14...1.0.15)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-02-22 09:52:59 +00:00
dependabot-preview[bot]
ad307912ab build(deps): bump memchr from 2.3.2 to 2.3.3
Bumps [memchr](https://github.com/BurntSushi/rust-memchr) from 2.3.2 to 2.3.3.
- [Release notes](https://github.com/BurntSushi/rust-memchr/releases)
- [Commits](https://github.com/BurntSushi/rust-memchr/compare/2.3.2...2.3.3)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-02-22 09:52:49 +00:00
Rob Bradford
94f2fc3308 release-notes: Update for v0.5.1 bug fix release
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-21 14:31:39 +01:00
dependabot-preview[bot]
f190cb05b5 build(deps): bump libc from 0.2.66 to 0.2.67
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.66 to 0.2.67.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.66...0.2.67)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-02-21 08:03:30 +00:00
dependabot-preview[bot]
299eb28453 build(deps): bump micro_http from 6fd1545 to b85757e
Bumps [micro_http](https://github.com/firecracker-microvm/firecracker) from `6fd1545` to `b85757e`.
- [Release notes](https://github.com/firecracker-microvm/firecracker/releases)
- [Commits](6fd1545222...b85757ec00)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-02-20 17:12:23 +00:00
Sergio Lopez
d2f1749edb vmm: config: Add poll_queue property to DiskConfig
Recently, vhost_user_block gained the ability of actively polling the
queue, a feature that can be disabled with the poll_queue property.

This change adds this property to DiskConfig, so it can be used
through the "disk" argument.

For the moment, it can only be used when vhost_user=true, but this
will change once virtio-block gets the poll_queue feature too.

Fixes: #787

Signed-off-by: Sergio Lopez <slp@redhat.com>
2020-02-20 18:06:54 +01:00
Sergio Lopez
378dd81204 vmm: openapi: Add missing "direct" knob to DiskConfig
Add missing "direct" knob that should be exposed through the REST API.

Signed-off-by: Sergio Lopez <slp@redhat.com>
2020-02-20 18:06:54 +01:00
Sergio Lopez
056f5481ac vmm: openapi: Fix "readonly" and "wce" defaults in DiskConfig
Fix "readonly" and "wce" defaults in cloud-hypervisor.yaml to match
their respective defaults in config.rs:DiskConfig.

Signed-off-by: Sergio Lopez <slp@redhat.com>
2020-02-20 18:06:54 +01:00
Rob Bradford
4ebf01b344 vhost_user_backend: Don't report out socket broken errors
This is a perfectly acceptable situation as it causes the backend to
exit because the VMM has closed the connection. This addresses the
rather ugly reporting of errors from the backend that appears
interleaved with the output from the VMM.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-20 14:30:27 +00:00
Rob Bradford
b5755e9c33 vhost_rs: vhost_user: Return error when connection broken
Return an error wen recvmsg() returns without a message using the
libc::ECONNRESET error so that the upper levels will correctly
interpret this as the connection being broken.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-20 14:30:27 +00:00
Samuel Ortiz
c49e31a6d9 vmm: api: Return a resize error when resize fails
And not a VmCreate one.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-02-20 12:26:12 +01:00
Samuel Ortiz
ebc6391bea vmm: api: Fix resize command typos
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-02-20 12:26:12 +01:00
Samuel Ortiz
9de755334d vmm: openapi: Update DiskConfig
It's missing a few knobs (readonly, vhost, wce) that should be exposed
through the rest API.

Fixes: #790

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-02-20 12:17:50 +01:00
Rob Bradford
ed1e7817cc vmm: Workaround double reboot triggered by the kernel
The kernel does not adhere to the ACPI specification (probably to work
around broken hardware) and rather than busy looping after requesting an
ACPI reset it will attempt to reset by other mechanisms (such as i8042
reset.)

In order to trigger a reset the devices write to an EventFd (called
reset_evt.) This is used by the VMM to identify if a reset is requested
and make the VM reboot. As the reset_evt is part of the VMM and reused
for both the old and new VM it is possible for the newly booted VM to
immediately get reset as there is an old event sitting in the EventFd.

The simplest solution is to "drain" the reset_evt EventFd on reboot to
make sure that there is no spurious events in the EventFd.

Fixes: #783

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-19 18:51:14 +01:00
Sergio Lopez
5c06b7f862 vhost_user_block: Implement optional static polling
Actively polling the virtqueue significantly reduces the latency of
each I/O operation, at the expense of using more CPU time. This
features is specially useful when using low-latency devices (SSD,
NVMe) as the backend.

This change implements static polling. When a request arrives after
being idle, vhost_user_block will keep checking the virtqueue for new
requests, until POLL_QUEUE_US (50us) has passed without finding one.

POLL_QUEUE_US is defined to be 50us, based on the current latency of
enterprise SSDs (< 30us) and the overhead of the emulation.

This feature is enabled by default, and can be disabled by using the
"poll_queue" parameter of "block-backend".

This is a test using null_blk as a backend for the image, with the
following parameters:

 - null_blk gb=20 nr_devices=1 irqmode=2 completion_nsec=0 no_sched=1

With "poll_queue=false":

fio --ioengine=sync --bs=4k --rw randread --name randread --direct=1
--filename=/dev/vdb --time_based --runtime=10

randread: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=sync, iodepth=1
fio-3.14
Starting 1 process
Jobs: 1 (f=1): [r(1)][100.0%][r=169MiB/s][r=43.2k IOPS][eta 00m:00s]
randread: (groupid=0, jobs=1): err= 0: pid=433: Tue Feb 18 11:12:59 2020
  read: IOPS=43.2k, BW=169MiB/s (177MB/s)(1688MiB/10001msec)
    clat (usec): min=17, max=836, avg=21.64, stdev= 3.81
     lat (usec): min=17, max=836, avg=21.77, stdev= 3.81
    clat percentiles (nsec):
     |  1.00th=[19328],  5.00th=[19840], 10.00th=[20352], 20.00th=[21120],
     | 30.00th=[21376], 40.00th=[21376], 50.00th=[21376], 60.00th=[21632],
     | 70.00th=[21632], 80.00th=[21888], 90.00th=[22144], 95.00th=[22912],
     | 99.00th=[28544], 99.50th=[30336], 99.90th=[39168], 99.95th=[42752],
     | 99.99th=[71168]
   bw (  KiB/s): min=168440, max=188496, per=100.00%, avg=172912.00, stdev=3975.63, samples=19
   iops        : min=42110, max=47124, avg=43228.00, stdev=993.91, samples=19
  lat (usec)   : 20=5.90%, 50=94.08%, 100=0.02%, 250=0.01%, 500=0.01%
  lat (usec)   : 750=0.01%, 1000=0.01%
  cpu          : usr=10.35%, sys=25.82%, ctx=432417, majf=0, minf=10
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=432220,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=169MiB/s (177MB/s), 169MiB/s-169MiB/s (177MB/s-177MB/s), io=1688MiB (1770MB), run=10001-10001msec

Disk stats (read/write):
  vdb: ios=427867/0, merge=0/0, ticks=7346/0, in_queue=0, util=99.04%

With "poll_queue=true" (default):

fio --ioengine=sync --bs=4k --rw randread --name randread --direct=1
--filename=/dev/vdb --time_based --runtime=10

randread: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=sync, iodepth=1
fio-3.14
Starting 1 process
Jobs: 1 (f=1): [r(1)][100.0%][r=260MiB/s][r=66.7k IOPS][eta 00m:00s]
randread: (groupid=0, jobs=1): err= 0: pid=422: Tue Feb 18 11:14:47 2020
  read: IOPS=68.5k, BW=267MiB/s (280MB/s)(2674MiB/10001msec)
    clat (usec): min=10, max=966, avg=13.60, stdev= 3.49
     lat (usec): min=10, max=966, avg=13.70, stdev= 3.50
    clat percentiles (nsec):
     |  1.00th=[11200],  5.00th=[11968], 10.00th=[11968], 20.00th=[12224],
     | 30.00th=[12992], 40.00th=[13504], 50.00th=[13760], 60.00th=[13888],
     | 70.00th=[14016], 80.00th=[14144], 90.00th=[14272], 95.00th=[14656],
     | 99.00th=[20352], 99.50th=[23936], 99.90th=[35072], 99.95th=[36096],
     | 99.99th=[47872]
   bw (  KiB/s): min=265456, max=296456, per=100.00%, avg=274229.05, stdev=13048.14, samples=19
   iops        : min=66364, max=74114, avg=68557.26, stdev=3262.03, samples=19
  lat (usec)   : 20=98.84%, 50=1.15%, 100=0.01%, 250=0.01%, 500=0.01%
  lat (usec)   : 750=0.01%, 1000=0.01%
  cpu          : usr=8.24%, sys=21.15%, ctx=684669, majf=0, minf=10
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=684611,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=267MiB/s (280MB/s), 267MiB/s-267MiB/s (280MB/s-280MB/s), io=2674MiB (2804MB), run=10001-10001msec

Disk stats (read/write):
  vdb: ios=677855/0, merge=0/0, ticks=7026/0, in_queue=0, util=99.04%

Signed-off-by: Sergio Lopez <slp@redhat.com>
2020-02-19 17:13:47 +00:00
Sergio Lopez
0e4e27ea9d vhost_user_block: Make use of the EVENT_IDX feature
Now that vhost_user_backend and vm-virtio do support EVENT_IDX, use it
in vhost_user_block to reduce the number of notifications sent between
the driver and the device.

This is specially useful when using active polling on the virtqueue,
as it'll be implemented by a future patch.

This is a snapshot of kvm_stat while generating ~60K IOPS with fio on
the guest without EVENT_IDX:

 Event                                         Total %Total CurAvg/s
 kvm_entry                                    393454   20.3    62494
 kvm_exit                                     393446   20.3    62494
 kvm_apic_accept_irq                          378146   19.5    60268
 kvm_msi_set_irq                              369720   19.0    58881
 kvm_fast_mmio                                370497   19.1    58817
 kvm_hv_timer_state                            10197    0.5     1715
 kvm_msr                                        8770    0.5     1443
 kvm_wait_lapic_expire                          7018    0.4     1118
 kvm_apic                                       2768    0.1      538
 kvm_pv_tlb_flush                               2028    0.1      360
 kvm_vcpu_wakeup                                1453    0.1      278
 kvm_apic_ipi                                   1384    0.1      269
 kvm_fpu                                        1148    0.1      164
 kvm_pio                                         574    0.0	  82
 kvm_userspace_exit                              574    0.0	  82
 kvm_halt_poll_ns                                 24    0.0	   3

And this is the snapshot while doing the same thing with EVENT_IDX:

 Event                                         Total %Total CurAvg/s
 kvm_entry                                     35506   26.0     3873
 kvm_exit                                      35499   26.0     3873
 kvm_hv_timer_state                            14740   10.8     1672
 kvm_apic_accept_irq                           13017    9.5     1438
 kvm_msr                                       12845    9.4     1421
 kvm_wait_lapic_expire                         10422    7.6     1118
 kvm_apic                                       3788    2.8      502
 kvm_pv_tlb_flush                               2708    2.0      340
 kvm_vcpu_wakeup                                1992    1.5      258
 kvm_apic_ipi                                   1894    1.4      251
 kvm_fpu                                        1476    1.1      164
 kvm_pio                                         738    0.5       82
 kvm_userspace_exit                              738    0.5	  82
 kvm_msi_set_irq                                 701    0.5	  69
 kvm_fast_mmio                                   238    0.2        4
 kvm_halt_poll_ns                                 50    0.0        1
 kvm_ple_window_update                            28    0.0        0
 kvm_page_fault                                    4    0.0        0

It can be clearly appreciated how the number of vm exits per second,
specially the ones related to notifications (kvm_fast_mmio and
kvm_msi_set_irq) is drastically lower.

Signed-off-by: Sergio Lopez <slp@redhat.com>
2020-02-19 17:13:47 +00:00