Commit Graph

68 Commits

Author SHA1 Message Date
Wei Liu
427a2bacf5 block_util: convert aligned_operations to SmallVec
The number of aligned operations can not be larger than the number of
descriptors. Initializing the capacity to 1 is good enough per the
observation that most of time there is only one data descriptor in a
given request.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2023-01-30 08:13:40 +00:00
Wei Liu
1325c76525 block_util: use SmallVec in async adaptor
Also fix a comment while at it.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2023-01-30 08:13:40 +00:00
Yong He
e2ea9e2821 block_util: add timestamp in request
Add timestamp per request, so that we could account
block process latency base request.

Signed-off-by: Yong He <alexyonghe@tencent.com>
2023-01-11 17:38:42 +00:00
Rob Bradford
db5582f7d1 block_util: Preserve ordering of sync file backend completions
For the synchronous backends efficiently preserve the order for
completion requests through the use of VecDequeue. Preserving the order
is not required but is beneficial as it matches the existing
optimisation that looks to match completions and requests.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2023-01-10 17:30:25 +00:00
Rob Bradford
ce51755109 block_util: Avoid intermediate completion queue allocation
Rather than aggregate the completion list into an intermediate vector
instead adjust the API to provide one completion item at a time.

With DHAT this shows the number of heap allocations has decreased.

Before:

    dhat: Total:     623,852 bytes in 8,157 blocks

After:

    dhat: Total:     380,444 bytes in 3,469 blocks

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2023-01-10 17:30:25 +00:00
Rob Bradford
67ff0d7819 block_util: Use SmallVec for I/O requests
The number of buffers in the request is usually one so by using SmallVec
the number of heap allocations can be significantly reduced.

DHAT reports:

Before:

dhat: Total:     1,166,412 bytes in 40,383 blocks

After:

dhat: Total:     623,852 bytes in 8,157 blocks

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2023-01-10 17:30:25 +00:00
Rob Bradford
e7a51cde78 block_util: Pass a reference to the slice of iovecs
Rather than passing the vector of iovecs for the I/O to act on pass a
reference to the slice of values inside them. This removes the explicit
container type from the API allowing the use of e.g. SmallVec.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2023-01-10 17:30:25 +00:00
Rob Bradford
eb87538100 block_util: Avoid reallocation of Vec on push
When using DHAT[1] for analysis the amount heap allocations change from:

dhat: Total:     3,186,536 bytes in 41,452 blocks

 to

dhat: Total:     1,059,816 bytes in 34,747 blocks

When running against virtio-block; this still more allocations than
virtio-pmem but a significant improvement without any cost.

[1] https://docs.rs/dhat/latest/dhat/

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2023-01-09 10:41:56 -08:00
Rob Bradford
ab8e559343 block_util: Use anonymous case to handle ioctl signature difference
Between musl and glibc there is a difference in the signature of the
ioctl libc function. Use an anonymous cast to force the type coversion.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-12-13 18:10:42 +00:00
Wei Liu
f110029acf block_util: modify or provide safety comments
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-11-18 12:50:01 +00:00
Wei Liu
3edf12accf block_util: modify and add safety comments
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-11-18 12:50:01 +00:00
Bo Chen
a9ec0f33c0 misc: Fix clippy issues
Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-11-02 09:41:43 +01:00
Rob Bradford
86c19816c6 block_util: Probe for used io_uring opcodes
The probing logic wasn't updated to reflect the actual opcodes in use
for io_uring which are vectored read/writes not the unvectored versions.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-10-31 08:53:54 +01:00
Bo Chen
2af2cc539f misc: Unify error message punctuation
Considering error messages will be mostly nested, ensuring no
punctuation at the end will make the error log more readable.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-10-21 12:19:07 +02:00
Bo Chen
6a1e637a46 block_util: Derive thiserror::Error
Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-08-16 11:45:44 +01:00
dependabot[bot]
b440cb7d23 build: bump vmm-sys-util from 0.9.0 to 0.10.0
This patch requires the vhost-user-backend crate to be bumped from 0.5.0
to 0.5.1.

Bumps [vmm-sys-util](https://github.com/rust-vmm/vmm-sys-util) from 0.9.0 to 0.10.0.
- [Release notes](https://github.com/rust-vmm/vmm-sys-util/releases)
- [Changelog](https://github.com/rust-vmm/vmm-sys-util/blob/main/CHANGELOG.md)
- [Commits](https://github.com/rust-vmm/vmm-sys-util/compare/v0.9.0...v0.10.0)

---
updated-dependencies:
- dependency-name: vmm-sys-util
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-07-20 09:40:28 +00:00
Rob Bradford
2716bc3311 build: Fix beta clippy issue (derive_partial_eq_without_eq)
warning: you are deriving `PartialEq` and can implement `Eq`
  --> vmm/src/serial_manager.rs:59:30
   |
59 | #[derive(Debug, Clone, Copy, PartialEq)]
   |                              ^^^^^^^^^ help: consider deriving `Eq` as well: `PartialEq, Eq`
   |
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#derive_partial_eq_without_eq

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-06-30 20:50:45 +01:00
Wei Liu
9019086625 block_util: drop Sync bound for DiskFile and AsyncIo
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-06-20 23:28:57 +01:00
Sebastien Boeuf
059e787cb5 virtio-devices: Rename address translation function for more clarity
Renaming translate() to translate_gva() to clarify we want to translate
a GVA address into a GPA.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-04-05 00:09:52 +02:00
Sebastien Boeuf
77df4e6773 vm-virtio: Define and implement Translatable trait
This new trait simplifies the address translation of a GuestAddress by
having GuestAddress implementing it.

The three crates virtio-devices, block_util and net_util have been
updated accordingly to rely on this new trait, helping with code
readability and limiting the amount of duplicated code.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-01-27 10:00:20 +00:00
Sebastien Boeuf
8eed276d14 vm-virtio: Define AccessPlatform trait
Moving the whole codebase to rely on the AccessPlatform definition from
vm-virtio so that we can fully remove it from virtio-queue crate.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-01-27 10:00:20 +00:00
Sebastien Boeuf
3e1ce98d1a virtio-devices: block: Handle descriptor address translation
Since we're trying to move away from the translation happening in the
virtio-queue crate, the device itself is performing the address
translation when needed.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-01-27 10:00:20 +00:00
Sebastien Boeuf
85bbf75fe8 block_util: Align buffers for O_DIRECT
Whenever the backing file of our virtio-block device is opened with
O_DIRECT, there's a requirement about the buffer address and size to be
aligned to the sector size.

We know virtio-block requests are sector aligned in terms of size, but
we must still check if the buffer address is. In case it's not, we
create an intermediate buffer that will be passed through the system
call. In case of a write operation, the content of the non-aligned
buffer must be copied beforehand, and in case of a read operation, the
content of the aligned buffer must be copied to the non-aligned one
after the operation has been completed.

Fixes #3587

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-01-20 11:49:02 +00:00
Wei Liu
ea2685e928 block_util: rewrite code and drop allow(clippy::ptr_arg)
The code can be written in a better form and the clippy warning
suppression can be dropped.

No functional change.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-01-18 16:22:21 +01:00
Sebastien Boeuf
0162d73ed8 virtio-queue: Update crate based on latest rust-vmm/vm-virtio
This crate contains up to date definition of the Queue, AvailIter,
DescriptorChain and Descriptor structures forked from the upstream
crate rust-vmm/vm-virtio 27b18af01ee2d9564626e084a758a2b496d2c618.

The following patches have been applied on top of this base in order to
make it work correctly with Cloud Hypervisor requirements:

- Add MSI vector field to the Queue

  In order to help with MSI/MSI-X support, it is convenient to store the
  value of the interrupt vector inside the Queue directly.

- Handle address translations

  For devices with access to data in memory being translated, we add to
  the Queue the ability to translate the address stored in the
  descriptor.
  It is very helpful as it performs the translation right after the
  untranslated address is read from memory, avoiding any errors from
  happening from the consumer's crate perspective. It also allows the
  consumer to reduce greatly the amount of duplicated code for applying
  the translation in many different places.

- Add helpers for Queue structure

  They are meant to help crate's consumers getting/setting information
  about the Queue.

These patches can be found on the 'ch' branch from the Cloud Hypervisor
fork: https://github.com/cloud-hypervisor/vm-virtio.git

This patch takes care of updating the Cloud Hypervisor code in
virtio-devices and vm-virtio to build correctly with the latest version
of virtio-queue.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-01-06 10:02:40 +00:00
Rob Bradford
7bb828ecf2 build: Remove io_uring feature flag
This has been part of the default features for a long time and is widely
tested.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-12-17 13:34:17 +01:00
Rob Bradford
8813456aef block_util: Move block device detection into it's own function
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-12-17 12:42:10 +01:00
Rob Bradford
6c8bd1f476 block_util: Remove duplicated logic for block size ioctls
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-12-17 12:42:10 +01:00
Rob Bradford
3b0d278ba3 block_util: Implement DiskFile::topology() for raw file types
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-12-17 12:42:10 +01:00
Rob Bradford
443b64b04f block_util: Implement DiskTopology::probe()
This method detects the underlying block topology if the disk is backed
by a block device.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-12-17 12:42:10 +01:00
Rob Bradford
ccccc94c8a block_util: Add ability to get block topology from a DiskFile
For simplicity this trait implements a default version that has a
topology with 512 byte (i.e. sector) recommended sizes.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-12-17 12:42:10 +01:00
Wei Liu
af770c814b block_util: provide and use AsyncAdaptor trait
The observation is that the code in question was used to bridge
synchronized and asynchronized code.

We can group the functions for that purpose under an adaptor trait. To
limit the scope of locking, the users of the trait are required to
implement a method to return a MutexGuard for the underlying file.

This then allows us to use concrete types (QcowFile and Vhdx) in code,
which is easier to read than a bunch of traits.

No functional change intended.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2021-12-06 15:20:29 +01:00
Wei Liu
e1151482fc block_util: handle synchronized read/write/fsync idiomatically
Previously mutex (semaphore) and file were separated. The code needed to
create artificial scopes to use mutex to protect file.

Rewrite the code to be idiomatic. The file itself is turned into a trait
object and placed inside the mutex. This requires providing a new
ReadWriteSeekFile trait to unify all helper functions.

The rewrite further simplified vhdx_sync code. The original code
contained two mutex'es for no apparent reason.

No functional change intended.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2021-12-06 09:23:51 +00:00
Wei Liu
3e536f91eb block_util: drop disk_size
It is only used by qcow_sync code. Merge it to its caller.

No functional change.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2021-12-06 09:23:51 +00:00
Wei Liu
6f49cb2860 block_util: add safety comments for impl ByteValued
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2021-11-17 14:40:51 +00:00
Rob Bradford
cfdf643237 block_util: Remove time consuming EventFd check
As part of checking if io_uring is supported various functionality is
tested. The test for whether io_uring supports EventFds is very time
consuming (~10ms) however this test can be removed as a later test will
test for functionality added after this one.

The support for register_eventfd() was released in Linux 5.1 but the
support for register_probe() was released in Linux 5.4. So if the latter
is present the former also is.

Before:

cloud-hypervisor: 4.880411ms: <vmm> INFO:vmm/src/device_manager.rs:1916 -- Creating virtio-block device: DiskConfig { path: Some("/home/rob/workloads/focal-server-cloudimg-amd64-custom-20210609-0.raw"), readonly: false, direct: false, iommu: false, num_queues: 1, queue_size: 128, vhost_user: false, vhost_socket: None, poll_queue: true, rate_limiter_config: None, id: Some("_disk0"), disable_io_uring: false, pci_segment: 0 }
cloud-hypervisor: 14.105123ms: <vmm> INFO:vmm/src/device_manager.rs:1998 -- Using asynchronous RAW disk file (io_uring)
cloud-hypervisor: 14.134837ms: <vmm> INFO:vmm/src/device_manager.rs:1916 -- Creating virtio-block device: DiskConfig { path: Some("/tmp/disk"), readonly: false, direct: false, iommu: false, num_queues: 1, queue_size: 128, vhost_user: false, vhost_socket: None, poll_queue: true, rate_limiter_config: None, id: Some("_disk1"), disable_io_uring: false, pci_segment: 0 }
cloud-hypervisor: 14.221869ms: <vmm> INFO:vmm/src/device_manager.rs:1998 -- Using asynchronous RAW disk file (io_uring)

After:

cloud-hypervisor: 3.140716ms: <vmm> INFO:vmm/src/device_manager.rs:1916 -- Creating virtio-block device: DiskConfig { path: Some("/home/rob/workloads/focal-server-cloudimg-amd64-custom-20210609-0.raw"), readonly: false, direct: false, iommu: false, num_queues: 1, queue_size: 128, vhost_user: false, vhost_socket: None, poll_queue: true, rate_limiter_config: None, id: Some("_disk0"), disable_io_uring: false, pci_segment: 0 }
cloud-hypervisor: 3.376027ms: <vmm> INFO:vmm/src/device_manager.rs:1998 -- Using asynchronous RAW disk file (io_uring)
cloud-hypervisor: 3.40446ms: <vmm> INFO:vmm/src/device_manager.rs:1916 -- Creating virtio-block device: DiskConfig { path: Some("/tmp/disk"), readonly: false, direct: false, iommu: false, num_queues: 1, queue_size: 128, vhost_user: false, vhost_socket: None, poll_queue: true, rate_limiter_config: None, id: Some("_disk1"), disable_io_uring: false, pci_segment: 0 }
cloud-hypervisor: 3.513969ms: <vmm> INFO:vmm/src/device_manager.rs:1998 -- Using asynchronous RAW disk file (io_uring)

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-12 18:09:55 +00:00
Sebastien Boeuf
0249e8641a Move Cloud Hypervisor to virtio-queue crate
Relying on the vm-virtio/virtio-queue crate from rust-vmm which has been
copied inside the Cloud Hypervisor tree, the entire codebase is moved to
the new definition of a Queue and other related structures.

The reason for this move is to follow the upstream until we get some
agreement for the patches that we need on top of that to make it
properly work with Cloud Hypervisor.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-10-22 11:38:55 +02:00
Fazla Mehrab
5db4dede28 block_util, vhdx: vhdx crate integration with the cloud hypervisor
vhdx_sync.rs in block_util implements traits to represent the vhdx
crate as a supported block device in the cloud hypervisor. The vhdx
is added to the block device list in device_manager.rs at the vmm
crate so that it can automatically detect a vhdx disk and invoke the
corresponding crate.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Fazla Mehrab <akm.fazla.mehrab@intel.com>
2021-08-19 11:43:19 +02:00
Sebastien Boeuf
4918c1ca7f block_util, vmm: Propagate error on QcowDiskSync creation
Instead of panicking with an expect() function, the QcowDiskSync::new
function now propagates the error properly. This ensures the VMM will
not panic, which might be the source of weird errors if only one thread
exits while the VMM continues to run.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-11 16:44:28 -07:00
Bo Chen
5825ab2dd4 clippy: Address the issue 'needless-borrow'
Issue from beta verion of clippy:

Error:    --> vm-virtio/src/queue.rs:700:59
    |
700 |             if let Some(used_event) = self.get_used_event(&mem) {
    |                                                           ^^^^ help: change this to: `mem`
    |
    = note: `-D clippy::needless-borrow` implied by `-D warnings`
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-06-24 08:55:43 +02:00
Bo Chen
1b87f332b0 block_util: Mark dirty pages manually with "VolatileSlice::as_ptr()"
As discussed in the working PR in the upstream vm-memory crate repo,
some special functions (e.g. return raw pointers to the wrapped guest
memory) require manual dirty page tracking from their users (e.g.the
VMM). One of the special functions is `VolatileSlice::as_ptr(), which is
used in our code base for supporting async block I/O. This patch
manually mark dirty for guest pages touched while reading from block
devices.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-06-03 08:34:45 +01:00
Bo Chen
b5bcdbaf48 misc: Upgrade to use the vm-memory crate w/ dirty-page-tracking
As the first step to complete live-migration with tracking dirty-pages
written by the VMM, this commit patches the dependent vm-memory crate to
the upstream version with the dirty-page-tracking capability. Most
changes are due to the updated `GuestMemoryMmap`, `GuestRegionMmap`, and
`MmapRegion` structs which are taking an additional generic type
parameter to specify what 'bitmap backend' is used.

The above changes should be transparent to the rest of the code base,
e.g. all unit/integration tests should pass without additional changes.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-06-03 08:34:45 +01:00
Rob Bradford
719e36049b block_util: Remove unrequired serde usage from block_util
These structs are now versioned instead.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-05-21 15:12:23 +02:00
Rob Bradford
c400702272 virtio-devices: Version state structures
Version the state for device state for the virtio devices.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-05-10 14:40:27 +01:00
Rob Bradford
0c1c8881ef virtio-devices, block_util: Automatically serialized packed structs
With current serde_derive it is possible to #[derive(Serialize)] on
packed structures if they implement Copy. This allows the removal of the
manual equivalent code.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-04-16 13:27:03 +01:00
Rob Bradford
80e48b545d block_util: Address Rust 1.51.0 clippy issue (ptr-arg)
error: writing `&PathBuf` instead of `&Path` involves a new object where a slice will do.
  --> block_util/src/lib.rs:68:31
   |
68 | fn build_device_id(disk_path: &PathBuf) -> result::Result<String, Error> {
   |                               ^^^^^^^^ help: change this to: `&Path`
   |
   = note: `-D clippy::ptr-arg` implied by `-D warnings`
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#ptr_arg

error: writing `&PathBuf` instead of `&Path` involves a new object where a slice will do.
  --> block_util/src/lib.rs:83:39
   |
83 | pub fn build_disk_image_id(disk_path: &PathBuf) -> Vec<u8> {
   |                                       ^^^^^^^^ help: change this to: `&Path`
   |
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#ptr_arg

error: writing `&PathBuf` instead of `&Path` involves a new object where a slice will do.
  --> block_util/src/lib.rs:68:31
   |
68 | fn build_device_id(disk_path: &PathBuf) -> result::Result<String, Error> {
   |                               ^^^^^^^^ help: change this to: `&Path`
   |
   = note: `-D clippy::ptr-arg` implied by `-D warnings`
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#ptr_arg

error: writing `&PathBuf` instead of `&Path` involves a new object where a slice will do.
  --> block_util/src/lib.rs:83:39
   |
83 | pub fn build_disk_image_id(disk_path: &PathBuf) -> Vec<u8> {
   |                                       ^^^^^^^^ help: change this to: `&Path`
   |
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#ptr_arg

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-03-26 11:32:09 +00:00
Rob Bradford
6dc3d60b2d block_util: Address Rust 1.51.0 clippy issue (upper_case_acronyms)
error: name `TYPE_UNKNOWN` contains a capitalized acronym
  --> vm-virtio/src/lib.rs:48:5
   |
48 |     TYPE_UNKNOWN = 0xFF,
   |     ^^^^^^^^^^^^ help: consider making the acronym lowercase, except the initial letter: `Type_Unknown`
   |
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#upper_case_acronyms

error: name `GetDeviceID` contains a capitalized acronym
   --> block_util/src/lib.rs:138:5
    |
138 |     GetDeviceID,
    |     ^^^^^^^^^^^ help: consider making the acronym lowercase, except the initial letter: `GetDeviceId`
    |
    = note: `-D clippy::upper-case-acronyms` implied by `-D warnings`
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#upper_case_acronyms

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-03-26 11:32:09 +00:00
Rob Bradford
c4564f3ba8 block_util: Enhance error reporting for virtio-block usage
There are multiple reports of DescriptorChainTooShort errors and so add
some extra debugging to aid the debugging of this issue.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-03-25 13:41:54 +01:00
dependabot-preview[bot]
30cd3cb764 deps: bump io-uring from 0.4.0 to 0.5.0
Bumps [io-uring](https://github.com/tokio-rs/io-uring) from 0.4.0 to 0.5.0.
- [Release notes](https://github.com/tokio-rs/io-uring/releases)
- [Commits](https://github.com/tokio-rs/io-uring/commits)

The API was changed, hence some changes were needed to keep the code
building and functional.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-03-01 11:08:25 +00:00
Rob Bradford
cf7a05ecb5 block_util: Use vmm_sys_util::tempfile::Tempfile
This removes the requirement for an extra crate.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-02-22 14:29:53 +01:00