Commit Graph

484 Commits

Author SHA1 Message Date
Sebastien Boeuf
7731d2f1be virtio-devices: pmem: Handle descriptor address translation
Since we're trying to move away from the translation happening in the
virtio-queue crate, the device itself is performing the address
translation when needed.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-01-27 10:00:20 +00:00
Sebastien Boeuf
4becb11a44 virtio-devices: net: Handle descriptor address translation
Since we're trying to move away from the translation happening in the
virtio-queue crate, the device itself is performing the address
translation when needed.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-01-27 10:00:20 +00:00
Sebastien Boeuf
ce984b73f5 virtio-devices: console: Handle descriptor address translation
Since we're trying to move away from the translation happening in the
virtio-queue crate, the device itself is performing the address
translation when needed.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-01-27 10:00:20 +00:00
Sebastien Boeuf
3e1ce98d1a virtio-devices: block: Handle descriptor address translation
Since we're trying to move away from the translation happening in the
virtio-queue crate, the device itself is performing the address
translation when needed.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-01-27 10:00:20 +00:00
Sebastien Boeuf
75b9e70ec8 virtio-devices: Set AccessPlatform trait through VirtioDevice
Add a new method set_access_platform() to the VirtioDevice trait in
order to allow an AccessPlatform trait to be setup on any virtio device.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-01-27 10:00:20 +00:00
Sebastien Boeuf
7d09df468d virtio-devices: Remove unused method from VirtioDevice trait
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-01-27 10:00:20 +00:00
Sebastien Boeuf
ce6446501d virtio-devices: Handle queue addresses translation
Upon the enablement of the queue by the guest, we perform a translation
of the descriptor table, the available ring and used ring addresses
prior to enabling the device itself. This only applies to the case where
the device is placed behind a vIOMMU, which is the reason why the
translation is needed. Indeed, the addresses allocated by the guest are
IOVAs which must be translated into GPAs before we can access the queue.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-01-27 10:00:20 +00:00
Sebastien Boeuf
de3e003e3e virtio-devices: Handle virtio queues interrupts from transport layer
Instead of relying on the virtio-queue crate to store the information
about the MSI-X vectors for each queue, we handle this directly from the
PCI transport layer.

This is the first step in getting closer to the upstream version of
virtio-queue so that we can eventually move fully to the upstream
version.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-01-25 12:01:12 +01:00
Rob Bradford
53caa565bb virtio-devices: Add openat() syscall to seccomp filter
When freeing memory sometimes glibc will attempt to read
"/proc/sys/vm/overcommit_memory" to find out how it should release the
blocks. This happens sporadically with Cloud Hypervisor but has been
seen in use. It is not necessary to add the read() syscall to the list
as it is already included in the virtio devices common set. Similarly
the vCPU and vmm threads already have both these in the allowed list.

Fixes: #3609

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-21 17:58:15 +00:00
Sebastien Boeuf
85bbf75fe8 block_util: Align buffers for O_DIRECT
Whenever the backing file of our virtio-block device is opened with
O_DIRECT, there's a requirement about the buffer address and size to be
aligned to the sector size.

We know virtio-block requests are sector aligned in terms of size, but
we must still check if the buffer address is. In case it's not, we
create an intermediate buffer that will be passed through the system
call. In case of a write operation, the content of the non-aligned
buffer must be copied beforehand, and in case of a read operation, the
content of the aligned buffer must be copied to the non-aligned one
after the operation has been completed.

Fixes #3587

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-01-20 11:49:02 +00:00
Wei Liu
4f05b8463c virtio-devices: fix clippy::needless_range_loop
Use iterator instead.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-01-18 17:23:27 -08:00
Sebastien Boeuf
0162d73ed8 virtio-queue: Update crate based on latest rust-vmm/vm-virtio
This crate contains up to date definition of the Queue, AvailIter,
DescriptorChain and Descriptor structures forked from the upstream
crate rust-vmm/vm-virtio 27b18af01ee2d9564626e084a758a2b496d2c618.

The following patches have been applied on top of this base in order to
make it work correctly with Cloud Hypervisor requirements:

- Add MSI vector field to the Queue

  In order to help with MSI/MSI-X support, it is convenient to store the
  value of the interrupt vector inside the Queue directly.

- Handle address translations

  For devices with access to data in memory being translated, we add to
  the Queue the ability to translate the address stored in the
  descriptor.
  It is very helpful as it performs the translation right after the
  untranslated address is read from memory, avoiding any errors from
  happening from the consumer's crate perspective. It also allows the
  consumer to reduce greatly the amount of duplicated code for applying
  the translation in many different places.

- Add helpers for Queue structure

  They are meant to help crate's consumers getting/setting information
  about the Queue.

These patches can be found on the 'ch' branch from the Cloud Hypervisor
fork: https://github.com/cloud-hypervisor/vm-virtio.git

This patch takes care of updating the Cloud Hypervisor code in
virtio-devices and vm-virtio to build correctly with the latest version
of virtio-queue.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-01-06 10:02:40 +00:00
Rob Bradford
9ef1187f4a vmm, pci: Fix potential deadlock in PCI BAR allocation
The allocator is locked by both the BAR allocation code and the
interrupt allocation code. Resulting in a potential lock inversion
error.

WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock) (pid=26318)
  Cycle in lock order graph: M87 (0x7b0c00001e30) => M28 (0x7b0c00001830) => M87

  Mutex M28 acquired here while holding mutex M87 in thread T1:
    #0 pthread_mutex_lock /rustc/llvm/src/llvm-project/compiler-rt/lib/tsan/../sanitizer_common/sanitizer_common_interceptors.inc:4249:3 (cloud-hypervisor+0x9c368)
    #1 std::sys::unix::mutex::Mutex:🔒:hcd1b9aa06ff775d3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/mutex.rs:63:17 (cloud-hypervisor+0x663954)
    #2 std::sys_common::mutex::MovableMutex::raw_lock::hff98d0b036469bca /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/mutex.rs:76:18 (cloud-hypervisor+0x491bae)
    #3 std::sync::mutex::Mutex$LT$T$GT$:🔒:hc61622e5536f5b72 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sync/mutex.rs:267:13 (cloud-hypervisor+0x36d07b)
    #4 _$LT$vmm..interrupt..MsiInterruptManager$LT$kvm_bindings..x86..bindings..kvm_irq_routing_entry$GT$$u20$as$u20$vm_device..interrupt..InterruptManager$GT$::create_group::hd412b5e1e8eeacc2 /home/rob/src/cloud-hypervisor/vmm/src/interrupt.rs:310:29 (cloud-hypervisor+0x6d1403)
    #5 virtio_devices::transport::pci_device::VirtioPciDevice:🆕:h3af603c3f00f4b3d /home/rob/src/cloud-hypervisor/virtio-devices/src/transport/pci_device.rs:376:38 (cloud-hypervisor+0x8e6137)
    #6 vmm::device_manager::DeviceManager::add_virtio_pci_device::h23608151d7668a1c /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:3333:37 (cloud-hypervisor+0x3b6339)
    #7 vmm::device_manager::DeviceManager::add_pci_devices::h136cc20cbeb6b977 /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:1236:30 (cloud-hypervisor+0x390aad)
    #8 vmm::device_manager::DeviceManager::create_devices::h29fc5b8a20e1aea5 /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:1155:9 (cloud-hypervisor+0x38f48c)
    #9 vmm::vm::Vm:🆕:h43efe7c6cd97ede5 /home/rob/src/cloud-hypervisor/vmm/src/vm.rs:799:9 (cloud-hypervisor+0x334641)
    #10 vmm::Vmm::vm_boot::h06bdf54b95d5e14f /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:379:26 (cloud-hypervisor+0x2e5ba8)
    #11 vmm::Vmm::control_loop::h40c9b48c7b800bed /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:1299:48 (cloud-hypervisor+0x2f44e0)
    #12 vmm::start_vmm_thread::_$u7b$$u7b$closure$u7d$$u7d$::h016d2f7cff698175 /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:263:17 (cloud-hypervisor+0x1dda20)
    #13 std::sys_common::backtrace::__rust_begin_short_backtrace::h7fd2df3e7cfba503 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/backtrace.rs:123:18 (cloud-hypervisor+0x659bca)
    #14 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h89880b05fe892d7e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:477:17 (cloud-hypervisor+0x44100e)
    #15 _$LT$core..panic..unwind_safe..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h487382524d80571f /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/panic/unwind_safe.rs:271:9 (cloud-hypervisor+0x6dda5e)
    #16 std::panicking::try::do_call::h1d9c2ccdc39f3322 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:406:40 (cloud-hypervisor+0x6d36d1)
    #17 __rust_try 3hkmq3dzyyv5ejsx (cloud-hypervisor+0x6d4718)
    #18 std::panicking::try::h251306df23d21913 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:370:19 (cloud-hypervisor+0x6d2459)
    #19 std::panic::catch_unwind::h2a9ac2fb12c3c64e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panic.rs:133:14 (cloud-hypervisor+0x57bdce)
    #20 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::h10f4c340611b55e4 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:476:30 (cloud-hypervisor+0x43f913)
    #21 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hdd9b37241caf97b3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/ops/function.rs:227:5 (cloud-hypervisor+0x3cf3f5)
    #22 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::ha5022a6bb7833f62 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d492)
    #23 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h481697829cbc6746 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d492)
    #24 std::sys::unix:🧵:Thread:🆕:thread_start::h6fad62c4c393bbe7 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/thread.rs:108:17 (cloud-hypervisor+0x119d492)

  Mutex M87 previously acquired by the same thread here:
    #0 pthread_mutex_lock /rustc/llvm/src/llvm-project/compiler-rt/lib/tsan/../sanitizer_common/sanitizer_common_interceptors.inc:4249:3 (cloud-hypervisor+0x9c368)
    #1 std::sys::unix::mutex::Mutex:🔒:h9a2d3e97e05c6430 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/mutex.rs:63:17 (cloud-hypervisor+0x9ea344)
    #2 std::sys_common::mutex::MovableMutex::raw_lock::h8abb3b5cf55c0264 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/mutex.rs:76:18 (cloud-hypervisor+0x96face)
    #3 std::sync::mutex::Mutex$LT$T$GT$:🔒:hecec128d40c6dd44 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sync/mutex.rs:267:13 (cloud-hypervisor+0x97120a)
    #4 virtio_devices::transport::pci_device::VirtioPciDevice:🆕:h3af603c3f00f4b3d /home/rob/src/cloud-hypervisor/virtio-devices/src/transport/pci_device.rs:356:29 (cloud-hypervisor+0x8e5c0e)
    #5 vmm::device_manager::DeviceManager::add_virtio_pci_device::h23608151d7668a1c /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:3333:37 (cloud-hypervisor+0x3b6339)
    #6 vmm::device_manager::DeviceManager::add_pci_devices::h136cc20cbeb6b977 /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:1236:30 (cloud-hypervisor+0x390aad)
    #7 vmm::device_manager::DeviceManager::create_devices::h29fc5b8a20e1aea5 /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:1155:9 (cloud-hypervisor+0x38f48c)
    #8 vmm::vm::Vm:🆕:h43efe7c6cd97ede5 /home/rob/src/cloud-hypervisor/vmm/src/vm.rs:799:9 (cloud-hypervisor+0x334641)
    #9 vmm::Vmm::vm_boot::h06bdf54b95d5e14f /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:379:26 (cloud-hypervisor+0x2e5ba8)
    #10 vmm::Vmm::control_loop::h40c9b48c7b800bed /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:1299:48 (cloud-hypervisor+0x2f44e0)
    #11 vmm::start_vmm_thread::_$u7b$$u7b$closure$u7d$$u7d$::h016d2f7cff698175 /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:263:17 (cloud-hypervisor+0x1dda20)
    #12 std::sys_common::backtrace::__rust_begin_short_backtrace::h7fd2df3e7cfba503 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/backtrace.rs:123:18 (cloud-hypervisor+0x659bca)
    #13 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h89880b05fe892d7e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:477:17 (cloud-hypervisor+0x44100e)
    #14 _$LT$core..panic..unwind_safe..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h487382524d80571f /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/panic/unwind_safe.rs:271:9 (cloud-hypervisor+0x6dda5e)
    #15 std::panicking::try::do_call::h1d9c2ccdc39f3322 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:406:40 (cloud-hypervisor+0x6d36d1)
    #16 __rust_try 3hkmq3dzyyv5ejsx (cloud-hypervisor+0x6d4718)
    #17 std::panicking::try::h251306df23d21913 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:370:19 (cloud-hypervisor+0x6d2459)
    #18 std::panic::catch_unwind::h2a9ac2fb12c3c64e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panic.rs:133:14 (cloud-hypervisor+0x57bdce)
    #19 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::h10f4c340611b55e4 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:476:30 (cloud-hypervisor+0x43f913)
    #20 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hdd9b37241caf97b3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/ops/function.rs:227:5 (cloud-hypervisor+0x3cf3f5)
    #21 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::ha5022a6bb7833f62 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d492)
    #22 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h481697829cbc6746 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d492)
    #23 std::sys::unix:🧵:Thread:🆕:thread_start::h6fad62c4c393bbe7 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/thread.rs:108:17 (cloud-hypervisor+0x119d492)

  Mutex M87 acquired here while holding mutex M28 in thread T1:
    #0 pthread_mutex_lock /rustc/llvm/src/llvm-project/compiler-rt/lib/tsan/../sanitizer_common/sanitizer_common_interceptors.inc:4249:3 (cloud-hypervisor+0x9c368)
    #1 std::sys::unix::mutex::Mutex:🔒:h9a2d3e97e05c6430 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/mutex.rs:63:17 (cloud-hypervisor+0x9ea344)
    #2 std::sys_common::mutex::MovableMutex::raw_lock::h8abb3b5cf55c0264 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/mutex.rs:76:18 (cloud-hypervisor+0x96face)
    #3 std::sync::mutex::Mutex$LT$T$GT$:🔒:hecec128d40c6dd44 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sync/mutex.rs:267:13 (cloud-hypervisor+0x97120a)
    #4 _$LT$virtio_devices..transport..pci_device..VirtioPciDevice$u20$as$u20$pci..device..PciDevice$GT$::allocate_bars::h39dc42b48fc8264c /home/rob/src/cloud-hypervisor/virtio-devices/src/transport/pci_device.rs:850:22 (cloud-hypervisor+0x8eb1a4)
    #5 vmm::device_manager::DeviceManager::add_pci_device::h561f6c8ed61db117 /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:3087:20 (cloud-hypervisor+0x3b0c62)
    #6 vmm::device_manager::DeviceManager::add_virtio_pci_device::h23608151d7668a1c /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:3359:20 (cloud-hypervisor+0x3b6707)
    #7 vmm::device_manager::DeviceManager::add_pci_devices::h136cc20cbeb6b977 /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:1236:30 (cloud-hypervisor+0x390aad)
    #8 vmm::device_manager::DeviceManager::create_devices::h29fc5b8a20e1aea5 /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:1155:9 (cloud-hypervisor+0x38f48c)
    #9 vmm::vm::Vm:🆕:h43efe7c6cd97ede5 /home/rob/src/cloud-hypervisor/vmm/src/vm.rs:799:9 (cloud-hypervisor+0x334641)
    #10 vmm::Vmm::vm_boot::h06bdf54b95d5e14f /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:379:26 (cloud-hypervisor+0x2e5ba8)
    #11 vmm::Vmm::control_loop::h40c9b48c7b800bed /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:1299:48 (cloud-hypervisor+0x2f44e0)
    #12 vmm::start_vmm_thread::_$u7b$$u7b$closure$u7d$$u7d$::h016d2f7cff698175 /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:263:17 (cloud-hypervisor+0x1dda20)
    #13 std::sys_common::backtrace::__rust_begin_short_backtrace::h7fd2df3e7cfba503 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/backtrace.rs:123:18 (cloud-hypervisor+0x659bca)
    #14 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h89880b05fe892d7e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:477:17 (cloud-hypervisor+0x44100e)
    #15 _$LT$core..panic..unwind_safe..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h487382524d80571f /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/panic/unwind_safe.rs:271:9 (cloud-hypervisor+0x6dda5e)
    #16 std::panicking::try::do_call::h1d9c2ccdc39f3322 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:406:40 (cloud-hypervisor+0x6d36d1)
    #17 __rust_try 3hkmq3dzyyv5ejsx (cloud-hypervisor+0x6d4718)
    #18 std::panicking::try::h251306df23d21913 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:370:19 (cloud-hypervisor+0x6d2459)
    #19 std::panic::catch_unwind::h2a9ac2fb12c3c64e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panic.rs:133:14 (cloud-hypervisor+0x57bdce)
    #20 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::h10f4c340611b55e4 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:476:30 (cloud-hypervisor+0x43f913)
    #21 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hdd9b37241caf97b3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/ops/function.rs:227:5 (cloud-hypervisor+0x3cf3f5)
    #22 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::ha5022a6bb7833f62 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d492)
    #23 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h481697829cbc6746 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d492)
    #24 std::sys::unix:🧵:Thread:🆕:thread_start::h6fad62c4c393bbe7 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/thread.rs:108:17 (cloud-hypervisor+0x119d492)

  Mutex M28 previously acquired by the same thread here:
    #0 pthread_mutex_lock /rustc/llvm/src/llvm-project/compiler-rt/lib/tsan/../sanitizer_common/sanitizer_common_interceptors.inc:4249:3 (cloud-hypervisor+0x9c368)
    #1 std::sys::unix::mutex::Mutex:🔒:hcd1b9aa06ff775d3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/mutex.rs:63:17 (cloud-hypervisor+0x663954)
    #2 std::sys_common::mutex::MovableMutex::raw_lock::hff98d0b036469bca /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/mutex.rs:76:18 (cloud-hypervisor+0x491bae)
    #3 std::sync::mutex::Mutex$LT$T$GT$:🔒:hc61622e5536f5b72 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sync/mutex.rs:267:13 (cloud-hypervisor+0x36d07b)
    #4 vmm::device_manager::DeviceManager::add_pci_device::h561f6c8ed61db117 /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:3091:22 (cloud-hypervisor+0x3b0a95)
    #5 vmm::device_manager::DeviceManager::add_virtio_pci_device::h23608151d7668a1c /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:3359:20 (cloud-hypervisor+0x3b6707)
    #6 vmm::device_manager::DeviceManager::add_pci_devices::h136cc20cbeb6b977 /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:1236:30 (cloud-hypervisor+0x390aad)
    #7 vmm::device_manager::DeviceManager::create_devices::h29fc5b8a20e1aea5 /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:1155:9 (cloud-hypervisor+0x38f48c)
    #8 vmm::vm::Vm:🆕:h43efe7c6cd97ede5 /home/rob/src/cloud-hypervisor/vmm/src/vm.rs:799:9 (cloud-hypervisor+0x334641)
    #9 vmm::Vmm::vm_boot::h06bdf54b95d5e14f /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:379:26 (cloud-hypervisor+0x2e5ba8)
    #10 vmm::Vmm::control_loop::h40c9b48c7b800bed /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:1299:48 (cloud-hypervisor+0x2f44e0)
    #11 vmm::start_vmm_thread::_$u7b$$u7b$closure$u7d$$u7d$::h016d2f7cff698175 /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:263:17 (cloud-hypervisor+0x1dda20)
    #12 std::sys_common::backtrace::__rust_begin_short_backtrace::h7fd2df3e7cfba503 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/backtrace.rs:123:18 (cloud-hypervisor+0x659bca)
    #13 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h89880b05fe892d7e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:477:17 (cloud-hypervisor+0x44100e)
    #14 _$LT$core..panic..unwind_safe..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h487382524d80571f /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/panic/unwind_safe.rs:271:9 (cloud-hypervisor+0x6dda5e)
    #15 std::panicking::try::do_call::h1d9c2ccdc39f3322 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:406:40 (cloud-hypervisor+0x6d36d1)
    #16 __rust_try 3hkmq3dzyyv5ejsx (cloud-hypervisor+0x6d4718)
    #17 std::panicking::try::h251306df23d21913 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:370:19 (cloud-hypervisor+0x6d2459)
    #18 std::panic::catch_unwind::h2a9ac2fb12c3c64e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panic.rs:133:14 (cloud-hypervisor+0x57bdce)
    #19 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::h10f4c340611b55e4 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:476:30 (cloud-hypervisor+0x43f913)
    #20 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hdd9b37241caf97b3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/ops/function.rs:227:5 (cloud-hypervisor+0x3cf3f5)
    #21 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::ha5022a6bb7833f62 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d492)
    #22 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h481697829cbc6746 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d492)
    #23 std::sys::unix:🧵:Thread:🆕:thread_start::h6fad62c4c393bbe7 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/thread.rs:108:17 (cloud-hypervisor+0x119d492)

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-06 09:59:36 +01:00
Sebastien Boeuf
c47ab55e99 virtio-devices: Forward correct evset in vsock muxer
When forwarding an epoll event from the unix muxer to the
targeted connection event handler, the eventset the connection
registered is forwarded instead of the actual epoll
operation (IN/OUT).

For example, if the connection was registered for EPOLLIN,
and receives an EPOLLOUT, the connection will actually handle
an EPOLLOUT.

This is the root cause of previous bug, which caused the
introduction of some workarounds (i.e: handling ewouldblock
when reading after receiving EPOLLIN, which should never happen).

When matching the connection, we retrieve and use the evset of
the connection instead of the one passed as a parameter.
The compiler does not complain for an unused variable because
it was first logged in a debug! statement.

This is an unfortunate naming mistake that caused a lot of problems.

Fixes #3497

Signed-off-by: Eduard Kyvenko <eduard.kyvenko@gmail.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-01-04 12:05:01 +00:00
Rob Bradford
4773e23c77 virtio-devices: block: Expose device topology
If the disk is backed by a block device on the host a non-default
topology will be available and that topology can be advertised by virtio
block.

Fixes: #3262

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-12-17 12:42:10 +01:00
Rob Bradford
4959434219 Revert "virtio-devices: net: Improve throughput with virtio features"
This reverts commit 58d25b3ccc.

This change introduced a regression when running iperf with the guest
running as the server:

marvin:~/src/cloud-hypervisor ((58d25b3c...))$ iperf  -c 192.168.249.2
------------------------------------------------------------
Client connecting to 192.168.249.2, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  1] local 192.168.249.1 port 47078 connected with 192.168.249.2 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-10.40 sec  14.0 MBytes  11.3 Mbits/sec
marvin:~/src/cloud-hypervisor ((58d25b3c...))$ iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size:  128 KByte (default)
------------------------------------------------------------
[  1] local 192.168.249.1 port 5001 connected with 192.168.249.2 port 42866
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-10.01 sec  51.2 GBytes  44.0 Gbits/sec

Fixes: #3450

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-12-13 08:33:27 +00:00
Ziye Yang
9075809494 virtio-devices: Update some comments in epoll_helper.rs
Make some comments more clear.

Signed-off-by: Ziye Yang <ziye.yang@intel.com>
2021-11-23 14:03:05 +01:00
Wei Liu
32af6f9723 virtio-device: add a safety comment for a dup(2) call
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2021-11-17 23:12:11 +00:00
Wei Liu
31b3871eee virtio-devices: add or adjust comments for impl ByteValued
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2021-11-17 14:40:51 +00:00
Sebastien Boeuf
58d25b3ccc virtio-devices: net: Improve throughput with virtio features
By merging receive buffers through the VIRTIO_NET_F_MRG_RXBUF feature,
as well as enabling the use of indirect descriptors through
VIRTIO_RING_F_INDIRECT_DESC feature, we achieve better throughput for
the virtio-net device without hurting its latency.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-11-03 17:11:59 +00:00
Rob Bradford
cd9d1cf8fc pci, virtio-devices, vmm: Allocate PCI 64-bit bars per segment
Since each segment must have a non-overlapping memory range associated
with it the device memory must be equally divided amongst all segments.
A new allocator is used for each segment to ensure that BARs are
allocated from the correct address ranges. This requires changes to
PciDevice::allocate/free_bars to take that allocator and when
reallocating BARs the correct allocator must be identified from the
ranges.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
88378d17a2 vmm: Take PCI segment ID into BAR size allocation
Move the decision on whether to use a 64-bit bar up to the DeviceManager
so that it can use both the device type (e.g. block) and the PCI segment
ID to decide what size bar should be used.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Sebastien Boeuf
0249e8641a Move Cloud Hypervisor to virtio-queue crate
Relying on the vm-virtio/virtio-queue crate from rust-vmm which has been
copied inside the Cloud Hypervisor tree, the entire codebase is moved to
the new definition of a Queue and other related structures.

The reason for this move is to follow the upstream until we get some
agreement for the patches that we need on top of that to make it
properly work with Cloud Hypervisor.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-10-22 11:38:55 +02:00
Rob Bradford
84f0f332b3 virtio-devices: Use #[allow(dead_code)] for unread structs
These structs are not read on the VMM side but are used in communication
with the guest.

As identified by the new beta clippy.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-10-19 19:42:36 +01:00
Rob Bradford
48d4ccbfeb virtio-devices: Call closure directly rather than indirect
As identified by the new beta clippy.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-10-19 19:42:36 +01:00
Rob Bradford
8a56720b0f virtio-devices: Use assert!() rather than if+panic
As identified by the new beta clippy.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-10-19 19:42:36 +01:00
Sebastien Boeuf
5ac013df8b virtio-devices: vhost-user: Set reply_ack conditionally
Setting the reply_ack should depend on the set of acknowledged features
containing the REPLY_ACK flag.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 19:48:17 +01:00
Sebastien Boeuf
0fb24ea3ae virtio-devices: mem: Discard unplugged ranges only on activate()
In order to support correctly the snapshot/restore and migration use
cases, we must be careful with the ranges that we discard by punching
holes. On restore, there might be some ranges already plugged in,
meaning they should not be discarded. That's why we loop over the list
of blocks to discard only the ranges that are marked as unplugged.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Sebastien Boeuf
e390775bcb vmm, virtio-devices: Move BlocksState creation to the MemoryManager
By creating the BlocksState object in the MemoryManager, we can directly
provide it to the virtio-mem device when being created. This will allow
the MemoryManager through each VirtioMemZone to have a handle onto the
blocks that are plugged at any point in time.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Sebastien Boeuf
4450c44fbc virtio-devices: mem: Create a MemoryRangeTable from BlocksState
This is going to be useful to let virtio-mem report the list of ranges
that are currently plugged, so that both snapshot/restore and migration
will copy only what is needed.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Sebastien Boeuf
a1caa6549a vmm: Add page size as a parameter for MemoryRangeTable::from_bitmap()
This will be helpful to support the creation of a MemoryRangeTable from
virtio-mem, as it uses 2M pages.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Sebastien Boeuf
af3a59aa33 virtio-devices: mem: Add constructor for BlocksState
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Sebastien Boeuf
d7115ec656 virtio-devices: mem: Add snapshot/restore support
Adding the snapshot/restore support along with migration as well,
allowing a VM with virtio-mem devices attached to be properly
migrated.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Rob Bradford
43365ade2e vmm, pci: Implement virtio-mem support for vfio-user
Implement the infrastructure that lets a virtio-mem device map the guest
memory into the device. This is necessary since with virtio-mem zones
memory can be added or removed and the vfio-user device must be
informed.

Fixes: #3025

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-21 15:42:49 +01:00
Rob Bradford
fd4f32fa69 virtio-mem: Support multiple mappings
For vfio-user the mapping handler is per device and needs to be removed
when the device in unplugged.

For VFIO the mapping handler is for the default VFIO container (used
when no vIOMMU is used - using a vIOMMU does not require mappings with
virtio-mem)

To represent these two use cases use an enum for the handlers that are
stored.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-21 15:42:49 +01:00
Sebastien Boeuf
6fb88c3c5a virtio-devices: balloon: Add snapshot/restore support
Adding the snapshot/restore support along with migration as well,
allowing a VM with a virtio-balloon device attached to be properly
migrated.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-21 14:47:17 +02:00
Michael Zhao
b3fa56544c virtio-devices: iommu: Support AArch64
The MSI IOVA address on X86 and AArch64 is different.

This commit refactored the code to receive the MSI IOVA address and size
from device_manager, which provides the actual IOVA space data for both
architectures.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-09-17 12:19:46 +02:00
Sebastien Boeuf
a6040d7a30 vmm: Create a single VFIO container
For most use cases, there is no need to create multiple VFIO containers
as it causes unwanted behaviors. Especially when passing multiple
devices from the same IOMMU group, we need to use the same container so
that it can properly list the groups that have been already opened. The
correct logic was already there in vfio-ioctls, but it was incorrectly
used from our VMM implementation.

For the special case where we put a VFIO device behind a vIOMMU, we must
create one container per device, as we need to control the DMA mappings
per device, which is performed at the container level. Because we must
keep one container per device, the vIOMMU use case prevents multiple
devices attached to the same IOMMU group to be passed through the VM.
But this is a limitation that we are fine with, especially since the
vIOMMU doesn't let us group multiple devices in the same group from a
guest perspective.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-15 09:08:13 -07:00
Alyssa Ross
330b5ea3be vmm: notify virtio-console of pty resizes
When a pty is resized (using the TIOCSWINSZ ioctl -- see ioctl_tty(2)),
the kernel will send a SIGWINCH signal to the pty's foreground process
group to notify it of the resize.  This is the only way to be notified
by the kernel of a pty resize.

We can't just make the cloud-hypervisor process's process group the
foreground process group though, because a process can only set the
foreground process group of its controlling terminal, and
cloud-hypervisor's controlling terminal will often be the terminal the
user is running it in.  To work around this, we fork a subprocess in a
new process group, and set its process group to be the foreground
process group of the pty.  The subprocess additionally must be running
in a new session so that it can have a different controlling
terminal.  This subprocess writes a byte to a pipe every time the pty
is resized, and the virtio-console device can listen for this in its
epoll loop.

Alternatives I considered were to have the subprocess just send
SIGWINCH to its parent, and to use an eventfd instead of a pipe.
I decided against the signal approach because re-purposing a signal
that has a very specific meaning (even if this use was only slightly
different to its normal meaning) felt unclean, and because it would
have required using pidfds to avoid race conditions if
cloud-hypervisor had terminated, which added complexity.  I decided
against using an eventfd because using a pipe instead allows the child
to be notified (via poll(2)) when nothing is reading from the pipe any
more, meaning it can be reliably notified of parent death and
terminate itself immediately.

I used clone3(2) instead of fork(2) because without
CLONE_CLEAR_SIGHAND the subprocess would inherit signal-hook's signal
handlers, and there's no other straightforward way to restore all signal
handlers to their defaults in the child process.  The only way to do
it would be to iterate through all possible signals, or maintain a
global list of monitored signals ourselves (vmm:vm::HANDLED_SIGNALS is
insufficient because it doesn't take into account e.g. the SIGSYS
signal handler that catches seccomp violations).

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2021-09-14 15:43:25 +01:00
Alyssa Ross
98bfd1e988 virtio-devices: get tty size from the right tty
Previously, we were always getting the size from stdin, even when the
console was hooked up to a pty.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2021-09-14 15:43:25 +01:00
Alyssa Ross
28382a1491 virtio-devices: determine tty size in console
This prepares us to be able to handle console resizes in the console
device's epoll loop, which we'll have to do if the output is a pty,
since we won't get SIGWINCH from it.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2021-09-14 15:43:25 +01:00
Alyssa Ross
8abe8c679b seccomp: allow mmap everywhere brk is allowed
Musl often uses mmap to allocate memory where Glibc would use brk.
This has caused seccomp violations for me on the API and signal
handling threads.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2021-09-10 12:01:31 -07:00
Rob Bradford
33a55bac0f virtio-devices: seccomp: Split out common seccomp rules
As well as reducing the amount of code this also improves the binary
size slightly:

cargo bloat --release -n 2000 --bin cloud-hypervisor | grep virtio_devices::seccomp_filters::get_seccomp_rules

Before:
 0.1%   0.2%   7.8KiB       virtio_devices virtio_devices::seccomp_filters::get_seccomp_rules
After:
 0.0%   0.1%   3.0KiB       virtio_devices virtio_devices::seccomp_filters::get_seccomp_rules

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-10 10:11:12 -07:00
Rob Bradford
687d646c60 virtio-devices, vmm: Shutdown VMM on virtio thread panic
Shutdown the VMM in the virtio (or VMM side of vhost-user) thread
panics.

See: #3031

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-08 09:40:36 +01:00
Rob Bradford
54e523c302 virtio-devices: Use a common method for spawning virtio threads
Introduce a common solution for spawning the virtio threads which will
make it easier to add the panic handling.

During this effort I discovered that there were no seccomp filters
registered for the vhost-user-net thread nor the vhost-user-block
thread. This change also incorporates basic seccomp filters for those as
part of the refactoring.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-08 09:40:36 +01:00
Rob Bradford
e475b12cf7 virtio-devices, vmm: Upgrade restore related messages to info!()
These happen only sporadically so can be included at the info!() level.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-03 09:30:55 -07:00
Rob Bradford
c2144b5690 vmm, virtio-console: Move input reading into virtio-console thread
Move the processing of the input from stdin, PTY or file from the VMM
thread to the existing virtio-console thread. The handling of the resize
of a virtio-console has not changed but the name of the struct used to
support that has been renamed to reflect its usage.

Fixes: #3060

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-02 21:17:33 +01:00
Fazla Mehrab
5db4dede28 block_util, vhdx: vhdx crate integration with the cloud hypervisor
vhdx_sync.rs in block_util implements traits to represent the vhdx
crate as a supported block device in the cloud hypervisor. The vhdx
is added to the block device list in device_manager.rs at the vmm
crate so that it can automatically detect a vhdx disk and invoke the
corresponding crate.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Fazla Mehrab <akm.fazla.mehrab@intel.com>
2021-08-19 11:43:19 +02:00
Bo Chen
9aba1fdee6 virtio-devices, vmm: Use syscall definitions from the libc crate
Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-08-18 10:42:19 +02:00
Bo Chen
864a5e4fe0 virtio-devices, vmm: Simplify 'get_seccomp_rules'
Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-08-18 10:42:19 +02:00
Bo Chen
7d38a1848b virtio-devices, vmm: Fix the '--seccomp false' option
We are relying on applying empty 'seccomp' filters to support the
'--seccomp false' option, which will be treated as an error with the
updated 'seccompiler' crate. This patch fixes this issue by explicitly
checking whether the 'seccomp' filter is empty before applying the
filter.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-08-18 10:42:19 +02:00
Bo Chen
08ac3405f5 virtio-devices, vmm: Move to the seccompiler crate
Fixes: #2929

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-08-18 10:42:19 +02:00
Sebastien Boeuf
6d34ed03f7 virtio-devices: vhost_user: Refactor through VhostUserCommon
Introducing a new structure VhostUserCommon allowing to factorize a lot
of the code shared between the vhost-user devices (block, fs and net).

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-11 17:01:12 -07:00
Sebastien Boeuf
4735cb8563 vmm, virtio-devices: Restore vhost-user devices in a dedicated way
We cannot let vhost-user devices connect to the backend when the Block,
Fs or Net object is being created during a restore/migration. The reason
is we can't have two VMs (source and destination) connected to the same
backend at the same time. That's why we must delay the connection with
the vhost-user backend until the restoration is performed.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-10 12:36:58 -07:00
Sebastien Boeuf
a636411522 vitio-devices: vhost_user: Factorize some part of the initialization
Introducing a new function to factorize a small part of the
initialization that is shared between a full reinitialization and a
restoration.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-10 12:36:58 -07:00
Sebastien Boeuf
c85aa6dfae virtio-devices: vhost_user: Kill threads upon migration completion
In order to prevent the vhost-user devices from reconnecting to the
backend after the migration has been successfully performed, we make
sure to kill the thread in charge of handling the reconnection
mechanism.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-10 12:36:58 -07:00
Sebastien Boeuf
152a3b98c9 virtio-devices: vhost_user: Shutdown communication after migration
During a migration, the vhost-user device talks to the backend to
retrieve the dirty pages. Once done with this, a snapshot will be taken,
meaning there's no need to communicate with the backend anymore. Closing
the communication is needed to let the destination VM being able to
connect to the same backend.

That's why we shutdown the communication with the backend in case a
migration has been started and we're asked for a snapshot.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-10 12:36:58 -07:00
Sebastien Boeuf
a738808604 virtio-devices: vhost_user: Make vhost-user handle optional
This anticipates the need for creating a new Blk, Fs or Net object
without having performed the connection with the vhost-user backend yet.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-10 12:36:58 -07:00
Sebastien Boeuf
2c54c30435 virtio-devices: vhost_user: common: Fix memory access
It was incorrect to call Vec::from_raw_parts() on the address pointing
to the shared memory log region since Vec is a Rust specific structure
that doesn't directly translate into bytes. That's why we use the same
function from std::slice in order to create a proper slice out of the
memory region, which is then copied into a Vec.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-10 12:36:58 -07:00
Sebastien Boeuf
adae986233 virtio-devices: vhost_user: Add LOG_SHMFD protocol feature
Now that the common vhost-user code can handle logging dirty pages
through shared memory, we need to advertise it to the vhost-user
backends with the protocol feature VHOST_USER_PROTOCOL_F_LOG_SHMFD.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-10 12:36:58 -07:00
Sebastien Boeuf
1c3f8236e7 virtio-devices, vm-migration: Update MigratableError types
Make sure the error types match the function from the Migratable trait.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-10 12:36:58 -07:00
Sebastien Boeuf
204be8611c virtio-devices: vhost_user: net: Fix wrong error message
Due to a previous copy and paste error.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-10 12:36:58 -07:00
Markus Theil
5b0d4bb398 virtio-devices: seccomp: allow unix socket connect in vsock thread
Allow vsocks to connect to Unix sockets on the host running
cloud-hypervisor with enabled seccomp.

Reported-by: Philippe Schaaf <philippe.schaaf@secunet.com>
Tested-by: Franz Girlich <franz.girlich@tu-ilmenau.de>
Signed-off-by: Markus Theil <markus.theil@tu-ilmenau.de>
2021-08-06 08:44:47 -07:00
Sebastien Boeuf
9d88e0b417 virtio-devices: vhost_user: Fully implement Migratable trait
All vhost-user devices are now equipped to support migration.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-05 06:07:00 -07:00
Sebastien Boeuf
b3f5630c27 virtio-devices: vhost_user: Add common migration logic
Adding the common vhost-user code for starting logging dirty pages when
the migration is started, and its counterpart for stopping, as well as
the code in charge of retrieving the bitmap of the dirty pages that have
been logged.

All these functions are meant to be leveraged from vhost-user devices.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-05 06:07:00 -07:00
Sebastien Boeuf
61994cdb14 virtio-devices: vhost_user: Store ability to migrate
Adding a simple field `migration_support` to VhostUserHandle in order to
store the information about the device supporting migration or not. The
value of this flag depends on the feature set negotiated with the
backend. It's considered as supporting migration if VHOST_F_LOG_ALL is
present in the virtio features and if VHOST_USER_PROTOCOL_F_LOG_SHMFD is
present in the vhost-user protocol features.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-05 06:07:00 -07:00
Arafatms
8fb53eb167 virtio-devices: vhost-user: Send set_vring_num before setup inflight I/O tracking
backend like SPDK required to know how many virt queues to be handled
before gets VHOST_USER_SET_INFLIGHT_FD message.

fix dpdk core dump while processing vhost_user_set_inflight_fd:
    #0 0x00007fffef47c347 in vhost_user_set_inflight_fd (pdev=0x7fffe2895998, msg=0x7fffe28956f0, main_fd=545) at ../lib/librte_vhost/vhost_user.c:1570
    #1 0x00007fffef47e7b9 in vhost_user_msg_handler (vid=0, fd=545) at ../lib/librte_vhost/vhost_user.c:2735
    #2 0x00007fffef46bac0 in vhost_user_read_cb (connfd=545, dat=0x7fffdc0008c0, remove=0x7fffe2895a64) at ../lib/librte_vhost/socket.c:309
    #3 0x00007fffef45b3f6 in fdset_event_dispatch (arg=0x7fffef6dc2e0 <vhost_user+8192>) at ../lib/librte_vhost/fd_man.c:286
    #4 0x00007ffff09926f3 in rte_thread_init (arg=0x15ee180) at ../lib/librte_eal/common/eal_common_thread.c:175

Signed-off-by: Arafatms <arafatms@outlook.com>
2021-08-04 09:25:00 +02:00
Muminul Islam
83c44a2411 vmm, virtio-devices: Add missing seccomp rules for MSHV
This patch adds all the seccomp rules missing for MSHV.
With this patch MSFT internal CI runs with seccomp enabled.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2021-08-03 11:09:07 -07:00
Arafatms
62b8955245 virtio-devices: vhost-user: Enable vrings after all queues ready
The vhost-user-net backend needs to prepare all queues before enabling vring.
For example, DPVGW will report 'the RX queue can't find' error, if we enable
vring immediately after kicking it out.

Signed-off-by: Arafatms <arafatms@outlook.com>
2021-07-30 11:12:16 +02:00
Sebastien Boeuf
ccafab6983 virtio-devices: vhost_user: Add snapshot/restore support
Adding the support for snapshot/restore feature for all supported
vhost-user devices.

The complexity of vhost-user-fs device makes it only partially
compatible with the feature. When using the DAX feature, there's no way
to store and remap what was previously mapped in the DAX region. And
when not using the cache region, if the filesystem is mounted, it fails
to be properly restored as this would require a special command to let
the backend know that it must remount what was already mounted before.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-07-29 06:35:03 -07:00
Sebastien Boeuf
382b37f8d1 virtio-devices: vhost_user: Add pause/resume support
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-07-29 06:35:03 -07:00
Sebastien Boeuf
d4b8c8308c virtio-devices: vhost_user: Namespace common functions
This patch moves all vhost-user common functions behind a new structure
VhostUserHandle. There is no functional changes intended, the only goal
being to prepare for storing information through this new structure,
limiting the amount of parameters that are needed for each function.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-07-29 06:35:03 -07:00
Sebastien Boeuf
dcc646f5b1 clippy: Fix redundant allocations
With the new beta version, clippy complains about redundant allocation
when using Arc<Box<dyn T>>, and suggests replacing it simply with
Arc<dyn T>.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-07-29 13:28:57 +02:00
Muminul Islam
e481f97550 vmm, virtio-devices:seccomp: Add MSHV related seccomp rule
MSHV needs SYS_clock_gettime to pause and resume
the guest VM.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2021-07-15 11:05:11 -07:00
Arafatms
3d4e27fa0a virtio-devices: Remove VIRTIO_F_RING_PACKED from default virtio features
The cloud hypervisor tells the VM and the backend to support the PACKED_RING feature,
but it actually processes various variables according to the split ring logic, such
as last_avail_index. Eventually it will cause the following error (SPDK as an example):

    vhost.c: 516:vhost_vq_packed_ring_enqueue: *ERROR*: descriptor has been used before
    vhost_blk.c: 596:process_blk_task: *ERROR*: ====== Task 0x200113784640 req_idx 0 failed ======
    vhost.c: 629:vhost_vring_desc_payload_to_iov: *ERROR*: gpa_to_vva((nil)) == NULL

Signed-off-by: Arafatms <arafatms@outlook.com>
2021-07-07 14:30:47 +02:00
Rob Bradford
b45264af75 virtio-devices, net_util, vhost_user_net: Retry writing to TAP
If writing to the TAP returns EAGAIN then listen for the TAP to be
writable. When the TAP becomes writable attempt to process the TX queue
again.

Fixes: #2807

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-06-29 12:31:11 -07:00
Rob Bradford
d9680c4c51 virtio-devices, net_util, vhost_user_net: Rename tap_event_id
When adding a TX version the RX version should be renamed to accomodate
this.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-06-29 12:31:11 -07:00
Sebastien Boeuf
d4d62fc9dc deps: Update vhost crate from 1a03a2a to 9982541
This dependency bump needed some manual handling since the API changed
quite a lot regarding some RawFd being changed into either File or
AsRawFd traits.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-06-24 13:13:19 +01:00
Bo Chen
5825ab2dd4 clippy: Address the issue 'needless-borrow'
Issue from beta verion of clippy:

Error:    --> vm-virtio/src/queue.rs:700:59
    |
700 |             if let Some(used_event) = self.get_used_event(&mem) {
    |                                                           ^^^^ help: change this to: `mem`
    |
    = note: `-D clippy::needless-borrow` implied by `-D warnings`
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-06-24 08:55:43 +02:00
Sebastien Boeuf
3c0f06c09c virtio-devices: vhost_user: Set NEED_REPLY when REPLY_ACK is supported
Now that vhost crate allows the caller to set the header flags, we can
set NEED_REPLY whenever the REPLY_ACK protocol feature is supported from
both ends.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-06-16 15:37:18 +02:00
Li Hangjing
3fed07419a virtio-balloon: use fallocate when balloon inflated
For vhost-user devices, memory should be shared between CLH and
vhost-user backend. However, madvise DONTNEED doesn't working in
this case. So, let's use fallocate PUNCH_HOLE to discard those
memory regions instead.

Signed-off-by: Li Hangjing <lihangjing@bytedance.com>
2021-06-16 09:55:22 +02:00
Fei Li
aa27f0e743 virtio-balloon: add deflate_on_oom support
Sometimes we need balloon deflate automatically to give memory
back to guest, especially for some low priority guest processes
under memory pressure. Enable deflate_on_oom to support this.

Usage: --balloon "size=0,deflate_on_oom=on" \

Signed-off-by: Fei Li <lifei.shirley@bytedance.com>
2021-06-16 09:55:22 +02:00
Sebastien Boeuf
a6fe4aa7e9 virtio-devices, vmm: Update virtio-iommu to rely on VIOT
Since using the VIRTIO configuration to expose the virtual IOMMU
topology has been deprecated, the virtio-iommu implementation must be
updated.

In order to follow the latest patchset that is about to be merged in the
upstream Linux kernel, it must rely on ACPI, and in particular the newly
introduced VIOT table to expose the information about the list of PCI
devices attached to the virtual IOMMU.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-06-15 17:05:59 +02:00
Sebastien Boeuf
744e9d06e5 virtio-devices: vhost_user: Fix wrong naming regarding reconnection
Since the reconnection thread took on the responsibility to handle
backend initiated requests as well, the variable naming should reflect
this by avoiding the "reconnect" prefix.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-06-14 08:25:15 -07:00
Jiachen Zhang
0699dc1000 virtio-devices: vhost_user: fs: Enable inflight I/O tracking
Signed-off-by: Jiachen Zhang <zhangjiachen.jaycee@bytedance.com>
2021-06-11 17:13:29 +02:00
Jiachen Zhang
2e3e64a4cf virtio-devices: vhost_user: net: Enable inflight I/O tracking
Signed-off-by: Jiachen Zhang <zhangjiachen.jaycee@bytedance.com>
2021-06-11 17:13:29 +02:00
Jiachen Zhang
36d336841c virtio-devices: vhost_user: blk: Enable inflight I/O tracking
Signed-off-by: Jiachen Zhang <zhangjiachen.jaycee@bytedance.com>
2021-06-11 17:13:29 +02:00
Jiachen Zhang
bfd4aa2fed virtio-devices: vhost_user: Support inflight I/O tracking
Vhost user INFLIGHT_SHMFD protocol feature supports inflight I/O
tracking, this commit implement the vhost-user device (master) support
of the feature. Till this commit, specific vhost-user devices (blk, fs,
or net) have not enable this feature.

Signed-off-by: Jiachen Zhang <zhangjiachen.jaycee@bytedance.com>
2021-06-11 17:13:29 +02:00
Sebastien Boeuf
1a5c6631a5 virtio-devices: vhost_user: Reconnection for slave request handler
Add the support for reconnecting the backend request handler after a
disconnection/crash happened.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-06-10 16:17:23 +02:00
Sebastien Boeuf
acec7e34fc virtio-devices: vhost_user: Factorize slave request handler
Since the slave request handler is common to all vhost-user devices, the
same way the reconnection is, it makes sense to handle the requests from
the backend through the same thread.

The reconnection thread now handles both a reconnection as well as any
request coming from the backend.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-06-10 16:17:23 +02:00
Jiachen Zhang
deca570544 virtio-devices: vhost_user: fs: Support socket reconnection handling
This commit enables socket reconnection for vhost-user-fs backends. Note
that, till this commit:

- The re-establish of the slave communication channel is no supported. So
the socket reconnection does not support virtiofsd with DAX enabled.

- Inflight I/O tracking and restoring is not supported. Therefore, only
virtio-fs daemons that are not processing inflight requests can work
normally after reconnection.

- To make the restarted virtiofsd work normally after reconnection, the
internal status of virtiofsd should also be recovered. This is not the
work of cloud-hypervisor. If the virtio-fs daemon does not support
saving or restoring its internal status, then a re-mount in guest after
socket reconnection should be performed.

Signed-off-by: Jiachen Zhang <zhangjiachen.jaycee@bytedance.com>
2021-06-10 16:17:23 +02:00
Jiachen Zhang
650cbce017 virtio-devices: vhost_user: blk: Support socket reconnection handling
This commit enables socket reconnection for vhost-user-blk backends.

Note that, till this commit, inflight I/O trakcing and restoring is not
supported. Therefore, only vhost-user-blk backend that are not processing
inflight requests can work normally after reconnection.

Signed-off-by: Jiachen Zhang <zhangjiachen.jaycee@bytedance.com>
2021-06-04 11:14:24 +02:00
Jiachen Zhang
058946772a virtio-devices: vhost_user: Set proper avail index to vhost-user backend
We should try to read the last avail index from the vring memory aera. This
is necessary when handling vhost-user socket reconnection.

Signed-off-by: Jiachen Zhang <zhangjiachen.jaycee@bytedance.com>
2021-06-04 11:14:24 +02:00
Bo Chen
2c4fa258a6 virtio-devices, vmm: Deprecate "GuestMemory::with_regions(_mut)"
Function "GuestMemory::with_regions(_mut)" were mainly temporary methods
to access the regions in `GuestMemory` as the lack of iterator-based
access, and hence they are deprecated in the upstream vm-memory crate [1].

[1] https://github.com/rust-vmm/vm-memory/issues/133

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-06-03 08:34:45 +01:00
Bo Chen
b5bcdbaf48 misc: Upgrade to use the vm-memory crate w/ dirty-page-tracking
As the first step to complete live-migration with tracking dirty-pages
written by the VMM, this commit patches the dependent vm-memory crate to
the upstream version with the dirty-page-tracking capability. Most
changes are due to the updated `GuestMemoryMmap`, `GuestRegionMmap`, and
`MmapRegion` structs which are taking an additional generic type
parameter to specify what 'bitmap backend' is used.

The above changes should be transparent to the rest of the code base,
e.g. all unit/integration tests should pass without additional changes.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-06-03 08:34:45 +01:00
Rob Bradford
280bef834b virtio-devices: Add helper to VirtioCommon for EventFd duplication
Add a helper to VirtioCommon which returns duplicates of the EventFds
for kill and pause event.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-06-02 12:39:10 -07:00
Sebastien Boeuf
df92495e31 virtio-devices: vhost_user: Fix connection retry logic
The logic was reversed, causing the retry to fail consistently.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-06-02 17:31:30 +02:00
Sebastien Boeuf
e62aafdf58 virtio-devices: Update seccomp filters for vhost-user-net control queue
The control queue was missing rt_sigprocmask syscall, which was causing
a crash when the VM was shutdown.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-06-02 17:31:18 +02:00
Sebastien Boeuf
48c8649891 virtio-devices: vhost_user: Retry connecting the backend
When in client mode, the VMM will retry connecting the backend for a
minute before giving up.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-06-02 11:59:13 +02:00
Sebastien Boeuf
b4eb015d86 virtio-devices: vhost_user: Move reconnection code to module
The reconnection code is moved to the vhost-user module as it is a
common place to be shared across all vhost-user devices.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-06-02 11:59:13 +02:00
Sebastien Boeuf
3e00247a47 virtio-devices: vhost_user: Factorize backend connection
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-06-02 11:59:13 +02:00
Sebastien Boeuf
6bce7f7937 virtio-devices: vhost_user: net: Handle reconnection
This commit implements the reconnection feature for vhost-user-net in
case the connection with the backend is shutdown.

The guest has no knowledge about what happens when a disconnection
occurs, which simplifies the code to handle the reconnection. The
reconnection happens with the backend, and the VMM goes through the
vhost-user setup once again.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-06-02 11:59:13 +02:00
Rob Bradford
2ac7232288 virtio-devices: vsock: Implement versioning for state
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-06-01 14:23:09 +02:00
Rob Bradford
905135fe8c virtio-devices: transport: Versionize VirtioPciDevice state
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-06-01 14:23:09 +02:00
Rob Bradford
57f532a760 virtio-devices, vm-virtio: Refactor queue state saving/restoring
Use a separate QueueState structure which can be versioned.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-06-01 14:23:09 +02:00
Rob Bradford
831dff8b4e virtio-devices: transport: Version common PCI config state
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-06-01 14:23:09 +02:00
Sebastien Boeuf
e9cc23ea94 virtio-devices: vhost_user: net: Move control queue back
We thought we could move the control queue to the backend as it was
making some good sense. Unfortunately, doing so was a wrong design
decision as it broke the compatibility with OVS-DPDK backend.

This is why this commit moves the control queue back to the VMM side,
meaning an additional thread is being run for handling the communication
with the guest.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-05-26 16:09:32 +01:00
Sebastien Boeuf
92c2101f7c virtio-devices: vhost_user: Enable most virtio reserved features
A lot of the VIRTIO reserved features should be supported or not by the
vhost-user backend. That means on the VMM side, these features should be
available, so that they don't get lost during the negotiation.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-05-25 12:12:22 +02:00
Sebastien Boeuf
47be87c7cc virtio-devices: vhost_user: Don't set features before ACK from guest
The VIRTIO features should not be set before they are acked from the
guest. This code was only present to overcome a vhost crate limitation
that was expecting the VIRTIO features to be set before we could fetch
and set the protocol features.

The vhost crate has been recently fixed by removing the limitation,
therefore there's no need for this workaround in the Cloud Hypervisor
codebase anymore.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-05-24 00:34:46 +02:00
Sebastien Boeuf
f583f993ee virtio-devices: Remove the need for net_util in the crate
Everything that was shared in the net_util.rs file has been now moved to
the net_util crate. The only remaining bit was only used by the
virtio-net implementation, that is why this commit moves this code to
virtio-net, and since there's nothing left in net_util.rs, it can be
removed.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-05-21 14:39:56 +02:00
Sebastien Boeuf
bcb1dfb86f virtio-devices: net: Rely on net_util crate for control queue
Since the net_util crate contains the common code needed for processing
the control queue, let's use it and remove the duplicate from inside the
virtio-devices crate.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-05-21 14:39:56 +02:00
Sebastien Boeuf
d7a69f8aa1 net_util: Move virtio-net helpers to net_util crate
Moving helpers to the net_util crate since we don't want virtio-net
common code to be split between two places. The net_util crate should be
the only place to host virtio-net common code.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-05-21 14:39:56 +02:00
Sebastien Boeuf
f36b5f3e3c virtio-devices: vhost_user: Factorize features negotiation
Factorize the virtio features and vhost-user protocol features
negotiation through a common function that blk, fs and net
implementations can directly rely on.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-05-21 12:03:54 +02:00
Sebastien Boeuf
cdaa4d3ad7 virtio-devices: vhost_user: Set features once acknowledged by guest
Make sure the virtio features are set upon device activation. At the
time the device is activated, we know the guest acknowledged the
features, which mean it's safe to set them back to the backend.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-05-21 12:03:54 +02:00
Sebastien Boeuf
8c73af048a virtio-devices: vhost_user: fs: Cleanup device creation
Prepare the device creation so that it can be factorized in a follow up
commit.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-05-21 12:03:54 +02:00
Sebastien Boeuf
e2121c5d75 virtio-devices: vhost_user: net: Cleanup device creation
Prepare the device creation so that it can be factorized in a follow up
commit.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-05-21 12:03:54 +02:00
Sebastien Boeuf
9a7199a116 virtio-devices: vhost_user: blk: Cleanup device creation
Prepare the device creation so that it can be factorized in a follow up
commit.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-05-21 12:03:54 +02:00
Rob Bradford
c05010887f virtio-devices: seccomp: Cleanup unused seccomp filter entries
The threads in question are no longer created and so no longer need
seccomp rules for them.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-05-19 18:21:47 +02:00
Sebastien Boeuf
334aa8c941 virtio-devices: vhost_user: Don't set features twice
The virtio features are negotiated and set at the time the device is
created, hence there's no need to set the features again while going
through the vhost-user setup that is performed upon queue activation.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-05-19 18:21:47 +02:00
Sebastien Boeuf
5d2df70a79 virtio-devices: vhost_user: net: Remove control queue
Now that the control queue is correctly handled by the backend, there's
no need to handle it as well from the VMM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-05-19 18:21:47 +02:00
Dayu Liu
8160c2884b docs: Fix some typos in docs and comments
Fix some typos or misspellings without functional change.

Signed-off-by: Dayu Liu <liu.dayu@zte.com.cn>
2021-05-18 17:19:12 +01:00
Sebastien Boeuf
355e7468e4 virtio-devices: vhost-user: Don't run threads for net and block
Some refactoring is performed in order to always expect the irqfd to be
provided by VirtioInterrupt trait. In case no irqfd is available, we
simply fail initializing the vhost-user device. This allows for further
simplification since we can assume the interrupt will always be
triggered directly by the vhost-user backend without proxying through
the VMM. This allows for complete removal of the dedicated thread for
both block and net.

vhost-user-fs is a bit more complex as it requires the slave request
protocol feature in order to support DAX. That's why we still need the
VMM to interfere and therefore run a dedicated thread for it.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-05-13 09:16:27 +01:00
Rob Bradford
496ceed1d0 misc: Remove unnecessary "extern crate"
Now all crates use edition = "2018" then the majority of the "extern
crate" statements can be removed. Only those for importing macros need
to remain.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-05-12 17:26:11 +02:00
Rob Bradford
bd724fc304 virtio-devices: Stop deriving unnecessary traits
These structs only need to derive Versionize now.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-05-12 17:26:11 +02:00
Rob Bradford
b8f5911c4e misc: Remove unused errors from public interface
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-05-11 13:37:19 +02:00
Rob Bradford
c400702272 virtio-devices: Version state structures
Version the state for device state for the virtio devices.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-05-10 14:40:27 +01:00
Rob Bradford
0d3c5c966b virtio-devices: mem: Address clippy issue
error: all if blocks contain the same code at the start
   --> virtio-devices/src/mem.rs:508:9
    |
508 | /         if plug {
509 | |             let handlers =
    self.dma_mapping_handlers.lock().unwrap();
    |
|_____________________________________________________________________^
    |
    = note: `-D clippy::branches-sharing-code` implied by `-D warnings`
    = help: for further information visit
https://rust-lang.github.io/rust-clippy/master/index.html#branches_sharing_code
help: consider moving the start statements out like this
    |
508 |         let handlers = self.dma_mapping_handlers.lock().unwrap();
509 |         if plug {
    |

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-05-07 14:31:57 +02:00
Rob Bradford
18012d9ee8 virtio-devices: iommu: Address clippy issue
error: usage of `contains_key` followed by `insert` on a `BTreeMap`
   --> virtio-devices/src/iommu.rs:439:17
    |
439 | /                 if !mappings.contains_key(&domain) {
440 | |                     mappings.insert(domain, BTreeMap::new());
441 | |                 }
    | |_________________^ help: try this:
`mappings.entry(domain).or_insert_with(|| BTreeMap::new());`
    |
    = note: `-D clippy::map-entry` implied by `-D warnings`
    = help: for further information visit
https://rust-lang.github.io/rust-clippy/master/index.html#map_entry

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-05-07 14:31:57 +02:00
Rob Bradford
a6da5f9e77 virtio-devices: vsock: Address clippy issue
Issue from beta version of clippy:

  --> virtio-devices/src/vsock/csm/txbuf.rs:69:34
   |
69 |                 Box::new(unsafe {mem::MaybeUninit::<[u8;
   Self::SIZE]>::uninit().assume_init()}));
   |
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |
   = note: `#[deny(clippy::uninit_assumed_init)]` on by default
   = help: for further information visit
https://rust-lang.github.io/rust-clippy/master/index.html#uninit_assumed_init

Fix backported Firecracker a8c9dffad439557081f3435a7819bf89b87870e7.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-05-07 14:31:57 +02:00
Rob Bradford
c418074360 virtio-devices: vhost_user: fs: Don't reference packed struct
error: reference to packed field is unaligned
  --> virtio-devices/src/vhost_user/fs.rs:85:21
   |
85 |                     fs.flags[i].bits() as i32,
   |                     ^^^^^^^^^^^
   |
   = note: `-D unaligned-references` implied by `-D warnings`
   = warning: this was previously accepted by the compiler but is being
phased out; it will become a hard error in a future release!
   = note: for more information, see issue #82523
<https://github.com/rust-lang/rust/issues/82523>
   = note: fields of packed structs are not properly aligned, and
creating a misaligned reference is undefined behavior (even if that
reference is never dereferenced)

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-05-07 14:31:57 +02:00
Yi Wang
1adcf5225b virtio-device: fix some misspelled words in comment
Fix some trivial spelling problem.
No functional change.

Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
2021-05-07 09:07:36 +02:00
Sebastien Boeuf
b5c6b04b36 virtio-devices, vmm: vhost: net: Add client mode support
Adding the support for an OVS vhost-user backend to connect as the
vhost-user client. This means we introduce with this patch a new
option to our `--net` parameter. This option is called 'server' in order
to ask the VMM to run as the server for the vhost-user socket.

Fixes #1745

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-05-05 16:05:51 +02:00
Rob Bradford
656b9f97f9 virtio-devices: net: Loop over enabled queue pairs when activating
In some situations (booting with OVMF) fewer queues will be enabled
therefore we should iterate over the number of enabled queues (as passed
into VirtioDevice::activate()) rather than the number of create tap
devices.

Fixes: #2578

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-05-05 10:19:38 +02:00
Rob Bradford
51a93bc635 virtio-devices: net: Add support for VIRTIO_NET_F_CTRL_GUEST_OFFLOADS
This allows the guest to reprogram the offload settings and mitigates
issues where the Linux kernel tries to reprogram the queues even when
the feature is not advertised.

Fixes: #2528

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-04-29 10:02:10 +02:00
Rob Bradford
9ef1a68539 virtio-devices: net: Remove unnecessary Option<> around tap
This doesn't serve any benefit and just makes the code more complex.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-04-29 10:02:10 +02:00
Rob Bradford
fbc7011346 virtio-devices: net: Tolerate unsupported control queue commands
Rather than erroring out and stalling the queue instead report an error
message if the command is invalid and return an error to the guest via
the status field.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-04-29 10:02:10 +02:00
Rob Bradford
2f2bd927b3 virtio-devices: net: Remove unnecessary Clone for NetCtrl
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-04-29 10:02:10 +02:00
Rob Bradford
bd90938f08 virtio-devices: Rename CtrlVirtio to NetCtrl
This better represents its purpose.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-04-29 10:02:10 +02:00
Rob Bradford
4806357f52 virtio-devices: net: Cleanup the MQ handling in the control queue
Cleanup the control queue handling in preparation for supporting
alternative commands.

Note that this change does not make the MQ handling spec compliant.
According to the specification MQ should only be enabled once the number
of queue pairs the guest would like to use has been specified. The only
improvement towards the specication in this change is correct error
handling if the guest specifies an inappropriate number of queues (out
of range.)

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-04-29 10:02:10 +02:00
Rob Bradford
375382cb08 virtio-devices: net: Advertise full set of offload features
This is based on the offload features that can be configured on the tap.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-04-26 18:40:05 +02:00
Rob Bradford
f213083386 virtio-devices: net: Set tap offload features based on those negotiated
Configure the tap offload features to match those that the guest has
acknowledged. The function for converting virtio to tap features came
from crosvm:
4786cee521/devices/src/virtio/net.rs (115)

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-04-26 18:40:05 +02:00
Rob Bradford
dab1cab4a7 virtio-devices: Simplify device state to support Versionize
In order to support using Versionize for state structures it is necessary
to use simpler, primitive, data types in the state definitions used for
snapshot restore.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-04-23 14:24:16 +01:00
Rob Bradford
da58b65997 virtio-devices: net: Support rebooting when tap fd specfied
Duplicate the fd that is specified in the config so that be used again
after a reboot. When rebooting we destroy all VM state and restore from
the config.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-04-22 12:19:01 +02:00
Rob Bradford
bfc65bff2a virtio-devices: transport: Naturally align capability PCI bar
The PCI bar should be naturally aligned i.e. aligned to the size of the
bar itself.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-04-21 16:11:54 +01:00
Rob Bradford
85f7913bb3 virtio-devices: pci_device: Deserialise section directly to state
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-04-20 18:58:37 +02:00
Rob Bradford
0c1c8881ef virtio-devices, block_util: Automatically serialized packed structs
With current serde_derive it is possible to #[derive(Serialize)] on
packed structures if they implement Copy. This allows the removal of the
manual equivalent code.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-04-16 13:27:03 +01:00
Rob Bradford
6f5d4702d4 misc: Simplify snapshot/restore by using helper functions
Simplify snapshot & restore code by using generics to specify helper
functions that take / make a Serialize / Deserialize struct

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-04-08 16:17:14 +01:00
Bo Chen
32ad4982dd virtio-devices: Add rate limiter for the RX queue of virtio-net
Fixes: #1286

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-03-30 19:47:43 +02:00
Bo Chen
b176ddfe2a virtio-devices, vmm: Add rate limiter for the TX queue of virtio-net
Partially fixes: #1286

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-03-30 19:47:43 +02:00
Bo Chen
bfa37f89c4 virtio-devices: net: Refactor 'handle_tx_event'
This patch moves out the actual processing on the TX queue from the
`handle_tx_event()` function into a separate function,
e.g. `process_tx()`. This allows us to resume the TX queue processing
without reading from the TX queue EventFd, which is needed for rate
limiting support.

No functional change.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-03-30 19:47:43 +02:00
Bo Chen
ee871278ee virtio-devices: Move the 'rate_limiter' module to its own crate
To support I/O throttling on virt-net devices, we need to use the
'rate_limiter' module from the 'net_utils' crate. Given the
'virtio-devices' crate has dependency on the 'net_utils', we will need
to move the 'rate_limiter' module out of the 'virtio-devices' crate to
avoid circular dependency issue. Considering the 'rate_limiter' is not
virtio specific and could be reused for non virtio devices, we move it
to its own crate.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-03-30 19:47:43 +02:00
Gaelan Steele
b16fdb1b3a virtio-devices: use Option::map
It's more concise, more idiomatic Rust, and satisfies nightly clippy.

Signed-off-by: Gaelan Steele <gbs@canishe.com>
2021-03-29 09:55:29 +02:00
Rob Bradford
827229d8e4 pci: Address Rust 1.51.0 clippy issue (upper_case_acroynms)
warning: name `IORegion` contains a capitalized acronym
   --> pci/src/configuration.rs:320:5
    |
320 |     IORegion = 0x01,
    |     ^^^^^^^^ help: consider making the acronym lowercase, except the initial letter (notice the capitalization): `IoRegion`
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#upper_case_acronyms

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-03-26 11:32:09 +00:00
Rob Bradford
6837de9057 vmm: Address Rust 1.51.0 clippy issue (upper_case_acroynms)
warning: name `ConvertFromUTF8` contains a capitalized acronym
  --> virtio-devices/src/vsock/unix/mod.rs:32:5
   |
32 |     ConvertFromUTF8(std::str::Utf8Error),
   |     ^^^^^^^^^^^^^^^ help: consider making the acronym lowercase, except the initial letter: `ConvertFromUtf8`
   |
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#upper_case_acronyms

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-03-26 11:32:09 +00:00
Rob Bradford
7c9a83f6e1 virtio-devices: Address Rust 1.51.0 clippy issue (needless_question_mark)
warning: Question mark operator is useless here
   --> virtio-devices/src/seccomp_filters.rs:485:5
    |
485 | /     Ok(SeccompFilter::new(
486 | |         rules.into_iter().collect(),
487 | |         SeccompAction::Log,
488 | |     )?)
    | |_______^
    |
    = note: `#[warn(clippy::needless_question_mark)]` on by default
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_question_mark
help: try
    |
485 |     SeccompFilter::new(
486 |         rules.into_iter().collect(),
487 |         SeccompAction::Log,
488 |     )
    |

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-03-26 11:32:09 +00:00
Rob Bradford
2571cc8041 virtio-devices: Address Rust 1.51.0 clippy issue (vec_init_then_push)
warning: calls to `push` immediately after creation
   --> virtio-devices/src/vhost_user/net.rs:291:13
    |
291 | /             let mut interrupt_list_sub: Vec<(Option<EventFd>, Queue)> = Vec::with_capacity(2);
292 | |             interrupt_list_sub.push(vu_interrupt_list.remove(0));
293 | |             interrupt_list_sub.push(vu_interrupt_list.remove(0));
    | |_________________________________________________________________^ help: consider using the `vec![]` macro: `let mut interrupt_list_sub: Vec<(Option<EventFd>, Queue)> = vec![..];`
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#vec_init_then_push

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-03-26 11:32:09 +00:00
Rob Bradford
aa34d545f6 vm-virtio, virtio-devices: Address Rust 1.51.0 clippy issue (upper_case_acronyms)
error: name `TYPE_UNKNOWN` contains a capitalized acronym
  --> vm-virtio/src/lib.rs:48:5
   |
48 |     TYPE_UNKNOWN = 0xFF,
   |     ^^^^^^^^^^^^ help: consider making the acronym lowercase, except the initial letter: `Type_Unknown`
   |
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#upper_case_acronyms

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-03-26 11:32:09 +00:00
Rob Bradford
8b7aafad16 virtio-devices: block: Remove unused members of Error enum
These are residual enum members from a previous refactoring.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-03-24 17:27:41 +01:00
Zide Chen
f685f0e0b2 vm-virtio: initialize MSI-X vectors with VIRTIO_MSI_NO_VECTOR
In case of the virtio frontend driver doesn't need interupts for
certain queue event, it may explicitly write VIRTIO_MSI_NO_VECTOR
to the virtio common configuration, or it may doesn't configure
the event type vector at all.

This patch initializes both MSI-X configuration vector and queue vector
with VIRTIO_MSI_NO_VECTOR, so that the backend drivers won't trigger
unexpected interrupts to the guest.

Signed-off-by: Zide Chen <zide.chen@intel.com>
2021-03-17 08:05:24 +01:00
Bo Chen
6307db5699 virtio-devices: seccomp: Add 'timerfd_settime' to block device
The `timerfd_settime` syscall is required when I/O throttling is
enabled.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-03-15 20:19:33 +00:00
Bo Chen
af8def364d virtio-devices, vmm: add I/O rate limiter on block device
This patch is based on the 'rate_limiter' module from firecracker[1]. To
simplify dependencies, we reply on 'vmm-sys-util::TimerFd' instead of
the `timerfd` crate.

[1]https://github.com/firecracker-microvm/firecracker/tree/master/src/rate_limiter

Fixes: #1285

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-03-12 09:35:03 +01:00
Sebastien Boeuf
ec9e6edcd0 virtio-devices: Remove unused update_memory() from VirtioDevice trait
Now that virtio devices can be updated with add_memory_region(), there's
no need to keep update_memory() around.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-03-11 19:04:21 +01:00
Sebastien Boeuf
9e53efa3ca virtio-devices: Add support for adding a single memory region
Assuming vhost-user devices support CONFIGURE_MEM_SLOTS protocol
feature, we introduce a new method to the VirtioDevice trait in order to
update one single memory at a time.

In case CONFIGURE_MEM_SLOTS is not supported by the backend (feature not
acked), we fallback onto the current way of updating the memory
mappings, that is with SET_MEM_TABLE.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-03-11 19:04:21 +01:00
Sebastien Boeuf
e3a8d6c13c virtio-devices: vhost-user: net: Fix seccomp filters
On x86_64 architecture, multiple syscalls were missing when shutting
down the vhost-user-net device along with the VM. This was causing the
usual crash related to seccomp filters.

This commit adds these missing syscalls to fix the issue.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-03-11 19:04:21 +01:00
Sebastien Boeuf
581bf4aad5 virtio-devices: vhost-user: Fix device reset
There is no need to get the vring base when resetting the vhost-user
device. This was mostly ignored, but in some cases, it was causing some
actual errors.

A reset must simply be a combination of disabling the vrings along with
the reset of the owner.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-03-11 19:04:21 +01:00
Jiachen Zhang
16b82f3dfe vhost-user: Make ActivateError messages more consistent
Originally, VhostUserSetup is only used by vhost-user-fs. While
vhost-user-blk and vhost-user-net have their own error messages,
we rename VhostUserSetup to VhostUserFsSetup.

Signed-off-by: Jiachen Zhang <zhangjiachen.jaycee@bytedance.com>
2021-03-11 15:09:07 +00:00
Sebastien Boeuf
61f9a4ec6c virtio-devices: mem: Accept handlers to update DMA mappings
Create two functions for registering/unregistering DMA mapping handlers,
each handler being associated with a VFIO device.

Whenever the plugged_size is modified (which means triggered by the
virtio-mem driver in the guest), the virtio-mem backend is responsible
for updating the DMA mappings related to every VFIO device through the
handler previously provided.

It's important to update the map when the handler is either registered
or unregistered as well, as we don't want to miss some plugged memory
that would have been added before the VFIO device is added to the VM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-03-05 10:38:42 +01:00
Sebastien Boeuf
c27d6df233 vhost: Bump to latest version from upstream
Moving to the latest version of the rust-vmm/vhost crate, before it gets
published on crates.io.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-03-01 15:53:46 +01:00
Rob Bradford
f8875acec2 misc: Bulk upgrade dependencies
In particular update for the vmm-sys-util upgrade and all the other
dependent packages. This requires an updated forked version of
kvm-bindings (due to updated vfio-ioctls) but allowed the removal of our
forked version of kvm-ioctls.

The changes to the API from kvm-ioctls and vmm-sys-util required some
other minor changes to the code.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-02-26 11:31:08 +00:00
Sebastien Boeuf
fa8fcf5f4c vhost: Move to upstream crate
The vhost crate from rust-vmm is ready, which is why we do the switch
from the Cloud Hypervisor fork to the upstream crate.

At the same time, we rename the crate from vhost_rs to vhost.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-02-25 11:20:41 +01:00
Sebastien Boeuf
aee1155870 virtio-devices, vmm: Move to ExternalDmaMapping from vm-device
Now that ExternalDmaMapping is defined in vm-device, let's use it from
there.

This commit also defines the function get_host_address_range() to move
away from the vfio-ioctls dependency.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-02-24 08:02:37 +01:00
Sebastien Boeuf
4ed0e1a3c8 net_util: Simplify TX/RX queue handling
The main idea behind this commit is to remove all the complexity
associated with TX/RX handling for virtio-net. By using writev() and
readv() syscalls, we could get rid of intermediate buffers for both
queues.

The complexity regarding the TAP registration has been simplified as
well. The RX queue is only processed when some data are ready to be
read from TAP. The event related to the RX queue getting more
descriptors only serves the purpose to register the TAP file if it's not
already.

With all these simplifications, the code is more readable but more
performant as well. We can see an improvement of 10% for a single
queue device.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-02-22 10:39:23 +00:00
Rob Bradford
c89095ab85 virtio-devices: Report events for virtio device activation and reset
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-02-18 16:15:13 +00:00
Rob Bradford
9c5be6f660 build: Remove unnecessary Result<> returns
If the function can never return an error this is now a clippy failure:

error: this function's return value is unnecessarily wrapped by `Result`
   --> virtio-devices/src/watchdog.rs:215:5
    |
215 | /     fn set_state(&mut self, state: &WatchdogState) -> io::Result<()> {
216 | |         self.common.avail_features = state.avail_features;
217 | |         self.common.acked_features = state.acked_features;
218 | |         // When restoring enable the watchdog if it was previously enabled. We reset the timer
...   |
223 | |         Ok(())
224 | |     }
    | |_____^
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_wraps

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-02-11 18:18:44 +00:00
Sebastien Boeuf
acfbee5b7a interrupt: Make notifier function return Option<EventFd>
In anticipation for supporting the notifier function for the legacy
interrupt source group, we need this function to return an EventFd
instead of a reference to this same EventFd.

The reason is we can't return a reference when there's an Arc<Mutex<>>
involved in the call chain.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-02-10 17:34:56 +00:00
Wei Liu
ddd0552c83 virtio-devices: vhost-user-net: unpark control queue thread in resume
This thread is virtio-net specific, so it is not handled in the common
virtio device code.

The non-vhost implementation resumes the thread itself. Do the same
thing for vhost-user-net.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2021-02-08 16:16:13 +00:00
Sebastien Boeuf
c397c9c95e vmm, virtio-devices: mem: Don't use MADV_DONTNEED on hugepages
This commit introduces a new information to the VirtioMemZone structure
in order to know if the memory zone is backed by hugepages.

Based on this new information, the virtio-mem device is now able to
determine if madvise(MADV_DONTNEED) should be performed or not. The
madvise documentation specifies that MADV_DONTNEED advice will fail if
the memory range has been allocated with some hugepages.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Hui Zhu <teawater@antfin.com>
2021-02-04 17:52:30 +00:00
Sebastien Boeuf
54f814f34a virtio-devices: mem: Refactor MemEpollHandler
This commit performs some refactoring to make all functions a method
from a specific object, and in particular methods for MemEpollHandler.

The point is to simplify the code to make it more readable.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-02-04 17:52:30 +00:00
Sebastien Boeuf
ad8adcb955 virtio-devices: mem: Adjust based on specification
Adjust the code to comply better with the virtio-mem specification by
adding some validation for the virtio-mem configuration, but also by
updating the virtio-mem configuration itself.

Nowhere in the virtio-mem specification is stated the usable region size
must be adjusted everytime the plugged size changes. For simplification
reason, and without going against the specification, the usable region
size is now kept static, setting its value to the size of the whole
region.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-02-04 17:52:30 +00:00
Sebastien Boeuf
f24094392e virtio-devices: mem: Improve semantic around Resize object
By introducing a ResizeSender object, we avoid having a Resize clone
with a different content than the original Resize object.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-02-04 17:52:30 +00:00
Sebastien Boeuf
c6854c5a97 block_util: Simplify RAW synchronous implementation
Using directly preadv and pwritev, we can simply use a RawFd instead of
a file, and we don't need to use the more complex implementation from
the qcow crate.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-02-01 13:45:08 +00:00
Sebastien Boeuf
b2e5dbaecb block_util, vmm: Add fixed VHD asynchronous implementation
This commit adds the asynchronous support for fixed VHD disk files.

It introduces FixedVhd as a new ImageType, moving the image type
detection to the block_util crate (instead of qcow crate).

It creates a new vhd module in the block_util crate in order to handle
VHD footer, following the VHD specification.

It creates a new fixed_vhd_async module in the block_util crate to
implement the asynchronous version of fixed VHD disk file. It relies on
io_uring.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-02-01 13:45:08 +00:00
Bo Chen
6664e5a6e7 net_util, virtio-devices, vmm: Accept multiple TAP fds
This patch enables multi-queue support for creating virtio-net devices by
accepting multiple TAP fds, e.g. '--net fds=3:7'.

Fixes: #2164

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-01-28 09:11:39 +00:00
Rob Bradford
5db9b0ec99 net_util: Support supplying flags to open_tap() helper
This helper can open a TAP device and configure the interface on it. If
the device needs to be opened multiple times for MQ then it also handles
that correctly.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-01-28 09:11:39 +00:00
Sebastien Boeuf
2824642e80 virtio-devices: Rename BlockIoUring to Block
Now that BlockIoUring is the only implementation of virtio-block,
handling both synchronous and asynchronous backends based on the
AsyncIo trait, we can rename it to Block.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-01-22 16:10:34 +00:00
Sebastien Boeuf
41cfdb50cd virtio-devices: Remove virtio-block synchronous implementation
Now that both synchronous and asynchronous backends rely on the
asynchronous version of virtio-block (namely BlockIoUring), we can
get rid of the synchronous version (namely Block).

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-01-22 16:10:34 +00:00
Sebastien Boeuf
12e20effd7 block_util: Port synchronous QCOW file to AsyncIo trait
Based on the synchronous QCOW file implementation present in the qcow
crate, we created a new qcow_sync module in block_util that ports this
synchronous implementation to the AsyncIo trait.

The point is to reuse virtio-blk asynchronous implementation for both
synchronous and asynchronous backends.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-01-22 16:10:34 +00:00
Sebastien Boeuf
9fc86a91e2 block_util: Port synchronous RAW file to AsyncIo trait
Based on the synchronous RAW file implementation present in the qcow
crate, we created a new raw_sync module in block_util that ports this
synchronous implementation to the AsyncIo trait.

The point is to reuse virtio-blk asynchronous implementation for both
synchronous and asynchronous backends.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-01-22 16:10:34 +00:00
Sebastien Boeuf
da8ce25abf virtio-devices: Use asynchronous traits for virtio-blk io_uring
Based on the new DiskFile and AsyncIo traits, the implementation of
asynchronous block support does not have to be tied to io_uring anymore.
Instead, the only thing the virtio-blk implementation knows is that it
is using an asynchronous implementation of the underlying disk file.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-01-22 16:10:34 +00:00
Rob Bradford
c90f77e399 virtio-devices: Enforce a minimum number of queues
Even though the driver can provide fewer queues than those advertised
for some device types their is a minimum number that is required for
operation.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-01-20 18:54:36 +01:00
Rob Bradford
a105089702 virtio-devices: Support driver programming fewer queues
It is permissable for the driver to program fewer queues than offered by
the device. Filter the queues so that only the ready ones are included
and check that they have valid addresses configured.

Fixes: #2136

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-01-20 18:54:36 +01:00
Rob Bradford
c366efc19e virtio-devices: block, block_io_uring: Don't assume max queue count
Don't assume that the number of queues provided match the number of
queues offered. The virtio spec allows the driver to program fewer
queues.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-01-20 18:54:36 +01:00
Rob Bradford
23f9ec50fb virtio-devices: Simplify virtio device reset
Rather than having to give and return the ioeventfd used for a device
clone them each time. This will make it simpler when we start handling
the driver enabling fewer queues than advertised by the device.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-01-18 15:05:54 +00:00
Rob Bradford
22649c4a87 virtio-devices: Upon reset reap/join the device threads
We have killed the device thread by writing to the exit EventFd but we
should wait for them to quit to ensure consistency.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-01-14 17:25:14 +01:00
Rob Bradford
23afe89089 virtio-devices: Derive thread names from device ids
In order to make the thread naming more useful derive their name from
the device id (which can be supplied by the user) and a device specific
suffix that has details of the individual queue (or queue pair.)

e.g.

rob@artemis:~$ pstree -p -c -l -t `pidof cloud-hypervisor`
cloud-hyperviso(27501)─┬─{_console}(27525)
                       ├─{_disk0_q0}(27529)
                       ├─{_disk0_q1}(27532)
                       ├─{_net1_ctrl}(27533)
                       ├─{_net1_qp0}(27534)
                       ├─{_net1_qp1}(27535)
                       ├─{_rng}(27526)
                       ├─{http-server}(27504)
                       ├─{seccomp_signal_}(27502)
                       ├─{signal_handler}(27523)
                       ├─{vcpu0}(27520)
                       ├─{vcpu1}(27522)
                       └─{vmm}(27503)

Fixes: #2077

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-01-13 16:56:44 +01:00
Rob Bradford
bdfe6a69ef virtio-devices: iommu: Update to latest version of IOMMU spec
This is order to support the latest patches from
https://jpbrucker.net/git/linux/commit/?h=virtio-iommu/devel&id=ddc3e6ce3f534af827a9c8f91e12a7082de8ec61

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-01-11 12:56:05 +01:00
Rob Bradford
e0d79196c8 virtio-devices, vmm: Enhance debugging around virtio device activation
Sometimes when running under the CI tests fail due to a barrier not
being released and the guest blocks on an MMIO write. Add further
debugging to try and identify the issue.

See: #2118

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-01-08 14:06:44 +00:00
Sebastien Boeuf
f70852c04b virtio-devices: Update seccomp filters for virtio-net thread
On aarch64, the openat() syscall was missing from the seccomp filters
list, preventing the test_watchdog from running properly.

Fixes #2103

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-01-08 12:37:32 +00:00
Rob Bradford
315a730128 virtio-devices: net: Reduce debug level of EVENT_IDX messages
This logging is too spammy for info!() level and should be handled as
debug!() level

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-01-06 13:51:26 +01:00
Rob Bradford
fabd63072b misc: Remove unnecessary literal casts
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-01-04 13:46:37 +01:00