4341 Commits

Author SHA1 Message Date
Rob Bradford
e9d67dc405 vmm: pci: Move creation of vfio_user::Client to DeviceManager
By moving this from the VfioUserPciDevice to DeviceManager the client
can be reused for handling DMA mapping behind an IOMMU.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-21 15:42:49 +01:00
Rob Bradford
fd4f32fa69 virtio-mem: Support multiple mappings
For vfio-user the mapping handler is per device and needs to be removed
when the device in unplugged.

For VFIO the mapping handler is for the default VFIO container (used
when no vIOMMU is used - using a vIOMMU does not require mappings with
virtio-mem)

To represent these two use cases use an enum for the handlers that are
stored.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-21 15:42:49 +01:00
Sebastien Boeuf
6fb88c3c5a virtio-devices: balloon: Add snapshot/restore support
Adding the snapshot/restore support along with migration as well,
allowing a VM with a virtio-balloon device attached to be properly
migrated.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-21 14:47:17 +02:00
Bo Chen
d6c08d902b ci: Use the pre-installed virtiofsd
Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-09-20 16:47:28 +01:00
Wei Liu
86afa38c64 hypervisor: mshv: drop one unsafe in code
The binding already provides a default() method which does the same
thing.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2021-09-20 17:22:31 +02:00
Bo Chen
ae68c802bd Dockerfile: Build and install virtiofsd to the docker image
Given the 'virtiofsd' executable is used in multiple CI workers,
installing them directly to the docker image is more efficient and can
save CI time.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-09-20 13:15:52 +01:00
dependabot[bot]
ccced2ebf4 build: bump arc-swap from 1.3.2 to 1.4.0 in /fuzz
Bumps [arc-swap](https://github.com/vorner/arc-swap) from 1.3.2 to 1.4.0.
- [Release notes](https://github.com/vorner/arc-swap/releases)
- [Changelog](https://github.com/vorner/arc-swap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/vorner/arc-swap/compare/v1.3.2...v1.4.0)

---
updated-dependencies:
- dependency-name: arc-swap
  dependency-type: indirect
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-19 17:41:20 +00:00
dependabot[bot]
d826b4fbdc build: bump arc-swap from 1.3.2 to 1.4.0
Bumps [arc-swap](https://github.com/vorner/arc-swap) from 1.3.2 to 1.4.0.
- [Release notes](https://github.com/vorner/arc-swap/releases)
- [Changelog](https://github.com/vorner/arc-swap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/vorner/arc-swap/compare/v1.3.2...v1.4.0)

---
updated-dependencies:
- dependency-name: arc-swap
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-19 17:12:50 +00:00
Rob Bradford
0faa7afac2 vmm: Add fast path for PCI config IO port
Looking up devices on the port I/O bus is time consuming during the
boot at there is an O(lg n) tree lookup and the overhead from taking a
lock on the bus contents.

Avoid this by adding a fast path uses the hardcoded port address and
size and directs PCI config requests directly to the device.

Command line:
target/release/cloud-hypervisor --kernel ~/src/linux/vmlinux --cmdline "root=/dev/vda1 console=ttyS0" --serial tty --console off --disk path=~/workloads/focal-server-cloudimg-amd64-custom-20210609-0.raw --api-socket /tmp/api

PIO exit: 17913
PCI fast path: 17871
Percentage on fast path: 99.8%

perf before:

marvin:~/src/cloud-hypervisor (main *)$ perf report -g | grep resolve
     6.20%     6.20%  vcpu0            cloud-hypervisor    [.] vm_device:🚌:Bus::resolve

perf after:

marvin:~/src/cloud-hypervisor (2021-09-17-ioapic-fast-path *)$ perf report -g | grep resolve
     0.08%     0.08%  vcpu0            cloud-hypervisor    [.] vm_device:🚌:Bus::resolve

The compromise required to implement this fast path is bringing the
creation of the PciConfigIo device into the DeviceManager::new() so that
it can be used in the VmmOps struct which is created before
DeviceManager::create_devices() is called.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-17 17:09:45 +01:00
Michael Zhao
da8eecc797 docs: Describe about virtio-iommu with FDT
Added a section in "Usage" chapter of "iommu.md" to introduce the
special behavior when virtio-iommu is working with FDT on AArch64.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-09-17 12:19:46 +02:00
Michael Zhao
9dc9e224b9 tests: Enable IOMMU test case on AArch64
For AArch64, now virtual IOMMU is only tested on FDT, not ACPI.
In the case of FDT, the behavior of IOMMU is a bit different with ACPI.
All the devices on the PCI bus will be attached to the virtual IOMMU,
except the virtio-iommu device itself. So these devices will all be
added to IOMMU groups, and appear in folder '/sys/kernel/iommu_groups/'.

The result is, on AArch64 IOMMU group '0' contains "0000:00:01.0" which
is the console device. But on X86, console device is not attached to
IOMMU. So the IOMMU group '0' contains "0000:00:02.0" which is the first
disk.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-09-17 12:19:46 +02:00
Michael Zhao
b3fa56544c virtio-devices: iommu: Support AArch64
The MSI IOVA address on X86 and AArch64 is different.

This commit refactored the code to receive the MSI IOVA address and size
from device_manager, which provides the actual IOVA space data for both
architectures.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-09-17 12:19:46 +02:00
Michael Zhao
b30ddc0837 aarch64: Refactor AArch64 GIC space definitions
Move the definition of MSI space to layout.rs, so other crates can
reference it. Now it is needed by virtio-iommu.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-09-17 12:19:46 +02:00
Michael Zhao
253c06d3ba arch/aarch64: Add virtio-iommu device in FDT
Add a virtio-iommu node into FDT if iommu option is turned on. Now we
support only one virtio-iommu device.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-09-17 12:19:46 +02:00
William Douglas
46f6d9597d vmm: Switch to using the serial_manager for serial input
This change switches from handling serial input in the VMM thread to
its own thread controlled by the SerialManager.

The motivation for this change is to avoid the VMM thread being unable
to process events while serial input is happening and vice versa.

The change also makes future work flushing the serial buffer on PTY
connections easier.

Signed-off-by: William Douglas <william.douglas@intel.com>
2021-09-17 11:15:35 +01:00
William Douglas
7b4f56e372 vmm: Add new serial_manager for serial input handling
This change adds a SerialManager with its own epoll handling that
should be created and run by the DeviceManager when creating an
appropriately configured console (serial tty or pty).

Both stdin and pty input are handled by the SerialManager. The stdin
and pty specific methods used by the VMM should be removed in a future
commit.

Signed-off-by: William Douglas <william.douglas@intel.com>
2021-09-17 11:15:35 +01:00
William Douglas
d6a2f48b32 vmm: device_manager: Make PtyPair implement Clone
The clone method for PtyPair should have been an impl of the Clone
trait but the method ended up not being used. Future work will make
use of the trait however so correct the missing trait implementation.

Signed-off-by: William Douglas <william.douglas@intel.com>
2021-09-17 11:15:35 +01:00
Hui Zhu
c313bcbfd4 net_util: Change libc::getrandom to getrandom::getrandom
libc::getrandom need to be called inside unsafe and it is not
cross-platform friendly.
Change it to getrandom::getrandom that is safe and cross-platform
friendly.

Signed-off-by: Hui Zhu <teawater@antfin.com>
2021-09-17 09:24:10 +02:00
dependabot[bot]
c741c01cc0 build: bump unicode-width from 0.1.8 to 0.1.9 in /fuzz
Bumps [unicode-width](https://github.com/unicode-rs/unicode-width) from 0.1.8 to 0.1.9.
- [Release notes](https://github.com/unicode-rs/unicode-width/releases)
- [Commits](https://github.com/unicode-rs/unicode-width/compare/v0.1.8...v0.1.9)

---
updated-dependencies:
- dependency-name: unicode-width
  dependency-type: indirect
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-17 00:29:15 +00:00
dependabot[bot]
0aba41b3ee build: bump unicode-width from 0.1.8 to 0.1.9
Bumps [unicode-width](https://github.com/unicode-rs/unicode-width) from 0.1.8 to 0.1.9.
- [Release notes](https://github.com/unicode-rs/unicode-width/releases)
- [Commits](https://github.com/unicode-rs/unicode-width/compare/v0.1.8...v0.1.9)

---
updated-dependencies:
- dependency-name: unicode-width
  dependency-type: indirect
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-16 23:38:40 +00:00
dependabot[bot]
f67b3f79ea build: bump vmm-sys-util from 0.8.0 to 0.9.0
Bumps [vmm-sys-util](https://github.com/rust-vmm/vmm-sys-util) from 0.8.0 to 0.9.0.
- [Release notes](https://github.com/rust-vmm/vmm-sys-util/releases)
- [Changelog](https://github.com/rust-vmm/vmm-sys-util/blob/main/CHANGELOG.md)
- [Commits](https://github.com/rust-vmm/vmm-sys-util/compare/v0.8.0...v0.9.0)

---
updated-dependencies:
- dependency-name: vmm-sys-util
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

This needed a bunch of manual updates as well, including vfio-ioctls and
vhost crates. The vhost crate is being patched with the latest version
from rust-vmm because the version 0.1.0 on crates.io doesn't include the
patches we need yet.

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-16 14:01:19 +01:00
Sebastien Boeuf
02173f42fc deps: Update kvm-ioctls to 0.10.0
Updating kvm-ioctls from 0.9.0 to 0.10.0 now that Cloud Hypervisor
relies on kvm-bindings 0.5.0.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-16 12:53:05 +01:00
dependabot[bot]
8ea0c542fd build: bump dirs from 3.0.2 to 4.0.0
Bumps [dirs](https://github.com/soc/dirs-rs) from 3.0.2 to 4.0.0.
- [Release notes](https://github.com/soc/dirs-rs/releases)
- [Commits](https://github.com/soc/dirs-rs/commits)

---
updated-dependencies:
- dependency-name: dirs
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-15 23:39:31 +00:00
dependabot[bot]
c1e896dddb build: bump libc from 0.2.101 to 0.2.102
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.101 to 0.2.102.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.101...0.2.102)

---
updated-dependencies:
- dependency-name: libc
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-15 17:23:46 +00:00
Sebastien Boeuf
a6040d7a30 vmm: Create a single VFIO container
For most use cases, there is no need to create multiple VFIO containers
as it causes unwanted behaviors. Especially when passing multiple
devices from the same IOMMU group, we need to use the same container so
that it can properly list the groups that have been already opened. The
correct logic was already there in vfio-ioctls, but it was incorrectly
used from our VMM implementation.

For the special case where we put a VFIO device behind a vIOMMU, we must
create one container per device, as we need to control the DMA mappings
per device, which is performed at the container level. Because we must
keep one container per device, the vIOMMU use case prevents multiple
devices attached to the same IOMMU group to be passed through the VM.
But this is a limitation that we are fine with, especially since the
vIOMMU doesn't let us group multiple devices in the same group from a
guest perspective.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-15 09:08:13 -07:00
dependabot[bot]
a5d77cdee9 build: bump libc from 0.2.101 to 0.2.102 in /fuzz
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.101 to 0.2.102.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.101...0.2.102)

---
updated-dependencies:
- dependency-name: libc
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-15 16:06:46 +00:00
Rob Bradford
34f220edcd main: Don't panic() if blocking signals fails
This allows Cloud Hypervisor to be run under `perf` as some of the
signals will already be blocked in the child process.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-15 16:20:28 +01:00
Sebastien Boeuf
bcdac10149 deps: Bump kvm-bindings to v0.5.0
Update the kvm-bindings dependency so that Cloud Hypervisor now depends
on the version 0.5.0, which is based on Linux kernel v5.13.0. We still
have to rely on a forked version to be able to serialize all the KVM
structures we need.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-15 16:20:17 +01:00
Rob Bradford
ccda1a004e devices: cmos: Increase robustness of CMOS device
Check sizes of data reads/writes to avoid panics.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-15 09:40:20 +02:00
Rob Bradford
43f0dd6d25 devices: ioapic: Increase robustness of IOAPIC operations
Validate the size of I/O reads and check that no request is made to an
out of bounds index (which would cause a panic.)

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-15 09:40:20 +02:00
Rob Bradford
a77160dca4 devices: acpi: Increase robustness of bus devices
Check the size of data buffer for reading on the ApciPmTimer device to
avoid a potential panic if the guest uses non-DWORD access.

Simplify the zeroring of the buffer for AcpiShutdownDevice.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-15 09:40:20 +02:00
dependabot[bot]
1d78812b63 build: bump serde_json from 1.0.67 to 1.0.68 in /fuzz
Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.67 to 1.0.68.
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](https://github.com/serde-rs/json/compare/v1.0.67...v1.0.68)

---
updated-dependencies:
- dependency-name: serde_json
  dependency-type: indirect
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-15 00:06:30 +00:00
dependabot[bot]
8836715c2d build: bump serde_json from 1.0.67 to 1.0.68
Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.67 to 1.0.68.
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](https://github.com/serde-rs/json/compare/v1.0.67...v1.0.68)

---
updated-dependencies:
- dependency-name: serde_json
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-15 00:06:23 +00:00
Alyssa Ross
330b5ea3be vmm: notify virtio-console of pty resizes
When a pty is resized (using the TIOCSWINSZ ioctl -- see ioctl_tty(2)),
the kernel will send a SIGWINCH signal to the pty's foreground process
group to notify it of the resize.  This is the only way to be notified
by the kernel of a pty resize.

We can't just make the cloud-hypervisor process's process group the
foreground process group though, because a process can only set the
foreground process group of its controlling terminal, and
cloud-hypervisor's controlling terminal will often be the terminal the
user is running it in.  To work around this, we fork a subprocess in a
new process group, and set its process group to be the foreground
process group of the pty.  The subprocess additionally must be running
in a new session so that it can have a different controlling
terminal.  This subprocess writes a byte to a pipe every time the pty
is resized, and the virtio-console device can listen for this in its
epoll loop.

Alternatives I considered were to have the subprocess just send
SIGWINCH to its parent, and to use an eventfd instead of a pipe.
I decided against the signal approach because re-purposing a signal
that has a very specific meaning (even if this use was only slightly
different to its normal meaning) felt unclean, and because it would
have required using pidfds to avoid race conditions if
cloud-hypervisor had terminated, which added complexity.  I decided
against using an eventfd because using a pipe instead allows the child
to be notified (via poll(2)) when nothing is reading from the pipe any
more, meaning it can be reliably notified of parent death and
terminate itself immediately.

I used clone3(2) instead of fork(2) because without
CLONE_CLEAR_SIGHAND the subprocess would inherit signal-hook's signal
handlers, and there's no other straightforward way to restore all signal
handlers to their defaults in the child process.  The only way to do
it would be to iterate through all possible signals, or maintain a
global list of monitored signals ourselves (vmm:vm::HANDLED_SIGNALS is
insufficient because it doesn't take into account e.g. the SIGSYS
signal handler that catches seccomp violations).

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2021-09-14 15:43:25 +01:00
Alyssa Ross
98bfd1e988 virtio-devices: get tty size from the right tty
Previously, we were always getting the size from stdin, even when the
console was hooked up to a pty.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2021-09-14 15:43:25 +01:00
Alyssa Ross
28382a1491 virtio-devices: determine tty size in console
This prepares us to be able to handle console resizes in the console
device's epoll loop, which we'll have to do if the output is a pty,
since we won't get SIGWINCH from it.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2021-09-14 15:43:25 +01:00
dependabot[bot]
68e6a14deb build: bump anyhow from 1.0.43 to 1.0.44 in /fuzz
Bumps [anyhow](https://github.com/dtolnay/anyhow) from 1.0.43 to 1.0.44.
- [Release notes](https://github.com/dtolnay/anyhow/releases)
- [Commits](https://github.com/dtolnay/anyhow/compare/1.0.43...1.0.44)

---
updated-dependencies:
- dependency-name: anyhow
  dependency-type: indirect
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-14 00:54:59 +00:00
dependabot[bot]
f3778a7fc7 build: bump anyhow from 1.0.43 to 1.0.44
Bumps [anyhow](https://github.com/dtolnay/anyhow) from 1.0.43 to 1.0.44.
- [Release notes](https://github.com/dtolnay/anyhow/releases)
- [Commits](https://github.com/dtolnay/anyhow/compare/1.0.43...1.0.44)

---
updated-dependencies:
- dependency-name: anyhow
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-14 00:22:00 +00:00
dependabot[bot]
c6a110dd28 build: bump mshv-bindings from b01bbf8 to 0b58354
Bumps [mshv-bindings](https://github.com/rust-vmm/mshv) from `b01bbf8` to `0b58354`.
- [Release notes](https://github.com/rust-vmm/mshv/releases)
- [Commits](b01bbf8f6e...0b5835475c)

---
updated-dependencies:
- dependency-name: mshv-bindings
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-13 23:36:12 +00:00
dependabot[bot]
b0c7df2e59 build: bump vfio-ioctls from a8ee64b to 06be730
Bumps [vfio-ioctls](https://github.com/rust-vmm/vfio-ioctls) from `a8ee64b` to `06be730`.
- [Release notes](https://github.com/rust-vmm/vfio-ioctls/releases)
- [Commits](a8ee64b978...06be730ff1)

---
updated-dependencies:
- dependency-name: vfio-ioctls
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-11 17:43:43 +00:00
dependabot[bot]
1067059c4a build: bump mshv-ioctls from 0d6e4e8 to b01bbf8
Bumps [mshv-ioctls](https://github.com/rust-vmm/mshv) from `0d6e4e8` to `b01bbf8`.
- [Release notes](https://github.com/rust-vmm/mshv/releases)
- [Commits](0d6e4e82b9...b01bbf8f6e)

---
updated-dependencies:
- dependency-name: mshv-ioctls
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-11 17:14:05 +00:00
Alyssa Ross
8abe8c679b seccomp: allow mmap everywhere brk is allowed
Musl often uses mmap to allocate memory where Glibc would use brk.
This has caused seccomp violations for me on the API and signal
handling threads.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2021-09-10 12:01:31 -07:00
Rob Bradford
b6b686c71c vmm: Shutdown VMM if API thread panics
See: #3031

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-10 10:52:08 -07:00
Rob Bradford
171d12943d vmm: memory_manager: Increase robustness of MemoryManager control device
See: #1289

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-10 10:23:19 -07:00
Rob Bradford
bdc44cd8bc vmm: cpu: Increase robustness of CpuManager control device
See: #1289

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-10 10:22:05 -07:00
Rob Bradford
33a55bac0f virtio-devices: seccomp: Split out common seccomp rules
As well as reducing the amount of code this also improves the binary
size slightly:

cargo bloat --release -n 2000 --bin cloud-hypervisor | grep virtio_devices::seccomp_filters::get_seccomp_rules

Before:
 0.1%   0.2%   7.8KiB       virtio_devices virtio_devices::seccomp_filters::get_seccomp_rules
After:
 0.0%   0.1%   3.0KiB       virtio_devices virtio_devices::seccomp_filters::get_seccomp_rules

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-10 10:11:12 -07:00
Rob Bradford
82ace6e327 build: Update version of toolchain in container
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-10 10:10:11 -07:00
Bo Chen
2e56f0df77 ci: Rustify ovs-dpdk setup and cleanup
Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-09-10 07:41:15 +01:00
Bo Chen
a181b77bc8 ci: Add integration test for live migration with OVS-DPDK
Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-09-10 07:41:15 +01:00
Bo Chen
9023412e31 tests: Refactor test_ovs_dpdk
This patch adds a separate function to launch two guest VMs and ensure
they are connected through ovs-dpdk, so that we can reuse this function
in other tests, e.g. the test for live-migration with ovs-dpdk.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-09-10 07:41:15 +01:00