4676 Commits

Author SHA1 Message Date
Sebastien Boeuf
86f86c5348 vmm: Optimize migration for virtio-mem
Copy only the memory ranges that have been plugged through virtio-mem,
allowing for an interesting optimization regarding the time it takes to
migrate a large virtio-mem device. Even if the hotpluggable space is
very large (say 64GiB), if only 1GiB has been previously added to the
VM, only 1GiB will be sent to the destination VM, avoiding the transfer
of the remaining 63GiB which are unused.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Sebastien Boeuf
0fb24ea3ae virtio-devices: mem: Discard unplugged ranges only on activate()
In order to support correctly the snapshot/restore and migration use
cases, we must be careful with the ranges that we discard by punching
holes. On restore, there might be some ranges already plugged in,
meaning they should not be discarded. That's why we loop over the list
of blocks to discard only the ranges that are marked as unplugged.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Sebastien Boeuf
e390775bcb vmm, virtio-devices: Move BlocksState creation to the MemoryManager
By creating the BlocksState object in the MemoryManager, we can directly
provide it to the virtio-mem device when being created. This will allow
the MemoryManager through each VirtioMemZone to have a handle onto the
blocks that are plugged at any point in time.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Sebastien Boeuf
4450c44fbc virtio-devices: mem: Create a MemoryRangeTable from BlocksState
This is going to be useful to let virtio-mem report the list of ranges
that are currently plugged, so that both snapshot/restore and migration
will copy only what is needed.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Sebastien Boeuf
a1caa6549a vmm: Add page size as a parameter for MemoryRangeTable::from_bitmap()
This will be helpful to support the creation of a MemoryRangeTable from
virtio-mem, as it uses 2M pages.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Sebastien Boeuf
af3a59aa33 virtio-devices: mem: Add constructor for BlocksState
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Sebastien Boeuf
d7115ec656 virtio-devices: mem: Add snapshot/restore support
Adding the snapshot/restore support along with migration as well,
allowing a VM with virtio-mem devices attached to be properly
migrated.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Sebastien Boeuf
7bbcc0f849 vmm: memory_manager: Make sure the hotplugged_size is up to date
The amount of memory plugged in the virtio-mem region should always be
kept up to date in the hotplugged_size field from VirtioMemZone.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Sebastien Boeuf
c4dc7a583d vmm: memory_manager: Simplify the MemoryManager structure
There's no need to duplicate the GuestMemory for snapshot purpose, as we
always have a handle onto the GuestMemory through the guest_memory
field.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Sebastien Boeuf
74485924b1 vmm: memory_manager: Simplification to avoid unnecessary locking
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Rob Bradford
4889999277 vmm: Only advertise a single PCI bus
Since we only support a single PCI bus right now advertise only a single
bus in the ACPI tables. This reduces the number of VM exits from probing
substantially.

Number of PCI config I/O port exits: 17871 -> 1551 (91% reduction) with
direct kernel boot.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-28 14:10:10 +02:00
dependabot[bot]
eda0dc20d3 build: bump libc from 0.2.102 to 0.2.103
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.102 to 0.2.103.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.102...0.2.103)

---
updated-dependencies:
- dependency-name: libc
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-28 10:45:35 +00:00
dependabot[bot]
1e9c84af0f build: bump vm-fdt from ddb3fad to 57796cd
Bumps [vm-fdt](https://github.com/rust-vmm/vm-fdt) from `ddb3fad` to `57796cd`.
- [Release notes](https://github.com/rust-vmm/vm-fdt/releases)
- [Commits](ddb3fad524...57796cde6a)

---
updated-dependencies:
- dependency-name: vm-fdt
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-28 10:37:00 +00:00
Rob Bradford
b50519651c vmm: Simplify slot eject code in PCI ACPI device code
Use a simpler method for extracting the affected slot on the eject
command. Also update the terminology to reflect that this a slot rather
than a bdf (which is what device id refers to elsewhere.)

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-28 12:03:23 +02:00
dependabot[bot]
08d859a7dd build: bump mshv-ioctls from 0b58354 to 74e46d4
Bumps [mshv-ioctls](https://github.com/rust-vmm/mshv) from `0b58354` to `74e46d4`.
- [Release notes](https://github.com/rust-vmm/mshv/releases)
- [Commits](0b5835475c...74e46d4eac)

---
updated-dependencies:
- dependency-name: mshv-ioctls
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-28 09:25:09 +00:00
dependabot[bot]
0dad7d9331 build: bump pkg-config from 0.3.19 to 0.3.20
Bumps [pkg-config](https://github.com/rust-lang/pkg-config-rs) from 0.3.19 to 0.3.20.
- [Release notes](https://github.com/rust-lang/pkg-config-rs/releases)
- [Changelog](https://github.com/rust-lang/pkg-config-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/pkg-config-rs/compare/0.3.19...0.3.20)

---
updated-dependencies:
- dependency-name: pkg-config
  dependency-type: indirect
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-28 09:03:04 +00:00
dependabot[bot]
a496cecef1 build: bump libc from 0.2.102 to 0.2.103 in /fuzz
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.102 to 0.2.103.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.102...0.2.103)

---
updated-dependencies:
- dependency-name: libc
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-28 09:01:16 +00:00
William Douglas
a8f063db7c vmm: Refactor serial buffer to allow flush on PTY when writable
Refactor the serial buffer handling in order to write the serial
buffer's output to a PTY connected after the serial device stops being
written to by the guest.

This change moves the serial buffer initialization inside the serial
manager. That is done to allow the serial buffer to be made aware of
the PTY and epoll fds needed in order to modify the
EpollDispatch::File trigger. These are then used by the serial buffer
to trigger an epoll event when the PTY fd is writable and the buffer
has content in it. They are also used to remove the trigger when the
buffer is emptied in order to avoid unnecessary wake-ups.

Signed-off-by: William Douglas <william.douglas@intel.com>
2021-09-27 14:18:21 +01:00
William Douglas
0066ddefe1 devices: Add utility functions for the serial output buffer
In preparation for reorganizing how the serial output is constructed
add methods to the serial devices for setting the out buffer after the
device is created.

Also add a method to enable flushing the output buffer to be used to
write the buffer to the PTY fd once the PTY is writable.

Signed-off-by: William Douglas <william.douglas@intel.com>
2021-09-27 14:18:21 +01:00
Michael Zhao
f9dd0aaf8a scripts: Optimize EDK2 building on AArch64
In integration test, we fetch latest EDK2 code on its master branch and
build. While the update on EDK2 master is frequent. And the building is
time consuming. It takes a lot of time in CI and local test. Floating on
top of a busy master branch also bring potential risk in tracking and
debugging.

Now that Cloud Hypervisor support in EDK2 has been steady, we can pin
the EDK2 software versions to avoid unnecessary updating and building.
We can update the versions manually every after several months.

The commit also optimizes the build process by applying multi-threaded
compiling.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-09-27 17:58:16 +08:00
Michael Zhao
8b7880160e scripts: Refactor the bash code for building linux
Simplified the bash code of building custom linux in integration test.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-09-27 17:58:16 +08:00
Michael Zhao
b7cb6257b5 scripts: Add a bash function to sync external code
Added a bash function in integration test script to checkout source code
of a GIT repo with specified branch and commit.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-09-27 17:58:16 +08:00
Henry Wang
6037c83585 resources: Add autotools and texinfo for AArch64
These packages will be used to compile `stress` from source, and
the `stress` will be used by the virtio-balloon integration test.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-09-27 17:34:29 +08:00
Sebastien Boeuf
b910a7922d vmm: Fix migration when writing/reading big chunks of data
Both read_exact_from() and write_all_to() functions from the GuestMemory
trait implementation in vm-memory are buggy. They should retry until
they wrote or read the amount of data that was expected, but instead
they simply return an error when this happens. This causes the migration
to fail when trying to send important amount of data through the
migration socket, due to large memory regions.

This should be eventually fixed in vm-memory, and here is the link to
follow up on the issue: https://github.com/rust-vmm/vm-memory/issues/174

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-27 11:13:56 +02:00
Rob Bradford
ad6dfc5875 build: Temporarily use git version of cargo-fuzz in GH action
This resolves issues between released version of cargo fuzz and nightly.

See rust-fuzz/cargo-fuzz#276

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-27 17:08:43 +08:00
dependabot[bot]
cb59976c68 build: bump syn from 1.0.76 to 1.0.77 in /fuzz
Bumps [syn](https://github.com/dtolnay/syn) from 1.0.76 to 1.0.77.
- [Release notes](https://github.com/dtolnay/syn/releases)
- [Commits](https://github.com/dtolnay/syn/compare/1.0.76...1.0.77)

---
updated-dependencies:
- dependency-name: syn
  dependency-type: indirect
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-27 03:01:45 +00:00
dependabot[bot]
520f54cb8d build: bump syn from 1.0.76 to 1.0.77
Bumps [syn](https://github.com/dtolnay/syn) from 1.0.76 to 1.0.77.
- [Release notes](https://github.com/dtolnay/syn/releases)
- [Commits](https://github.com/dtolnay/syn/compare/1.0.76...1.0.77)

---
updated-dependencies:
- dependency-name: syn
  dependency-type: indirect
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-24 09:56:54 +00:00
Rob Bradford
1a2d0e6dd8 build: bump linux-loader from 0.3.0 to 0.4.0
Requires manual change to command line loading.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-24 09:11:57 +00:00
Michael Zhao
7383087230 tests: Enable virtio-iommu test on AArch64
Refactored the test case `test_virtio_iommu` to adapt architectures and
different choices among ACPI and FDT. In the case of ACPI, a Focal image
with modified kernel is tested.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-09-24 07:57:57 +01:00
Michael Zhao
d72af85c42 vmm: Add "_CCA" field to ACPI DSDT table
"_CCA" is required by DMA configuration on AArch64.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-09-24 07:57:57 +01:00
Michael Zhao
d76d04e2f2 scripts: Prepare a Focal image with custom kernel
On AArch64, ACPI must work with UEFI (EDK2). This way, the kernel is
always loaded from the disk image. We can not specify a direct custom
kernel while using ACPI.

To use a custom kernel, we have to replace the kernel file in the disk
image by:
- Making a copy of the Focal `raw` image
- Mounting the rootfs with `libguestfs-tools`
- Replacing the compressed kernel file

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-09-24 07:57:57 +01:00
Michael Zhao
56c26b3d9c resource: Install libguestfs-tools in Docker image
Installed `libguestfs-tools` to replace kernel file in cloud image.
Installed a kernel as `libguestfs-tools` requires.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-09-23 08:59:58 +01:00
dependabot[bot]
2280be825e build: bump openssl-sys from 0.9.66 to 0.9.67
Bumps [openssl-sys](https://github.com/sfackler/rust-openssl) from 0.9.66 to 0.9.67.
- [Release notes](https://github.com/sfackler/rust-openssl/releases)
- [Commits](https://github.com/sfackler/rust-openssl/compare/openssl-sys-v0.9.66...openssl-sys-v0.9.67)

---
updated-dependencies:
- dependency-name: openssl-sys
  dependency-type: indirect
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-22 08:00:42 +00:00
dependabot[bot]
73c822f64a build: bump vm-fdt from 06cbff3 to ddb3fad
Bumps [vm-fdt](https://github.com/rust-vmm/vm-fdt) from `06cbff3` to `ddb3fad`.
- [Release notes](https://github.com/rust-vmm/vm-fdt/releases)
- [Commits](06cbff3a02...ddb3fad524)

---
updated-dependencies:
- dependency-name: vm-fdt
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-21 15:47:15 +00:00
Rob Bradford
43365ade2e vmm, pci: Implement virtio-mem support for vfio-user
Implement the infrastructure that lets a virtio-mem device map the guest
memory into the device. This is necessary since with virtio-mem zones
memory can be added or removed and the vfio-user device must be
informed.

Fixes: #3025

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-21 15:42:49 +01:00
Rob Bradford
e9d67dc405 vmm: pci: Move creation of vfio_user::Client to DeviceManager
By moving this from the VfioUserPciDevice to DeviceManager the client
can be reused for handling DMA mapping behind an IOMMU.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-21 15:42:49 +01:00
Rob Bradford
fd4f32fa69 virtio-mem: Support multiple mappings
For vfio-user the mapping handler is per device and needs to be removed
when the device in unplugged.

For VFIO the mapping handler is for the default VFIO container (used
when no vIOMMU is used - using a vIOMMU does not require mappings with
virtio-mem)

To represent these two use cases use an enum for the handlers that are
stored.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-21 15:42:49 +01:00
Sebastien Boeuf
6fb88c3c5a virtio-devices: balloon: Add snapshot/restore support
Adding the snapshot/restore support along with migration as well,
allowing a VM with a virtio-balloon device attached to be properly
migrated.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-21 14:47:17 +02:00
Bo Chen
d6c08d902b ci: Use the pre-installed virtiofsd
Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-09-20 16:47:28 +01:00
Wei Liu
86afa38c64 hypervisor: mshv: drop one unsafe in code
The binding already provides a default() method which does the same
thing.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2021-09-20 17:22:31 +02:00
Bo Chen
ae68c802bd Dockerfile: Build and install virtiofsd to the docker image
Given the 'virtiofsd' executable is used in multiple CI workers,
installing them directly to the docker image is more efficient and can
save CI time.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-09-20 13:15:52 +01:00
dependabot[bot]
ccced2ebf4 build: bump arc-swap from 1.3.2 to 1.4.0 in /fuzz
Bumps [arc-swap](https://github.com/vorner/arc-swap) from 1.3.2 to 1.4.0.
- [Release notes](https://github.com/vorner/arc-swap/releases)
- [Changelog](https://github.com/vorner/arc-swap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/vorner/arc-swap/compare/v1.3.2...v1.4.0)

---
updated-dependencies:
- dependency-name: arc-swap
  dependency-type: indirect
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-19 17:41:20 +00:00
dependabot[bot]
d826b4fbdc build: bump arc-swap from 1.3.2 to 1.4.0
Bumps [arc-swap](https://github.com/vorner/arc-swap) from 1.3.2 to 1.4.0.
- [Release notes](https://github.com/vorner/arc-swap/releases)
- [Changelog](https://github.com/vorner/arc-swap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/vorner/arc-swap/compare/v1.3.2...v1.4.0)

---
updated-dependencies:
- dependency-name: arc-swap
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-19 17:12:50 +00:00
Rob Bradford
0faa7afac2 vmm: Add fast path for PCI config IO port
Looking up devices on the port I/O bus is time consuming during the
boot at there is an O(lg n) tree lookup and the overhead from taking a
lock on the bus contents.

Avoid this by adding a fast path uses the hardcoded port address and
size and directs PCI config requests directly to the device.

Command line:
target/release/cloud-hypervisor --kernel ~/src/linux/vmlinux --cmdline "root=/dev/vda1 console=ttyS0" --serial tty --console off --disk path=~/workloads/focal-server-cloudimg-amd64-custom-20210609-0.raw --api-socket /tmp/api

PIO exit: 17913
PCI fast path: 17871
Percentage on fast path: 99.8%

perf before:

marvin:~/src/cloud-hypervisor (main *)$ perf report -g | grep resolve
     6.20%     6.20%  vcpu0            cloud-hypervisor    [.] vm_device:🚌:Bus::resolve

perf after:

marvin:~/src/cloud-hypervisor (2021-09-17-ioapic-fast-path *)$ perf report -g | grep resolve
     0.08%     0.08%  vcpu0            cloud-hypervisor    [.] vm_device:🚌:Bus::resolve

The compromise required to implement this fast path is bringing the
creation of the PciConfigIo device into the DeviceManager::new() so that
it can be used in the VmmOps struct which is created before
DeviceManager::create_devices() is called.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-17 17:09:45 +01:00
Michael Zhao
da8eecc797 docs: Describe about virtio-iommu with FDT
Added a section in "Usage" chapter of "iommu.md" to introduce the
special behavior when virtio-iommu is working with FDT on AArch64.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-09-17 12:19:46 +02:00
Michael Zhao
9dc9e224b9 tests: Enable IOMMU test case on AArch64
For AArch64, now virtual IOMMU is only tested on FDT, not ACPI.
In the case of FDT, the behavior of IOMMU is a bit different with ACPI.
All the devices on the PCI bus will be attached to the virtual IOMMU,
except the virtio-iommu device itself. So these devices will all be
added to IOMMU groups, and appear in folder '/sys/kernel/iommu_groups/'.

The result is, on AArch64 IOMMU group '0' contains "0000:00:01.0" which
is the console device. But on X86, console device is not attached to
IOMMU. So the IOMMU group '0' contains "0000:00:02.0" which is the first
disk.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-09-17 12:19:46 +02:00
Michael Zhao
b3fa56544c virtio-devices: iommu: Support AArch64
The MSI IOVA address on X86 and AArch64 is different.

This commit refactored the code to receive the MSI IOVA address and size
from device_manager, which provides the actual IOVA space data for both
architectures.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-09-17 12:19:46 +02:00
Michael Zhao
b30ddc0837 aarch64: Refactor AArch64 GIC space definitions
Move the definition of MSI space to layout.rs, so other crates can
reference it. Now it is needed by virtio-iommu.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-09-17 12:19:46 +02:00
Michael Zhao
253c06d3ba arch/aarch64: Add virtio-iommu device in FDT
Add a virtio-iommu node into FDT if iommu option is turned on. Now we
support only one virtio-iommu device.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-09-17 12:19:46 +02:00
William Douglas
46f6d9597d vmm: Switch to using the serial_manager for serial input
This change switches from handling serial input in the VMM thread to
its own thread controlled by the SerialManager.

The motivation for this change is to avoid the VMM thread being unable
to process events while serial input is happening and vice versa.

The change also makes future work flushing the serial buffer on PTY
connections easier.

Signed-off-by: William Douglas <william.douglas@intel.com>
2021-09-17 11:15:35 +01:00