Commit Graph

1886 Commits

Author SHA1 Message Date
Sebastien Boeuf
81ba70a497 pci, vmm: Defer mapping VFIO MMIO regions on restore
When restoring a VM, the restore codepath will take care of mapping the
MMIO regions based on the information from the snapshot, rather than
having the mapping being performed during device creation.

When the device is created, information such as which BARs contain the
MSI-X tables are missing, preventing to perform the mapping of the MMIO
regions.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-06-09 09:19:58 +02:00
Sebastien Boeuf
7df7061610 pci, vmm: Add migratable support to vfio-user devices
Based on recent changes to VfioUserPciDevice, the vfio-user devices can
now be migrated.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-06-09 09:19:58 +02:00
Sebastien Boeuf
c021dda267 pci, vmm: Add migratable support to VFIO devices
Based on recent changes to VfioPciDevice, the VFIO devices can now be
migrated.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-06-09 09:19:58 +02:00
Rob Bradford
94fb9f817d vmm: Fix clippy issues under "guest_debug" feature
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-06-08 11:40:56 +01:00
Rob Bradford
133a5a858a vmm: Pull in gdb dependencies only when building with feature
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-06-08 11:30:20 +01:00
Michael Zhao
a7a15d56dd aarch64: Move setup_regs to hypervisor
`setup_regs` of AArch64 calls KVM sepecific code. Now move it to
`hypervisor` crate.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-06-06 11:07:46 +01:00
Sebastien Boeuf
65dc1c83a9 vmm: cpu: Save and restore CPU states during snapshot/restore
Based on recent KVM host patches (merged in Linux 5.16), it's forbidden
to call into KVM_SET_CPUID2 after the first successful KVM_RUN returned.
That means saving CPU states during the pause sequence, and restoring
these states during the resume sequence will not work with the current
design starting with kernel version 5.16.

In order to solve this problem, let's simply move the save/restore logic
to the snapshot/restore sequences rather than the pause/resume ones.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-06-06 11:07:29 +01:00
Sebastien Boeuf
3edaa8adb6 vmm: Ensure restore matches boot sequence
The vCPU is created and set after all the devices on a VM's boot.
There's no reason to follow a different order on the restore codepath as
this could cause some unexpected behaviors.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-06-06 11:07:17 +01:00
Michael Zhao
9260c3816e vmm: Update unit test for GIC refactoring
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-06-06 10:17:26 +08:00
Michael Zhao
5d45d6d0fb vmm: Move GIC unit test to hypervisor crate
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-06-06 10:17:26 +08:00
Michael Zhao
957d3a7443 aarch64: Simplify GIC related structs definition
Combined the `GicDevice` struct in `arch` crate and the `Gic` struct in
`devices` crate.

After moving the KVM specific code for GIC in `arch`, a very thin wapper
layer `GicDevice` was left in `arch` crate. It is easy to combine it
with the `Gic` in `devices` crate.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-06-06 10:17:26 +08:00
Michael Zhao
04949755c0 arch: Switch to new GIC interface
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-06-06 10:17:26 +08:00
dependabot[bot]
f1495b5767 build: bump uuid from 1.1.0 to 1.1.1
Bumps [uuid](https://github.com/uuid-rs/uuid) from 1.1.0 to 1.1.1.
- [Release notes](https://github.com/uuid-rs/uuid/releases)
- [Commits](https://github.com/uuid-rs/uuid/compare/1.1.0...1.1.1)

---
updated-dependencies:
- dependency-name: uuid
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-06-02 00:48:36 +00:00
Rob Bradford
ade3a9c8f6 virtio-devices, vmm: Optimised async virtio device activation
In order to ensure that the virtio device thread is spawned from the vmm
thread we use an asynchronous activation mechanism for the virtio
devices. This change optimises that code so that we do not need to
iterate through all virtio devices on the platform in order to find the
one that requires activation. We solve this by creating a separate short
lived VirtioPciDeviceActivator that holds the required state for the
activation (e.g. the clones of the queues) this can then be stored onto
the device manager ready for asynchronous activation.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-06-01 09:42:02 +02:00
Yi Wang
dbeb922882 doc: add vm coredump support
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Co-authored-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-05-30 13:41:40 +02:00
Yi Wang
8b585b96c1 vmm: enable coredump
Based on the newly added guest_debug feature, this patch adds http
endpoint support.

Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Co-authored-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-05-30 13:41:40 +02:00
Yi Wang
ccb604e1e1 vmm: add cpu segment note for coredump
The crash tool use a special note segment which named 'QEMU' to
analyze kaslr info and so on. If we don't add the 'QEMU' note
segment, crash tool can't find linux version to move on.

For now, the most convenient way is to add 'QEMU' note segment to
make crash tool happy.

Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
2022-05-30 13:41:40 +02:00
Yi Wang
0e65ca4a6c vmm: save guest memory for coredump
Guest memory is needed for analysis in crash tool, so save it
for coredump.

Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Co-authored-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-05-30 13:41:40 +02:00
Yi Wang
7e280b6f70 vmm: save elf header for coredump
The vmcore file of guest is an elf format, so the first step of coredump
is to save the elf header.

Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
2022-05-30 13:41:40 +02:00
Yi Wang
90034fd6ba vmm: add GuestDebuggable trait
It's useful to dump the guest, which named coredump so that crash
tool can be used to analysize it when guest hung up.

Let's add GuestDebuggable trait and Coredumpxxx error to support
coredump firstly.

Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Co-authored-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-05-30 13:41:40 +02:00
Rob Bradford
465db7f08c vmm: config: Remove mergeable option from PmemConfig
Fixes: #3968

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-27 09:48:49 +02:00
Rob Bradford
55c5961f43 vmm: config: Remove dax & cache_size options from FsConfig
Fixes: #3889

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-27 09:47:13 +02:00
Rob Bradford
7c3582b4a8 vmm: config: Fix error message regarding use of cache size without dax
The error message incorrectly said that the user was trying to combine
cache_size without dax whereas it is only usuable with dax.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-27 09:47:13 +02:00
Rob Bradford
979797786d vmm: Remove DAX cache setup for virtio-fs devices
Remove the code from the DeviceManager that prepares the DAX cache since
the functionality has now been removed.

Fixes: #3889

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-27 09:47:13 +02:00
Michael Zhao
0fd6521759 aarch64: Avoid depending on layout in GIC code
Removing the dependency on `layout` helps moving GIC code into
`hypervisor` crate.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-05-27 10:57:50 +08:00
Michael Zhao
3fe20cc09a aarch64: Remove GicDevice trait
`GicDevice` trait was defined for the common part of GicV3 and ITS.
Now that the standalone GicV3 do not exist, `GicDevice` is not needed.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-05-27 10:57:50 +08:00
dependabot[bot]
c0cbaff519 build: bump uuid from 1.0.0 to 1.1.0
Bumps [uuid](https://github.com/uuid-rs/uuid) from 1.0.0 to 1.1.0.
- [Release notes](https://github.com/uuid-rs/uuid/releases)
- [Commits](https://github.com/uuid-rs/uuid/compare/1.0.0...1.1.0)

---
updated-dependencies:
- dependency-name: uuid
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-05-26 00:27:34 +00:00
Rob Bradford
fa07d83565 Revert "virtio-devices, vmm: Optimised async virtio device activation"
This reverts commit f160572f9d.

There has been increased flakiness around the live migration tests since
this was merged. Speculatively reverting to see if there is increased
stability.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-21 21:27:33 +01:00
Rob Bradford
f160572f9d virtio-devices, vmm: Optimised async virtio device activation
In order to ensure that the virtio device thread is spawned from the vmm
thread we use an asynchronous activation mechanism for the virtio
devices. This change optimises that code so that we do not need to
iterate through all virtio devices on the platform in order to find the
one that requires activation. We solve this by creating a separate short
lived VirtioPciDeviceActivator that holds the required state for the
activation (e.g. the clones of the queues) this can then be stored onto
the device manager ready for asynchronous activation.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-20 17:07:13 +01:00
Sebastien Boeuf
49db713124 virtio-devices, vmm: Remove unused macro rules
Latest cargo beta version raises warnings about unused macro rules.
Simply remove them to fix the beta build.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-05-20 09:59:43 +01:00
dependabot[bot]
0e16ffbcff build: bump libc from 0.2.125 to 0.2.126
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.125 to 0.2.126.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.125...0.2.126)

---
updated-dependencies:
- dependency-name: libc
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-05-18 07:14:37 +00:00
Maksym Pavlenko
3a0429c998 cargo: Clean up serde dependencies
There is no need to include serde_derive separately,
as it can be specified as serde feature instead.

Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2022-05-18 08:21:19 +02:00
dependabot[bot]
58966e952d build: bump signal-hook from 0.3.13 to 0.3.14
Bumps [signal-hook](https://github.com/vorner/signal-hook) from 0.3.13 to 0.3.14.
- [Release notes](https://github.com/vorner/signal-hook/releases)
- [Changelog](https://github.com/vorner/signal-hook/blob/master/CHANGELOG.md)
- [Commits](https://github.com/vorner/signal-hook/compare/v0.3.13...v0.3.14)

---
updated-dependencies:
- dependency-name: signal-hook
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-05-17 04:18:48 +00:00
dependabot[bot]
7db7410920 build: bump virtio-queue from 0.2.0 to 0.3.0
Bumps [virtio-queue](https://github.com/rust-vmm/vm-virtio) from 0.2.0 to 0.3.0.
- [Release notes](https://github.com/rust-vmm/vm-virtio/releases)
- [Commits](https://github.com/rust-vmm/vm-virtio/compare/virtio-queue-v0.2.0...virtio-queue-v0.3.0)

Also relies on main branch from vhost-user-backend since it moved to
virtio-queue 0.3.0 as well, and without this change it wouldn't compile.

---
updated-dependencies:
- dependency-name: virtio-queue
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-05-16 11:47:20 +02:00
Rob Bradford
16a9882153 vmm: cpu: tdx: Don't use fd suffix for something not an FD
The hypervisor::Vcpu is the abstraction over the fd.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-13 15:39:22 +02:00
Rob Bradford
218be2642e hypervisor: Explicitly pub use at the hypervisor crate top-level
Explicitly re-export types from the hypervisor specific modules. This
makes it much clearer what the common functionality that is exposed is.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-13 15:39:22 +02:00
Rob Bradford
cd0df05808 vmm, arch: CpuId is x86_64 specific so import from the x86_64 module
It will be removed as a top-level export from the hypervisor crate.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-13 15:39:22 +02:00
Rob Bradford
d3f66f8702 hypervisor: Make vm module private
And thus only export what is necessary through a `pub use`. This is
consistent with some of the other modules and makes it easier to
understand what the external interface of the hypervisor crate is.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-13 15:39:22 +02:00
dependabot[bot]
8e1ba99e8d build: bump clap from 3.1.17 to 3.1.18
Bumps [clap](https://github.com/clap-rs/clap) from 3.1.17 to 3.1.18.
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/clap-rs/clap/compare/v3.1.17...v3.1.18)

---
updated-dependencies:
- dependency-name: clap
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-05-12 23:51:40 +00:00
Rob Bradford
b1bd87df19 vmm: Simplify MsiInterruptManager generics
By taking advantage of the fact that IrqRoutingEntry is exported by the
hypervisor crate (that is typedef'ed to the hypervisor specific version)
then the code for handling the MsiInterruptManager can be simplified.

This is particularly useful if in this future it is not a typedef but
rather a wrapper type.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-11 11:19:14 +01:00
Rob Bradford
3f9e8d676a hypervisor: Move creation of irq routing struct to hypervisor crate
This removes the requirement to leak as many datastructures from the
hypervisor crate into the vmm crate.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-11 11:19:14 +01:00
Rob Bradford
c2c813599d vmm: Don't use kvm_ioctls directly
The IoEventAddress is re-exported through the crate at the top-level.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-10 15:57:43 +01:00
Rob Bradford
387d56879b vmm, hypervisor: Clean up nomenclature around offloading VM operations
The trait and functionality is about operations on the VM rather than
the VMM so should be named appropriately. This clashed with with
existing struct for the concrete implementation that was renamed
appropriately.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-10 13:10:01 +01:00
dependabot[bot]
28926c7454 build: bump clap from 3.1.15 to 3.1.17
Bumps [clap](https://github.com/clap-rs/clap) from 3.1.15 to 3.1.17.
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/clap-rs/clap/compare/v3.1.15...v3.1.17)

---
updated-dependencies:
- dependency-name: clap
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-05-07 01:06:10 +00:00
Sebastien Boeuf
5f722d0d3f vmm: Fix loading RAW firmware
Whenever going through the codepath of loading a RAW firmware, we always
add an extra RAM region to the guest memory through the memory manager.
But we must be careful to use the updated guest memory rather than a
previous reference that wasn't containing the new region, as this can
lead to the following error:

VmBoot(FirmwareLoad(InvalidGuestAddress(GuestAddress(4290772992))))

This is fixed by the current patch, getting the latest reference onto
the guest memory from the memory manager right after the new region has
been added.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-05-06 18:13:28 +02:00
Bo Chen
42c19e14c5 vmm: Add 'shutdown()' to vCPU seccomp filter
This is required when hot-removing a vfio-user device. Details code path
below:

Thread 6 "vcpu0" received signal SIGSYS, Bad system call.
[Switching to Thread 0x7f8196889700 (LWP 2358305)]
0x00007f8196dae7ab in shutdown () at ../sysdeps/unix/syscall-template.S:78
78	T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
(gdb) bt
  0x00007f8196dae7ab in shutdown () at ../sysdeps/unix/syscall-template.S:78
  0x000056189240737d in std::sys::unix::net::Socket::shutdown ()
    at library/std/src/sys/unix/net.rs:383
  std::os::unix::net::stream::UnixStream::shutdown () at library/std/src/os/unix/net/stream.rs:479
  0x000056189210e23d in vfio_user::Client::shutdown (self=0x7f8190014300)
    at vfio_user/src/lib.rs:787
  0x00005618920b9d02 in <pci::vfio_user::VfioUserPciDevice as core::ops::drop::Drop>::drop (
    self=0x7f819002d7c0) at pci/src/vfio_user.rs:551
  0x00005618920b8787 in core::ptr::drop_in_place<pci::vfio_user::VfioUserPciDevice> ()
    at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/ptr/mod.rs:188
  0x00005618920b92e3 in core::ptr::drop_in_place<core::cell::UnsafeCell<dyn pci::device::PciDevice>>
    () at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/ptr/mod.rs:188
  0x00005618920b9362 in core::ptr::drop_in_place<std::sync::mutex::Mutex<dyn pci::device::PciDevice>> () at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/ptr/mod.rs:188
  0x00005618920d8a3e in alloc::sync::Arc<T>::drop_slow (self=0x7f81968852b8)
    at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/alloc/src/sync.rs:1092
  0x00005618920ba273 in <alloc::sync::Arc<T> as core::ops::drop::Drop>::drop (self=0x7f81968852b8)
    at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/alloc/src/sync.rs:1688
 0x00005618920b76fb in core::ptr::drop_in_place<alloc::sync::Arc<std::sync::mutex::Mutex<dyn pci::device::PciDevice>>> ()
    at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/ptr/mod.rs:188
 0x0000561891b5e47d in vmm::device_manager::DeviceManager::eject_device (self=0x7f8190009600,
    pci_segment_id=0, device_id=3) at vmm/src/device_manager.rs:4000
 0x0000561891b674bc in <vmm::device_manager::DeviceManager as vm_device:🚌:BusDevice>::write (
    self=0x7f8190009600, base=70368744108032, offset=8, data=&[u8](size=4) = {...})
    at vmm/src/device_manager.rs:4625
 0x00005618921927d5 in vm_device:🚌:Bus::write (self=0x7f8190006e00, addr=70368744108040,
    data=&[u8](size=4) = {...}) at vm-device/src/bus.rs:235
 0x0000561891b72e10 in <vmm::vm::VmOps as hypervisor::vm::VmmOps>::mmio_write (
    self=0x7f81900097b0, gpa=70368744108040, data=&[u8](size=4) = {...}) at vmm/src/vm.rs:378
 0x0000561892133ae2 in <hypervisor::kvm::KvmVcpu as hypervisor::cpu::Vcpu>::run (
    self=0x7f8190013c90) at hypervisor/src/kvm/mod.rs:1114
 0x0000561891914e85 in vmm::cpu::Vcpu::run (self=0x7f819001b230) at vmm/src/cpu.rs:348
 0x000056189189f2cb in vmm::cpu::CpuManager::start_vcpu::{{closure}}::{{closure}} ()
    at vmm/src/cpu.rs:953

Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-05-05 15:33:26 -07:00
Sebastien Boeuf
058a61148c vmm: Factorize net creation
Since both Net and vhost_user::Net implement the Migratable trait, we
can factorize the common part to simplify the code related to the net
creation.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-05-05 13:08:41 +02:00
Sebastien Boeuf
425902b296 vmm: Factorize disk creation
Since both Block and vhost_user::Blk implement the Migratable trait, we
can factorize the common part to simplify the code related to the disk
creation.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-05-05 13:08:41 +02:00
Sebastien Boeuf
54f39aa8cb vmm: Validate vhost-user-block/net are not configured with iommu=on
Extend the validate() function for both DiskConfig and NetConfig so that
we return an error if a vhost-user-block or vhost-user-net device is
expected to be placed behind the virtual IOMMU. Since these devices
don't support this feature, we can't allow iommu to be set to true in
these cases.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-05-05 13:08:41 +02:00
Rob Bradford
707cea2182 vmm, devices: Move logging of 0x80 timestamp to its own device
This is a cleaner approach to handling the I/O port write to 0x80.
Whilst doing this also use generate the timestamp at the start of the VM
creation. For consistency use the same timestamp for the ARM equivalent.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-04 23:02:53 +01:00