Commit Graph

4918 Commits

Author SHA1 Message Date
Rob Bradford
e01933c8ea build: Release v21.1 (bug fix release)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-03-11 08:44:05 +00:00
Rob Bradford
c6307865eb virtio-devices: Enable F_EVENT_IDX on control queue if negotiated
With the VIRTIO_F_EVENT_IDX handling now conducted inside the
virtio-queue crate it is necessary to activate the functionality on
every queue if it is negotiatated. Otherwise this leads to a failure of
the guest to signal to the host that there is something in the available
queue as the queue's internal state has not been configured correctly.

Fixes: #3829

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
(cherry picked from commit 223d0cf787)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-03-11 08:44:05 +00:00
Yi Wang
6275d4f87e vmm: interrupt: fix msi mask irq causing kernel panic on AMD
When mask a msi irq, we set the entry.masked to be true, so kvm
hypervisor will not pass the gsi to kernel through KVM_SET_GSI_ROUTING
ioctl which update kvm->irq_routing. This will trigger kernel
panic on AMD platform when the gsi is the largest one in kernel
kvm->irqfds.items:

crash> bt
PID: 22218  TASK: ffff951a6ad74980  CPU: 73  COMMAND: "vcpu8"
 #0 [ffffb1ba6707fa40] machine_kexec at ffffffff8565b397
 #1 [ffffb1ba6707fa90] __crash_kexec at ffffffff85788a6d
 #2 [ffffb1ba6707fb58] crash_kexec at ffffffff8578995d
 #3 [ffffb1ba6707fb70] oops_end at ffffffff85623c0d
 #4 [ffffb1ba6707fb90] no_context at ffffffff856692c9
 #5 [ffffb1ba6707fbf8] exc_page_fault at ffffffff85f95b51
 #6 [ffffb1ba6707fc50] asm_exc_page_fault at ffffffff86000ace
    [exception RIP: svm_update_pi_irte+227]
    RIP: ffffffffc0761b53  RSP: ffffb1ba6707fd08  RFLAGS: 00010086
    RAX: ffffb1ba6707fd78  RBX: ffffb1ba66d91000  RCX: 0000000000000001
    RDX: 00003c803f63f1c0  RSI: 000000000000019a  RDI: ffffb1ba66db2ab8
    RBP: 000000000000019a   R8: 0000000000000040   R9: ffff94ca41b82200
    R10: ffffffffffffffcf  R11: 0000000000000001  R12: 0000000000000001
    R13: 0000000000000001  R14: ffffffffffffffcf  R15: 000000000000005f
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #7 [ffffb1ba6707fdb8] kvm_irq_routing_update at ffffffffc09f19a1 [kvm]
 #8 [ffffb1ba6707fde0] kvm_set_irq_routing at ffffffffc09f2133 [kvm]
 #9 [ffffb1ba6707fe18] kvm_vm_ioctl at ffffffffc09ef544 [kvm]
    RIP: 00007f143c36488b  RSP: 00007f143a4e04b8  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 00007f05780041d0  RCX: 00007f143c36488b
    RDX: 00007f05780041d0  RSI: 000000004008ae6a  RDI: 0000000000000020
    RBP: 00000000000004e8   R8: 0000000000000008   R9: 00007f05780041e0
    R10: 00007f0578004560  R11: 0000000000000246  R12: 00000000000004e0
    R13: 000000000000001a  R14: 00007f1424001c60  R15: 00007f0578003bc0
    ORIG_RAX: 0000000000000010  CS: 0033  SS: 002b

To solve this problem, move route.disable() before set_gsi_routes() to
remove the gsi from irqfds.items first.

This problem only exists on AMD platform, 'cause on Intel platform
kernel just return when update irte while it only prints a warning on
AMD.

Also, this patch adjusts the order of enable() and set_gsi_routes() in
unmask(), which should do no harm.

Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
(cherry picked from commit 5375b84e3b)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-03-11 08:44:05 +00:00
Yi Wang
4dfe1ff77a vmm: interrupt: fix msi mask irq causing kernel panic on AMD
When mask a msi irq, we set the entry.masked to be true, so kvm
hypervisor will not pass the gsi to kernel through KVM_SET_GSI_ROUTING
ioctl which update kvm->irq_routing. This will trigger kernel
panic on AMD platform when the gsi is the largest one in kernel
kvm->irqfds.items:

crash> bt
PID: 22218  TASK: ffff951a6ad74980  CPU: 73  COMMAND: "vcpu8"
 #0 [ffffb1ba6707fa40] machine_kexec at ffffffff8565b397
 #1 [ffffb1ba6707fa90] __crash_kexec at ffffffff85788a6d
 #2 [ffffb1ba6707fb58] crash_kexec at ffffffff8578995d
 #3 [ffffb1ba6707fb70] oops_end at ffffffff85623c0d
 #4 [ffffb1ba6707fb90] no_context at ffffffff856692c9
 #5 [ffffb1ba6707fbf8] exc_page_fault at ffffffff85f95b51
 #6 [ffffb1ba6707fc50] asm_exc_page_fault at ffffffff86000ace
    [exception RIP: svm_update_pi_irte+227]
    RIP: ffffffffc0761b53  RSP: ffffb1ba6707fd08  RFLAGS: 00010086
    RAX: ffffb1ba6707fd78  RBX: ffffb1ba66d91000  RCX: 0000000000000001
    RDX: 00003c803f63f1c0  RSI: 000000000000019a  RDI: ffffb1ba66db2ab8
    RBP: 000000000000019a   R8: 0000000000000040   R9: ffff94ca41b82200
    R10: ffffffffffffffcf  R11: 0000000000000001  R12: 0000000000000001
    R13: 0000000000000001  R14: ffffffffffffffcf  R15: 000000000000005f
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #7 [ffffb1ba6707fdb8] kvm_irq_routing_update at ffffffffc09f19a1 [kvm]
 #8 [ffffb1ba6707fde0] kvm_set_irq_routing at ffffffffc09f2133 [kvm]
 #9 [ffffb1ba6707fe18] kvm_vm_ioctl at ffffffffc09ef544 [kvm]
    RIP: 00007f143c36488b  RSP: 00007f143a4e04b8  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 00007f05780041d0  RCX: 00007f143c36488b
    RDX: 00007f05780041d0  RSI: 000000004008ae6a  RDI: 0000000000000020
    RBP: 00000000000004e8   R8: 0000000000000008   R9: 00007f05780041e0
    R10: 00007f0578004560  R11: 0000000000000246  R12: 00000000000004e0
    R13: 000000000000001a  R14: 00007f1424001c60  R15: 00007f0578003bc0
    ORIG_RAX: 0000000000000010  CS: 0033  SS: 002b

To solve this problem, move route.disable() before set_gsi_routes() to
remove the gsi from irqfds.items first.

This problem only exists on AMD platform, 'cause on Intel platform
kernel just return when update irte while it only prints a warning on
AMD.

Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
(cherry picked from commit db9e5e5a87)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-03-11 08:44:05 +00:00
Rob Bradford
bbda07c388 pci: Support DWORD/4-byte writes to the MSI-X control register
The PCI spec does not specify that the access has to be of a specific
size.

Fixes: #3714

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
(cherry picked from commit 9c6e7c4a4b)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-03-11 08:44:05 +00:00
Rob Bradford
366223c9f4 vmm: Ensure that PIO and MMIO exits complete before pausing
As per this kernel documentation:

      For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_OSI, KVM_EXIT_PAPR, KVM_EXIT_XEN,
      KVM_EXIT_EPR, KVM_EXIT_X86_RDMSR and KVM_EXIT_X86_WRMSR the corresponding
      operations are complete (and guest state is consistent) only after userspace
      has re-entered the kernel with KVM_RUN.  The kernel side will first finish
      incomplete operations and then check for pending signals.

      The pending state of the operation is not preserved in state which is
      visible to userspace, thus userspace should ensure that the operation is
      completed before performing a live migration.  Userspace can re-enter the
      guest with an unmasked signal pending or with the immediate_exit field set
      to complete pending operations without allowing any further instructions
      to be executed.

Since we capture the state as part of the pause and override it as part
of the resume we must ensure the state is consistent otherwise we will
lose the results of the MMIO or PIO operation that caused the exit from
which we paused.

Fixes: #3658

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
(cherry picked from commit 507912385a)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-03-11 08:44:05 +00:00
Rob Bradford
375356a097 virtio-devices: Add openat() syscall to seccomp filter
When freeing memory sometimes glibc will attempt to read
"/proc/sys/vm/overcommit_memory" to find out how it should release the
blocks. This happens sporadically with Cloud Hypervisor but has been
seen in use. It is not necessary to add the read() syscall to the list
as it is already included in the virtio devices common set. Similarly
the vCPU and vmm threads already have both these in the allowed list.

Fixes: #3609

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
(cherry picked from commit 53caa565bb)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-03-11 08:44:05 +00:00
Rob Bradford
95ca79974a build: Release v21.0
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-20 14:48:11 +00:00
Sebastien Boeuf
eb5c5f2c7f tests: Add integration test for O_DIRECT
Both OVMF and RHF firmwares triggered an error when O_DIRECT was used
because they didn't align the buffers to the block sector size.

In order to prevent regressions, we're adding a new test validating the
VM can properly boot when the OS disk is opened with O_DIRECT and booted
from the rust-hypervisor-fw.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-01-20 11:49:02 +00:00
Sebastien Boeuf
85bbf75fe8 block_util: Align buffers for O_DIRECT
Whenever the backing file of our virtio-block device is opened with
O_DIRECT, there's a requirement about the buffer address and size to be
aligned to the sector size.

We know virtio-block requests are sector aligned in terms of size, but
we must still check if the buffer address is. In case it's not, we
create an intermediate buffer that will be passed through the system
call. In case of a write operation, the content of the non-aligned
buffer must be copied beforehand, and in case of a read operation, the
content of the aligned buffer must be copied to the non-aligned one
after the operation has been completed.

Fixes #3587

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-01-20 11:49:02 +00:00
Henry Wang
b4566b9eab tests: ignore the result from test_vfio_user
As it is currently unstable.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2022-01-20 11:40:30 +00:00
Anatol Belski
e2a8a1483f acpi: aarch64: Implement DBG2 table
This table is listed as required in the ARM Base Boot Requirements
document. The particular need arises to make the serial debugging of
Windows guest functional.

Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
2022-01-20 09:11:21 +08:00
Rob Bradford
ea60d48853 resources: Update Rust version in container to 1.58
Update to the latest stable release.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-19 09:53:42 -08:00
Rob Bradford
658658e76c hypervisor: kvm: Ignore -EINVAL from KVM_KVMCLOCK_CTRL ioctl()
If the guest hasn't initialised a PV clock then the KVM_KVMCLOCK_CTRL
ioctl will return -EINVAL. Therefore if running in the firmware or an OS
that doesn't use the PV clock then we should ignore that error

Tested by migrating a VM that has not yet booted into the Linux kernel
(just in firmware) by specifying no disk image:

e.g. target/debug/cloud-hypervisor --kernel ~/workloads/hypervisor-fw --api-socket /tmp/api --serial tty --console off

Fixes: #3586

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-19 10:12:57 +01:00
Henry Wang
b7b3b45364 tests: Enable test_vfio_user for AArch64
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2022-01-18 18:00:00 -08:00
Henry Wang
d5b4d0d951 resources: AArch64: Enable Device Mapper and NVME Multipath in config
From 15358ef79d: Device Mapper Multipath
config can avoid systemd errors related to Device Mapper multipath while
guest booting.

From 46672c384c: CONFIG_NVME_MULTIPATH is
needed to fix the observed guest hanging issue cased by systemd crash
while booting.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2022-01-18 18:00:00 -08:00
Henry Wang
1c18d124dc scripts: Add more huge pages for AArch64 integration test
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2022-01-18 18:00:00 -08:00
Henry Wang
7a42ce9310 scripts: AArch64: Build SPDK NVMe before running integration tests
The SPDK-NVMe is needed for the integration test for vfio_user.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-01-18 18:00:00 -08:00
Michael Zhao
1db7718589 pci, vmm: Pass PCI BDF to vfio and vfio_user
On AArch64, PCI BDF is used for devId in MSI-X routing entry.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-01-18 18:00:00 -08:00
Henry Wang
cf68f03ab6 pci: vfio: Skip IOBAR allocation on AArch64
AArch64 does not use IOBAR, and current code of panics the whole VMM if
we need to allocate the IOBAR.

This commit checks if IOBAR is enabled before the arch conditional code
of IOBAR allocation and if the IOBAR is not enabled, we can just skip
the IOBAR allocation and do nothing.

Fixes: https://github.com/cloud-hypervisor/cloud-hypervisor/issues/3479

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2022-01-18 18:00:00 -08:00
Rob Bradford
4ecc778efe vmm: Avoid deadlock between virtio device activation and vcpu pausing
Ensure all pending virtio activations (as triggered by MMIO write on the
vCPU threads leading to a barrier wait) are completed before pausing the
vCPUs as otherwise there will a deadlock with the VMM waiting for the
vCPU to acknowledge it's pause and the vCPU waiting for the VMM to
activate the device and release the barrier.

Fixes: #3585

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-18 17:30:06 -08:00
Wei Liu
48ba999bd9 net_util: drop unneeded clippy::cast_lossless
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-01-18 17:23:27 -08:00
Wei Liu
ef05354c81 memory_manager: drop unneeded clippy suppressions
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-01-18 17:23:27 -08:00
Wei Liu
277cfd07ba device_manager: use if let to drop single match
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-01-18 17:23:27 -08:00
Wei Liu
4f05b8463c virtio-devices: fix clippy::needless_range_loop
Use iterator instead.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-01-18 17:23:27 -08:00
Wei Liu
c9983ff4ad arch: drop allow(clippy::transmute_ptr_to_ptr)
It is not needed.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-01-18 17:23:27 -08:00
Wei Liu
714b529bb2 arch: aarch64: drop unnecessary static lifetime
This also has the side effect for making access_redists_aux function
strictly more useful.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-01-18 17:23:27 -08:00
Wei Liu
99bcebad74 arch: aarch64: do not unnecessarily add mut keyword
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-01-18 17:23:27 -08:00
dependabot[bot]
22d86fb6d2 build: bump clap from 3.0.8 to 3.0.10
Bumps [clap](https://github.com/clap-rs/clap) from 3.0.8 to 3.0.10.
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/clap-rs/clap/compare/v3.0.8...v3.0.10)

---
updated-dependencies:
- dependency-name: clap
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-01-19 00:09:18 +00:00
dependabot[bot]
2eed24b4b2 build: bump serde_json from 1.0.74 to 1.0.75
Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.74 to 1.0.75.
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](https://github.com/serde-rs/json/compare/v1.0.74...v1.0.75)

---
updated-dependencies:
- dependency-name: serde_json
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-01-18 23:41:56 +00:00
dependabot[bot]
be7829b752 build: bump clap from 3.0.8 to 3.0.10 in /fuzz
Bumps [clap](https://github.com/clap-rs/clap) from 3.0.8 to 3.0.10.
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/clap-rs/clap/compare/v3.0.8...v3.0.10)

---
updated-dependencies:
- dependency-name: clap
  dependency-type: indirect
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-01-18 23:35:44 +00:00
Wei Liu
ea2685e928 block_util: rewrite code and drop allow(clippy::ptr_arg)
The code can be written in a better form and the clippy warning
suppression can be dropped.

No functional change.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-01-18 16:22:21 +01:00
Henry Wang
14ba3f68d3 vmm: cpu: Remove unused import in unit tests
These are the leftovers from the commit 8155be2:
arch: aarch64: vm_memory is not required when configuring vcpu

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2022-01-18 20:34:50 +08:00
Sebastien Boeuf
779bc1a53a edk2: Rely on latest OVMF based on CloudHvX64 target
Update documentation and CI to rely on the new CLOUDHV.fd firmware built
from the newly introduced target CloudHvX64.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-01-18 11:58:26 +01:00
Rob Bradford
8b8daf571a README: Ensure kernel build includes the ELF PVH note
See: #3222

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-18 10:32:08 +01:00
Sebastien Boeuf
482e1ca435 scripts: Pin OVMF EDK2 version
By pinning the OVMF version, we will be able to update the EDK2 fork
with a new version without potentially breaking our Cloud Hypervisor CI.

Once the new version is ready on the EDK2 fork, we'll be able to update
Cloud Hypervisor codebase, replacing the fixed version with the latest,
as well as replacing OVMF.fd with CLOUDHV.fd. This is because we'll
start building from the new target CloudHvX64.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-01-18 10:21:51 +01:00
Rob Bradford
a61302f73f docs: Update Live migration documentation for local migration
Use new --local for efficient live migration when migrating locally for
live upgrade.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-18 09:07:47 +00:00
Rob Bradford
33fd0af8b3 vm-migration: Update protocol for FD based live migration
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-18 09:07:47 +00:00
Rob Bradford
cb243571de tests: Add test_live_migration_local() test
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-18 09:07:47 +00:00
Rob Bradford
70f7f64e23 vmm: api: Add "local" option to OpenAPI YAML file
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-18 09:07:47 +00:00
Rob Bradford
88952cc500 vmm: Send FDs across unix socket for migration when in local mode
When in local migration mode send the FDs for the guest memory over the
socket along with the slot that the FD is associated with. This removes
the requirement for copying the guest RAM and gives significantly faster
live migration performance (of the order of 3s to 60ms).

Fixes: #3566

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-18 09:07:47 +00:00
Rob Bradford
715a7d9065 vmm: Add convenience API for getting slots to FDs mapping
This will be used for sending those file descriptors for local
migration.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-18 09:07:47 +00:00
Rob Bradford
1676fffaad vmm: Check shared memory is enabled for local migration
This is required so that the receiving process can access the existing
process's memory.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-18 09:07:47 +00:00
Rob Bradford
1daef5e8c9 vmm: Propagate the set of memory slots to FDs received in migration
Create the VM using the FDs (wrapped in Files) that have been received
during the migration process.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-18 09:07:47 +00:00
Rob Bradford
735658a49d vm-migration: Add MemoryFd command for setting FDs for memory
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-18 09:07:47 +00:00
Rob Bradford
b95e46565c vmm: Support using existing files for memory slots
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-18 09:07:47 +00:00
Rob Bradford
eeba1d3ad8 vmm: Support using an existing FD for memory
If this FD (wrapped in a File) is supplied when the RAM region is being
created use that over creating a new one.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-18 09:07:47 +00:00
Rob Bradford
271e17bd79 vmm: Extract code for opening a file for memory
This function is used to open an FD (wrapped in a File) that points to
guest memory from memfd_create() or backed on the filesystem.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-18 09:07:47 +00:00
Rob Bradford
b9c260c0de vmm, ch-remote: Add "local" option to send-migration API
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-18 09:07:47 +00:00
dependabot[bot]
6e78ac1837 build: bump clap from 3.0.7 to 3.0.8
Bumps [clap](https://github.com/clap-rs/clap) from 3.0.7 to 3.0.8.
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/clap-rs/clap/compare/v3.0.7...v3.0.8)

---
updated-dependencies:
- dependency-name: clap
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-01-18 00:28:25 +00:00