3060 Commits

Author SHA1 Message Date
Henry Wang
48544e4e82 vmm: seccomp: whitelist KVM_GET_REG_LIST in seccomp
`KVM_GET_REG_LIST` ioctl is needed in save/restore AArch64 vCPU.
Therefore we whitelist this ioctl in seccomp.

Also this commit unifies the `SYS_FTRUNCATE` syscall for x86_64
and AArch64.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
c6b47d39e0 vmm: refactor vCPU save/restore code in restoring VM
Similarly as the VM booting process, on AArch64 systems,
the vCPUs should be created before the creation of GIC. This
commit refactors the vCPU save/restore code to achieve the
above-mentioned restoring order.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
970a5a410d vmm: decouple vCPU init from configure_vcpus
Since calling `KVM_GET_ONE_REG` before `KVM_VCPU_INIT` will
result in an error: Exec format error (os error 8). This commit
decouples the vCPU init process from `configure_vcpus`. Therefore
in the process of restoring the vCPUs, these vCPUs can be
initialized separately before started.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
47e65cd341 vmm: AArch64: add methods to get saved vCPU states
The construction of `GICR_TYPER` register will need vCPU states.
Therefore this commit adds methods to extract saved vCPU states
from the cpu manager.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
381d0b4372 devices: remove the migration traits for the Gic struct
Unlike x86_64, the "interrupt_controller" in the device manager
for AArch64 is only a `Gic` object that implements the
`InterruptController` to provide the interrupt delivery service.
This is not the real GIC device so that we do not need to save
its states. Also, we do not need to insert it to the device_tree.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
39c9583b48 arch: AArch64: implement save/restore for GICv3
This commit implements the save/restore for GICv3.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
7ddcad1d8b arch: AArch64: add a field gicr_typers for GIC implementations
The value of GIC register `GICR_TYPER` is needed in restoring
the GIC states. This commit adds a field in the GIC device struct
and a method to construct its value.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
dcf6d9d731 device_manager: AArch64: add a field to set/get GIC device entity
In AArch64 systems, the state of GIC device can only be
retrieved from `KVM_GET_DEVICE_ATTR` ioctl. Therefore to implement
saving/restoring the GIC states, we need to make sure that the
GIC object (either the file descriptor or the device itself) can
be extracted after the VM is started.

This commit refactors the code of GIC creation by adding a new
field `gic_device_entity` in device manager and methods to set/get
this field. The GIC object can be therefore saved in the device
manager after calling `arch::configure_system`.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
e7acbcc184 arch: AArch64: support saving RDIST pending tables into guest RAM
This commit adds a function which allows to save RDIST pending
tables to the guest RAM, as well as unit test case for it.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
29ce3076c2 tests: AArch64: Add unit test cases for accessing GIC registers
This commit adds the unit test cases for getting/setting the GIC
distributor, redistributor and ICC registers.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
09d53aad11 arch: AArch64: Porting GIC icc_regs implementation
This commit ports the implementation of GIC ICC registers
from Firecracker.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
bfde6977c8 arch: AArch64: Porting GIC redist_regs implementation
This commit ports the implementation of GIC redistributor registers
from Firecracker.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
f53990c7e7 arch: AArch64: Porting GIC dist_regs implementation
This commit ports the implementation of GIC distributor registers
from Firecracker.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
89a6b63e6e hypervisor: Implement get_device_attr method for AArch64
This commit implements the `get_device_attr` method for the
`KVM_GET_DEVICE_ATTR` ioctl. This ioctl will be used in retrieving
the GIC status.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
b1285cf528 arch: AArch64: move GIC implementations to a separate module
This commit moves the GIC-related code to a separate module.
Therefore the implementation of GIC registers can be introduced
to the new module.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
9dd188a8e8 tests: AArch64: Add unit test cases for vCPU save/restore
Adds 3 more unit test cases for AArch64:

*save_restore_core_regs
*save_restore_system_regs
*get_set_mpstate

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
ffafeda4b6 AArch64: Implement AArch64 vCPU states save/restore
This commit adds methods to save/restore AArch64 vCPU registers,
including:

1. The AArch64 `VcpuKvmState` structure.

2. Some `Vcpu` trait methods of the `KvmVcpu` structure to
enable the save/restore of the AArch64 vCPU states.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
e3d45be6f7 AArch64: Preparation for vCPU save/restore
This commit ports code from firecracker and refactors the existing
AArch64 code as the preparation for implementing save/restore
AArch64 vCPU, including:

1. Modification of `arm64_core_reg` macro to retrive the index of
arm64 core register and implemention of a helper to determine if
a register is a system register.

2. Move some macros and helpers in `arch` crate to the `hypervisor`
crate.

3. Added related unit tests for above functions and macros.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Josh Soref
5c3f4dbe6f ch: Fix various misspelled words
Misspellings were identified by https://github.com/marketplace/actions/check-spelling
* Initial corrections suggested by Google Sheets
* Additional corrections by Google Chrome auto-suggest
* Some manual corrections

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-09-23 08:59:31 +01:00
Jiangbo Wu
22a2a99e5f acpi: Add hotplug numa node
virtio-mem device would use 'VIRTIO_MEM_F_ACPI_PXM' to add memory to NUMA
node, which MUST be existed, otherwise it will be assigned to node id 0,
even if user specify different node id.

According ACPI spec about Memory Affinity Structure, system hardware
supports hot-add memory region using 'Hot Pluggable | Enabled' flags.

Signed-off-by: Jiangbo Wu <jiangbo.wu@intel.com>
2020-09-22 13:11:39 +02:00
Jiangbo Wu
223189c063 mm: Apply zone's property instread of global config
Apply memory zone's property for associated virtio-mem regions.

Signed-off-by: Jiangbo Wu <jiangbo.wu@intel.com>
2020-09-22 09:56:37 +02:00
Jiangbo Wu
80be8ac0dc mm: Apply memory policy for virtio-mem region
Use zone.host_numa_node to create memory zone, so that memory zone
can apply memory policy in according with host numa node ID

Signed-off-by: Jiangbo Wu <jiangbo.wu@intel.com>
2020-09-22 09:56:37 +02:00
dependabot-preview[bot]
097ba3b191 build(deps): bump hermit-abi from 0.1.15 to 0.1.16
Bumps [hermit-abi](https://github.com/hermitcore/libhermit-rs) from 0.1.15 to 0.1.16.
- [Release notes](https://github.com/hermitcore/libhermit-rs/releases)
- [Commits](https://github.com/hermitcore/libhermit-rs/commits)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-09-22 06:00:35 +00:00
Sebastien Boeuf
7c346c3844 vmm: Kill vhost-user self-spawned process on failure
If after the creation of the self-spawned backend, the VMM cannot create
the corresponding vhost-user frontend, the VMM must kill the freshly
spawned process in order to ensure the error propagation can happen.

In case the child process would still be around, the VMM cannot return
the error as it waits onto the child to terminate.

This should help us identify when self-spawned failures are caused by a
connection being refused between the VMM and the backend.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-18 17:26:25 +01:00
Rob Bradford
198bd55122 build, release-notes.md: Document 0.10.0 release
Update release notes and version number for the new release.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
v0.10.0
2020-09-18 14:06:10 +01:00
Sebastien Boeuf
555c5c5d9c vmm: Add missing syscalls to signal thread
When the VMM is terminated by receiving a SIGTERM signal, the signal
handler thread must be able to invoke ioctl(TCGETS) and ioctl(TCSETS)
without error.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-18 13:40:10 +01:00
Rob Bradford
41a9b1adef vmm: Add missing syscall to vCPU thread
Fixes: #1717

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-09-18 13:40:10 +01:00
Rob Bradford
036c2e5e45 Jenkinsfile: Remove "Build" steps from Jenkinsfile
Build testing of changes happens on GitHub actions and the integration
tests will build the binary (with different feature flags) again. So
these earlier build operations are just wasted time on the critical
path.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-09-18 12:48:30 +01:00
Rob Bradford
66352b100f README: Reference the aarch64 tracking issue.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-09-18 10:25:04 +01:00
Rob Bradford
8589d3f985 README: Fix missing punctuation
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-09-18 10:25:04 +01:00
Rob Bradford
5d535853f4 README: Update table of contents
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-09-18 10:25:04 +01:00
Rob Bradford
98bce5e044 README: Standardise project nomenclature
Only use `cloud-hypervisor` when referring to the binary itself and
prefer Cloud Hypervisor when referring to the project.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-09-18 10:25:04 +01:00
Rob Bradford
ce6353818f README: Update status section
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-09-18 10:25:04 +01:00
Rob Bradford
ea44d0a433 README: Include bzImage as a supported direct kernel boot method
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-09-18 10:25:04 +01:00
Rob Bradford
6d0b05c6b3 README: Update operating system and architecture support statements
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-09-18 10:25:04 +01:00
Rob Bradford
ab80789747 README: Configurable hotplug is no longer an objective
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-09-18 10:25:04 +01:00
Rob Bradford
5506908199 README: Remove KVM exclusivity
With our hypervisor crate we are no aiming to be KVM exclusive.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-09-18 10:25:04 +01:00
Rob Bradford
b666d40f9d README: Remove security reporting guidelines
These guidelines are no longer correct remove them as they are
unhelpful.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-09-18 10:25:04 +01:00
Rob Bradford
f0fae3e8b6 README: Remove pre-production disclaimer
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-09-18 10:25:04 +01:00
Sebastien Boeuf
b8bbe244b7 block_util: Ensure all io_uring operations are asynchronous
Some operations complete directly after they have been submitted, which
means they are not submitted asynchronously and therefore they don't
generate any ioevent. This is the reason why we are not processing some
of the completed operations, which leads to some unpredictable
behaviors.

Forcing all io_uring operations submitted to the SQE to be asynchronous
helps simplifying the code as it ensures the completion of every
operation will generate an ioevent, therefore no operation is missed.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-17 23:16:27 +02:00
dependabot-preview[bot]
144f0839ae build(deps): bump cc from 1.0.59 to 1.0.60
Bumps [cc](https://github.com/alexcrichton/cc-rs) from 1.0.59 to 1.0.60.
- [Release notes](https://github.com/alexcrichton/cc-rs/releases)
- [Commits](https://github.com/alexcrichton/cc-rs/compare/1.0.59...1.0.60)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-09-17 07:43:00 +00:00
Sebastien Boeuf
5a3d54fa19 ci: Extend virtio-mem integration test with reboot
Now that virtio-mem supports reboot, we extend the existing integration
tests to validate the amount of RAM after reboot is the same as before
the reboot, but also that we can still resize down the VM or the memory
zone after the reboot.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-16 19:20:04 +02:00
Sebastien Boeuf
46d972e402 virtio-devices: mem: Add missing syscall to seccomp filters
The missing syscall rt_sigprocmask(2) was triggered for the musl build
upon rebooting the VM, and was causing the VM to be killed.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-16 19:20:04 +02:00
Sebastien Boeuf
87bb041101 resources: x86_64: Enable auto onlining of memory blocks
In order to facilitate the memory hotplug for our users, let's enable
the automatic memory onlining in our guest kernel by activating the
kernel config option 'CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE'.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-16 19:20:04 +02:00
Sebastien Boeuf
1e1a50ef70 vmm: Update memory configuration upon virtio-mem resizing
Based on all the preparatory work achieved through previous commits,
this patch updates the 'hotplugged_size' field for both MemoryConfig and
MemoryZoneConfig structures when either the whole memory is resized, or
simply when a memory zone is resized.

This fixes the lack of support for rebooting a VM with the right amount
of memory plugged in.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-16 19:20:04 +02:00
Sebastien Boeuf
de2b917f55 vmm: Add hotplugged_size to VirtioMemZone
Adding a new field to VirtioMemZone structure, as it lets us associate
with a particular virtio-mem region the amount of memory that should be
plugged in at boot.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-16 19:20:04 +02:00
Sebastien Boeuf
3faf8605f3 vmm: Group virtio-mem fields under a dedicated structure
This patch simplifies the code as we have one single Option for the
VirtioMemZone. This also prepares for storing additional information
related to the virtio-mem region.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-16 19:20:04 +02:00
Sebastien Boeuf
4e1b78e1ff vmm: Add 'hotplugged_size' to memory parameters
Add the new option 'hotplugged_size' to both --memory-zone and --memory
parameters so that we can let the user specify a certain amount of
memory being plugged at boot.

This is also part of making sure we can store the virtio-mem size over a
reboot of the VM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-16 19:20:04 +02:00
Hui Zhu
33a1e37c35 virtio-devices: mem: Allow for an initial size
This commit gives the possibility to create a virtio-mem device with
some memory already plugged into it. This is preliminary work to be
able to reboot a VM with the virtio-mem region being already resized.

Signed-off-by: Hui Zhu <teawater@antfin.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-16 19:20:04 +02:00
Sebastien Boeuf
8b5202aa5a vmm: Always add virtio-mem region upon VM creation
Now that e820 tables are created from the 'boot_guest_memory', we can
simplify the memory manager code by adding the virtio-mem regions when
they are created. There's no need to wait for the first hotplug to
insert these regions.

This also anticipates the need for starting a VM with some memory
already plugged into the virtio-mem region.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-16 19:20:04 +02:00