Commit Graph

2781 Commits

Author SHA1 Message Date
Rob Bradford
f762bc7573 arch: x86_64: Create MP table after SMBIOS table if space
In order to speed up the Linux boot (so as to avoid it having to scan a
large number of pages) place the MP table directly after the SMBIOS
table if there is sufficient room. The start address of the SMBIOS table
is one of the three (and the largest) location that the MP table can
also be located at.

Before:
[    0.000399] x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WP  UC- WT
[    0.014945] check: Scanning 1 areas for low memory corruption

After:
[    0.000284] x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WP  UC- WT
[    0.000421] found SMP MP-table at [mem 0x000f0090-0x000f009f]

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-09-25 19:41:24 +02:00
Wei Liu
7e130a65ba vmm: interrupts: adjust set_gsi_routes
There is no point in manually dropping the lock for gsi_msi_routes then
instantly grabbing it again in set_gsi_routes.

Make set_gsi_routes take a reference to the routing hashmap instead.

No functional change intended.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-09-25 17:17:35 +02:00
Hui Zhu
4913acc05e vmm: Add 'balloon' to memory parameters
Add the option 'balloon' to --memory.

Signed-off-by: Hui Zhu <teawater@antfin.com>
2020-09-25 17:13:39 +02:00
Sebastien Boeuf
c85e396ce5 vmm: cpu: x86: Enable MTRR feature in CPUID
The MTRR feature was missing from the CPUID, which is causing the guest
to ignore the MTRR settings exposed through dedicated MSRs.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-25 15:03:52 +02:00
Rob Bradford
f4ec915c5d resources: Remove unused PPS features from kernel config
In particular this removes the annoying PPS messages that fill up the
dmesg log.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-09-25 14:07:41 +02:00
Sebastien Boeuf
ae44e9c076 resources: Reduce x86_64 kernel configuration to fix warnings
Removing the ISA DMA configurations prevents the kernel from accessing
the port I/O 0x87, which was generating the following warning:

WARN:vmm/src/cpu.rs:378 -- Guest PIO read to unregistered address 0x87

Removing the TELCLOCK configuration prevents the kernel from accessing
the port I/O reserved for the memory manager, which was causing the
following warning:

WARN:vmm/src/memory_manager.rs:289 -- Unexpected offset for accessing
memory manager device: 15

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-25 10:22:28 +01:00
Rob Bradford
29b74804e1 main: Improve the error reporting when creating the hypervisor object
The ::new() does very little beyond trying to open the /dev/kvm device
so provide a hint to the user about what has gone wrong.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-09-25 11:08:01 +02:00
Bo Chen
1d3c3bc6ec tests: Capture child process stdout/err in 'test_memory_mergeable'
Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-09-25 08:40:23 +02:00
Bo Chen
2441798fe4 tests: Resize the pipe size to 256K for capturing child stdout/err
As discussed in #1707, the `vcpu` thread can be stalled when using
`--serial tty`. To workaround that issue, this patch enforces to resize
the pipe size to 256K when we capture the stdout/stderr of the
cloud-hypervisor child process in the integration tests. Note that the
pipe size (256K) is chosen based on the output size of our integration
tests at this point, which may need to be increased in the future.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-09-25 08:40:23 +02:00
Bo Chen
365b947023 tests: Port test_simple_launch to the new methodology
This is the last test to be ported to the new methodology.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-09-25 08:40:23 +02:00
Bo Chen
cb2f11724a tests: Port test_reboot to the new methodology
Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-09-25 08:40:23 +02:00
Hui Zhu
d03a48162f balloon.rs: BalloonEpollHandler: Fix wrong error in handle_event
error!("Unknown event for virtio-mem");
This error should be
error!("Unknown event for virtio-balloon");

Signed-off-by: Hui Zhu <teawater@antfin.com>
2020-09-25 08:36:23 +02:00
Sebastien Boeuf
de88bef429 pci: msix: Fix masking/enabling semantics
By looking at Linux kernel boot time, we identified that a lot of time
was spent registering and unregistering IRQ fds to KVM. This is not
efficient and certainly not a wrong behavior from the Linux kernel,
but rather a problem with the Cloud-Hypervisor's implementation of
MSI-X.

The way to fix this issue is by ensuring the initial conditions are
correct, which means the entire MSI-X vector table must be disabled
and masked. Additionally, each vector must be individually masked.

With these correct conditions, Linux won't start masking interrupt
vectors, and later unmask them since they will be seen as masked from
the beginning. This means the OS will simply have to unmask them when
needed, avoiding the extra operation.

Another aspect of this patch is to prevent Cloud-Hypervisor from
enabling (by registering IRQ fd) all vectors when either the global
'mask' or 'enable' bits are set. Instead, we can simply let the mask()
and unmask() operations take care of it if needed.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-24 22:29:16 +02:00
Sebastien Boeuf
64351c1f3f build: Update Cargo.lock
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-23 19:03:19 +02:00
Sebastien Boeuf
2eaf1c70c0 vmm: acpi: Advertise the correct PCI bus range
Since Cloud-Hypervisor currently support one single PCI bus, we must
reflect this through the MCFG table, as it advertises the first bus and
the last bus available. In this case both are bus 0.

This patch saves quite some time during guest kernel boot, as it
prevents from checking each bus for available devices.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-23 19:03:19 +02:00
Sebastien Boeuf
ec56710c9b devices: ioapic: Mask entries by default
When created, the IOAPIC entries should be masked, as it is the guest's
responsibility (FW and/or OS) to unmask them if/when necessary.

This patch saves a full round of port I/O writes from the guest to the
IOAPIC, meant for masking the unmasked entries.

Because they're now masked, the entries are not enabled, which means
they are not connected from a KVM perspective, saving from unneeded
registration/unregistration of the irq fds.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-23 19:03:19 +02:00
Sebastien Boeuf
827810dbd5 ci: Fix virtiofsd build by staying on older branch
While we figure out the details on how to correctly build virtiofsd from
the latest rebase from the branch "virtio-fs-dev" (which now relies on
QEMU's new build system), let's fix the CI by relying on an older branch
which still relies on the previous build system.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-23 17:06:34 +01:00
Henry Wang
c85c1f0d76 ci: AArch64: enable snapshot/restore integration test case
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
961c5f2cb2 vmm: AArch64: enable VM states save/restore for AArch64
The states of GIC should be part of the VM states. This commit
enables the AArch64 VM states save/restore by adding save/restore
of GIC states.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
7c40a78b66 arch: Fix wrong trial of creating GICv3-ITS for non-PCI use cases
Currently for AArch64, the GICv3-ITS is tried to be created first
when PCI is not needed, which is unnecessary. This commit fixes
the problem.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
3ea4a0797d vmm: seccomp: unify AArch64 and x86_64 FTRUNCATE syscall
The definition of libc::SYS_ftruncate on AArch64 is different
from that on x86_64. This commit unifies the previously hard-coded
syscall number for AArch64.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
48544e4e82 vmm: seccomp: whitelist KVM_GET_REG_LIST in seccomp
`KVM_GET_REG_LIST` ioctl is needed in save/restore AArch64 vCPU.
Therefore we whitelist this ioctl in seccomp.

Also this commit unifies the `SYS_FTRUNCATE` syscall for x86_64
and AArch64.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
c6b47d39e0 vmm: refactor vCPU save/restore code in restoring VM
Similarly as the VM booting process, on AArch64 systems,
the vCPUs should be created before the creation of GIC. This
commit refactors the vCPU save/restore code to achieve the
above-mentioned restoring order.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
970a5a410d vmm: decouple vCPU init from configure_vcpus
Since calling `KVM_GET_ONE_REG` before `KVM_VCPU_INIT` will
result in an error: Exec format error (os error 8). This commit
decouples the vCPU init process from `configure_vcpus`. Therefore
in the process of restoring the vCPUs, these vCPUs can be
initialized separately before started.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
47e65cd341 vmm: AArch64: add methods to get saved vCPU states
The construction of `GICR_TYPER` register will need vCPU states.
Therefore this commit adds methods to extract saved vCPU states
from the cpu manager.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
381d0b4372 devices: remove the migration traits for the Gic struct
Unlike x86_64, the "interrupt_controller" in the device manager
for AArch64 is only a `Gic` object that implements the
`InterruptController` to provide the interrupt delivery service.
This is not the real GIC device so that we do not need to save
its states. Also, we do not need to insert it to the device_tree.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
39c9583b48 arch: AArch64: implement save/restore for GICv3
This commit implements the save/restore for GICv3.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
7ddcad1d8b arch: AArch64: add a field gicr_typers for GIC implementations
The value of GIC register `GICR_TYPER` is needed in restoring
the GIC states. This commit adds a field in the GIC device struct
and a method to construct its value.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
dcf6d9d731 device_manager: AArch64: add a field to set/get GIC device entity
In AArch64 systems, the state of GIC device can only be
retrieved from `KVM_GET_DEVICE_ATTR` ioctl. Therefore to implement
saving/restoring the GIC states, we need to make sure that the
GIC object (either the file descriptor or the device itself) can
be extracted after the VM is started.

This commit refactors the code of GIC creation by adding a new
field `gic_device_entity` in device manager and methods to set/get
this field. The GIC object can be therefore saved in the device
manager after calling `arch::configure_system`.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
e7acbcc184 arch: AArch64: support saving RDIST pending tables into guest RAM
This commit adds a function which allows to save RDIST pending
tables to the guest RAM, as well as unit test case for it.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
29ce3076c2 tests: AArch64: Add unit test cases for accessing GIC registers
This commit adds the unit test cases for getting/setting the GIC
distributor, redistributor and ICC registers.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
09d53aad11 arch: AArch64: Porting GIC icc_regs implementation
This commit ports the implementation of GIC ICC registers
from Firecracker.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
bfde6977c8 arch: AArch64: Porting GIC redist_regs implementation
This commit ports the implementation of GIC redistributor registers
from Firecracker.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
f53990c7e7 arch: AArch64: Porting GIC dist_regs implementation
This commit ports the implementation of GIC distributor registers
from Firecracker.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
89a6b63e6e hypervisor: Implement get_device_attr method for AArch64
This commit implements the `get_device_attr` method for the
`KVM_GET_DEVICE_ATTR` ioctl. This ioctl will be used in retrieving
the GIC status.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
b1285cf528 arch: AArch64: move GIC implementations to a separate module
This commit moves the GIC-related code to a separate module.
Therefore the implementation of GIC registers can be introduced
to the new module.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
9dd188a8e8 tests: AArch64: Add unit test cases for vCPU save/restore
Adds 3 more unit test cases for AArch64:

*save_restore_core_regs
*save_restore_system_regs
*get_set_mpstate

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
ffafeda4b6 AArch64: Implement AArch64 vCPU states save/restore
This commit adds methods to save/restore AArch64 vCPU registers,
including:

1. The AArch64 `VcpuKvmState` structure.

2. Some `Vcpu` trait methods of the `KvmVcpu` structure to
enable the save/restore of the AArch64 vCPU states.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
e3d45be6f7 AArch64: Preparation for vCPU save/restore
This commit ports code from firecracker and refactors the existing
AArch64 code as the preparation for implementing save/restore
AArch64 vCPU, including:

1. Modification of `arm64_core_reg` macro to retrive the index of
arm64 core register and implemention of a helper to determine if
a register is a system register.

2. Move some macros and helpers in `arch` crate to the `hypervisor`
crate.

3. Added related unit tests for above functions and macros.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Josh Soref
5c3f4dbe6f ch: Fix various misspelled words
Misspellings were identified by https://github.com/marketplace/actions/check-spelling
* Initial corrections suggested by Google Sheets
* Additional corrections by Google Chrome auto-suggest
* Some manual corrections

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-09-23 08:59:31 +01:00
Jiangbo Wu
22a2a99e5f acpi: Add hotplug numa node
virtio-mem device would use 'VIRTIO_MEM_F_ACPI_PXM' to add memory to NUMA
node, which MUST be existed, otherwise it will be assigned to node id 0,
even if user specify different node id.

According ACPI spec about Memory Affinity Structure, system hardware
supports hot-add memory region using 'Hot Pluggable | Enabled' flags.

Signed-off-by: Jiangbo Wu <jiangbo.wu@intel.com>
2020-09-22 13:11:39 +02:00
Jiangbo Wu
223189c063 mm: Apply zone's property instread of global config
Apply memory zone's property for associated virtio-mem regions.

Signed-off-by: Jiangbo Wu <jiangbo.wu@intel.com>
2020-09-22 09:56:37 +02:00
Jiangbo Wu
80be8ac0dc mm: Apply memory policy for virtio-mem region
Use zone.host_numa_node to create memory zone, so that memory zone
can apply memory policy in according with host numa node ID

Signed-off-by: Jiangbo Wu <jiangbo.wu@intel.com>
2020-09-22 09:56:37 +02:00
dependabot-preview[bot]
097ba3b191 build(deps): bump hermit-abi from 0.1.15 to 0.1.16
Bumps [hermit-abi](https://github.com/hermitcore/libhermit-rs) from 0.1.15 to 0.1.16.
- [Release notes](https://github.com/hermitcore/libhermit-rs/releases)
- [Commits](https://github.com/hermitcore/libhermit-rs/commits)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-09-22 06:00:35 +00:00
Sebastien Boeuf
7c346c3844 vmm: Kill vhost-user self-spawned process on failure
If after the creation of the self-spawned backend, the VMM cannot create
the corresponding vhost-user frontend, the VMM must kill the freshly
spawned process in order to ensure the error propagation can happen.

In case the child process would still be around, the VMM cannot return
the error as it waits onto the child to terminate.

This should help us identify when self-spawned failures are caused by a
connection being refused between the VMM and the backend.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-18 17:26:25 +01:00
Rob Bradford
198bd55122 build, release-notes.md: Document 0.10.0 release
Update release notes and version number for the new release.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-09-18 14:06:10 +01:00
Sebastien Boeuf
555c5c5d9c vmm: Add missing syscalls to signal thread
When the VMM is terminated by receiving a SIGTERM signal, the signal
handler thread must be able to invoke ioctl(TCGETS) and ioctl(TCSETS)
without error.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-18 13:40:10 +01:00
Rob Bradford
41a9b1adef vmm: Add missing syscall to vCPU thread
Fixes: #1717

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-09-18 13:40:10 +01:00
Rob Bradford
036c2e5e45 Jenkinsfile: Remove "Build" steps from Jenkinsfile
Build testing of changes happens on GitHub actions and the integration
tests will build the binary (with different feature flags) again. So
these earlier build operations are just wasted time on the critical
path.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-09-18 12:48:30 +01:00
Rob Bradford
66352b100f README: Reference the aarch64 tracking issue.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-09-18 10:25:04 +01:00