549 Commits

Author SHA1 Message Date
Rob Bradford
107bfe1075 tests: Fix missing wait() on spawned support commands
Compiling cloud-hypervisor v41.0.0 (/home/rob/src/cloud-hypervisor)
warning: spawned process is never `wait()`ed on
    --> tests/integration.rs:4250:31
     |
4250 |         let mut socat_child = socat_command.spawn().unwrap();
     |                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     |
     = note: consider calling `.wait()`
     = note: not doing so might leave behind zombie processes
     = note: see https://doc.rust-lang.org/stable/std/process/struct.Child.html#warning
     = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#zombie_processes
     = note: `#[warn(clippy::zombie_processes)]` on by default

warning: spawned process is never `wait()`ed on
    --> tests/integration.rs:6687:9
     |
6687 | /         Command::new("/usr/local/bin/spdk-nvme/nvmf_tgt")
6688 | |             .args(["-i", "0", "-m", "0x1"])
6689 | |             .spawn()
6690 | |             .unwrap();
     | |                     ^- help: try: `.wait()`
     | |_____________________|
     |
     |
     = note: not doing so might leave behind zombie processes
     = note: see https://doc.rust-lang.org/stable/std/process/struct.Child.html#warning
     = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#zombie_processes

warning: `cloud-hypervisor` (test "integration") generated 2 warnings
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.83s

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-10-21 15:28:17 +00:00
Ruoqing He
416fe5eaac misc: Remove redundant return statement
As clippy of rust-toolchain version 1.83.0-beta.1 suggests, remove
redundant `return` statement to keep code clean.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2024-10-18 17:46:39 +00:00
Rob Bradford
fc0d808205 tests: Adjust test_tpm memory usage
After updating the OS image to the latest jammy version this test now
requires more memory to run

Fixes: #6782

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-10-17 19:51:05 +00:00
Rob Bradford
caf15f0acb tests: Give test_vfio_user more RAM
When running with the newer jammy OS image error messages from Rust
Hypervisor Firmware about lack of ram when rebooting where observed.
Increase the RAM allocation to allow it to boot after a reboot.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-10-17 19:51:05 +00:00
Rob Bradford
b1547c4ccb tests: Update version of Jammy image in use
This version is generated with the new script and adds kexec-tools.

Fixes: #6726

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-10-17 19:51:05 +00:00
Rob Bradford
81db2f0233 tests: Update for new VFIO worker
Adjust the VFIO device path and the disk image based on the new VFIO CI
worker.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-10-12 19:11:12 +00:00
Ruoqing He
297236a7c0 misc: Eliminate use of assert!((...).is_ok())
Asserting on .is_ok()/.is_err() leads to hard to debug failures (as if
the test fails, it will only say "assertion failed: false". We replace
these with `.unwrap()`, which also prints the exact error variant that
was unexpectedly encountered (we can to this these days thanks to
efforts to implement Display and Debug for our error types). If the
assert!((...).is_ok()) was followed by an .unwrap() anyway, we just drop
the assert.

Inspired by and quoted from @roypat.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2024-10-03 12:03:49 +00:00
Ruoqing He
61e57e1cb1 misc: Further improve imports styling
By introducing `imports_granularity="Module"` format strategy,
effectively groups imports from the same module into one line or block,
improving maintainability and readability.

Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>
2024-09-29 16:13:48 +00:00
Rob Bradford
88a9f79944 misc: Adapt consistent import style formatting
Historically the Cloud Hypervisor coding style has been to ensure that
all imports are ordered and placed in a single group. Unfortunately
cargo fmt has no support for ensuring that all imports are in a single
group so if whitespace lines were added as part of the import statements
then they would only be odered correctly in the group.

By adopting "group_imports="StdExternalCrate" we can enforce a style
where imports are placed in at most three groups for std, external
crates and the crate itself. Choosing a style enforceable by the tooling
reduces the reviewer burden.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-09-29 13:08:12 +01:00
Bo Chen
be5fc769e9 tests: Relax rate limit check for group rate limiter
The rate-limiter worker now is running on Azure VMs whose performance
are more prone to be fluctuate, particularly on block performances. This
commit relaxes the limit check for group rate limiter tests to make them
less flaky.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2024-09-21 08:13:32 +00:00
Praveen K Paladugu
b9f086bcb3 tests: drop landlock parameter while starting dest
After moving landlock config to VMConfig, there is no need to start
destination VM with landlock cmdline options in
test_live_migration_with_landlock test.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
2024-08-05 17:46:30 +00:00
Praveen K Paladugu
8452edfcc7 tests: Test live migration with Landlock
Add a test case to check Live Migration with Landlock support.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
2024-07-06 04:42:58 +00:00
Praveen K Paladugu
466cc5e043 tests: Add disk_hotplug test with Landlock
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
2024-07-06 04:42:58 +00:00
Praveen K Paladugu
034c674c4c tests: Add a basic Landlock test
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
2024-07-06 04:42:58 +00:00
Songqian Li
544de7d000 tests: send SIGTERM to kill GuestCommand
Killing CLH by SIGKILL will cause inaccurate code coverage
information. This patch changes the signal to SIGTERM.

Fixes: #6507

Signed-off-by: Songqian Li <sionli@tencent.com>
2024-06-18 08:03:09 +00:00
Josh Soref
42e9632c53 misc: Fix spelling issues
Misspellings were identified by:
  https://github.com/marketplace/actions/check-spelling

* Initial corrections based on forbidden patterns from the action
* Additional corrections by Google Chrome auto-suggest
* Some manual corrections
* Adding markdown bullets to readme credits section

Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2024-06-08 16:31:30 +00:00
Songqian Li
77c80f67ed tests: add snapshot_restore_pvpanic test
Add test_snapshot_restore_pvpanic to integration tests since pvpanic
 already supports snapshot.

Signed-off-by: Songqian Li <sionli@tencent.com>
2024-05-30 17:07:12 +00:00
Songqian Li
de38e4b777 tests: restart systemd daemon to reload systemd configuration
Currently, the watchdog configuration in systemd is reloaded by
reboot. We can cancel this reboot operation by restarting the
systemd daemon process.

Signed-off-by: Songqian Li <sionli@tencent.com>
2024-05-15 11:17:42 +00:00
Purna Pavan Chandra
fe86f90ba6 tests: add back tests_snapshot_restore* but to common_sequential
tests_snapshot_restore* have been earlier removed from common_parallel
due to the falkiness they add testsuite. Running them sequentially would
eliminate the flakiness. Hence, add the tests back to testsuite but into
common_sequential module.

Signed-off-by: Purna Pavan Chandra <paekkaladevi@linux.microsoft.com>
2024-05-14 10:52:46 +00:00
Purna Pavan Chandra
3769b18a70 tests: remove test_snapshot_restore* tests from common_parallel
test_snapshot_restore_* tests often have transient failures and add to
overall flakiness of the integration testsuite. Hence, remove them from
common_parallel. However, these tests need to be added back to
common_sequential

Signed-off-by: Purna Pavan Chandra <paekkaladevi@linux.microsoft.com>
2024-05-14 10:52:46 +00:00
Purna Pavan Chandra
6d74393833 tests: Add test_snapshot_restore_with_fd to integration tests
VM is created with FDs explicitly passed to CH via --net parameter
and snapshotted. New net FDs are passed in turn during restore.
Boilerplate code from _test_snapshot_restore().

Signed-off-by: Purna Pavan Chandra <paekkaladevi@linux.microsoft.com>
2024-05-14 10:52:46 +00:00
Rob Bradford
622a1a3735 tests: Serialise the live migration/upgrade NUMA/balloon tests
These tests use relatively large memory allocations and if they are
allowed to run in parallel can result in the container encountering an
OOM situation.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-05-04 10:06:04 +00:00
Rob Bradford
f5abb168e3 tests: Re-enable live upgrade tests
And bump release verion used to v39.0

Fixes: #6134

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-05-04 10:06:04 +00:00
Purna Pavan Chandra
de1f055481 tests: Reduce Guest memory in _test_snapshot_restore
Change from default 4G to 2G to reduce the memory usage. This will help
avoid "No disk space left" errors when all snapshot tests run at once.

Signed-off-by: Purna Pavan Chandra <paekkaladevi@linux.microsoft.com>
2024-04-30 07:36:59 +00:00
Purna Pavan Chandra
1a94c6c932 tests: clean up snapshot dir after restore
The snapshot_dir is not needed anymore fter the guest is restored.
Hence, remove it. This will potentially avoid transient failures caused
due to insufficient disk space.

Signed-off-by: Purna Pavan Chandra <paekkaladevi@linux.microsoft.com>
2024-04-30 07:36:59 +00:00
Yi Wang
f1e4a82fd3 tests: Add test for nmi
Kernel can panic when get a unknown nmi when set unknown_nmi_panic,
we can use this feature to test if nmi injected to guest.

Signed-off-by: Yi Wang <foxywang@tencent.com>
2024-03-04 10:02:38 +00:00
Bo Chen
0718067851 tests: Fix test_snapshot_restore_hotplug_virtiomem on 16 cores VM
It takes longer time to restore a VM on a VM with 16 cores comparing
with ones with 64 cores.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2024-03-01 07:24:14 +00:00
Rob Bradford
6930370a03 tests: Remove download of unused bionic image for aarch64
The bionic image was being downloaded and converted but no test uses
this image any longer.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-02-22 12:28:40 +00:00
Stefan Nuernberger
d8cd403c5d arch: x86_64: integration test for bzImage boot
Signed-off-by: Stefan Nuernberger <stefan.nuernberger@cyberus-technology.de>
2024-02-19 17:07:50 +00:00
acarp
035c4b20fb block: Set an option to pin virtio block threads to host cpus
Currently the only way to set the affinity for virtio block threads is
to boot the VM, search for the tid of each of the virtio block threads,
then set the affinity manually. This commit adds an option to pin virtio
block queues to specific host cpus (similar to pinning vcpus to host
cpus). A queue_affinity option has been added to the disk flag in
the cli to specify a mapping of queue indices to host cpus.

Signed-off-by: acarp <acarp@crusoeenergy.com>
2024-02-13 09:05:57 +00:00
Bo Chen
36890373cd tests: Avoid clippy warning of unhandled I/O bytes
Fixes beta clippy issue:

error: read amount is not handled
    --> tests/integration.rs:2121:15
     |
2121 |         match pty.read(&mut buf) {
     |               ^^^^^^^^^^^^^^^^^^
     |
     = help: use `Read::read_exact` instead, or handle partial reads
note: the result is consumed here, but the amount of I/O bytes remains unhandled
    --> tests/integration.rs:2122:13
     |
2122 | /             Ok(_) => {
2123 | |                 let output = std::str::from_utf8(&buf).unwrap().to_string();
2124 | |                 match tx.send(output) {
2125 | |                     Ok(_) => (),
2126 | |                     Err(_) => break,
2127 | |                 }
2128 | |             }
     | |_____________^
     = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#unused_io_amount
     = note: `#[deny(clippy::unused_io_amount)]` on by default

Signed-off-by: Bo Chen <chen.bo@intel.com>
2024-02-07 09:25:40 +00:00
Rob Bradford
61afd93a50 tests: Remove unnecessary use of vec![] macro
Beta clippy fix

warning: useless use of `vec!`
    --> tests/integration.rs:5845:23
     |
5845 |         let kernels = vec![direct_kernel_boot_path()];
     |                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: you can use an array directly: `[direct_kernel_boot_path()]`
     |
     = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#useless_vec
     = note: `#[warn(clippy::useless_vec)]` on by default

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2024-02-07 09:25:40 +00:00
Bo Chen
3ce0fef7fd build: Bump vmm-sys-util crate and its consumers
This patch bumps the following crates, including `kvm-bindings@0.7.0`*,
`kvm-ioctls@0.16.0`**, `linux-loader@0.11.0`, `versionize@0.2.0`,
`versionize_derive@0.1.6`***, `vhost@0.10.0`,
`vhost-user-backend@0.13.1`, `virtio-queue@0.11.0`, `vm-memory@0.14.0`,
`vmm-sys-util@0.12.1`, and the latest of `vfio-bindings`, `vfio-ioctls`,
`mshv-bindings`,`mshv-ioctls`, and `vfio-user`.

* A fork of the `kvm-bindings` crate is being used to support
serialization of various structs for migration [1]. Also, code changes
are made to accommodate the updated `struct xsave` from the Linux
kernel. Note: these changes related to `struct xsave` break
live-upgrade.

** The new `kvm-ioctls` crate introduced breaking changes for
the `get/set_one_reg` API on `aarch64` [2], so code changes are made to
the new APIs.

*** A fork of the `versionize_derive` crate is being used to support
versionize on packed structs [3].

[1] https://github.com/cloud-hypervisor/kvm-bindings/tree/ch-v0.7.0
[2] https://github.com/rust-vmm/kvm-ioctls/pull/223
[3] https://github.com/cloud-hypervisor/versionize_derive/tree/ch-0.1.6

Fixes: #6072

Signed-off-by: Bo Chen <chen.bo@intel.com>
2024-01-25 10:14:54 +00:00
Thomas Barrett
9e9bcd24c6 tests: add rate-limit group integration tests
Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
2024-01-03 10:21:06 -08:00
Yi Wang
ee2f0c3cb4 build: fix clippy Path::join issue
CI reports clippy errors:

error: argument to `Path::join` starts with a path separator
    --> tests/integration.rs:4076:58
     |
4076 |         let serial_socket = guest.tmp_dir.as_path().join("/tmp/serial.socket");
     |                                                          ^^^^^^^^^^^^^^^^^^^^
     |
     = note: joining a path starting with separator will replace the path instead
     = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#join_absolute_paths

Signed-off-by: Yi Wang <foxywang@tencent.com>
2024-01-02 08:58:11 +00:00
Thomas Barrett
5c0b66529a arch: x86_64: handle npot CPU topology
This PR addresses a bug in which the cpu topology of a guest
with non power-of-two number of cores is incorrect. For example,
in some contexts, a virtual machine with 2-sockets and 12-cores
will incorrectly believe that 16 cores are on socket 1 and 8
cores are on socket 2. In other cases, common topology enumeration
software such as hwloc will crash.

The root of the problem was the way that cloud-hypervisor generates
apic_id. On x86_64, the (x2) apic_id embeds information about cpu
topology. The cpuid instruction is primarily used to discover the
number of sockets, dies, cores, threads, etc. Using this information,
the (x2) apic_id is masked to determine which {core, die, socket} the
cpu is on. When the cpu topology is not a power of two
(e.g. a 12-core machine), this requires non-contiguous (x2) apic_id.

Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
2024-01-01 10:05:03 +00:00
Ravi kumar Veeramally
24f384d239 tests: Migrate docker container from ubuntu 20.04 to 22.04
The following tests have been temporarily disabled:

1. Live upgrade/migration test with ovs-dpdk (#5532);
2. Disk hotplug tests on windows guests (#6037);

This patch has been tested with PR #6048.

Signed-off-by: Ravi kumar Veeramally <ravikumar.veeramally@intel.com>
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Tested-by: Bo Chen <chen.bo@intel.com>
2023-12-20 12:12:05 -08:00
Bo Chen
602d704558 tests: Stabilize 'test_vfio_user' with retries to run host commands
The 'test_vfio_user' is prone to fail when the system is under high
workloads with errors:

```
Error while connecting to /var/tmp/spdk.sock
Is SPDK application running?
Error details: Invalid or non-existing address: '/var/tmp/spdk.sock'
```

This is because SPDK is not fully functional before we request to
create a nvme device using the vfio_user protocol. This patch stabilize
this test with allowing retires to execute host commands.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-12-14 07:12:04 -08:00
Bo Chen
c0faa75922 tests: Print ExitStatus for 'test_serial_socket_interaction'
This test has been failing fairly often on the AMD worker. Let's
collect more log for debugging.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-12-08 11:26:33 +00:00
Bo Chen
5d411d257a tests: Stabilize snapshot_restore tests
See: #5938

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-12-08 11:26:33 +00:00
Ruslan Mstoi
800f971381 tests: test_watchdog: use event monitor
See: #5127

Signed-off-by: Ruslan Mstoi <ruslan.mstoi@intel.com>
2023-11-24 21:37:25 +00:00
Ruslan Mstoi
341a4558fe tests: test_vdpa_block: fix false positive
Running on host where vdpa_sim_blk module is not correctly loaded
test_vdpa_block passes.

"test common_parallel::test_vdpa_block ... ok"

This commit fixes the vdpa_sim_blk test to fail in that case.

Signed-off-by: Ruslan Mstoi <ruslan.mstoi@intel.com>
2023-11-24 21:37:06 +00:00
Bo Chen
4d80be3a04 tests: Enable live-upgrade tests based on release v36.0
These live-upgrade tests were disabled due to CLI changes (#5791).

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-11-17 08:43:13 +00:00
Bo Chen
de2fcc2d87 tests: Stabilize snapshot_restore tests
Since the 'write()' to the event file was moved to its own thread
(see #5633), we have no reliable way to read the latest contents of
the event file from our integration tests, since we can't ensure the
'read()' from our test always happen after 'write()' is completed from
Cloud Hypervisor. This is also why we started to see random failures on
snapshot_restore tests (particularly when the system workload is high).

This patch adds a 1s sleep before reading the event file to mitigate the
random failures.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-11-14 09:19:25 +00:00
Bo Chen
62db13ba0e tests: Temporarily disable vhost_user_blk tests on aarch64
See: #5934

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-11-09 07:57:44 +00:00
Bo Chen
1be40e2339 tests: Improve debuggability for "test_vfio"
Instead of relying on "wc" and "grep", this patch provides helper
functions for checking line counts and searching/counting keywords.
To understand assertion failures better, it also generate logs for the
L1/L2 VM commands when checks fail.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-11-06 08:50:58 -08:00
Bo Chen
bc04e75b4b test_infra, tests: Unify error message formatting
Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-11-06 08:50:58 -08:00
Bo Chen
5976a37cf4 tests: Print details when checks on event monitor failed
Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-11-06 08:50:58 -08:00
Yong He
d1ba50f10e tests: Add a test simultaneously set serial and console as TTY mode
Add a test that supports configuring serial and console as TTY mode
at the same time. With this configuration, the VM can set up a legacy
serial device as an early printk console device, and then change to a
virito console device after the virito console device is initialized.

In this case, we can capture the logs printed by legacy serial on early
boot, and later by the virtio console.

Signed-off-by: Yong He <alexyonghe@tencent.com>
2023-11-02 11:06:30 -07:00
Bo Chen
4afd8d96f9 tests: Remove "test_vfio" from the bare-metal worker
With #4324 being resolved, the nested VFIO test (e.g. "test_vfio") is
now a part of the general Azure VM-based workers. No need to run it on
the bare-metal worker.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-11-01 15:00:41 +00:00