This function starts the 'receive-migration' for the destination VM,
'send-migration' for the source VM, waits for the live-migration
completion, and prints debug information upon errors.
Signed-off-by: Bo Chen <chen.bo@intel.com>
This patch moves the actual test logic and assertions from various
functions to the actual tests, which makes these tests more readable and
easier to debug.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Until the issue #4583 is resolved, we must disable this test given it's
failing quite often on the aarch64 worker.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
As 'handle_child_output()' may terminate the test on panic, we need to
cleanup ovs-dpdk setup in advance.
see: #4555
Signed-off-by: Bo Chen <chen.bo@intel.com>
Following our recent v26.0 release we can re-enable our live upgrade
tests to try and make it possible for us to move to making LTS releases.
Currently limited to x86-64 as the live upgrade tests fail on aarch64.
Fixes: #3949
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Only the ovs-dpdk live-migration tests need to run sequentially as they
use the same ovs-dpdk setup.
This is to reduce our CI time, particularly for the live-migration
and aarch64 jobs.
Signed-off-by: Bo Chen <chen.bo@intel.com>
This enables the Windows test module. One basic test is enabled,
while all others are disabled yet for aarch64. Jenkins file is
extended with the corresponding step for aarch64.
installAzureCli() is parametrized.
It seems that transferring a 30GB image would take >= 15 minutes. An
optimization here is having a gzip'ed image to 10GB which would unpack
in 3 minutes. Expect to be quicker than transferring an uncompressed
image while on another network.
Signed-off-by: Anatol Belski <ab@php.net>
The test test_virtio_block_topology has been recently failing due to an
error happening in losetup while trying to set the block size. Since
there's no option in losetup for retrying, we took the approach of
programming the expected behavior of creating a loop device relying
directly on the system ioctl LOOP_CONFIGURE. We apply a retry loop based
on the result returned by this ioctl, so that we don't fail on the first
try. We also added a sleep before retrying, hoping this would help the
next iteration to succeed.
Fixes#3494
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
It's been observed on the Bionic image that udev and snapd services can
cause some delay in the VM's shutdown. Disabling them before shutting
down the VM improves the reliability of the test.
Also increasing slightly the sleep time to ensure we give the VM enough
time to shutdown before checking the list of events provided by the
event monitor.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Since it's not possible to run the integration test test_vfio on Azure
at the moment (because of some nested virtualization issues), we can
temporarily run it on the baremetal CI where we already run some VFIO
tests.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Move the live migration tests to a 'jammy' worker rather than
'jammy-small'. This type of worker has more CPUs (64 vs 16) and more RAM
(256G vs 64G), which should improve the time it takes to run each test.
With this improvement, the test shouldn't fail anymore due to timeout
being reached.
A second improvement is to reduce the amount of vCPUs created for each
VM. The point is simply to check we can migrate a VM with multiple
vCPUs, therefore using 2 instead of 6 should be enough when possible.
When testing NUMA, we can't lower the amount of vCPUs since there's a
quite complex topology that is expected there.
Also, the total amount of vCPUs is reduced from 12 to 4 (again when not
testing with NUMA).
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Improve error catching on the steps creating the block device so that we
can understand if qemu-img or losetup is the faulty command leading to
an empty device path.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Both of these tests have been sporadically failing through multiple CI
runs. The reason is related to cloud-init which fails to run the
"init-local" script during the second boot of the VM. This causes the
network interface to not be available, and therefore the test can't SSH
into the VM as expected. The root cause is the filesystem and cache
corruption that happens on the cloud-init disk.
The way to prevent from this issue is to sync the guest filesystem
before we shut it down, and as a security harness, we also wait for a
few seconds for the shutdown command to complete inside the guest before
we trigger the API shutdown or delete.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
It might sometimes take a few seconds for the guest to trigger the OOM
and report it back to the host. That's why this patch adds some sleep
time between the command in the guest supposedly triggering the OOM and
the check of the balloon size from the host.
Fixes#4336
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
warning: you are deriving `PartialEq` and can implement `Eq`
--> vmm/src/serial_manager.rs:59:30
|
59 | #[derive(Debug, Clone, Copy, PartialEq)]
| ^^^^^^^^^ help: consider deriving `Eq` as well: `PartialEq, Eq`
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#derive_partial_eq_without_eq
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
As coredump function is to make a vmcore for crash tool to analyze,
in order not to introduce a big thing in integration, we just check
if ch-remote command runs no error report here.
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
From the logs it appears that booting the VM to the point at which it
can signal to the host can sometimes take longer than then 30 seconds
specified.
Fixes: #4136
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The current patch fixes the following error that was raised by clippy:
error: this let-binding has unit value
--> tests/integration.rs:6538:13
|
6538 | / let _ = stdin
6539 | | .write_all("type=7".as_bytes())
6540 | | .expect("failed to write stdin");
| |_________________________________________________^
|
= note: `-D clippy::let-unit-value` implied by `-D warnings`
= help: for further information visit
https://rust-lang.github.io/rust-clippy/master/index.html#let_unit_value
help: omit the `let` binding
|
6538 ~ stdin
6539 + .write_all("type=7".as_bytes())
6540 + .expect("failed to write stdin");
|
error: could not compile `cloud-hypervisor` due to previous error
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In this way, we can cover a broad range of events from the event monitor
while avoiding code duplication.
Fixes: #4054
Signed-off-by: Bo Chen <chen.bo@intel.com>
This prevents a conflict since the old API socket will not have been
cleaned up (due to the use of SIGKILL.)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Live upgrade is currently not guaranteed during this development cycle
and we will try to enable these tests after the next release.
Signed-off-by: Bo Chen <chen.bo@intel.com>
By augmenting existing set of tests, this patch added a set of
tests for live-upgrade that covers use cases with NUMA,
vhost-user (OVS-DPDK), and local-migration.
Fixes: #3949
Signed-off-by: Bo Chen <chen.bo@intel.com>
We modified a test case to workaround the RAM calculation error caused
by hidding 4MiB memory for UEFI. Now change it back to normal.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Now that address translations performed by virtio-iommu can error out if
the address can't be translated, we uncovered an issue in integration
test aarch64_acpi::test_virtio_iommu.
We disable the test until we can investigate and fix the root cause.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The extra vDPA device in the test is hotplugged behind the vIOMMU, which
covers the use case of placing a vDPA device behind a virtual IOMMU.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
As reported by the periodic CI runs, it may take more time for the NVMe
device to present in the guest after being hotplugged as a VFIO user
device on `aarch64` (especially under high load). Let's increase the
timeout after device hotplug from `1s` to `10s` to increase the test
stability.
Fixes: #3495
Signed-off-by: Bo Chen <chen.bo@intel.com>
Compile this feature in by default as it's well supported on both
aarch64 and x86_64 and we only officially support using it (no non-acpi
binaries are available.)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
It seems the vdpa_sim_block isn't behaving properly after the vhost
device is closed, as it sometimes returns EBUSY when we try to open it
again. The easiest way to deal with this issue is by simplifying the
integration test, avoid to plug the same device after it's been
unplugged.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Disable the DAX feature from the virtio-fs implementation as the feature
is still not stable. The feature is deprecated, meaning the 'dax'
parameter will be removed in about 2 releases cycles.
In the meantime, the parameter value is ignored and forced to be
disabled.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The test is sporadically failing whenever we try to hotplug the vDPA
device we've just unplugged. This is causing the kernel to complain with
EBUSY because the device hasn't been released yet. This is happening
because the CI system is under very high load, therefore taking quite
some time to the host to update the state of this device.
The easy way to fix such issue is by increasing the sleep time between
the unplug and the replug.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Ensure devices that are specified to be on a PCI segment that is behind
the IOMMU are IOMMU enabled if possible or error out for those devices
that do not support it.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Adding two new integration tests for vDPA, relying on both block and net
simulators from the host kernel.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In this way, we can cover local-migration with dpdk in our regular CI,
to prevent similar regressions reported and fixed by #3657.
Fixes: #3659
Signed-off-by: Bo Chen <chen.bo@intel.com>
Introducing a new integration test relying on the virtio-balloon ability
to free host pages that have been reported as freed by the guest.
This test checks that after consuming a lot of RAM in the guest, the VMM
process is able to releases the pages reported by the guest. Simply done
by checking the RSS associated with the VMM's process follows the memory
trend in the guest.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In this way, we allows to reuse the struct `Guest` with kernel paths and
kernel commands (e.g. hardcoded constants) that are tests-specific.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Instead of using hardcoded firmware paths inside the `Guest` struct
constructor, this commit removes `fw_path` related code paths from the
`Guest` struct and asks each test constructs its firmware path
explicitly. This allows better flexibility for the `Guest` struct so
that it can be reused for the performance tests we are adding soon.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Now that multiple file descriptors can be provided along with add-net,
that means we can hotplug a multiqueue macvtap interface to the VM.
The common macvtap test is updated, meaning that both coldplug and
hotplug codepath now use multiqueue.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Add integration tests for "pmu=on". It depends on checking if there
is "arm-pmu" item in "/proc/interrupts". As PMU info has not been added
to ACPI, the tests are only for dt.
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
As it might take more time for the VM to boot (especially under high
load) when using the firmware, let's increase the timeout waiting for
the VM to be reachable.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Both OVMF and RHF firmwares triggered an error when O_DIRECT was used
because they didn't align the buffers to the block sector size.
In order to prevent regressions, we're adding a new test validating the
VM can properly boot when the OS disk is opened with O_DIRECT and booted
from the rust-hypervisor-fw.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Update documentation and CI to rely on the new CLOUDHV.fd firmware built
from the newly introduced target CloudHvX64.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
error: this boolean expression can be simplified
--> tests/integration.rs:3755:33
|
3755 | assert!(!(empty > 5), "No login on pty");
| ^^^^^^^^^^^^ help: try: `empty <= 5`
|
= note: `-D clippy::nonminimal-bool` implied by `-D warnings`
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#nonminimal_bool
error: unneeded late initalization
--> tests/integration.rs:7619:13
|
7619 | let mut success;
| ^^^^^^^^^^^^^^^^
|
= note: `-D clippy::needless-late-init` implied by `-D warnings`
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_late_init
help: declare `success` here
|
7621 | let mut success = if let Some(status) = send_migration
| +++++++++++++++++
help: remove the assignments from the branches
|
7625 ~ status.success()
7626 | } else {
7627 ~ false
|
help: add a semicolon after the `if` expression
|
7628 | };
| +
error: unneeded late initalization
--> tests/integration.rs:7838:13
|
7838 | let mut success;
| ^^^^^^^^^^^^^^^^
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_late_init
help: declare `success` here
|
7840 | let mut success = if let Some(status) = send_migration
| +++++++++++++++++
help: remove the assignments from the branches
|
7844 ~ status.success()
7845 | } else {
7846 ~ false
|
help: add a semicolon after the `if` expression
|
7847 | };
| +
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
When enabling the `mshv` feature, we skip quite some tests and
hence have known dead-code. This annotation silences dead-code
related warnings for our quality workflow to pass.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Given integration tests are placed in a dedicate directory, they don't
need annotations (e.g. `#[cfg(integration_test)]` and `#[cfg(test)]`) or
defining `test mod` to exclude themselves from the common compilation
process.
Signed-off-by: Bo Chen <chen.bo@intel.com>
The test test_virtio_block_topology is flaky on aarch64, let's disable
it until we find the right way to fix it.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This test relies on using losetup with a block size to create a block
device from a file that has a specific block size for the topology
detection code to pick up and passthrough to the guest.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This time we use the Rust Hypervisor Firmware for test_vfio_user() in
order to fix the systemd issues we've seen so far.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This commit enhances the integration test for multiple PCI segments
by:
(1) Enables the `test_virtio_fs_multi_segment` on AArch64.
(2) Adds a new integration test case for both x86_64 and AArch64 using
the direct kernel boot to test virtio-disk multiple PCI segments.
The test case does:
- Start a VM using direct kernel boot with 16 PCI segments and assign
the last PCI segment with a virtio-disk device.
- Check if the number of PCI host bridges equals to 16 after VM boots.
- Mount the virtio-disk device on the last PCI segment to the rootfs
and write/read data to the virtio-disk device.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
Extending the test_simple_launch() integration test to validate Cloud
Hypervisor boots correctly with both rust-hypervisor-fw and OVMF on
x86_64 platforms.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Bumping the OVMF binary version along with UEFI documentation to
reflect the latest set of patches on top of tianocore/edk2 'master'
branch, which can be found on the Cloud Hypervisor fork on 'ch' branch.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Commit ac25172176 bumps the rust
version of virtiofsd named `virtiofsd-rs`, which causes a warning
```
warning: use of deprecated parameter '--socket':
Please use the '--socket-path' option instead.
```
This commit updates the cmdline parameter accordingly.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
This test is flaky (#3400) while we are experiencing a bug of using the latest
SPDK/NVMe backend as VFIO user device (#3401). Let's disable this test
before we fix the above two issues.
Signed-off-by: Bo Chen <chen.bo@intel.com>
For now we only enable the vfio-user test on x86_64 platform, as we have
a known hanging issue to resovle on the aarch64 platform.
Fixes: #3098
Signed-off-by: Bo Chen <chen.bo@intel.com>
This new integration test validates the vCPUs are running on the
expected set of CPUs on the host.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The elements of a list should be using commas as the correct delimiter
now that it is supported. Deprecate use of colons as delimiter.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Refactor the existing virtio fs test to support controlling the PCI
segment the device should be added to and use this for a multiple
segment test.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Refactor the existing net hotplug test to support controlling the PCI
segment the device should be added to and use this for a multiple
segment test.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Refactor the existing pmem hotplug test to support controlling the PCI
segment the device should be added to and use this for a multiple
segment test.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
On AArch64, device hotplug can be enabled with ACPI. Therefore,
this commit enables the hotplug test case for following devices:
- PCI bar reprogramming
- virtio-disk
- virtio-net
- macvtap
- virtio-vsock
- virtio-pmem: Works with the latest reference kernel
- virtio-fs: Works with the latest reference kernel
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
Currently vfio and nested virtualization is not used on AArch64,
and SGX is a x86_64 only feature. Therefore this commit adds the
architecture gates for helper functions related to vfio, SGX, and
nested virtualization to mute warnings when building tests on the
AArch64 platform.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
Memory hotplug and virtio_balloon works on arm64 with:
- memory hotplug: An updated kernel using ACPI
- virtio balloon: `stress` installed in the cloud image
Therefore, we can enable test cases for them in integration test.
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
On MSHV some of the integration test cases are not supported yet
or still in progress. This patch disables all those test cases.
Signed-off-by: Muminul Islam <muislam@microsoft.com>
Adding some bits to the existing live migration test with NUMA in order
to properly validate virtio-mem works with live migration.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Refactored the test case `test_virtio_iommu` to adapt architectures and
different choices among ACPI and FDT. In the case of ACPI, a Focal image
with modified kernel is tested.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Adding the snapshot/restore support along with migration as well,
allowing a VM with a virtio-balloon device attached to be properly
migrated.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
For AArch64, now virtual IOMMU is only tested on FDT, not ACPI.
In the case of FDT, the behavior of IOMMU is a bit different with ACPI.
All the devices on the PCI bus will be attached to the virtual IOMMU,
except the virtio-iommu device itself. So these devices will all be
added to IOMMU groups, and appear in folder '/sys/kernel/iommu_groups/'.
The result is, on AArch64 IOMMU group '0' contains "0000:00:01.0" which
is the console device. But on X86, console device is not attached to
IOMMU. So the IOMMU group '0' contains "0000:00:02.0" which is the first
disk.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
This patch adds a separate function to launch two guest VMs and ensure
they are connected through ovs-dpdk, so that we can reuse this function
in other tests, e.g. the test for live-migration with ovs-dpdk.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Currently we need to test both device tree and ACPI on AArch64. As
the number of ACPI test cases is gradually increasing and expected
to increase in the future, it is better to extract all ACPI test
cases on AArch64 to a single module.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
This patch adds a separate function to perform common numa checks, so
that we can reuse this function in other tests, e.g. the test for
live-migration with numa.
Signed-off-by: Bo Chen <chen.bo@intel.com>
This test exercises the local live-migration between two Cloud
Hypervisor VMs on the same host. It ensures the following behaviors:
1. The source VM is up and functional (including various virtio-devices
are working properly);
2. The 'send-migration' and 'receive-migration' command finished
successfully;
3. The source VM terminated gracefully after live migration;
4. The destination VM is functional (including various virtio-devices
are working properly) after live migration.
Note: This test does not use vsock as we can't create two identical
vsock on the same host.
Fixes: #2965
Signed-off-by: Bo Chen <chen.bo@intel.com>
This patch adds a dedicate function to include the common checks on the
virtio-devices from the 'test_snapshot_restore' test, which will also be
reused for the upcoming 'test_live_migration' test.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Two tests for booting Linux cloud image from the different VHDx files:
fixed and dynamic. Another test for testing the dynamic expansion of a
generated VHDx file.
Signed-off-by: Fazla Mehrab <akm.fazla.mehrab@intel.com>
This commit adds an AArch64-only integration test case called
`test_guest_numa_nodes_dt` so that it is possible to test the
NUMA for the FDT on AArch64 platform.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
AArch64 CPU topology can be described using either device tree or
ACPI. Therefore, the integration test should also cover the AArch64
ACPI CPU topology tests.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
Now that vhost-user supports being snapshot and restored, we extend the
existing test_ovs_dpdk to validate snapshot/restore feature works as
expected.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The existing test_macvtap is factorized to be able to support both
coldplug and hotplug of a macvtap interface through virtio-net. Adding
the new test_macvtap_hotplug test allows for validating that sending a
TAP file descriptor through control message along with the add-net
command works.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
To help with readability, we rely on exec_host_command_status() from the
macvtap test, which replaces the former "bash -c ..." syntax.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In order to uniquely identify each SGX EPC section, we introduce a
mandatory option `id` to the `--sgx-epc` parameter.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The output from getty ("login:") does not always appear. This can be
observed interactively when booting the VM. (Mashing return will bring
it up.) Instead of checking for that string to ensure the VM has booted
instead check for a message from systemd to say it has started the SSH
daemon.
Fixes: #2799
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
AArch64 tests were divided into 2 steps:
- Build and test with 'acpi' feature
- Build and test without 'acpi'
This can be optimized. We need only to build and test once with default
features ('acpi' is enabled).
On AArch64, ACPI only works with UEFI. If UEFI is not available, guest
kernel fall back to use FDT. Most AArch64 test cases boot from direct
kernel, the guest will keep using FDT even if ACPI is enabled. So
nothing is broken.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>