It's been observed on the Bionic image that udev and snapd services can
cause some delay in the VM's shutdown. Disabling them before shutting
down the VM improves the reliability of the test.
Also increasing slightly the sleep time to ensure we give the VM enough
time to shutdown before checking the list of events provided by the
event monitor.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Since it's not possible to run the integration test test_vfio on Azure
at the moment (because of some nested virtualization issues), we can
temporarily run it on the baremetal CI where we already run some VFIO
tests.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Move the live migration tests to a 'jammy' worker rather than
'jammy-small'. This type of worker has more CPUs (64 vs 16) and more RAM
(256G vs 64G), which should improve the time it takes to run each test.
With this improvement, the test shouldn't fail anymore due to timeout
being reached.
A second improvement is to reduce the amount of vCPUs created for each
VM. The point is simply to check we can migrate a VM with multiple
vCPUs, therefore using 2 instead of 6 should be enough when possible.
When testing NUMA, we can't lower the amount of vCPUs since there's a
quite complex topology that is expected there.
Also, the total amount of vCPUs is reduced from 12 to 4 (again when not
testing with NUMA).
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Improve error catching on the steps creating the block device so that we
can understand if qemu-img or losetup is the faulty command leading to
an empty device path.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Both of these tests have been sporadically failing through multiple CI
runs. The reason is related to cloud-init which fails to run the
"init-local" script during the second boot of the VM. This causes the
network interface to not be available, and therefore the test can't SSH
into the VM as expected. The root cause is the filesystem and cache
corruption that happens on the cloud-init disk.
The way to prevent from this issue is to sync the guest filesystem
before we shut it down, and as a security harness, we also wait for a
few seconds for the shutdown command to complete inside the guest before
we trigger the API shutdown or delete.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
It might sometimes take a few seconds for the guest to trigger the OOM
and report it back to the host. That's why this patch adds some sleep
time between the command in the guest supposedly triggering the OOM and
the check of the balloon size from the host.
Fixes#4336
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
warning: you are deriving `PartialEq` and can implement `Eq`
--> vmm/src/serial_manager.rs:59:30
|
59 | #[derive(Debug, Clone, Copy, PartialEq)]
| ^^^^^^^^^ help: consider deriving `Eq` as well: `PartialEq, Eq`
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#derive_partial_eq_without_eq
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
As coredump function is to make a vmcore for crash tool to analyze,
in order not to introduce a big thing in integration, we just check
if ch-remote command runs no error report here.
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
From the logs it appears that booting the VM to the point at which it
can signal to the host can sometimes take longer than then 30 seconds
specified.
Fixes: #4136
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The current patch fixes the following error that was raised by clippy:
error: this let-binding has unit value
--> tests/integration.rs:6538:13
|
6538 | / let _ = stdin
6539 | | .write_all("type=7".as_bytes())
6540 | | .expect("failed to write stdin");
| |_________________________________________________^
|
= note: `-D clippy::let-unit-value` implied by `-D warnings`
= help: for further information visit
https://rust-lang.github.io/rust-clippy/master/index.html#let_unit_value
help: omit the `let` binding
|
6538 ~ stdin
6539 + .write_all("type=7".as_bytes())
6540 + .expect("failed to write stdin");
|
error: could not compile `cloud-hypervisor` due to previous error
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In this way, we can cover a broad range of events from the event monitor
while avoiding code duplication.
Fixes: #4054
Signed-off-by: Bo Chen <chen.bo@intel.com>
This prevents a conflict since the old API socket will not have been
cleaned up (due to the use of SIGKILL.)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Live upgrade is currently not guaranteed during this development cycle
and we will try to enable these tests after the next release.
Signed-off-by: Bo Chen <chen.bo@intel.com>
By augmenting existing set of tests, this patch added a set of
tests for live-upgrade that covers use cases with NUMA,
vhost-user (OVS-DPDK), and local-migration.
Fixes: #3949
Signed-off-by: Bo Chen <chen.bo@intel.com>
We modified a test case to workaround the RAM calculation error caused
by hidding 4MiB memory for UEFI. Now change it back to normal.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Now that address translations performed by virtio-iommu can error out if
the address can't be translated, we uncovered an issue in integration
test aarch64_acpi::test_virtio_iommu.
We disable the test until we can investigate and fix the root cause.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The extra vDPA device in the test is hotplugged behind the vIOMMU, which
covers the use case of placing a vDPA device behind a virtual IOMMU.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
As reported by the periodic CI runs, it may take more time for the NVMe
device to present in the guest after being hotplugged as a VFIO user
device on `aarch64` (especially under high load). Let's increase the
timeout after device hotplug from `1s` to `10s` to increase the test
stability.
Fixes: #3495
Signed-off-by: Bo Chen <chen.bo@intel.com>
Compile this feature in by default as it's well supported on both
aarch64 and x86_64 and we only officially support using it (no non-acpi
binaries are available.)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
It seems the vdpa_sim_block isn't behaving properly after the vhost
device is closed, as it sometimes returns EBUSY when we try to open it
again. The easiest way to deal with this issue is by simplifying the
integration test, avoid to plug the same device after it's been
unplugged.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>