Add integration test of coredump with no need pause.
As file of coredump has been tested in test_coredump(), so this
patch only test vm state after coredump.
Signed-off-by: Yi Wang <foxywang@tencent.com>
This test case creates a new qcow2 file using the image of ubuntu as
its backing file, and boot a virtual machine with this image file.
Signed-off-by: Yu Li <liyu.yukiteru@bytedance.com>
test_cpu_affinity needs the number of host CPUs. Since it is possible
for the host to have more than 255 CPUs; increase the size of the
datatype used for parsing the string to accomodate this.
Signed-off-by: dom.song <dom.song@amperecomputing.com>
Add integration test for pvpanic, by two methods:
- the vendor id and device id of pci device in guest
- triggering a guest panic and check event-monitor.
Also, to support pvpanic-pci driver, add pvpanic config
in resources.
Signed-off-by: Yi Wang <foxywang@tencent.com>
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
Between musl and glibc there is a difference in the signature of the
ioctl libc function. Use an anonymous cast to force the type coversion.
Signed-off-by: Ravi kumar Veeramally <ravikumar.veeramally@intel.com>
Test was failing due to regression caused by commit
d5558aea2a
Failing command:
sudo /mnt/ch-remote --api-socket /tmp/ch_api.sock resize --memory=1073741824"
Fixes#5190
Signed-off-by: Ruslan Mstoi <ruslan.mstoi@intel.com>
In this way, our integration tests exercise the same set of build
features (e.g. "kvm,mshv") being used for releases.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Implemented a `TargetApi` enum to make the process of implementing
tests for the D-Bus and HTTP API more convenient.
Refactored `test_api_{create_boot, shutdown, pause_resume, delete}` tests
with the `TargetApi` enum to also implement tests for the D-Bus API.
Added a new test named `test_api_dbus_and_http_interleaved` that uses
both the HTTP and D-Bus API at the same time.
Modified integration test scripts to enable the `dbus_api` feature when
compiling and start a dbus-session when integration tests are run.
Signed-off-by: Omer Faruk Bayram <omer.faruk@sartura.hr>
This fixes the following tests that have been consistently failing on
the CI:
[2023-04-22T07:00:53.760Z] failures:
[2023-04-22T07:00:53.760Z] common_parallel::test_focal_hypervisor_fw
[2023-04-22T07:00:53.760Z] common_parallel::test_focal_ovmf
I'm not sure of the origin of this check but it obviously dependent on
the underlying platform as the guest OS has not changed. Since it
depends on the host environment it doesn't make sense to assert for it.
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
In this way, we can cover the scenario where a VM with hotplugged net
device using FDs can work properly with reboot.
Signed-off-by: Bo Chen <chen.bo@intel.com>
balloon_free_page_reporting test case should not
work as expected. The reason is that MSHV pins
all the pages during the memory map for the guest.
Those pages can not be altered without unpinning the pages.
MSHV does not support modifying the pages during the guest
life cycle. This test case can be enabled once we add
VA backed VM support.
Signed-off-by: Muminul Islam <muislam@microsoft.com>
The updated image is configured in a same way as the previously used
2019, it has same
- Credentials
- Services configured, like SAC, SSH, RDP
- Size
All the Windows updates are applied so the state is current to the date.
Also, the latest stable version 0.1.229 of the VirtIO Windows drivers
is installed.
Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
KSM doesn't work with MSHV stack since guest memory is pinned
(`pin_user_pages`) and pinned pages cannot be merged.
So, don't run the test for mshv.
Signed-off-by: Anirudh Rayabharam <anrayabh@linux.microsoft.com>
A few breaking changes:
1. `-vvv` needs to be written as `-v -v -v`.
2. `--disk D1 D2` and others need to be written as `--disk D1 --disk D2`.
3. `--option=value` needs to be written as `--option value`
Change integration tests to adapt to the breaking changes.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Since argh does not support `--option=value`, we need to change the
integration test code to become `--option value`.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Since SGX testing doesn't rely on a custom guest image anymore, there's
no need to keep the custom filename around as it's already not in use.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
While measuring UDP PPS, we saturate the link, so there are packets
lost. We only account for the packets that are not lost.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
This OS is EOL this year and is well tested by the Rust Hypervisor
Firmware CI so there is no need to duplicate this effort.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Since the refactoring of the vm-migration crate broke the backward
compatibility, we must disable the live upgrade tests until next
release.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This moves the devices creation out of the dedicated restore function
which will be eventually removed.
This factorizes the creation of all devices into a single location.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Re-enable the VFIO integration now the machine is back online.
The image has been updated to rely on Ubuntu 22.04 (Jammy) and it's
smaller given only the NVIDIA drivers along with the nvidia-smi tool are
installed.
The test to verify the GPU is functional has been simplified given it
only relies on nvidia-smi to validate it has been able to find the Tesla
T4 card, meaning the associated driver was loaded correctly.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This allows the unification of the same testing methodology with aarch64
and removes a user of the Bionic image.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The jammy disk image has a new enough kernel to support SGX and if we
rely on just the CPUid information (which is sufficient) then we can use
the regular jammy test image for testing.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
There is no need for this test any longer as we have plenty of other
tests that reboot the VM.
Further this test used unmodified bionic image, which not only will be
EOLed soon but also took a long time to shutdown as it still had snapd
installed.
Fixes: #4849
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The systemd journal has a known issue of generating large size logs[1],
which makes it unreliable as a source for retrieving system
information, such as for counting reboot times. This is particularly
problematic on disk-constrained systems, like the VMs we launched for
our integration tests, where the disk size is normally 2GB. By default,
the systemd journal has a size limit of 10% of the size of the
underlying file system (e.g. around 200MB for the VMs of our integration
tests), which would remove archived journal files on demand.
A better alternative to count reboot times is based on information from
`wtmp` (e.g. the login records) which is much more concise and can be
accessed via the `last` command.
[1] https://github.com/systemd/systemd/issues/5285Fixes: #4749
Signed-off-by: Bo Chen <chen.bo@intel.com>
Adding the support for the user to set the MTU for the vhost-user-net
backend, which allows the integration test to be extended with the test
of the MTU parameter.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Add a new "mtu" parameter to the NetConfig structure and therefore to
the --net option. This allows Cloud Hypervisor's users to define the
Maximum Transmission Unit (MTU) they want to use for the network
interface that they create.
In details, there are two main aspects. On the one hand, the TAP
interface is created with the proper MTU if it is provided. And on the
other hand the guest is made aware of the MTU through the VIRTIO
configuration. That means the MTU is properly set on both the TAP on the
host and the network interface in the guest.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This function starts the 'receive-migration' for the destination VM,
'send-migration' for the source VM, waits for the live-migration
completion, and prints debug information upon errors.
Signed-off-by: Bo Chen <chen.bo@intel.com>
This patch moves the actual test logic and assertions from various
functions to the actual tests, which makes these tests more readable and
easier to debug.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Until the issue #4583 is resolved, we must disable this test given it's
failing quite often on the aarch64 worker.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
As 'handle_child_output()' may terminate the test on panic, we need to
cleanup ovs-dpdk setup in advance.
see: #4555
Signed-off-by: Bo Chen <chen.bo@intel.com>
Following our recent v26.0 release we can re-enable our live upgrade
tests to try and make it possible for us to move to making LTS releases.
Currently limited to x86-64 as the live upgrade tests fail on aarch64.
Fixes: #3949
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Only the ovs-dpdk live-migration tests need to run sequentially as they
use the same ovs-dpdk setup.
This is to reduce our CI time, particularly for the live-migration
and aarch64 jobs.
Signed-off-by: Bo Chen <chen.bo@intel.com>
This enables the Windows test module. One basic test is enabled,
while all others are disabled yet for aarch64. Jenkins file is
extended with the corresponding step for aarch64.
installAzureCli() is parametrized.
It seems that transferring a 30GB image would take >= 15 minutes. An
optimization here is having a gzip'ed image to 10GB which would unpack
in 3 minutes. Expect to be quicker than transferring an uncompressed
image while on another network.
Signed-off-by: Anatol Belski <ab@php.net>
The test test_virtio_block_topology has been recently failing due to an
error happening in losetup while trying to set the block size. Since
there's no option in losetup for retrying, we took the approach of
programming the expected behavior of creating a loop device relying
directly on the system ioctl LOOP_CONFIGURE. We apply a retry loop based
on the result returned by this ioctl, so that we don't fail on the first
try. We also added a sleep before retrying, hoping this would help the
next iteration to succeed.
Fixes#3494
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
It's been observed on the Bionic image that udev and snapd services can
cause some delay in the VM's shutdown. Disabling them before shutting
down the VM improves the reliability of the test.
Also increasing slightly the sleep time to ensure we give the VM enough
time to shutdown before checking the list of events provided by the
event monitor.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Since it's not possible to run the integration test test_vfio on Azure
at the moment (because of some nested virtualization issues), we can
temporarily run it on the baremetal CI where we already run some VFIO
tests.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Move the live migration tests to a 'jammy' worker rather than
'jammy-small'. This type of worker has more CPUs (64 vs 16) and more RAM
(256G vs 64G), which should improve the time it takes to run each test.
With this improvement, the test shouldn't fail anymore due to timeout
being reached.
A second improvement is to reduce the amount of vCPUs created for each
VM. The point is simply to check we can migrate a VM with multiple
vCPUs, therefore using 2 instead of 6 should be enough when possible.
When testing NUMA, we can't lower the amount of vCPUs since there's a
quite complex topology that is expected there.
Also, the total amount of vCPUs is reduced from 12 to 4 (again when not
testing with NUMA).
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Improve error catching on the steps creating the block device so that we
can understand if qemu-img or losetup is the faulty command leading to
an empty device path.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Both of these tests have been sporadically failing through multiple CI
runs. The reason is related to cloud-init which fails to run the
"init-local" script during the second boot of the VM. This causes the
network interface to not be available, and therefore the test can't SSH
into the VM as expected. The root cause is the filesystem and cache
corruption that happens on the cloud-init disk.
The way to prevent from this issue is to sync the guest filesystem
before we shut it down, and as a security harness, we also wait for a
few seconds for the shutdown command to complete inside the guest before
we trigger the API shutdown or delete.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
It might sometimes take a few seconds for the guest to trigger the OOM
and report it back to the host. That's why this patch adds some sleep
time between the command in the guest supposedly triggering the OOM and
the check of the balloon size from the host.
Fixes#4336
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
warning: you are deriving `PartialEq` and can implement `Eq`
--> vmm/src/serial_manager.rs:59:30
|
59 | #[derive(Debug, Clone, Copy, PartialEq)]
| ^^^^^^^^^ help: consider deriving `Eq` as well: `PartialEq, Eq`
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#derive_partial_eq_without_eq
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
As coredump function is to make a vmcore for crash tool to analyze,
in order not to introduce a big thing in integration, we just check
if ch-remote command runs no error report here.
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
From the logs it appears that booting the VM to the point at which it
can signal to the host can sometimes take longer than then 30 seconds
specified.
Fixes: #4136
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The current patch fixes the following error that was raised by clippy:
error: this let-binding has unit value
--> tests/integration.rs:6538:13
|
6538 | / let _ = stdin
6539 | | .write_all("type=7".as_bytes())
6540 | | .expect("failed to write stdin");
| |_________________________________________________^
|
= note: `-D clippy::let-unit-value` implied by `-D warnings`
= help: for further information visit
https://rust-lang.github.io/rust-clippy/master/index.html#let_unit_value
help: omit the `let` binding
|
6538 ~ stdin
6539 + .write_all("type=7".as_bytes())
6540 + .expect("failed to write stdin");
|
error: could not compile `cloud-hypervisor` due to previous error
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In this way, we can cover a broad range of events from the event monitor
while avoiding code duplication.
Fixes: #4054
Signed-off-by: Bo Chen <chen.bo@intel.com>
This prevents a conflict since the old API socket will not have been
cleaned up (due to the use of SIGKILL.)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>