This PR addresses a bug in which the cpu topology of a guest
with non power-of-two number of cores is incorrect. For example,
in some contexts, a virtual machine with 2-sockets and 12-cores
will incorrectly believe that 16 cores are on socket 1 and 8
cores are on socket 2. In other cases, common topology enumeration
software such as hwloc will crash.
The root of the problem was the way that cloud-hypervisor generates
apic_id. On x86_64, the (x2) apic_id embeds information about cpu
topology. The cpuid instruction is primarily used to discover the
number of sockets, dies, cores, threads, etc. Using this information,
the (x2) apic_id is masked to determine which {core, die, socket} the
cpu is on. When the cpu topology is not a power of two
(e.g. a 12-core machine), this requires non-contiguous (x2) apic_id.
Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
(cherry picked from commit 5c0b66529a)
This patch bumps the following crates, including `kvm-bindings@0.7.0`*,
`kvm-ioctls@0.16.0`**, `linux-loader@0.11.0`, `versionize@0.2.0`,
`versionize_derive@0.1.6`***, `vhost@0.10.0`,
`vhost-user-backend@0.13.1`, `virtio-queue@0.11.0`, `vm-memory@0.14.0`,
`vmm-sys-util@0.12.1`, and the latest of `vfio-bindings`, `vfio-ioctls`,
`mshv-bindings`,`mshv-ioctls`, and `vfio-user`.
* A fork of the `kvm-bindings` crate is being used to support
serialization of various structs for migration [1]. Also, code changes
are made to accommodate the updated `struct xsave` from the Linux
kernel. Note: these changes related to `struct xsave` break
live-upgrade.
** The new `kvm-ioctls` crate introduced breaking changes for
the `get/set_one_reg` API on `aarch64` [2], so code changes are made to
the new APIs.
*** A fork of the `versionize_derive` crate is being used to support
versionize on packed structs [3].
[1] https://github.com/cloud-hypervisor/kvm-bindings/tree/ch-v0.7.0
[2] https://github.com/rust-vmm/kvm-ioctls/pull/223
[3] https://github.com/cloud-hypervisor/versionize_derive/tree/ch-0.1.6Fixes: #6072
Signed-off-by: Bo Chen <chen.bo@intel.com>
(cherry picked from commit 3ce0fef7fd)
It takes longer time to restore a VM on a VM with 16 cores comparing
with ones with 64 cores.
Signed-off-by: Bo Chen <chen.bo@intel.com>
(cherry picked from commit 0718067851)
The bionic image was being downloaded and converted but no test uses
this image any longer.
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
(cherry picked from commit 6930370a03)
The 'test_vfio_user' is prone to fail when the system is under high
workloads with errors:
```
Error while connecting to /var/tmp/spdk.sock
Is SPDK application running?
Error details: Invalid or non-existing address: '/var/tmp/spdk.sock'
```
This is because SPDK is not fully functional before we request to
create a nvme device using the vfio_user protocol. This patch stabilize
this test with allowing retires to execute host commands.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Running on host where vdpa_sim_blk module is not correctly loaded
test_vdpa_block passes.
"test common_parallel::test_vdpa_block ... ok"
This commit fixes the vdpa_sim_blk test to fail in that case.
Signed-off-by: Ruslan Mstoi <ruslan.mstoi@intel.com>
Since the 'write()' to the event file was moved to its own thread
(see #5633), we have no reliable way to read the latest contents of
the event file from our integration tests, since we can't ensure the
'read()' from our test always happen after 'write()' is completed from
Cloud Hypervisor. This is also why we started to see random failures on
snapshot_restore tests (particularly when the system workload is high).
This patch adds a 1s sleep before reading the event file to mitigate the
random failures.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Instead of relying on "wc" and "grep", this patch provides helper
functions for checking line counts and searching/counting keywords.
To understand assertion failures better, it also generate logs for the
L1/L2 VM commands when checks fail.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Add a test that supports configuring serial and console as TTY mode
at the same time. With this configuration, the VM can set up a legacy
serial device as an early printk console device, and then change to a
virito console device after the virito console device is initialized.
In this case, we can capture the logs printed by legacy serial on early
boot, and later by the virtio console.
Signed-off-by: Yong He <alexyonghe@tencent.com>
With #4324 being resolved, the nested VFIO test (e.g. "test_vfio") is
now a part of the general Azure VM-based workers. No need to run it on
the bare-metal worker.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Partially revert 111225a2a5
and add the new dbus and pvpanic arguments.
As we are switching back to clap observe the following changes.
A few examples:
1. `-v -v -v` needs to be written as`-vvv`
2. `--disk D1 --disk D2` and others need to be written as `--disk D1 D2`.
3. `--option value` needs to be written as `--option=value.`
Change integration tests to adapt to the breaking changes.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Signed-off-by: Ravi kumar Veeramally <ravikumar.veeramally@intel.com>
This fixes all typos found by the typos utility with respect to the config file.
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
In the documentation of function check_latest_events_exact use same events
argument name as in the implementation
Signed-off-by: Ruslan Mstoi <ruslan.mstoi@intel.com>
Add integration test of coredump with no need pause.
As file of coredump has been tested in test_coredump(), so this
patch only test vm state after coredump.
Signed-off-by: Yi Wang <foxywang@tencent.com>
This test case creates a new qcow2 file using the image of ubuntu as
its backing file, and boot a virtual machine with this image file.
Signed-off-by: Yu Li <liyu.yukiteru@bytedance.com>
test_cpu_affinity needs the number of host CPUs. Since it is possible
for the host to have more than 255 CPUs; increase the size of the
datatype used for parsing the string to accomodate this.
Signed-off-by: dom.song <dom.song@amperecomputing.com>
Add integration test for pvpanic, by two methods:
- the vendor id and device id of pci device in guest
- triggering a guest panic and check event-monitor.
Also, to support pvpanic-pci driver, add pvpanic config
in resources.
Signed-off-by: Yi Wang <foxywang@tencent.com>
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
Between musl and glibc there is a difference in the signature of the
ioctl libc function. Use an anonymous cast to force the type coversion.
Signed-off-by: Ravi kumar Veeramally <ravikumar.veeramally@intel.com>
Test was failing due to regression caused by commit
d5558aea2a
Failing command:
sudo /mnt/ch-remote --api-socket /tmp/ch_api.sock resize --memory=1073741824"
Fixes#5190
Signed-off-by: Ruslan Mstoi <ruslan.mstoi@intel.com>
In this way, our integration tests exercise the same set of build
features (e.g. "kvm,mshv") being used for releases.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Implemented a `TargetApi` enum to make the process of implementing
tests for the D-Bus and HTTP API more convenient.
Refactored `test_api_{create_boot, shutdown, pause_resume, delete}` tests
with the `TargetApi` enum to also implement tests for the D-Bus API.
Added a new test named `test_api_dbus_and_http_interleaved` that uses
both the HTTP and D-Bus API at the same time.
Modified integration test scripts to enable the `dbus_api` feature when
compiling and start a dbus-session when integration tests are run.
Signed-off-by: Omer Faruk Bayram <omer.faruk@sartura.hr>
This fixes the following tests that have been consistently failing on
the CI:
[2023-04-22T07:00:53.760Z] failures:
[2023-04-22T07:00:53.760Z] common_parallel::test_focal_hypervisor_fw
[2023-04-22T07:00:53.760Z] common_parallel::test_focal_ovmf
I'm not sure of the origin of this check but it obviously dependent on
the underlying platform as the guest OS has not changed. Since it
depends on the host environment it doesn't make sense to assert for it.
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
In this way, we can cover the scenario where a VM with hotplugged net
device using FDs can work properly with reboot.
Signed-off-by: Bo Chen <chen.bo@intel.com>
balloon_free_page_reporting test case should not
work as expected. The reason is that MSHV pins
all the pages during the memory map for the guest.
Those pages can not be altered without unpinning the pages.
MSHV does not support modifying the pages during the guest
life cycle. This test case can be enabled once we add
VA backed VM support.
Signed-off-by: Muminul Islam <muislam@microsoft.com>
The updated image is configured in a same way as the previously used
2019, it has same
- Credentials
- Services configured, like SAC, SSH, RDP
- Size
All the Windows updates are applied so the state is current to the date.
Also, the latest stable version 0.1.229 of the VirtIO Windows drivers
is installed.
Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>