AArch64 tests were divided into 2 steps:
- Build and test with 'acpi' feature
- Build and test without 'acpi'
This can be optimized. We need only to build and test once with default
features ('acpi' is enabled).
On AArch64, ACPI only works with UEFI. If UEFI is not available, guest
kernel fall back to use FDT. Most AArch64 test cases boot from direct
kernel, the guest will keep using FDT even if ACPI is enabled. So
nothing is broken.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Issue from beta verion of clippy:
Error: --> vm-virtio/src/queue.rs:700:59
|
700 | if let Some(used_event) = self.get_used_event(&mem) {
| ^^^^ help: change this to: `mem`
|
= note: `-D clippy::needless-borrow` implied by `-D warnings`
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow
Signed-off-by: Bo Chen <chen.bo@intel.com>
The virtio_balloon test is a bit flaky since we can't really know how
much the balloon is gonna be deflated when the guest is under memory
pressure. That's why it's safer to simply check that the balloon is not
the initial size anymore.
One small detail, but we don't need to check for the balloon size to be
higher than 0 since the returned value is a u64.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Additionally, he disk creation routine is extended to support NTFS and
variable image size.
Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
In order to allow a hotplugged vCPU to be assigned to the correct NUMA
node in the guest, the DSDT table must expose the _PXM method for each
vCPU. This method defines the proximity domain to which each vCPU should
be attached to.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The new test_virtio_balloon() is to verify if the 'deflate_on_oom'
parameter works. Its testing result is as follows:
1. Start a 4G guest with 2G balloon, check memory once starts up.
total_mem is 4294967296 bytes
actual_mem is 2147483648 bytes
orig_balloon is 2147483648 bytes
total used free shared buff/cache available
Mem: 3.8Gi 2.1Gi 1.6Gi 0.0Ki 140Mi 1.6Gi
Swap: 0B 0B 0B
2. Run a command in guest to eat up 25G memory, and check again.
total_mem is 4294967296 bytes
actual_mem is 3121610752 bytes
deflated_balloon is 1173356544 bytes
total used free shared buff/cache available
Mem: 3.8Gi 1.2Gi 2.6Gi 0.0Ki 49Mi 2.5Gi
Swap: 0B 0B 0B
From above, we can notice the balloon memory indeed deflates from
2147483648 bytes to 1173356544 bytes once an oom is going to be
triggered.
Signed-off-by: Fei Li <lifei.shirley@bytedance.com>
A new method has been introduced for WindowsGuest, that would poll on
the SSH connection and try to execute a test command. When successfull,
the polling stops and the guest is considered finished boot. The
downside of this method is that the network setup is needed always even
if the test doesn't use network, however it complies with the behavior
of other tests. I also observe a bit more stability on the local test
run, however it still appears to be a resource issue so some sporadic
fails are possible on a slower machine.
The hardcoded timeouts for guest boot and DHCP setup have been removed.
The dnsmasq invocation uses `--bind-dynamic` so then the daemon can be
started while the target interface could be down yet.
Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
Co-authored-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
test_aarch64_pe_boot was added at the very beginning of AArch64 support.
Now it can be combined to test_vmlinux_boot.
Renamed test_vmlinux_boot to test_direct_kernel_boot for generality.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Enabled test case test_vmlinux_boot_noacpi on AArch64 and renamed it
test_direct_kernel_boot_noacpi for generality.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Some new integration tests will require the "stress" binary to be
present in the guest in order to run correctly.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Re-enable virtiofsd testing now that issues with capstone repository
have been resolved.
This reverts commit a14c70019a.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This new function allows for better readability of the code by
factorizing a few of lines of code into a single one.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
It's important to verify the actual exit code returned from the new
function exec_host_command_status() to ensure the command ran
successfully.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This new function allows for better readability of the code by
factorizing a bunch of lines of code into a single one.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
There's no need for ssh_command_ok() anymore since ssh_command() now
returns an error in case the executed command returned with an exit code
different than 0.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Make sure the reconnection functionaliity is well tested from the OVS
DPDK integration test.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In order to simplify and speed up the OVS DPDK test, we switch from
using 'iperf3' to 'netcat' to validate the connectivity is functional.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Since OVS DPDK deprecated the use for the server mode, let's make sure
the integration test uses the client mode instead. This means the OVS
backend is the client and the VMM acts as the server, therefore the
reason why we use "vhost_mode=server".
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Re-enable the OVS DPDK integration test by assigning both VMs to the
NUMA node 0. This ensures the processes are being run on NUMA node 0,
preventing OVS DPDK from abnormal functioning.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The CI is failing due the git server that the submodules required for
this fork of QEMU need to build from is unavailable.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
(cherry picked from commit 2aec0a92a5)
In order to avoid regression regarding OVS-DPDK support, a new
integration test is added. This test consists of running two VMs, both
attached to a distinct OVS port, where both ports are connected to an
OVS bridge. Once the VM are running, the test validates the connection
between the two VMs works correctly.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Adding a new function ssh_command_ok() as a wrapper around the existing
ssh_command() function. The goal being to identify if the command
returned without any error.
This new function allow for a bit of factorization through the codebase.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Yet another small refactoring step for WindowsGuest
after f56471566b.
For this particular case - there's currently neither overloading nor
default argument support in Rust (except a macro or other tricky stuff),
so keep the timeout and other options default for now.
Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
Bump the sleep time before checking the guest RAM size from 10 to 30
seconds. This will help the VFIO baremetal CI passing more consistently.
Fixes#2606
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Re-enable virtiofsd testing now that issues with capstone repository
have been resolved.
This reverts commit 2aec0a92a5.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Relying on dnsmasq running on the host, the Windows guest are now
getting allocated with the expected IP addresses. This allows for
multiple VMs, therefore multiple tests to run in parallel.
The end goal is to reduce the time spent running Windows integration
tests.
Fixes#1891
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This new wrapping structure allows for a better factorization of the
Windows specific code. This makes each test simpler and easier to read.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
By using a Box around the DiskConfig trait, it becomes Sized. For that
reason, we can pass the DiskConfig to the Guest so that it can own it.
This allows for further simplification as the Guest does not need to be
bound to a specific lifetime, which makes things easier.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Refactor the existing vhost-user-net integration tests in order to
extend it with an extra test for vhost-user client mode.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Relying on guest Ubuntu image 21.04, including a 5.11 kernel, this patch
adds some additional tests to the VFIO baremetal integration tests. It
adds a test for ACPI memory hotplug, another one for virtio-mem memory
hotplug, and finally a test for hotplugging the NVIDIA card.
The existing test already taking care of the reboot has been renamed.
The script running "cargo test" has been modified to run only one thread
at a time, so that each test run sequentially. This is mandatory since
the card can't be shared across multiple VMs.
Fixes#2404
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In order to support most recent Ubuntu distributions, we must update
the way of detecting a reboot through the journal since there is no
more "-- Reboot --" logs.
Using the `--list-boots` option is the preferred way for getting the
boot count information from journalctl command. We simply need to add 1
to the count in order to get the reboot count.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Moving to latest Ubuntu version as the guest image is needed to move to
more recent guest kernel (5.11). With more recent kernels, we'll be able
to add hotplug and virtio-mem tests to the VFIO baremetal CI.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The CI is failing due the git server that the submodules required for
this fork of QEMU need to build from is unavailable.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
test_reboot became flaky after the refactoring of the VM reboot code.
This is because we removed the ability to specify a custom timeout.
This patch fixes the issue by allowing a custom timeout of 120s to be
set.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Factorize NVIDIA GPU checks into its own function so that it can be
reused.
Factorize linux guest reboot into its own function to reduce the amount
of code.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Use the PVH vmlinux for all tests (with the exception of the specific
bzImage test.)
See: #2231
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The new kernel 5.12 requires the devices to be manually bound to
vfio-pci while adding a new_id is only needed once per
device_id:vendor_id pair.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
It appears that mshv is not yet there to succeed with these tests. It is
suggested to ignore them and enable later one by one as the
functionality gets fixed.
Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
We now reply on the value from '/sys/kernel/mm/ksm/shared_pages' to
validate our "--memory mergeable=on|off" option. For `mergeable=on`,
we are expecting to see more 'shared_pages' reported by the kernel when
we start more VMs with this option. For `mergeable=off`, we are
expecting the 'shared_pages' value to be always 0, as we are assuming
the rest of the system (in our CI) is not using mergeable memory.
Fixes: #2138
Signed-off-by: Bo Chen <chen.bo@intel.com>
The MCRS method returns a 64-bit memory range descriptor. The
calculation is supposed to be done as follows:
max = min + len - 1
However, every operand is represented not as a QWORD but as combination
of two DWORDs for high and low part. Till now, the calculation was done
this way, please see also inline comments:
max.lo = min.lo + len.lo //this may overflow, need to carry over to high
max.hi = min.hi + len.hi
max.hi = max.hi - 1 // subtraction needs to happen on the low part
This calculation has been corrected the following way:
max.lo = min.lo + len.lo
max.hi = min.hi + len.hi + (max.lo < min.lo) // check for overflow
max.lo = max.lo - 1 // subtract from low part
The relevant part from the generated ASL for the MCRS method:
```
Method (MCRS, 1, Serialized)
{
Acquire (MLCK, 0xFFFF)
\_SB.MHPC.MSEL = Arg0
Name (MR64, ResourceTemplate ()
{
QWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, Cacheable, ReadWrite,
0x0000000000000000, // Granularity
0x0000000000000000, // Range Minimum
0xFFFFFFFFFFFFFFFE, // Range Maximum
0x0000000000000000, // Translation Offset
0xFFFFFFFFFFFFFFFF, // Length
,, _Y00, AddressRangeMemory, TypeStatic)
})
CreateQWordField (MR64, \_SB.MHPC.MCRS._Y00._MIN, MINL) // _MIN: Minimum Base Address
CreateDWordField (MR64, 0x12, MINH)
CreateQWordField (MR64, \_SB.MHPC.MCRS._Y00._MAX, MAXL) // _MAX: Maximum Base Address
CreateDWordField (MR64, 0x1A, MAXH)
CreateQWordField (MR64, \_SB.MHPC.MCRS._Y00._LEN, LENL) // _LEN: Length
CreateDWordField (MR64, 0x2A, LENH)
MINL = \_SB.MHPC.MHBL
MINH = \_SB.MHPC.MHBH
LENL = \_SB.MHPC.MHLL
LENH = \_SB.MHPC.MHLH
MAXL = (MINL + LENL) /* \_SB_.MHPC.MCRS.LENL */
MAXH = (MINH + LENH) /* \_SB_.MHPC.MCRS.LENH */
If ((MAXL < MINL))
{
MAXH += One /* \_SB_.MHPC.MCRS.MAXH */
}
MAXL -= One
Release (MLCK)
Return (MR64) /* \_SB_.MHPC.MCRS.MR64 */
}
```
Fixes#1800.
Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
Since using bzImage is now deprecated, let's update the SGX integration
test to rely on vmlinux instead.
Fixes#2476
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Both changes aim to document the absence of the CPU hot-remove
functionality on Windows.
Closes#2457.
Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
Update the Ubuntu Focal image used as the guest image. It's based on the
latest Focal image released on April 1st 2021, and customized to include
all the utilities we need. As usual, snapd and pollinate services have
been removed.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Fixes the current codebase so that every cargo clippy can be run with
the beta toolchain without any error.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
It must be specified as excluded from the workspace as it must not be
built on non-test targets due to issues with the ssh2 dependency and the
musl toolchain.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This includes:
* OS disk image management
* Cloud init creation
* SSH to guest access
* Waiting for guest to boot
This will be useful in other projects that want to do similar things in
their integration tests.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Relying on a NVIDIA Tesla T4 card present in the SGX machine, this patch
enables baremetal VFIO testing, validated by running several NVIDIA
tools in the guest. The guest image has been prepared to include all the
software needed to run these tests.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Enabled all "ttyS0" related test cases:
- test_serial_off
- test_serial_tty
- test_serial_file
Enabled mandatory guest kernel driver for "ns16550a" on AArch64.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
This removes the dependency on "tempdir" which in turn depends on the
large rand dependency chain.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
If the function can never return an error this is now a clippy failure:
error: this function's return value is unnecessarily wrapped by `Result`
--> virtio-devices/src/watchdog.rs:215:5
|
215 | / fn set_state(&mut self, state: &WatchdogState) -> io::Result<()> {
216 | | self.common.avail_features = state.avail_features;
217 | | self.common.acked_features = state.acked_features;
218 | | // When restoring enable the watchdog if it was previously enabled. We reset the timer
... |
223 | | Ok(())
224 | | }
| |_____^
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_wraps
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Add the ability for cloud-hypervisor to create, manage and monitor a
pty for serial and/or console I/O from a user. The reasoning for
having cloud-hypervisor create the ptys is so that clients, libvirt
for example, could exit and later re-open the pty without causing I/O
issues. If the clients were responsible for creating the pty, when
they exit the main pty fd would close and cause cloud-hypervisor to
get I/O errors on writes.
Ideally the main and subordinate pty fds would be kept in the main
vmm's Vm structure. However, because the device manager owns parsing
the configuration for the serial and console devices, the information
is instead stored in new fields under the DeviceManager structure
directly.
From there hooking up the main fd is intended to look as close to
handling stdin and stdout on the tty as possible (there is some future
work ahead for perhaps moving support for the pty into the
vmm_sys_utils crate).
The main fd is used for reading user input and writing to output of
the Vm device. The subordinate fd is used to setup raw mode and it is
kept open in order to avoid I/O errors when clients open and close the
pty device.
The ability to handle multiple inputs as part of this change is
intentional. The current code allows serial and console ptys to be
created and both be used as input. There was an implementation gap
though with the queue_input_bytes needing to be modified so the pty
handlers for serial and console could access the methods on the serial
and console structures directly. Without this change only a single
input source could be processed as the console would switch based on
its input type (this is still valid for tty and isn't otherwise
modified).
Signed-off-by: William Douglas <william.r.douglas@gmail.com>
Let's create a fixed VHD disk file from the existing RAW file thanks to
qemu-img, and create a new integration test to validate that
Cloud-Hypervisor can boot VHD disk image.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
By using `net_util::open_tap` to create the TAP interface, the created
interface will be deleted when the returned variable (`net_utils::Tap`)
is dropped.
Signed-off-by: Bo Chen <chen.bo@intel.com>
The Windows image is quite large (about 20GiB), hence it takes some time
to copy it for every test in order to avoid potential corruption.
One way to mitigate that without compromising on safety between each
test is by using device mapper. By creating a read-only base, we ensure
the image won't be modified by any of the tests, and by creating one
snapshot for each test, we avoid copying the entire image each time.
A dedicated Copy On Write disk image is created to handle any change
that might be performed on the base image, letting the tests behave as
expected.
Fixes#2155
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
By relying on the Guest object, Windows dedicated tests copy the Windows
guest image before booting from it. The point being to avoid corruption
between multiple tests. This is already how the rest of the integration
tests work, Windows tests were the only ones missing this feature.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This image does not have the pollinate service which can sometimes fail
and prevent SSH from starting as it marks itself as a prerequisite. This
service will never fully succeed as it tries to make a network
connection which will fail inside our test VMs.
Fixes: #2113
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Using --net=host is not necessary for any of the integration tests, so
let's use the default network option called "bridge".
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Some sporadic failures were due to an early connection to the VM while
it was not fully ready. Increasing sleep times fixes these issues.
Fixes#2104
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Given we already check the connected IP address matches the expected
guest IP address, the check on the "booted" message is not needed.
Fixes: #2117
Signed-off-by: Bo Chen <chen.bo@intel.com>
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This test is very flaky and regularly causing CI failures. Until we can
identify the root cause we should disable this test.
See: #2103
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Simplify our image handling by not copying both QCOW2 and raw images for
every test. Allow the test to choose QCOW2 or raw by specifying the
image name manually. A follow on patch will add explicity QCOW2 tests.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
When an SSH command fails we want to be able to see, via a panic() why
and where it failed. Replace use of .unwrap_or_default() from SSH
command calls to ensure that we can see the location of the panic.
Also enhance the existing SSH output code to show the error if there is
one.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The boot time for direct kernel boot based tests is significantly
quicker than booting via the firmware and stock kernel as it triggers a
reboot during the boot process due to the initrd handling.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
When doing a direct kernel boot only have console=ttyS0 in the command
line if we are explicitly testing the serial output. The default
behaviour is `--serial null` so this output will not be visible but will
trigger a KVM exit for every byte which is very costly when running
under nested virtualization.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Starting the virtio device threads from the VMM thread has slowed down
the start of the VM when running on a highly contested system like the
CI.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
On the CI we are seeing that sometimes the epoll is receiving these
errors which do not indicate a failure but that we should retry.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
As we switched to focal for this test we no longer get any output during
the boot unless serial is used over virtio-console.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
There have been a lot of flakes around tests such as
test_virtio_fs_hotplug_dax_on_w_vhost_user_fs_daemon() or
test_virtio_fs_hotplug_dax_on() which all try and hotplug memory.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
With the removal of vhost-user self-spawning support we should migrate
the tests to use the binaries so that we can remove the functionality
from the cloud-hypervisor binary itself.
See: #1925
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
While the addressable space size reduction of 4k in necessary due to
the Linux bug, the 64k alignment of the addressable space size is
required by Windows. This patch satisfies both.
Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
Set the test case test_snapshot_restore X86 only, instead of excluding
it from test command line.
The command line option was added because we used to support migration
with Virtio-MMIO, but not Virtio-PCI.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Tests not ported include 1) the ones that start guest VMs without
network (e.g. test_net_hotplug, test_initramfs), 2) test_vfio that
involves l2 guest. Also, some tests that use bionic guest image are
given extended timeout (120s) for 'wait_vm_boot'.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Instead of waiting blindly with fixed amount of sleeping time, we can
use the `wait-timeout` crate to explicitly wait VM shutdown (with a
timeout). It can reduces the execution time of some tests
substantially. Also, this patch increases the `shutdown` timeout for
'test_reboot', which should fix the recent sporadic failures on this
test.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Instead of blindly waiting for 20-40s for the guest VM to boot, this
patch waits the notification from the guest VM explicitly by using a
simple TcpListener on the host and a custom systemd service in the
guest.
This patch also ported few tests to use this new machanism, while more
tests are to be ported.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Now that virtio-balloon is not declared as part of the --memory
parameter, the integration tests are updated to keep the correct
behavior.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Given the increased amount of output from cloud-hypervisor, this patch
also increased the PIPE_SIZE to 32MB (from 256KB).
Signed-off-by: Bo Chen <chen.bo@intel.com>
This patch prints the complete commandline when launching
cloud-hypervisor. It also prints the details of the `ssh` command if
the command is failing.
Signed-off-by: Bo Chen <chen.bo@intel.com>
This is a new integration test running Windows as a guest with Cloud
Hypervisor. Once the VM is booted, the test connects to the guest
through SSH and shutdown the VM. If this succeeds, this means the VM
was properly booted to userspace and that the network was functional.
Important to note that because this test generates lots of logs, it
requires a large pipe size for both stdout and stderr.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Introduce a new test that will validate the new option `max_phys_bits`
from the `--cpus` parameter behaves as expected.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Since all unit and integration tests are run inside containers because
they are called from dev_cli.sh, they always run as root. That's why
both unit and integration scripts can be simplified as they don't need
to apply specific capabilities and run cargo tests in a dedicated 'kvm'
group.
Fixes#1683
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The existing virtio-blk hotplug test is extended by removing and
re-adding the virtio-blk device. This ensures the unplug/re-plug
feature is properly tested.
Fixes#1809
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The existing virtio-net hotplug test is extended by removing and
re-adding the virtio-net device. This ensures the unplug/re-plug
feature is properly tested.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
As discussed in #1707, the `vcpu` thread can be stalled when using
`--serial tty`. To workaround that issue, this patch enforces to resize
the pipe size to 256K when we capture the stdout/stderr of the
cloud-hypervisor child process in the integration tests. Note that the
pipe size (256K) is chosen based on the output size of our integration
tests at this point, which may need to be increased in the future.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Now that virtio-mem supports reboot, we extend the existing integration
tests to validate the amount of RAM after reboot is the same as before
the reboot, but also that we can still resize down the VM or the memory
zone after the reboot.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Extend the existing test to validate that each NUMA node gets assigned
the right amount of memory after each memory zone has been resized.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Now that we can resize each memory zone independently, this commit
extends the memory zone related test by validating 'vm.resize-zone'
works correctly.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Extending the Cloud-Hypervisor CI to allow for testing SGX on a
dedicated machine where special image and kernels are ready.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The goal of this commit is to rename the existing NUMA option 'id' with
'guest_numa_id'. This is done without any modification to the way this
option behaves.
The reason for the rename is caused by the observation that all other
parameters with an option called 'id' expect a string to be provided.
Because in this particular case we expect a u32 representing a proximity
domain from the ACPI specification, it's better to name it with a more
explicit name.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Since both --memory-zone and --numa parameters have been updated with
addition and removal of multiple options, this commit updates the
related integration tests to ensure they are still valid.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This patch ported many tests to the new methodology, where the guest log
will be printed only when the test is failing.
Things to finish in follow-up PRs:
1. Special tests not ported yet include: test_reboot,
test_bzimage_reboot, test_serial_null(), test_serial_tty(),
test_serial_file(), test_virtio_console(), test_console_file(),
and test_simple_launch.
2. Few direct calls to 'Command::new(clh_command("cloud-hypervisor"))',
which is still printing the guest console
Signed-off-by: Bo Chen <chen.bo@intel.com>
By extending the existing NUMA integration test, this commit validates
the proper distances between NUMA nodes are exposed to the guest.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Extend the existing NUMA integration to validate that specifying CPUs
for each NUMA node gets propagated to the guest.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This new test validates the guest OS can find the NUMA nodes which have
been defined by the user through the CLI, and that the right amount of
memory is associated with each node.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Relying on the new option 'host_numa_node' from the 'memory-zone'
parameter, the user can now define which NUMA node from the host
should be used to back the current memory zone.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Adding a small test to validate that user defined memory zones work as
expected when using --memory-zone option.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This required a bit of rearranging as it is not possible to call
prepare_daemon() inside a catch_unwind{} block.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Now the testing aspects are run inside a panic handler block rather than
inside a credibility TestBlock. If the test fails then the output from
the cloud-hypervisor binary is then presented.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This reduces the complexity of the test slightly. The PCI BDFs in the L1
needed changing as the block devices come before the network ones.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Rather than using a credible TestBlock to capture the test assertions
instead use a catch_unwind block to catch the panic and turn
it into a Result<>.
If block panicked or the child binary had non-zero exit then, and only
then, print out the child output.
This results in a clearer test output with no interleaving.
Currently only test_counters is ported to this methodology to
demonstrate its use.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This commit enables the test case for testing the basic function
of virtio_vsock (i.e. without the hotplug).
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
This commit enables the `api_create_boot` case in the integration
test as the test for the Cloud Hypervisor API server functionality.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
This commit enables the virtio-fs related integration test cases
for AArch64.
Note that to run virtio-fs cases, the host kernel should be
newer than v5.5.
Fixes: https://github.com/cloud-hypervisor/cloud-hypervisor/issues/1516
Signed-off-by: Henry Wang <henry.wang@arm.com>
This commit enables some mmio-related integration test cases on
AArch64, including:
* some vhost_user test cases
* virtio-blk test cases
* pmem test cases
Also this commit contains a bug fix in creating virtio-blk device.
Previously, when creating the FDT, the virtio-blk device was
labeled in the reverse order of address allocation.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
Under high load, the VM might take some time to hotplug the disk after
the hotplug command has been issued. For this reason, let's put a 10s
sleep before checking for the presence of the new disk.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
We want to give the time to the nested VM to be fully ready before we
check it's correctly setup. This involves 3 layers of virtualization
when running the CI on Azure, which added to the high load happening
because of the parallelization, adds up to the start up time.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
As compiling without acpi (implied by mmio) means that the VM will
terminate on i8042 reset we cannot test the reboot.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Check if increasing the time after the VM is spawned help with getting
more stable numbers for the base PSS.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The compiled AArch64 linux kernel by running `make` is in PE format
instead of vmlinux, vmlinux.pvh and bzImage format. Therefore we
need to add integration test for this case.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
In order to differentiate tests that can be run in parallel versus
tests that must be run on their own, we move all tests into dedicated
modules.
The point is to avoid glitches in results that can be caused by the fact
that other tests (hence VMs) are running at the same time.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Move the CI to rely entirely on Ubuntu cloud images. It's worth noting
that both QCOW2 and RAW images from Ubuntu Focal Fossa have been
modified to include the tools needed from integration tests.
This means fio, iperf, iperf3, netcat and socat have been added to the
image. The snapd package have been fully removed as it was expecting the
support for squashfs (not present when using our own kernel from direct
kernel boot), which was causing some failures, and was preventing
cloud-init from terminating properly.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
With QCOW disk images the space needed is greater than the size of the
iamge as any "zero" blocks in the image are allocated when they are
touched making the image bigger.
Here we add a threshold of 6GiB with added debugging messages to ensure
that there is sufficient disk space to run the tests.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Extend the existing integration test test_snapshot_restore by testing
with more than one vCPU.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Add a simple test to check that the data from the counters matches what
is expected and that the value of the counters increases after an
operation that will hit all counters.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
When a request is made to increase the number of vCPUs in the VM attempt
to reuse any previously removed (and hence inactive) vCPUs before
creating new ones.
This ensures that the APIC ID is not reused for a different KVM vCPU
(which is not allowed) and that the APIC IDs are also sequential.
The two key changes to support this are:
* Clearing the "kill" bit on the old vCPU state so that it does not
immediately exit upon thread recreation.
* Using the length of the vcpus vector (the number of allocated vcpus)
rather than the number of active vCPUs (.present_vcpus()) to determine
how many should be created.
This change also introduced some new info!() debugging on the vCPU
creation/removal path to aid further development in the future.
TEST=Expanded test_cpu_hotplug test.
Fixes: #1338
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Now that snapshot/restore is symmetrical, that is the VM must be paused
before it is snapshot and it must be resumed after it's been restored,
the integration test is updated accordingly.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Add multiple integration tests with various different CPU topologies and
check that they work as expected.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Upon PCI hotplug, the VMM now returns some information about the device
name and the associated b/d/f. This patch extends the integration tests
so that we validate the response is the one that is expected.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This revised version of the patch reuses the back-off strategy from
'ssh_command()' to deal with varying booting time.
Fixes: #1209
Signed-off-by: Bo Chen <chen.bo@intel.com>
This config option provided very little value and instead we now enable
this feature (which then lets the guest control the cache mode)
unconditionally.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Now that snapshot/restore support has been enabled for virtio-vsock, the
corresponding integration test is expanded with some validation that
virtio-vsock supports to be snapshot and restored.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
When compiled with pci feature, the integration test now validates that
/dev/vdb can be correctly read while being placed behing a virtual
IOMMU.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The integration test validating that --serial off works correctly was
not properly written as it was using the FW, which by default would use
the kernel command line found in the EFI partition. Unfortunately, this
kernel command line was including "console=ttyS0", which causes the
kernel to try to write to the serial port, even if there's no serial
port being emulated.
The problem is, when no emulation of the serial port is provided, the
default value returned on 0x3f8 is 0, which makes the guest kernel think
that some data needs to be read.
The only way to avoid all this is by ensuring we can control the kernel
command line by removing any occurence of "console=ttyS0" from it.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Extend the set of tests we have for virtio-net and vhost-user-net to
check for host MAC address setting.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
There's a simple way to trigger PCI BAR reprogramming for a given
device, by removing it and then hotplugging it back. The Linux kernel
will simply choose to place the BARs at different location than the ones
chosen by Cloud-Hypervisor. By doing so, and creating the snapshot after
this hotplug operation, we can manage to validate that the resource are
correctly restored for a given virtio-pci device.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Now that snapshot/restore feature is stable for both virtio-mmio and
virtio-pci devices, we re-enable the integration test for validating
snapshot/restore does not get broken.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Now that virtio-fs has unplug support try unplugging and readding the
virtio-fs device.
Fixes: #1050
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The mmio integration test build currently doesn't use ACPI so piggyback
on that test variation.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Continue to try and track down the instability in the numbers generated
by this test by removing the (unused) network interface. It's possible
we are getting broadcast packets into the tap device and generating a
variable size of allocations.
See: #760
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Switch to using the recently added OptionParser in the code that parses
the network backend.
Whilst doing this also update the net-backend syntax to use "sock"
rather than socket.
Fixes: #1092
Partially fixes: #1091
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
To aid the debugging of why the test_memory_overhead sometimes fails
print a details of all the named regions that it uses in its
calculation.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The current tests support giving a tap name but they only test with tap
interfaces that already exist.
Fixes: #871
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
We pass it to the integration and unit tests script through --libc.
Cargo tests are left unmodified.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
If a size is specified use it (in particular this is required if the
destination is a directory) otherwise seek in the file to get the size
of the file.
Add a new check that the size is a multiple of 2MiB otherwise the kernel
will reject it.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The new 'shared' and 'hugepages' controls aim to replace the 'file'
option in MemoryConfig. This patch also updated all related integration
tests to use the new controls (instead of providing explicit paths to
"/dev/shm" or "/dev/hugepages").
Fixes: #1011
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Signed-off-by: Bo Chen <chen.bo@intel.com>
The newly added integration test "test_snapshot_restore" is very
unstable, which causes our CI to fail on most pull requests, which is
not acceptable. That's why we ignore this test until we can fix the
stability issue.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Now that we have multiple virtio devices supporting snapshot and restore
operations, we can add a new integration test to validate the migration
feature works as expected.
The important part is virtio-net as it is used to ssh into the VM to
verify the VM has been restored in the proper state.
This test only works for virtio-mmio for now. Further support for
testing virtio-pci will be added once the virtio-pci transport layer
will support migration.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
There is some duplication between regular and hotplug virtio-fs tests
that can be factorized by adding a simple hotplug flag to choose if each
test should run with or without hotplugging the device.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
One more time, we're extending the VFIO integration test to verify that
a VFIO device passed through a VM is still usable after some new memory
has been hotplugged.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
the integration test creates an initramfs image based on AlpineLinux mini root filesystem
with a simple /init script that just echoes a string to the console. The string
is passed via the kernel cmdline as an environment variable.
Signed-off-by: Damjan Georgievski <gdamjan@gmail.com>
All vhost-user-blk, vhost-user-net and virtio-fs integration tests are
extended with memory resizing. The point is to validate that we can
hotplug some memory and keep these backends to work as expected.
Just a note that because memory resizing is not supported by mmio, we
limit these extensions to all non-mmio use cases.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The same Clear Linux kernel command line was repeated again and again in
the code. That's why this commit takes care of factorizing it, which
simplifies the code but also simplify maintenance whenever we'd need to
update the block UUID.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The hypervisor-fw does not support virtio-blk through mmio transport
layer, therefore we can only run tests with mmio if these tests boot
directly to kernel.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
There's no reason mmio builds cannot support vhost-user-blk, therefore
there's no reason to skip these tests for mmio.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
As a first step towards factorizing as many vhost-user-blk tests as
possible, this patch takes care of unifying their expected output.
It also makes a clear separation between the tests booting from
vhost-user-blk from the ones passing an extra volume.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In order to limit code duplication, vhost-user-net integration tests are
further factorized, as they now include the self spawning test.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Factorise test_virtio_pmem to test changes are not persisted if file is
readonly.
As virtio-mem is now supported in the upstream kernel we can switch to
using the firmware.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Rather than using a raw OS disk image. This will be useful when the test
is extended to doing I/O on the image.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This feature is stable and there is no need for this to be behind a
flag. This will also reduce the time needed to run the integration test
as we will not be running them all again under the flag.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Add an integration test that builds cloud-hypervisor with
the pvh_boot feature and boots a kernel built with CONFIG_PVH.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Add a new id option to the VFIO hotplug command so that it matches the
VFIO coldplug semantic.
This is done by refactoring the existing code for VFIO hotplug, where
VmAddDeviceData structure is replaced by DeviceConfig. This structure is
the one used whenever a VFIO device is coldplugged, which is why it
makes sense to reuse it for the hotplug codepath.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Replace sudo invocations of bash creating a subshell with simpler
solutions by running the program directly on the input or using tee.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
As part of bringing the tests up with virtio-mmio the virtio-fs were
wrongly disabled. Re-enable them and remove the cache size check as the
cache does not appear in /proc/iomem on virtio-mmio.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
When booting from vhost-user-block the entropy is sometimes lower
triggering a flaky test. We have already use other, more reliable
methods for checking that the VM has booted.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This commit extends the existing test_vfio by hotplugging an extra
virtio-net device to the L2 VM. The test for validating the hotplug
succeeded is the same as the one to verify the non-hotplugged devices.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Now that virtio-mmio transport has support for SHM regions we should be
able to test virtio-fs against it.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
A new ClearLinux image has been uploaded to the Azure storage account.
It is based off of the ClearLinux cloudguest image 31310 version, with
three extra bundles added to it.
First bundle is curl, which adds the curl binary to the image, second
bundle is iperf, adding the iperf binary to the image, and third bundle
is sysadmin-basic to include utility like netcat.
The image is 2G in size.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
On tests that expect a clean shutdown there is no need to try and kill
the child after wait() has returned as the process has already exited.
Further there is no need to sleep before wait() as wait will block until
the VM and VMM shutdown is complete.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This should address any flakiness as the VMM process will have
completely terminated and all files closed.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Many of the tests attempted to SSH into the VM and then run "shutdown"
but don't actually check that the VM has shutdown correctly and proceed
to kill the child process. Remove the associated SSH commands and sleeps
from those tests that are not explicitly checking the shutdown
behaviour.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Only some tests require the output for the tests to be captured so
default to not capturing the output to a pipe and instead make it
controllable.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Many of the tests use an identical network configuration. Add a
GuestCommand::default_net() to generate this configuration and use it
wherever possible.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Many of the tests use an identical disk configuration. Add a
GuestCommand::default_disks() to generate this configuration and use it
wherever possible.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This is a thin wrapper over std::process:Command which currently only
specifies the default binary but in future will handle more default
behaviour.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>