When shutting down a VM using VFIO, the following error has been
detected:
vfio-ioctls/src/vfio_device.rs:312 -- Could not delete VFIO group:
KvmSetDeviceAttr(Error(9))
After some investigation, it appears the KVM device file descriptor used
for removing a VFIO group was already closed. This is coming from the
Rust sequence of Drop, from the DeviceManager all the way down to
VfioDevice.
Because the DeviceManager owns passthrough_device, which is effectively
a KVM device file descriptor, when the DeviceManager is dropped, the
passthrough_device follows, with the effect of closing the KVM device
file descriptor. Problem is, VfioDevice has not been dropped yet and it
still needs a valid KVM device file descriptor.
That's why the simple way to fix this issue coming from Rust dropping
all resources is to make Linux accountable for it by duplicating the
file descriptor. This way, even when the passthrough_device is dropped,
the KVM file descriptor is closed, but a duplicated instance is still
valid and owned by the VfioContainer.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
We turn on that emulation for Windows. Windows does not have KVM's PV
clock, so calling notify_guest_clock_paused results in an error.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
While using the virtio-iommu device involving L2 scenario, and tearing
things down all the way from L2 back to L0 exposed some bad syscalls
that were not part of the authorized list.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Now that we've written Windows integration tests and the associated
script to launch them, this patch enables the support for Windows tests
in dev_cli.sh, so that we can run it in our Cloud Hypervisor container.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This is a new integration test running Windows as a guest with Cloud
Hypervisor. Once the VM is booted, the test connects to the guest
through SSH and shutdown the VM. If this succeeds, this means the VM
was properly booted to userspace and that the network was functional.
Important to note that because this test generates lots of logs, it
requires a large pipe size for both stdout and stderr.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Introduce a new test that will validate the new option `max_phys_bits`
from the `--cpus` parameter behaves as expected.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
If the user specified a maximum physical bits value through the
`max_phys_bits` option from `--cpus` parameter, the guest CPUID
will be patched accordingly to ensure the guest will find the
right amount of physical bits.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
If the user provided a maximum physical bits value for the vCPUs, the
memory manager will adapt the guest physical address space accordingly
so that devices are not placed further than the specified value.
It's important to note that if the number exceed what is available on
the host, the smaller number will be picked.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In order to let the user choose maximum address space size, this patch
introduces a new option `max_phys_bits` to the `--cpus` parameter.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The 'GuestAddress::unchecked_add' function has undefined behavior when
an overflow occurs. Its alternative 'checked_add' requires use to handle
the overflow explicitly.
Signed-off-by: Bo Chen <chen.bo@intel.com>
We are now reserving a 256M gap in the guest address space each time
when hotplugging memory with ACPI, which prevents users from hotplugging
memory to the maximum size they requested. We confirm that there is no
need to reserve this gap.
This patch removes the 'reserved gaps'. It also refactors the
'MemoryManager::start_addr' so that it is rounding-up to 128M alignment
when hotplugged memory is allowed with ACPI.
Signed-off-by: Bo Chen <chen.bo@intel.com>
In previous dev_cli.sh, the `uname -m` command will generate
either `x86_64` or `aarch64`, which is inconsistent with the
architectures in the Dockerfile, namely `amd64` and `arm64`.
This will cause some dependancy missing in the docker container
when the docker image is built locally.
This commit fixes this inconsistancy.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
We now try to create a ram region of size 0 when the requested memory
size is the same as current memory size. It results in an error of
`GuestMemoryRegion(Mmap(Os { code: 22, kind: InvalidInput, message:
"Invalid argument" }))`. This error is not meaningful to users and we
should not report it.
Signed-off-by: Bo Chen <chen.bo@intel.com>
A failure appeared in AArch64 musl cross build, after upgrading rust
to v1.47.0. A symbol "strrchr" was missing while linking against
static libfdt.a.
The issue could be caused by missing symbol(s) in new rust toolchain.
This fix pins the rust version in this cross build action to a stable-
enough version. Further upgrade will be done manually after testing.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
"Using a stable sort consumes more memory and cpu cycles. Because values
which compare equal are identical, preserving their relative order (the
guarantee that a stable sort provides) means nothing, while the extra
costs still apply."
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This is a new clippy check introduced in 1.47 which requires the use of
the matches!() macro for simple match blocks that return a boolean.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Instead of having the hypervisor crate embedding Cloud-Hypervisor forks
from the rust-vmm project, it's more appropriate to leave the rust-vmm
references in the hypervisor crate, and have the root Cargo.toml being
patched.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This changeset extends the documentation with the UEFI and Windows
related info. The focus is on providing consumer with a minimum
necessary and proper piece of the information to enter the features
quickly. While UEFI is a cross platform topic, it is a required
prerequisite for the Windows usage.
Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
The OneRegister literally means "one (arbitrary) register". Just call it
"Register" instead. There is no need to inherit KVM's naming scheme in
the hypervisor agnostic code.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
The goal here is to replace anywhere possible a virtio structure
with a "C, packed" representation by a "C" representation. Some
virtio structures are not expected to be packed, therefore there's
no reason for using the more restrictive "C, packed" representation.
This is important since "packed" representation can still cause
undefined behaviors with Rust 2018.
By removing the need for "packed" representation, we can simplify a
bit of code by deriving the Serialize trait without writing the
implementation ourselves.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Small patch creating a dedicated `block_io_uring_is_supported()`
function for the non-io_uring case, so that we can simplify the
code in the DeviceManager.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The purpose of that step is to make sure each commit builds. The `check`
command is much faster for that purpose.
On my 32-core machine `cargo check --all` takes around 25 seconds while
`cargo build --all` takes around 35 seconds, so that's quite a bit of
time saving there.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Since all unit and integration tests are run inside containers because
they are called from dev_cli.sh, they always run as root. That's why
both unit and integration scripts can be simplified as they don't need
to apply specific capabilities and run cargo tests in a dedicated 'kvm'
group.
Fixes#1683
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>