Now feature "mshv" can be built together with "kvm". There is no need to
use "--no-default-features" any more.
Fixes: #5647
Signed-off-by: Bo Chen <chen.bo@intel.com>
The reset system is asynchronous with an I/O event (PIO or MMIO) for
ACPI/i8042/CMOS triggering a write to the reset_evt event handler. The
VMM thread will pick up this event on the VMM main loop and then trigger
a shutdown in the CpuManager. However since there is some delay between
the CPU threads being marked to be killed (through the
CpuManager::cpus_kill_signalled bool) it is possible for the guest vCPU
that triggered the exit to be re-entered when the vCPU KVM_RUN is called
after the I/O exit is completed.
This is undesirable and in particular the Linux kernel will attempt to
jump to real mode after a CMOS based exit - this is unsupported in
nested KVM on AMD on Azure and will trigger an error in KVM_RUN.
Solve this problem by spinning in the device that has triggered the
reset until the vcpus_kill_signalled boolean has been updated
indicating that the VMM thread has received the event and called
CpuManager::shutdown(). In particular if this bool is set then the vCPU
threads will not re-enter the guest.
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
Currently if container registry is inaccessible the image will be built
locally and that takes time. This patch adds support to use mirror
registry. To use a different registry CTR_IMAGE environment variable
must be set. For example:
CTR_IMAGE="registry/cloud-hypervisor" scripts/dev_cli.sh
or
export CTR_IMAGE="registry/cloud-hypervisor"
scripts/dev_cli.sh
Signed-off-by: Ruslan Mstoi <ruslan.mstoi@intel.com>
Split interrupt source group restore into two steps, first restore
the irqfd for each interrupt source entry, and second restore the
GSI routing of the entire interrupt source group.
This patch will reduce restore latency of interrupt source group,
and in a 200-concurrent restore test, the patch reduced the
average IOAPIC restore time from 15ms to 1ms.
Signed-off-by: Yong He <alexyonghe@tencent.com>
The VFIO tests based on the NVMe emulation framework require cause a
high number of file descriptors to be opened on the system. On systems
with a low limit on opened files, tests like test_vfio_user can possibly
fail with "too many open files" exeption. This issue is fixed by raising
the corresponding limit.
A tricky detail here is, that the limit has to be changed in the test
container and not on the host. As the the container is executed in the
privileged mode, the setting in it will override the host value.
Related: #5426
Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
Add integration test of coredump with no need pause.
As file of coredump has been tested in test_coredump(), so this
patch only test vm state after coredump.
Signed-off-by: Yi Wang <foxywang@tencent.com>
If the VMM is not already paused then pause the VM prior to executing
the coredump and then resume it after. If the VM is already paused then
the original state is maintained.
Signed-off-by: Yi Wang <foxywang@tencent.com>
As the commit 2f15d027c05f (KVM: x86: Properly handle APF vs disabled
LAPIC situation) has been merged into kernel 5.13, so update the minimum
performance host kernel version to that.
Signed-off-by: Yi Wang <foxywang@tencent.com>
The commit b92fe648e925 (vmm: cpu: Disable KVM_FEATURE_ASYNC_PF_INT in
CPUID) disabled APF (Asynchronous Page Fault) mechanism to address
problem that makes vcpu thread spin 100%. As the actual issue is in
KVM, which has been merged in commit 2f15d027c05f (KVM: x86: Properly
handle APF vs disabled LAPIC situation) since 2021, so it's okay to
re-enable APF now.
Signed-off-by: Yi Wang <foxywang@tencent.com>
Add MSHV_CREATE_DEVICE, MSHV_SET_DEVICE_ATTR ioctls to filters. These
ioctls are required to passthrough PCI devices on mshv.
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
The pause of vcpu is async now, which makes the vm pause is not
synchronised. As virtio device calls paused_sync wait() to make
sure device_manager pause synchronously, if we make vcpu pause
synchronously, the vm pause can be synchronously then. After
vm.pause() returns the vm is really paused now.
This patch adds a AtomicBool variable to mark vcpu paused state,
to make sure the pause of CpuManager is synchronised.
Signed-off-by: Yi Wang <foxywang@tencent.com>
This test case creates a new qcow2 file using the image of ubuntu as
its backing file, and boot a virtual machine with this image file.
Signed-off-by: Yu Li <liyu.yukiteru@bytedance.com>
This preserves any data that the backing file had on a cluster when
doing a write to a subset of that cluster. These writes cause a
performance penalty on creating new clusters if a backing file is
present.
This commit is based on crosvm implementation:
5ad3bc3459
Signed-off-by: Yu Li <liyu.yukiteru@bytedance.com>
Reads to qcow files with backing files will fall through to the backing
file if there is no allocated cluster. As of this change, a write will
still trash the cluster and hide any data already present.
This commit is based on crosvm implementation:
d8144a56e2
Signed-off-by: Yu Li <liyu.yukiteru@bytedance.com>
This commit allows opening qcow with a backing file, which supports any
type implementing `BlockBackend`.
This commit is based on crosvm implementation:
9ca6039b03
Signed-off-by: Yu Li <liyu.yukiteru@bytedance.com>
This commit introduces the trait `BlockBackend` with generic ops
including read, write and seek, which can be used for common I/O
interfaces for the block types without using `DiskFile` and `AsyncIo`.
Signed-off-by: Yu Li <liyu.yukiteru@bytedance.com>
This commit merges crates `qcow`, `vhdx` and `block_util` into the
crate `block`, which can allow `qcow` to use functions from `block_util`
without introducing a circular crate dependency.
This commit is based on crosvm implementation:
f2eecc4152
Signed-off-by: Yu Li <liyu.yukiteru@bytedance.com>
This commit introduces a generic type `FixedVhd` for its sync and async
disk I/O types, and the disk I/O types will use it directly instead of
maintaining a file themselves.
Signed-off-by: Yu Li <liyu.yukiteru@bytedance.com>
Using this over the LocalApic supports APIC IDs (and hence number of
vCPUs) above 256 (size increases from u8 to u32.)
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
If the VM has been configured but not yet booted, all we need to do to
support removing a device is to remove it from the config, so it will
never be created.
Signed-off-by: Alyssa Ross <hi@alyssa.is>