Since we don't register ioevents in case of SEV-SNP guests. Thus, we
should not unregister it as well.
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
With the nightly toolchain (2024-02-18) cargo check will flag up
redundant imports either because they are pulled in by the prelude on
earlier match.
Remove those redundant imports.
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
When a guest running on a terminal reboots, the sigwinch_listener
subprocess exits and a new one restarts. The parent never wait()s
for children, so the old subprocess remains as a zombie. With further
reboots, more and more zombies build up.
As there are no other children for which we want the exit status,
the easiest fix is to take advantage of the implicit reaping specified
by POSIX when we set the disposition of SIGCHLD to SIG_IGN.
For this to work, we also need to set the correct default exit signal
of SIGCHLD when using clone3() CLONE_CLEAR_SIGHAND. Unlike the fallback
fork() path, clone_args::default() initialises the exit signal to zero,
which results in a child with non-standard reaping behaviour.
Signed-off-by: Chris Webb <chris@arachsys.com>
Allow cloud-hypervisor to direct boot the bzImage kernel format using
the regular 32 bit entry point. This can share the memory and vcpu
setup with the regular PVH boot code, but requires the setup of the
'zero page'.
Signed-off-by: Stefan Nuernberger <stefan.nuernberger@cyberus-technology.de>
For SevSnp guest IO events are handled by GHCB protocol.
While we get the notification we have to notify via eventfd.
Signed-off-by: Muminul Islam <muislam@microsoft.com>
In case of SEV-SNP guest devices use sw-iotlb to gain access guest
memory for DMA. For that F_IOMMU/F_ACCESS_PLATFORM must be exposed in
the feature set of virtio devices.
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
According to PCIe specification, a 64-bit MMIO BAR should be
naturally aligned. In addition to being more compliant with
the specification, natural aligned BARs are mapped with
the largest possible page size by the host iommu driver, which
should speed up boot time and reduce IOTLB thrashing for virtual
machines with VFIO devices.
Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
The 'generate_ram_ranges' function currently hardcodes the assumption
that there are only 2 E820 RAM entries. This is not flexible enough to
handle vendor specific memory holes. Returning a Vec is also more
convenient for users of this function.
Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
Since the ACPI tables are generated inside the IGVM file in case of
SEV-SNP guest. So, we don't need to generate it inside the cloud
hypervisor.
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
... enabled VMs. IOEvents are not supported in case of SEV-SNP VMs. All
the IO events are delievered via GHCB protocol.
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
This will help in identify whether a VM supports sev-snp and based on
that disable/enable certain features.
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
Implement workflow to to run static analysis and linting of all shell
scripts by using shfmt and shellcheck.
Fixes: #5396
Signed-off-by: Ruslan Mstoi <ruslan.mstoi@intel.com>
We occasionally saw cloud-hypervisor crashed due to seccomp violations. The
coredumps showed the HTTP API thread crashing after it attempted to call
sched_yield(). The call came from rust stdlib's mpmc module, which calls
sched_yield() if several attempts to busy-wait for a condition to fulfil fall
short.
Since the system call is harmless and it comes from the stdlib, I opted to allow
all threads to call it.
Signed-off-by: Peteris Rudzusiks <rye@stripe.com>
Currently the only way to set the affinity for virtio block threads is
to boot the VM, search for the tid of each of the virtio block threads,
then set the affinity manually. This commit adds an option to pin virtio
block queues to specific host cpus (similar to pinning vcpus to host
cpus). A queue_affinity option has been added to the disk flag in
the cli to specify a mapping of queue indices to host cpus.
Signed-off-by: acarp <acarp@crusoeenergy.com>
Traditional way to configure vcpu don't work for sev-snp guests. All the
vCPU configuration for SEV-SNP guest is provided via VMSA.
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
This also fixes the following clippy warning on nightly build from cargo
fuzz:
warning: struct `RegionEntry` is never constructed
--> /home/chenb/project/cloud-hypervisor/cloud-hypervisor/block/src/vhdx/vhdx_header.rs:357:8
|
357 | struct RegionEntry {
| ^^^^^^^^^^^
|
= note: `RegionEntry` has derived impls for the traits `Debug` and `Clone`, but these are intentionally ignored during dead code analysis
= note: `#[warn(dead_code)]` on by default
Signed-off-by: Bo Chen <chen.bo@intel.com>
For all VFIO devices, map all non-emulated MMIO regions to
the vfio container to allow PCIe P2P between all VFIO devices
on the virtual machine. This is required for a wide variety of
advanced GPU workloads such as GPUDirect P2P (DMA between two
GPUs), GPUDirect RDMA (DMA between a GPU and an IB device).
Signed-off-by: Thomas Barrett <tbarrett@crusoeenergy.com>
Currently kernel and firmware are checked as a payload.
IGVM should be checked as well. Otherwise, it hangs indefinitely.
Signed-off-by: Muminul Islam <muislam@microsoft.com>
Unaccepted GPA is usually thrown by Microsoft hypervisor in case of
mismatch between GPA and GVA mappings. This is a fatal message from the
hypervisor perspective so we would need to error out from the vcpu run
loop. Along with add some debug message to identify the broken mapping
between GVA and GPA.
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
On guests with large amounts of memory, using the `prefault` option can
lead to a very long boot time. This commit implements the strategy
taken by QEMU to prefault memory in parallel using multiple threads,
decreasing the time to allocate memory for large guests by
an order of magnitude or more.
For example, this commit reduces the time to allocate memory for a
guest configured with 704 GiB of memory on 1 NUMA node using 1 GiB
hugepages from 81.44134669s to just 6.865287881s.
Signed-off-by: Sean Banko <sbanko@crusoeenergy.com>