Run loop in hypervisor needs a callback mechanism to access resources
like guest memory, mmio, pio etc.
VmmOps trait is introduced here, which is implemented by vmm module.
While handling vcpuexits in run loop, this trait allows hypervisor
module access to the above mentioned resources via callbacks.
Signed-off-by: Praveen Paladugu <prapal@microsoft.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
A new version of vm-memory was released upstream which resulted in some
components pulling in that new version. Update the version number used
to point to the latest version but continue to use our patched version
due to the fix for #1258
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The PMEM support has an option called "discard_writes" which when true
will prevent changes to the device from hitting the backing file. This
is trying to be the equivalent of "readonly" support of the block
device.
Previously the memory of the device was marked as KVM_READONLY. This
resulted in a trap when the guest attempted to write to it resulting a
VM exit (and recently a warning). This has a very detrimental effect on
the performance so instead make "discard_writes" truly CoW by mapping
the memory as `PROT_READ | PROT_WRITE` and using `MAP_PRIVATE` to
establish the CoW mapping.
Fixes: #1795
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
When all refblocks are consumed, the loop looking for the first free
cluster would access the element at refcounts[refcounts.len()], which is
out of bounds. Modify the free cluster search loop to check that the
index is in bounds before accessing it.
Tested-by: Daniel Verkamp <dverkamp@chromium.org>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Dylan Reid <dgreid@chromium.org>
Commit-Queue: Daniel Verkamp <dverkamp@chromium.org>
(cherry picked from crosvm commit f21572c7187c8beb9c6bfea6446351ae93200d01)
Fixes: #1792
Signed-off-by: Bo Chen <chen.bo@intel.com>
The virtio-balloon change the memory size is asynchronous.
VirtioBalloonConfig.actual of balloon device show current balloon size.
This commit add memory_actual_size to vm.info to show memory actual size.
Signed-off-by: Hui Zhu <teawater@antfin.com>
When the destination mode is physical, the destination field should
only be defined through bits 56-59, as defined in the IOAPIC spec. But
from the APIC specification, the APIC ID is always defined on 8 bits no
matter which destination mode is selected. That's why we always retrieve
the destination field based on bits 56-63.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This patch adds two required dependencies to fuzz/Cargo.toml, and fixes
the building error on the 'block' fuzzer.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Now that Docker images are automatically generated for both amd64 and
arm64 architectures, there's no need to generate the arm64 image locally
on the ARM CI during a CI run. The image should be available from
DockerHub instead.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Write to the exit_evt EventFD which will trigger all the devices and
vCPUs to exit. This is slightly cleaner than just exiting the process as
any temporary files will be removed.
Fixes: #1242
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This commit moves back to the branch "virtio-fs-dev" from virtiofsd, as
we figured the changes needed to use this branch and the requirements
from the new meson build from QEMU.
It updates the container version to ensure the dev_cli.sh script will
rely on the latest container which contains the needed packages.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
By fixing the Dockerfile, we have now finalized the automated generation
of the Docker images for both architectures (amd64 and arm64).
Fixes#953
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This patch adds the missing the `iommu` and `id` option for
`VmAddDevice` in the openApi yaml to respect the internal data structure
in the code base. Also, setting the `id` explicitly for VFIO device
hotplug is required for VFIO device unplug through openAPI calls.
Signed-off-by: Bo Chen <chen.bo@intel.com>
According to openAPI specification [1], the format for `integer` types
can be only `int32` or `int64`, unsigned and 8-bits integers are not
supported.
This patch replaces `uint64` with `int64`, `uint32` with `int32` and
`uint8` with `int32`.
[1]: https://swagger.io/specification/#data-types
Signed-off-by: Julio Montes <julio.montes@intel.com>
MsiInterruptGroup doesn't need to know the internal field names of
InterruptRoute. Introduce two helper functions to eliminate references
to irq_fd. This is done similarly to the enable and disable helper
functions.
Also drop the pub keyword from InterruptRoute fields. It is not needed
anymore.
No functional change.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
In order to support both amd64 and arm64, we rely on the TARGETARCH
variable that is passed from the docker buildx command, based on the
platform used to build the container image.
There is no way to rely directly on $(uname -m) to assign a variable
with the correct x86_64 or aarch64 values we're looking for. Both ENV
and ARG don't evaluate the command, which means they see it as a simple
string. Using RUN is the only way to evaluate a command.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In order to build virtiofsd from the latest build system, the Python
package python3-setuptools is required.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The idea is to rely on this new Github Action to detect when the
Dockerfile is updated after a push to the master branch on the
repository.
Once triggered, this action builds the Docker image for both
linux/amd64 and linux/arm64 platforms, and updates it directly
on Docker Hub.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
According to openAPI specification[1], the format for `integer` types
can be only `int32` or `int64`, unsigned integers are not supported.
This patch replaces `uint64` with `int64`.
[1]: https://swagger.io/specification/#data-types
Signed-off-by: Julio Montes <julio.montes@intel.com>
In order to speed up the Linux boot (so as to avoid it having to scan a
large number of pages) place the MP table directly after the SMBIOS
table if there is sufficient room. The start address of the SMBIOS table
is one of the three (and the largest) location that the MP table can
also be located at.
Before:
[ 0.000399] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WP UC- WT
[ 0.014945] check: Scanning 1 areas for low memory corruption
After:
[ 0.000284] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WP UC- WT
[ 0.000421] found SMP MP-table at [mem 0x000f0090-0x000f009f]
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
There is no point in manually dropping the lock for gsi_msi_routes then
instantly grabbing it again in set_gsi_routes.
Make set_gsi_routes take a reference to the routing hashmap instead.
No functional change intended.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
The MTRR feature was missing from the CPUID, which is causing the guest
to ignore the MTRR settings exposed through dedicated MSRs.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Removing the ISA DMA configurations prevents the kernel from accessing
the port I/O 0x87, which was generating the following warning:
WARN:vmm/src/cpu.rs:378 -- Guest PIO read to unregistered address 0x87
Removing the TELCLOCK configuration prevents the kernel from accessing
the port I/O reserved for the memory manager, which was causing the
following warning:
WARN:vmm/src/memory_manager.rs:289 -- Unexpected offset for accessing
memory manager device: 15
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The ::new() does very little beyond trying to open the /dev/kvm device
so provide a hint to the user about what has gone wrong.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
As discussed in #1707, the `vcpu` thread can be stalled when using
`--serial tty`. To workaround that issue, this patch enforces to resize
the pipe size to 256K when we capture the stdout/stderr of the
cloud-hypervisor child process in the integration tests. Note that the
pipe size (256K) is chosen based on the output size of our integration
tests at this point, which may need to be increased in the future.
Signed-off-by: Bo Chen <chen.bo@intel.com>
By looking at Linux kernel boot time, we identified that a lot of time
was spent registering and unregistering IRQ fds to KVM. This is not
efficient and certainly not a wrong behavior from the Linux kernel,
but rather a problem with the Cloud-Hypervisor's implementation of
MSI-X.
The way to fix this issue is by ensuring the initial conditions are
correct, which means the entire MSI-X vector table must be disabled
and masked. Additionally, each vector must be individually masked.
With these correct conditions, Linux won't start masking interrupt
vectors, and later unmask them since they will be seen as masked from
the beginning. This means the OS will simply have to unmask them when
needed, avoiding the extra operation.
Another aspect of this patch is to prevent Cloud-Hypervisor from
enabling (by registering IRQ fd) all vectors when either the global
'mask' or 'enable' bits are set. Instead, we can simply let the mask()
and unmask() operations take care of it if needed.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Since Cloud-Hypervisor currently support one single PCI bus, we must
reflect this through the MCFG table, as it advertises the first bus and
the last bus available. In this case both are bus 0.
This patch saves quite some time during guest kernel boot, as it
prevents from checking each bus for available devices.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>