Bound checks for virtio-mem and ACPI memory hotplug are off by
one and two, respectively. This prevents users to fully use the reserved
memory hotplug size.
For ACPI, if we specific `--memory size=2G,hotplug_size=4G` and run
`ch-remote resize --memory 6G`, cloud-hypervisor will report the
following error because of the incorrect bound check:
`<vmm> ERROR:vmm/src/lib.rs:1631 -- Error when resizing VM:
MemoryManager(InsufficientHotplugRam)`
Similarly, for virtio-mem, cloud-hypervisor will fail the incorrect
bound check and abort the resize. The VM will see the following error
in dmesg:
`virtio_mem virtio3: unknown error, marking device broken: -22`
This patch has fixed both bound checks and ensure that users can
hot add memory up to the reserved hotplug size.
Signed-off-by: Yuhong Zhong <yz@cs.columbia.edu>
Misspellings were identified by:
https://github.com/marketplace/actions/check-spelling
* Initial corrections based on forbidden patterns from the action
* Additional corrections by Google Chrome auto-suggest
* Some manual corrections
* Adding markdown bullets to readme credits section
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
The memory region that is associated with the hotpluggable part of
a virtio-mem zone isn't backed by the file specified in the
MemoryZoneConfig. The file is used only for the fixed part of the
zone. When you try to restore a snapshot with virtio-mem, the
backing file is used for all its regions. This results in the
following error:
VmRestore(MemoryManager(GuestMemoryRegion(MappingPastEof)))
This patch sets backing_file only for the fixed part of a virtio-mem
zone.
Fixes: #6337
Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
With the nightly toolchain (2024-02-18) cargo check will flag up
redundant imports either because they are pulled in by the prelude on
earlier match.
Remove those redundant imports.
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
On guests with large amounts of memory, using the `prefault` option can
lead to a very long boot time. This commit implements the strategy
taken by QEMU to prefault memory in parallel using multiple threads,
decreasing the time to allocate memory for large guests by
an order of magnitude or more.
For example, this commit reduces the time to allocate memory for a
guest configured with 704 GiB of memory on 1 NUMA node using 1 GiB
hugepages from 81.44134669s to just 6.865287881s.
Signed-off-by: Sean Banko <sbanko@crusoeenergy.com>
This fixes all typos found by the typos utility with respect to the config file.
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
This commit renames `ram_region_sub_size` to `ram_region_available_size`
and make its value align down to the default page size or hugepage
size of the current memory zone, which can prevent the memory zone from
being split into misaligned parts. And if the available size of ram
region is zero, this region will be marked as consumed even it has
unused space.
Note that there is two methods to use hugepages.
1. Specify `hugepages` for `memory` or `memory-zone`, if the
`hugepage_size` is not specified, the value can be got by `statfs`
for `/dev/hugepages`.
2. Specify a `file` in hugetlbfs for `memory-zone`, the hugepage size
can also be got by `statfs` for the file.
The value for alignment will be the hugepage size if this memory zone
is using hugepages, otherwise the value will be default page size of
system.
Fixes: #5463
Signed-off-by: Yu Li <liyu.yukiteru@bytedance.com>
The previous `arch_memory_regions` function will provide some memory
regions with the specified memory size and fill all the previous
regions before using the next one, but sometimes there may be no need
to fill up the previous one, e.g., the previous one should be aligned
with hugepage size.
This commit make `arch_memory_regions` function not take any
parameters and return the max available regions, the memory manager
can use them on demand.
Fixes: #5463
Signed-off-by: Yu Li <liyu.yukiteru@bytedance.com>
Significant API changes have occured, most significantly is the switch
to an approach which does not require vm-memory and can run no_std.
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
We can ideally defer the address space allocation till we start the
vCPUs for the very first time. Because the VM will not access the memory
until the CPUs start running. Thus there is no need to allocate the
address space eagerly and wait till the time we are going to start the
vCPUs for the first time.
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
This functionality has been obsoleted by our native support for
hugepages and shared memory.
See: #5082
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This simplifies the Snapshot creation as we expect a SnapshotData to be
provided most of the time.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The information about the identifier related to a Snapshot is only
relevant from the BTreeMap perspective, which is why we can get rid of
the duplicated identifier in every Snapshot structure.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
There's no reason to carry a HashMap of SnapshotDataSection per
Snapshot. And given we now provide at most one SnapshotDataSection per
Snapshot, there's no need to keep the id part of the SnapshotDataSection
structure.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The coredump functionality is only implemented for x86_64 so it should
only be compiled in there.
Fixes: #4964
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
If the memory is not backed by a file then it is possible to enable
Transparent Huge Pages on the memory and take advantage of the benefits
of huge pages without requiring the specific allocation of an appropriate
number of huge pages.
TEST=Boot and see that in /proc/`pidof cloud-hypervisor`/smaps that the
region is now THPeligible (and that also pages are being used.)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
We can't use MAP_ANONYMOUS and still have huge pages so MAP_SHARED is
effectively required when using huge pages.
Unfortunately it is not as simple as always forcing MAP_SHARED if
hugepages is on as this might be inappropriate in the backing file case
hence why there is additional complexity of assigning to mmap_flags on
each case and the MAP_SHARED is only turned on for the anonymous file
huge page case as well as anonymous shared file case.
See: #4805
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
If we do not need an anonymous file backing the memory then do not
create one.
As a side effect this addresses an issue with CoW (mmap with MAP_PRIVATE
but no MAP_ANONYMOUS) when the memory is pinned for VFIO.
Fixes: #4805
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Until there is a need for sharing the memory fd with a child process, we
should err on the safe side to close it on exec.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
This is preliminary work to ensure a migrated VM is created right before
it is restored. This will be useful when moving to a design where the VM
is both created and restored simultaneously from the Snapshot.
In details, that means the MemoryManager is the object that must be
created upon receiving the config from the source VM, so that memory
content can be later received and filled into the GuestMemory.
Only after these steps happened, the snapshot is received from the
source VM, and the actual Vm object can be created from both the
snapshot and the MemoryManager previously created.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Add tracing of the VM boot sequence from the point at which the request
to create a VM is received to the hand-off to the vCPU threads running.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
There's no need to delegate the resize operation to the virtio-mem
thread. This can come directly from the vmm thread which will use the
Mem object to update the VIRTIO configuration and trigger the interrupt
for the guest to be notified.
In order to achieve what's described above, the VirtioMemZone structure
now has a handle onto the Mem object directly. This avoids the need for
intermediate Resize and ResizeSender structures.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
uefi_flash is used when load firmware, that is load payload depends on
device manager. move uefi_flash to memory manager can eliminate the
dependency.
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Guest memory is needed for analysis in crash tool, so save it
for coredump.
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Co-authored-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
It's useful to dump the guest, which named coredump so that crash
tool can be used to analysize it when guest hung up.
Let's add GuestDebuggable trait and Coredumpxxx error to support
coredump firstly.
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Co-authored-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
There is no need to include serde_derive separately,
as it can be specified as serde feature instead.
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
If using the ACPI based hotplug only memory can be added so if the
hotplug RAM size is the same as the boot RAM size then do not include
the memory manager DSDT entries.
Also: this change simplifies the code marginally by making the
HotplugMethod enum Copyable.
This was identified from the following perf output:
1.78% 0.00% vmm cloud-hypervisor [.] <vmm::memory_manager::MemorySlots as acpi_tables::aml::Aml>::append_aml_bytes
|
---<vmm::memory_manager::MemorySlots as acpi_tables::aml::Aml>::append_aml_bytes
<vmm::memory_manager::MemorySlot as acpi_tables::aml::Aml>::append_aml_bytes
acpi_tables::aml::Name::new
<acpi_tables::aml::Path as acpi_tables::aml::Aml>::append_aml_bytes
__libc_malloc
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The reserved space is for devices.
Some devices (like TPM) require arbitrary addresses close to 4GiB.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
The introduction of a error if live resizing is not possible is a
regression compared to the original behaviour where the new size would
be stored in the config and reflected in the next boot. This behaviour
was also inconsistent with the effect of resizing with no VM booted.
Instead of generating an error allow the code to go ahead and update the
config so that the new size will be available upon the reboot.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>