This is preliminary work to ensure a migrated VM is created right before
it is restored. This will be useful when moving to a design where the VM
is both created and restored simultaneously from the Snapshot.
In details, that means the MemoryManager is the object that must be
created upon receiving the config from the source VM, so that memory
content can be later received and filled into the GuestMemory.
Only after these steps happened, the snapshot is received from the
source VM, and the actual Vm object can be created from both the
snapshot and the MemoryManager previously created.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The kernel will trigger a SIGBUS upon hugetlb page faults when there is
no huge pages available. We neither have a way to ensure enough huge
pages available on the host system, nor have a way to gracefully report
the lack of huge pages in advance from Cloud Hypervisor. For these
reasons, we have to avoid using huge pages from the virtio-mem fuzzer to
avoid SIGBUS errors.
Signed-off-by: Bo Chen <chen.bo@intel.com>
These look alarming if you are booting with the a distro kernel which is
now a recommended approach.
See: #4786
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The restore path of MemoryManager is handled specially without
implementing a `Snapshottable:restore()`. Removing the explicit call to
it along the migration code path to avoid confusions.
See: #4783
Signed-off-by: Bo Chen <chen.bo@intel.com>
The systemd journal has a known issue of generating large size logs[1],
which makes it unreliable as a source for retrieving system
information, such as for counting reboot times. This is particularly
problematic on disk-constrained systems, like the VMs we launched for
our integration tests, where the disk size is normally 2GB. By default,
the systemd journal has a size limit of 10% of the size of the
underlying file system (e.g. around 200MB for the VMs of our integration
tests), which would remove archived journal files on demand.
A better alternative to count reboot times is based on information from
`wtmp` (e.g. the login records) which is much more concise and can be
accessed via the `last` command.
[1] https://github.com/systemd/systemd/issues/5285Fixes: #4749
Signed-off-by: Bo Chen <chen.bo@intel.com>
Refresh our README in a consistent style and update it to reflect:
* A recommendation to use binaries
* Clarify our relationship with other Rust based VMMs/Rust-VMM project
* Ensure instructions result in a usable image (cloud-init)
* Simplify script instructions
* Move compilation details elsewhere
* Add Fedora 36 image details
* Point to CLOUDHV as well as Rust Hypervisor Firmware
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
In function create_for_size(), when calculating the number
of clusters for the L1 table, it incorrectly divides the total number of
entries in the L1 table by the cluster size. In fact, it should be
divided by the number of entries/addresses that can fit into a cluster.
Signed-off-by: lv.mengzhao <lv.mengzhao@zte.com.cn>
Vdpa now implements the Migratable trait, which allows the device to be
added to the DeviceTree and therefore allows live migrating any vDPA
device that supports being suspended.
Given a vDPA device can't be resumed from a suspended state without
having to reset everything, we don't support pause/resume for a vDPA
device, as well as snapshot/restore (which requires resume to be
supported).
In order for the migration to work locally, reusing the same device on
the same host machine, the vhost-vdpa handler is dropped after the
snapshot has been performed, which allows the destination VM to open the
device without any conflict about the device being busy.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In order to anticipate for migration support, we need to be able to
create a Vdpa object without VhostKernVdpa object associated with it.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Adding VHOST_VDPA_GET_CONFIG_SIZE and VHOST_VDPA_SUSPEND to the list of
authorized ioctls for the vmm thread.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
When restoring a VM, the BAR type can be found directly from the
snapshot resources. It is more reliable than the previous method which
was using self.use_64bit_bar from VirtioPciDevice because at the time
the BARs are allocated, the VirtioDevice hasn't been restored yet,
meaning the way to determine the value of use_64bit_bar is wrong for a
device like vDPA. At this time, the device type is not known and relying
on the stored resources is the only reliable way.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
With an updated Linux kernel the kernel reprograms the same table entry
in the MSI-X table multiple times. Fortunately it does so with entry
masked (the default.)
The costly part of MSI-X table reprogramming is going out to the host
kernel to update the GSI routing entries. This change makes three
optimisations.
1. If the table entry is unchanged: skip handking updates
2. Only update the GSI routing table if the table entry is unmasked
(this skips extra calls to the ioctl() for reprogramming the
entries.)
3. Only generate a message on entry unmasking if the global MSI-X enable
bit is set.
Fixes: #4273
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
In this way, we have all functions related to generate default values of
vm-config structs in the same location.
Signed-off-by: Bo Chen <chen.bo@intel.com>