Following the new design proposal to improve the restore codepath when
migrating a VM, all virtio devices are supplied with an optional state
they can use to restore from. The restore() implementation every device
was providing has been removed in order to prevent from going through
the restoration twice.
Here is the list of devices now following the new restore design:
- Block (virtio-block)
- Net (virtio-net)
- Rng (virtio-rng)
- Fs (vhost-user-fs)
- Blk (vhost-user-block)
- Net (vhost-user-net)
- Pmem (virtio-pmem)
- Vsock (virtio-vsock)
- Mem (virtio-mem)
- Balloon (virtio-balloon)
- Watchdog (virtio-watchdog)
- Vdpa (vDPA)
- Console (virtio-console)
- Iommu (virtio-iommu)
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The add_device() function, from the device manager code, takes a
DeviceConfig as a parameter, instead of a VmAddDevice.
The change was originally done as part of 34412c9b41 and it didn't
break Kata Containers because the VmAddDevice and DeviceConfig structs
share most of their fields, besides the optional for serialization
`pci_segment`, which is not used by the client yet.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Considering error messages will be mostly nested, ensuring no
punctuation at the end will make the error log more readable.
Signed-off-by: Bo Chen <chen.bo@intel.com>
With known number of queues and queue events, we can make each of them
more explicit and avoid using vector/direct indexing, which is cleaner
and slightly more efficient.
Signed-off-by: Bo Chen <chen.bo@intel.com>
The the number of queues and associated events is known and fixed. We
can define and use each of them explicitly and avoid using vector (and
hence direct indexing), which is cleaner and slightly more efficient.
Signed-off-by: Bo Chen <chen.bo@intel.com>
The the number of queues and associated events is known and fixed. We
can define and use each of them explicitly and avoid using vector (and
hence direct indexing), which is cleaner and slightly more efficient.
Also, this refactoring makes it clearer that we are not handling "event
queue" events (as "_event_queue" is not being used intentionally).
Signed-off-by: Bo Chen <chen.bo@intel.com>
In this way, the virtio-iommu code can properly report an error when
a wrong number of queues is provided, instead of triggering an
out-of-bound error.
Signed-off-by: Bo Chen <chen.bo@intel.com>
This is preliminary work to ensure a migrated VM is created right before
it is restored. This will be useful when moving to a design where the VM
is both created and restored simultaneously from the Snapshot.
In details, that means the MemoryManager is the object that must be
created upon receiving the config from the source VM, so that memory
content can be later received and filled into the GuestMemory.
Only after these steps happened, the snapshot is received from the
source VM, and the actual Vm object can be created from both the
snapshot and the MemoryManager previously created.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The kernel will trigger a SIGBUS upon hugetlb page faults when there is
no huge pages available. We neither have a way to ensure enough huge
pages available on the host system, nor have a way to gracefully report
the lack of huge pages in advance from Cloud Hypervisor. For these
reasons, we have to avoid using huge pages from the virtio-mem fuzzer to
avoid SIGBUS errors.
Signed-off-by: Bo Chen <chen.bo@intel.com>
These look alarming if you are booting with the a distro kernel which is
now a recommended approach.
See: #4786
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The restore path of MemoryManager is handled specially without
implementing a `Snapshottable:restore()`. Removing the explicit call to
it along the migration code path to avoid confusions.
See: #4783
Signed-off-by: Bo Chen <chen.bo@intel.com>
The systemd journal has a known issue of generating large size logs[1],
which makes it unreliable as a source for retrieving system
information, such as for counting reboot times. This is particularly
problematic on disk-constrained systems, like the VMs we launched for
our integration tests, where the disk size is normally 2GB. By default,
the systemd journal has a size limit of 10% of the size of the
underlying file system (e.g. around 200MB for the VMs of our integration
tests), which would remove archived journal files on demand.
A better alternative to count reboot times is based on information from
`wtmp` (e.g. the login records) which is much more concise and can be
accessed via the `last` command.
[1] https://github.com/systemd/systemd/issues/5285Fixes: #4749
Signed-off-by: Bo Chen <chen.bo@intel.com>
Refresh our README in a consistent style and update it to reflect:
* A recommendation to use binaries
* Clarify our relationship with other Rust based VMMs/Rust-VMM project
* Ensure instructions result in a usable image (cloud-init)
* Simplify script instructions
* Move compilation details elsewhere
* Add Fedora 36 image details
* Point to CLOUDHV as well as Rust Hypervisor Firmware
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
In function create_for_size(), when calculating the number
of clusters for the L1 table, it incorrectly divides the total number of
entries in the L1 table by the cluster size. In fact, it should be
divided by the number of entries/addresses that can fit into a cluster.
Signed-off-by: lv.mengzhao <lv.mengzhao@zte.com.cn>
Vdpa now implements the Migratable trait, which allows the device to be
added to the DeviceTree and therefore allows live migrating any vDPA
device that supports being suspended.
Given a vDPA device can't be resumed from a suspended state without
having to reset everything, we don't support pause/resume for a vDPA
device, as well as snapshot/restore (which requires resume to be
supported).
In order for the migration to work locally, reusing the same device on
the same host machine, the vhost-vdpa handler is dropped after the
snapshot has been performed, which allows the destination VM to open the
device without any conflict about the device being busy.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In order to anticipate for migration support, we need to be able to
create a Vdpa object without VhostKernVdpa object associated with it.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Adding VHOST_VDPA_GET_CONFIG_SIZE and VHOST_VDPA_SUSPEND to the list of
authorized ioctls for the vmm thread.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
When restoring a VM, the BAR type can be found directly from the
snapshot resources. It is more reliable than the previous method which
was using self.use_64bit_bar from VirtioPciDevice because at the time
the BARs are allocated, the VirtioDevice hasn't been restored yet,
meaning the way to determine the value of use_64bit_bar is wrong for a
device like vDPA. At this time, the device type is not known and relying
on the stored resources is the only reliable way.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>