Implement the VIRTIO_BALLOON_F_REPORTING feature, indicating to the
guest it can report set of free pages. A new virtqueue dedicated for
receiving the information about the free pages is created. The VMM
releases the memory by punching holes with fallocate() if the guest
memory is backed by a file, and madvise() the host about the ranges of
memory that shouldn't be needed anymore.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Adding a new parameter free_page_reporting=on|off to the balloon device
so that we can enable the corresponding feature from virtio-balloon.
Running a VM with a balloon device where this feature is enabled allows
the guest to report pages that are free from guest's perspective. This
information is used by the VMM to release the corresponding pages on the
host.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Improving the existing code for better readability and in anticipation
for adding an additional virtqueue for the free page reporting feature.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Instead of relying on the virtio-queue crate to store the information
about the MSI-X vectors for each queue, we handle this directly from the
PCI transport layer.
This is the first step in getting closer to the upstream version of
virtio-queue so that we can eventually move fully to the upstream
version.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Relying on the vm-virtio/virtio-queue crate from rust-vmm which has been
copied inside the Cloud Hypervisor tree, the entire codebase is moved to
the new definition of a Queue and other related structures.
The reason for this move is to follow the upstream until we get some
agreement for the patches that we need on top of that to make it
properly work with Cloud Hypervisor.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Adding the snapshot/restore support along with migration as well,
allowing a VM with a virtio-balloon device attached to be properly
migrated.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Introduce a common solution for spawning the virtio threads which will
make it easier to add the panic handling.
During this effort I discovered that there were no seccomp filters
registered for the vhost-user-net thread nor the vhost-user-block
thread. This change also incorporates basic seccomp filters for those as
part of the refactoring.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
We are relying on applying empty 'seccomp' filters to support the
'--seccomp false' option, which will be treated as an error with the
updated 'seccompiler' crate. This patch fixes this issue by explicitly
checking whether the 'seccomp' filter is empty before applying the
filter.
Signed-off-by: Bo Chen <chen.bo@intel.com>
For vhost-user devices, memory should be shared between CLH and
vhost-user backend. However, madvise DONTNEED doesn't working in
this case. So, let's use fallocate PUNCH_HOLE to discard those
memory regions instead.
Signed-off-by: Li Hangjing <lihangjing@bytedance.com>
Sometimes we need balloon deflate automatically to give memory
back to guest, especially for some low priority guest processes
under memory pressure. Enable deflate_on_oom to support this.
Usage: --balloon "size=0,deflate_on_oom=on" \
Signed-off-by: Fei Li <lifei.shirley@bytedance.com>
As the first step to complete live-migration with tracking dirty-pages
written by the VMM, this commit patches the dependent vm-memory crate to
the upstream version with the dirty-page-tracking capability. Most
changes are due to the updated `GuestMemoryMmap`, `GuestRegionMmap`, and
`MmapRegion` structs which are taking an additional generic type
parameter to specify what 'bitmap backend' is used.
The above changes should be transparent to the rest of the code base,
e.g. all unit/integration tests should pass without additional changes.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Add a helper to VirtioCommon which returns duplicates of the EventFds
for kill and pause event.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Now all crates use edition = "2018" then the majority of the "extern
crate" statements can be removed. Only those for importing macros need
to remain.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
error: name `TYPE_UNKNOWN` contains a capitalized acronym
--> vm-virtio/src/lib.rs:48:5
|
48 | TYPE_UNKNOWN = 0xFF,
| ^^^^^^^^^^^^ help: consider making the acronym lowercase, except the initial letter: `Type_Unknown`
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#upper_case_acronyms
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Even though the driver can provide fewer queues than those advertised
for some device types their is a minimum number that is required for
operation.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Rather than having to give and return the ioeventfd used for a device
clone them each time. This will make it simpler when we start handling
the driver enabling fewer queues than advertised by the device.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
In order to make the thread naming more useful derive their name from
the device id (which can be supplied by the user) and a device specific
suffix that has details of the individual queue (or queue pair.)
e.g.
rob@artemis:~$ pstree -p -c -l -t `pidof cloud-hypervisor`
cloud-hyperviso(27501)─┬─{_console}(27525)
├─{_disk0_q0}(27529)
├─{_disk0_q1}(27532)
├─{_net1_ctrl}(27533)
├─{_net1_qp0}(27534)
├─{_net1_qp1}(27535)
├─{_rng}(27526)
├─{http-server}(27504)
├─{seccomp_signal_}(27502)
├─{signal_handler}(27523)
├─{vcpu0}(27520)
├─{vcpu1}(27522)
└─{vmm}(27503)
Fixes: #2077
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
error: field assignment outside of initializer for an instance created with Default::default()
--> virtio-devices/src/mem.rs:496:9
|
496 | resp.resp_type = resp_type;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
note: consider initializing the variable with `mem::VirtioMemResp { resp_type: resp_type, ..Default::default() }` and removing relevant reassignments
--> virtio-devices/src/mem.rs:495:9
|
495 | let mut resp = VirtioMemResp::default();
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#field_reassign_with_default
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
When a total ordering between multiple atomic variables is not required
then use Ordering::Acquire with atomic loads and Ordering::Release with
atomic stores.
This will improve performance as this does not require a memory fence
on x86_64 which Ordering::SeqCst will use.
Add a comment to the code in the vCPU handling code where it operates on
multiple atomics to explain why Ordering::SeqCst is required.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The virtio-balloon change the memory size is asynchronous.
VirtioBalloonConfig.actual of balloon device show current balloon size.
This commit add memory_actual_size to vm.info to show memory actual size.
Signed-off-by: Hui Zhu <teawater@antfin.com>
In order to simplify the transition to VirtioCommon and to avoid needing
to set empty fields derive Default for VirtioCommon.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
As we never join the spawned virtio-devices worker threads, the error
returned from each worker thread is lost. For now, we simply print out
the error from each worker thread.
Fixes: #1551
Signed-off-by: Bo Chen <chen.bo@intel.com>
Using the Rust Barrier mechanism, this patch forces each virtio device
to acknowledge they've been correctly paused before going further.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Instead of passing only the event type through the handle_event()
callback, we make the trait slightly more generic by providing the
epoll event to each virtio device implementation.
This is particularly useful for vsock as it will need the event set.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Migrate to EpollHelper so as to remove code that is duplicated between
multiple virtio devices.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Remove the write_config() implementations that only generate a warning
as that is now done at the VirtioDevice level.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This commit store balloon size to MemoryConfig.
After reboot, virtio-balloon can use this size to inflate back to
the size before reboot.
Signed-off-by: Hui Zhu <teawater@antfin.com>
The implementation of the virtio-balloon was slightly wrong as it was
generating the GPA (Guest Physical Address) from the PFN (Page Frame
Number) which was a u32. That means the GPA was created as a u32, and
later a cast was done to extend it to a u64 type. Unfortunately, by
doing so, the GPA was wrong if the value was supposedly more than 32
bits.
That's why the PFN is casted into a u64 before the GPA is generated,
which creates the GPA on 64 bits directly.
Additionally, this patch simplifies the process_queue() function,
relying on multiple vm-memory helpers.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>