Commit Graph

116 Commits

Author SHA1 Message Date
William Douglas
48963e322a Enable pty console
Add the ability for cloud-hypervisor to create, manage and monitor a
pty for serial and/or console I/O from a user. The reasoning for
having cloud-hypervisor create the ptys is so that clients, libvirt
for example, could exit and later re-open the pty without causing I/O
issues. If the clients were responsible for creating the pty, when
they exit the main pty fd would close and cause cloud-hypervisor to
get I/O errors on writes.

Ideally the main and subordinate pty fds would be kept in the main
vmm's Vm structure. However, because the device manager owns parsing
the configuration for the serial and console devices, the information
is instead stored in new fields under the DeviceManager structure
directly.

From there hooking up the main fd is intended to look as close to
handling stdin and stdout on the tty as possible (there is some future
work ahead for perhaps moving support for the pty into the
vmm_sys_utils crate).

The main fd is used for reading user input and writing to output of
the Vm device. The subordinate fd is used to setup raw mode and it is
kept open in order to avoid I/O errors when clients open and close the
pty device.

The ability to handle multiple inputs as part of this change is
intentional. The current code allows serial and console ptys to be
created and both be used as input. There was an implementation gap
though with the queue_input_bytes needing to be modified so the pty
handlers for serial and console could access the methods on the serial
and console structures directly. Without this change only a single
input source could be processed as the console would switch based on
its input type (this is still valid for tty and isn't otherwise
modified).

Signed-off-by: William Douglas <william.r.douglas@gmail.com>
2021-02-09 10:03:28 +00:00
Rob Bradford
981bb72a09 vmm: api: Add "power-button" API entry point
This will lead to the triggering of an ACPI button inside the guest in
order to cleanly shutdown the guest.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-01-13 17:00:39 +00:00
Rob Bradford
e0d79196c8 virtio-devices, vmm: Enhance debugging around virtio device activation
Sometimes when running under the CI tests fail due to a barrier not
being released and the guest blocks on an MMIO write. Add further
debugging to try and identify the issue.

See: #2118

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-01-08 14:06:44 +00:00
Rob Bradford
03db48306b vmm: Activate virtio device from VMM thread
When a device is ready to be activated signal to the VMM thread via an
EventFd that there is a device to be activated. When the VMM receives a
notification on the EventFd that there is a device to be activated
notify the device manager to attempt to activate any devices that have
not been activated.

As a side effect the VMM thread will create the virtio device threads.

Fixes: #1863

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-12-17 11:23:53 +00:00
Rob Bradford
280d4fb245 vmm: Include device tree in vm.info API
The DeviceNode cannot be fully represented as it embeds a Rust style
enum (i.e. with data) which is instead represented by a simple
associative array.

Fixes: #1167

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-12-01 16:44:25 +01:00
Rob Bradford
b5b97f7b05 vmm: When receiving a migration store the config
The configuration is stored separately to the Vm in the VMM. The failure
to store the config was preventing the VM from shutting down correctly
as Vmm::vm_delete() checks for the presence of the config.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-11-25 01:27:26 +01:00
Rob Bradford
df6b52924f vmm: Unlink created socket after source connects
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-11-25 01:27:26 +01:00
Rob Bradford
3ac9b6c404 vmm: Implement live migration
Now the VM is paused/resumed by the migration process itself.

0. The guest configuration is sent to the destination
1. Dirty page log tracking is started by start_memory_dirty_log()
2. All guest memory is sent to the destination
3. Up to 5 attempts are made to send the dirty guest memory to the
   destination...
4. ...before the VM is paused
5. One last set of dirty pages is sent to the destination
6. The guest is snapshotted and sent to the destination
7. When the migration is completed the destination unpauses the received
   VM.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-11-17 16:57:11 +00:00
Rob Bradford
cf6763dfdb vmm: migration: Add missing response check
A read and check of the response was missing from when sending the
memory to the destination.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-11-17 16:57:11 +00:00
Rob Bradford
11a69450ba vm-migration, vmm: Send configuration in separate step
Prior to sending the memory the full state is not needed only the
configuration. This is sufficient to create the appropriate structures
in the guest and have the memory allocations ready for filling.

Update the protocol documentation to add a separate config step and move
the state to after the memory is transferred. As the VM is created in a
separate step to restoring it the requires a slightly different
constructor as well as saving the VM object for the subsequent commands.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-11-17 16:57:11 +00:00
Rob Bradford
ca60adda70 vmm: Add support for sending and receiving migration if VM is paused
This is tested by:

Source VMM:

target/debug/cloud-hypervisor --kernel ~/src/linux/vmlinux \
--pmem file=~/workloads/focal.raw --cpus boot=1 \
--memory size=2048M \
--cmdline"root=/dev/pmem0p1 console=ttyS0" --serial tty --console off \
--api-socket=/tmp/api1 -v

Destination VMM:

target/debug/cloud-hypervisor --api-socket=/tmp/api2 -v

And the following commands:

target/debug/ch-remote --api-socket=/tmp/api1 pause
target/debug/ch-remote --api-socket=/tmp/api2 receive-migration unix:/tmp/foo &
target/debug/ch-remote --api-socket=/tmp/api1 send-migration unix:/tmp/foo
target/debug/ch-remote --api-socket=/tmp/api2 resume

The VM is then responsive on the destination VMM.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-11-11 11:07:24 +01:00
Rob Bradford
dfe2dadb3e vmm: memory_manager: Make the snapshot source directory an Option
This allows the code to be reused when creating the VM from a snapshot
when doing VM migration.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-11-11 11:07:24 +01:00
Rob Bradford
7ac764518c vmm: api: Implement API support for migration
Add API entry points with stub implementation for sending and receiving
a VM from one VMM to another.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-11-11 11:07:24 +01:00
Rob Bradford
7b77f1ef90 vmm: Remove self-spawning functionality for vhost-user-{net,block}
This also removes the need to lookup up the "exe" symlink for finding
the VMM executable path.

Fixes: #1925

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-11-09 00:16:15 +01:00
Michael Zhao
093a581ee1 vmm: Implement VM rebooting on AArch64
The logic to handle AArch64 system event was: SHUTDOWN and RESET were
all treated as RESET.

Now we handle them differently:
- RESET event will trigger Vmm::vm_reboot(),
- SHUTDOWN event will trigger Vmm::vm_shutdown().

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2020-10-30 17:14:44 +00:00
Rob Bradford
dfd21cbfc5 vmm: Use thiserror/anyhow for vmm::Error
This gives a nicer user experience and this error can now be used as the
source for other errors based off this.

See: #1910

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-10-27 13:27:23 +00:00
Sebastien Boeuf
3594685279 vmm: Move balloon code from MemoryManager to DeviceManager
Now that we have a new dedicated way of asking for a balloon through the
CLI and the REST API, we can move all the balloon code to the device
manager. This allows us to simplify the memory manager, which is already
quite complex.

It also simplifies the behavior of the balloon resizing command. Instead
of providing the expected size for the RAM, which is complex when memory
zones are involved, it now expects the balloon size. This is a much more
straightforward behavior as it really resizes the balloon to the desired
size. Additionally to the simplication, the benefit of this approach is
that it does not need to be tied to the memory manager at all.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-10-22 16:33:16 +02:00
Hui Zhu
c75f8b2f89 virtio-balloon: Add memory_actual_size to vm.info to show memory actual size
The virtio-balloon change the memory size is asynchronous.
VirtioBalloonConfig.actual of balloon device show current balloon size.

This commit add memory_actual_size to vm.info to show memory actual size.

Signed-off-by: Hui Zhu <teawater@antfin.com>
2020-10-01 17:46:30 +02:00
Sebastien Boeuf
015c78411e vmm: Add a 'resize-zone' action to the API actions
Implement a new VM action called 'resize-zone' allowing the user to
resize one specific memory zone at a time. This relies on all the
preliminary work from the previous commits to resize each virtio-mem
device independently from each others.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-16 19:20:04 +02:00
Bo Chen
ff7ed8f628 vmm: Propagate the SeccompAction value to the Vm struct constructor
This patch propagates the SeccompAction value from main to the
Vm struct constructor (i.e. Vm::new_from_memory_manager), so that we can
use it to construct the DeviceManager and CpuManager struct for
controlling the behavior of the seccomp filters for vcpu/virtio-device
worker threads.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-08-04 11:40:49 +02:00
Bo Chen
b41884a406 main, vmm: seccomp: Use SeccompAction instead of SeccompLevel
This patch replaces the usage of 'SeccompLevel' with 'SeccompAction',
which is the first step to support the 'log' action over system
calls that are not on the allowed list of seccomp filters.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-08-04 11:40:49 +02:00
Hui Zhu
8ffbc3d031 vmm: api: ch-remote: Add balloon to VmResizeData
Signed-off-by: Hui Zhu <teawater@antfin.com>
2020-07-07 17:25:13 +01:00
Rob Bradford
b69f6d4f6c vhost_user_net, vhost_user_block, option_parser: Remove vmm dependency
Remove the vmm dependency from vhost_user_block and vhost_user_net where
it was existing to use config::OptionParser. By moving the OptionParser
to its own crate at the top-level we can remove the very heavy
dependency that these vhost-user backends had.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-07-06 18:33:29 +01:00
Rob Bradford
bca8a19244 vmm: Implement HTTP API for obtaining counters
The counters are a hash of device name to hash of counter name to u64
value. Currently the API is only implemented with a stub that returns an
empty set of counters.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-06-25 07:02:44 +02:00
Sebastien Boeuf
8038161861 vmm: Get and set clock during pause and resume operations
In order to maintain correct time when doing pause/resume and
snapshot/restore operations, this patch stores the clock value
on pause, and restore it on resume. Because snapshot/restore
expects a VM to be paused before the snapshot and paused after
the restore, this covers the migration use case too.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-06-23 14:36:01 +01:00
Muminul Islam
e4dee57e81 arch, pci, vmm: Initial switch to the hypervisor crate
Start moving the vmm, arch and pci crates to being hypervisor agnostic
by using the hypervisor trait and abstractions. This is not a complete
switch and there are still some remaining KVM dependencies.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-06-22 15:03:15 +02:00
Sebastien Boeuf
4fe7347fb9 vmm: Manually implement Serialize for PciDeviceInfo
In order to provide a more comprehensive b/d/f to the user, the
serialization of PciDeviceInfo is implemented manually to control the
formatting.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-06-12 13:37:18 +01:00
Sebastien Boeuf
83cd9969df vmm: Enable HTTP response for PCI device hotplug
This patch completes the series by connecting the dots between the HTTP
frontend and the device manager backend.

Any request to hotplug a VFIO, disk, fs, pmem, net, or vsock device will
now return a response including the device name and the place of the
device in the PCI topology.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-06-12 13:37:18 +01:00
Sebastien Boeuf
f08e9b6a73 vmm: device_manager: Return PciDeviceInfo from a hotplugged device
In order to provide the device name and PCI b/d/f associated with a
freshly hotplugged device, the hotplugging functions from the device
manager return a new structure called PciDeviceInfo.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-06-12 13:37:18 +01:00
Michael Zhao
97a1e5e1d2 vmm: Exit VMM event loop after guest shutdown for AArch64
X86 and AArch64 work in different ways to shutdown a VM.
X86 exit VMM event loop through ACPI device;
AArch64 need to exit from CPU loop of a SystemEvent.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2020-06-11 15:00:17 +01:00
Bo Chen
625bab69bd vmm: api: Allow to delete non-booted VMs
The action of "vm.delete" should not report errors on non-booted
VMs. This patch also revised the "docs/api.md" to reflect the right
'Prerequisites' of different API actions, e.g. on "vm.delete" and
"vm.boot".

Fixes: #1110

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-06-09 05:58:32 +01:00
Bo Chen
7c3e19c65a vhost_user_backend, vmm: Close leaked file descriptors
Explicit call to 'close()' is required on file descriptors allocated
from 'epoll::create()', which is missing for the 'EpollContext' and
'VringWorker'. This patch enforces to close the file descriptors by
reusing the Drop trait of the 'File' struct.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-05-19 09:22:09 +02:00
Sebastien Boeuf
f6a71bec36 vmm: Add unit tests for DeviceTree
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-05-05 16:08:42 +02:00
Sebastien Boeuf
64e01684f9 vmm: Create new module device_tree
This module will be dedicated to DeviceNode and DeviceTree definitions
along with some dedicated unit tests.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-05-05 16:08:42 +02:00
Rob Bradford
8de7448d44 vmm: api: Add "add-vsock" API entry point
This allows the hotplugging of vsock devices.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-29 12:44:49 +01:00
Dean Sheather
c2abadc293 vmm: Add ability to add virtio-fs device post-boot
Adds DeviceManager method `make_virtio_fs_device` which creates a single
device, and modifies `make_virtio_fs_devices` to use this method.

Implements the new `vm.add-fs route`.

Signed-off-by: Dean Sheather <dean@coder.com>
2020-04-20 20:36:26 +02:00
Sebastien Boeuf
8d9d22436a vmm: Add "prefault" option when restoring
Now that the restore path uses RestoreConfig structure, we add a new
parameter called "prefault" to it. This will give the user the ability
to populate the pages corresponding to the mapped regions backed by the
snapshotted memory files.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-08 10:56:14 +02:00
Sebastien Boeuf
a517ca23a0 vmm: Move restore parameters into common RestoreConfig structure
The goal here is to move the restore parameters into a dedicated
structure that can be reused from the entire codebase, making the
addition or removal of a parameter easier.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-08 10:56:14 +02:00
Sebastien Boeuf
6eb721301c vmm: Enable restore feature
This connects the dots together, making the request from the user reach
the actual implementation for restoring the VM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-07 12:26:10 +02:00
Sebastien Boeuf
53613319cc vmm: Enable snapshot feature
This connects the dots together, making the request from the user reach
the actual implementation for snapshotting the VM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-07 12:26:10 +02:00
Samuel Ortiz
b55b83c6e8 vmm: vm: Implement the Transportable trait
This is only implementing the send() function in order to store all Vm
states into a file.

This needs to be extended for live migration, by adding more transport
methods, and also the recv() function must be implemented.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-04-07 12:26:10 +02:00
Samuel Ortiz
1ed357cf34 vmm: vm: Implement the Snapshottable trait
By aggregating snapshots from the CpuManager, the MemoryManager and the
DeviceManager, Vm implements the snapshot() function from the
Snapshottable trait.
And by restoring snapshots from the CpuManager, the MemoryManager and
the DeviceManager, Vm implements the restore() function from the
Snapshottable trait.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
2020-04-07 12:26:10 +02:00
Yi Sun
50b3f008d1 vmm: cpu: Implement the Snapshottable trait
Implement the Snapshottable trait for Vcpu, and then implements it for
CpuManager. Note that CpuManager goes through the Snapshottable
implementation of Vcpu for every vCPU in order to implement the
Snapshottable trait for itself.

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-04-07 12:26:10 +02:00
Samuel Ortiz
92c73c3b78 vmm: Add a VmRestore command
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-04-02 13:24:25 +01:00
Samuel Ortiz
cf8f8ce93a vmm: api: Add a Snapshot command
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
2020-04-02 13:24:25 +01:00
Samuel Ortiz
1b1a2175ca vm-migration: Define the Snapshottable and Transportable traits
A Snapshottable component can snapshot itself and
provide a MigrationSnapshot payload as a result.

A MigrationSnapshot payload is a map of component IDs to a list of
migration sections (MigrationSection). As component can be made of
several Migratable sub-components (e.g. the DeviceManager and its
device objects), a migration snapshot can be made of multiple snapshot
itself.
A snapshot is a list of migration sections, each section being a
component state snapshot. Having multiple sections allows for easier and
backward compatible migration payload extensions.

Once created, a migratable component snapshot may be transported and this
is what the Transportable trait defines, through 2 methods: send and recv.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
2020-04-02 13:24:25 +01:00
Rob Bradford
57c3fa4b1e vmm: Add "add-net" to the API
Add the HTTP and internal API entry points for adding a network device
at runtime.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-25 17:58:06 +01:00
Rob Bradford
f6f4c68fb4 vmm: Add "add-pmem" to the API
Add the HTTP and internal API entry points for adding persistent memory
at runtime.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-25 13:18:17 +01:00
Rob Bradford
4c9d15d44c vmm: Fix copy and paste error message
vm_remove_device was copied from vm_add_device but the error message
wasn't correctly updated.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-25 09:35:53 +00:00
Rob Bradford
f2151b2734 vmm: Add "add-disk" to the API
Add the HTTP and internal API entry points for adding disks at runtime.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-25 09:35:53 +00:00