Commit Graph

658 Commits

Author SHA1 Message Date
Muminul Islam
e1a07ce3c4 vmm: vm: Unpark the threads before shutdown when the current state is paused
If the current state is paused that means most of the handles got killed by pthread_kill
We need to unpark those threads to make the shutdown worked. Otherwise
The shutdown API hangs and the API is not responding afterwards. So
before the shutdown call we need to resume the VM make it succeed.

Fixes: #817

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2020-04-27 09:09:12 +02:00
Rob Bradford
1df38daf74 vmm, tests: Make specifying a size optional for virtio-pmem
If a size is specified use it (in particular this is required if the
destination is a directory) otherwise seek in the file to get the size
of the file.

Add a new check that the size is a multiple of 2MiB otherwise the kernel
will reject it.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-24 18:30:05 +01:00
Rob Bradford
7481e4d959 vmm: config: Validate that shared memory is enabled if using vhost-user
Check that if any device using vhost-user (net & disk with
vhost_user=true) or virtio-fs is enabled then check shared memory is
also enabled.

Fixes: #848

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-24 16:01:49 +01:00
Bo Chen
2ac6971a8b vmm: MemoryManager: Cleanup the usage of std::ffi/io/result
Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-04-23 21:39:51 +02:00
Bo Chen
3f42f86d81 vmm: Add the 'shared' and 'hugepages' controls to MemoryConfig
The new 'shared' and 'hugepages' controls aim to replace the 'file'
option in MemoryConfig. This patch also updated all related integration
tests to use the new controls (instead of providing explicit paths to
"/dev/shm" or "/dev/hugepages").

Fixes: #1011

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-04-23 21:39:51 +02:00
Martin Xu
5a380a6918 vmm: memory_manager: Support non-power-of-2 block sizes
Replace alignment calculation of start address with functionally
equivalent version that does not assume that the block size is a power
of two.

Signed-off-by: Martin Xu <martin.xu@intel.com>
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-22 09:11:51 +02:00
Sebastien Boeuf
c22fd39170 vmm: Remove virtio device's userspace mapping on hot-unplug
When a virtio device is dynamically removed from the VM through the
hot-unplug mechanism, every mapping associated with it must be properly
removed.

Based on the previous patches letting a VirtioDevice expose the list of
userspace mappings associated with it, this patch can now remove all the
KVM userspace memory regions through the MemoryManager.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-21 10:02:21 +01:00
Sebastien Boeuf
0a97c25464 vmm: Extend MemoryManager to remove userspace mappings
The same way we added a helper for creating userspace memory mappings
from the MemoryManager, this patch adds a new helper to remove some
previously added mappings.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-21 10:02:21 +01:00
Sebastien Boeuf
fbcf3a7a7a vm-virtio: Implement userspace_mappings() for virtio-pmem
When hot-unplugging the virtio-pmem from the VM, we don't remove the
associated userspace mapping. This patch will let us fix this in a
following patch. For now, it simply adapts the code so that the Pmem
device knows about the mapping associated with it. By knowing about it,
it can expose it to the caller through the new userspace_mappings()
function.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-21 10:02:21 +01:00
Sebastien Boeuf
18f7789a81 vmm: Add hotplugged virtio devices to the DeviceManager list
The hotplugged virtio devices were not added to the list of virtio
devices from the DeviceManager. This patch fixes it, as it was causing
hotplugged virtio-fs devices from not supporting memory hotplug, since
they were never getting the update as they were not part of the list of
virtio devices held by the DeviceManager.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-20 20:36:26 +02:00
Dean Sheather
c2abadc293 vmm: Add ability to add virtio-fs device post-boot
Adds DeviceManager method `make_virtio_fs_device` which creates a single
device, and modifies `make_virtio_fs_devices` to use this method.

Implements the new `vm.add-fs route`.

Signed-off-by: Dean Sheather <dean@coder.com>
2020-04-20 20:36:26 +02:00
Dean Sheather
bb2139a408 vmm/api: Add vm.add-fs route
Currently unimplemented. Once implemented, this API will allow for
creating virtio-fs devices in the VM after it has booted.

Signed-off-by: Dean Sheather <dean@coder.com>
2020-04-20 20:36:26 +02:00
Sebastien Boeuf
d35e775ed9 vmm: Update KVM userspace mapping when PCI BAR remapping
In the context of the shared memory region used by virtio-fs in order to
support DAX feature, the shared region is exposed as a dedicated PCI
BAR, and it is backed by a KVM userspace mapping.

Upon BAR remapping, the BAR is moved to a different location in the
guest address space, and the KVM mapping must be updated accordingly.

Additionally, we need the VirtioDevice to report the updated guest
address through the shared memory region returned by get_shm_regions().
That's why a new setter is added to the VirtioDevice trait, so that
after the mapping has been updated for KVM, we can tell the VirtioDevice
the new guest address the shared region is located at.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-20 16:01:25 +02:00
Sebastien Boeuf
ac7178ef2a vmm: Keep migratable devices list as a Vec
The order the elements are pushed into the list is important to restore
them in the right order. This is particularly important for MmioDevice
(or VirtioPciDevice) and their VirtioDevice counterpart.

A device must be fully ready before its associated transport layer
management can trigger its restoration, which will end up activating the
device in most cases.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-17 19:29:41 +02:00
Rob Bradford
e7e0e8ac38 vmm, devices: Add firmware debug port device
OVMF and other standard firmwares use I/O port 0x402 as a simple debug
port by writing ASCII characters to it. This is gated under a feature
that is not enabled by default.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-17 12:54:00 +02:00
Rob Bradford
f9a0445c3d vmm: vm: Remove device from configuration after unplug
This ensures that a device that is removed will not reappear after a
reboot.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-16 17:03:25 +02:00
Rob Bradford
444e5c2a04 vmm: device_manager: Generalise NoAvailableVfioDeviceName
We now support assigning device ids for VFIO and virtio-pci devices so
this error can be generalised.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-16 17:03:25 +02:00
Rob Bradford
5bab9c3894 vmm: device_manager: Assign ids to pmem/net/disk devices if absent
If the id has not been provided by the user generate an incrementing id.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-16 17:03:25 +02:00
Rob Bradford
514491a051 vmm: device_manager: Support unplugging virtio-pci devices
Extend the eject_device() method on DeviceManager to also support
virtio-pci devices being unplugged.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-16 17:03:25 +02:00
Rob Bradford
476e4ce24f vmm: device_manager: Add virtio-pci devices into id to BDF map
In order to support hotplugging there is a map of human readable device
id to PCI BDF map.

As the device id is part of the specific device configuration (e.g.
NetConfig) it is necessary to return the id through from the helper
functions that create the devices through to the functions that add
those devices to the bus. This necessitates changing a great deal of
function prototypes but otherwise has little impact.

Currently only if an id is supplied by the user as part of the device
configuration is it populated into this map. A later commit will
populate with an autogenerated name where none is supplied by the user.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-16 17:03:25 +02:00
Rob Bradford
b38470df4b vmm: config: Add "id" parameter to {Net, Disk, Pmem}Config
This id will be used to unplug the device if the user has chosen an id.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-16 17:03:25 +02:00
Rob Bradford
1beb62ed2d vmm: vm: Don't panic on kernel load error
Rather than panic()ing when we get a kernel loading error populate the
error upwards.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-16 17:03:25 +02:00
Rob Bradford
72fdfff15d vmm: device_manager: Remove unused "_mmap_regions" member
Now that ownership of the memory regions used for the virtio-pmem and
vhost-user-fs devices have been moved into those devices it is no longer
necessary to track them inside DeviceManager.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-14 17:46:11 +01:00
Rob Bradford
70ecd6bab4 vmm, virtio: fs: Move freeing of mappped region into device
Move the release of the managed memory region from the DeviceManager to
the vhost-user-fs device. This ensures that the memory will be freed when
the device is unplugged which will lead to it being Drop()ed.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-14 17:46:11 +01:00
Rob Bradford
0c6706a510 vmm, virtio: pmem: Move freeing of mappped region into device
Move the release of the managed memory region from the DeviceManager to
the virtio-pmem device. This ensures that the memory will be freed when
the device is unplugged which will lead to it being Drop()ed.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-14 17:46:11 +01:00
Sebastien Boeuf
b1554642e4 vmm: seccomp: Add missing mremap() syscall
While testing self spawned vhost-user backends, it appeared that the
backend was aborting due to a missing system call in the seccomp
filters. mremap() was the culprit and this patch simply adds it to the
whitelist.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-14 14:11:41 +02:00
dependabot-preview[bot]
886c0f9093 build(deps): bump libc from 0.2.68 to 0.2.69
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.68 to 0.2.69.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.68...0.2.69)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-04-14 09:27:04 +01:00
Rob Bradford
28abfa9de5 vmm: openapi: Mark "initramfs" field nullable
This should make it a pointer in the Go generated code so that it will
be ommitted and thus not populated with an unhelpful default value.

Fixes: #1015

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-09 23:25:18 +02:00
Rob Bradford
c260640fd5 vmm: config: Use Default::default() value for initramfs field
This ensures that the field is filled with None when it is not specified
as part of the deserialisation step.

Fixes: #1015

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-09 17:28:45 +02:00
Alejandro Jimenez
7134f3129f vmm: Allow PVH boot with initramfs
We can now allow guests that specify an initramfs to boot
using the PVH boot protocol.

Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
2020-04-09 17:28:03 +02:00
Rob Bradford
2d3f518c72 vmm: config: Error if both socket and path are specified for a disk
This allows the validation of this requirement for both command line
booted VMs and those booted via the API.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-08 12:06:09 +01:00
Rob Bradford
eeb7e2529d vmm: config: Move max vCPUs > boot vCPUs check to validate()
This allows the validation of this requirement for both command line
booted VMs and those booted via the API.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-08 12:06:09 +01:00
Rob Bradford
12edb24678 vmm: config: Validate that serial/console file mode has a path
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-08 12:06:09 +01:00
Rob Bradford
aaf382eee2 vmm: Move kernel check to VmConfig::validate() method
Replace the existing VmConfig::valid() check with a call into
.validate() as part of earlier config setup or boot API checks.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-08 12:06:09 +01:00
Rob Bradford
3b0da2d895 vmm: vm: Validate configuration on API boot
When performing an API boot validate the configuration. For now only
some very basic validation is performed but in subsequent commits
the validation will be extended.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-08 12:06:09 +01:00
Rob Bradford
99b2ada4d0 vmm: Start splitting configuration parsing and validation
The configuration comes from a variety of places (commandline, REST API
and restore) however some validation was only happening on the command
line parsing path.  Therefore introduce a new ability to validate the
configuration before proceeding so that this can be used for commandline
and API boots.

For now move just the console and serial output mode validation under
the new validation API.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-08 12:06:09 +01:00
Sebastien Boeuf
0ea706faf5 vmm: openapi: Update OpenAPI definition with RestoreConfig
Making sure the OpenAPI definition is up to date with newly added
structure and parameters to support VM restoration.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-08 10:56:14 +02:00
Sebastien Boeuf
8d9d22436a vmm: Add "prefault" option when restoring
Now that the restore path uses RestoreConfig structure, we add a new
parameter called "prefault" to it. This will give the user the ability
to populate the pages corresponding to the mapped regions backed by the
snapshotted memory files.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-08 10:56:14 +02:00
Sebastien Boeuf
a517ca23a0 vmm: Move restore parameters into common RestoreConfig structure
The goal here is to move the restore parameters into a dedicated
structure that can be reused from the entire codebase, making the
addition or removal of a parameter easier.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-08 10:56:14 +02:00
Sebastien Boeuf
6712958f23 vmm: memory: Add prefault option when creating region
When CoW can be used, the VM restoration time is reduced, but the pages
are not populated. This can lead to some slowness from the guest when
accessing these pages.

Depending on the use case, we might prefer a slower boot time for better
performances from guest runtime. The way to achieve this is to prefault
the pages in this case, using the MAP_POPULATE flag along with CoW.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-08 10:56:14 +02:00
Sebastien Boeuf
b2cdee80b6 vmm: memory: Restore with Copy-on-Write when possible
This patch extends the previous behavior on the restore codepath.
Instead of copying the memory regions content from the snapshot files
into the new memory regions, the VMM will use the snapshot region files
as the backing files behind each mapped region. This is done in order to
reduce the time for the VM to be restored.

When the source VM has been initially started with a backing file, this
means it has been mapped with the MAP_SHARED flag. For this case, we
cannot use the CoW trick to speed up the VM restore path and we simply
fallback onto the copy of the memory regions content.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-08 10:56:14 +02:00
Sebastien Boeuf
d771223b2f vmm: memory: Extend new() to support external backing files
Whenever a MemoryManager is restored from a snapshot, the memory regions
associated with it might need to directly back the mapped memory for
increased performances. If that's the case, a list of external regions
is provided and the MemoryManager should simply ignore what's coming
from the MemoryConfig.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-08 10:56:14 +02:00
Sebastien Boeuf
ee5a041a0f vmm: memory: Add Copy-on-Write parameter when creating region
Now that we can choose specific mmap flags for the guest RAM, we create
a new parameter "copy_on_write" meaning that the memory mappings backed
by a file should be performed with MAP_PRIVATE instead of MAP_SHARED.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-08 10:56:14 +02:00
Sebastien Boeuf
be4e1e8712 vmm: memory: Use fine grained mmap wrapper
In order to anticipate the need for special mmap flags when memory
mapping the guest RAM, we need to switch from from_file() wrapper to
build() wrapper.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-08 10:56:14 +02:00
Sebastien Boeuf
b9f9f01fcc vmm: Extend seccomp filters to allow snapshot/restore
A few KVM ioctls were missing in order to perform both snapshot and
restore while keeping seccomp enabled.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-07 12:26:10 +02:00
Sebastien Boeuf
6eb721301c vmm: Enable restore feature
This connects the dots together, making the request from the user reach
the actual implementation for restoring the VM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-07 12:26:10 +02:00
Sebastien Boeuf
53613319cc vmm: Enable snapshot feature
This connects the dots together, making the request from the user reach
the actual implementation for snapshotting the VM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-07 12:26:10 +02:00
Samuel Ortiz
2cd0bc0a2c vmm: Create initial VM from its snapshot
The MemoryManager is somehow a special case, as its restore() function
was not implemented as part of the Snapshottable trait. Instead, and
because restoring memory regions rely both on vm.json and every memory
region snapshot file, the memory manager is restored at creation time.
This makes the restore path slightly different from CpuManager, Vcpu,
DeviceManager and Vm, but achieve the correct restoration of the
MemoryManager along with its memory regions filled with the correct
content.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-04-07 12:26:10 +02:00
Samuel Ortiz
b55b83c6e8 vmm: vm: Implement the Transportable trait
This is only implementing the send() function in order to store all Vm
states into a file.

This needs to be extended for live migration, by adding more transport
methods, and also the recv() function must be implemented.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-04-07 12:26:10 +02:00
Samuel Ortiz
1ed357cf34 vmm: vm: Implement the Snapshottable trait
By aggregating snapshots from the CpuManager, the MemoryManager and the
DeviceManager, Vm implements the snapshot() function from the
Snapshottable trait.
And by restoring snapshots from the CpuManager, the MemoryManager and
the DeviceManager, Vm implements the restore() function from the
Snapshottable trait.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
2020-04-07 12:26:10 +02:00