Commit Graph

224 Commits

Author SHA1 Message Date
Dean Sheather
c2abadc293 vmm: Add ability to add virtio-fs device post-boot
Adds DeviceManager method `make_virtio_fs_device` which creates a single
device, and modifies `make_virtio_fs_devices` to use this method.

Implements the new `vm.add-fs route`.

Signed-off-by: Dean Sheather <dean@coder.com>
2020-04-20 20:36:26 +02:00
Sebastien Boeuf
d35e775ed9 vmm: Update KVM userspace mapping when PCI BAR remapping
In the context of the shared memory region used by virtio-fs in order to
support DAX feature, the shared region is exposed as a dedicated PCI
BAR, and it is backed by a KVM userspace mapping.

Upon BAR remapping, the BAR is moved to a different location in the
guest address space, and the KVM mapping must be updated accordingly.

Additionally, we need the VirtioDevice to report the updated guest
address through the shared memory region returned by get_shm_regions().
That's why a new setter is added to the VirtioDevice trait, so that
after the mapping has been updated for KVM, we can tell the VirtioDevice
the new guest address the shared region is located at.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-20 16:01:25 +02:00
Sebastien Boeuf
ac7178ef2a vmm: Keep migratable devices list as a Vec
The order the elements are pushed into the list is important to restore
them in the right order. This is particularly important for MmioDevice
(or VirtioPciDevice) and their VirtioDevice counterpart.

A device must be fully ready before its associated transport layer
management can trigger its restoration, which will end up activating the
device in most cases.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-17 19:29:41 +02:00
Rob Bradford
e7e0e8ac38 vmm, devices: Add firmware debug port device
OVMF and other standard firmwares use I/O port 0x402 as a simple debug
port by writing ASCII characters to it. This is gated under a feature
that is not enabled by default.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-17 12:54:00 +02:00
Rob Bradford
444e5c2a04 vmm: device_manager: Generalise NoAvailableVfioDeviceName
We now support assigning device ids for VFIO and virtio-pci devices so
this error can be generalised.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-16 17:03:25 +02:00
Rob Bradford
5bab9c3894 vmm: device_manager: Assign ids to pmem/net/disk devices if absent
If the id has not been provided by the user generate an incrementing id.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-16 17:03:25 +02:00
Rob Bradford
514491a051 vmm: device_manager: Support unplugging virtio-pci devices
Extend the eject_device() method on DeviceManager to also support
virtio-pci devices being unplugged.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-16 17:03:25 +02:00
Rob Bradford
476e4ce24f vmm: device_manager: Add virtio-pci devices into id to BDF map
In order to support hotplugging there is a map of human readable device
id to PCI BDF map.

As the device id is part of the specific device configuration (e.g.
NetConfig) it is necessary to return the id through from the helper
functions that create the devices through to the functions that add
those devices to the bus. This necessitates changing a great deal of
function prototypes but otherwise has little impact.

Currently only if an id is supplied by the user as part of the device
configuration is it populated into this map. A later commit will
populate with an autogenerated name where none is supplied by the user.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-16 17:03:25 +02:00
Rob Bradford
72fdfff15d vmm: device_manager: Remove unused "_mmap_regions" member
Now that ownership of the memory regions used for the virtio-pmem and
vhost-user-fs devices have been moved into those devices it is no longer
necessary to track them inside DeviceManager.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-14 17:46:11 +01:00
Rob Bradford
70ecd6bab4 vmm, virtio: fs: Move freeing of mappped region into device
Move the release of the managed memory region from the DeviceManager to
the vhost-user-fs device. This ensures that the memory will be freed when
the device is unplugged which will lead to it being Drop()ed.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-14 17:46:11 +01:00
Rob Bradford
0c6706a510 vmm, virtio: pmem: Move freeing of mappped region into device
Move the release of the managed memory region from the DeviceManager to
the virtio-pmem device. This ensures that the memory will be freed when
the device is unplugged which will lead to it being Drop()ed.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-14 17:46:11 +01:00
Samuel Ortiz
1ed357cf34 vmm: vm: Implement the Snapshottable trait
By aggregating snapshots from the CpuManager, the MemoryManager and the
DeviceManager, Vm implements the snapshot() function from the
Snapshottable trait.
And by restoring snapshots from the CpuManager, the MemoryManager and
the DeviceManager, Vm implements the restore() function from the
Snapshottable trait.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
2020-04-07 12:26:10 +02:00
Samuel Ortiz
a0d5dbce6c vmm: device_manager: Implement the Snapshottable trait
Based on the list of Migratable devices stored by the DeviceManager, the
DeviceManager can implement the Snapshottable trait by aggregating all
devices snapshots together.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-04-07 12:26:10 +02:00
Yi Sun
93d3abfd6e vmm: device_manager: Make serial and ioapic devices migratable
Serial and Ioapic both implement the Migratable trait, hence the
DeviceManager can store them in the list of Migratable devices.

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
2020-04-07 12:26:10 +02:00
Samuel Ortiz
b584ec3fb3 vmm: memory_manager: Own the system allocator
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-04-03 18:05:18 +01:00
Samuel Ortiz
7a50646c02 vmm: device_manager: Convert migratable_devices to a map
We must be able to map a migratable component id to its device.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-04-03 18:05:18 +01:00
Samuel Ortiz
1b1a2175ca vm-migration: Define the Snapshottable and Transportable traits
A Snapshottable component can snapshot itself and
provide a MigrationSnapshot payload as a result.

A MigrationSnapshot payload is a map of component IDs to a list of
migration sections (MigrationSection). As component can be made of
several Migratable sub-components (e.g. the DeviceManager and its
device objects), a migration snapshot can be made of multiple snapshot
itself.
A snapshot is a list of migration sections, each section being a
component state snapshot. Having multiple sections allows for easier and
backward compatible migration payload extensions.

Once created, a migratable component snapshot may be transported and this
is what the Transportable trait defines, through 2 methods: send and recv.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
2020-04-02 13:24:25 +01:00
Sebastien Boeuf
9e18177654 vmm: Add memory hotplug support to VFIO PCI devices
Extend the update_memory() method from DeviceManager so that VFIO PCI
devices can update their DMA mappings to the physical IOMMU, after a
memory hotplug has been performed.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-27 09:35:39 +01:00
Sebastien Boeuf
cc67131ecc vmm: Retrieve new memory region when memory is extended
Whenever the memory is resized, it's important to retrieve the new
region to pass it down to the device manager, this way it can decide
what to do with it.

Also, there's no need to use a boolean as we can instead use an Option
to carry the information about the region. In case of virtio-mem, there
will be no region since the whole memory has been reserved up front by
the VMM at boot. This means only the ACPI hotplug will return a region
and is the only method that requires the memory to be updated from the
device manager.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-27 09:35:39 +01:00
Rob Bradford
8f323e61d8 vmm: Add support to DeviceManager for hotplugging network devices
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-25 17:58:06 +01:00
Rob Bradford
42a9896fe4 vmm: device_manager: Refactor make_virtio_net_devices
Split it into a method that creates a single device which is called by
the multiple device version so this can be used when dynamically adding
a device.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-25 17:58:06 +01:00
Hui Zhu
e63f98182a vmm: device: Add make_virtio_mem_devices
Add make_virtio_mem_devices to add virtio-mem to vmm.

Signed-off-by: Hui Zhu <teawater@antfin.com>
2020-03-25 15:54:16 +01:00
Rob Bradford
f7def621dd vmm: Add support to DeviceManager for hotplugging pmem devices
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-25 13:18:17 +01:00
Rob Bradford
8c3ea8cd76 vmm: device_manager: Refactor make_virtio_pmem_devices
Split it into a method that creates a single device which is called by
the multiple device version so this can be used when dynamically adding
a device.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-25 13:18:17 +01:00
Rob Bradford
b3082c1984 vmm: Add support to DeviceManager for hotplugging disks
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-25 09:35:53 +00:00
Rob Bradford
2be703ca92 vmm: device_manager: Refactor make_virtio_block_devices
Split it into a method that creates a single device which is called by
the multiple device version so this can be used when dynamically adding
a device.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-25 09:35:53 +00:00
Sebastien Boeuf
e54f8ec8a5 vmm: Update memory through DeviceManager
Whenever the VM memory is resized, DeviceManager needs to be notified
so that it can subsequently notify each virtio devices about it.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-24 19:01:15 +00:00
Rob Bradford
f7197e8415 vmm: Add a "discard_writes=" to --pmem
This opens the backing file read-only, makes the pages in the mmap()
read-only and also makes the KVM mapping read-only. The file is also
mapped with MAP_PRIVATE to make the changes local to this process only.

This is functional alternative to having support for making a
virtio-pmem device readonly. Unfortunately there is no concept of
readonly virtio-pmem (or any type of NVDIMM/PMEM) in the Linux kernel so
to be able to have a block device that is appears readonly in the guest
requires significant specification and kernel changes.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-20 14:46:34 +01:00
Rob Bradford
d11a67b0fe vmm: Use more generic MmapRegion constructor
Switch to MmapRegion::build() and fill in the fields appropriately.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-20 14:46:34 +01:00
Rob Bradford
7257e890ef vmm: Add "readonly" parameter MemoryManager::create_userspace_mapping
Use this boolean to turn on the KVM_MEM_READONLY flag to indicate that
this memory mapping should not be writable by the VM.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-20 14:46:34 +01:00
Qiu Wenbo
c503118d16 vmm: fix a corrupted stack caused by get_win_size
According to `asm-generic/termios.h`, the `struct winsize` should be:

struct winsize {
        unsigned short ws_row;
        unsigned short ws_col;
        unsigned short ws_xpixel;
        unsigned short ws_ypixel;
};

The ioctl of TIOCGWINSZ will trigger a segfault on aarch64.

Signed-off-by: Qiu Wenbo <qiuwenbo@phytium.com.cn>
2020-03-20 07:30:06 +01:00
Rob Bradford
87990f9e67 vmm: Add virtio-pci device to B/D/F hash table
This table currently contains only all the VFIO devices and it should
really contain all the PCI devices.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-18 19:05:58 +00:00
Rob Bradford
fb185fa839 vmm: Always return PCI B/D/F from add_virtio_pci_device
Previously this was only returned if the device had an IOMMU mapping and
whether the device should be added to the virtio-iommu. This was already
captured earlier as part of creating the device so use that information
instead.

Always returning the B/D/F is helpful as it facilitates virtio PCI
device hotplug.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-18 19:05:58 +00:00
Rob Bradford
7e599b4450 vmm: Make disk path optional
When using "--disk" with a vhost socket and not using self spawning then
it is not necessary or helpful to specify the path.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-13 11:41:52 +00:00
Sebastien Boeuf
8d785bbd5f pci: Fix the PciBus using HashMap instead of Vec
By using a Vec to hold the list of devices on the PciBus, there's a
problem when we use unplug. Indeed, the vector of devices gets reduced
and if the unplugged device was not the last one from the list, every
other device after this one is shifted on the bus.

To solve this problem, a HashMap is used. This allows to keep track of
the exact place where each device stands on the bus.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-13 10:54:34 +01:00
Sebastien Boeuf
efba48dddb vmm: Don't put a VFIO device behind the vIOMMU by default
With some of the factorization that happened to be able to support VFIO
hotplug, one mistake was made. In case a vIOMMU is created through a
virtio-iommu device, and no matter the "iommu" option value from the
VFIO device parameter, the VFIO device was always placed behind the
virtual IOMMU.

This commit fixes this wrong behavior by making sure the device
configuration is taken into account to decide if it should be attached
or not to the virtual IOMMU.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-11 19:50:31 +01:00
Sebastien Boeuf
34412c9b41 vmm: Add id option to VFIO hotplug
Add a new id option to the VFIO hotplug command so that it matches the
VFIO coldplug semantic.

This is done by refactoring the existing code for VFIO hotplug, where
VmAddDeviceData structure is replaced by DeviceConfig. This structure is
the one used whenever a VFIO device is coldplugged, which is why it
makes sense to reuse it for the hotplug codepath.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-11 19:50:31 +01:00
Sebastien Boeuf
9023444ad3 vmm: Add id field to --device through CLI
Add the ability to specify the "id" associated with a device, by adding
an extra option to the parameter --device.

This new option is not mandatory, and by default, the VMM will take care
of finding a unique identifier.

If the identifier provided by the user through this new option is not
unique, an error will be thrown and the VM won't be started.

Fixes #881

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-11 13:10:57 +00:00
Sebastien Boeuf
f4a956a60a vmm: Remove 32 bits MMIO range from correct address space
The 32 bits MMIO address space is handled separately from the 64 bits
one. For this reason, we need to invoke the appropriate freeing function
to remove a range from this address space.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-11 13:10:30 +00:00
Sebastien Boeuf
432eb5b70a vmm: Free PCI BARs when unplugging PCI device
Now that PciDevice trait has a dedicated function to remove the bars,
the DeviceManager can invoke this function whenever a PCI device is
unplugged from the VM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-11 13:10:30 +00:00
Sebastien Boeuf
b50cbe5064 pci: Give PCI device ID back when removing a device
Upon removal of a PCI device, make sure we don't hold onto the device ID
as it could be reused for another device later.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-10 17:05:06 +00:00
Sebastien Boeuf
df71aaee3f pci: Make the device ID allocation smarter
In order to handle the case where devices are very often plugged and
unplugged from a VM, we need to handle the PCI device ID allocation
better.

Any PCI device could be removed, which means we cannot simply rely on
the vector size to give the next available PCI device ID.

That's why this patch stores in memory the information about the 32
slots availability. Based on this information, whenever a new slot is
needed, the code can correctly provide an available ID, or simply return
an error because all slots are taken.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-10 17:05:06 +00:00
Sebastien Boeuf
81173bf4ab vmm: Add id field to DeviceConfig structure
Add a new field to the DeviceConfig, allowing the VMM to allocate a name
to the VFIO devices.

By identifying a VFIO device with a unique name, we can make sure a user
can properly unplug it at any time.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-10 17:05:06 +00:00
Sebastien Boeuf
6cbdb9aa47 vmm: api: Introduce new "remove-device" HTTP endpoint
This commit introduces the new command "remove-device" that will let a
user hot-unplug a VFIO PCI device from an already running VM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-10 17:05:06 +00:00
Sebastien Boeuf
991f3bb5da vmm: Remove VFIO device from everywhere it is referenced
This commit implements the eject function so that a VFIO device will be
removed from any bus it might sit on, and from any list it might be
stored in.

The idea is to reach a point where there is no reference of the device
anywhere in the code, so that the Drop implementation will be invoked
and so that the device will be fully removed from the VMM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-10 17:05:06 +00:00
Sebastien Boeuf
6adebbc6a0 vmm: Detect when guest notifies about ejecting PCI device
When the guest OS is done removing a PCI device, it will invoke the _EJ0
method from ACPI, associated with the device. This will trigger a port
IO write to a region known by the VMM. Upon this writing, the VMM will
trap the VM exit and retrieve the written value.

Based on the value, the VMM will invoke its eject_device() method to
finalize the removal of the device.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-10 17:05:06 +00:00
Sebastien Boeuf
08604ac6a8 vmm: Store PCI devices as Any devices from DeviceManager
As we try to keep track of every PCI device related to the VM, we don't
want to have separate lists depending on the concrete type associated
with the PciDevice trait. Also, we want to be able to cast the actual
type into any trait or concrete type.

The most efficient way to solve all these issues is to store every
device as an Arc<dyn Any + Send + Sync>. This gives the ability to
downcast into the appropriate concrete type, and then to cast back into
any trait that we might need.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-10 17:05:06 +00:00
Sebastien Boeuf
0f99d3f7cc vmm: Store VFIO device's name and its PCI b/d/f
Add a new list storing the device names across the entire codebase. VFIO
devices are added to the list whenever a new one is created. By default,
each VFIO device is given a name "vfioX" where X is the first available
integer.

Along with this new list of names, another list is created, grouping PCI
device's name with its associated b/d/f. This will be useful to keep
track of the created devices so that we can implement unplug
functionality.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-10 17:05:06 +00:00
Sebastien Boeuf
09829c44b2 vmm: Remove IO bus strong reference from Vm
The Vm structure was used to store a strong reference to the IO bus.
This is not needed anymore since the AddressManager is logically the
one holding this strong reference. This has been made possible by the
introduction of Weak references on the Bus structure itself.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-04 18:46:44 +01:00
Sebastien Boeuf
2dbb376175 vmm: Remove all Weak references from DeviceManager
Now that the BusDevice devices are stored as Weak references by the
IO and MMIO buses, there's no need to use Weak references from the
DeviceManager anymore.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-04 18:46:44 +01:00