Commit Graph

200 Commits

Author SHA1 Message Date
Rob Bradford
1f6cbad01a vmm: Add support for spawning vhost-user-block backend
If no socket is supplied when enabling "vhost_user=true" on "--disk"
follow the "exe" path in the /proc entry for this process and launch the
network backend (via the vmm_path field.)

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-18 08:43:47 +00:00
Sebastien Boeuf
3edc2bd6ab vmm: Prevent memory overcommitment through virtio-fs shared regions
When a virtio-fs device is created with a dedicated shared region, by
default the region should be mapped as PROT_NONE so that no pages can be
faulted in.

It's only when the guest performs the mount of the virtiofs filesystem
that we can expect the VMM, on behalf of the backend, to perform some
new mappings in the reserved shared window, using PROT_READ and/or
PROT_WRITE.

Fixes #763

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-02-17 15:03:47 +01:00
Rob Bradford
bc75c1b4e1 vmm: Add support for spawning vhost-user-net backend
If no socket is supplied when enabling "vhost_user=true" on "--net"
follow the "exe" path in the /proc entry for this process and launch the
network backend (via the vmm_path field.)

Currently this only supports creating a new tap interface as the network
backend also only supports that.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-14 17:32:49 +00:00
Rob Bradford
b04eb4770b vmm: Follow the "exe" symlink from the PID directory in /proc
It is necessary to do this at the start of the VMM execution rather than
later as it must be done in the main thread in order to satisfy the
checks required by PTRACE_MODE_READ_FSCREDS (see proc(5) and
ptrace(2))

The alternative is to run as CAP_SYS_PTRACE but that has its
disadvantages.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-14 17:32:49 +00:00
Rob Bradford
7c9e8b103f vmm: device_manager: Shutdown all virtio devices
When the DeviceManager is dropped explicitly shutdown() all virtio
devices.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-14 17:32:49 +00:00
Samuel Ortiz
da2b3c92d3 vm-device: interrupt: Remove InterruptType dependencies and definitions
Having the InterruptManager trait depend on an InterruptType forces
implementations into supporting potentially very different kind of
interrupts from the same code base. What we're defining through the
current, interrupt type based create_group() method is a need for having
different interrupt managers for different kind of interrupts.

By associating the InterruptManager trait to an interrupt group
configuration type, we create a cleaner design to support that need as
we're basically saying that one interrupt manager should have the single
responsibility of supporting one kind of interrupt (defined through its
configuration).

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-02-04 19:32:45 +01:00
Samuel Ortiz
84fc807bc6 interrupt: Interrupt manager split
We create 2 different interrupt managers for separately handling
creation of legacy and MSI interrupt groups.
Doing so allows us to have a cleaner interrupt manager and IOAPIC
initialization path. It also prepares for an InterruptManager trait
design improvement where we remove the interrupt source type dependency
by associating an interrupt configuration type to the trait.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-02-04 19:32:45 +01:00
Rob Bradford
880a57c920 vmm: Remove VmInfo struct
After refactoring the VmInfo struct is no longer needed.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-03 12:28:30 +00:00
Rob Bradford
07bc292fa5 vmm: device_manager: Get VmFd from AddressManager
A reference to the VmFd is stored on the AddressManager so it is not
necessary to pass in the VmInfo into all methods that need it as it can
be obtained from the AddressManager.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-03 12:28:30 +00:00
Rob Bradford
6411c3ae42 vmm: device_manager: Use MemoryManager to get guest memory
The DeviceManager has a reference to the MemoryManager so use that to
get the GuestMemoryMmap rather than the version stored in the VmInfo
struct.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-03 12:28:30 +00:00
Rob Bradford
066fc6c0d1 vmm: device_manager: Get VM config from the struct member
Remove the use of vm_info in methods to get the config and instead use
the config stored on the DeviceManager itself.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-03 12:28:30 +00:00
Rob Bradford
77ae3de4f3 vmm: device_manager: Make legacy device addition a method
Remove some in/out parameters and instead rely on them as members of the
&mut self parameter.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-03 12:28:30 +00:00
Rob Bradford
599275b610 vmm: device_manager: Make ACPI device creation a method
Remove some in/out parameters and instead rely on them as members of the
&mut self parameter.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-03 12:28:30 +00:00
Rob Bradford
b8c1b2e174 vmm: device_manager: Make console creation a method
Remove some in/out parameters and instead rely on them as members of the
&mut self parameter.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-03 12:28:30 +00:00
Rob Bradford
b5440e2d0a vmm: device_manager: Make virtio device creation functions methods
Remove some in/out parameters and instead rely on them as members of the
&mut self parameter. This prepares the way to more easily store state on
the DeviceManager.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-03 12:28:30 +00:00
Rob Bradford
e90c6f3c44 vmm: device_manager: Make make_virtio_devices a method
Remove some in/out parameters and instead rely on them as members of the
&mut self parameter. A follow-up commit will change the callee functions
that create the devices themselves.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-03 12:28:30 +00:00
Rob Bradford
dbc09ad0ef vmm: device_manager: Make add_vfio_devices a method
Remove some in/out parameters and instead rely on them as members of the
&mut self parameter.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-03 12:28:30 +00:00
Rob Bradford
d9e1c2cd22 vmm: device_manager: Make add_virtio_pci_device a method
Remove some in/out parameters and instead rely on them as members of the
&mut self parameter.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-03 12:28:30 +00:00
Rob Bradford
aaa5e2e9ea vmm: device_manager: Make add_virtio_mmio_device a method
Remove some in/out parameters and instead rely on them as members of the
&mut self parameter.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-03 12:28:30 +00:00
Rob Bradford
2987476e0a vmm: device_manager: Make add_pci_devices and add_mmio_devices methods
Modify these functions to take an &mut self and become methods on
DeviceManager. This allows the removal of some in/out parameters and
leads the way to further refactoring and simplification.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-03 12:28:30 +00:00
Rob Bradford
3dbae423bb vmm: device_manager: Only add MemoryManager to I/O bus on ACPI builds
The MemoryManager should only be included on the I/O bus when doing ACPI
builds as that is the only time it will be interrogated.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-03 12:28:30 +00:00
Rob Bradford
68fa97eb0e vmm: device_manager: Always embed MemoryManager in the struct
Currently the MemoryManager is only used on the ACPI code paths after
the DeviceManager has been created. This will change in a future commit
as part of the refactoring so for now always include it but name it with
underscore prefix to indicate it might not always be used.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-03 12:28:30 +00:00
Sebastien Boeuf
ac01ceddbb vmm: Cleanup list of PCI IDs related to virtual IOMMU
Now that devices attached to the virtual IOMMU are described through
virtio configuration, there is no need for the DeviceManager to store
the list of IDs for all these devices. Instead, things are handled
locally when PCI devices are being added.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-30 10:37:40 +01:00
Sebastien Boeuf
097cff2d85 vmm: Use virtio topology for virtio-iommu
Instead of relying on the ACPI tables to describe the devices attached
to the virtual IOMMU, let's use the virtio topology, as the ACPI support
is getting deprecated.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-30 10:37:40 +01:00
Rob Bradford
aeeae661fc vmm: Support vhost-user-block via "--disks"
Add a socket and vhost_user parameter to this option so that the same
configuration option can be used for both virtio-block and
vhost-user-block.  For now it is necessary to specify both vhost_user
and socket parameters as auto activation is not yet implemented. The wce
parameter for supporting "Write Cache Enabling" is also added to the
disk configuration.

The original command line parameter is still supported for now and will
be removed in a future release.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-01-29 08:06:37 +00:00
Rob Bradford
a831aa214c vmm: Support vhost-user-net via "--net"
Add a socket and vhost_user parameter to this option so that the same
configuration option can be used for both virtio-net and vhost-user-net.
For now it is necessary to specify both vhost_user and socket parameters
as auto activation is not yet implemented. The original command line
parameter is still supported for now.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-01-28 12:39:26 +00:00
Sebastien Boeuf
f5b53ae4be vm-virtio: Implement multiqueue/multithread support for virtio-blk
This commit improves the existing virtio-blk implementation, allowing
for better I/O performance. The cost for the end user is to accept
allocating more vCPUs to the virtual machine, so that multiple I/O
threads can run in parallel.

One thing to notice, the amount of vCPUs must be egal or superior to the
amount of queues dedicated to the virtio-blk device.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-28 09:26:53 +01:00
Sebastien Boeuf
0fa1e2c241 vmm: Handle mapping from devices regions through vm-memory
Devices like virtio-pmem and virtio-fs require some dedicated memory
region to be mapped. The memory mapping from the DeviceManager is being
replaced by the usage of MmapRegion from the vm-memory crate.

The unmap will happen automatically when the MmapRegion will be dropped,
which should happen when the DeviceManager gets dropped.

Fixes #240

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-24 17:56:49 +01:00
Rob Bradford
a34893a402 Revert "vmm: Move MemoryManager from I/O ports to MMIO region"
This reverts commit 03108fb88b.
2020-01-24 12:08:31 +01:00
Rob Bradford
57ed006992 Revert "devices, vmm: Move GED device to MMIO region"
This reverts commit 5e3c62dc6a.
2020-01-24 12:08:31 +01:00
Rob Bradford
5e3c62dc6a devices, vmm: Move GED device to MMIO region
Move GED device reporting of required device type to scan into an MMIO
region rather than an I/O port.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-01-23 16:04:58 +00:00
Rob Bradford
03108fb88b vmm: Move MemoryManager from I/O ports to MMIO region
Rather than have the MemoryManager device sit on the I/O bus allocate
space for MMIO and add it to the MMIO bus.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-01-23 16:04:58 +00:00
Sebastien Boeuf
0042f1de75 ioapic: Rely fully on the InterruptSourceGroup to manage interrupts
This commit relies on the interrupt manager and the resulting interrupt
source group to abstract the knowledge about KVM and how interrupts are
updated and delivered.

This allows the entire "devices" crate to be freed from kvm_ioctls and
kvm_bindings dependencies.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-23 11:20:08 +00:00
Sebastien Boeuf
2dca959084 ioapic: Create the InterruptSourceGroup from InterruptManager
The interrupt manager is passed to the IOAPIC creation, and the IOAPIC
now creates an InterruptSourceGroup for MSI interrupts based on it.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-23 11:20:08 +00:00
Sebastien Boeuf
52800a871a vmm: Create an InterruptManager dedicated to IOAPIC
By introducing a new InterruptManager dedicated to the IOAPIC, we don't
have to solve the chicken and eggs problem about which of the
InterruptManager or the Ioapic should be created first. It's also
totally fine to have two interrupt manager instances as they both share
the same list of GSI routes and the same allocator.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-23 11:20:08 +00:00
Sergio Lopez
925c862f98 vmm: device_manager: Add 'direct' support for virtio-blk
vhost_user_blk already has it, so it's only fair to give it to
virtio-blk too. Extend DiskConfig with a 'direct' property, honor
it while opening the file backing the disk image, and pass it to
vm_virtio::RawFile.

Fixes #631

Signed-off-by: Sergio Lopez <slp@redhat.com>
2020-01-21 13:39:45 +00:00
Sergio Lopez
fb79e75afc vmm: device_manager: Add read-only support for virtio-blk
vhost_user_blk already has it, so it's only fair to give it to
virtio-blk too. Extend DiskConfig with a 'readonly' properly, and pass
it to vm_virtio::Block.

Signed-off-by: Sergio Lopez <slp@redhat.com>
2020-01-21 13:39:45 +00:00
Sebastien Boeuf
9ac06bf613 ci: Run clippy for each specific feature
The build is run against "--all-features", "pci,acpi", "pci" and "mmio"
separately. The clippy validation must be run against the same set of
features in order to validate the code is correct.

Because of these new checks, this commit includes multiple fixes
related to the errors generated when manually running the checks.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-21 11:44:40 +01:00
Sebastien Boeuf
99f39291fd pci: Simplify PciDevice trait
There's no need for assign_irq() or assign_msix() functions from the
PciDevice trait, as we can see it's never used anywhere in the codebase.
That's why it's better to remove these methods from the trait, and
slightly adapt the existing code.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-21 10:44:48 +01:00
Sebastien Boeuf
a20b383be8 vmm: Always use a reference for InterruptManager
Since the InterruptManager is never stored into any structure, it should
be passed as a reference instead of being cloned.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-21 10:44:48 +01:00
Sebastien Boeuf
bb8cd9eb24 vmm: Use LegacyUserspaceInterruptGroup for acpi device
This commit replaces the way legacy interrupts were handled with the
brand new implementation of the legacy InterruptSourceGroup for KVM.

Additionally, since it removes the last bit relying on the Interrupt
trait, the trait and its implementation can be removed from the
codebase.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-21 10:44:48 +01:00
Sebastien Boeuf
75e22ff34e vmm: Use LegacyUserspaceInterruptGroup for serial device
This commit replaces the way legacy interrupts were handled with the
brand new implementation of the legacy InterruptSourceGroup for KVM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-21 10:44:48 +01:00
Sebastien Boeuf
8d7c4ea334 vmm: Use LegacyUserspaceInterruptGroup for mmio devices
This commit replaces the way legacy interrupts were handled with the
brand new implementation of the legacy InterruptSourceGroup for KVM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-21 10:44:48 +01:00
Sebastien Boeuf
f70c9937fb vmm: Add ioapic to KvmInterruptManager
By having a reference to the IOAPIC, the KvmInterruptManager is going
to be able to initialize properly the legacy interrupt source group.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-21 10:44:48 +01:00
Sebastien Boeuf
e73cb1ff80 vmm: Initialize InterruptManager sooner
In order to let the InterruptManager be shared across both PCI and MMIO
devices, this commit moves the initialization earlier in the code.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-21 10:44:48 +01:00
Sebastien Boeuf
4bb12a2d8d interrupt: Reorganize all interrupt management with InterruptManager
Based on all the previous changes, we can at this point replace the
entire interrupt management with the implementation of InterruptManager
and InterruptSourceGroup traits.

By using KvmInterruptManager from the DeviceManager, we can provide both
VirtioPciDevice and VfioPciDevice a way to pick the kind of
InterruptSourceGroup they want to create. Because they choose the type
of interrupt to be MSI/MSI-X, they will be given a MsiInterruptGroup.

Both MsixConfig and MsiConfig are responsible for the update of the GSI
routes, which is why, by passing the MsiInterruptGroup to them, they can
still perform the GSI route management without knowing implementation
details. That's where the InterruptSourceGroup is powerful, as it
provides a generic way to manage interrupt, no matter the type of
interrupt and no matter which hypervisor might be in use.

Once the full replacement has been achieved, both SystemAllocator and
KVM specific dependencies can be removed.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
c396baca46 vm-virtio: Modify VirtioInterrupt callback into a trait
Callbacks are not the most Rust idiomatic way of programming. The right
way is to use a Trait to provide multiple implementation of the same
interface.

Additionally, a Trait will allow for multiple functions to be defined
while using callbacks means that a new callback must be introduced for
each new function we want to add.

For these two reasons, the current commit modifies the existing
VirtioInterrupt callback into a Trait of the same name.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
2381f32ae0 msix: Add gsi_msi_routes to MsixConfig
Because MsixConfig will be responsible for updating KVM GSI routes at
some point, it is necessary that it can access the list of routes
contained by gsi_msi_routes.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
9b60fcdc39 msix: Add VmFd to MsixConfig
Because MsixConfig will be responsible for updating the KVM GSI routes
at some point, it must have access to the VmFd to invoke the KVM ioctl
KVM_SET_GSI_ROUTING.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
86c760a0d9 msix: Add SystemAllocator to MsixConfig
The point here is to let MsixConfig take care of the GSI allocation,
which means the SystemAllocator must be passed from the vmm crate all
the way down to the pci crate.

Once this is done, the GSI allocation and irq_fd creation is performed
by MsixConfig directly.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
f5704d32b3 vmm: Move gsi_msi_routes creation to be shared across all PCI devices
Because we will need to share the same list of GSI routes across
multiple PCI devices (virtio-pci, VFIO), this commit moves the creation
of such list to a higher level location in the code.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sergio Lopez
a14aee9213 qcow: Use RawFile as backend instead of File
Use RawFile as backend instead of File. This allows us to abstract
the access to the actual image with a specialized layer, so we have a
place where we can deal with the low-level peculiarities.

Signed-off-by: Sergio Lopez <slp@redhat.com>
2020-01-17 17:28:44 +00:00
Sergio Lopez
c5a656c9dc vm-virtio: block: Add support for alignment restrictions
Doing I/O on an image opened with O_DIRECT requires to adhere to
certain restrictions, requiring the following elements to be aligned:

 - Address of the source/destination memory buffer.
 - File offset.
 - Length of the data to be read/written.

The actual alignment value depends on various elements, and according
to open(2) "(...) there is currently no filesystem-independent
interface for an application to discover these restrictions (...)".

To discover such value, we iterate through a list of alignments
(currently, 512 and 4096) calling pread() with each one and checking
if the operation succeeded.

We also extend RawFile so it can be used as a backend for QcowFile,
so the later can be easily adapted to support O_DIRECT too.

Signed-off-by: Sergio Lopez <slp@redhat.com>
2020-01-17 17:28:44 +00:00
Cathy Zhang
652e7b9b8a vm-virtio: Implement multiple queue support for net devices
Update the common part in net_util.rs under vm-virtio to add mq
support, meanwhile enable mq for virtio-net device, vhost-user-net
device and vhost-user-net backend. Multiple threads will be created,
one thread will be responsible to handle one queue pair separately.
To gain the better performance, it requires to have the same amount
of vcpus as queue pair numbers defined for the net device, due to
the cpu affinity.

Multiple thread support is not added for vhost-user-net backend
currently, it will be added in future.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2020-01-17 12:06:19 +01:00
Cathy Zhang
4ab88a8173 net_util: Add multiple queue support for tap
Add support to allow VMMs to open the same tap device many times, it will
create multiple file descriptors meanwhile.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2020-01-17 12:06:19 +01:00
Cathy Zhang
1ae7deb393 vm-virtio: Implement refactor for net devices and backend
Since the common parts are put into net_util.rs under vm-virtio,
refactoring code for virtio-net device, vhost-user-net device
and backend to shrink the code size and improve readability
meanwhile.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2020-01-17 12:06:19 +01:00
Rob Bradford
8b500d7873 deps: Bump vm-memory and linux-loader version
The function GuestMemory::end_addr() has been renamed to last_addr()

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-01-15 20:21:22 +01:00
Rob Bradford
7310ab6fa7 devices, vmm: Use a bit field for ACPI GED interrupt type
Use independent bits for storing whether there is a CPU or memory device
changed when reporting changes via ACPI GED interrupt. This prevents a
later notification squashing an earlier one and ensure that hotplugging
both CPU and memory at the same time succeeds.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-01-15 20:21:22 +01:00
Rob Bradford
4e414f0d84 vmm: device_manager: Scan memory devices upon GED interrupt
If there is a GED interrupt and the field indicates that the memory
device has changed triggers a scan of the memory devices.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-01-15 20:21:22 +01:00
Rob Bradford
8ecf736982 vmm: device_manager: Add the MemoryManager to the I/O bus
Now that the MemoryManager has I/O port functionality it needs to be
exposed on the I/O bus.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-01-15 20:21:22 +01:00
Rob Bradford
78dcb1862c vmm: device_manager: Store the type of notification in a local value
When the value is read from the I/O port via the ACPI AML functions to
determine what has been triggered the notifiction value is reset
preventing a second read from exposing the value. If we need support
multiple types of GED notification (such as memory hotplug) then we
should avoid reading the value multiple times.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-01-15 20:21:22 +01:00
Samuel Ortiz
5788d36583 vmm: Do not create virtio devices when missing a transport
If neither PCI or MMIO are built in, we should not bother creating any
virtio devices at all.
When building a minimal VMM made of a kernel with an initramfs and a
serial console, the RNG virtio device is still created even though there
is no way it can ever get probed.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-01-14 07:42:09 +01:00
Rob Bradford
b2589d4f3f vm-virtio, vmm, vfio: Store GuestMemoryMmap in an Arc<ArcSwap<T>>
This allows us to change the memory map that is being used by the
devices via an atomic swap (by replacing the map with another one). The
ArcSwap provides the mechanism for atomically swapping from to another
whilst still giving good read performace. It is inside an Arc so that we
can use a single ArcSwap for all users.

Not covered by this change is replacing the GuestMemoryMmap itself.

This change also removes some vertical whitespace from use blocks in the
files that this commit also changed. Vertical whitespace was being used
inconsistently and broke rustfmt's behaviour of ordering the imports as
it would only do it within the block.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-01-02 13:20:11 +00:00
Rob Bradford
a551398135 vmm: device_manager: Use MemoryManager to create KVM mapping
Use the newly exported funtionality to reduce the amount of duplicated
code.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-12-23 10:25:40 +00:00
Rob Bradford
7df88793a0 vmm: device_manager: Get device range from MemoryManager
This removes the duplication of these values.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-12-23 10:25:40 +00:00
Rob Bradford
61cfe3e72d vmm: Obtain sequential KVM memory slot numbers from MemoryManager
This removes the need to handle a mutable integer and also centralises
the allocation of these slot numbers.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-12-23 10:25:40 +00:00
Rob Bradford
260cebb8cf vmm: Introduce MemoryManager
The memory manager is responsible for setting up the guest memory and in
the long term will also handle addition of guest memory.

In this commit move code for creating the backing memory and populating
the allocator into the new implementation trying to make as minimal
changes to other code as possible.

Follow on commits will further reduce some of the duplicated code.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-12-23 10:25:40 +00:00
Rob Bradford
d5682cd306 vmm: device_manager: Rewrite if chain using match
To reflect updated clippy rules:

error: `if` chain can be rewritten with `match`
    --> vmm/src/device_manager.rs:1508:25
     |
1508 | /                         if ret > 0 {
1509 | |                             debug!("MSI message successfully delivered");
1510 | |                         } else if ret == 0 {
1511 | |                             warn!("failed to deliver MSI message, blocked by guest");
1512 | |                         }
     | |_________________________^
     |
     = note: `-D clippy::comparison-chain` implied by `-D warnings`
     = help: Consider rewriting the `if` chain to use `cmp` and `match`.
     = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#comparison_chain

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-12-20 00:52:03 +01:00
Rob Bradford
e25a47b32c vmm: device_manager: Remove redundant clones
Address updated clippy errors:

error: redundant clone
   --> vmm/src/device_manager.rs:699:32
    |
699 |             .insert(acpi_device.clone(), 0x3c0, 0x4)
    |                                ^^^^^^^^ help: remove this
    |
    = note: `-D clippy::redundant-clone` implied by `-D warnings`
note: this value is dropped without further use
   --> vmm/src/device_manager.rs:699:21
    |
699 |             .insert(acpi_device.clone(), 0x3c0, 0x4)
    |                     ^^^^^^^^^^^
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#redundant_clone

error: redundant clone
   --> vmm/src/device_manager.rs:737:26
    |
737 |             .insert(i8042.clone(), 0x61, 0x4)
    |                          ^^^^^^^^ help: remove this
    |
note: this value is dropped without further use
   --> vmm/src/device_manager.rs:737:21
    |
737 |             .insert(i8042.clone(), 0x61, 0x4)
    |                     ^^^^^
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#redundant_clone

error: redundant clone
   --> vmm/src/device_manager.rs:754:29
    |
754 |                 .insert(cmos.clone(), 0x70, 0x2)
    |                             ^^^^^^^^ help: remove this
    |
note: this value is dropped without further use
   --> vmm/src/device_manager.rs:754:25
    |
754 |                 .insert(cmos.clone(), 0x70, 0x2)
    |                         ^^^^
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#redundant_clone

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-12-20 00:52:03 +01:00
Rob Bradford
e8313e3e69 vmm: acpi: Refactor ACPI CPU notification
Continue to notify on all vCPUs but instead separate the notification
functionality into two methods, CSCN that walks through all the CPUs
and CTFY which notifies based on the numerical CPU id. This is an
interim step towards only notifying on changed CPUs and ultimately CPU
removal.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-12-16 23:57:14 +01:00
Samuel Ortiz
f0b7412495 vmm: device_manager: Add all virtio devices to the migratable list
We want to track all migratable devices through the DeviceManager.

Fixes: #341

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-12-12 08:50:36 +01:00
Samuel Ortiz
35dd1523c9 vmm: device_manager: Implement the Pausable trait
Since the Snapshotable placeholder and Migratable traits are provided as
well, the DeviceManager object and all its objects are now Migratable.

All Migratable devices are tracked as Arc<Mutex<dyn Migratable>>
references.

Keeping track of all migratable devices allows for implementing the
Migratable trait for the DeviceManager structure, making the whole
device model potentially migratable.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-12-12 08:50:36 +01:00
Samuel Ortiz
35d7721683 vmm: Convert virtio devices to Arc<Mutex<T>>
Migratable devices can be virtio or legacy devices.
In any case, they can potentially be tracked through one of the IO bus
as an Arc<Mutex<dyn BusDevice>>. In order for the DeviceManager to also
keep track of such devices as Migratable trait objects, they must be
shared as mutable atomic references, i.e. Arc<Mutex<T>>. That forces all
Migratable objects to be tracked as Arc<Mutex<dyn Migratable>>.

Virtio devices are typically migratable, and thus for them to be
referenced by the DeviceManager, they now should be built as
Arc<Mutex<VirtioDevice>>.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-12-12 08:50:36 +01:00
Sebastien Boeuf
64c5e3d8cb vmm: api: Adjust FsConfig for OpenAPI
The FsConfig structure has been recently adjusted so that the default
value matches between OpenAPI and CLI. Unfortunately, with the current
description, there is no way from the OpenAPI to describe a cache_size
value "None", so that DAX does not get enabled. Usually, using a Rust
"Option" works because the default value is None. But in this case, the
default value is Some(8G), which means we cannot describe a None.

This commit tackles the problem, introducing an explicit parameter
"dax", and leaving "cache_size" as a simple u64 integer.

This way, the default value is dax=true and cache_size=8G, but it lets
the opportunity to disable DAX entirely with dax=false, which will
simply ignore the cache_size value.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-12-11 15:50:24 +00:00
Sebastien Boeuf
5e0bbf9c3b vmm: Don't factorize vhost-user configurations
We want to set different default configurations for vhost-user-net and
vhost-user-blk, which is the reason why the common part corresponding to
the number of queues and the queue size cannot be embedded.

This prepares for the following commit, matching API and CLI behaviors.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-12-11 15:50:24 +00:00
Rob Bradford
ba59c62044 vmm, devices: Remove hardcoded IRQ number for GED device
Remove the previously hardcoded IRQ number used for the GED device.
Instead allocate the IRQ using the allocator and use that value in the
definition in the ACPI device.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-12-09 16:58:00 +00:00
Sebastien Boeuf
aa94e9b8f3 Revert "vmm: api: Modify FsConfig to be OpenAPI friendly"
This reverts commit defc5dcd9c.
2019-12-06 18:08:10 +00:00
Rob Bradford
9b1ba14f2d vmm: Delegate device related ACPI DSDT table work to DeviceManager
Move the code for handling the creation of the DSDT entries for devices
into the DeviceManager.

This will make it easier to handle device hotplug and also in the future
remove some hardcoded ACPI constants.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-12-06 17:44:00 +00:00
Sebastien Boeuf
defc5dcd9c vmm: api: Modify FsConfig to be OpenAPI friendly
When consumer of the HTTP API try to interact with cloud-hypervisor,
they have to provide the equivalent of the config structure related to
each component they need. Problem is, the Rust enum type "Option" cannot
be obtained from the OpenAPI YAML definition.

This patch intends to fix this inconsistency between what is possible
through the CLI and what's possible through the HTTP API by using simple
types bool and int64 instead of Option<u64>.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-12-06 06:38:48 -08:00
Rob Bradford
59d01712ad vmm: Remove kernel based IOAPIC handling from the device manager
Previously the device setup code assumed that if no IOAPIC was passed in
then the device should be added to the kernel irqchip. As an earlier
change meant that there was always a userspace IOAPIC this kernel based
code can be removed.

The accessor still returns an Option type to leave scope for
implementing a situation without an IOAPIC (no serial or GED device).
This change does not add support no-IOAPIC mode as the original code did
not either.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-12-06 12:34:06 +01:00
Rob Bradford
9b1cb9621f vmm: Remove pin based interrupt setup for virtio devices
With MSI now required remove pin based interrupt support from all the
virtio PCI device setup.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-12-06 12:34:06 +01:00
Rob Bradford
1722708612 vmm: Switch to storing VmConfig inside an Arc<Mutex<>>
This permits the runtime reconfiguration of the VM.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-12-05 16:39:19 +00:00
Sebastien Boeuf
08258d5dad vfio: pci: Allow multiple devices to be passed through
The KVM_SET_GSI_ROUTING ioctl is very simple, it overrides the previous
routes configuration with the new ones being applied. This means the
caller, in this case cloud-hypervisor, needs to maintain the list of all
interrupts which needs to be active at all times. This allows to
correctly support multiple devices to be passed through the VM and being
functional at the same time.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-12-04 08:48:17 +01:00
Rob Bradford
791ca3388f vmm: device_manager: Add ability to notify via GED device
Add ability to notify via the GED device that there is some new hotplug
activity. This will be used by the CpuManager (and later DeviceManager
itself) to notify of new hotplug activity.

Currently it has a hardcoded IRQ of 5 as the ACPI tables also need to
refer to this IRQ and the IRQ allocation does not permit the allocation
of specific IRQs.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-12-02 13:49:04 +00:00
Rob Bradford
7ad68d499a vmm: device_manager: Allocate I/O port for ACPI shutdown device
The refactoring in ce1765c8af dropped the
code to allocate the I/O port.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-12-02 13:49:04 +00:00
Samuel Ortiz
0f21781fbe cargo: Bump the kvm and vmm-sys-util crates
Since the kvm crates now depend on vmm-sys-util, the bump must be
atomic.
The kvm-bindings and ioctls 0.2.0 and 0.4.0 crates come with a few API
changes, one of them being the use of a kvm_ioctls specific error type.
Porting our code to that type makes for a fairly large diff stat.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-11-29 17:48:02 +00:00
Sebastien Boeuf
f979380620 vmm: Mark guest persistent memory pages as mergeable
In case the VM is started with the flag "--pmem mergeable=on", it means
the user expects the guest persistent memory pages to be marked as
mergeable. This commit relies on the madvise(MADV_MERGEABLE) system call
to inform the host kernel about these pages.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-11-22 15:28:10 +00:00
Rob Bradford
50c8335d3d vmm: device_manager: Expose the SystemAllocator
This allows other code to allocate I/O ports for use on the (already)
exposed IO bus.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-11-21 09:17:15 -08:00
Samuel Ortiz
f0e618431d vmm: device_manager: Use consistent naming when adding devices
When adding devices to the guest, and populating the device model, we
should prefix the routines with add_. When we're just creating the
device objects but not yet adding them we use make_.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-11-19 13:36:21 -08:00
Samuel Ortiz
a2ee681665 vmm: device_manager: Add an MMIO devices creation routine
In order to reduce the DeviceManager's new() complexity, we can move the
MMIO devices creation code into its own routine.

Fixes: #441

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-11-19 13:36:21 -08:00
Samuel Ortiz
79b8f8e477 vmm: device_manager: Add a PCI devices creation routine
In order to reduce the DeviceManager's new() complexity, we can move the
PCI devices creation code into its own routine.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-11-19 13:36:21 -08:00
Samuel Ortiz
5087f633f6 vmm: device_manager: Add an IOAPIC creation routine
In order to reduce the DeviceManager's new() complexity, we can move the
ACPI device creation code into its own routine.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-11-19 13:36:21 -08:00
Samuel Ortiz
ce1765c8af vmm: device_manager: Add an ACPI device creation routine
In order to reduce the DeviceManager's new() complexity, we can move the
ACPI device creation code into its own routine.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-11-19 13:36:21 -08:00
Samuel Ortiz
cfca2759fc vmm: device_manager: Add a legacy devices creation routine
In order to reduce the DeviceManager's new() complexity, we can move the
legacy devices creation code into its own routine.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-11-19 13:36:21 -08:00
Samuel Ortiz
4b469b98cf vmm: device_manager: Add a console creation routine
In order to reduce the DeviceManager's new() complexity, we can move the
console creation code into its own routine.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-11-19 13:36:21 -08:00
Rob Bradford
b3388c343d vmm: device_manager: Ensure I/O ports are allocated
Ensure that we tell the allocator about all the I/O ports that we are
using for I/O bus attached devices (serial, i8042, ACPI device.)

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-11-05 10:13:01 +00:00
Sebastien Boeuf
5694ac2b1e vm-virtio: Create new VirtioTransport trait to abstract ioeventfds
In order to group together some functions that can be shared across
virtio transport layers, this commit introduces a new trait called
VirtioTransport.

The first function of this trait being ioeventfds() as it is needed from
both virtio-mmio and virtio-pci devices, represented by MmioDevice and
VirtioPciDevice structures respectively.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-31 09:30:59 +01:00
Sebastien Boeuf
3fa5df4161 vmm: Unregister old ioeventfds when reprogramming PCI BAR
Now that kvm-ioctls has been updated, the function unregister_ioevent()
can be used to remove eventfd previously associated with some specific
PIO or MMIO guest address. Particularly, it is useful for the PCI BAR
reprogramming case, as we want to ensure the eventfd will only get
triggered by the new BAR address, and not the old one.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-31 09:30:59 +01:00
Sebastien Boeuf
587a420429 cargo: Update to the latest kvm-ioctls version
We need to rely on the latest kvm-ioctls version to benefit from the
recent addition of unregister_ioevent(), allowing us to detach a
previously registered eventfd to a PIO or MMIO guest address.

Because of this update, we had to modify the current constraint we had
on the vmm-sys-util crate, using ">= 0.1.1" instead of being strictly
tied to "0.2.0".

Once the dependency conflict resolved, this commit took care of fixing
build issues caused by recent modification of kvm-ioctls relying on
EventFd reference instead of RawFd.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-31 09:30:59 +01:00
Sebastien Boeuf
c7cabc88b4 vmm: Conditionally update ioeventfds for virtio PCI device
The specific part of PCI BAR reprogramming that happens for a virtio PCI
device is the update of the ioeventfds addresses KVM should listen to.
This should not be triggered for every BAR reprogramming associated with
the virtio device since a virtio PCI device might have multiple BARs.

The update of the ioeventfds addresses should only happen when the BAR
related to those addresses is being moved.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-31 09:30:59 +01:00