229 Commits

Author SHA1 Message Date
Sergio Lopez
d17fa784bc vm-virtio: Implement support for EVENT_IDX
VIRTIO_RING_F_EVENT_IDX is a virtio feature that allows to avoid
device <-> driver notifications under some circunstances, most
notably when actively polling the queue.

This commit implements support for in in the vm-virtio
crate. Consumers of this crate will also need to add support for it by
exposing the feature and calling using update_avail_event() and
get_used_event() accordingly.

Signed-off-by: Sergio Lopez <slp@redhat.com>
2020-02-19 17:13:47 +00:00
Sebastien Boeuf
793d4e7b8d vmm: Move codebase to GuestMemoryAtomic from vm-memory
Relying on the latest vm-memory version, including the freshly
introduced structure GuestMemoryAtomic, this patch replaces every
occurrence of Arc<ArcSwap<GuestMemoryMmap> with
GuestMemoryAtomic<GuestMemoryMmap>.

The point is to rely on the common RCU-like implementation from
vm-memory so that we don't have to do it from Cloud-Hypervisor.

Fixes #735

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-02-19 13:48:19 +00:00
Rob Bradford
4d60ef59bc vm-virtio: vhost_user: block: On shutdown() drop the socket
This causes the vhost-user-block backend to shutdown.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-18 08:43:47 +00:00
Rob Bradford
503887843f vm-virtio: vhost_user: net: On shutdown() drop the socket
This causes the vhost-user-net backend to shutdown.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-14 17:32:49 +00:00
Rob Bradford
545ea9ea33 vm-virtio: Add shutdown method to VirtioDevice trait
This allows the VMM to explicitly shutdown devices as part of the VM
shutdown ahead of what Drop::drop() would do.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-14 17:32:49 +00:00
Sebastien Boeuf
4dd16c2686 vm-virtio: Detect if a tap interface supports multiqueue
By detecting if an existing tap interface supports multiqueue, we now
have the information to determine if the command line parameters
regarding the number of queues is correct.

Fixes #738

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-02-12 18:05:42 +00:00
dependabot-preview[bot]
d46c61c5d4 build(deps): bump byteorder from 1.3.2 to 1.3.4
Bumps [byteorder](https://github.com/BurntSushi/byteorder) from 1.3.2 to 1.3.4.
- [Release notes](https://github.com/BurntSushi/byteorder/releases)
- [Changelog](https://github.com/BurntSushi/byteorder/blob/master/CHANGELOG.md)
- [Commits](https://github.com/BurntSushi/byteorder/compare/1.3.2...1.3.4)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-02-07 14:18:07 +00:00
Cathy Zhang
14eddf72b4 vm-virtio: Simplify virtio feature handling
Remove duplicated code across the different devices by handling
the virtio feature pages in VirtioDevice itself rather than
in the backends. This works as no virtio devices use feature
bits beyond 64-bits.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2020-02-07 08:32:21 +00:00
Sebastien Boeuf
3447e226d9 dependencies: bump vm-memory from 4237db3 to f3d1c27
This commit updates Cloud-Hypervisor to rely on the latest version of
the vm-memory crate.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-02-06 11:40:45 +01:00
Samuel Ortiz
da2b3c92d3 vm-device: interrupt: Remove InterruptType dependencies and definitions
Having the InterruptManager trait depend on an InterruptType forces
implementations into supporting potentially very different kind of
interrupts from the same code base. What we're defining through the
current, interrupt type based create_group() method is a need for having
different interrupt managers for different kind of interrupts.

By associating the InterruptManager trait to an interrupt group
configuration type, we create a cleaner design to support that need as
we're basically saying that one interrupt manager should have the single
responsibility of supporting one kind of interrupt (defined through its
configuration).

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-02-04 19:32:45 +01:00
Sebastien Boeuf
56d7c04226 vm-virtio: vsock: Don't return error when epoll_wait is interrupted
The existing code taking care of the epoll loop was too restrictive as
it was considering all errors the same. But in case the error is EINTR,
this means the syscall has been interrupted while waiting, and it should
be resumed to wait again.

This patch enforces the parsing of the returned error and prevent the
code from assuming EINTR should be handled as all other errors.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-02-04 18:16:37 +01:00
Yang Zhong
1038a07dd6 vhost-user-blk: Device support multiple queues
The previous code only support one queue, and we need
to support MQ in vhost user block device. This patch
can work with SPDK with MQ setting.

Signed-off-by: Yang Zhong <yang.zhong@intel.com>
2020-02-03 09:49:27 +01:00
Sebastien Boeuf
bac0d1e689 iommu: Implement virtio topology configuration
Based on the new structures previously introduced, the new topology
feature is being fully implemented through this commit. This allows
the description of the devices attached to the virtual IOMMU, which
is why a new function attach_devices() has been introduced. It gives
the virtual IOMMU device the full list of devices which must be attached
to it, letting the device share this information through its virtio
configuration.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-30 10:37:40 +01:00
Sebastien Boeuf
0c73ff8129 iommu: Add topology structures
The virtio-iommu device defines a new virtio feature allowing the
topology to be discovered fully through virtio configuration.

By topology, it means describing the devices attached to the virtual
IOMMU. This is currently managed through ACPI with IORT and VIOT table,
but this is another way of describing it.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-30 10:37:40 +01:00
Sebastien Boeuf
db42caef42 vm-virtio: Handle special virtio-pci capability CAP_PCI_CFG
The virtio capability VIRTIO_PCI_CAP_PCI_CFG is exposed through the
device's PCI config space the same way other virtio-pci capabilities
are exposed.

The main and important difference is that this specific capability is
designed as a way for the guest to access virtio capabilities without
mapping the PCI BAR. This is very rarely used, but it can be useful when
it is too early for the guest to be able to map the BARs.

One thing to note, this special feature MUST be implemented, based on
the virtio specification.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-30 09:25:52 +01:00
Sebastien Boeuf
db9f9b7820 pci: Make self mutable when reading from PCI config space
In order to anticipate the need to support more features related to the
access of a device's PCI config space, this commits changes the self
reference in the function read_config_register() to be mutable.

This also brings some more flexibility for any implementation of the
PciDevice trait.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-30 09:25:52 +01:00
Sebastien Boeuf
e155e3690c vm-virtio: Simplify virtio-fs configuration
This commit introduces a clear definition of the virtio-fs
configuration structure, allowing vhost-user-fs device to
rely on it.

This makes the code more readable for developers.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-28 10:28:14 +00:00
Sebastien Boeuf
8e48fc445f vm-virtio: Simplify virtio-blk configuration
This commit reuses the clear definition of the virtio-blk
configuration structure, allowing both vhost-user-blk and
virtio-blk devices to rely on it.

This makes the code more readable for developers.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-28 10:28:14 +00:00
Sebastien Boeuf
8946a09afd vm-virtio: Simplify virtio-net configuration
This commit introduces a clear definition of the virtio-net
configuration structure, allowing both vhost-user-net and
virtio-net devices to rely on it.

This makes the code more readable for developers.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-28 10:28:14 +00:00
Sebastien Boeuf
f5b53ae4be vm-virtio: Implement multiqueue/multithread support for virtio-blk
This commit improves the existing virtio-blk implementation, allowing
for better I/O performance. The cost for the end user is to accept
allocating more vCPUs to the virtual machine, so that multiple I/O
threads can run in parallel.

One thing to notice, the amount of vCPUs must be egal or superior to the
amount of queues dedicated to the virtio-blk device.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-28 09:26:53 +01:00
Samuel Ortiz
c4b3ed7223 vm-virtio: Further factorization
The trait bound and non trait bound virtio devices can use the same
inner implementation.
Also, the virtio pausable trait definiton can also be factorized.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-01-28 07:51:13 +01:00
Samuel Ortiz
bce76271c5 vm-virtio: Define a separate macro alias for ctrl queue devices
Now that we have factorized the common virtio pausable implementation,
it's cleaner to have a dedicated macro for control queue devices rather
than overload the macro prototype.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-01-28 07:51:13 +01:00
Samuel Ortiz
2e2b1e4230 vm-virtio: Remove the multiqueue argument from the pausable macro
We only need the ctrl queue one.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-01-28 07:51:13 +01:00
Samuel Ortiz
2cb7ec04a4 vm-virtio: Pausable macro factorization improvements
By adding an internal layer of abstraction (the hidden VirtioPausable
trait), we can factorize the virtio common code.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-01-28 07:51:13 +01:00
Samuel Ortiz
c06a827cbb vm-virtio: Rename epoll_thread to epoll_threads
Now that we unified epoll_thread to potentially be a vector of threads,
it makes sense to make it a plural field.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-01-28 07:51:13 +01:00
Samuel Ortiz
f648f2856d vm-virtio: Make all virtio devices potentially multi-threaded
Although only the block and net virtio devices can actually be multi
threaded (for now), handling them as special cases makes the code more
complex.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-01-28 07:51:13 +01:00
Sebastien Boeuf
0a7bcc9a7d vm-virtio: Fix map_err losing the inner error
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-24 12:42:09 +01:00
Sebastien Boeuf
9ac06bf613 ci: Run clippy for each specific feature
The build is run against "--all-features", "pci,acpi", "pci" and "mmio"
separately. The clippy validation must be run against the same set of
features in order to validate the code is correct.

Because of these new checks, this commit includes multiple fixes
related to the errors generated when manually running the checks.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-21 11:44:40 +01:00
Sebastien Boeuf
99f39291fd pci: Simplify PciDevice trait
There's no need for assign_irq() or assign_msix() functions from the
PciDevice trait, as we can see it's never used anywhere in the codebase.
That's why it's better to remove these methods from the trait, and
slightly adapt the existing code.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-21 10:44:48 +01:00
Sebastien Boeuf
8d7c4ea334 vmm: Use LegacyUserspaceInterruptGroup for mmio devices
This commit replaces the way legacy interrupts were handled with the
brand new implementation of the legacy InterruptSourceGroup for KVM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-21 10:44:48 +01:00
Sebastien Boeuf
8049666eff vm-virtio: Cleanup from kvm_iotcls and kvm_bindings dependencies
Now that KVM specific interrupts are handled through InterruptManager
trait implementation, the vm-virtio crate does not need to rely on
kvm_ioctls and kvm_bindings crates.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
4bb12a2d8d interrupt: Reorganize all interrupt management with InterruptManager
Based on all the previous changes, we can at this point replace the
entire interrupt management with the implementation of InterruptManager
and InterruptSourceGroup traits.

By using KvmInterruptManager from the DeviceManager, we can provide both
VirtioPciDevice and VfioPciDevice a way to pick the kind of
InterruptSourceGroup they want to create. Because they choose the type
of interrupt to be MSI/MSI-X, they will be given a MsiInterruptGroup.

Both MsixConfig and MsiConfig are responsible for the update of the GSI
routes, which is why, by passing the MsiInterruptGroup to them, they can
still perform the GSI route management without knowing implementation
details. That's where the InterruptSourceGroup is powerful, as it
provides a generic way to manage interrupt, no matter the type of
interrupt and no matter which hypervisor might be in use.

Once the full replacement has been achieved, both SystemAllocator and
KVM specific dependencies can be removed.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
be421dccea vm-virtio: Optimize vhost-user interrupt notification
Thanks to the recently introduced function notifier() in the
VirtioInterrupt trait, all vhost-user devices can now bypass
listening onto an intermediate event fd as they can provide the
actual fd responsible for triggering the interrupt directly to
the vhost-user backend.

In case the notifier does not provide the event fd, the code falls
back onto the creation of an intermediate event fd it needs to listen
to, so that it can trigger the interrupt on behalf of the backend.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
1f029dd2dc vm-virtio: Add notifier to VirtioInterrupt trait
The point is to be able to retrieve directly the event fd related to
the interrupt, as this might optimize the way VirtioDevice devices are
implemented.

For instance, this can be used by vhost-user devices to provide
vhost-user backends directly with the event fd triggering the
interrupt related to a virtqueue.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
c396baca46 vm-virtio: Modify VirtioInterrupt callback into a trait
Callbacks are not the most Rust idiomatic way of programming. The right
way is to use a Trait to provide multiple implementation of the same
interface.

Additionally, a Trait will allow for multiple functions to be defined
while using callbacks means that a new callback must be introduced for
each new function we want to add.

For these two reasons, the current commit modifies the existing
VirtioInterrupt callback into a Trait of the same name.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
19aeac40c9 msix: Remove the need for interrupt callback
Now that MsixConfig has access to the irq_fd descriptors associated with
each vector, it can directly write to it anytime it needs to trigger an
interrupt.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
2381f32ae0 msix: Add gsi_msi_routes to MsixConfig
Because MsixConfig will be responsible for updating KVM GSI routes at
some point, it is necessary that it can access the list of routes
contained by gsi_msi_routes.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
9b60fcdc39 msix: Add VmFd to MsixConfig
Because MsixConfig will be responsible for updating the KVM GSI routes
at some point, it must have access to the VmFd to invoke the KVM ioctl
KVM_SET_GSI_ROUTING.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
86c760a0d9 msix: Add SystemAllocator to MsixConfig
The point here is to let MsixConfig take care of the GSI allocation,
which means the SystemAllocator must be passed from the vmm crate all
the way down to the pci crate.

Once this is done, the GSI allocation and irq_fd creation is performed
by MsixConfig directly.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sergio Lopez
c5a656c9dc vm-virtio: block: Add support for alignment restrictions
Doing I/O on an image opened with O_DIRECT requires to adhere to
certain restrictions, requiring the following elements to be aligned:

 - Address of the source/destination memory buffer.
 - File offset.
 - Length of the data to be read/written.

The actual alignment value depends on various elements, and according
to open(2) "(...) there is currently no filesystem-independent
interface for an application to discover these restrictions (...)".

To discover such value, we iterate through a list of alignments
(currently, 512 and 4096) calling pread() with each one and checking
if the operation succeeded.

We also extend RawFile so it can be used as a backend for QcowFile,
so the later can be easily adapted to support O_DIRECT too.

Signed-off-by: Sergio Lopez <slp@redhat.com>
2020-01-17 17:28:44 +00:00
Cathy Zhang
652e7b9b8a vm-virtio: Implement multiple queue support for net devices
Update the common part in net_util.rs under vm-virtio to add mq
support, meanwhile enable mq for virtio-net device, vhost-user-net
device and vhost-user-net backend. Multiple threads will be created,
one thread will be responsible to handle one queue pair separately.
To gain the better performance, it requires to have the same amount
of vcpus as queue pair numbers defined for the net device, due to
the cpu affinity.

Multiple thread support is not added for vhost-user-net backend
currently, it will be added in future.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2020-01-17 12:06:19 +01:00
Cathy Zhang
4ab88a8173 net_util: Add multiple queue support for tap
Add support to allow VMMs to open the same tap device many times, it will
create multiple file descriptors meanwhile.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2020-01-17 12:06:19 +01:00
Cathy Zhang
cf7e0cdf95 vm-virtio: Add multiple queue handling with control queue
Current guest kernel will check the oneline cpu count, in principle,
if the online cpu count is not smaller than the number of queue pairs
VMM reported, the net packets could be put/get to all the virtqueues,
otherwise, only the number of queue pairs that match the oneline cpu
count will have packets work with. guest kernel will send command
through control queue to tell VMMs the actual queue pair numbers which
it could currently play with. Add mq process in control queue handling
to get the queue pair number, VMM will verify if it is in a valid range,
nothing else but this.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2020-01-17 12:06:19 +01:00
Cathy Zhang
709f7fe607 vm-virtio: Implement control queue support for net devices
While feature VIRTIO_NET_F_CTRL_VQ is negotiated, control queue
will exits besides the Tx/Rx virtqueues, an epoll handler should
be started to monitor and handle the control queue event.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2020-01-17 12:06:19 +01:00
Cathy Zhang
d38787c578 vm-virtio: Add control queue support in net_util.rs
As virtio spec 1.1 said, the driver uses the control queue
to send commands to manipulate various features of the devices,
such as VIRTIO_NET_F_MQ which is required by multiple queue
support. Here add the control queue handling process.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2020-01-17 12:06:19 +01:00
Cathy Zhang
1ae7deb393 vm-virtio: Implement refactor for net devices and backend
Since the common parts are put into net_util.rs under vm-virtio,
refactoring code for virtio-net device, vhost-user-net device
and backend to shrink the code size and improve readability
meanwhile.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2020-01-17 12:06:19 +01:00
Cathy Zhang
6ae2597d19 vm-virtio: Create new module to abstract common parts for net devices
There are some common logic shared among virtio-net device, vhost-user-net
device and vhost-user-net backend, abstract those parts into net_util.rs
to improve code maintainability and readability.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2020-01-17 12:06:19 +01:00
Cathy Zhang
3485e89080 vm-virtio: Stop delivering interrupt while NO_VECTOR
According to virtio spec, for used buffer notifications, if
MSI-X capability is enabled, and queue msix vector is
VIRTIO_MSI_NO_VECTOR 0xffff, the device must not deliver an
interrupt for that virtqueue.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2020-01-17 12:06:19 +01:00
Samuel Ortiz
c2f6dfce88 vm-virtio: Fix VirtioDeviceType traits
The From and Display traits were not handling some of the enum
definitions. We no longer have a default case for Display so any future
misses will fail at build time.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-01-09 15:29:09 +01:00
Sebastien Boeuf
38468d3d9b vm-virtio: Improve virtio-console input processing
The way the code is currently implemented, only by writing to STDIN a
user can trigger some input to reach the VM through virtio-console. But
in case, there were not enough virtio descriptors to process what was
retrieved from STDIN, the remaining bits would be transferred only if
STDIN was triggered again. The missing part is that when some
descriptors are made available from the guest, the virtio-console device
should try to send any possible remaining bits.

By triggering the function process_input_queue() whenever the guest
notifies the host that some new descriptors are ready for the receive
queue, this patch allows to fill the implementation void that was left.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-08 15:37:02 +01:00