Commit Graph

302 Commits

Author SHA1 Message Date
Sebastien Boeuf
8e48fc445f vm-virtio: Simplify virtio-blk configuration
This commit reuses the clear definition of the virtio-blk
configuration structure, allowing both vhost-user-blk and
virtio-blk devices to rely on it.

This makes the code more readable for developers.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-28 10:28:14 +00:00
Sebastien Boeuf
8946a09afd vm-virtio: Simplify virtio-net configuration
This commit introduces a clear definition of the virtio-net
configuration structure, allowing both vhost-user-net and
virtio-net devices to rely on it.

This makes the code more readable for developers.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-28 10:28:14 +00:00
Sebastien Boeuf
f5b53ae4be vm-virtio: Implement multiqueue/multithread support for virtio-blk
This commit improves the existing virtio-blk implementation, allowing
for better I/O performance. The cost for the end user is to accept
allocating more vCPUs to the virtual machine, so that multiple I/O
threads can run in parallel.

One thing to notice, the amount of vCPUs must be egal or superior to the
amount of queues dedicated to the virtio-blk device.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-28 09:26:53 +01:00
Samuel Ortiz
c4b3ed7223 vm-virtio: Further factorization
The trait bound and non trait bound virtio devices can use the same
inner implementation.
Also, the virtio pausable trait definiton can also be factorized.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-01-28 07:51:13 +01:00
Samuel Ortiz
bce76271c5 vm-virtio: Define a separate macro alias for ctrl queue devices
Now that we have factorized the common virtio pausable implementation,
it's cleaner to have a dedicated macro for control queue devices rather
than overload the macro prototype.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-01-28 07:51:13 +01:00
Samuel Ortiz
2e2b1e4230 vm-virtio: Remove the multiqueue argument from the pausable macro
We only need the ctrl queue one.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-01-28 07:51:13 +01:00
Samuel Ortiz
2cb7ec04a4 vm-virtio: Pausable macro factorization improvements
By adding an internal layer of abstraction (the hidden VirtioPausable
trait), we can factorize the virtio common code.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-01-28 07:51:13 +01:00
Samuel Ortiz
c06a827cbb vm-virtio: Rename epoll_thread to epoll_threads
Now that we unified epoll_thread to potentially be a vector of threads,
it makes sense to make it a plural field.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-01-28 07:51:13 +01:00
Samuel Ortiz
f648f2856d vm-virtio: Make all virtio devices potentially multi-threaded
Although only the block and net virtio devices can actually be multi
threaded (for now), handling them as special cases makes the code more
complex.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-01-28 07:51:13 +01:00
Sebastien Boeuf
0a7bcc9a7d vm-virtio: Fix map_err losing the inner error
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-24 12:42:09 +01:00
Sebastien Boeuf
9ac06bf613 ci: Run clippy for each specific feature
The build is run against "--all-features", "pci,acpi", "pci" and "mmio"
separately. The clippy validation must be run against the same set of
features in order to validate the code is correct.

Because of these new checks, this commit includes multiple fixes
related to the errors generated when manually running the checks.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-21 11:44:40 +01:00
Sebastien Boeuf
99f39291fd pci: Simplify PciDevice trait
There's no need for assign_irq() or assign_msix() functions from the
PciDevice trait, as we can see it's never used anywhere in the codebase.
That's why it's better to remove these methods from the trait, and
slightly adapt the existing code.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-21 10:44:48 +01:00
Sebastien Boeuf
8d7c4ea334 vmm: Use LegacyUserspaceInterruptGroup for mmio devices
This commit replaces the way legacy interrupts were handled with the
brand new implementation of the legacy InterruptSourceGroup for KVM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-21 10:44:48 +01:00
Sebastien Boeuf
4bb12a2d8d interrupt: Reorganize all interrupt management with InterruptManager
Based on all the previous changes, we can at this point replace the
entire interrupt management with the implementation of InterruptManager
and InterruptSourceGroup traits.

By using KvmInterruptManager from the DeviceManager, we can provide both
VirtioPciDevice and VfioPciDevice a way to pick the kind of
InterruptSourceGroup they want to create. Because they choose the type
of interrupt to be MSI/MSI-X, they will be given a MsiInterruptGroup.

Both MsixConfig and MsiConfig are responsible for the update of the GSI
routes, which is why, by passing the MsiInterruptGroup to them, they can
still perform the GSI route management without knowing implementation
details. That's where the InterruptSourceGroup is powerful, as it
provides a generic way to manage interrupt, no matter the type of
interrupt and no matter which hypervisor might be in use.

Once the full replacement has been achieved, both SystemAllocator and
KVM specific dependencies can be removed.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
be421dccea vm-virtio: Optimize vhost-user interrupt notification
Thanks to the recently introduced function notifier() in the
VirtioInterrupt trait, all vhost-user devices can now bypass
listening onto an intermediate event fd as they can provide the
actual fd responsible for triggering the interrupt directly to
the vhost-user backend.

In case the notifier does not provide the event fd, the code falls
back onto the creation of an intermediate event fd it needs to listen
to, so that it can trigger the interrupt on behalf of the backend.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
1f029dd2dc vm-virtio: Add notifier to VirtioInterrupt trait
The point is to be able to retrieve directly the event fd related to
the interrupt, as this might optimize the way VirtioDevice devices are
implemented.

For instance, this can be used by vhost-user devices to provide
vhost-user backends directly with the event fd triggering the
interrupt related to a virtqueue.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
c396baca46 vm-virtio: Modify VirtioInterrupt callback into a trait
Callbacks are not the most Rust idiomatic way of programming. The right
way is to use a Trait to provide multiple implementation of the same
interface.

Additionally, a Trait will allow for multiple functions to be defined
while using callbacks means that a new callback must be introduced for
each new function we want to add.

For these two reasons, the current commit modifies the existing
VirtioInterrupt callback into a Trait of the same name.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
19aeac40c9 msix: Remove the need for interrupt callback
Now that MsixConfig has access to the irq_fd descriptors associated with
each vector, it can directly write to it anytime it needs to trigger an
interrupt.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
2381f32ae0 msix: Add gsi_msi_routes to MsixConfig
Because MsixConfig will be responsible for updating KVM GSI routes at
some point, it is necessary that it can access the list of routes
contained by gsi_msi_routes.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
9b60fcdc39 msix: Add VmFd to MsixConfig
Because MsixConfig will be responsible for updating the KVM GSI routes
at some point, it must have access to the VmFd to invoke the KVM ioctl
KVM_SET_GSI_ROUTING.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sebastien Boeuf
86c760a0d9 msix: Add SystemAllocator to MsixConfig
The point here is to let MsixConfig take care of the GSI allocation,
which means the SystemAllocator must be passed from the vmm crate all
the way down to the pci crate.

Once this is done, the GSI allocation and irq_fd creation is performed
by MsixConfig directly.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-17 23:43:45 +01:00
Sergio Lopez
c5a656c9dc vm-virtio: block: Add support for alignment restrictions
Doing I/O on an image opened with O_DIRECT requires to adhere to
certain restrictions, requiring the following elements to be aligned:

 - Address of the source/destination memory buffer.
 - File offset.
 - Length of the data to be read/written.

The actual alignment value depends on various elements, and according
to open(2) "(...) there is currently no filesystem-independent
interface for an application to discover these restrictions (...)".

To discover such value, we iterate through a list of alignments
(currently, 512 and 4096) calling pread() with each one and checking
if the operation succeeded.

We also extend RawFile so it can be used as a backend for QcowFile,
so the later can be easily adapted to support O_DIRECT too.

Signed-off-by: Sergio Lopez <slp@redhat.com>
2020-01-17 17:28:44 +00:00
Cathy Zhang
652e7b9b8a vm-virtio: Implement multiple queue support for net devices
Update the common part in net_util.rs under vm-virtio to add mq
support, meanwhile enable mq for virtio-net device, vhost-user-net
device and vhost-user-net backend. Multiple threads will be created,
one thread will be responsible to handle one queue pair separately.
To gain the better performance, it requires to have the same amount
of vcpus as queue pair numbers defined for the net device, due to
the cpu affinity.

Multiple thread support is not added for vhost-user-net backend
currently, it will be added in future.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2020-01-17 12:06:19 +01:00
Cathy Zhang
4ab88a8173 net_util: Add multiple queue support for tap
Add support to allow VMMs to open the same tap device many times, it will
create multiple file descriptors meanwhile.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2020-01-17 12:06:19 +01:00
Cathy Zhang
cf7e0cdf95 vm-virtio: Add multiple queue handling with control queue
Current guest kernel will check the oneline cpu count, in principle,
if the online cpu count is not smaller than the number of queue pairs
VMM reported, the net packets could be put/get to all the virtqueues,
otherwise, only the number of queue pairs that match the oneline cpu
count will have packets work with. guest kernel will send command
through control queue to tell VMMs the actual queue pair numbers which
it could currently play with. Add mq process in control queue handling
to get the queue pair number, VMM will verify if it is in a valid range,
nothing else but this.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2020-01-17 12:06:19 +01:00
Cathy Zhang
709f7fe607 vm-virtio: Implement control queue support for net devices
While feature VIRTIO_NET_F_CTRL_VQ is negotiated, control queue
will exits besides the Tx/Rx virtqueues, an epoll handler should
be started to monitor and handle the control queue event.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2020-01-17 12:06:19 +01:00
Cathy Zhang
d38787c578 vm-virtio: Add control queue support in net_util.rs
As virtio spec 1.1 said, the driver uses the control queue
to send commands to manipulate various features of the devices,
such as VIRTIO_NET_F_MQ which is required by multiple queue
support. Here add the control queue handling process.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2020-01-17 12:06:19 +01:00
Cathy Zhang
1ae7deb393 vm-virtio: Implement refactor for net devices and backend
Since the common parts are put into net_util.rs under vm-virtio,
refactoring code for virtio-net device, vhost-user-net device
and backend to shrink the code size and improve readability
meanwhile.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2020-01-17 12:06:19 +01:00
Cathy Zhang
6ae2597d19 vm-virtio: Create new module to abstract common parts for net devices
There are some common logic shared among virtio-net device, vhost-user-net
device and vhost-user-net backend, abstract those parts into net_util.rs
to improve code maintainability and readability.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2020-01-17 12:06:19 +01:00
Cathy Zhang
3485e89080 vm-virtio: Stop delivering interrupt while NO_VECTOR
According to virtio spec, for used buffer notifications, if
MSI-X capability is enabled, and queue msix vector is
VIRTIO_MSI_NO_VECTOR 0xffff, the device must not deliver an
interrupt for that virtqueue.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2020-01-17 12:06:19 +01:00
Samuel Ortiz
c2f6dfce88 vm-virtio: Fix VirtioDeviceType traits
The From and Display traits were not handling some of the enum
definitions. We no longer have a default case for Display so any future
misses will fail at build time.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-01-09 15:29:09 +01:00
Sebastien Boeuf
38468d3d9b vm-virtio: Improve virtio-console input processing
The way the code is currently implemented, only by writing to STDIN a
user can trigger some input to reach the VM through virtio-console. But
in case, there were not enough virtio descriptors to process what was
retrieved from STDIN, the remaining bits would be transferred only if
STDIN was triggered again. The missing part is that when some
descriptors are made available from the guest, the virtio-console device
should try to send any possible remaining bits.

By triggering the function process_input_queue() whenever the guest
notifies the host that some new descriptors are ready for the receive
queue, this patch allows to fill the implementation void that was left.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-08 15:37:02 +01:00
Sebastien Boeuf
e4c3401a33 vm-virtio: Don't waste a descriptor if not filled
In case the virtio descriptor is pulled out of the Queue iterator, it
is important to fill it and tag it as used. This is already done from
the successful code path, but in case there's an error during the
filling, we should make sure to put the descriptor back in the list of
available descriptors. This way, when the error occurs, we don't loose
a descriptor, and it could be used later.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-08 15:37:02 +01:00
Sebastien Boeuf
7a3e6caca4 vm-virtio: Simplify virtio-console input processing
The existing code was a bit too complex and it was introducing a bug
when trying to paste long lines directly to the console. By simplifying
the code, and by doing proper usage of the drain() function, the bug is
fixed by this commit.

Here is the similar output one could have gotten from time to time, when
pasting important amounts of bytes:

ERROR:vm-virtio/src/console.rs:104 -- Failed to write slice:
InvalidGuestAddress(GuestAddress(1040617472))

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-08 15:37:02 +01:00
Sebastien Boeuf
84445aae93 vm-virtio: Implement multi-mapping for virtio-fs
The virtio-fs messages coming from the slave can contain multiple
mappings (up to 8) through one single request. By implementing such
feature, the virtio-fs implementation of cloud-hypervisor is optimal and
fully functional as it resolves a bug that was seen when running fio
testing without this patch.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-08 09:27:07 +01:00
Sebastien Boeuf
e1822cfdad vm-virtio: Implement VIRTIO_IOMMU_F_PROBE feature
By implementing this virtio feature, we let the virtio-iommu driver call
the device backend so that it can probe each device that gets attached.

Through this probing, the device provides a range of reserved memory
related to MSI. This is mandatory for x86 architecture as we want to
avoid the default MSI range assigned by the virtio-iommu driver if no
range is provided at all. The default range is 0x8000000-0x80FFFFF but
it only makes sense for ARM architectures.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-08 09:27:07 +01:00
Rob Bradford
32a39f9b95 vm-virtio: Fix broken write_base_regs() unit test
The following commit broke this unit test:

"""
vmm: Convert virtio devices to Arc<Mutex<T>>

Migratable devices can be virtio or legacy devices.
In any case, they can potentially be tracked through one of the IO bus
as an Arc<Mutex<dyn BusDevice>>. In order for the DeviceManager to also
keep track of such devices as Migratable trait objects, they must be
shared as mutable atomic references, i.e. Arc<Mutex<T>>. That forces all
Migratable objects to be tracked as Arc<Mutex<dyn Migratable>>.

Virtio devices are typically migratable, and thus for them to be
referenced by the DeviceManager, they now should be built as
Arc<Mutex<VirtioDevice>>.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
"""

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-01-03 10:20:53 +01:00
Rob Bradford
b2589d4f3f vm-virtio, vmm, vfio: Store GuestMemoryMmap in an Arc<ArcSwap<T>>
This allows us to change the memory map that is being used by the
devices via an atomic swap (by replacing the map with another one). The
ArcSwap provides the mechanism for atomically swapping from to another
whilst still giving good read performace. It is inside an Arc so that we
can use a single ArcSwap for all users.

Not covered by this change is replacing the GuestMemoryMmap itself.

This change also removes some vertical whitespace from use blocks in the
files that this commit also changed. Vertical whitespace was being used
inconsistently and broke rustfmt's behaviour of ordering the imports as
it would only do it within the block.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-01-02 13:20:11 +00:00
Rob Bradford
9fb1c46cd1 vm-virtio: Remove unnecessary cloning
Found by updated clippy:

error: redundant clone
   --> vm-virtio/src/block.rs:182:5
    |
182 |     .to_owned();
    |     ^^^^^^^^^^^ help: remove this
    |
    = note: `-D clippy::redundant-clone` implied by `-D warnings`
note: this value is dropped without further use
   --> vm-virtio/src/block.rs:176:21
    |
176 |       let device_id = format!(
    |  _____________________^
177 | |         "{}{}{}",
178 | |         blk_metadata.st_dev(),
179 | |         blk_metadata.st_rdev(),
180 | |         blk_metadata.st_ino()
181 | |     )
182 | |     .to_owned();
    | |____^
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#redundant_clone

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-12-20 00:52:03 +01:00
Sebastien Boeuf
9701fde209 vm-virtio: Add connection handshake to vsock
This patch has been cherry-picked from the Firecracker tree. The
reference commit is 1db04ccc69862f30b7814f30024d112d1b86b80e.

Changed the host-initiated vsock connection protocol to include a
trivial handshake.

The new protocol looks like this:
- [host] CONNECT <port><LF>
- [guest/success] OK <assigned_host_port><LF>

On connection failure, the host host connection is reset without any
accompanying message, as before.

This allows host software to more easily detect connection failures, for
instance when attempting to connect to a guest server that may have not
yet started listening for client connections.

Signed-off-by: Dan Horobeanu <dhr@amazon.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-12-19 09:48:17 +01:00
Samuel Ortiz
664431ff14 vsock: vhost_user: vfio: Fix potential host memory overflow
The vsock packets that we're building are resolving guest addresses to
host ones and use the latter as raw pointers.
If the corresponding guest mapped buffer spans across several regions in
the guest, they will do so in the host as well. Since we have no
guarantees that host regions are contiguous, it may lead the VMM into
trying to access memory outside of its memory space.

For now we fix that by ensuring that the guest buffers do not span
across several regions. If they do, we error out.
Ideally, we should enhance the rust-vmm memory model to support safe
acces across host regions.

Fixes CVE-2019-18960

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-12-12 22:15:50 +01:00
Samuel Ortiz
a122da4bef vm-virtio: vhost: Implement the Pausable trait for all vhost-user devices
Due to the amount of code currently duplicated across vhost-user devices,
the stats for this commit is on the large side but it's mostly more
duplicated code, unfortunately.

Migratable and Snapshotable placeholder implementations are provided as
well, making all vhost-user devices Migratable.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-12-12 08:50:36 +01:00
Samuel Ortiz
dae0b2ef72 vm-virtio: Implement the Pausable trait for all virtio devices
Due to the amount of code currently duplicated across virtio devices,
the stats for this commit is on the large side but it's mostly more
duplicated code, unfortunately.

Migratable and Snapshotable placeholder implementations are provided as
well, making all virtio devices Migratable.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-12-12 08:50:36 +01:00
Samuel Ortiz
35d7721683 vmm: Convert virtio devices to Arc<Mutex<T>>
Migratable devices can be virtio or legacy devices.
In any case, they can potentially be tracked through one of the IO bus
as an Arc<Mutex<dyn BusDevice>>. In order for the DeviceManager to also
keep track of such devices as Migratable trait objects, they must be
shared as mutable atomic references, i.e. Arc<Mutex<T>>. That forces all
Migratable objects to be tracked as Arc<Mutex<dyn Migratable>>.

Virtio devices are typically migratable, and thus for them to be
referenced by the DeviceManager, they now should be built as
Arc<Mutex<VirtioDevice>>.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-12-12 08:50:36 +01:00
Sebastien Boeuf
8845326aa2 vm-virtio: Introduce DescriptorChain iterator
In order to iterate over a chain of descriptor chains, this code has
been ported over from crosvm, based on the commit
961461350c0b6824e5f20655031bf6c6bf6b7c30.

The main modification compared to the original code is the way the
sorting between readable and writable descriptors happens.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-11-22 22:17:47 +01:00
Rob Bradford
ff36fa99e6 vm-virtio: Replace use of deprecated std::mem::uninitialized
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-11-08 20:43:52 +00:00
Sergio Lopez
3a3dd0096c vm-virtio: export block::Request and related funcs/structs
Export block::Request and related functions and structs so the code
can be shared with vhost-user-blk.

Signed-off-by: Sergio Lopez <slp@redhat.com>
2019-11-07 10:36:30 +00:00
Sebastien Boeuf
5694ac2b1e vm-virtio: Create new VirtioTransport trait to abstract ioeventfds
In order to group together some functions that can be shared across
virtio transport layers, this commit introduces a new trait called
VirtioTransport.

The first function of this trait being ioeventfds() as it is needed from
both virtio-mmio and virtio-pci devices, represented by MmioDevice and
VirtioPciDevice structures respectively.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-31 09:30:59 +01:00
Sebastien Boeuf
c7cabc88b4 vmm: Conditionally update ioeventfds for virtio PCI device
The specific part of PCI BAR reprogramming that happens for a virtio PCI
device is the update of the ioeventfds addresses KVM should listen to.
This should not be triggered for every BAR reprogramming associated with
the virtio device since a virtio PCI device might have multiple BARs.

The update of the ioeventfds addresses should only happen when the BAR
related to those addresses is being moved.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-31 09:30:59 +01:00
Sebastien Boeuf
de21c9ba4f pci: Remove ioeventfds() from PciDevice trait
The PciDevice trait is supposed to describe only functions related to
PCI. The specific method ioeventfds() has nothing to do with PCI, but
instead would be more specific to virtio transport devices.

This commit removes the ioeventfds() method from the PciDevice trait,
adding some convenient helper as_any() to retrieve the Any trait from
the structure behing the PciDevice trait. This is the only way to keep
calling into ioeventfds() function from VirtioPciDevice, so that we can
still properly reprogram the PCI BAR.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-31 09:30:59 +01:00
Sebastien Boeuf
149b61b213 pci: Detect BAR reprogramming
Based on the value being written to the BAR, the implementation can
now detect if the BAR is being moved to another address. If that is the
case, it invokes move_bar() function from the DeviceRelocation trait.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-29 16:48:02 +01:00
Sebastien Boeuf
4f8054fa82 pci: Store the type of BAR to return correct address
Based on the type of BAR, we can now provide the correct address related
to a BAR index provided by the caller.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-29 16:48:02 +01:00
Sebastien Boeuf
efbafdf9ed vm-virtio: Allow 2MiB mappings
In order to speed up the boot time and reduce the amount of mappings,
this patch exposes the virtio-iommu device as supporting both 2M and 4k
page sizes.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-18 07:21:40 +02:00
Sebastien Boeuf
c65ead5de8 vm-virtio: Trigger external map/unmap from virtio-iommu
This patch relies on the trait implementation provided for each device
which requires some sort of external update based on a map or unmap.

Whenever a MAP or UNMAP request comes through the virtqueues, it
triggers a call to the external mapping trait with map()/unmap()
functions being invoked.

Those external mappings are meant to be used from VFIO and vhost-user
devices as they need to update their own mappings. In case of VFIO, the
goal is to update the DMAR table in the physical IOMMU, while vhost-user
devices needs to update their internal representation of the virtqueues.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-16 07:27:06 +02:00
Sebastien Boeuf
f40adff2a1 vm-virtio: Add virtio-iommu support
This patch introduces the first implementation of the virtio-iommu
device. This device emulates an IOMMU for the guest, which allows
special use cases like nesting passed through devices, or even using
IOVAs from the guest.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-07 10:12:07 +02:00
Sebastien Boeuf
0acb1e329d vm-virtio: Translate addresses for devices attached to IOMMU
In case some virtio devices are attached to the virtual IOMMU, their
vring addresses need to be translated from IOVA into GPA. Otherwise it
makes no sense to try to access them, and they would cause out of range
errors.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-07 10:12:07 +02:00
Sebastien Boeuf
6566c739e1 vm-virtio: Add IOMMU support to virtio-vsock
Adding virtio feature VIRTIO_F_IOMMU_PLATFORM when explicitly asked by
the user. The need for this feature is to be able to attach the virtio
device to a virtual IOMMU.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-07 10:12:07 +02:00
Sebastien Boeuf
9ab00dcb75 vm-virtio: Add IOMMU support to virtio-rng
Adding virtio feature VIRTIO_F_IOMMU_PLATFORM when explicitly asked by
the user. The need for this feature is to be able to attach the virtio
device to a virtual IOMMU.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-07 10:12:07 +02:00
Sebastien Boeuf
ee1899c6f6 vm-virtio: Add IOMMU support to virtio-pmem
Adding virtio feature VIRTIO_F_IOMMU_PLATFORM when explicitly asked by
the user. The need for this feature is to be able to attach the virtio
device to a virtual IOMMU.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-07 10:12:07 +02:00
Sebastien Boeuf
392f1ec155 vm-virtio: Add IOMMU support to virtio-console
Adding virtio feature VIRTIO_F_IOMMU_PLATFORM when explicitly asked by
the user. The need for this feature is to be able to attach the virtio
device to a virtual IOMMU.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-07 10:12:07 +02:00
Sebastien Boeuf
9fad680db1 vm-virtio: Add IOMMU support to virtio-net
Adding virtio feature VIRTIO_F_IOMMU_PLATFORM when explicitly asked by
the user. The need for this feature is to be able to attach the virtio
device to a virtual IOMMU.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-07 10:12:07 +02:00
Sebastien Boeuf
9ebb1a55bc vm-virtio: Add IOMMU support to virtio-blk
Adding virtio feature VIRTIO_F_IOMMU_PLATFORM when explicitly asked by
the user. The need for this feature is to be able to attach the virtio
device to a virtual IOMMU.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-07 10:12:07 +02:00
Sebastien Boeuf
85e1865cb5 vm-virtio: Implement reset() for vhost-user-fs
The virtio specification defines a device can be reset, which was not
supported by this vhost-user-fs implementation. The reason it is needed
is to support unbinding this device from the guest driver, and rebind it
to vfio-pci driver.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-07 10:12:07 +02:00
Sebastien Boeuf
4b1328a29c vm-virtio: Implement reset() for vhost-user-net
The virtio specification defines a device can be reset, which was not
supported by this vhost-user-net implementation. The reason it is needed
is to support unbinding this device from the guest driver, and rebind it
to vfio-pci driver.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-07 10:12:07 +02:00
Sebastien Boeuf
8225d4cd6e vm-virtio: Implement reset() for virtio-console
The virtio specification defines a device can be reset, which was not
supported by this virtio-console implementation. The reason it is needed
is to support unbinding this device from the guest driver, and rebind it
to vfio-pci driver.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-07 10:12:07 +02:00
Sebastien Boeuf
dac7737919 vm-virtio: Implement reset() for virtio-vsock
The virtio specification defines a device can be reset, which was not
supported by this virtio-vsock implementation. The reason it is needed
is to support unbinding this device from the guest driver, and rebind
it to vfio-pci driver.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-07 10:12:07 +02:00
Sebastien Boeuf
3e750de43f vm-virtio: Implement reset() for virtio-pmem
The virtio specification defines a device can be reset, which was not
supported by this virtio-pmem implementation. The reason it is needed
is to support unbinding this device from the guest driver, and rebind
it to vfio-pci driver.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-07 10:12:07 +02:00
Sebastien Boeuf
eb91bc812b vm-virtio: Implement reset() for virtio-rng
The virtio specification defines a device can be reset, which was not
supported by this virtio-rng implementation. The reason it is needed
is to support unbinding this device from the guest driver, and rebind
it to vfio-pci driver.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-07 10:12:07 +02:00
Sebastien Boeuf
59b4aaba87 vm-virtio: Implement reset() for virtio-net
The virtio specification defines a device can be reset, which was not
supported by this virtio-net implementation. The reason it is needed is
to support unbinding this device from the guest driver, and rebind it to
vfio-pci driver.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-10-07 10:12:07 +02:00
Cathy Zhang
d724511a91 vm-virtio: Add set_protocol_features in vhost-user-net
While implement vhost-user-net backend with Tap interface, it keeps
failed to enable the tx vring, since there is a checking in
slave_req_handler.rs to require acked_protocol_features to be setup
as a pre-requirement, which is filled by set_protocol_features call.
Add this call in vhost-user-net device implementation to address the issue.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2019-09-30 13:06:00 -07:00
Sebastien Boeuf
2e2cad91ae vhost_user_backend: Add new crate
The purpose of this new crate is to provide a common library to all
vhost-user backend implementations. The more is handled by this library,
the less duplication will need to happen in each vhost-user daemon.

This crate relies a lot on vhost_rs, vm-memory and vm-virtio crates.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-09-30 09:26:11 -07:00
Rob Bradford
2ae3919181 vm-virtio: Fix formatting
With the 1.38.0 toolchain rustfmt is even stricter about formatting now

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-27 08:05:56 -07:00
Samuel Ortiz
3dc7aff00e vmm: Make vhost-user configuration owned
Convert Path to PathBuf, &str to String and remove the associated lifetime.

Fixes #298

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-09-24 08:39:39 +01:00
Sebastien Boeuf
f06b2aaaa7 vm-virtio: vhost-user: Set the right vring size
The vhost-user implementation was always passing the maximum size
supported by the virtqueues to the backend, but this is obviously wrong
as it must pass the size being set by the driver running in the guest.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-09-23 17:29:38 +01:00
Sebastien Boeuf
2cd406ba50 vm-virtio: Fix virtio-pci BAR type
The 32 or 64 bits type for the memory BAR was not set correctly. This
patch ensure the right type is applied to the BAR.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-09-21 09:11:30 +01:00
Sebastien Boeuf
d723b7dae8 vm-virtio: vhost-user-blk: Add support for reset
If we expect the vhost-user-blk device to be used for booting a VMM
along with the firmware, then need the device to support being reset.

In the vhost-user context, this means the backend needs to be informed
the vrings are disabled and stopped, and the owner needs to be reset
too.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-09-20 15:56:51 +02:00
Yang Zhong
360980d93c vhost-user-blk: enable write_config for WCE
In vhost-user-blk, only WCE value can be set back to device in
guest kernel like
echo "write through" > /sys/block/vda/cache_type

So write_config() will only set WCE value from guest kernel to
vhost user side.

Signed-off-by: Yang Zhong <yang.zhong@intel.com>
2019-09-20 15:56:51 +02:00
Yang Zhong
39083d705b vhost-user-blk: make read_config work
Since config space in vhost-user-blk are mostly from backend
device, this change will get config space info from backend
by vhost-user protocol.

Signed-off-by: Yang Zhong <yang.zhong@intel.com>
2019-09-20 15:56:51 +02:00
Yang Zhong
397d388710 vm-virtio: Add vhost-user-blk implementation
vhost-user-blk has better performance than virtio-blk, so we need
add vhost-user-blk support with SPDK in Rust-based VMMs.

Signed-off-by: Yang Zhong <yang.zhong@intel.com>
2019-09-20 15:56:51 +02:00
Sebastien Boeuf
0a0c7358a2 virtio-bindings: Rely on the upstream crate from rust-vmm
Now that virtio-bindings is a crate part of the rust-vmm project, we
want to rely on this one instead of the local one we had so far.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-09-19 07:13:54 -07:00
Rob Bradford
180e6d1e78 vm-virtio: Allocate BARs for virtio-block devices in 32-bit hole
Currently all devices and guest memory share the same 64GiB
allocation. With guest memory working upwards and devices working
downwards. This creates issues if you want to either have a VM with a
large amount of memory or want to have devices with a large allocation
(e.g. virtio-pmem.)

As it is possible for the hypervisor to place devices anywhere in its
address range it is required for simplistic users like the firmware to
set up an identity page table mapping across the full range. Currently
the hypervisor sets up an identify mapping of 1GiB which the firmware
extends to 64GiB to match the current address space size of the
hypervisor.

A simpler solution is to place the device needed for booting with the
firmware (virtio-block) inside the 32-bit memory hole. This allows the
firmware to easily access the block device and paves the way for
increasing the address space beyond the current 64GiB limit.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-19 10:43:55 +01:00
Rob Bradford
0739c2c7fd vm-virtio: Fix compilation warning from "mmio" feature only build
Use the correct constant for the newly initialised device state.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-16 08:55:35 -07:00
Rob Bradford
26974c7625 vm-virtio: Add MMIO transport
Derived from the crosvm code at 5656c124af2bb956dba19e409a269ca588c685e3
and adapted to work within cloud-hypervisor:

Main differences:

* Interrupt handling is done via a VirtioInterrupt turned into a
devices::Interrupt
* GuestMemory -> GuestMemoryMmap
* Differences in read/write for BusDevice
* Different crates for EventFd and GuestAddress

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-13 12:30:13 +01:00
Rob Bradford
c042483953 build: make PCI (virtio and vfio) disableable at build time
Although included by default it is now possible to build without PCI
support.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-13 12:30:13 +01:00
Sebastien Boeuf
7975394901 vm-virtio: vsock: Port unit testing from Firecracker
This unit testing porting effort is based off of Firecracker commit
1e1cb6f8f8003e0bdce11d265f0feb23249a03f6

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-09-06 10:51:25 -07:00
Sebastien Boeuf
5a3472847d vm-virtio: vsock: Implement VsockEpollHandler
This is the last step connecting the dots between the virtio-vsock
device and the bulk of the logic hosted in the unix and csm modules.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-09-06 10:51:25 -07:00
Sebastien Boeuf
475e487ac3 vmm: Create vsock backend
This commit relies on the new vsock::unix module to create the backend
that will be used from the virtio-vsock device.

The concept of backend is interesting here as it would allow for a vhost
kernel backend to be plugged if that was needed someday.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-09-06 10:51:25 -07:00
Sebastien Boeuf
434a5d0edf vm-virtio: vsock: Port submodule unix from Firecracker
This code porting is based off of Firecracker commit
1e1cb6f8f8003e0bdce11d265f0feb23249a03f6

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-09-06 10:51:25 -07:00
Sebastien Boeuf
df61a8fea2 vm-virtio: vsock: Port submodule csm and packet from Firecracker
This code porting is based off of Firecracker commit
1e1cb6f8f8003e0bdce11d265f0feb23249a03f6

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-09-06 10:51:25 -07:00
Sebastien Boeuf
22f91ab3a2 vm-virtio: Move vsock to its own module
There is a lot of code related to this virtio-vsock hybrid
implementation, that's why it's better to keep it under its
own module.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-09-06 10:51:25 -07:00
Sebastien Boeuf
c48ca61417 vm-virtio: Add virtio-vsock skeleton
This is the first commit introducing the support for virtio-vsock.

This is based off of Firecracker commit
1e1cb6f8f8003e0bdce11d265f0feb23249a03f6

Fixes #102

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-09-06 10:51:25 -07:00
Cathy Zhang
8c2a9a75ec vm-virtio: Update backend feature set for vhost-user-net
Regarding vhost-user-net, there are features in avail_features
and acked_features, like VIRTIO_NET_F_MAC which is required by
driver and device to transfer mac address through config space,
but not needed by backend, like ovs+dpdk, so it's necessary to
adjust backend_features based on acked_features before calling
set_features() API.

This fix is to record backend_features in vhost-user-net to avoid
requesting it twice.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2019-09-05 07:11:58 -07:00
Cathy Zhang
b8622b5c69 vm-virtio: Address event count error and refactor data setting
New event is added in VhostUserEpollHandler for vhost-user fs,
but the total event count is not update accordingly. Fix the
issue and refactor the event data setting for new event
expansion in the future.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2019-09-05 07:11:58 -07:00
Sebastien Boeuf
772191b409 vm-virtio: vhost-user: Rely on acked features to setup backend
At this point in the code, the acked features have been provided by the
guest and they can be set back to the backend. There's no need to
retrieve one more time the backend features for this purpose.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-31 17:33:17 +01:00
Sebastien Boeuf
97699a521f vm-virtio: vhost-user: Vring should be enabled after initialization
As mentioned in the vhost-user specification, each ring is initialized
in a stopped state. This means each ring should be enabled only after
it has been correctly initialized.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-31 17:33:17 +01:00
Sebastien Boeuf
a4ebcf486d vm-virtio: vhost-user-net: Map proper error when getting features
Simple patch replacing unwrap() with appropriate map_err().

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-31 17:33:17 +01:00
Sebastien Boeuf
cdfe576eb1 vm-virtio: vhost-user-net: Set the right set of features
The available features are masked with the backend features, therefore
the available features should be the one used when calling into
set_features() API.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-31 17:33:17 +01:00
Sebastien Boeuf
bc42420583 vm-virtio: Expand vhost-user handler to be reused from virtio-fs
In order to factorize the code between vhost-user-net and virtio-fs one
step further, this patch extends the vhost-user handler implementation
to support slave requests.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-31 17:33:17 +01:00
Sebastien Boeuf
b7d3ad9063 vm-virtio: fs: Factorize vhost-user setup
This patch factorizes the existing virtio-fs code by relying onto the
common code part of the vhost_user module in the vm-virtio crate.

In details, it factorizes the vhost-user setup, and reuses the error
types defined by the module instead of defining its own types.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-31 17:33:17 +01:00
Sebastien Boeuf
56cad00f2e vm-virtio: Move fs.rs to vhost_user module
vhost-user-net introduced a new module vhost_user inside the vm-virtio
crate. Because virtio-fs is actually vhost-user-fs, it belongs to this
new module and needs to be moved there.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-31 17:33:17 +01:00
Cathy Zhang
633f51af9c vm-virtio: Add vhost-user-net implementation
vhost-user framwork could provide good performance in data intensive
scenario due to the memory sharing mechanism. Implement vhost-user-net
device to get the benefit for Rust-based VMMs network.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2019-08-30 15:00:26 +01:00
Sebastien Boeuf
dfb18ef14a net: Make TAP registration functions immutable
By making the registration functions immutable, this patch prevents from
self borrowing issues with the RwLock on self.mem.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-22 08:24:15 +01:00
Sebastien Boeuf
0b8856d148 vmm: Add RwLock to the GuestMemoryMmap
Following the refactoring of the code allowing multiple threads to
access the same instance of the guest memory, this patch goes one step
further by adding RwLock to it. This anticipates the future need for
being able to modify the content of the guest memory at runtime.

The reasons for adding regions to an existing guest memory could be:
- Add virtio-pmem and virtio-fs regions after the guest memory was
  created.
- Support future hotplug of devices, memory, or anything that would
  require more memory at runtime.

Because most of the time, the lock will be taken as read only, using
RwLock instead of Mutex is the right approach.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-22 08:24:15 +01:00
Sebastien Boeuf
ec0b5567c8 vmm: Share the guest memory instead of cloning it
The VMM guest memory was cloned (copied) everywhere the code needed to
have ownership of it. In order to clean the code, and in anticipation
for future support of modifying this guest memory instance at runtime,
it is important that every part of the code share the same instance.

Because VirtioDevice implementations need to have access to it from
different threads, that's why Arc must be used in this case.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-22 08:24:15 +01:00
Rob Bradford
f4d41d600b virtio: net: Remove TAP fd from epoll when no available descriptors
When there are no available descriptors in the queue (observed when the
network interface hasn't been brought up by the kernel) stop waiting for
notifications that the TAP fd should be read from.

This avoids a situation where the TAP device has data avaiable and wakes
up the virtio-net thread only for the virtio-net thread not read that
data as it has nowhere to put it.

When there are descriptors available in the queue then we resume waiting
for the epoll event on the TAP fd.

This bug demonstrated itself as 100% CPU usage for cloud-hypervisor
binary prior to the guest network interface being brought up. The
solution was inspired by the Firecracker virtio-net code.

Fixes: #208

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-08-21 08:41:28 -07:00
Sebastien Boeuf
44d8ab06ac vm-virtio: Remove unused dependency from unit tests
AtomicSize was imported but not used.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-21 08:51:25 +01:00
Sebastien Boeuf
658c076eb2 linters: Fix clippy issues
Latest clippy version complains about our existing code for the
following reasons:

- trait objects without an explicit `dyn` are deprecated
- `...` range patterns are deprecated
- lint `clippy::const_static_lifetime` has been renamed to
  `clippy::redundant_static_lifetimes`
- unnecessary `unsafe` block
- unneeded return statement

All these issues have been fixed through this patch, and rustfmt has
been run to cleanup potential formatting errors due to those changes.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-15 09:10:04 -07:00
Sebastien Boeuf
2e0508cdc6 vm-virtio: fs: Add DAX shared region support
This patch enables the vhost-user protocol features to let the slave
initiates some request towards the master (VMM). It also takes care of
receiving the requests from the slave and take appropriate actions based
on the request type.

The way the flow works now are as follow:
 - The VMM creates a region of memory that is made available to the
   guest by exposing it through the virtio-fs PCI BAR 2.
 - The virtio-fs device is created by the VMM, exposing some protocol
   features bits to virtiofsd, letting it know that it can send some
   request to the VMM through a dedicated socket.
 - On behalf of the guest driver asking for reading or writing a file,
   virtiofsd sends a request to the VMM, asking for a file descriptor to
   be mapped into the shared memory region at a specific offset.
 - The guest can directly read/write the file at the offset of the
   memory region.

This implementation is more performant than the one using exclusively
the virtqueues. With the virtqueues, the content of the file needs to be
copied to the queues every time the guest is asking to access it.
With the shared memory region, the virtqueues become the control plane
where the libfuse commands are sent to virtiofsd. The data plane is
literally the whole memory region which does not need any extra copy of
the file content. The only penalty is the first time a file is accessed,
it needs to be mapped into the VMM virtual address space.

Another interesting case where this solution will not perform as well as
expected is when a file is larger than the region itself. This means the
file needs to be mapped in several times, but more than that this means
it needs to be remapped every time it's being accessed.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-13 13:57:53 +02:00
Sebastien Boeuf
3c29c47783 vmm: Create shared memory region for virtio-fs
When the cache_size parameter from virtio-fs device is not empty, the
VMM creates a dedicated memory region where the shared files will be
memory mapped by the virtio-fs device.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-13 13:57:53 +02:00
Sebastien Boeuf
f30ba069b7 vm-virtio: Allocate shared memory regions on dedicated BAR
In the context of shared memory regions, they could not be present for
most of the virtio devices. For this reason, we prefer dedicate a BAR
for the shared memory regions.

Another reason is that memory regions, if there are several, can be
allocated all at once as a contiguous region, which then can be used as
its own BAR. It would be more complicated to try to allocate the BAR 0
holding the regular information about the virtio-pci device along with
the shared memory regions.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-13 13:57:53 +02:00
Sebastien Boeuf
e0fda0611c vm-virtio: Remove virtio-pci dependency from VirtioDevice
This patch cleans up the VirtioDevice trait. Since some function are PCI
specific and since they are not even used, it makes sense to remove them
from the trait definition.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-13 13:57:53 +02:00
Sebastien Boeuf
e2b38cc050 vm-virtio: Extend VirtioDevice trait to retrieve shared memory regions
Based on the newly added SharedMemoryConfig capability to the virtio
specification, and based on the fact that it is not tied to the type of
transport (pci or mmio), we can create as part of the VirtioDevice trait
a new method that will provide the shared memory regions associated with
the device.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-13 13:57:53 +02:00
Sebastien Boeuf
d97079d793 vm-virtio: Update VirtioPciCap and introduce VirtioPciCap64
Based on the latest version of the virtio specification, the structure
virtio_pci_cap has been updated and a new structure virtio_pci_cap64 has
been introduced.

virtio_pci_cap now includes a field "id" that does not modify the
existing structure size since there was a 3 bytes reserved field
already there. The id is used in the context of shared memory regions
which need to be identified since there could be more than one of this
kind of capability.

virtio_pci_cap64 is a new structure that includes virtio_pci_cap and
extends it to allow 64 bits offsets and 64 bits region length. This is
used in the context of shared memory regions capability, as we might need
to describe regions of 4G or more, that could be placed at a 4G offset
or more in the associated BAR.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-13 13:57:53 +02:00
Sebastien Boeuf
d180deb679 vm-virtio: pci: Fix PCI capability length
The length of the PCI capability as it is being calculated by the guest
was not accurate since it was not including the implicit 2 bytes offset.

The reason for this offset is that the structure itself does not contain
the capability ID (1 byte) and the next capability pointer (1 byte), but
the structure exposed through PCI config space does include those bytes.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-13 13:57:53 +02:00
Rob Bradford
6c06420a11 vm-virtio: net: Fix out-of-range slice panic when under load
The numbr of bytes read was being incorrectly increased by the potential
length of the end of the sliced data rather than the number of bytes
that was in the range. This caused a panic when the the network was
under load by using iperf.

It's important to note that in the Firecracker code base the function
that read_slice() returns the number of bytes read which is used to
increment this counter. The VM memory version however only returns the
empty unit "()".

Fixes: #166

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-08-12 15:35:11 +01:00
fazlamehrab
df5058ec0a vm-virtio: Implement console size config feature
One of the features of the virtio console device is its size can be
configured and updated. Our first iteration of the console device
implementation is lack of this feature. As a result, it had a
default fixed size which could not be changed. This commit implements
the console config feature and lets us change the console size from
the vmm side.

During the activation of the device, vmm reads the current terminal
size, sets the console configuration accordinly, and lets the driver
know about this configuration by sending an interrupt. Later, if
someone changes the terminal size, the vmm detects the corresponding
event, updates the configuration, and sends interrupt as before. As a
result, the console device driver, in the guest, updates the console
size.

Signed-off-by: A K M Fazla Mehrab <fazla.mehrab.akm@intel.com>
2019-08-09 13:55:43 -07:00
Sebastien Boeuf
aa44726658 vm-virtio: Don't trigger an MSI-X interrupt if not enabled
Relying on the newly added MSI-X helper, the interrupt callback checks
the interrupts are enabled on the device before to try triggering the
interrupt.

Fixes #156

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-08 17:38:47 +01:00
Rob Bradford
9caad7394d build, misc: Bump vmm-sys-util dependency
The structure of the vmm-sys-util crate has changed with lots of code
moving to submodules.

This change adjusts the use of the imported structs to reference the
submodules.

Fixes: #145

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-08-02 07:42:20 -07:00
Sebastien Boeuf
baec27698e vm-virtio: Don't break from epoll loop on EINTR
The existing code taking care of the epoll loop was too restrictive as
it was propagating the error returned from the epoll_wait() syscall, no
matter what was the error. This causes the epoll loop to be broken,
leading to a non-functional virtio device.

This patch enforces the parsing of the returned error and prevent from
the error propagation in case it is EINTR, which stands for Interrupted.
In case the epoll loop is interrupted, it is appropriate to retry.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-02 08:37:34 +01:00
Sebastien Boeuf
98d7955e34 vm-virtio: Add support for notifying about virtio config update
As per the VIRTIO specification, every virtio device configuration can
be updated while the guest is running. The guest needs to be notified
when this happens, and it can be done in two different ways, depending
on the type of interrupt being used for those devices.

In case the device uses INTx, the allocated IRQ pin is shared between
queues and configuration updates. The way for the guest to differentiate
between an interrupt meant for a virtqueue or meant for a configuration
update is tied to the value of the ISR status field. This field is a
simple 32 bits bitmask where only bit 0 and 1 can be changed, the rest
is reserved.

In case the device uses MSI/MSI-X, the driver should allocate a
dedicated vector for configuration updates. This case is much simpler as
it only requires the device to send the appropriate MSI vector.

The cloud-hypervisor codebase was not supporting the update of a virtio
device configuration. This patch extends the existing VirtioInterrupt
closure to accept a type that can be Config or Queue, so that based on
this type, the closure implementation can make the right choice about
which interrupt pin or vector to trigger.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-07-29 15:34:37 +01:00
fazlamehrab
577d44c8eb vm-virtio: Add virtio console device for single port operation
The virtio console device is a console for the communication between
the host and guest userspace. It has two parts: the device and the
driver. The console device is implemented here as a virtio-pci device
to the guest. On the other side, the guest OS expected to have a
character device driver which provides an interface to the userspace
applications.

The console device can have multiple ports where each port has one
transmit queue and one receive queue. The current implementation only
supports one port. For data IO communication, one or more empty
buffers are placed in the receive queue for incoming data, and
outgoing characters are placed in the transmit queue. Details spec
can be found from the following link.

https://docs.oasis-open.org/virtio/virtio/v1.1/csprd01/virtio-v1.1-csprd01.pdf#e7

Apart from the console, for the communication between guest and host,
the Cloud Hypervisor has a legacy serial device implemented. However,
the implementation of a console device lets us be independent of legacy
pin-based interrupts without losing the logs and access to the VM.

Signed-off-by: A K M Fazla Mehrab <fazla.mehrab.akm@intel.com>
2019-07-22 23:08:56 +01:00
Chao Peng
96fb38a5aa vm-allocator: Align address at allocation time
There is alignment support for AddressAllocator but there are occations
that the alignment is known only when we call allocate(). One example
is PCI BAR which is natually aligned, means for which we have to align
the base address to its size.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
2019-07-22 09:51:16 -07:00
Sebastien Boeuf
1268165040 pci: Allow for registering IO and Memory BAR
This patch adds the support for both IO and Memory BARs by expecting
the function allocate_bars() to identify the type of each BAR.
Based on the type, register_mapping() insert the address range on the
appropriate bus (PIO or MMIO).

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-07-22 09:50:10 -07:00
Sebastien Boeuf
72007f016a pci: Improve MSI-X code to let VFIO rely on it
This commit enhances the current msi-x code hosted in the pci crate
in order to be reused by the vfio crate. Specifically, it creates
several useful methods for the MsixCap structure that can simplify
the caller's code.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-07-22 09:50:10 -07:00
Rob Bradford
7499210d0c vm-virtio: net: Remove attributes for test exclusions
Now that the tests are in use this import and function is used.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-07-16 17:09:05 +02:00
Rob Bradford
af15ce9dc3 vm-virtio: Update test activate() function
The type of interrupt_evt has changed along with the addition of an
msix_config member for the virtio device.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-07-16 17:09:05 +02:00
Samuel Ortiz
4605ecf1a8 pci: Extend the Device trait to carry the device BARs
When reading from or writing to a PCI BAR to handle a VM exit, we need
to have the BAR address itself to be able to support multiple BARs PCI
devices.

Fixes: #87

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-07-08 07:39:21 +02:00
Samuel Ortiz
8173e1ccd7 devices: Extend the Bus trait to carry the device range base
With the range base for the IO/MMIO vm exit address, a device with
multiple ranges has all the needed information for resolving which of
its range the exit is coming from

Fixes: #87

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-07-08 07:39:21 +02:00
Samuel Ortiz
4a15316101 vm-virtio: Fix the network and storage PCI class and sub-class
Use the virtio device type to generate the righ class and subclass.

Fixes: #83

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-07-02 17:37:12 +02:00
Samuel Ortiz
77684f473d vm-virtio: Implement the u32 to VirtioDeviceType conversion
The From trait allows us to compare and convert an integer
with and into a virtio device type.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-07-02 17:37:12 +02:00
Sebastien Boeuf
8862d61042 vm-virtio: Add virtio-pmem implementation
This commit introduces the implementation of the virtio-pmem device
based on the pending proposal of the virtio specification here:
https://lists.oasis-open.org/archives/virtio-dev/201903/msg00083.html

It is also based on the kernel patches coming along with the virtio
proposal: https://lkml.org/lkml/2019/6/12/624

And it is based off of the current crosvm implementation found in
devices/src/virtio/pmem.rs relying on commit
bb340d9a94d48514cbe310d05e1ce539aae31264

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-07-01 14:38:55 +01:00
Sebastien Boeuf
1ddc8f2f0d vm-virtio: Add vhost-user-fs support
The vhost-user-fs or virtio-fs device allows files and directories to
be shared between host and guest. This patch adds the implementation
of this device to the cloud-hypervisor device model.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-06-27 21:46:00 +02:00
Sebastien Boeuf
8dc06aa50d vm-virtio: Remove unneeded code
Remove legacy code coming from Firecracker and/or Crosvm.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-06-27 21:46:00 +02:00
Jing Liu
9da2343cb7 device: Improvement for BusDevice trait and PciDevice trait
BusDevice includes two methods which are only for PCI devices, which should
be as members of PciDevice trait for a better clean high level APIs.

Signed-off-by: Jing Liu <jing2.liu@linux.intel.com>
2019-06-25 06:17:30 -07:00
Sebastien Boeuf
24dbe7003a irq: Fix pin based interrupt for virtio-pci
When the KVM capability KVM_CAP_SIGNAL_MSI is not present, the VMM
falls back from MSI-X onto pin based interrupts. Unfortunately, this
was not working as expected because the VirtioPciDevice object was
always creating an MSI-X capability structure in the PCI configuration
space. This was causing the guest drivers to expect MSI-X interrupts
instead of the pin based generated ones.

This patch takes care of avoiding the creation of a dedicated MSI-X
capability structure when MSI is not supported by KVM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-06-07 18:19:52 +01:00
Sebastien Boeuf
4d98dcb077 msix: Handle MSI-X device masking
As mentioned in the PCI specification, the Function Mask from the
Message Control Register can be set to prevent a device from injecting
MSI-X messages. This supersedes the vector masking as it interacts at
the device level.

Here quoted from the specification:
For MSI and MSI-X, while a vector is masked, the function is prohibited
from sending the associated message, and the function must set the
associated Pending bit whenever the function would otherwise send the
message. When software unmasks a vector whose associated Pending bit is
set, the function must schedule sending the associated message, and
clear the Pending bit as soon as the message has been sent. Note that
clearing the MSI-X Function Mask bit may result in many messages
needing to be sent.

This commit implements the behavior described above by reorganizing
the way the PCI configuration space is being written. It is indeed
important to be able to catch a change in the Message Control
Register without having to implement it for every PciDevice
implementation. Instead, the PciConfiguration has been modified to
take care of handling any update made to this register.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-06-07 13:33:53 +01:00
Sebastien Boeuf
d810c7712d msix: Handle MSI-X vector masking
The current MSI-X implementation completely ignores the values found
in the Vector Control register related to a specific vector, and never
updates the Pending Bit Array.

According to the PCI specification, MSI-X vectors can be masked
through the Vector Control register on bit 0. If this bit is set,
the device should not inject any MSI message. When the device
runs into such situation, it must not inject the interrupt, but
instead it must update the bit corresponding to the vector number
in the Pending Bit Array.

Later on, if/when the Vector Control register is updated, and if
the bit 0 is flipped from 0 to 1, the device must look into the PBA
to find out if there was a pending interrupt for this specific
vector. If that's the case, an MSI message is injected and the
bit from the PBA is cleared.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-06-07 13:33:53 +01:00
Sebastien Boeuf
42378caa8b vm-virtio: Fix alignment and MSI-X table size on the BAR
As mentioned in the PCI specification:

If a dedicated Base Address register is not feasible, it is
recommended that a function isolate the MSI-X structures from
the non-MSI-X structures with aligned 8 KB ranges rather than
the mandatory aligned 4 KB ranges.

That's why this patch ensures that each structure present on the
BAR is 8KiB aligned.

It also fixes the MSI-X table and PBA sizes so that they can support
up to 2048 vectors, as specified for MSI-X.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-06-07 13:33:53 +01:00
Sebastien Boeuf
47a4065aaf interrupt: Use a single closure to describe pin based and MSI-X
In order to factorize the complexity brought by closures, this commit
merges IrqClosure and MsixClosure into a generic InterruptDelivery one.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-06-06 15:27:35 +01:00
Sebastien Boeuf
8df05b72dc vmm: Add MSI-X support to virtio-pci devices
In order to allow virtio-pci devices to use MSI-X messages instead
of legacy pin based interrupts, this patch implements the MSI-X
support for cloud-hypervisor. The VMM code and virtio-pci bits have
been modified based on the "msix" module previously added to the pci
crate.

Fixes #12

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-06-06 15:27:35 +01:00
Sebastien Boeuf
d3c7b45542 interrupt: Make IRQ delivery generic
Because we cannot always assume the irq fd will be the way to send
an IRQ to the guest, this means we cannot make the assumption that
every virtio device implementation should expect an EventFd to
trigger an IRQ.

This commit organizes the code related to virtio devices so that it
now expects a Rust closure instead of a known EventFd. This lets the
caller decide what should be done whenever a device needs to trigger
an interrupt to the guest.

The closure will allow for other type of interrupt mechanism such as
MSI to be implemented. From the device perspective, it could be a
pin based interrupt or an MSI, it does not matter since the device
will simply call into the provided callback, passing the appropriate
Queue as a reference. This design keeps the device model generic.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-06-06 15:27:35 +01:00
Chao Peng
6ecdd98634 virtio: Enable qcow support for virtio-block
With this enabled, one can pass a QCOW format disk
image with '--disk' switch.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
2019-05-13 22:08:29 +01:00
Samuel Ortiz
fe99c29743 vm-virtio: Remove useless PCI BAR debug log
We should not unconditionally display our virtio PCI BAR setting.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-05-10 16:32:39 +02:00
Chao Peng
8e7579b20e vm-virtio: Add virtio-rng implementation
Most of the code is taken from crosvm(bbd24c5) but is modified to
be adapted to the current VirtioDevice definition and epoll
implementation.

A new command option '--rng' is provided and it gives one the option
to override the entropy source which is /dev/urandom by default.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
2019-05-10 16:32:39 +02:00
Sebastien Boeuf
6d27cfb3b6 vm-virtio: Create virtio-net device
In order to provide connectivity through network interface between
host and guest, this patch introduces the virtio-net backend.

This code is based on Firecracker commit
d4a89cdc0bd2867f821e3678328dabad6dd8b767

It is a trimmed down version of the original files as it removes the
rate limiter support. It has been ported to support vm-memory crate
and the epoll handler has been modified in order to run a dedicated
epoll loop from the device itself. This epoll loop runs in its own
dedicated thread.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-05-10 16:32:39 +02:00
Rob Bradford
1151b07682 vm-virtio: block: Add support for resetting a block device
As it is necessary to return the interrupt EventFD and the queue EventFD
to the transport layer upon reset the activate function has been
modified to clone these descriptors as well as the underlying disk
itself.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-05-09 15:44:18 +02:00
Rob Bradford
3b2faa9f11 vm-virtio: Reset underlying device on driver request
If the driver triggers a reset by writing zero into the status register
then reset the underlying device if supported. A device reset also
requires resetting various aspects of the queue.

In order to be able to do a subsequent reactivate it is required to
reclaim certain resources (interrupt and queue EventFDs.) If a device
reset is requested by the driver but the underlying device does not
support it then generate an error as the driver would not be able to
configure it anyway.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-05-09 15:44:18 +02:00
Samuel Ortiz
040ea5432d cloud-hypervisor: Add proper licensing
Add the BSD and Apache license.
Make all crosvm references point to the BSD license.
Add the right copyrights and identifier to our VMM code.
Add Intel copyright to the vm-virtio and pci crates.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-05-09 15:44:17 +02:00
Sebastien Boeuf
b67e0b3dad vmm: Use virtio-blk to support booting from disk image
After the virtio-blk device support has been introduced in the
previous commit, the vmm need to rely on this new device to boot
from disk images instead of initrd built into the kernel.

In order to achieve the proper support of virtio-blk, this commit
had to handle a few things:

  - Register an ioevent fd for each virtqueue. This important to be
    notified from the virtio driver that something has been written
    on the queue.

  - Fix the retrieval of 64bits BAR address. This is needed to provide
    the right address which need to be registered as the notification
    address from the virtio driver.

  - Fix the write_bar and read_bar functions. They were both assuming
    to be provided with an address, from which they were trying to
    find the associated offset. But the reality is that the offset is
    directly provided by the Bus layer.

  - Register a new virtio-blk device as a virtio-pci device from the
    vm.rs code. When the VM is started, it expects a block device to
    be created, using this block device as the VM rootfs.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-05-08 08:55:09 +02:00
Sebastien Boeuf
65f96e408f virtio: Add virtio-blk implementation
This commit introduces the virtio-blk backend implementation, which is
the first device implementing the VirtioDevice trait.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-05-08 08:55:09 +02:00
Samuel Ortiz
c2c51dc9d1 vm-virtio: Add PCI transport support
Copied from crosvm 107edb3e with one main modification: VirtioPciDevice
implements BusDevice.

We need this modification because it is the only way for us to be able
to add a VirtioPciDevice to the MMIO bus. Bus insertion takes a
BusDevice. The fact that VirtioPciDevice implements PciDevice which
itself implements BusDevice does not mean that Rust will automatically
downcast a VirtioPciDevice into a BusDevice.

crosvm works around that issue by having the PCI, virtio and BusDevice
implementations in the same crate.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-05-08 08:55:06 +02:00
Samuel Ortiz
8246434710 vm-virtio: Initial crate
Copied from Firecracker 17a9089d for the queue implementation and from
crosvm 107edb3e for the device Trait. The device trait has some PCI
specific methods hence its crosvm origin.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-05-08 08:55:06 +02:00