Commit Graph

341 Commits

Author SHA1 Message Date
Stefano Garzarella
bb3cf7c30c vsock: add handshake regression test
This patch has been cherry-picked from the Firecracker tree. The
reference commit is 6dbe8e021a64ba3742081741a7538cdfd93a102e.

Signed-off-by: Gabriel Ionescu <gbi@amazon.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
2020-06-16 22:02:06 +02:00
Stefano Garzarella
6ab4e247e6 vsock: fixed TX buf flushing
This patch has been cherry-picked from the Firecracker tree. The
reference commit is 78819f35f63f5777a58e3e1e774b3270b32881ed.

The vsock TX buffer flush operation would report inconsistent results,
under specific circumstances.

The flush operation is performed in two steps, since it's dealing with a
ring buffer, an the data to be flushed may wrap around. If the first
step was successful, but the second one failed, the whole flush
operation would report an error, thus causing flow control accounting to
lose track of the bytes that were successfully written by the first
pass.

This commit changes the flush behavior to always report success when
some data has been written to the backing stream.

Signed-off-by: Dan Horobeanu <dhr@amazon.com>
Signed-off-by: Gabriel Ionescu <gbi@amazon.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
2020-06-16 22:02:06 +02:00
Stefano Garzarella
aca2baf458 vsock: fixed flow control regression
This patch has been cherry-picked from the Firecracker tree. The
reference commit is 2da612a9cdce85c91fb54ab22d950ec6ccc93b27.

Fixed a bug introduced by a271d08f0b1ba0ee82761cd49244b6a8017bcede,
whereby the flow control accouting would be off by a few bytes, for
host-initiated connections.

The connection ack message ("OK <port_num><CR>") was accounted for as
data sent by the guest, so its length was substracted from the total
amount of data the guest was allowed to send.

This commit changes the way this ack message is sent, so that it
bypasses flow control accouting.

Signed-off-by: Dan Horobeanu <dhr@amazon.com>
Signed-off-by: Gabriel Ionescu <gbi@amazon.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
2020-06-16 22:02:06 +02:00
Stefano Garzarella
3ce4bd5ec8 vsock: absorb spurious EPOLLOUT events
This patch has been cherry-picked from the Firecracker tree. The
reference commit is 109e631566350867dafa4b16c3919dfd1533eeea.

This commit changes the vsock connection state machine behavior to absorb
any EWOULDBLOCK errors recevied while handling an EPOLLOUT event. Previously,
this condition would lead to immediate connection termination.

Signed-off-by: Dan Horobeanu <dhr@amazon.com>
Signed-off-by: Gabriel Ionescu <gbi@amazon.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
2020-06-16 22:02:06 +02:00
Stefano Garzarella
0530b4e1ed vsock: absorb spurious EPOLLIN events
This patch has been cherry-picked from the Firecracker tree. The
reference commit is 660d18cf7fee5b38c3b1b17a5da6544b9025909d.

Apparently, epoll_wait sometimes yields false EPOLLIN events (i.e. events
follwing which read() would fail with EWOULDBLOCK). This would cause the
vsock connection state machine to terminate connections, since an error
was detected on the underlying Unix socket.

This commit changes the vsock connection state machine code to handle such
erroneous EPOLLIN events by absorbing EWOULDBLOCK read() errors.

Signed-off-by: Dan Horobeanu <dhr@amazon.com>
Signed-off-by: Gabriel Ionescu <gbi@amazon.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
2020-06-16 22:02:06 +02:00
Stefano Garzarella
a3f24e5fb9 vsock: flow control fix
This patch has been cherry-picked from the Firecracker tree. The
reference commit is 1cc8b8a678eb28b20f5843556bdb7fbb2dfa6284.

Fixed a logical error in the vsock flow control, that would cause credit
update packets to not be sent at the right time.

Signed-off-by: Dan Horobeanu <dhr@amazon.com>
Signed-off-by: Gabriel Ionescu <gbi@amazon.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
2020-06-16 22:02:06 +02:00
Stefano Garzarella
5fc52a5056 vsock: fixed rxq logic
This patch has been cherry-picked from the Firecracker tree. The
reference commit is d2475773557c82d2abad2fc8bdf69e7d01444109.

Fixed a vsock muxer issue that would cause a connection to be removed
from the RX queue, even though it still had pending RX data.

Signed-off-by: Dan Horobeanu <dhr@amazon.com>
Signed-off-by: Gabriel Ionescu <gbi@amazon.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
2020-06-16 22:02:06 +02:00
Stefano Garzarella
096ffe08f2 vm-virtio: vsock: add is_empty method to VsockPacket
This patch adds `is_empty` method to VsockPacket to fix the
following clippy error:

error: item `vsock::packet::VsockPacket` has a public `len` method but no corresponding `is_empty` method
   --> vm-virtio/src/vsock/packet.rs💯1
    |
100 | / impl VsockPacket {
101 | |     /// Create the packet wrapper from a TX virtq chain head.
102 | |     ///
103 | |     /// The chain head is expected to hold valid packet header data. A following packet buffer
...   |
334 | |     }
335 | | }
    | |_^
    |
    = note: `-D clippy::len-without-is-empty` implied by `-D warnings`
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#len_without_is_empty

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
2020-06-15 18:31:54 +01:00
Stefano Garzarella
b74a855446 vm-virtio: make VsockPacket public
This patch makes VsockPacket public to allow other crates
(e.g. vhost-user-vsock) to use it.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
2020-06-15 18:31:54 +01:00
Anatol Belski
abd6204d27 source: Fix file permissions
Rust sources and some data files should not be executable. The perms are
set to 644.

Signed-off-by: Anatol Belski <ab@php.net>
2020-06-10 18:47:27 +01:00
Rob Bradford
9b71ba20ac vmm, vm-virtio: Stop always autogenerating a host MAC address
This removes the need to use CAP_NET_ADMIN privileges and instead the
host MAC addres is either provided by the user or alternatively it is
retrieved from the kernel.

TEST=Run cloud-hypervisor without CAP_NET_ADMIN permission and a
preconfigured tap device:

sudo ip tuntap add name tap0 mode tap
sudo ifconfig tap0 192.168.249.1 netmask 255.255.255.0 up
cargo clean
cargo build
target/debug/cloud-hypervisor --serial tty --console off --kernel ~/src/rust-hypervisor-firmware/target/target/release/hypervisor-fw --disk path=~/workloads/clear-33190-kvm.img --net tap=tap0

VM was also rebooted to check that works correctly.

Fixes: #1274

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-06-08 17:56:10 +02:00
Bo Chen
a8cdf2f070 tests,vm-virtio,vmm: Use 'socket' for all CLI/API parameters
This patch unifies the inconsistent uses of 'socket' and 'sock' from our
CLI/API parameters.

Fixes: #1091

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-06-08 17:41:12 +02:00
Arron Wang
6ff107afe4 vm-device: Switch to use get_host_address_range in vfio-ioctls
The API has change to use generic GuestMemory trait:
pub fn get_host_address_range<M: GuestMemory>(
    mem: &M,
    addr: GuestAddress,
    size: usize,
) -> Option<*mut u8> {

Signed-off-by: Arron Wang <arron.wang@intel.com>
2020-06-04 08:48:55 +02:00
Samuel Ortiz
3336e80192 vfio: Switch to the vfio-ioctls crate ch branch
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-06-04 08:48:55 +02:00
Rob Bradford
a4d377a066 vm-virtio: net: Implement VIRTIO_RING_F_EVENT_IDX
If VIRTIO_RING_F_EVENT_IDX is negotiated only generate suppress
interrupts if the guest has asked us to do so.

Fixes: #788

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-06-03 08:28:49 +02:00
Rob Bradford
f06970730b vm-virtio: net: Handle lost interrupts on restore
In some situations it is seen that the first interrupt sent to the guest
is lost upon a restore (due to the tap worker being awake ahead of the
vPUs).

This causes problems with VIRTIO_RING_F_EVENT_IDX interrupt suppression
as the guest will not be interrupted again in order to mitigate this we
always interrupt the guest until the device itself has been signalled by
the guest.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-06-03 08:28:49 +02:00
Rob Bradford
a5596020b3 vm-virtio: Add some info! level debugging interrupt generation
This was very helpful when debugging interrupt issues and will be useful
for the future.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-06-03 08:28:49 +02:00
Rob Bradford
fcc62efc41 vm-virtio: net: Prepare NetQueuePair for use in vhost-user-net
This requires exposing the struct members and also using Option<..>
types for the main epoll fd and the memory as they are initialised later
in vhost-user-net.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-06-02 13:26:52 +02:00
Rob Bradford
2dbd11864e vm-virtio: net: Split network handling
Split handling of behaviour that is independent of the device itself so
that it can be reused in the vhost-user-net device.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-06-02 13:26:52 +02:00
Rob Bradford
237cb184b4 vm-virtio: net: Add further missing error reporting
Ensure that errors generated from rx_single_frame are propagated
correctly.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-06-02 13:26:52 +02:00
Rob Bradford
36d072e69c vm-virtio: Add error propagation for TAP listener (un)registration
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-06-02 13:26:52 +02:00
Rob Bradford
3151b5d82a vm-virtio: net: Refactor to support code reuse
Split out functions that work just on the TAP device and queues. Whilst
doing so also improve the error handling to return Results rather than
drop errors.

This change also addresses a bug where the TAP event suppression could
ineffectual because it was being enabled immediately after it may have
been disabled:

resume_rx -> rx_single_frame -> unregister_listener -> resume_rx ->
register_listener.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-06-02 13:26:52 +02:00
dependabot-preview[bot]
aac87196d6 build(deps): bump vm-memory from 0.2.0 to 0.2.1
Bumps [vm-memory](https://github.com/rust-vmm/vm-memory) from 0.2.0 to 0.2.1.
- [Release notes](https://github.com/rust-vmm/vm-memory/releases)
- [Changelog](https://github.com/rust-vmm/vm-memory/blob/v0.2.1/CHANGELOG.md)
- [Commits](https://github.com/rust-vmm/vm-memory/compare/v0.2.0...v0.2.1)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-05-28 17:06:48 +01:00
Rob Bradford
c31ad72ee9 build: Address issues found by 1.43.0 clippy
These are mostly due to use of "bare use" statements and unnecessary vector
creation.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-05-27 19:32:12 +02:00
dependabot-preview[bot]
a4bb96d45c build(deps): bump libc from 0.2.70 to 0.2.71
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.70 to 0.2.71.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.70...0.2.71)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-05-27 09:02:13 +02:00
Rob Bradford
af8292b623 vmm, config, vhost_user_blk: remove "wce" parameter
This config option provided very little value and instead we now enable
this feature (which then lets the guest control the cache mode)
unconditionally.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-05-21 08:40:43 +02:00
Rob Bradford
9101bdd7a9 vm-virtio: block: Ensure backing file consistency
Correctly implement the virtio specification by setting the writeback
field on the request based on the algorithm in the spec.

TEST=Boot with hypervisor-firmware with CH in verbose mode. See info
level messages saying cache mode is writethrough in firmware (no support
for flush or WCE). Once in the Linux kernel see messages that mode is
writeback.

Fixes: #1216
Fixes: #680

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-05-21 08:40:43 +02:00
Rob Bradford
10db2131bd vm-virtio: block: Add "writeback" control to Request
When this is set to false the write needs to be followed by a flush on
the underlying disk (leading to a fsync()).

The default behaviour is not changed with this change.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-05-21 08:40:43 +02:00
Rob Bradford
1fac263263 vm-virtio: Use config name as per spec
The spec calls this field "writeback" which is much clearer than than
"wce".

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-05-21 08:40:43 +02:00
Rob Bradford
a813b57f59 vm-virtio, vhost_user_{fs,block,backend}: Move EVENT_IDX handling
Move the method that is used to decide whether the guest should be
signalled into the Queue implementation from vm-virtio. This removes
duplicated code between vhost_user_backend and the vm-virtio block
implementation.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-05-20 12:56:25 +02:00
Rob Bradford
8ae7a38da5 build: Use same virtio-bindings version
Consistently use the crates.io 0.1.0 version based on Linux 5.0.0

Fixes: #1192

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-05-20 12:55:59 +02:00
Rob Bradford
3947809c36 vm-virtio: block: Ensure that VIRTIO_BLK_T_FLUSH requests actually sync
The implementation of this virtio block (and vhost-user block) command
called a function that was a no-op on Linux. Use the same function as
virtio-pmem to ensure that data is not lost when the guest asks for it
to be flused to disk.

Fixes: #399

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-05-20 12:54:10 +02:00
Sebastien Boeuf
f442c62bc5 vm-virtio: Implement Snapshottable trait for Vsock
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-05-19 18:52:08 +02:00
Sebastien Boeuf
646d33fea3 vm-virtio: Set queue fields explicitely during restore
For both virtio-mmio and virtio-pci transport layers, we were setting
every field from the saved snapshot during a restore. This is a problem
when we don't want to override specific fields such as iommu_mapping_cb
because the saved snapshot doesn't contain the appropriate information.

That's why this commit sets only the appropriate field from the saved
snapshot during a restore.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-05-19 09:03:41 +01:00
Sebastien Boeuf
02cbea546d vm-virtio: Implement Snapshottable trait for Iommu
Provide implementation for both snapshot() and restore() methods from
the Snapshottable trait, so that we can snapshot and restore a VM with
devices attached to a virtual IOMMU.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-05-19 09:03:41 +01:00
Bo Chen
35782bd9f8 vm-virtio: Close file descriptors created by epoll::create()
This patch fixes file descriptor leak related to epoll::create() from
various virtio devices.

Fixes: #1124

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-05-19 09:22:09 +02:00
Rob Bradford
039accc139 vhost_user_net, vm-virtio: Interrupt guest when TX queue is updated
According to the virtio spec the guest should always be interrupted when
"used" descriptors are returned from the device to the driver. However
this was not the case for the TX queue in either the virtio-net
implementation or the vhost-user-net implementation.

This would have meant that the guest could end up with a reduced TX
throughput as it would not know that the packets had been dispatched via
the VMM.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-05-18 17:02:53 +02:00
Rob Bradford
4366dd92ac vm-virtio: block: Add support for VIRTIO_RING_F_EVENT_IDX
Permit the guest to suppress interrupts from the host as an
optimisation.

Fixes: #786

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-05-15 19:03:41 +02:00
Rob Bradford
1b8b5ac179 vhost-user_net, vm-virtio, vmm: Permit host MAC address setting
Add a new "host_mac" parameter to "--net" and "--net-backend" and use
this to set the MAC address on the tap interface. If no address is given
one is randomly assigned and is stored in the config.

Support for vhost-user-net self spawning was also included.

Fixes: #1177

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-05-15 11:45:09 +01:00
dependabot-preview[bot]
2991fd2a48 build(deps): bump libc from 0.2.69 to 0.2.70
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.69 to 0.2.70.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.69...0.2.70)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-05-12 20:26:43 +02:00
Sebastien Boeuf
02bd50f6ab vm-virtio: Add helper to set the configuration BAR value
From a VirtioPciDevice perspective, there are two types of BARs, either
the virtio configuration BAR or the SHaredMemory BAR.

The SHaredMemory BAR address comes from the virtio device directly as
the memory region had been previously allocated when the virtio device
has been created. So for this BAR, there's nothing to do when restoring
a VM, since the associated virtio device is already restored with the
appropriate resources, hence the BAR will already be at the right
address.

The remaining configuration BAR is different, as we usually get its
address from the SystemAllocator. This means in case we restore a VM,
we must provide this value, bypassing the allocator. This is what this
commit takes care of, by letting the caller set the base address for the
configuration BAR prior to allocating the BARs.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-05-12 17:37:31 +01:00
Sebastien Boeuf
308b790cfc vm-virtio: Implement Snapshottable trait for VirtioPciDevice
This gives Cloud-Hypervisor the possibility to snapshot and restore a
VM running with virtio-pci devices attached to it.

The VirtioPciDevice snapshot contains a vector of sub-snapshots to store
and restore information related to MsixConfig, VirtioPciCommonConfig and
PciConfiguration structures, along with snapshot data related to
VirtioPciDevice itself.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-05-11 11:38:16 +01:00
Sebastien Boeuf
6d59428641 vm-virtio: Implement Snapshottable trait for VirtioPciCommonConfig
This structure contains all the virtio generic information, and as part
of restoring a VM with virtio-pci devices, it is important to restore
these values to ensure the device's proper functioning.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-05-11 11:38:16 +01:00
Sebastien Boeuf
475040b29e vm-virtio: Correctly reset the virtqueues
Upon a virtio reset, the driver expects that available and used indexes
will be reset to 0. That's why we need to reset these values from the
VMM for any virtio device that might get reset.

This issue was not detected before because the Vec<Queue> maintained
through VirtioPciDevice or MmioDevice was never updated from the virtio
device thread after the device had been actived. For this reason, upon
reset, both available and used indexes were already at the value 0.

The issue arose when trying to reset a device after the VM was restored.
That's because during the restore, each queue is assigned with the right
available and used indexes before it is passed to the device through the
activate function. And that's why upon reset, each queue was still
assigned with these indexes while it should have been reset to 0.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-05-08 11:31:50 +01:00
Sebastien Boeuf
d809f2fe09 vm-virtio: Add virtio reset() support to MmioDevice
All our virtio devices support to be reset, but the virtio-mmio
transport layer was not implemented for it. This patch fixes this
lack of support.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-05-08 11:31:16 +01:00
Rob Bradford
fec97e0586 vm-virtio, vmm: Delete unix socket on shutdown
It's not possible to call UnixListener::Bind() on an existing file so
unlink the created socket when shutting down the Vsock device.

This will allow the VM to be rebooted with a vsock device.

Fixes: #1083

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-05-05 13:01:38 +02:00
Hui Zhu
327d67fadf virtio-mem: Return reize error in MemEpollHandler.run
Return resize error in MemEpollHandler.run.

Fixes: #1081

Signed-off-by: Hui Zhu <teawater@antfin.com>
2020-05-03 10:21:49 +01:00
Sebastien Boeuf
06487131f9 vm-virtio: pci: Expect an identifier upon device creation
This identifier is chosen from the DeviceManager so that it will manage
all identifiers across the VM, which will ensure uniqueness.

It is based off the name from the virtio device attached to this
transport layer.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-29 19:34:31 +01:00
Sebastien Boeuf
eeb7e10d1f vm-virtio: mmio: Expect an identifier upon device creation
This identifier is chosen from the DeviceManager so that it will manage
all identifiers across the VM, which will ensure uniqueness.

It is based off the name from the virtio device attached to this
transport layer.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-29 19:34:31 +01:00
Sebastien Boeuf
556871570e vm-virtio: iommu: Expect an identifier upon device creation
This identifier is chosen from the DeviceManager so that it will manage
all identifiers across the VM, which will ensure uniqueness.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-29 19:34:31 +01:00
Sebastien Boeuf
052eff1ca7 vm-virtio: console: Expect an identifier upon device creation
This identifier is chosen from the DeviceManager so that it will manage
all identifiers across the VM, which will ensure uniqueness.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-29 19:34:31 +01:00
Sebastien Boeuf
354c2a4b3d vm-virtio: vhost-user-net: Expect an identifier upon device creation
This identifier is chosen from the DeviceManager so that it will manage
all identifiers across the VM, which will ensure uniqueness.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-29 19:34:31 +01:00
Sebastien Boeuf
46e0b3ff75 vm-virtio: vhost-user-blk: Expect an identifier upon device creation
This identifier is chosen from the DeviceManager so that it will manage
all identifiers across the VM, which will ensure uniqueness.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-29 19:34:31 +01:00
Sebastien Boeuf
bb7fa71fcb vm-virtio: vhost-user-fs: Expect an identifier upon device creation
This identifier is chosen from the DeviceManager so that it will manage
all identifiers across the VM, which will ensure uniqueness.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-29 19:34:31 +01:00
Sebastien Boeuf
ec5ff395cf vm-virtio: vsock: Expect an identifier upon device creation
This identifier is chosen from the DeviceManager so that it will manage
all identifiers across the VM, which will ensure uniqueness.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-29 19:34:31 +01:00
Sebastien Boeuf
9b53044aae vm-virtio: mem: Expect an identifier upon device creation
This identifier is chosen from the DeviceManager so that it will manage
all identifiers across the VM, which will ensure uniqueness.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-29 19:34:31 +01:00
Sebastien Boeuf
1592a9292f vm-virtio: pmem: Expect an identifier upon device creation
This identifier is chosen from the DeviceManager so that it will manage
all identifiers across the VM, which will ensure uniqueness.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-29 19:34:31 +01:00
Sebastien Boeuf
2e91b73881 vm-virtio: rng: Expect an identifier upon device creation
This identifier is chosen from the DeviceManager so that it will manage
all identifiers across the VM, which will ensure uniqueness.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-29 19:34:31 +01:00
Sebastien Boeuf
9eb7413fab vm-virtio: net: Expect an identifier upon device creation
This identifier is chosen from the DeviceManager so that it will manage
all identifiers across the VM, which will ensure uniqueness.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-29 19:34:31 +01:00
Sebastien Boeuf
be946caf4b vm-virtio: blk: Expect an identifier upon device creation
This identifier is chosen from the DeviceManager so that it will manage
all identifiers across the VM, which will ensure uniqueness.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-29 19:34:31 +01:00
Sebastien Boeuf
a5de49558e vmm: Only allow removal of specific types of virtio device
Now that all virtio devices are assigned with identifiers, they could
all be removed from the VM. This is not something that we want to allow
because it does not make sense for some devices. That's why based on the
device type, we remove the device or we return an error to the user.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-29 13:33:19 +01:00
Jose Carlos Venegas Munoz
3eaeba4b55 vm-virtio: Fix FS_IO callback for virtio-fs
FS_IO is part of the actions a vhost-user-fs daemon can ask the VMM to
perform on its behalf. It is meant to read/write the content from a file
descriptor directly into a guest memory region. This region can either
be a RAM region or the dedicated cache region for virtio-fs.

The way FS_IO was implemented, it was only expecting the guest physical
address provided through the "cache_offset" field to refer to the cache
region. Unfortunately, this was only implementing FS_IO partially.

This patch extends the existing FS_IO implementation by checking the GPA
against the cache region as a first step, but if it is not part of the
cache region address range, then we fallback onto searching for a RAM
region that could match. If there is a matching RAM region, we retrieve
the corresponding host address to let the VMM read/write from/to it.

Fixes: #1054

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
2020-04-23 15:01:28 +01:00
Yi Sun
4fc75cf2b0 vm-virtio: Implement Snapshottable trait for Console
This patch implements the Snapshottable trait for virtio-console, which
enables migration support for it. A VM with a virtio-console device
attached can be snapshot and then restored without issues.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
2020-04-22 14:45:16 +02:00
Yi Sun
d41ce909a2 vm-virtio: Implement Snapshottable trait for Pmem
This brings the migration support to virtio-pmem device.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
2020-04-22 14:45:02 +02:00
Sebastien Boeuf
49322c5ebe vm-virtio: Implement the Snapshottable trait for Net
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
2020-04-21 21:25:03 +02:00
Sebastien Boeuf
24c2b67aa4 vm-virtio: Improve virtio-net rx queue processing
The frame buffer must be updated depending on the amount read from it,
which depends on the number and depth of descriptors available at the
time of the processing.

This patch handles this buffer update, and allow for large buffers to be
correctly processed in multiple rounds.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-21 21:25:03 +02:00
Sebastien Boeuf
03dd24978e vm-virtio: Restore queues based on used index
On the restore path, using the available and used indexes read from
memory to fill the Queue structure was a mistake. Indeed, the available
index is written from the guest and it reflects the last available index
in the descriptor table. But the driver might have queued a lot of
buffers which have not yet been used by the device. This leads to a
situation where the next_avail from Queue is completely different from
the one we can read from memory.

Instead, the right way to determine the next_avail index that should be
used by the device is by relying on the used index from the memory. This
index represents the correct information we're looking for as it has
been updated before the snapshot to let the guest know the next index to
process.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-21 21:25:03 +02:00
Sebastien Boeuf
cf707da1a8 vm-virtio: Extend Queue helpers
First, this modifies the existing helpers on how to get indexes for
available and used rings from memory. Instead of updating the queue
through each helper, they are now used as simple getters.

Based on these new getters, we could create a new helper to determine if
the queue has some available descriptors already queued from the driver
side. This helper is going to be particularly helpful when trying to
determine from a virtio thread if a queue is already loaded with some
available buffers that can be used to send information to the guest.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-21 21:25:03 +02:00
Sebastien Boeuf
b2de1cd523 vm-virtio: Implement shutdown() for virtio-fs
Since the virtio-fs device is backed by a vhost-user process, it is
important to implement the proper shutdown() function from the
VirtioDevice trait, as vhost-user-blk and vhost-user-net do.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-21 10:02:21 +01:00
Sebastien Boeuf
fbcf3a7a7a vm-virtio: Implement userspace_mappings() for virtio-pmem
When hot-unplugging the virtio-pmem from the VM, we don't remove the
associated userspace mapping. This patch will let us fix this in a
following patch. For now, it simply adapts the code so that the Pmem
device knows about the mapping associated with it. By knowing about it,
it can expose it to the caller through the new userspace_mappings()
function.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-21 10:02:21 +01:00
Sebastien Boeuf
b0353992d6 vm-virtio: Implement userspace_mappings() for virtio-fs
This will help when we will implement the hot-unplug of the virtio-fs
device, as we will have to remove correctly the userspace mappings
associated with the device.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-21 10:02:21 +01:00
Sebastien Boeuf
3fb0a02fa2 vm-virtio: Get userspace mappings from VirtioDevice
Introduce new getter function to the VirtioDevice trait, as it will
allow the caller to retrieve the list of userspace mappings associated
with the device.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-21 10:02:21 +01:00
Sebastien Boeuf
d35e775ed9 vmm: Update KVM userspace mapping when PCI BAR remapping
In the context of the shared memory region used by virtio-fs in order to
support DAX feature, the shared region is exposed as a dedicated PCI
BAR, and it is backed by a KVM userspace mapping.

Upon BAR remapping, the BAR is moved to a different location in the
guest address space, and the KVM mapping must be updated accordingly.

Additionally, we need the VirtioDevice to report the updated guest
address through the shared memory region returned by get_shm_regions().
That's why a new setter is added to the VirtioDevice trait, so that
after the mapping has been updated for KVM, we can tell the VirtioDevice
the new guest address the shared region is located at.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-20 16:01:25 +02:00
Sebastien Boeuf
49cc73a4ca vm-virtio: pci: Make sure to return the correct list of BARs
By adding the shared memory regions to the list of BARs, we make sure
the DeviceManager will register it as a BAR on the PCI bus. Without
this, when PCI BAR reprogramming happens, the PCI bus errors since it
does not know about any BAR at the specified address.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-20 16:01:25 +02:00
Yi Sun
187b1eec8b vm-virtio: Implement the Snapshottable trait for Block
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
2020-04-17 19:29:41 +02:00
Samuel Ortiz
a484aa7be6 vm-virtio: Implement the Snapshottable trait for Rng
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
2020-04-17 19:29:41 +02:00
Sebastien Boeuf
b6fdbf7a44 vm-virtio: Implement Snapshottable trait for MmioDevice
Any virtio device relying on the mmio transport layer can be snapshotted
and restored thanks to this new patch. From the MmioDevice perspective,
it is mainly a matter of saving the information about the virtqueues as
the restore path will need them to activate the device (if needed
because it has been activated before being snapshotted).

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-17 19:29:41 +02:00
Sebastien Boeuf
12fec55064 vm-virtio: Add helpers to update queue indexes
In anticipation for adding snapshot/restore support to virtio devices,
this commit introduces two new helpers updating the available and used
indexes of a queue, relying on the guest memory.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-17 19:29:41 +02:00
Samuel Ortiz
fd45e94510 vm-virtio: Add the ability to serialize a Queue
This commit relies on serde to serialize and deserialize the content of
a Queue structure. This will be useful information to store when
implementing snapshot/restore feature for virtio devices.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
2020-04-17 19:29:41 +02:00
Rob Bradford
9bd5ec8967 pci, vfio, vm-virtio: Specify a PCI revision ID of 1 for virtio-pci
Add support for specifying the PCI revision in the PCI configuration and
populate this with the value of 1 for virtio-pci devices.

The virtio-pci specification is slightly ambiguous only saying that
transitional (i.e. devices that support legacy and virtio 1.0) should
set this to 0. In practice it seems that software expects the revision
to be set to 1 for modern only devices.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-17 13:46:48 +02:00
Rob Bradford
2fa652aa4c vm-virtio: pci: Add virtio_device() accessor
Add an accessor to return the underlying VirtioDevice. This is useful
for managing the removal of the device from internal datastructures when
handling virtio-pci device unplug.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-16 17:03:25 +02:00
Rob Bradford
8ff3633782 vm-virtio: pci: Update the BARs used by the VirtioPciDevice
In order to support freeing the memory that is allocated we need to make
sure that we update the internal representation so that free_bars() can
correctly free the memory if the device has its BARs moved.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-15 12:02:19 +02:00
Rob Bradford
a216c2ebd3 vm-virtio: pci: Implement free_bars() for VirtioPciDevice
Implement the free_bars() method from the PciDevice trait which is used
as part of the device removal process. Although there is only one BAR
allocated by VirtioPciDevice simplify the code by using a vector.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-15 12:02:19 +02:00
Rob Bradford
70ecd6bab4 vmm, virtio: fs: Move freeing of mappped region into device
Move the release of the managed memory region from the DeviceManager to
the vhost-user-fs device. This ensures that the memory will be freed when
the device is unplugged which will lead to it being Drop()ed.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-14 17:46:11 +01:00
Rob Bradford
0c6706a510 vmm, virtio: pmem: Move freeing of mappped region into device
Move the release of the managed memory region from the DeviceManager to
the virtio-pmem device. This ensures that the memory will be freed when
the device is unplugged which will lead to it being Drop()ed.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-14 17:46:11 +01:00
dependabot-preview[bot]
886c0f9093 build(deps): bump libc from 0.2.68 to 0.2.69
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.68 to 0.2.69.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.68...0.2.69)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-04-14 09:27:04 +01:00
Yang Zhong
183529d024 vmm: Cleanup warning from build
Remove unnecessary parentheses from code and this will cleanup
the warning from cargo build.

Signed-off-by: Yang Zhong <yang.zhong@intel.com>
2020-04-07 09:45:31 +02:00
Samuel Ortiz
1b1a2175ca vm-migration: Define the Snapshottable and Transportable traits
A Snapshottable component can snapshot itself and
provide a MigrationSnapshot payload as a result.

A MigrationSnapshot payload is a map of component IDs to a list of
migration sections (MigrationSection). As component can be made of
several Migratable sub-components (e.g. the DeviceManager and its
device objects), a migration snapshot can be made of multiple snapshot
itself.
A snapshot is a list of migration sections, each section being a
component state snapshot. Having multiple sections allows for easier and
backward compatible migration payload extensions.

Once created, a migratable component snapshot may be transported and this
is what the Transportable trait defines, through 2 methods: send and recv.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
2020-04-02 13:24:25 +01:00
Eryu Guan
33be24bd5a vhost-user-fs: return EINVAL if req is out of range in fs_slave_mmap/unmap/sync
Return libc::EINVAL instead of custom "Wrong offset" error, as mmap(2)
returns EINVAL when offset/len is invalid.

Signed-off-by: Eryu Guan <eguan@linux.alibaba.com>
2020-03-27 11:27:56 +01:00
Eryu Guan
78b5cbc63a vhost-user-fs: validate fs_slave_map/unmap/sync request
In fs_slave_map/unmap/sync, we only made sure offset < cache_size, but
didn't validate (offset + len). We should ensure [offset, offset+len]
is within cache range as well.

Signed-off-by: Eryu Guan <eguan@linux.alibaba.com>
2020-03-27 11:27:56 +01:00
Hui Zhu
51d102c708 vm-virtio: Add virtio-mem device
The basic idea of virtio-mem is to provide a flexible, cross-architecture
memory hot plug and hot unplug solution that avoids many limitations
imposed by existing technologies, architectures, and interfaces. More
details can be found in https://lkml.org/lkml/2019/12/12/681.

This commit add virtio-mem device.

Signed-off-by: Hui Zhu <teawater@antfin.com>
2020-03-25 15:54:16 +01:00
Eryu Guan
61e34331c2 virtio-fs: validate request len in fs_slave_io()
We made sure gpa is in cache range, but not the end addr of request,
which is (gpa + len). If the end addr of request is beyond dax cache
window, vmm would corrupt guest memory or crash.

Fix it by making sure end addr of request is within cache range as well.
And while we're at it, return EFAULT if the request is out of range, as
write(2)/read(2) returns EFAULT when buffer is outside accessible
address space.

Signed-off-by: Eryu Guan <eguan@linux.alibaba.com>
2020-03-25 13:12:26 +01:00
Sebastien Boeuf
d75e7456fc vm-virtio: vhost-user: Send memory update to the backend
In order to keep vhost-user backend to work across guest memory resizing
happening when memory is hot-plugged or hot-unplugged, both blk, net and
fs devices are implementing the notifier to let the backend know.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-24 19:01:15 +00:00
Sebastien Boeuf
7ff82af4b2 vm-virtio: vhost-user: Factorize SET_MEM_TABLE setup
By factorizing the setup of the memory table for vhost-user, we
anticipate the fact that vhost-user devices are going to reuse this
function when the guest memory will be updated.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-24 19:01:15 +00:00
Sebastien Boeuf
bc874a9b6f vm-virtio: Add update_memory() to VirtioDevice trait
The virtio devices backed by a vhost-user backend must send an update to
the associated backend with the new file descriptors corresponding to
the memory regions.

This patch allows such devices to be notified when such update needs to
happen.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-24 19:01:15 +00:00
Eryu Guan
18fbd303ab vhost-user-fs: return correct result of fs_slave_io()
Virtio-fs daemon expects fs_slave_io() returns the number of bytes
read/written on success, but we always return 0 and make userspace think
nothing has been read/written.

Fix it by returning the actual bytes read/written. Note that This
depends on the corresponding fix in vhost crate.

Fixes: #949
Signed-off-by: Eryu Guan <eguan@linux.alibaba.com>
2020-03-24 14:55:56 +01:00
Rob Bradford
8acc15a63c build: Bump vm-memory and linux-loader dependencies
linux-loader depends on vm-memory so must be updated at the same time.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-23 14:27:41 +00:00
Sergio Lopez
6329219749 vm-virtio: queue: Use a SeqCst fence on get_used_event
On x86_64, a hint to the compiler is not enough, we need to issue a
MFENCE instruction. Replace the Acquire fence with a SeqCst one.

Without this, it's still possible to miss an used_event update,
leading to the omission of a notification, possibly stalling the
vring.

Signed-off-by: Sergio Lopez <slp@redhat.com>
2020-03-18 13:36:17 +00:00
dependabot-preview[bot]
51f51ea17d build(deps): bump libc from 0.2.67 to 0.2.68
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.67 to 0.2.68.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.67...0.2.68)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-03-17 21:36:38 +00:00
Liu Bo
5c1207c198 vhost-user-fs: handle FS_IO request
Virtiofs's dax window can be used as read/write's source (e.g. mmap a file
on virtiofs), but the dax window area is not shared with vhost-user
backend, i.e. virtiofs daemon.

To make those IO work, addresses of this kind of IO source are routed to
VMM via FS_IO requests to perform a read/write from an fd directly to the
given GPA.

This adds the support of FS_IO request to clh's vhost-user-fs master part.

Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com>
2020-03-17 08:23:38 +01:00