SerialBuffer uses VecDeque::extend, which calls realloc, which a
maximum buffer size of 1 MiB. Starting at allocation sizes of
128 KiB, musl's mallocng allocator will use mremap for the allocation.
Since this was not permitted by the seccomp rules, heavy write load
could crash cloud-hypervisor with a seccomp failure. (Encountered
using virtio-console, but I don't see any reason it wouldn't happen
for the legacy serial device too.)
Signed-off-by: Alyssa Ross <hi@alyssa.is>
(cherry picked from commit beed5e5d6d)
Cloud Hypervisor's vhost-user implementation will reconnect if it gets
disconnected from the backend. That means connections happen inside
the vhost-user seccomp sandbox, so all syscalls used in reconnecting
have to be allowed in that sandbox.
clock_nanosleep is used by Glibc, and nanosleep is used by musl.
Signed-off-by: Alyssa Ross <hi@alyssa.is>
The code of the stable branch diverges from the main branch, so we
can't directly backport the corresponding commit to fix the clippy
issues.
See: commit 5e52729453
Signed-off-by: Bo Chen <chen.bo@intel.com>
This change is important to do a proper resource cleanup. We decided
to do this repetitive approach as VirtioCommon can't implement Drop
without major changes to the corresponding code. Also, devices such as
Net can't easily use the epoll_threads-abstraction from VirtioCommon as
it has multiple threads with different semantics.
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
(cherry picked from commit ad6c0ee52b)
TEST=Boot `--disk readonly=on` along with a guest that tries to write
(unmodified hypervisor-fw) and observe that the virtio device thread no
longer panics.
Fixes: #4888
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
It's perfectly reasonable to expect if that some virtio threads trigger
libc behaviour that needs mprotect() that all virtio threads would do
the same.
Fixes: #4874
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
It is required to close all file descriptors pointing to an opened TAP
device prior to reopening the TAP device; otherwise it will return
-EBUSY as the device can only be opened once (excluding MQ use cases.)
When rebooting the VM the virtio-net threads would still be running and
so the TAP file descriptor may not have been closed. To ensure that the
TAP FD is closed wait for all the epoll threads to exit after receiving the
KILL_EVENT.
Fixes: #4868
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
An integer overflow from our virtio-mem device can be triggered
from (misbehaved) guest driver with malicious requests. This patch
handles this integer overflow explicitly and treats it as an invalid
request.
Note: this bug was detected by our virtio-mem fuzzer through 'oss-fuzz'.
Signed-off-by: Bo Chen <chen.bo@intel.com>
To support all virtio-devices, this patch replaces the customized
EpollHelper::run` with customized `EpollHelper::run_with_timeout` for
fuzzing.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Since the processing of the console inputs was moved from the VMM thread
to the virtio-console thread (#3061), we have been using the 'FILE_EVENT'
to handle input from stdin/pty/file, which made 'INPUT_EVENT' obsoleted.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Following the new design proposal to improve the restore codepath when
migrating a VM, all virtio devices are supplied with an optional state
they can use to restore from. The restore() implementation every device
was providing has been removed in order to prevent from going through
the restoration twice.
Here is the list of devices now following the new restore design:
- Block (virtio-block)
- Net (virtio-net)
- Rng (virtio-rng)
- Fs (vhost-user-fs)
- Blk (vhost-user-block)
- Net (vhost-user-net)
- Pmem (virtio-pmem)
- Vsock (virtio-vsock)
- Mem (virtio-mem)
- Balloon (virtio-balloon)
- Watchdog (virtio-watchdog)
- Vdpa (vDPA)
- Console (virtio-console)
- Iommu (virtio-iommu)
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Considering error messages will be mostly nested, ensuring no
punctuation at the end will make the error log more readable.
Signed-off-by: Bo Chen <chen.bo@intel.com>
With known number of queues and queue events, we can make each of them
more explicit and avoid using vector/direct indexing, which is cleaner
and slightly more efficient.
Signed-off-by: Bo Chen <chen.bo@intel.com>
The the number of queues and associated events is known and fixed. We
can define and use each of them explicitly and avoid using vector (and
hence direct indexing), which is cleaner and slightly more efficient.
Signed-off-by: Bo Chen <chen.bo@intel.com>
The the number of queues and associated events is known and fixed. We
can define and use each of them explicitly and avoid using vector (and
hence direct indexing), which is cleaner and slightly more efficient.
Also, this refactoring makes it clearer that we are not handling "event
queue" events (as "_event_queue" is not being used intentionally).
Signed-off-by: Bo Chen <chen.bo@intel.com>
In this way, the virtio-iommu code can properly report an error when
a wrong number of queues is provided, instead of triggering an
out-of-bound error.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Vdpa now implements the Migratable trait, which allows the device to be
added to the DeviceTree and therefore allows live migrating any vDPA
device that supports being suspended.
Given a vDPA device can't be resumed from a suspended state without
having to reset everything, we don't support pause/resume for a vDPA
device, as well as snapshot/restore (which requires resume to be
supported).
In order for the migration to work locally, reusing the same device on
the same host machine, the vhost-vdpa handler is dropped after the
snapshot has been performed, which allows the destination VM to open the
device without any conflict about the device being busy.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In order to anticipate for migration support, we need to be able to
create a Vdpa object without VhostKernVdpa object associated with it.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
When restoring a VM, the BAR type can be found directly from the
snapshot resources. It is more reliable than the previous method which
was using self.use_64bit_bar from VirtioPciDevice because at the time
the BARs are allocated, the VirtioDevice hasn't been restored yet,
meaning the way to determine the value of use_64bit_bar is wrong for a
device like vDPA. At this time, the device type is not known and relying
on the stored resources is the only reliable way.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The RNG device never reads from the guest memory it reads from a file
and writes to the guest memory.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Don't silently ignore the descriptors provided by the guest. This is
consistent with other devices.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
With the virtio-rng device the descriptors that are provided by the
guest must be writable and of non-zero length. Also propagate an error
if writing to the guest memory fails.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Adjust MTU logic such that:
1. Apply an MTU to the TAP interface if the user supplies it
2. Always query the TAP interface for the MTU and expose that.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>