Split interrupt source group restore into two steps, first restore
the irqfd for each interrupt source entry, and second restore the
GSI routing of the entire interrupt source group.
This patch will reduce restore latency of interrupt source group,
and in a 200-concurrent restore test, the patch reduced the
average IOAPIC restore time from 15ms to 1ms.
Signed-off-by: Yong He <alexyonghe@tencent.com>
This commit merges crates `qcow`, `vhdx` and `block_util` into the
crate `block`, which can allow `qcow` to use functions from `block_util`
without introducing a circular crate dependency.
This commit is based on crosvm implementation:
f2eecc4152
Signed-off-by: Yu Li <liyu.yukiteru@bytedance.com>
Add new configuration for offloading features, including
Checksum/TSO/UFO, and set these offloading features as
enabled by default.
Fixes: #4792.
Signed-off-by: Yong He <alexyonghe@tencent.com>
To synthesize the interactions between the virtio-net device and the tap
interface, the fuzzer utilizes a pair of unix domain sockets: one socket
(e.g. the dummy tap frontend) is used to construct the 'net_util::Tap'
instance for creating a virtio-net device; the other socket (e.g. the
dummy tap backend) is used in a epoll loop for handling the tx and rx
requests from the virtio-net device.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Following the new design proposal to improve the restore codepath when
migrating a VM, all virtio devices are supplied with an optional state
they can use to restore from. The restore() implementation every device
was providing has been removed in order to prevent from going through
the restoration twice.
Here is the list of devices now following the new restore design:
- Block (virtio-block)
- Net (virtio-net)
- Rng (virtio-rng)
- Fs (vhost-user-fs)
- Blk (vhost-user-block)
- Net (vhost-user-net)
- Pmem (virtio-pmem)
- Vsock (virtio-vsock)
- Mem (virtio-mem)
- Balloon (virtio-balloon)
- Watchdog (virtio-watchdog)
- Vdpa (vDPA)
- Console (virtio-console)
- Iommu (virtio-iommu)
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The kernel will trigger a SIGBUS upon hugetlb page faults when there is
no huge pages available. We neither have a way to ensure enough huge
pages available on the host system, nor have a way to gracefully report
the lack of huge pages in advance from Cloud Hypervisor. For these
reasons, we have to avoid using huge pages from the virtio-mem fuzzer to
avoid SIGBUS errors.
Signed-off-by: Bo Chen <chen.bo@intel.com>
With the guest memory size of 1MB, a valid descriptor size can be close
to the guest memory size (e.g. 1MB) and can contain close to 256k
valid pfn entries (each entry is 4 bytes). Multiplying the queue
size (e.g. 256), there can be close to 64 millions pfn entries to
process in a single request. This is why the oss-fuzz reported a
timeout (with a limit of 60s).
By reducing the guest memory size and the queue size, the worst-case now
is 8 million pfn entries for fuzzing, which can be finished in around 20
seconds according to my local experiment.
Signed-off-by: Bo Chen <chen.bo@intel.com>
As the virt queues are initialized with random bytes from the fuzzing
engine, a descriptor buffer for the available ring can have a very large
length (e.g. up to 4GB). This means there can be up to 1 billion
entries (e.g. page frame number) for virtio-balloon to process a signal
available descriptor (given each entry is 4 bytes). This is the reason
why oss-fuzz reported a hanging issue for this fuzzer, where the
generated descriptor buffer length is 4,278,321,152.
We can avoid this kind of long execution by reducing the size of guest
memory. For example, with 1MB of guest memory, the number of descriptor
entries for processing is limited ~256K.
Signed-off-by: Bo Chen <chen.bo@intel.com>
This function is for really for the transport layer to trigger a device
reset. Instead name it appropriately for the fuzzing specific use case.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The fuzzer exercises the inflate, deflate and reporting events of
virtio-balloon via creating three queues and kicking three events.
Signed-off-by: Bo Chen <chen.bo@intel.com>
To make the fuzzer faster and more effective, the guest memory is
setup with a much smaller size (comparing with other virtio device
fuzzers) and a hole between the memory for holding virtio queue and
the rest of guest data. It brings two benefits: 1) avoid writing large
chunk of data from 'urandom' into the available descriptor chain (which
makes the fuzzer faster); 2) reduce substantial amount of overwrites to
the virtio queue data by the data from 'urandom (which makes the fuzzer
more deterministic and hence effective).
Signed-off-by: Bo Chen <chen.bo@intel.com>
The fuzzer is focusing on the virtio-pmem code that processes guest
inputs (e.g. virt queues). Given 'flush' is the only virtio-pmem
request, the fuzzer is essentially testing the code for parsing and
error handling.
Signed-off-by: Bo Chen <chen.bo@intel.com>
To make the fuzzers more focused and more efficient, we now provide
default addresses for the descriptor table, available ring, and used
ring, which ensures the virt-queue has a valid memory layout (e.g. no
overlapping between descriptor tables, available ring, and used ring).
Signed-off-by: Bo Chen <chen.bo@intel.com>
Instead of always fuzzing virt-queues with default values (mostly 0s),
the fuzzer now initializes the virt-queue based on the fuzzed input
bytes, such as the tail position of the available ring, queue size
selected by driver, descriptor table address, available ring address,
used ring address, etc. In this way, the fuzzer can explore the
virtio-block code path with various virt-queue setup.
Signed-off-by: Bo Chen <chen.bo@intel.com>
The current fuzzer defines a 'format' for the random input 'bytes' from
libfuzzer, but this 'format' failed to improve the fuzzing
efficiency. Instead, the 'format' parsing process obfuscates the fuzzer and
makes the fuzzing engine much harder to focus on the actual fuzzing
target (e.g. virtio-block queue event handling). It is actually worse than
simply using the random inputs as the virt queue content for fuzzing.
We can later introduce a different 'format' to the input 'bytes' for
better fuzzing, say focusing more on virito-block fuzzing through
ensuring the virt queue content always has a valid 'available'
descriptor chain to process.
Signed-off-by: Bo Chen <chen.bo@intel.com>
This also ensures that the 'queue_evt' is fully processed, as we enforce
the main thread is waiting for the virtio-block thread to process the
'kill_evt' which is after the 'queue_evt' processing.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Currently the main thread returns immediately after sending a 'queue'
event which is rarely received and processed by the virtio-block
thread (unless system is in high workload). In this way, the fuzzer is
mostly doing nothing and is unable to reproduce its behavior
deterministically (from the same inputs). This patch relies on a
'level-triggered' epoll to ensure a 'queue' event is properly processed
before return from the main thread.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Fuzz the HTTP API handling code with a minimum HTTP API
receiver (e.g. mocking the behavior of our "vmm" thread).
Signed-off-by: Bo Chen <chen.bo@intel.com>
The new virtio-queue version introduced some breaking changes which need
to be addressed so that Cloud Hypervisor can still work with this
version.
The most important change is about removing a handle to the guest memory
from the Queue, meaning the caller has to provide the guest memory
handle for multiple methods from the QueueT trait.
One interesting aspect is that QueueT has been widely extended to
provide every getter and setter we need to access and update the Queue
structure without having direct access to its internal fields.
This patch ports all the virtio and vhost-user devices to this new crate
definition. It also updates both vhost-user-block and vhost-user-net
backends based on the updated vhost-user-backend crate. It also updates
the fuzz directory.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Instead of passing separately a list of Queues and the equivalent list
of EventFds, we consolidate these two through a tuple along with the
queue index.
The queue index can be very useful if looking for the actual index
related to the queue, no matter if other queues have been enabled or
not.
It's also convenient to have the EventFd associated with the Queue so
that we don't have to carry two lists with the same amount of items.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Instead of relying on the virtio-queue crate to store the information
about the MSI-X vectors for each queue, we handle this directly from the
PCI transport layer.
This is the first step in getting closer to the upstream version of
virtio-queue so that we can eventually move fully to the upstream
version.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This crate contains up to date definition of the Queue, AvailIter,
DescriptorChain and Descriptor structures forked from the upstream
crate rust-vmm/vm-virtio 27b18af01ee2d9564626e084a758a2b496d2c618.
The following patches have been applied on top of this base in order to
make it work correctly with Cloud Hypervisor requirements:
- Add MSI vector field to the Queue
In order to help with MSI/MSI-X support, it is convenient to store the
value of the interrupt vector inside the Queue directly.
- Handle address translations
For devices with access to data in memory being translated, we add to
the Queue the ability to translate the address stored in the
descriptor.
It is very helpful as it performs the translation right after the
untranslated address is read from memory, avoiding any errors from
happening from the consumer's crate perspective. It also allows the
consumer to reduce greatly the amount of duplicated code for applying
the translation in many different places.
- Add helpers for Queue structure
They are meant to help crate's consumers getting/setting information
about the Queue.
These patches can be found on the 'ch' branch from the Cloud Hypervisor
fork: https://github.com/cloud-hypervisor/vm-virtio.git
This patch takes care of updating the Cloud Hypervisor code in
virtio-devices and vm-virtio to build correctly with the latest version
of virtio-queue.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Relying on the vm-virtio/virtio-queue crate from rust-vmm which has been
copied inside the Cloud Hypervisor tree, the entire codebase is moved to
the new definition of a Queue and other related structures.
The reason for this move is to follow the upstream until we get some
agreement for the patches that we need on top of that to make it
properly work with Cloud Hypervisor.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The fuzzer needs to take a larger input for the whole disk image to
be most useful. Since the file is small we can test by reading and
writing over the whole file.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Signed-off-by: Fazla Mehrab <akm.fazla.mehrab@intel.com>
Instead of running the generic block fuzzer with QCOW, it's better to
use a RAW file since it's less complex and it will focus on virtqueues.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Now that BlockIoUring is the only implementation of virtio-block,
handling both synchronous and asynchronous backends based on the
AsyncIo trait, we can rename it to Block.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Now that both synchronous and asynchronous backends rely on the
asynchronous version of virtio-block (namely BlockIoUring), we can
get rid of the synchronous version (namely Block).
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This patch adds two required dependencies to fuzz/Cargo.toml, and fixes
the building error on the 'block' fuzzer.
Signed-off-by: Bo Chen <chen.bo@intel.com>
Add the basic infrastructure for fuzzing along with a qcow fuzzer ported
from crosvm and adapted to our code.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>