On x86_64, a hint to the compiler is not enough, we need to issue a
MFENCE instruction. Replace the Acquire fence with a SeqCst one.
Without this, it's still possible to miss an used_event update,
leading to the omission of a notification, possibly stalling the
vring.
Signed-off-by: Sergio Lopez <slp@redhat.com>
I spent a few minutes trying to understand why we were unconditionally
updating the VM config memory size, even if the guest memory resizing
did not happen.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
We set it to 0xff, which is for unregistered loaders.
The kernel checks that the bootloader ID is set when e.g. loading
ramdisks, so not setting it when we get a bootparams header from the
loader will prevent the kernel from loading ramdisks.
Fixes: #918
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
The IORT table for virtio-iommu use was removed and replaced with a
purely virtio based solution. Although the table construction was
removed these structures were left behind.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Virtiofs's dax window can be used as read/write's source (e.g. mmap a file
on virtiofs), but the dax window area is not shared with vhost-user
backend, i.e. virtiofs daemon.
To make those IO work, addresses of this kind of IO source are routed to
VMM via FS_IO requests to perform a read/write from an fd directly to the
given GPA.
This adds the support of FS_IO request to clh's vhost-user-fs master part.
Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com>
Add an integration test that builds cloud-hypervisor with
the pvh_boot feature and boots a kernel built with CONFIG_PVH.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Validate correct GDT entries, initial segment configuration, and control
register bits that are required by PVH boot protocol.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Expand the unit tests to cover the configure_system() code when
using the PVH boot protocol. Verify the method for adding memory
map table entries in the format specified by PVH boot protocol.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Use a new feature called "pvh_boot" to enable using the PVH boot
protocol if the guest kernel supports it. The feature can be enabled
by building with:
cargo build [--release] --features "pvh_boot"
Once performance has been evaluated, this can be made part of the
default set of features so that any guest that supports it boots
using PVH as the preferred option as is the case in QEMU.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Fill the hvm_start_info and related memory map structures as
specified in the PVH boot protocol. Write the data structures
to guest memory at the GPA that will be stored in %rbx when
the guest starts.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
In order to properly initialize the kvm regs/sregs structs for
the guest, the load_kernel() return type must specify which
boot protocol to use with the entry point address it returns.
Make load_kernel() return an EntryPoint struct containing the
required information. This structure will later be used
in the vCPU configuration methods to setup the appropriate
initial conditions for the guest.
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Create supporting definitions to use the hvm start info and memory
map table entry struct definitions from the linux-loader crate in
order to enable PVH boot protocol support
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
@dependabot bumped the dependency to 0.4.10 but this is no longer a
valid version so downgrade appropriately.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Extended attributes (xattr) support has a huge impact on write
performance. The reason for this is that, if enabled, FUSE sends a
setxattr request after each write operation, and due to the inode
locking inside the kernel during said request, the ability to execute
the operations in parallel becomes heavily limited.
Signed-off-by: Sergio Lopez <slp@redhat.com>
This change enables vhost_user_fs to process multiple requests in
parallel by scheduling them into a ThreadPool (from the Futures
crate).
Parallelism on a single file is limited by the nature of the operation
executed on it. A recent commit replaced the Mutex that protects the
File within HandleData with a RwLock, to allow some operations (at
this moment, only "read" and "write") to proceed in parallel by
acquiring a read lock.
A more complex approach was also implemented [1], involving
instrumentation through vhost_user_backend to be able to serialize
completions, reducing the pressure on the vring RwLock. This strategy
improved the performance on some corner cases, while making it worse
on other, more common ones. This fact, in addition to it requiring
wider changes through the source code, prompted me to drop it in favor
of this one.
[1] https://github.com/slp/cloud-hypervisor/tree/vuf_async
Signed-off-by: Sergio Lopez <slp@redhat.com>
"DescriptorChain"s are tied to the lifetime of the referenced
GuestMemoryMmap object (for good reasons), but sometimes (i.e., when
processing descriptors from different contexts) we may need to switch
them to point a different GuestMemoryMmap.
Here we introduce the structure DescriptorHead, which holds the data
needed to rebuild a DescriptorChain, the method "get_head" which
returns the DescriptorHead for a DescriptorChain, and the method
"new_from_head", which allows to create a new DescriptorChain with a
DescriptorHead and a new reference to a GuestMemoryMmap.
Signed-off-by: Sergio Lopez <slp@redhat.com>
For rustfmt to accept modern syntax like "async move", a .rustfmt.toml
with "edition = 2018" needs to be created.
https://github.com/rust-lang/rustfmt/issues/3149
Signed-off-by: Sergio Lopez <slp@redhat.com>
Replace HandleData's File Mutex with a RwLock to have more granularity
on the lock. This allows operations on the same File that are safe to
be run in parallel (at this moment, read and write), to acquire a read
lock to avoid waiting on each other.
Signed-off-by: Sergio Lopez <slp@redhat.com>
When using "--disk" with a vhost socket and not using self spawning then
it is not necessary or helpful to specify the path.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This option was superseded by using "--net" with "vhost_user=true". This
option wasn't being parsed any more but was left over.
Fixes: #806
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
By using a Vec to hold the list of devices on the PciBus, there's a
problem when we use unplug. Indeed, the vector of devices gets reduced
and if the unplugged device was not the last one from the list, every
other device after this one is shifted on the bus.
To solve this problem, a HashMap is used. This allows to keep track of
the exact place where each device stands on the bus.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The option desired_ram is in byte, make larger the amount of memory to
add.
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Some of the help strings had extra newlines in them or otherwise strange
wrapping. The strings were rewrapped with the nightly version of rustfmt
that supports string formatting.
Fixes: #899
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This prevents the output being wrapped at 120 characters and giving
strange results.
Fixes: #899
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This change, combined with the compiler hint to inline get_used_event,
shortens the window between the memory read and the actual check by
calling get_used_event from needs_notification.
Without it, when putting enough pressure on the vring, it's possible
that a notification is wrongly omitted, causing the queue to stall.
Signed-off-by: Sergio Lopez <slp@redhat.com>
get_used_event is used from vhost_user_backend:needs_notification to
check whether an interrupt must be sent to the guest to notify there
are new items in the queue. Shorten the update window by asking the
the compiler to inline this method, so a write won't slip between the
read of the memory contents and the actual check.
Signed-off-by: Sergio Lopez <slp@redhat.com>
Since we only keep one single version of the kernel config file in our
repository, there is no reason to keep the filename complex.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The kernel version is updated from 5.5-rc1 to 5.6-rc4, including the
updated kernel config file.
The kernel branch contains virtio-fs, virtio-iommu and virtio-mem
patches that are not upstream yet. It also contains one fix for
virtio-vsock which will be merged upstream in the next release.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>