Commit Graph

755 Commits

Author SHA1 Message Date
Sebastien Boeuf
f787c409c4 vmm: cpu: Factorize vcpu starting code
Anticipating the need for a slightly different function for restoring
vCPUs, this patch factorizes most of the vCPU creation, so that it can
be reused for migration purposes.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-07 12:26:10 +02:00
Cathy Zhang
722f9b6628 vmm: cpu: Get and set KVM vCPU state
These two new helpers will be useful to capture a vCPU state and being
able to restore it at a later time.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-04-07 12:26:10 +02:00
Cathy Zhang
13756490b5 vmm: cpu: Track all Vcpus through CpuManager
In anticipation for the CpuManager to aggregate all Vcpu snapshots
together, this change makes sure the CpuManager has a handle onto
every vCPU.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-04-07 12:26:10 +02:00
Samuel Ortiz
a0d5dbce6c vmm: device_manager: Implement the Snapshottable trait
Based on the list of Migratable devices stored by the DeviceManager, the
DeviceManager can implement the Snapshottable trait by aggregating all
devices snapshots together.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-04-07 12:26:10 +02:00
Yi Sun
93d3abfd6e vmm: device_manager: Make serial and ioapic devices migratable
Serial and Ioapic both implement the Migratable trait, hence the
DeviceManager can store them in the list of Migratable devices.

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
2020-04-07 12:26:10 +02:00
Samuel Ortiz
12b036a824 Cargo: Update dependencies for the KVM serialization work
We need the project to rely on kvm-bindings and kvm-ioctls branches
which include the serde derive to be able to serialize and deserialize
some KVM structures.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-04-07 12:26:10 +02:00
Rob Bradford
c7dfbd8a84 vmm: config: Implement fmt::Display for error
Fixes: #367

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-06 10:31:24 +01:00
Rob Bradford
d8119fda13 vmm: config: Remove unused error entries
These entries are not currently used.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-06 10:31:24 +01:00
Rob Bradford
1a10f16ad0 vmm: config: Consolidate size parsing code
The parse_size helper function can now be consolidated into the
ByteSized FromStr implementation.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-06 10:31:24 +01:00
Rob Bradford
f449486b9b vmm: config: Make toggle parsing more tolerant
Support "true" and "false" as well as well as capitalised forms.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-06 10:31:24 +01:00
Rob Bradford
a4e0ce58c7 vmm: config: Consolidate on/off parsing
Now all parsing code makes use of the Toggle and it's FromStr support
move the helper function into the from_str() implementation.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-06 10:31:24 +01:00
Rob Bradford
c731a943d4 vmm: config: Port vsock to OptionParser
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-06 10:31:24 +01:00
Rob Bradford
37264cf21b vmm: config: Add unit testing for vsock
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-06 10:31:24 +01:00
Rob Bradford
8665898ff3 vmm: config: Port device parsing to OptionParser
Also make the "path" option required and generate an error if it is not
provided.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-06 10:31:24 +01:00
Rob Bradford
a85e2fa735 vmm: config: Add unit test for VFIO device parsing
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-06 10:31:24 +01:00
Rob Bradford
bed282b801 vmm: config: Add "valueless" options to OptionParser
Valueless options are those like "off" or "tty" as used by the console
options.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-06 10:31:24 +01:00
Rob Bradford
2ae3392d32 vmm: config: Port console parsing to OptionParser
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-06 10:31:24 +01:00
Rob Bradford
143d63c88e vmm: config: Add unit test for console parsing
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-06 10:31:24 +01:00
Rob Bradford
5ab58e743a vmm: config: Port pmem option to OptionParser
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-06 10:31:24 +01:00
Rob Bradford
233ad78b3a vmm: config: Add parsing test for pmem
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-06 10:31:24 +01:00
Rob Bradford
13dc637350 vmm: config: Port filesystem parsing to OptionParser
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-06 10:31:24 +01:00
Rob Bradford
7a071c28db vmm: config: Implement unit testing for virtio-fs parsing
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-06 10:31:24 +01:00
Rob Bradford
e4cd3072d4 vmm: config: Port RNG options to OptionParser
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-06 10:31:24 +01:00
Rob Bradford
708dbb973a vmm: config: Add RNG parsing unit test
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-06 10:31:24 +01:00
Rob Bradford
057e71d266 vmm: config: Accept empty value strings
The integration tests and documentation make use of empty value strings
like "--net tap=" accept them but return None so that the default value
will be used as expected.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-06 10:31:24 +01:00
Rob Bradford
218c780f67 vmm: config: Port network parsing to OptionParser
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-06 10:31:24 +01:00
Rob Bradford
8754720e2d vmm: config: Add unit test for net parsing
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-06 10:31:24 +01:00
Rob Bradford
224e3ddef4 vmm: config: Switch disk parsing to OptionParser
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-06 10:31:24 +01:00
Rob Bradford
9e10244716 vmm: config: Add unit test for disk parsing
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-06 10:31:24 +01:00
Rob Bradford
e40ae6274b vmm: config: Port memory option parsing to OptionParser
This simplifies the parsing of the option by using OptionParser along
with its automatic conversion behaviour.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-06 10:31:24 +01:00
Rob Bradford
be32065aa4 vmm: config: Add "ByteSized" type for simplifying parsing of byte sizes
Byte sizes are quantities ending in "K", "M", "G" and by implementing
this type with a FromStr implementation the values can be converted
using .parse().

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-06 10:31:24 +01:00
Rob Bradford
f01bd7d56d vmm: config: Implement FromStr for HotplugMethod
This allows the use of .parse() to automatically convert the string to
the enum.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-06 10:31:24 +01:00
Rob Bradford
746138039d vmm: config: Add a Toggle type for "on/off" strings
Some of the config parameters take an "on" or "off". Add a way to
neatly parse that.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-06 10:31:24 +01:00
Rob Bradford
929142bc2e vmm: config: Add memory parsing unit test
Before porting over to OptionParser add a unit test to validate the
current memory parsing code. This showed up a bug where the "size=" was
always required. Temporarily resolve this by assigning the string a
default value which will later be replaced when the code is refactored.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-06 10:31:24 +01:00
Rob Bradford
68203ea414 vmm: config: Port CPU parsing to OptionParser
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-06 10:31:24 +01:00
Rob Bradford
9e6a2825ba vmm: config: Add unit test for CPU parsing
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-06 10:31:24 +01:00
Rob Bradford
9e7231cd69 vmm: config: Introduce basic OptionParser
This will be used to simplify and consolidate much of the parsing code
used for command line parameters.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-06 10:31:24 +01:00
Samuel Ortiz
447af8e702 vmm: vm: Factorize the device and cpu managers creation routine
Into a new_from_memory_manager() routine.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-04-03 18:05:18 +01:00
Samuel Ortiz
c73c9b112c vmm: vm: Open kernel and initramfs once all managers are created
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-04-03 18:05:18 +01:00
Samuel Ortiz
0646a90626 vmm: cpu: Pass CpusConfig to simplify the new() prototype
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-04-03 18:05:18 +01:00
Samuel Ortiz
b584ec3fb3 vmm: memory_manager: Own the system allocator
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-04-03 18:05:18 +01:00
Samuel Ortiz
ef2b11ee6c vmm: memory_manager: Pass MemoryConfig to simplify the new() prototype
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-04-03 18:05:18 +01:00
Samuel Ortiz
622f3f8fb6 vmm: vm: Avoid ioapic variable creation
For a more readable VM creation routine.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-04-03 18:05:18 +01:00
Samuel Ortiz
164e810069 vmm: cpu: Move CPUID patching to CpuManager
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-04-03 18:05:18 +01:00
Samuel Ortiz
1a2c1f9751 vmm: vm: Factorize the KVM setup code
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-04-03 18:05:18 +01:00
Samuel Ortiz
7a50646c02 vmm: device_manager: Convert migratable_devices to a map
We must be able to map a migratable component id to its device.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-04-03 18:05:18 +01:00
Samuel Ortiz
8f300bed83 vmm: api: Add a /api/v1/vm.restore endpoint
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-04-02 13:24:25 +01:00
Samuel Ortiz
92c73c3b78 vmm: Add a VmRestore command
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-04-02 13:24:25 +01:00
Samuel Ortiz
39d4f817f0 vmm: http: Add a /api/v1/vm.snapshot endpoint
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-04-02 13:24:25 +01:00
Samuel Ortiz
cf8f8ce93a vmm: api: Add a Snapshot command
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
2020-04-02 13:24:25 +01:00
Sebastien Boeuf
452475c280 vmm: Add migration helpers
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-02 13:24:25 +01:00
Samuel Ortiz
1b1a2175ca vm-migration: Define the Snapshottable and Transportable traits
A Snapshottable component can snapshot itself and
provide a MigrationSnapshot payload as a result.

A MigrationSnapshot payload is a map of component IDs to a list of
migration sections (MigrationSection). As component can be made of
several Migratable sub-components (e.g. the DeviceManager and its
device objects), a migration snapshot can be made of multiple snapshot
itself.
A snapshot is a list of migration sections, each section being a
component state snapshot. Having multiple sections allows for easier and
backward compatible migration payload extensions.

Once created, a migratable component snapshot may be transported and this
is what the Transportable trait defines, through 2 methods: send and recv.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
2020-04-02 13:24:25 +01:00
Sebastien Boeuf
2d17f4384a vmm: seccomp: Add missing open() syscall
On some systems, the open() system call is used by Cloud-Hypervisor,
that's why it should be part of the seccomp filters whitelist.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-02 09:56:48 +02:00
Sebastien Boeuf
e4ea8b0bef vmm: Add missing syscalls to the seccomp filters
Both clock_gettime and gettimeofday syscalls where missing when running
Cloud-Hypervisor on a Linux host without vDSO enabled. On a system with
vDSO enabled, the syscalls performed by vDSO were not filtered, that's
why we didn't have to whitelist them.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-27 16:50:52 +00:00
Sebastien Boeuf
9e18177654 vmm: Add memory hotplug support to VFIO PCI devices
Extend the update_memory() method from DeviceManager so that VFIO PCI
devices can update their DMA mappings to the physical IOMMU, after a
memory hotplug has been performed.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-27 09:35:39 +01:00
Sebastien Boeuf
cc67131ecc vmm: Retrieve new memory region when memory is extended
Whenever the memory is resized, it's important to retrieve the new
region to pass it down to the device manager, this way it can decide
what to do with it.

Also, there's no need to use a boolean as we can instead use an Option
to carry the information about the region. In case of virtio-mem, there
will be no region since the whole memory has been reserved up front by
the VMM at boot. This means only the ACPI hotplug will return a region
and is the only method that requires the memory to be updated from the
device manager.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-27 09:35:39 +01:00
Samuel Ortiz
8fc7bf2953 vmm: Move to the latest linux-loader
Commit 2adddce2 reorganized the crate for a cleaner multi architecture
(x86_64 and aarch64) support.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-03-27 08:48:20 +01:00
Sebastien Boeuf
785812d976 vmm: Fallback to legacy boot if PVH is enabled along with initramfs
For now, the codebase does not support booting from initramfs with PVH
boot protocol, therefore we need to fallback to the legacy boot.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-26 11:59:03 +01:00
Damjan Georgievski
6cce7b9560 arch: load initramfs and populate zero page
* load the initramfs File into the guest memory, aligned to page size
* finally setup the initramfs address and its size into the boot params
  (in configure_64bit_boot)

Signed-off-by: Damjan Georgievski <gdamjan@gmail.com>
2020-03-26 11:59:03 +01:00
Damjan Georgievski
1f9bc68c54 openapi: Add initramfs support
added InitramfsConfig property to the REST API spec

Signed-off-by: Damjan Georgievski <gdamjan@gmail.com>
2020-03-26 11:59:03 +01:00
Damjan Georgievski
4db252b418 main, vmm: add --initramfs cli option
currently unused, the initramfs argument is added to the cli,
and stored in vmm::config:VmConfig as an Option(InitramfsConfig(PathBuf))

Signed-off-by: Damjan Georgievski <gdamjan@gmail.com>
2020-03-26 11:59:03 +01:00
Rob Bradford
6244beb9d5 openapi: Add "vm.add-net" entry point
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-25 17:58:06 +01:00
Rob Bradford
57c3fa4b1e vmm: Add "add-net" to the API
Add the HTTP and internal API entry points for adding a network device
at runtime.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-25 17:58:06 +01:00
Rob Bradford
f664cddec9 vmm: Add support for adding network devices to the VM
The persistent memory will be hotplugged via DeviceManager and saved in
the config for later use.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-25 17:58:06 +01:00
Rob Bradford
8f323e61d8 vmm: Add support to DeviceManager for hotplugging network devices
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-25 17:58:06 +01:00
Rob Bradford
42a9896fe4 vmm: device_manager: Refactor make_virtio_net_devices
Split it into a method that creates a single device which is called by
the multiple device version so this can be used when dynamically adding
a device.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-25 17:58:06 +01:00
Rob Bradford
9df601a1df bin, vmm: Centralise the net syntax
This will allow the syntax to be reused with cloud-hypervsor binary and
ch-remote.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-25 17:58:06 +01:00
Samuel Ortiz
41d7b3a387 vmm: memory_manager: Only send the GED notification for the ACPI method
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-03-25 15:54:16 +01:00
Hui Zhu
15d9ec0149 openapit: Add hotplug_method to MemoryConfig
Add hotplug_method to MemoryConfig in cloud-hypervisor.yaml.

Signed-off-by: Hui Zhu <teawater@antfin.com>
2020-03-25 15:54:16 +01:00
Hui Zhu
e63f98182a vmm: device: Add make_virtio_mem_devices
Add make_virtio_mem_devices to add virtio-mem to vmm.

Signed-off-by: Hui Zhu <teawater@antfin.com>
2020-03-25 15:54:16 +01:00
Hui Zhu
e6b934a56a vmm: Add support for virtio-mem
This commit adds new option hotplug_method to memory config.
It can set the hotplug method to "acpi" or "virtio-mem".

Signed-off-by: Hui Zhu <teawater@antfin.com>
2020-03-25 15:54:16 +01:00
Rob Bradford
75878dd90a openapi: Add "vm.add-pmem" entry point
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-25 13:18:17 +01:00
Rob Bradford
f6f4c68fb4 vmm: Add "add-pmem" to the API
Add the HTTP and internal API entry points for adding persistent memory
at runtime.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-25 13:18:17 +01:00
Rob Bradford
15de30f141 vmm: Add support for adding pmem devices to the VM
The persistent memory will be hotplugged via DeviceManager and saved in
the config for later use.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-25 13:18:17 +01:00
Rob Bradford
f7def621dd vmm: Add support to DeviceManager for hotplugging pmem devices
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-25 13:18:17 +01:00
Rob Bradford
8c3ea8cd76 vmm: device_manager: Refactor make_virtio_pmem_devices
Split it into a method that creates a single device which is called by
the multiple device version so this can be used when dynamically adding
a device.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-25 13:18:17 +01:00
Rob Bradford
a7296bbb52 bin, vmm: Centralise the pmem syntax
This will allow the syntax to be reused with cloud-hypervisor binary and
ch-remote.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-25 13:18:17 +01:00
Rob Bradford
4c9d15d44c vmm: Fix copy and paste error message
vm_remove_device was copied from vm_add_device but the error message
wasn't correctly updated.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-25 09:35:53 +00:00
Rob Bradford
82cad99c0b openapi: Add "vm.add-disk" entry point
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-25 09:35:53 +00:00
Rob Bradford
f2151b2734 vmm: Add "add-disk" to the API
Add the HTTP and internal API entry points for adding disks at runtime.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-25 09:35:53 +00:00
Rob Bradford
164ec2b8e6 vmm: Add support for adding disks to the VM
The disk will be hotplugged via DeviceManager and saved in the config
for later use.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-25 09:35:53 +00:00
Rob Bradford
b3082c1984 vmm: Add support to DeviceManager for hotplugging disks
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-25 09:35:53 +00:00
Rob Bradford
2be703ca92 vmm: device_manager: Refactor make_virtio_block_devices
Split it into a method that creates a single device which is called by
the multiple device version so this can be used when dynamically adding
a device.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-25 09:35:53 +00:00
Rob Bradford
66da29d8dd bin, vmm: Centralise the disk syntax
This will allow the syntax to be reused with cloud-hypervsor binary and
ch-remote.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-25 09:35:53 +00:00
Sebastien Boeuf
e54f8ec8a5 vmm: Update memory through DeviceManager
Whenever the VM memory is resized, DeviceManager needs to be notified
so that it can subsequently notify each virtio devices about it.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-24 19:01:15 +00:00
Sebastien Boeuf
feb8d7ae90 vmm: Separate seccomp filters between VMM and API threads
This separates the filters used between the VMM and API threads, so that
we can apply different rules for each thread.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-24 14:59:57 +01:00
Sebastien Boeuf
f1a23d712f vmm: api: Add seccomp to the HTTP API thread
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-24 14:59:57 +01:00
Sebastien Boeuf
db62cb3f4d vmm: Add seccomp filter to the VMM thread
This commit introduces the application of the seccomp filter to the VMM
thread. The filter is empty for now (SeccompLevel::None).

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-24 14:59:57 +01:00
Sebastien Boeuf
cb98d90097 vmm: Create new seccomp_filter module
Based on the seccomp crate, we create a new vmm module responsible for
creating a seccomp filter that will be applied to the VMM main thread.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-24 14:59:57 +01:00
Sebastien Boeuf
708f02dc26 vmm: Pull seccomp crate from Firecracker
The seccomp crate from Firecracker is nicely implemented, documented and
tested, which is a good reason for relying on it to create and apply
seccomp filters.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-24 14:59:57 +01:00
Rob Bradford
8acc15a63c build: Bump vm-memory and linux-loader dependencies
linux-loader depends on vm-memory so must be updated at the same time.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-23 14:27:41 +00:00
Rob Bradford
f7197e8415 vmm: Add a "discard_writes=" to --pmem
This opens the backing file read-only, makes the pages in the mmap()
read-only and also makes the KVM mapping read-only. The file is also
mapped with MAP_PRIVATE to make the changes local to this process only.

This is functional alternative to having support for making a
virtio-pmem device readonly. Unfortunately there is no concept of
readonly virtio-pmem (or any type of NVDIMM/PMEM) in the Linux kernel so
to be able to have a block device that is appears readonly in the guest
requires significant specification and kernel changes.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-20 14:46:34 +01:00
Rob Bradford
d11a67b0fe vmm: Use more generic MmapRegion constructor
Switch to MmapRegion::build() and fill in the fields appropriately.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-20 14:46:34 +01:00
Rob Bradford
7257e890ef vmm: Add "readonly" parameter MemoryManager::create_userspace_mapping
Use this boolean to turn on the KVM_MEM_READONLY flag to indicate that
this memory mapping should not be writable by the VM.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-20 14:46:34 +01:00
Qiu Wenbo
c503118d16 vmm: fix a corrupted stack caused by get_win_size
According to `asm-generic/termios.h`, the `struct winsize` should be:

struct winsize {
        unsigned short ws_row;
        unsigned short ws_col;
        unsigned short ws_xpixel;
        unsigned short ws_ypixel;
};

The ioctl of TIOCGWINSZ will trigger a segfault on aarch64.

Signed-off-by: Qiu Wenbo <qiuwenbo@phytium.com.cn>
2020-03-20 07:30:06 +01:00
Rob Bradford
0788600702 build: Remove "pvh_boot" feature flag
This feature is stable and there is no need for this to be behind a
flag. This will also reduce the time needed to run the integration test
as we will not be running them all again under the flag.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-19 13:05:44 +00:00
Rob Bradford
477bc17f18 bin: Share VFIO device syntax between cloud-hypervisor and ch-remote
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-18 23:38:55 +00:00
Jose Carlos Venegas Munoz
a31ffef085 openapi: Add hotplug_size for memory hotplug
Add hotplug_size, needed to be defined when hotplug is used.

Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
2020-03-18 19:06:07 +00:00
Rob Bradford
87990f9e67 vmm: Add virtio-pci device to B/D/F hash table
This table currently contains only all the VFIO devices and it should
really contain all the PCI devices.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-18 19:05:58 +00:00
Rob Bradford
fb185fa839 vmm: Always return PCI B/D/F from add_virtio_pci_device
Previously this was only returned if the device had an IOMMU mapping and
whether the device should be added to the virtio-iommu. This was already
captured earlier as part of creating the device so use that information
instead.

Always returning the B/D/F is helpful as it facilitates virtio PCI
device hotplug.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-18 19:05:58 +00:00
Samuel Ortiz
63eeed29cc vm: Comment on the VM config update from memory hotplug
I spent a few minutes trying to understand why we were unconditionally
updating the VM config memory size, even if the guest memory resizing
did not happen.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-03-18 12:48:40 +01:00
dependabot-preview[bot]
51f51ea17d build(deps): bump libc from 0.2.67 to 0.2.68
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.67 to 0.2.68.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.67...0.2.68)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-03-17 21:36:38 +00:00
Rob Bradford
28a5f9dc19 vmm: acpi: Remove unused IORT related structures
The IORT table for virtio-iommu use was removed and replaced with a
purely virtio based solution. Although the table construction was
removed these structures were left behind.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-17 12:46:26 +00:00
Alejandro Jimenez
9e247c4e06 pvh: Introduce "pvh_boot" feature
Use a new feature called "pvh_boot" to enable using the PVH boot
protocol if the guest kernel supports it. The feature can be enabled
by building with:

cargo build [--release] --features "pvh_boot"

Once performance has been evaluated, this can be made part of the
default set of features so that any guest that supports it boots
using PVH as the preferred option as is the case in QEMU.

Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
2020-03-13 18:29:44 +01:00
Alejandro Jimenez
a22bc3559f pvh: Write start_info structure to guest memory
Fill the hvm_start_info and related memory map structures as
specified in the PVH boot protocol. Write the data structures
to guest memory at the GPA that will be stored in %rbx when
the guest starts.

Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
2020-03-13 18:29:44 +01:00
Alejandro Jimenez
840a9a97ff pvh: Initialize vCPU regs/sregs for PVH boot
Set the initial values of the KVM vCPU registers as specified in
the PVH boot ABI:

https://xenbits.xen.org/docs/unstable/misc/pvh.html

Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
2020-03-13 18:29:44 +01:00
Alejandro Jimenez
24f0e42e6a pvh: Introduce EntryPoint struct
In order to properly initialize the kvm regs/sregs structs for
the guest, the load_kernel() return type must specify which
boot protocol to use with the entry point address it returns.

Make load_kernel() return an EntryPoint struct containing the
required information. This structure will later be used
in the vCPU configuration methods to setup the appropriate
initial conditions for the guest.

Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
2020-03-13 18:29:44 +01:00
Rob Bradford
4579afa091 vmm: For --disk error if socket and path is specified
This is an error as the path should be specfied by the unmanaged
backend.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-13 11:41:52 +00:00
Rob Bradford
7e599b4450 vmm: Make disk path optional
When using "--disk" with a vhost socket and not using self spawning then
it is not necessary or helpful to specify the path.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-13 11:41:52 +00:00
Sebastien Boeuf
8d785bbd5f pci: Fix the PciBus using HashMap instead of Vec
By using a Vec to hold the list of devices on the PciBus, there's a
problem when we use unplug. Indeed, the vector of devices gets reduced
and if the unplugged device was not the last one from the list, every
other device after this one is shifted on the bus.

To solve this problem, a HashMap is used. This allows to keep track of
the exact place where each device stands on the bus.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-13 10:54:34 +01:00
Jose Carlos Venegas Munoz
40b38a4222 openapi: Make desired_ram int64 format
The option desired_ram is in byte, make larger the amount of memory to
add.

Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
2020-03-12 23:17:56 +01:00
Sebastien Boeuf
efba48dddb vmm: Don't put a VFIO device behind the vIOMMU by default
With some of the factorization that happened to be able to support VFIO
hotplug, one mistake was made. In case a vIOMMU is created through a
virtio-iommu device, and no matter the "iommu" option value from the
VFIO device parameter, the VFIO device was always placed behind the
virtual IOMMU.

This commit fixes this wrong behavior by making sure the device
configuration is taken into account to decide if it should be attached
or not to the virtual IOMMU.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-11 19:50:31 +01:00
Sebastien Boeuf
34412c9b41 vmm: Add id option to VFIO hotplug
Add a new id option to the VFIO hotplug command so that it matches the
VFIO coldplug semantic.

This is done by refactoring the existing code for VFIO hotplug, where
VmAddDeviceData structure is replaced by DeviceConfig. This structure is
the one used whenever a VFIO device is coldplugged, which is why it
makes sense to reuse it for the hotplug codepath.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-11 19:50:31 +01:00
Samuel Ortiz
18dc916380 vmm: Switch to the micro-http package
It's been extracted from the Firecracker code base.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-03-11 17:38:01 +01:00
Sebastien Boeuf
9023444ad3 vmm: Add id field to --device through CLI
Add the ability to specify the "id" associated with a device, by adding
an extra option to the parameter --device.

This new option is not mandatory, and by default, the VMM will take care
of finding a unique identifier.

If the identifier provided by the user through this new option is not
unique, an error will be thrown and the VM won't be started.

Fixes #881

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-11 13:10:57 +00:00
Sebastien Boeuf
f4a956a60a vmm: Remove 32 bits MMIO range from correct address space
The 32 bits MMIO address space is handled separately from the 64 bits
one. For this reason, we need to invoke the appropriate freeing function
to remove a range from this address space.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-11 13:10:30 +00:00
Sebastien Boeuf
432eb5b70a vmm: Free PCI BARs when unplugging PCI device
Now that PciDevice trait has a dedicated function to remove the bars,
the DeviceManager can invoke this function whenever a PCI device is
unplugged from the VM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-11 13:10:30 +00:00
Sebastien Boeuf
b50cbe5064 pci: Give PCI device ID back when removing a device
Upon removal of a PCI device, make sure we don't hold onto the device ID
as it could be reused for another device later.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-10 17:05:06 +00:00
Sebastien Boeuf
df71aaee3f pci: Make the device ID allocation smarter
In order to handle the case where devices are very often plugged and
unplugged from a VM, we need to handle the PCI device ID allocation
better.

Any PCI device could be removed, which means we cannot simply rely on
the vector size to give the next available PCI device ID.

That's why this patch stores in memory the information about the 32
slots availability. Based on this information, whenever a new slot is
needed, the code can correctly provide an available ID, or simply return
an error because all slots are taken.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-10 17:05:06 +00:00
Sebastien Boeuf
e514b124ed vmm: Update VmConfig when removing VFIO device
This commit ensures that when a VFIO device is hot-unplugged from the
VM, it is also removed from the VmConfig. This prevents a potential
reboot from creating the device.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-10 17:05:06 +00:00
Sebastien Boeuf
81173bf4ab vmm: Add id field to DeviceConfig structure
Add a new field to the DeviceConfig, allowing the VMM to allocate a name
to the VFIO devices.

By identifying a VFIO device with a unique name, we can make sure a user
can properly unplug it at any time.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-10 17:05:06 +00:00
Sebastien Boeuf
6cbdb9aa47 vmm: api: Introduce new "remove-device" HTTP endpoint
This commit introduces the new command "remove-device" that will let a
user hot-unplug a VFIO PCI device from an already running VM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-10 17:05:06 +00:00
Sebastien Boeuf
991f3bb5da vmm: Remove VFIO device from everywhere it is referenced
This commit implements the eject function so that a VFIO device will be
removed from any bus it might sit on, and from any list it might be
stored in.

The idea is to reach a point where there is no reference of the device
anywhere in the code, so that the Drop implementation will be invoked
and so that the device will be fully removed from the VMM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-10 17:05:06 +00:00
Sebastien Boeuf
6adebbc6a0 vmm: Detect when guest notifies about ejecting PCI device
When the guest OS is done removing a PCI device, it will invoke the _EJ0
method from ACPI, associated with the device. This will trigger a port
IO write to a region known by the VMM. Upon this writing, the VMM will
trap the VM exit and retrieve the written value.

Based on the value, the VMM will invoke its eject_device() method to
finalize the removal of the device.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-10 17:05:06 +00:00
Sebastien Boeuf
08604ac6a8 vmm: Store PCI devices as Any devices from DeviceManager
As we try to keep track of every PCI device related to the VM, we don't
want to have separate lists depending on the concrete type associated
with the PciDevice trait. Also, we want to be able to cast the actual
type into any trait or concrete type.

The most efficient way to solve all these issues is to store every
device as an Arc<dyn Any + Send + Sync>. This gives the ability to
downcast into the appropriate concrete type, and then to cast back into
any trait that we might need.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-10 17:05:06 +00:00
Sebastien Boeuf
0f99d3f7cc vmm: Store VFIO device's name and its PCI b/d/f
Add a new list storing the device names across the entire codebase. VFIO
devices are added to the list whenever a new one is created. By default,
each VFIO device is given a name "vfioX" where X is the first available
integer.

Along with this new list of names, another list is created, grouping PCI
device's name with its associated b/d/f. This will be useful to keep
track of the created devices so that we can implement unplug
functionality.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-10 17:05:06 +00:00
Rob Bradford
f0a3e7c4a1 build: Bump linux-loader and vm-memory dependencies
linux-loader now uses the released vm-memory so we must move to that
version at the same time.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-03-05 11:01:30 +01:00
Sebastien Boeuf
09829c44b2 vmm: Remove IO bus strong reference from Vm
The Vm structure was used to store a strong reference to the IO bus.
This is not needed anymore since the AddressManager is logically the
one holding this strong reference. This has been made possible by the
introduction of Weak references on the Bus structure itself.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-04 18:46:44 +01:00
Sebastien Boeuf
2dbb376175 vmm: Remove all Weak references from DeviceManager
Now that the BusDevice devices are stored as Weak references by the
IO and MMIO buses, there's no need to use Weak references from the
DeviceManager anymore.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-04 18:46:44 +01:00
Sebastien Boeuf
9e915a0284 vmm: Remove all Weak references from CpuManager
Now that the BusDevice devices are stored as Weak references by the
IO and MMIO buses, there's no need to use Weak references from the
CpuManager anymore.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-04 18:46:44 +01:00
Sebastien Boeuf
49268bff3b pci: Remove all Weak references from PciBus
Now that the BusDevice devices are stored as Weak references by the IO
and MMIO buses, there's no need to use Weak references from the PciBus
anymore.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-04 18:46:44 +01:00
Sebastien Boeuf
7773812f58 vmm: Store the list of BusDevice devices from DeviceManager
The point is to make sure the DeviceManager holds a strong reference of
each BusDevice inserted on the IO and MMIO buses. This will allow these
buses to hold Weak references onto the BusDevice devices.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-04 18:46:44 +01:00
Sebastien Boeuf
d0820cc026 vmm: Make add_vfio_device mutable
The method add_vfio_device() from the DeviceManager needs to be mutable
if we want later to be able to update some internal fields from the
DeviceManager from this same function.

This commit simply takes care of making the necessary changes to change
this function as mutable.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-04 18:46:44 +01:00
Sebastien Boeuf
948f808da6 vm: Rename DeviceManager field in Vm structure
It's more logical to name the field referring to the DeviceManager as
"device_manager" instead of "devices".

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-04 18:46:44 +01:00
Sebastien Boeuf
d47f733e51 vmm: Break the cyclic dependency between DeviceManager and IO bus
By inserting the DeviceManager on the IO bus, we introduced some cyclic
dependency:

  DeviceManager ---> AddressManager ---> Bus ---> BusDevice
        ^                                             |
        |                                             |
        +---------------------------------------------+

This cycle needs to be broken by inserting a Weak reference instead of
an Arc (considered as a strong reference).

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-04 12:06:02 +00:00
Sebastien Boeuf
c1af13efeb vmm: Update VmConfig when adding new device
Ensures the configuration is updated after a new device has been
hotplugged. In the event of a reboot, this means the new VM will be
started with the new device that had been previously hotplugged.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-04 12:06:02 +00:00
Sebastien Boeuf
a86f4369a7 vmm: Add VFIO PCI device hotplug support
This commit finalizes the VFIO PCI hotplug support, based on all the
previous commits preparing for it.

One thing to notice, this does not support vIOMMU yet. This means we can
hotplug VFIO PCI devices, but we cannot attach them to an existing or a
new virtio-iommu device.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-04 12:06:02 +00:00
Sebastien Boeuf
320fea0eaf vmm: Factorize VFIO PCI device creation
This factorization is very important as it will allow both the standard
codepath and the VFIO PCI hotplug codepath to rely on the same function
to perform the addition of a new VFIO PCI device.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-04 12:06:02 +00:00
Sebastien Boeuf
00716f90a0 vmm: Store virtio-iommu device from DeviceManager
Helps with future refactoring of VFIO device creation.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-04 12:06:02 +00:00
Sebastien Boeuf
5902dfa403 vmm: Store VFIO KVM device from DeviceManager
Helps with future refactoring of VFIO device creation.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-04 12:06:02 +00:00
Sebastien Boeuf
d9c1b4396e vmm: Store MSI InterruptManager from DeviceManager
Helps with future refactoring of VFIO device creation.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-04 12:06:02 +00:00
Sebastien Boeuf
02adc4061a vmm: Store PciBus from DeviceManager
Helps with future refactoring of VFIO device creation.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-04 12:06:02 +00:00
Sebastien Boeuf
d0218e94a3 vmm: Trigger hotplug notification to the guest
Whenever the user wants to hotplug a new VFIO PCI device, the VMM will
have to trigger a hotplug notification through the GED device.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-04 12:06:02 +00:00
Sebastien Boeuf
0e58741a09 vmm: api: Introduce new "add-device" HTTP endpoint
This commit introduces the new command "add-device" that will let a user
hotplug a VFIO PCI device to an already running VM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-04 12:06:02 +00:00
Sebastien Boeuf
0f1396acef vmm: Insert PCI device hotplug operation region on IO bus
Through the BusDevice implementation from the DeviceManager, and by
inserting the DeviceManager on the IO bus for a specific IO port range,
the VMM now has the ability to handle PCI device hotplug.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-04 12:06:02 +00:00
Sebastien Boeuf
65774e8a78 vmm: Implement BusDevice for DeviceManager
In anticipation of inserting the DeviceManager on the IO/MMIO buses,
the DeviceManager must implement the BusDevice trait.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-04 12:06:02 +00:00
Sebastien Boeuf
8dbc84318c vmm: acpi: Add PCNT method to invoke DVNT
Create a small method that will perform both hotplug of all the devices
identified by PCIU bitmap, and then perform the hotunplug of all the
devices identified by the PCID bitmap.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-04 12:06:02 +00:00
Sebastien Boeuf
c62db97a81 vmm: acpi: Add _EJ0 to each PCI device slot
The _EJ0 method provides the guest OS a way to notify the VMM that the
device has been properly ejected from the guest OS. Only after this
point, the VMM can fully remove the device.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-04 12:06:02 +00:00
Sebastien Boeuf
4dc2a39f3a vmm: acpi: Create PHPR container
This new PHPR device in the DSDT table introduces some specific
operation regions and the associated fields.

PCIU stands for "PCI up", which identifies PCI devices that must be
added.
PCID stands for "PCI down", which identifies PCI devices that must be
removed.
B0EJ stands for "Bus 0 eject", which identifies which device on the bus
has been ejected by the guest OS.

Thanks to these fields, the VMM and the guest OS can communicate while
performing hotplug/hotunplug operations.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-04 12:06:02 +00:00
Sebastien Boeuf
c3a0685e2d vmm: acpi: Add notification method for PCI device slots
Adds the DVNT method to the PCI0 device in the DSDT table. This new
method is responsible for checking each slot and notify the guest OS if
one of the slots is supposed to be added or removed.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-04 12:06:02 +00:00
Sebastien Boeuf
5a68d5b6a7 vmm: acpi: Create PCI device slots
This commit introduces the ACPI support for describing the 32 device
slots attached to the main PCI host bridge.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-04 12:06:02 +00:00
Bin Liu
d6e6901957 vmm/api: Fix vm.info response definition
Update cloud-hypervisor.yaml with latest code.

Fixes: #841

Signed-off-by: liubin <liubin0329@gmail.com>
2020-03-03 09:34:25 +01:00
Sebastien Boeuf
8142c823ed vmm: Move DeviceManager into an Arc<Mutex<>>
In anticipation of the support for device hotplug, this commit moves the
DeviceManager object into an Arc<Mutex<>> when the DeviceManager is
being created. The reason is, we need the DeviceManager to implement the
BusDevice trait and then provide it to the IO bus, so that IO accesses
related to device hotplug can be handled correctly.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-02-27 11:12:31 +01:00
Qiu Wenbo
9de3ace8c7 devices: implement Aml trait for GED device
Fixes: #657

Signed-off-by: Qiu Wenbo <qiuwenbo@phytium.com.cn>
2020-02-25 08:32:16 +00:00
Sebastien Boeuf
b77fdeba2d msi/msi-x: Prevent from losing masked interrupts
We want to prevent from losing interrupts while they are masked. The
way they can be lost is due to the internals of how they are connected
through KVM. An eventfd is registered to a specific GSI, and then a
route is associated with this same GSI.

The current code adds/removes a route whenever a mask/unmask action
happens. Problem with this approach, KVM will consume the eventfd but
it won't be able to find an associated route and eventually it won't
be able to deliver the interrupt.

That's why this patch introduces a different way of masking/unmasking
the interrupts, simply by registering/unregistering the eventfd with the
GSI. This way, when the vector is masked, the eventfd is going to be
written but nothing will happen because KVM won't consume the event.
Whenever the unmask happens, the eventfd will be registered with a
specific GSI, and if there's some pending events, KVM will trigger them,
based on the route associated with the GSI.

Suggested-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-02-25 08:31:14 +00:00
Rob Bradford
bba5ef3a59 vmm: Remove deprecated CPU syntax
Remove the old way of specifying the number of vCPUs to use.

Fixes: #678

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-24 07:26:31 +01:00
Rob Bradford
374ac77c63 main, vmm: Remove deprecated --vhost-user-net
This has been superseded by using --net with vhost_user=true and
socket=<socket>

Fixes: #678

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-24 07:26:31 +01:00
Rob Bradford
ffd816ebfa main, vmm: Remove deprecated --vhost-user-blk
This has been superseded by using --disk with vhost_user=true and
socket=<socket>

Fixes: #678

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-24 07:26:31 +01:00
dependabot-preview[bot]
f190cb05b5 build(deps): bump libc from 0.2.66 to 0.2.67
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.66 to 0.2.67.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.66...0.2.67)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-02-21 08:03:30 +00:00
Sergio Lopez
d2f1749edb vmm: config: Add poll_queue property to DiskConfig
Recently, vhost_user_block gained the ability of actively polling the
queue, a feature that can be disabled with the poll_queue property.

This change adds this property to DiskConfig, so it can be used
through the "disk" argument.

For the moment, it can only be used when vhost_user=true, but this
will change once virtio-block gets the poll_queue feature too.

Fixes: #787

Signed-off-by: Sergio Lopez <slp@redhat.com>
2020-02-20 18:06:54 +01:00
Sergio Lopez
378dd81204 vmm: openapi: Add missing "direct" knob to DiskConfig
Add missing "direct" knob that should be exposed through the REST API.

Signed-off-by: Sergio Lopez <slp@redhat.com>
2020-02-20 18:06:54 +01:00
Sergio Lopez
056f5481ac vmm: openapi: Fix "readonly" and "wce" defaults in DiskConfig
Fix "readonly" and "wce" defaults in cloud-hypervisor.yaml to match
their respective defaults in config.rs:DiskConfig.

Signed-off-by: Sergio Lopez <slp@redhat.com>
2020-02-20 18:06:54 +01:00
Samuel Ortiz
c49e31a6d9 vmm: api: Return a resize error when resize fails
And not a VmCreate one.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-02-20 12:26:12 +01:00
Samuel Ortiz
ebc6391bea vmm: api: Fix resize command typos
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-02-20 12:26:12 +01:00
Samuel Ortiz
9de755334d vmm: openapi: Update DiskConfig
It's missing a few knobs (readonly, vhost, wce) that should be exposed
through the rest API.

Fixes: #790

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-02-20 12:17:50 +01:00
Rob Bradford
ed1e7817cc vmm: Workaround double reboot triggered by the kernel
The kernel does not adhere to the ACPI specification (probably to work
around broken hardware) and rather than busy looping after requesting an
ACPI reset it will attempt to reset by other mechanisms (such as i8042
reset.)

In order to trigger a reset the devices write to an EventFd (called
reset_evt.) This is used by the VMM to identify if a reset is requested
and make the VM reboot. As the reset_evt is part of the VMM and reused
for both the old and new VM it is possible for the newly booted VM to
immediately get reset as there is an old event sitting in the EventFd.

The simplest solution is to "drain" the reset_evt EventFd on reboot to
make sure that there is no spurious events in the EventFd.

Fixes: #783

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-19 18:51:14 +01:00
Sebastien Boeuf
793d4e7b8d vmm: Move codebase to GuestMemoryAtomic from vm-memory
Relying on the latest vm-memory version, including the freshly
introduced structure GuestMemoryAtomic, this patch replaces every
occurrence of Arc<ArcSwap<GuestMemoryMmap> with
GuestMemoryAtomic<GuestMemoryMmap>.

The point is to rely on the common RCU-like implementation from
vm-memory so that we don't have to do it from Cloud-Hypervisor.

Fixes #735

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-02-19 13:48:19 +00:00
Rob Bradford
1f6cbad01a vmm: Add support for spawning vhost-user-block backend
If no socket is supplied when enabling "vhost_user=true" on "--disk"
follow the "exe" path in the /proc entry for this process and launch the
network backend (via the vmm_path field.)

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-18 08:43:47 +00:00
Sebastien Boeuf
3edc2bd6ab vmm: Prevent memory overcommitment through virtio-fs shared regions
When a virtio-fs device is created with a dedicated shared region, by
default the region should be mapped as PROT_NONE so that no pages can be
faulted in.

It's only when the guest performs the mount of the virtiofs filesystem
that we can expect the VMM, on behalf of the backend, to perform some
new mappings in the reserved shared window, using PROT_READ and/or
PROT_WRITE.

Fixes #763

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-02-17 15:03:47 +01:00
Rob Bradford
bc75c1b4e1 vmm: Add support for spawning vhost-user-net backend
If no socket is supplied when enabling "vhost_user=true" on "--net"
follow the "exe" path in the /proc entry for this process and launch the
network backend (via the vmm_path field.)

Currently this only supports creating a new tap interface as the network
backend also only supports that.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-14 17:32:49 +00:00
Rob Bradford
b04eb4770b vmm: Follow the "exe" symlink from the PID directory in /proc
It is necessary to do this at the start of the VMM execution rather than
later as it must be done in the main thread in order to satisfy the
checks required by PTRACE_MODE_READ_FSCREDS (see proc(5) and
ptrace(2))

The alternative is to run as CAP_SYS_PTRACE but that has its
disadvantages.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-14 17:32:49 +00:00
Rob Bradford
7c9e8b103f vmm: device_manager: Shutdown all virtio devices
When the DeviceManager is dropped explicitly shutdown() all virtio
devices.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-14 17:32:49 +00:00
Sebastien Boeuf
3447e226d9 dependencies: bump vm-memory from 4237db3 to f3d1c27
This commit updates Cloud-Hypervisor to rely on the latest version of
the vm-memory crate.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-02-06 11:40:45 +01:00
Sebastien Boeuf
62ccccc303 vmm: Make sure to retry creating the VM on EINTR
If the ioctl syscall KVM_CREATE_VM gets interrupted while creating the
VM, it is expected that we should retry since EINTR should not be
considered a standard error.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-02-05 12:06:21 +01:00
Samuel Ortiz
da2b3c92d3 vm-device: interrupt: Remove InterruptType dependencies and definitions
Having the InterruptManager trait depend on an InterruptType forces
implementations into supporting potentially very different kind of
interrupts from the same code base. What we're defining through the
current, interrupt type based create_group() method is a need for having
different interrupt managers for different kind of interrupts.

By associating the InterruptManager trait to an interrupt group
configuration type, we create a cleaner design to support that need as
we're basically saying that one interrupt manager should have the single
responsibility of supporting one kind of interrupt (defined through its
configuration).

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-02-04 19:32:45 +01:00
Samuel Ortiz
84fc807bc6 interrupt: Interrupt manager split
We create 2 different interrupt managers for separately handling
creation of legacy and MSI interrupt groups.
Doing so allows us to have a cleaner interrupt manager and IOAPIC
initialization path. It also prepares for an InterruptManager trait
design improvement where we remove the interrupt source type dependency
by associating an interrupt configuration type to the trait.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-02-04 19:32:45 +01:00
Rob Bradford
880a57c920 vmm: Remove VmInfo struct
After refactoring the VmInfo struct is no longer needed.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-03 12:28:30 +00:00
Rob Bradford
07bc292fa5 vmm: device_manager: Get VmFd from AddressManager
A reference to the VmFd is stored on the AddressManager so it is not
necessary to pass in the VmInfo into all methods that need it as it can
be obtained from the AddressManager.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-03 12:28:30 +00:00
Rob Bradford
6411c3ae42 vmm: device_manager: Use MemoryManager to get guest memory
The DeviceManager has a reference to the MemoryManager so use that to
get the GuestMemoryMmap rather than the version stored in the VmInfo
struct.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-03 12:28:30 +00:00
Rob Bradford
066fc6c0d1 vmm: device_manager: Get VM config from the struct member
Remove the use of vm_info in methods to get the config and instead use
the config stored on the DeviceManager itself.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-03 12:28:30 +00:00
Rob Bradford
77ae3de4f3 vmm: device_manager: Make legacy device addition a method
Remove some in/out parameters and instead rely on them as members of the
&mut self parameter.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-03 12:28:30 +00:00
Rob Bradford
599275b610 vmm: device_manager: Make ACPI device creation a method
Remove some in/out parameters and instead rely on them as members of the
&mut self parameter.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-03 12:28:30 +00:00
Rob Bradford
b8c1b2e174 vmm: device_manager: Make console creation a method
Remove some in/out parameters and instead rely on them as members of the
&mut self parameter.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-03 12:28:30 +00:00
Rob Bradford
b5440e2d0a vmm: device_manager: Make virtio device creation functions methods
Remove some in/out parameters and instead rely on them as members of the
&mut self parameter. This prepares the way to more easily store state on
the DeviceManager.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-03 12:28:30 +00:00
Rob Bradford
e90c6f3c44 vmm: device_manager: Make make_virtio_devices a method
Remove some in/out parameters and instead rely on them as members of the
&mut self parameter. A follow-up commit will change the callee functions
that create the devices themselves.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-03 12:28:30 +00:00
Rob Bradford
dbc09ad0ef vmm: device_manager: Make add_vfio_devices a method
Remove some in/out parameters and instead rely on them as members of the
&mut self parameter.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-03 12:28:30 +00:00
Rob Bradford
d9e1c2cd22 vmm: device_manager: Make add_virtio_pci_device a method
Remove some in/out parameters and instead rely on them as members of the
&mut self parameter.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-03 12:28:30 +00:00
Rob Bradford
aaa5e2e9ea vmm: device_manager: Make add_virtio_mmio_device a method
Remove some in/out parameters and instead rely on them as members of the
&mut self parameter.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-03 12:28:30 +00:00
Rob Bradford
2987476e0a vmm: device_manager: Make add_pci_devices and add_mmio_devices methods
Modify these functions to take an &mut self and become methods on
DeviceManager. This allows the removal of some in/out parameters and
leads the way to further refactoring and simplification.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-03 12:28:30 +00:00
Rob Bradford
3dbae423bb vmm: device_manager: Only add MemoryManager to I/O bus on ACPI builds
The MemoryManager should only be included on the I/O bus when doing ACPI
builds as that is the only time it will be interrogated.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-03 12:28:30 +00:00
Rob Bradford
68fa97eb0e vmm: device_manager: Always embed MemoryManager in the struct
Currently the MemoryManager is only used on the ACPI code paths after
the DeviceManager has been created. This will change in a future commit
as part of the refactoring so for now always include it but name it with
underscore prefix to indicate it might not always be used.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-02-03 12:28:30 +00:00
Sebastien Boeuf
ac01ceddbb vmm: Cleanup list of PCI IDs related to virtual IOMMU
Now that devices attached to the virtual IOMMU are described through
virtio configuration, there is no need for the DeviceManager to store
the list of IDs for all these devices. Instead, things are handled
locally when PCI devices are being added.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-30 10:37:40 +01:00
Sebastien Boeuf
097cff2d85 vmm: Use virtio topology for virtio-iommu
Instead of relying on the ACPI tables to describe the devices attached
to the virtual IOMMU, let's use the virtio topology, as the ACPI support
is getting deprecated.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-30 10:37:40 +01:00
dependabot-preview[bot]
1651cc3953 build(deps): bump kvm-ioctls from 0.4.0 to 0.5.0
Bumps [kvm-ioctls](https://github.com/rust-vmm/kvm-ioctls) from 0.4.0 to 0.5.0.
- [Release notes](https://github.com/rust-vmm/kvm-ioctls/releases)
- [Changelog](https://github.com/rust-vmm/kvm-ioctls/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-vmm/kvm-ioctls/compare/v0.4.0...v0.5.0)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-01-29 10:22:51 +00:00
Rob Bradford
75e6762897 vmm: Give deprecation warning for "--vhost-user-blk" syntax
This will be removed in a future release.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-01-29 08:06:37 +00:00
Rob Bradford
969b5ee4e8 vmm: config: Add warning about specifying "wce" without "vhost-user"
Currently configuring WCE is only supported when using vhost-user.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-01-29 08:06:37 +00:00
Rob Bradford
aeeae661fc vmm: Support vhost-user-block via "--disks"
Add a socket and vhost_user parameter to this option so that the same
configuration option can be used for both virtio-block and
vhost-user-block.  For now it is necessary to specify both vhost_user
and socket parameters as auto activation is not yet implemented. The wce
parameter for supporting "Write Cache Enabling" is also added to the
disk configuration.

The original command line parameter is still supported for now and will
be removed in a future release.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-01-29 08:06:37 +00:00
Rob Bradford
2c6f528c23 vmm: Give deprecation warning for "--vhost-user-net" syntax
This will be removed in a future release.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-01-28 12:39:26 +00:00
Rob Bradford
a831aa214c vmm: Support vhost-user-net via "--net"
Add a socket and vhost_user parameter to this option so that the same
configuration option can be used for both virtio-net and vhost-user-net.
For now it is necessary to specify both vhost_user and socket parameters
as auto activation is not yet implemented. The original command line
parameter is still supported for now.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-01-28 12:39:26 +00:00
Sebastien Boeuf
f5b53ae4be vm-virtio: Implement multiqueue/multithread support for virtio-blk
This commit improves the existing virtio-blk implementation, allowing
for better I/O performance. The cost for the end user is to accept
allocating more vCPUs to the virtual machine, so that multiple I/O
threads can run in parallel.

One thing to notice, the amount of vCPUs must be egal or superior to the
amount of queues dedicated to the virtio-blk device.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-01-28 09:26:53 +01:00