Commit Graph

293 Commits

Author SHA1 Message Date
Bo Chen
2612a6df29 vmm: seccomp: Add seccomp filters for the vcpu worker thread
Partially fixes: #925

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-09-11 07:42:31 +02:00
Sebastien Boeuf
e15dba2925 vmm: Rename NUMA option 'id' into 'guest_numa_id'
The goal of this commit is to rename the existing NUMA option 'id' with
'guest_numa_id'. This is done without any modification to the way this
option behaves.

The reason for the rename is caused by the observation that all other
parameters with an option called 'id' expect a string to be provided.

Because in this particular case we expect a u32 representing a proximity
domain from the ACPI specification, it's better to name it with a more
explicit name.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-07 07:37:14 +02:00
Sebastien Boeuf
f21c04166a vmm: Move NUMA node list creation to Vm structure
Based on the previous changes introducing new options for both memory
zones and NUMA configuration, this patch changes the behavior of the
NUMA node definition. Instead of relying on the memory zones to define
the guest NUMA nodes, everything goes through the --numa parameter. This
allows for defining NUMA nodes without associating any particular memory
range to it. And in case one wants to associate one or multiple memory
ranges to it, the expectation is to describe a list of memory zone
through the --numa parameter.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-07 07:37:14 +02:00
Sebastien Boeuf
9548e7e857 vmm: Update NUMA node distances internally
Based on the NumaConfig which now provides distance information, we can
internally update the list of NUMA nodes with the exact distances they
should be located from other nodes.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-01 18:09:01 +02:00
Sebastien Boeuf
db28db8567 vmm: Update NUMA nodes based on NumaConfig
Relying on the list of CPUs defined through the NumaConfig, this patch
will update the internal list of CPUs attached to each NUMA node.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-01 15:25:00 +02:00
Sebastien Boeuf
871138d5cc vm-migration: Make snapshot() mutable
There will be some cases where the implementation of the snapshot()
function from the Snapshottable trait will require to modify some
internal data, therefore we make this possible by updating the trait
definition with snapshot(&mut self).

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-08-25 16:43:10 +02:00
Michael Zhao
afc98a5ec9 vmm: Fix AArch64 clippy warnings of vmm and other crates
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2020-08-24 10:59:08 +02:00
Muminul Islam
92b4499c1e vmm, hypervisor: Add vmstate to snapshot and restore path
Signed-off-by: Muminul Islam <muislam@microsoft.com>
2020-08-24 08:48:15 +02:00
Bo Chen
704edd544c virtio-devices: seccomp: Add seccomp_filter module
This patch added the seccomp_filter module to the virtio-devices crate
by taking reference code from the vmm crate. This patch also adds
allowed-list for the virtio-block worker thread.

Partially fixes: #925

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-08-04 11:40:49 +02:00
Bo Chen
ff7ed8f628 vmm: Propagate the SeccompAction value to the Vm struct constructor
This patch propagates the SeccompAction value from main to the
Vm struct constructor (i.e. Vm::new_from_memory_manager), so that we can
use it to construct the DeviceManager and CpuManager struct for
controlling the behavior of the seccomp filters for vcpu/virtio-device
worker threads.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-08-04 11:40:49 +02:00
Wei Liu
218ec563fc vmm: fix warnings when KVM is not enabled
Some imports are only used by KVM. Some variables and code become dead
or unused when KVM is not enabled.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-07-28 21:08:39 +01:00
Michael Zhao
e3e771727a arch: Refactor GIC code to seperate KVM specific code
Shrink GICDevice trait to contain hypervisor agnostic API's only, which
are used in generating FDT.
Move all KVM specific logic into KvmGICDevice trait.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2020-07-21 16:22:02 +02:00
Michael Zhao
3e051e7b2c arch, vmm: Enable initramfs on AArch64
Ported Firecracker commit 144b6c.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2020-07-20 14:20:53 +01:00
Wei Liu
d80e383dbb arch: move test cases to vmm crate
This saves us from adding a "kvm" feature to arch crate merely for the
purpose of running tests.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-07-15 17:21:07 +02:00
Wei Liu
598eaf9f86 vmm: use hypervisor::new in test_vm
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-07-15 17:21:07 +02:00
Sebastien Boeuf
a5c4f0fc6f arch, vmm: Add e820 entry related to SGX EPC region
SGX expects the EPC region to be reported as "reserved" from the e820
table. This patch adds a new entry to the table if SGX is enabled.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-07-15 15:08:56 +02:00
Sebastien Boeuf
1603786374 vmm: Pass MemoryManager through CpuManager creation
Instead of passing the GuestMemoryMmap directly to the CpuManager upon
its creation, it's better to pass a reference to the MemoryManager. This
way we will be able to know if SGX EPC region along with one or multiple
sections are present.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-07-15 15:08:56 +02:00
Sebastien Boeuf
84cf12d86a arch, vmm: Create SGX virtual EPC sections from MemoryManager
Based on the presence of one or multiple SGX EPC sections from the VM
configuration, the MemoryManager will allocate a contiguous block of
guest address space to hold the entire EPC region. Within this EPC
region, each EPC section is memory mapped.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-07-15 15:08:56 +02:00
Michael Zhao
f2e484750a arch: aarch64: Add PCIe node in FDT for AArch64
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2020-07-14 14:34:54 +01:00
Hui Zhu
800220acbb virtio-balloon: Store the balloon size to support reboot
This commit store balloon size to MemoryConfig.
After reboot, virtio-balloon can use this size to inflate back to
the size before reboot.

Signed-off-by: Hui Zhu <teawater@antfin.com>
2020-07-07 17:25:13 +01:00
Hui Zhu
8ffbc3d031 vmm: api: ch-remote: Add balloon to VmResizeData
Signed-off-by: Hui Zhu <teawater@antfin.com>
2020-07-07 17:25:13 +01:00
Wei Liu
a4f484bc5e hypervisor: Define a VM-Exit abstraction
In order to move the hypervisor specific parts of the VM exit handling
path, we're defining a generic, hypervisor agnostic VM exit enum.

This is what the hypervisor's Vcpu run() call should return when the VM
exit can not be completely handled through the hypervisor specific bits.
For KVM based hypervisors, this means directly forwarding the IO related
exits back to the VMM itself. For other hypervisors that e.g. rely on the
VMM to decode and emulate instructions, this means the decoding itself
would happen in the hypervisor crate exclusively, and the rest of the VM
exit handling would be handled through the VMM device model implementation.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>

Fix test_vm unit test by using the new abstraction and dropping some
dead code.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-07-06 12:59:43 +01:00
Wei Liu
cfa758fbb1 vmm, hypervisor: introduce and use make_user_memory_region
This removes the last KVM-ism from memory_manager. Also make use of that
method in other places.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-07-06 12:31:19 +02:00
Samuel Ortiz
acfe5eb94f vmm: vm: Rename fd variable into something more meaningful
The fd naming is quite KVM specific. Since we're now using the
hypervisor crate abstractions, we can rename those into something more
readable and meaningful.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-07-06 09:35:30 +01:00
Rob Bradford
2a6eb31d5b vm-virtio, virtio-devices: Split device implementation from virt queues
Split the generic virtio code (queues and device type) from the
VirtioDevice trait, transport and device implementations.

This also simplifies the feature handling in vhost_user_backend as the
vm-virtio crate is no longer has any features.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-07-02 17:09:28 +01:00
dependabot-preview[bot]
f3c8f827cc build(deps): bump linux-loader from 2a62f21 to ec930d7
Bumps [linux-loader](https://github.com/rust-vmm/linux-loader) from `2a62f21` to `ec930d7`.
- [Release notes](https://github.com/rust-vmm/linux-loader/releases)
- [Commits](2a62f21b44...ec930d700f)

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-06-30 07:05:06 +00:00
Sebastien Boeuf
86377127df vmm: Resume devices after vCPUs have been resumed
Because we don't want the guest to miss any event triggered by the
emulation of devices, it is important to resume all vCPUs before we can
resume the DeviceManager with all its associated devices.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-06-25 12:01:34 +02:00
Wei Liu
1741af74ed hypervisor: add safety statement in set_user_memory_region
When set_user_memory_region was moved to hypervisor crate, it was turned
into a safe function that wrapped around an unsafe call. All but one
call site had the safety statements removed. But safety statement was
not moved inside the wrapper function.

Add the safety statement back to help reasoning in the future. Also
remove that one last instance where the safety statement is not needed .

No functional change.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-25 10:25:13 +02:00
Wei Liu
b27439b6ed arch, hypervisor, vmm: KvmHyperVisor -> KvmHypervisor
"Hypervisor" is one word. The "v" shouldn't be capitalised.

No functional change.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-25 10:25:13 +02:00
Wei Liu
b00171e17d vmm: use MemoryRegion where applicable
That removes one more KVM-ism in VMM crate.

Note that there are more KVM specific code in those files to be split
out, but we're not at that stage yet.

No functional change.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-25 10:25:13 +02:00
Rob Bradford
d983c0a680 vmm: Expose counters from virtio devices to API
Collate the virtio device counters in DeviceManager for each device that
exposes any and expose it through the recently added HTTP API.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-06-25 07:02:44 +02:00
Rob Bradford
bca8a19244 vmm: Implement HTTP API for obtaining counters
The counters are a hash of device name to hash of counter name to u64
value. Currently the API is only implemented with a stub that returns an
empty set of counters.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-06-25 07:02:44 +02:00
Sebastien Boeuf
8038161861 vmm: Get and set clock during pause and resume operations
In order to maintain correct time when doing pause/resume and
snapshot/restore operations, this patch stores the clock value
on pause, and restore it on resume. Because snapshot/restore
expects a VM to be paused before the snapshot and paused after
the restore, this covers the migration use case too.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-06-23 14:36:01 +01:00
Sebastien Boeuf
8a165b5314 vmm: Restore the VM in "paused" state
Because we need to pause the VM before it is snapshot, it should be
restored in a paused state to keep the sequence symmetrical. That's the
reason why the state machine regarding the valid VM's state transition
needed to be updated accordingly.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-06-23 10:15:03 +02:00
Muminul Islam
e4dee57e81 arch, pci, vmm: Initial switch to the hypervisor crate
Start moving the vmm, arch and pci crates to being hypervisor agnostic
by using the hypervisor trait and abstractions. This is not a complete
switch and there are still some remaining KVM dependencies.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-06-22 15:03:15 +02:00
Wei Liu
fb461c820f vmm: vm: enable test_vm test case
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-12 14:46:58 +01:00
Wei Liu
b99b5777bb vmm: vm: move some imports into test_vm
They are only needed there. Not moving them causes rustc to complain
about unused imports.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-12 14:46:58 +01:00
Sebastien Boeuf
83cd9969df vmm: Enable HTTP response for PCI device hotplug
This patch completes the series by connecting the dots between the HTTP
frontend and the device manager backend.

Any request to hotplug a VFIO, disk, fs, pmem, net, or vsock device will
now return a response including the device name and the place of the
device in the PCI topology.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-06-12 13:37:18 +01:00
Sebastien Boeuf
3316348d4c vmm: vm: Carry information from hotplugged PCI device
Pass from the device manager to the calling code the information about
the PCI device that has just been hotplugged.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-06-12 13:37:18 +01:00
Wei Liu
5ebd02a572 vmm: vm: fix test_vm test case
We should break out from the loop after getting the HLT exit, otherwise
the VM hangs forever.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-12 08:38:07 +02:00
Michael Zhao
5cd1730bc4 vmm: Configure VM on AArch64
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2020-06-11 15:00:17 +01:00
Anatol Belski
abd6204d27 source: Fix file permissions
Rust sources and some data files should not be executable. The perms are
set to 644.

Signed-off-by: Anatol Belski <ab@php.net>
2020-06-10 18:47:27 +01:00
Samuel Ortiz
d24aa72d3e vfio: Rename to vfio-ioctls
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-06-04 08:48:55 +02:00
Michael Zhao
969e5e0b51 vmm: Split configure_system() from load_kernel() for x86_64
Now the flow of both architectures are aligned to:
1. load kernel
2. create VCPU's
3. configure system
4. start VCPU's

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2020-06-03 11:27:29 +02:00
Michael Zhao
20cf21cd9d vmm: Change booting process to cover AArch64 requirements
Between X86 and AArch64, there is some difference in booting a VM:
- X86_64 can setup IOAPIC before creating any VCPU.
- AArch64 have to create VCPU's before creating GIC.

The old process is:
1. load_kernel()
    load kernel binary
    configure system
2. activate_vcpus()
    create & start VCPU's

So we need to separate "activate_vcpus" into "create_vcpus" and
"activate_vcpus" (to start vcpus only). Setup GIC and create FDT
between the 2 steps.

The new procedure is:
1. load_kernel()
    load kernel binary
    (X86_64) configure system
2. create VCPU's
3. (AArch64) setup GIC
4. (AArch64) configure system
5. start VCPU's

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2020-06-03 11:27:29 +02:00
Michael Zhao
1befae872d build: Fixed build errors and warnings on AArch64
This is a preparing commit to build and test CH on AArch64. All building
issues were fixed, but no functionality was introduced.
For X86, the logic of code was not changed at all.
For ARM, the architecture specific part is still empty. And we applied
some tricks to workaround lint warnings. But such code will be replaced
later by other commits with real functionality.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2020-05-21 11:56:26 +01:00
Rob Bradford
9ccc7daa83 build, vmm: Update to latest kvm-ioctls
The ch branch has been rebased to incorporate the latest upstream code
requiring a small change to the unit tests.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-05-13 17:14:49 +02:00
Rob Bradford
31bde4f5da vmm: Unpark the DeviceManager threads in shutdown
To ensure that the DeviceManager threads (such as those used for virtio
devices) are cleaned up it is necessary to unpark them so that they get
cleanly terminated as part of the shutdown.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-05-07 09:00:14 +02:00
Rob Bradford
cd60de8f7f Revert "vmm: vm: Unpark the threads before shutdown when the current state is paused"
This reverts commit e1a07ce3c4.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-05-07 09:00:14 +02:00
Sebastien Boeuf
adf297066d vmm: Create devices in different path if restoring the VM
In case the VM is created from scratch, the devices should be created
after the DeviceManager has been created. But this should not affect the
restore codepath, as in this case the devices should be created as part
of the restore() function.

It's necessary to perform this differentiation as the restore must go
through the following steps:
- Create the DeviceManager
- Restore the DeviceManager with the right state
- Create the devices based on the restored DeviceManager's device tree
- Restore each device based on the restored DeviceManager's device tree

That's why this patch leverages the recent split of the DeviceManager's
creation to achieve what's needed.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-05-05 16:08:42 +02:00