Commit Graph

31 Commits

Author SHA1 Message Date
Michael Zhao
a95b6bbd8b vmm: Add seccomp rules for starting vhost-user-net backend on AArch64
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2020-08-31 08:19:23 +02:00
Sebastien Boeuf
1b4591aecc vmm: memory_manager: Apply NUMA policy to memory zones
Relying on the new option 'host_numa_node' from the 'memory-zone'
parameter, the user can now define which NUMA node from the host
should be used to back the current memory zone.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-08-27 08:39:38 -07:00
Sebastien Boeuf
bdef54ead6 vmm: Add brk syscall to the API thread
The brk syscall is not always called as the system might not need it.
But when it's needed from the API thread, this causes the thread to
terminate as it is not part of the authorized list of syscalls.

This should fix some sporadic failures on the CI with the musl build.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-08-11 15:04:21 +01:00
Jose Carlos Venegas Munoz
90acb01bad vmm: seccomp: add mprotect to API thread filter
Add mprotect to API thread rules. Prevent the VMM is
killed when it is used.

Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
2020-08-05 21:35:21 +01:00
Bo Chen
8e74637ebb main, vmm: seccomp: Add the '--seccomp log' option
This patch extends the CLI option '--seccomp' to accept the 'log'
parameter in addition 'true/false'. It also refactors the
vmm::seccomp_filters module to support both "SeccompAction::Trap" and
"SeccompAction::Log".

Fixes: #1180

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-08-04 11:40:49 +02:00
Bo Chen
b41884a406 main, vmm: seccomp: Use SeccompAction instead of SeccompLevel
This patch replaces the usage of 'SeccompLevel' with 'SeccompAction',
which is the first step to support the 'log' action over system
calls that are not on the allowed list of seccomp filters.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-08-04 11:40:49 +02:00
Sebastien Boeuf
917027c55b vmm: Rely on virtio-blk io_uring when possible
In case the host supports io_uring and the specific io_uring options
needed, the VMM will choose the asynchronous version of virtio-blk.
This will enable better I/O performances compared to the default
synchronous version.

This is also important to note the VMM won't be able to use the
asynchronous version if the backend image is in QCOW format.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-08-03 14:15:01 +01:00
Jianyong Wu
d24b110519 seccomp: AArch64: Add SYS_unlinkat to seccomp whitelist
This commit fixes an "Bad syscall" error when shutting down the VM
on AArch64 by adding the SYS_unlinkat syscall to the seccomp
whitelist.

Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2020-07-27 07:25:07 +00:00
Michael Zhao
726e45e0ce vmm: Divide Seccomp KVM IOCTL rules by architecture
Refactored the construction of KVM IOCTL rules for Seccomp.
Separating the rules by architecture can reduce the risk of bugs and
attacks.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2020-07-06 13:40:38 +01:00
Michael Zhao
8820e9e133 vmm: Fix Seccomp filter for AArch64
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2020-07-02 08:46:24 +01:00
Sebastien Boeuf
e35d4c5b28 hypervisor: Store all supported MSRs
On x86 architecture, we need to save a list of MSRs as part of the vCPU
state. By providing the full list of MSRs supported by KVM, this patch
fixes the remaining snapshot/restore issues, as the vCPU is restored
with all its previous states.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-06-30 14:03:03 +01:00
Sebastien Boeuf
e2b5c78dc5 hypervisor: Re-order vCPU state for storing and restoring
Some vCPU states such as MP_STATE can be modified while retrieving
other states. For this reason, it's important to follow a specific
order that will ensure a state won't be modified after it has been
saved. Comments about ordering requirements have been copied over
from Firecracker commit 57f4c7ca14a31c5536f188cacb669d2cad32b9ca.

This patch also set the previously saved VCPU_EVENTS, as this was
missing from the restore codepath.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-06-30 14:03:03 +01:00
Sebastien Boeuf
9f4714c32a vmm: Extend seccomp filters with KVM_KVMCLOCK_CTRL
Now that the VMM uses KVM_KVMCLOCK_CTRL from the KVM API, it must be
added to the seccomp filters list.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-06-24 12:38:56 +02:00
Sebastien Boeuf
f5150aa261 vmm: Extend seccomp filters with KVM_GET_CLOCK and KVM_SET_CLOCK
Now that the VMM uses both KVM_GET_CLOCK and KVM_SET_CLOCK from the KVM
API, they must be added to the seccomp filters list.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-06-23 14:36:01 +01:00
Sebastien Boeuf
a998e89375 build(deps): bump signal-hook from 0.1.15 to 0.1.16
Bumps [signal-hook](https://github.com/vorner/signal-hook) from 0.1.15
to 0.1.16.
- [Release notes](https://github.com/vorner/signal-hook/releases)
- [Changelog](https://github.com/vorner/signal-hook/blob/master/CHANGELOG.md)
- [Commits](vorner/signal-hook@v0.1.15...v0.1.16)

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-06-22 14:09:11 +01:00
Rob Bradford
929d70bc7f net_util: Only try and enable the TAP device if it not already enabled
This allows an existing TAP interface to be used without needing
CAP_NET_ADMIN permissions on the Cloud Hypervisor binary as the ioctl to
bring up the interface is avoided.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-06-08 17:56:10 +02:00
Rob Bradford
c31ad72ee9 build: Address issues found by 1.43.0 clippy
These are mostly due to use of "bare use" statements and unnecessary vector
creation.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-05-27 19:32:12 +02:00
Rob Bradford
0728bece0c vmm: seccomp: Ensure that umask() can be reprogrammed
When doing self spawning the child will attempt to set the umask() again. Let
it through the seccomp rules so long as it the safe mask again.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-05-27 16:46:51 +01:00
Michael Zhao
1befae872d build: Fixed build errors and warnings on AArch64
This is a preparing commit to build and test CH on AArch64. All building
issues were fixed, but no functionality was introduced.
For X86, the logic of code was not changed at all.
For ARM, the architecture specific part is still empty. And we applied
some tricks to workaround lint warnings. But such code will be replaced
later by other commits with real functionality.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2020-05-21 11:56:26 +01:00
Rob Bradford
11049401ce vmm: seccomp: Add ioctl() commands interface hardware address
This is necessary to support setting the MAC address on the tap
interface on the host.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-05-15 11:45:09 +01:00
Sebastien Boeuf
68fc432978 vmm: Update seccomp filters with clock_nanosleep
The clock_nanosleep system call needs to be whitelisted since the commit
12e00c0f45 introduced the use of a sleep()
function. Without this patch, we can see an error when the VM is paused
or killed.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-05-15 12:34:53 +02:00
Rob Bradford
8cef35745b vmm: seccomp: Add fork, gettid and pipe2 syscalls to permitted list
This is needed for self spawning with the musl target.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-29 17:57:01 +01:00
Rob Bradford
ce7678f29f vmm: seccomp: Add tkill syscall to permitted list
This is needed for rebooting on the musl target.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-29 17:57:01 +01:00
Rob Bradford
12758d7fad vmm: seccomp: Add epoll_pwait syscall to permitted list
This is needed for basic operation on the musl target.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-04-29 17:57:01 +01:00
Bo Chen
3f42f86d81 vmm: Add the 'shared' and 'hugepages' controls to MemoryConfig
The new 'shared' and 'hugepages' controls aim to replace the 'file'
option in MemoryConfig. This patch also updated all related integration
tests to use the new controls (instead of providing explicit paths to
"/dev/shm" or "/dev/hugepages").

Fixes: #1011

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-04-23 21:39:51 +02:00
Sebastien Boeuf
b1554642e4 vmm: seccomp: Add missing mremap() syscall
While testing self spawned vhost-user backends, it appeared that the
backend was aborting due to a missing system call in the seccomp
filters. mremap() was the culprit and this patch simply adds it to the
whitelist.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-14 14:11:41 +02:00
Sebastien Boeuf
b9f9f01fcc vmm: Extend seccomp filters to allow snapshot/restore
A few KVM ioctls were missing in order to perform both snapshot and
restore while keeping seccomp enabled.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-07 12:26:10 +02:00
Sebastien Boeuf
2d17f4384a vmm: seccomp: Add missing open() syscall
On some systems, the open() system call is used by Cloud-Hypervisor,
that's why it should be part of the seccomp filters whitelist.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-04-02 09:56:48 +02:00
Sebastien Boeuf
e4ea8b0bef vmm: Add missing syscalls to the seccomp filters
Both clock_gettime and gettimeofday syscalls where missing when running
Cloud-Hypervisor on a Linux host without vDSO enabled. On a system with
vDSO enabled, the syscalls performed by vDSO were not filtered, that's
why we didn't have to whitelist them.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-27 16:50:52 +00:00
Sebastien Boeuf
feb8d7ae90 vmm: Separate seccomp filters between VMM and API threads
This separates the filters used between the VMM and API threads, so that
we can apply different rules for each thread.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-24 14:59:57 +01:00
Sebastien Boeuf
cb98d90097 vmm: Create new seccomp_filter module
Based on the seccomp crate, we create a new vmm module responsible for
creating a seccomp filter that will be applied to the VMM main thread.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-03-24 14:59:57 +01:00