Commit Graph

2366 Commits

Author SHA1 Message Date
Sebastien Boeuf
e35d4c5b28 hypervisor: Store all supported MSRs
On x86 architecture, we need to save a list of MSRs as part of the vCPU
state. By providing the full list of MSRs supported by KVM, this patch
fixes the remaining snapshot/restore issues, as the vCPU is restored
with all its previous states.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-06-30 14:03:03 +01:00
Sebastien Boeuf
49b4fba283 hypervisor: Retrieve list of supported MSRs
Add a new function to the hypervisor trait so that the caller can
retrieve the list of MSRs supported by this hypervisor.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-06-30 14:03:03 +01:00
Sebastien Boeuf
e2b5c78dc5 hypervisor: Re-order vCPU state for storing and restoring
Some vCPU states such as MP_STATE can be modified while retrieving
other states. For this reason, it's important to follow a specific
order that will ensure a state won't be modified after it has been
saved. Comments about ordering requirements have been copied over
from Firecracker commit 57f4c7ca14a31c5536f188cacb669d2cad32b9ca.

This patch also set the previously saved VCPU_EVENTS, as this was
missing from the restore codepath.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-06-30 14:03:03 +01:00
Wei Liu
2b8accf49a vmm: interrupt: put KVM code into a kvm module
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-30 12:09:42 +01:00
Wei Liu
c31e747005 vmm: interrupt: generify impl InterruptManager for MsiInterruptManager
The logic can be shared among hypervisor implementations.

The 'static bound is used such that we don't need to deal with extra
lifetime parameter everywhere. It should be okay because we know the
entry type E doesn't contain any reference.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-30 12:09:42 +01:00
Wei Liu
ade904e356 vmm: interrupt: generify impl InterruptSourceGroup for MsiInterruptGroup
At this point we can use the same logic for all hypervisor
implementations.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-30 12:09:42 +01:00
Wei Liu
2b466ed80c vmm: interrupt: provide MsiInterruptGroupOps trait
Currently it only contains a function named set_gsi_routes.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-30 12:09:42 +01:00
Wei Liu
b2abead65b vmm: interrupt: provide and use extension trait RoutingEntryExt
This trait contains a function which produces a interrupt routing entry.

Implement that trait for KvmRoutingEntry and rewrite the update
function.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-30 12:09:42 +01:00
Wei Liu
4dbca81b86 vmm: interrupt: rename set_kvm_gsi_routes to set_gsi_routes
This function will be used to commit routing information to the
hypervisor.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-30 12:09:42 +01:00
Wei Liu
fd7b42e54d vmm: interrupt: inline mask_kvm_entry
The logic for looking up the correct interrupt can be shared among
hypervisors.

No functional change.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-30 12:09:42 +01:00
Wei Liu
0ec39da90c vmm: interrupt: generify KvmMsiInterruptManager
The observation is only the route entry is hypervisor dependent.

Keep a definition of KvmMsiInterruptManager to avoid too much code
churn.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-30 12:09:42 +01:00
Wei Liu
d5149e95cb vmm: interrupt: generify KvmRoutingEntry and KvmMsiInterruptGroup
The observation is that only the route field is hypervisor specific.

Provide a new function in blanket implementation. Also redefine
KvmRoutingEntry with RoutingEntry to avoid code churn.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-30 12:09:42 +01:00
Wei Liu
637f58bcd9 vmm: interrupt: drop Kvm prefix from KvmLegacyUserspaceInterruptManager
This data structure doesn't contain KVM specific stuff.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-30 12:09:42 +01:00
Wei Liu
574cab6990 vmm: interrupt: create GSI hashmap directly
The observation is that the GSI hashmap remains untouched before getting
passed into the MSI interrupt manager. We can create that hashmap
directly in the interrupt manager's new function.

The drops one import from the interrupt module.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-30 12:09:42 +01:00
dependabot-preview[bot]
f3c8f827cc build(deps): bump linux-loader from 2a62f21 to ec930d7
Bumps [linux-loader](https://github.com/rust-vmm/linux-loader) from `2a62f21` to `ec930d7`.
- [Release notes](https://github.com/rust-vmm/linux-loader/releases)
- [Commits](2a62f21b44...ec930d700f)

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-06-30 07:05:06 +00:00
Rob Bradford
fbbe348447 arch: x86-64: Add missing End of Table entry
The OVMF firmware loops around looking for an entry marking the end of
the table. Without this entry processing the tables is an infinite loop.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-06-30 08:15:48 +02:00
Rob Bradford
2c3c335de6 arch: x86_64: Add basic SMBIOS support
Taken from crosvm: 44336b913126d73f9f8d6854f57aac92b5db809e and adapted
for Cloud Hypervisor.

This is basic and incomplete support but Linux correctly finds the DMI
data based on this:

root@clr-c6ed47bc1c9d473d9a3a8bddc50ee4cb ~ # dmesg | grep -i dmi
[    0.000000] DMI: Cloud Hypervisor cloud-hypervisor, BIOS 0

root@clr-c6ed47bc1c9d473d9a3a8bddc50ee4cb ~ # dmesg | grep -i smbio
[    0.000000] SMBIOS 3.2.0 present.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-06-30 08:15:48 +02:00
Wei Liu
24c051c663 vmm: hypervisor: drop duplicate comment
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-29 21:51:59 +01:00
Wei Liu
2518b9e3cd vmm: hypervisor: fix white space issues
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-29 21:51:59 +01:00
dependabot-preview[bot]
5ac3299234 build(deps): bump serde_json from 1.0.55 to 1.0.56
Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.55 to 1.0.56.
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](https://github.com/serde-rs/json/compare/v1.0.55...v1.0.56)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-06-29 19:54:47 +00:00
dependabot-preview[bot]
58980067f4 build(deps): bump ssh2 from 0.8.1 to 0.8.2
Bumps [ssh2](https://github.com/alexcrichton/ssh2-rs) from 0.8.1 to 0.8.2.
- [Release notes](https://github.com/alexcrichton/ssh2-rs/releases)
- [Commits](https://github.com/alexcrichton/ssh2-rs/compare/0.8.1...0.8.2)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-06-29 18:02:38 +00:00
dependabot-preview[bot]
d2780a6575 build(deps): bump unicode-width from 0.1.7 to 0.1.8
Bumps [unicode-width](https://github.com/unicode-rs/unicode-width) from 0.1.7 to 0.1.8.
- [Release notes](https://github.com/unicode-rs/unicode-width/releases)
- [Commits](https://github.com/unicode-rs/unicode-width/commits/v0.1.8)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-06-29 17:37:40 +00:00
Samuel Ortiz
5a6b8d6323 dev_cli: Add a shell command
And drop the caller into a privileged root shell.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-06-29 18:17:39 +01:00
Henry Wang
d824d55562 unit tests: Fix broken AArch64 unit tests
https://github.com/cloud-hypervisor/cloud-hypervisor/pull/1225
introduces a hypervisor abstraction crate, which breaks some of
the unit test cases on AArch64. This commit fixes related test
cases.

Signed-off-by: Henry Wang <henry.wang@arm.com>
2020-06-29 18:00:42 +01:00
Henry Wang
462c58d58b tests: Enable AArch64 Jenkins CI with unit tests for GNU
This commit enables the AArch64 Jenkins CI with build and running
unit tests for GNU toolchain.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-06-29 18:00:42 +01:00
dependabot-preview[bot]
ba3f1bcde2 build(deps): bump cc from 1.0.55 to 1.0.56
Bumps [cc](https://github.com/alexcrichton/cc-rs) from 1.0.55 to 1.0.56.
- [Release notes](https://github.com/alexcrichton/cc-rs/releases)
- [Commits](https://github.com/alexcrichton/cc-rs/compare/1.0.55...1.0.56)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-06-29 16:20:33 +00:00
Rob Bradford
6ee9903601 docs: Update API documentation
Update the API documentation to reflect that the hotplug APIs return
data about the device as well as the newly added /vm.counters API.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-06-27 00:07:47 +02:00
Rob Bradford
522d8c8412 vmm: openapi: Add the /vm.counters API entry point
This is a hash table of string to hash tables of u64s. In JSON these
hash tables are object types.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-06-27 00:07:47 +02:00
Muminul Islam
72ae1577ed hypervisor: Update license to Apache-2.0 OR BSD-3-Clause
Initially the licensing was just Apache-2.0. This patch changes
the licensing to dual license Apache-2.0 OR BSD-3-Clause

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2020-06-27 00:06:39 +02:00
dependabot-preview[bot]
e1ea06e74a build(deps): bump cc from 1.0.54 to 1.0.55
Bumps [cc](https://github.com/alexcrichton/cc-rs) from 1.0.54 to 1.0.55.
- [Release notes](https://github.com/alexcrichton/cc-rs/releases)
- [Commits](https://github.com/alexcrichton/cc-rs/compare/1.0.54...1.0.55)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-06-26 04:55:46 +00:00
dependabot-preview[bot]
19ce0a37a4 build(deps): bump epoll from 4.3.0 to 4.3.1
Bumps [epoll](https://github.com/nathansizemore/epoll) from 4.3.0 to 4.3.1.
- [Release notes](https://github.com/nathansizemore/epoll/releases)
- [Commits](https://github.com/nathansizemore/epoll/commits/4.3.1)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2020-06-25 14:03:04 +00:00
Sebastien Boeuf
a7f0f9dfea vm-virtio: Ensure pause event is caught by every virtio thread
Each virtio thread was reading/draining the pause_evt pipe when
detecting the associated event. Problem is, when a virtio device has
multiple threads, they all share the same pause_evt pipe, which can
prevent some threads from receiving the event. If the first thread to
catch the event is quickly clearing the pipe, some other threads might
simply miss the event and they will not enter the "paused" state as
expected.

This is a behavior that was spotted with virtio-net as it usually uses
2 threads by default (1 for TX/RX queues and 1 for the control queue).

The way to solve this issue is by letting each thread drain the pipe
during the resume codepath, that is after the thread has been unparked.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-06-25 12:01:34 +02:00
Sebastien Boeuf
86377127df vmm: Resume devices after vCPUs have been resumed
Because we don't want the guest to miss any event triggered by the
emulation of devices, it is important to resume all vCPUs before we can
resume the DeviceManager with all its associated devices.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-06-25 12:01:34 +02:00
Sebastien Boeuf
f6eeba781b vmm: Save and restore vCPU states during pause/resume operations
We need consistency between pause/resume and snapshot/restore
operations. The symmetrical behavior of pausing/snapshotting
and restoring/resuming has been introduced recently, and we must
now ensure that no matter if we're using pause/resume or
snapshot/restore features, the resulting VM should be running in
the exact same way.

That's why the vCPU state is now stored upon VM pausing. The snapshot
operation being a simple serialization of the previously saved state.
The same way, the vCPU state is now restored upon VM resuming. The
restore operation being a simple deserialization of the previously
restored state.

It's interesting to note that this patch ensures time consistency from a
guest perspective, no matter which clocksource is being used. From a
previous patch, the KVM clock was saved/restored upon VM pause/resume.
We now have the same behavior for TSC, as the TSC from the vCPUs are
saved/restored upon VM pause/resume too.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-06-25 12:01:34 +02:00
Sebastien Boeuf
18e7d7a1f7 vmm: cpu: Resume before shutdown in a specific way
Instead of calling the resume() function from the CpuManager, which
involves more than what is needed from the shutdown codepath, and
potentially ends up with a deadlock, we replace it with a subset.

The full resume operation is reserved for a VM that has been paused.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-06-25 12:01:34 +02:00
Sebastien Boeuf
65132fb99d vmm: Implement Pausable trait for Vcpu
We want each Vcpu to store the vCPU state upon VM pausing. This is the
reason why we need to explicitly implement the Pausable trait for the
Vcpu structure.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-06-25 12:01:34 +02:00
Wei Liu
1741af74ed hypervisor: add safety statement in set_user_memory_region
When set_user_memory_region was moved to hypervisor crate, it was turned
into a safe function that wrapped around an unsafe call. All but one
call site had the safety statements removed. But safety statement was
not moved inside the wrapper function.

Add the safety statement back to help reasoning in the future. Also
remove that one last instance where the safety statement is not needed .

No functional change.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-25 10:25:13 +02:00
Wei Liu
b27439b6ed arch, hypervisor, vmm: KvmHyperVisor -> KvmHypervisor
"Hypervisor" is one word. The "v" shouldn't be capitalised.

No functional change.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-25 10:25:13 +02:00
Wei Liu
b00171e17d vmm: use MemoryRegion where applicable
That removes one more KVM-ism in VMM crate.

Note that there are more KVM specific code in those files to be split
out, but we're not at that stage yet.

No functional change.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-25 10:25:13 +02:00
Rob Bradford
48a05c4727 tests: Add simple counters integration test
Add a simple test to check that the data from the counters matches what
is expected and that the value of the counters increases after an
operation that will hit all counters.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-06-25 07:02:44 +02:00
Rob Bradford
980b49da94 vm-virtio: block: Implement counters for block device
Expose counters for read/write bytes/ops from the virtio block device.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-06-25 07:02:44 +02:00
Rob Bradford
d983c0a680 vmm: Expose counters from virtio devices to API
Collate the virtio device counters in DeviceManager for each device that
exposes any and expose it through the recently added HTTP API.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-06-25 07:02:44 +02:00
Rob Bradford
9b7afd4aac bin: ch-remote: Implement "counters" command
This is used to obtain the counters from the VM. The raw JSON data is
presented to the user.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-06-25 07:02:44 +02:00
Rob Bradford
bca8a19244 vmm: Implement HTTP API for obtaining counters
The counters are a hash of device name to hash of counter name to u64
value. Currently the API is only implemented with a stub that returns an
empty set of counters.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-06-25 07:02:44 +02:00
Rob Bradford
fd4aba8eae vmm: api: Implement support for GET handlers EndpointHandler
This can be used for simple API requests which return data but do not
require any input.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-06-25 07:02:44 +02:00
Rob Bradford
80be393b16 vmm: api: Order HTTP entry points in alphabetical order
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-06-25 07:02:44 +02:00
Rob Bradford
6713a3c859 vm-virtio: net: Expose network counters through VirtioDevice
Through the counters() function on the trait expose the accumulated
counters.

TEST=Observe that the counters from the VM match those from the tap on
the host (RX-TX inverted) and inside the guest (non inverted.)

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-06-25 07:02:44 +02:00
Rob Bradford
dd54883a07 vm-virtio: device: Extend the VirtioDevice trait to expose counters
The counters are a hash of counter name to (wrapping) u64 value. The
interpretation layer is responsible for converting this data into a
rate.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-06-25 07:02:44 +02:00
Rob Bradford
2b4a0358de vm-virtio: net: Implement counters for network traffic
Add counters for RX/TX bytes and RX/TX frames. These are collected on a
per queue basis and then accumulated into an atomic shared value across
the different threads for the device as a whole.

Collecting and accumulating these counters makes minimal difference in
the iperf results. Any difference seen is within what is observed as
natural variation in this test.

e.g.

With counter updates:

$ iperf3 -c 192.168.249.2
Connecting to host 192.168.249.2, port 5201
[  5] local 192.168.249.1 port 52706 connected to 192.168.249.2 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  6.19 GBytes  53.2 Gbits/sec    0   3.01 MBytes
[  5]   1.00-2.00   sec  6.31 GBytes  54.2 Gbits/sec    0   3.01 MBytes
[  5]   2.00-3.00   sec  6.29 GBytes  54.0 Gbits/sec    0   3.01 MBytes
[  5]   3.00-4.00   sec  6.22 GBytes  53.4 Gbits/sec    0   3.01 MBytes
[  5]   4.00-5.00   sec  6.14 GBytes  52.8 Gbits/sec    0   3.01 MBytes
[  5]   5.00-6.00   sec  6.13 GBytes  52.7 Gbits/sec    0   3.01 MBytes
[  5]   6.00-7.00   sec  6.20 GBytes  53.3 Gbits/sec    0   3.01 MBytes
[  5]   7.00-8.00   sec  6.16 GBytes  52.9 Gbits/sec    0   3.01 MBytes
[  5]   8.00-9.00   sec  6.13 GBytes  52.6 Gbits/sec    0   3.01 MBytes
[  5]   9.00-10.00  sec  6.15 GBytes  52.8 Gbits/sec    0   3.01 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  61.9 GBytes  53.2 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  61.9 GBytes  53.2 Gbits/sec                  receiver

iperf Done.

Without counter updates:

$ iperf3 -c 192.168.249.2
Connecting to host 192.168.249.2, port 5201
[  5] local 192.168.249.1 port 52716 connected to 192.168.249.2 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  6.29 GBytes  54.1 Gbits/sec    0   3.03 MBytes
[  5]   1.00-2.00   sec  6.18 GBytes  53.1 Gbits/sec    0   3.03 MBytes
[  5]   2.00-3.00   sec  6.26 GBytes  53.8 Gbits/sec    0   3.03 MBytes
[  5]   3.00-4.00   sec  6.24 GBytes  53.6 Gbits/sec    0   3.03 MBytes
[  5]   4.00-5.00   sec  6.27 GBytes  53.9 Gbits/sec    1   3.03 MBytes
[  5]   5.00-6.00   sec  6.31 GBytes  54.2 Gbits/sec    0   3.03 MBytes
[  5]   6.00-7.00   sec  6.29 GBytes  54.1 Gbits/sec    0   3.03 MBytes
[  5]   7.00-8.00   sec  6.16 GBytes  52.9 Gbits/sec    0   3.03 MBytes
[  5]   8.00-9.00   sec  6.13 GBytes  52.6 Gbits/sec    0   3.03 MBytes
[  5]   9.00-10.00  sec  6.25 GBytes  53.7 Gbits/sec    0   3.03 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  62.4 GBytes  53.6 Gbits/sec    1             sender
[  5]   0.00-10.00  sec  62.4 GBytes  53.6 Gbits/sec                  receiver

iperf Done.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-06-25 07:02:44 +02:00
Dr. David Alan Gilbert
0583ce921b vhost_user_fs: Allow fchmod in seccomp
This corresponds to QEMU's 63659fe74e76f5c52854 commit.
the setattr code uses both fchmod and fchmodat in different cases,
however we only had fchmodat in the whitelist.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-06-24 21:56:58 +01:00