Commit Graph

297 Commits

Author SHA1 Message Date
Sebastien Boeuf
932c8c9713 vmm: Add CPU affinity support
With the introduction of a new option `affinity` to the `cpus`
parameter, Cloud Hypervisor can now let the user choose the set
of host CPUs where to run each vCPU.

This is useful when trying to achieve CPU pinning, as well as making
sure the VM runs on a specific NUMA node.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-11-12 09:40:37 +00:00
Rob Bradford
f8d9c073f0 vmm: Add "--platform"
This currently contains only the number over PCI segments to create.
This is limited to 16 at the moment which should allow 496 user specified
PCI devices.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
83066cf58e vmm: Set a default maximum physical address size
When using PVH for booting (which we use for all firmwares and direct
kernel boot) the Linux kernel does not configure LA57 correctly. As such
we need to limit the address space to the maximum 4-level paging address
space.

If the user knows that their guest image can take advantage of the
5-level addressing and they need it for their workload then they can
increase the physical address space appropriately.

This PR removes the TDX specific handling as the new address space limit
is below the one that that code specified.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-10-01 08:59:15 -07:00
Yu Li
08021087ec vmm: add prefault option in memory and memory-zone
The argument `prefault` is provided in MemoryManager, but it can
only be used by SGX and restore.
With prefault (MAP_POPULATE) been set, subsequent page faults will
decrease during running, although it will make boot slower.

This commit adds `prefault` in MemoryConfig and MemoryZoneConfig.
To resolve conflict between memory and restore, argument
`prefault` has been changed from `bool` to `Option<bool>`, when
its value is None, config from memory will be used, otherwise
argument in Option will be used.

Signed-off-by: Yu Li <liyu.yukiteru@bytedance.com>
2021-09-29 14:17:35 +02:00
Rob Bradford
34f220edcd main: Don't panic() if blocking signals fails
This allows Cloud Hypervisor to be run under `perf` as some of the
signals will already be blocked in the child process.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-15 16:20:28 +01:00
Alyssa Ross
7549149bb5 vmm: ensure signal handlers run on the right thread
Despite setting up a dedicated thread for signal handling, we weren't
making sure that the signals we were listening for there were actually
dispatched to the right thread.  While the signal-hook provides an
iterator API, so we can know that we're only processing the signals
coming out of the iterator on our signal handling thread, the actual
signal handling code from signal-hook, which pushes the signals onto
the iterator, can run on any thread.  This can lead to seccomp
violations when the signal-hook signal handler does something that
isn't allowed on that thread by our seccomp policy.

To reproduce, resize a terminal running cloud-hypervisor continuously
for a few minutes.  Eventually, the kernel will deliver a SIGWINCH to
a thread with a restrictive seccomp policy, and a seccomp violation
will trigger.

As part of this change, it's also necessary to allow rt_sigreturn(2)
on the signal handling thread, so signal handlers are actually allowed
to run on it.  The fact that this didn't seem to be needed before
makes me think that signal handlers were almost _never_ actually
running on the signal handling thread.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2021-09-02 21:33:31 +01:00
Bo Chen
08ac3405f5 virtio-devices, vmm: Move to the seccompiler crate
Fixes: #2929

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-08-18 10:42:19 +02:00
Rob Bradford
7fbec7113e main, config: Add support for --user-device
This allows the user to specify devices that are running in a different
userspace process and communicated with vfio-user.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-08-10 16:01:00 +01:00
Yukiteru
2b1173acc3 main: Add missing comma in help of arguments
The help of arguments `memory` and `memory-zone` missing a comma.
Before adding, these parts are as follows:

> hugepage_size=<hugepage_size>hotplug_method=acpi|virtio-mem

After adding, these parts will be:

> hugepage_size=<hugepage_size>,hotplug_method=acpi|virtio-mem

Signed-off-by: Yukiteru Lee <wfly1998@sina.com>
2021-07-12 17:43:40 +02:00
Bo Chen
5825ab2dd4 clippy: Address the issue 'needless-borrow'
Issue from beta verion of clippy:

Error:    --> vm-virtio/src/queue.rs:700:59
    |
700 |             if let Some(used_event) = self.get_used_event(&mem) {
    |                                                           ^^^^ help: change this to: `mem`
    |
    = note: `-D clippy::needless-borrow` implied by `-D warnings`
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-06-24 08:55:43 +02:00
Rob Bradford
496ceed1d0 misc: Remove unnecessary "extern crate"
Now all crates use edition = "2018" then the majority of the "extern
crate" statements can be removed. Only those for importing macros need
to remain.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-05-12 17:26:11 +02:00
Rob Bradford
7e0ccce225 vmm: config: Validate that vCPUs is sufficient for MQ queue count
Fixes: #2563

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-05-04 19:49:34 +02:00
Rob Bradford
da8136e49d arch, vmm: Remove support for LinuxBoot
By supporting just PVH boot on x86-64 we simplify our boot path
substatially.

Fixes: #2231

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-04-30 16:16:48 +02:00
William Douglas
767b4f0e59 main: Enable the api-socket to be passed as an fd
To avoid race issues where the api-socket may not be created by the
time a cloud-hypervisor caller is ready to look for it, enable the
caller to pass the api-socket fd directly.

Avoid breaking current callers by allowing the --api-socket path to be
passed as it is now in addition to through the path argument.

Signed-off-by: William Douglas <william.r.douglas@gmail.com>
2021-04-26 14:40:49 -07:00
Rob Bradford
2d2623238d main: Move logging setup to start_vmm()
This allows the return of errors which will be printed using the
existing code and removes panic()s

Fixes: #2342

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-04-07 16:29:20 +01:00
Rob Bradford
af02262b4b main: Move event monitor handling to start_vmm()
This allows the return of errors which will be printed using the
existing code and removes panic()s

Fixes: #2342

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-04-07 16:29:20 +01:00
Rob Bradford
19c5e91b6e main: Address Rust 1.51.0 clippy issue (upper_case_acroynms)
warning: name `StartVMMThread` contains a capitalized acronym
  --> src/main.rs:50:5
   |
50 |     StartVMMThread(#[source] vmm::Error),
   |     ^^^^^^^^^^^^^^ help: consider making the acronym lowercase, except the initial letter: `StartVmmThread`
   |
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#upper_case_acronyms

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-03-26 11:32:09 +00:00
Rob Bradford
13724dbd22 main: Remove messages from startup
Remove the startup message containing incomplete VM configuration
details. It's a bit unusual for a tool such as Cloud Hypervisor to print
those kind of details which are a direct representation of the command
line.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-03-18 11:32:28 +00:00
Rob Bradford
78f9ddc6be main: Remove default API sever path
Only create an API server if the use specifies one with `--api-socket`

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-03-17 11:30:26 +00:00
Rob Bradford
9b0996a71f vmm, main: Optionalise creation of API server
Only if we have a valid API server path then create the API server. For
now this has no functional change there is a default API server path in
the clap handling but rather prepares to do so optionally.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-03-17 11:30:26 +00:00
Rob Bradford
57c8c250fd tdx: Permit starting Cloud Hypervisor without --kernel
This is not required if TDX is present.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-03-08 18:30:00 +00:00
Rob Bradford
66a3bed086 vmm: config: Add "--tdx" option parsing
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-03-08 18:30:00 +00:00
Vineeth Pillai
fd9bd1c86c main: minor fix in the help message for event monitor
Signed-off-by: Vineeth Pillai <viremana@linux.microsoft.com>
2021-03-08 15:32:18 +00:00
Rob Bradford
b65502c3c1 main: Refine event monitor control
Replace "--monitor-fd" with "--event-monitor" which can either take
"fd=<int>" or "path=<path>" which can point to e.g. a named pipe and
allow more flexibility.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-03-04 19:12:40 +01:00
Rob Bradford
4822ed79e1 main: Add "--monitor-fd" to write structured event data to
If supplied then structured JSON event data will be written to that file
descriptor.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-02-18 16:15:13 +00:00
William Douglas
48963e322a Enable pty console
Add the ability for cloud-hypervisor to create, manage and monitor a
pty for serial and/or console I/O from a user. The reasoning for
having cloud-hypervisor create the ptys is so that clients, libvirt
for example, could exit and later re-open the pty without causing I/O
issues. If the clients were responsible for creating the pty, when
they exit the main pty fd would close and cause cloud-hypervisor to
get I/O errors on writes.

Ideally the main and subordinate pty fds would be kept in the main
vmm's Vm structure. However, because the device manager owns parsing
the configuration for the serial and console devices, the information
is instead stored in new fields under the DeviceManager structure
directly.

From there hooking up the main fd is intended to look as close to
handling stdin and stdout on the tty as possible (there is some future
work ahead for perhaps moving support for the pty into the
vmm_sys_utils crate).

The main fd is used for reading user input and writing to output of
the Vm device. The subordinate fd is used to setup raw mode and it is
kept open in order to avoid I/O errors when clients open and close the
pty device.

The ability to handle multiple inputs as part of this change is
intentional. The current code allows serial and console ptys to be
created and both be used as input. There was an implementation gap
though with the queue_input_bytes needing to be modified so the pty
handlers for serial and console could access the methods on the serial
and console structures directly. Without this change only a single
input source could be processed as the console would switch based on
its input type (this is still valid for tty and isn't otherwise
modified).

Signed-off-by: William Douglas <william.r.douglas@gmail.com>
2021-02-09 10:03:28 +00:00
Rob Bradford
29607f38ad vmm: config: Add a hugepage_size option
This allows the user to use an alternative huge page size otherwise the
default size will be used.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-02-05 09:24:02 +00:00
Rob Bradford
39d080e0c1 main: Give a friendly message when we get a seccomp violation
If we receive SIGSYS and identify it as a seccomp violation then give
friendly instructions on how to debug further. We are unable to decode
the siginfo_t struct ourselves due to https://github.com/rust-lang/libc/issues/716

Fixes: #2139

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-01-12 09:23:29 +00:00
Rob Bradford
8a27735826 main: Add thread name to log output
If there is no thread then the name is reported as "anonymous".

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-12-18 16:05:14 +00:00
Wei Liu
c4f8e4b000 main: provide a sensible error message when /dev/mshv is missing
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-12-09 17:28:36 +00:00
Rob Bradford
d1d0421103 main: Remove --net-backend and --block-backend from cloud-hypervisor
Remove the parameters used for self spawning from the cloud-hypervisor
binary.

See: #1925

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-11-18 11:46:32 +01:00
Rob Bradford
0005d11e32 vmm: config: Require a socket when using vhost-user
With self-spawning being removed both parameters are now required.

Fixes: #1925

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-11-09 00:16:15 +01:00
Rob Bradford
be1b6bc1e1 main: Remove API socket when exiting
When exiting remove the API socket from the filesystem.

Fixes: #1241

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-10-27 13:27:23 +00:00
Rob Bradford
c22b788b47 main: Simplify error and return handling in start_vmm
Use a Result<> type with an error to simplify the code in start_vmm().
This will also make it easier to add cleanup funtionality.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-10-27 13:27:23 +00:00
Sebastien Boeuf
f4e391922f vmm: Remove balloon options from --memory parameter
The standalone `--balloon` parameter being fully functional at this
point, we can get rid of the balloon options from the --memory
parameter.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-10-22 16:33:16 +02:00
Sebastien Boeuf
1d479e5e08 vmm: Introduce new --balloon parameter
This introduces a new way of defining the virtio-balloon device. Instead
of going through the --memory parameter, the idea is to consider balloon
as a standalone virtio device.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-10-22 16:33:16 +02:00
Rob Bradford
885ee9567b vmm: Add support for creating virtio-watchdog
The watchdog device is created through the "--watchdog" parameter. At
most a single watchdog can be created per VM.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-10-21 16:02:39 +01:00
Sebastien Boeuf
52ad78886c vmm: Introduce new CPU option to set maximum physical bits
In order to let the user choose maximum address space size, this patch
introduces a new option `max_phys_bits` to the `--cpus` parameter.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-10-13 18:58:36 +02:00
Rob Bradford
03f7d39ce5 main: Set default log level to warn!() equivalent.
Using our standard configuration and default kernel we trigger no
messages at this level.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-10-06 16:52:29 +01:00
Rob Bradford
2c2e7016c7 main: Improve documentation for --kernel
Fixes: #1712

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-10-01 10:08:25 +01:00
Hui Zhu
4913acc05e vmm: Add 'balloon' to memory parameters
Add the option 'balloon' to --memory.

Signed-off-by: Hui Zhu <teawater@antfin.com>
2020-09-25 17:13:39 +02:00
Rob Bradford
29b74804e1 main: Improve the error reporting when creating the hypervisor object
The ::new() does very little beyond trying to open the /dev/kvm device
so provide a hint to the user about what has gone wrong.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-09-25 11:08:01 +02:00
Sebastien Boeuf
4e1b78e1ff vmm: Add 'hotplugged_size' to memory parameters
Add the new option 'hotplugged_size' to both --memory-zone and --memory
parameters so that we can let the user specify a certain amount of
memory being plugged at boot.

This is also part of making sure we can store the virtio-mem size over a
reboot of the VM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-16 19:20:04 +02:00
Sebastien Boeuf
c645a72c17 vmm: Add 'hotplug_size' to memory zones
In anticipation for resizing support of an individual memory zone,
this commit introduces a new option 'hotplug_size' to '--memory-zone'
parameter. This defines the amount of memory that can be added through
each specific memory zone.

Because memory zone resize is tied to virtio-mem, make sure the user
selects 'virtio-mem' hotplug method, otherwise return an error.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-16 19:20:04 +02:00
Rob Bradford
5495ab7415 vmm: Add "kvm_hyperv" toggle to "--cpus"
This turns on the KVM HyperV emulation.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-09-16 16:08:01 +01:00
Sebastien Boeuf
1970ee89da main, vmm: Remove guest_numa_node option from memory zones
The way to describe guest NUMA nodes has been updated through previous
commits, letting the user describe the full NUMA topology through the
--numa parameter (or NumaConfig).

That's why we can remove the deprecated and unused 'guest_numa_node'
option.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-07 07:37:14 +02:00
Sebastien Boeuf
3ff82b4b65 main, vmm: Add mandatory id to memory zones
In anticipation for allowing memory zones to be removed, but also in
anticipation for refactoring NUMA parameter, we introduce a mandatory
'id' option to the --memory-zone parameter.

This forces the user to provide a unique identifier for each memory zone
so that we can refer to these.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-07 07:37:14 +02:00
Sebastien Boeuf
42f963d6f2 main, vmm: Add new --numa parameter
Through this new parameter, we give users the opportunity to specify a
set of CPUs attached to a NUMA node that has been previously created
from the --memory-zone parameter.

This parameter will be extended in the future to describe the distance
between multiple nodes.

For instance, if a user wants to attach CPUs 0, 1, 2 and 6 to a NUMA
node, here are two different ways of doing so:
Either
	./cloud-hypervisor ... --numa id=0,cpus=0-2:6
Or
	./cloud-hypervisor ... --numa id=0,cpus=0:1:2:6

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-01 15:25:00 +02:00
Sebastien Boeuf
768dbd1fb0 vmm: Add 'guest_numa_node' option to 'memory-zone'
With the introduction of this new option, the user will be able to
describe if a particular memory zone should belong to a specific NUMA
node from a guest perspective.

For instance, using '--memory-zone size=1G,guest_numa_node=2' would let
the user describe that a memory zone of 1G in the guest should be
exposed as being associated with the NUMA node 2.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-09-01 14:11:49 +02:00
Sebastien Boeuf
e6f585a31c vmm: Add 'host_numa_nodes' option to memory zones
Since memory zones have been introduced, it is now possible for a user
to specify multiple backends for the guest RAM. By adding a new option
'host_numa_node' to the 'memory-zone' parameter, we allow the guest RAM
to be backed by memory that might come from a specific NUMA node on the
host.

The option expects a node identifier, specifying which NUMA node should
be used to allocate the memory associated with a specific memory zone.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2020-08-27 08:39:38 -07:00