Commit Graph

126 Commits

Author SHA1 Message Date
Samuel Ortiz
46cde1a38e vmm: Rename the VM start and stop operations to boot and shutdown
To match the OpenAPI description. And also to map the real life
terminology.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-10-04 09:36:33 +02:00
Samuel Ortiz
f2de4d0315 vmm: config: Make the cmdline config serializable
The linux_loader crate Cmdline struct is not serializable.
Instead of forcing the upstream create to carry a serde dependency, we
simply use a String for the passed command line and build the actual
CmdLine when we need it (in vm::new()).
Also, the cmdline offset is not a configuration knob, so we remove it.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-10-04 09:36:33 +02:00
Samuel Ortiz
b14fd37db9 vmm: Make --kernel optional
The kernel path was the only mandatory command line option.
With the addition of the --api-socket option, we can run without a
kernel path and get it later through the API.

Since we can end up with VM configurations that are no longer valid by
default, we need to provide a validation check for it. For now, if the
kernel path is not defined, the VM configuration is invalid.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-10-04 09:36:33 +02:00
Rob Bradford
a0455167d0 vmm: Use layout constant for kernel command line
Remove the unnecessary field on CmdlineConfig and switch to using the
common offset.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-27 11:48:30 -07:00
Rob Bradford
0e7a1fc923 arch, vmm: Start documenting major regions of RAM and reserved memory
Using the existing layout module start documenting the major regions of
RAM and those areas that are reserved. Some of the constants have also
been renamed to be more consistent and some functions that returned
constant variables have been replaced.

Future commits will move more constants into this file to make it the
canonical source of information about the memory layout.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-27 08:55:47 -07:00
Samuel Ortiz
8188074300 main: Start the VMM thread
We now start the main VMM thread, which will be listening for VM and IPC
related events.
In order to start the configured VM, we no longer directly call the VM
API but we use the IPC instead, to first create and then start a VM.

Fixes: #303

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-09-26 16:21:14 +02:00
Samuel Ortiz
4671a5831f vmm: Move the EpollContext implementation to lib
The VMM thread and control loop will be the sole consumer of the
EpollContext and EpollDispatch API, so let's move it to lib.rs.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-09-26 16:21:14 +02:00
Samuel Ortiz
6710a39b5a vmm: Pass the exit and reset fds to the vm creation method
As we're going to move the control loop to the VMM thread, the exit and
reset EventFds are no longer going to be owned by the VM.
We pass a copy of them when creating the Vm instead.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-09-26 16:21:14 +02:00
Samuel Ortiz
feb1c33084 vmm: Add a VM config getter
We will need it from the VMM thread, when trying to reboot a VM.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-09-26 16:21:14 +02:00
Samuel Ortiz
47167a658e vmm: Add a VM console handling method
In order to handle the VM STDIN stream from a separate VMM thread
without having to export the DeviceManager, we simply add a console
handling method to the Vm structure.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-09-26 16:21:14 +02:00
Samuel Ortiz
ea7abc6c80 vmm: Add a VM stop method
In order to transfer the control loop to a separate VMM thread, we want
to shrink the VM control loop to a bare minimum.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-09-26 16:21:14 +02:00
Samuel Ortiz
e6ef9ece2c vmm: Move the tty setting to the VM start routine
We want to shrink the control loop to a bare minimal.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-09-26 16:21:14 +02:00
Samuel Ortiz
2e9d815701 vmm: Use a reference counted VmConfig when creating a new VM
Once passed to the VM creation routine, a VmConfig structure is
immutable. We can simply carry a Arc of it instead of a reference.
This also allows us to remove any lifetime bound from our VM.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-09-26 16:21:14 +02:00
Samuel Ortiz
bdfd1a3f38 vmm: Remove the Vmm structure
The Vmm structure is just a placeholder for the KVM instance. We can
create it directly from the VM creation routine instead.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-09-24 10:12:04 +02:00
Samuel Ortiz
9c5135da7a vmm: Simplify the VM start flow
We can integrate the kernel loading into the VM start method.
The VM start flow is then: Vm::new() -> vm.start(), which feels more
natural.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-09-24 10:12:04 +02:00
Samuel Ortiz
acc60b0ad5 vmm: Make VsockConfig owned
Convert Path to PathBuf and remove the associated lifetime.
Now we can remove the VmConfig associated lifetime.

Fixes #298

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-09-24 08:39:39 +01:00
Samuel Ortiz
9c5bfb8e13 vmm: Make MemoryConfig owned
Convert Path to PathBuf and remove the associated lifetime.

Fixes #298

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-09-24 08:39:39 +01:00
Rob Bradford
5b3ca78dac vmm: Use the full host physical address range
Probe for the size of the host physical address range and use that to
establish the address range for the VM. This removes the limitation on
the size of the VM RAM and gives more space for the devices.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-19 10:43:55 +01:00
Rob Bradford
f0360c92d9 arch: acpi: Set the upper device range based on RAM levels
After the 32-bit gap the memory is shared between the devices and the
RAM. Ensure that the ACPI tables correctly indicate where the RAM ends
and the device area starts by patching the precompiled tables. We get
the following valid output now from the PCI bus probing (8GiB guest)

[    0.317757] pci_bus 0000:00: resource 4 [io  0x0000-0x0cf7 window]
[    0.319035] pci_bus 0000:00: resource 5 [io  0x0d00-0xffff window]
[    0.320215] pci_bus 0000:00: resource 6 [mem 0x000a0000-0x000bffff window]
[    0.321431] pci_bus 0000:00: resource 7 [mem 0xc0000000-0xfebfffff window]
[    0.322613] pci_bus 0000:00: resource 8 [mem 0x240000000-0xfffffffff window]

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-19 10:43:55 +01:00
Rob Bradford
c042483953 build: make PCI (virtio and vfio) disableable at build time
Although included by default it is now possible to build without PCI
support.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-13 12:30:13 +01:00
Rob Bradford
6d27ac9dfc vmm: Allow the DeviceManager to inject extra kernel commandline entries
This is useful for virtio-mmio to be able to provide the commandline
entries for the devices.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-13 12:30:13 +01:00
Rob Bradford
05b5115e67 vmm: Call DeviceManager's register_devices() on creation
Rather than calling it at the very start of the VM execution (i.e. when
the VCPUs are created) do it as part of the DeviceManager creation.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-10 20:04:00 +02:00
Rob Bradford
8f37dec498 vmm: "close" the SIGWINCH signal handler
Rather than sending a signal to the signal handler used for handling
SIGWINCH calls instead use the crate provided termination method. This
also unregisters the signal handler which also means that there won't be
a leaked signal handler remaining.

This leaked signal handler is what was causing a failure to cleanup up
the thread on subsequent requests breaking two reboots in a row.

Fixes: #252

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-09 15:42:26 +02:00
Rob Bradford
d2db34edf2 vmm: Hide underlying console setup from VM
Refactor the underlying console details into the DeviceManager and
abstract away.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-06 09:26:37 -07:00
Rob Bradford
d089ee4e25 vmm: Move ownership of the exit/reset EventFd to Vm structure
It makes more sense there as it is used by more than just the
DeviceManager.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-06 09:26:37 -07:00
Rob Bradford
2f4de81175 vmm: Access ioapic/io_bus/mmio_bus from DeviceManager via accessor
This paves the way for introducing a trait for the DeviceManager.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-06 09:26:37 -07:00
Rob Bradford
9ac967e3d8 vmm: Split DeviceManager into it's own file
Refactor out DeviceManager into it's own file. This is part of a bigger
effort to reduce complexity in the vm.rs file but will also allow future
separation to allow making PCI support optional.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-06 09:26:37 -07:00
Rob Bradford
5dd675710b vmm: Call munmap() on regions that have been mmap()ed
For virtio-fs and virtio-pmem regions of memory are manually mapped into
the address space of the VMM. In order to cleanly reboot we need to
unmap those regions.

Fixes: #223

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-05 10:38:14 +01:00
Rob Bradford
f59cad15a3 vmm: Cleanup signal_handler thread used for console SIGWINCH handling
Do this by using the same mechanism as the vCPU threads by sending a
signal to the thread. As this is the same mechanism reuse the same code
and rename the "vcpus" member to "threads" to indicate this represents
both the vCPU threads and also the signal handler thread.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-04 09:21:01 -07:00
Rob Bradford
9e764fc091 vmm, arch, devices: Put ACPI support behind a default feature
Put the ACPI support behind a feature and ensure that the code compiles
without that feature by adding an extra build to Travis.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-03 19:18:49 +02:00
Rob Bradford
bb2e7bb942 vmm: Shutdown vCPU threads
As part of the cleanup of the VM shutdown all the vCPU threads. This is
achieved by toggling a shared atomic boolean variable which is checked
in the vCPU loop. To trigger the vCPU code to look at this boolean it is
necessary to send a signal to the vCPU which will interrupt the running
KVM_RUN ioctl.

Fixes: #229

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-03 19:18:49 +02:00
Rob Bradford
ad128bf72d vmm: Give vCPU and signal handler thread useful names
Sadly only the first few characters of the thread name is preserved so
use a shorter name for the vCPU thread for now. Also give the signal
handling thread a name.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-03 19:18:49 +02:00
Rob Bradford
614eb68f16 vm: Make triple-fault and i8042 reset reboot the VM
Now we have ACPI shutdown we should reboot on these reset triggers.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-03 19:18:49 +02:00
Rob Bradford
5a187ee2c2 x86_64/devices: acpi: Add support for ACPI shutdown & reboot
Add an I/O port "device" to handle requests from the kernel to shutdown
or trigger a reboot, borrowing an I/O used for ACPI on the Q35 platform.
The details of this I/O port are included in the FADT
(SLEEP_STATUS_REG/SLEEP_CONTROL_REG/RESET_REG) with the details of the
value to write in the FADT for reset (RESET_VALUE) and in the DSDT for
shutdown (S5 -> 0x05)

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-03 19:18:49 +02:00
Rob Bradford
ae66a44d26 vmm: Support both reset and shutdown
Add a 2nd EventFd to the VM to control resetting (rebooting) the VM this
supplements the EventFd used for managing shutdown of the VM.

The default behaviour on i8042 or triple-fault based reset is currently
unchanged i.e. it will trigger a shutdown.

In order to support restarting the VM it was necessary to make start()
function take a reference to the config.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-03 19:18:49 +02:00
Rob Bradford
2610f4353d arch: acpi: Only add ACPI COM1 device if serial is turned on
Only add the ACPI PNP device for the COM1 serial port if it is not
turned off with "--serial off"

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-03 19:18:49 +02:00
Rob Bradford
451502b50b vm: If a VCPU thread errors out then exit the hypervisor
Currently when the VCPU thread exits on an error the VMM continues to
run with no way of shutting down the main thread.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-03 19:18:49 +02:00
Sebastien Boeuf
b7d3ad9063 vm-virtio: fs: Factorize vhost-user setup
This patch factorizes the existing virtio-fs code by relying onto the
common code part of the vhost_user module in the vm-virtio crate.

In details, it factorizes the vhost-user setup, and reuses the error
types defined by the module instead of defining its own types.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-31 17:33:17 +01:00
Sebastien Boeuf
56cad00f2e vm-virtio: Move fs.rs to vhost_user module
vhost-user-net introduced a new module vhost_user inside the vm-virtio
crate. Because virtio-fs is actually vhost-user-fs, it belongs to this
new module and needs to be moved there.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-31 17:33:17 +01:00
Cathy Zhang
584a2cccee vmm: Add vhost-user-net support
Update vm configuration and device initial process to add
vhost-user-net support.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2019-08-30 15:00:26 +01:00
Cathy Zhang
51306555e7 vmm: Add hugetlbfs handling support
The currently directory handling process to open tempfile by
OpenOptions with custom_flags(O_TMPFILE) is workable for tmp
filesystem, but not workable for hugetlbfs, add new directory
handling process which works fine for both tmpfs and hugetlbfs.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2019-08-30 15:00:26 +01:00
Sebastien Boeuf
0b8856d148 vmm: Add RwLock to the GuestMemoryMmap
Following the refactoring of the code allowing multiple threads to
access the same instance of the guest memory, this patch goes one step
further by adding RwLock to it. This anticipates the future need for
being able to modify the content of the guest memory at runtime.

The reasons for adding regions to an existing guest memory could be:
- Add virtio-pmem and virtio-fs regions after the guest memory was
  created.
- Support future hotplug of devices, memory, or anything that would
  require more memory at runtime.

Because most of the time, the lock will be taken as read only, using
RwLock instead of Mutex is the right approach.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-22 08:24:15 +01:00
Sebastien Boeuf
ec0b5567c8 vmm: Share the guest memory instead of cloning it
The VMM guest memory was cloned (copied) everywhere the code needed to
have ownership of it. In order to clean the code, and in anticipation
for future support of modifying this guest memory instance at runtime,
it is important that every part of the code share the same instance.

Because VirtioDevice implementations need to have access to it from
different threads, that's why Arc must be used in this case.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-22 08:24:15 +01:00
Sebastien Boeuf
658c076eb2 linters: Fix clippy issues
Latest clippy version complains about our existing code for the
following reasons:

- trait objects without an explicit `dyn` are deprecated
- `...` range patterns are deprecated
- lint `clippy::const_static_lifetime` has been renamed to
  `clippy::redundant_static_lifetimes`
- unnecessary `unsafe` block
- unneeded return statement

All these issues have been fixed through this patch, and rustfmt has
been run to cleanup potential formatting errors due to those changes.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-15 09:10:04 -07:00
Samuel Ortiz
c52e276a5c vmm: Log debug ioport timestamps
We timestamp the VM creation time, and log the elapsed time between that
instant and the debug ioport events.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-08-15 16:06:54 +02:00
Samuel Ortiz
48a9300667 vmm: Log 0x80 IO port writes
The 0x80 IO port is typically used for BIOS debugging and testing on
bare metal x86 platforms.
We use that port and its dedicated 16 debug codes to time and track the
guest boot process.

Fixes #63

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-08-15 16:06:54 +02:00
Sebastien Boeuf
3c29c47783 vmm: Create shared memory region for virtio-fs
When the cache_size parameter from virtio-fs device is not empty, the
VMM creates a dedicated memory region where the shared files will be
memory mapped by the virtio-fs device.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-13 13:57:53 +02:00
fazlamehrab
df5058ec0a vm-virtio: Implement console size config feature
One of the features of the virtio console device is its size can be
configured and updated. Our first iteration of the console device
implementation is lack of this feature. As a result, it had a
default fixed size which could not be changed. This commit implements
the console config feature and lets us change the console size from
the vmm side.

During the activation of the device, vmm reads the current terminal
size, sets the console configuration accordinly, and lets the driver
know about this configuration by sending an interrupt. Later, if
someone changes the terminal size, the vmm detects the corresponding
event, updates the configuration, and sends interrupt as before. As a
result, the console device driver, in the guest, updates the console
size.

Signed-off-by: A K M Fazla Mehrab <fazla.mehrab.akm@intel.com>
2019-08-09 13:55:43 -07:00
Rob Bradford
d9a355f85a vmm: Add new "null" serial/console output mode
Poor performance was observed when booting kernels with "console=ttyS0"
and the serial port disabled.

This change introduces a "null" console output mode and makes it the
default for the serial console. In this case the serial port
is advertised as per other output modes but there is no input and any
output is dropped.

Fixes: #163

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-08-09 09:04:48 -07:00
Rob Bradford
f910476dd7 vmm: Only send stdin input to serial/console if it can handle it
Do not send the contents of stdin to the serial or console device if
they're not in tty mode.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-08-09 09:04:48 -07:00