We should return an explicit error when the transition from on VM state
to another is invalid.
The valid_transition() routine for the VmState enum essentially
describes the VM state machine.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
In order to pause a VM, we signal all the vCPU threads to get them out
of vmx non-root. Once out, the vCPU thread will check for a an atomic
pause boolean. If it's set to true, then the thread will park until
being resumed.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
So that we don't need to forward an ExitBehaviour up to the VMM thread.
This simplifies the control loop and the VMM thread even further.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
This commit is the glue between the virtio-pci devices attached to the
vIOMMU, and the IORT ACPI table exposing them to the guest as sitting
behind this vIOMMU.
An important thing is the trait implementation provided to the virtio
vrings for each device attached to the vIOMMU, as they need to perform
proper address translation before they can access the buffers.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The virtual IOMMU exposed through virtio-iommu device has a dependency
on ACPI. It needs to expose the device ID of the virtio-iommu device,
and all the other devices attached to this virtual IOMMU. The IDs are
expressed from a PCI bus perspective, based on segment, bus, device and
function.
The guest relies on the topology description provided by the IORT table
to attach devices to the virtio-iommu device.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
We used to have errors definitions spread across vmm, vm, api,
and http.
We now have a cleaner separation: All API routines only return an
ApiResult. All VM operations, including the VMM wrappers, return a
VmResult. This makes it easier to carry errors up to the HTTP caller.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
The linux_loader crate Cmdline struct is not serializable.
Instead of forcing the upstream create to carry a serde dependency, we
simply use a String for the passed command line and build the actual
CmdLine when we need it (in vm::new()).
Also, the cmdline offset is not a configuration knob, so we remove it.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
The kernel path was the only mandatory command line option.
With the addition of the --api-socket option, we can run without a
kernel path and get it later through the API.
Since we can end up with VM configurations that are no longer valid by
default, we need to provide a validation check for it. For now, if the
kernel path is not defined, the VM configuration is invalid.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Using the existing layout module start documenting the major regions of
RAM and those areas that are reserved. Some of the constants have also
been renamed to be more consistent and some functions that returned
constant variables have been replaced.
Future commits will move more constants into this file to make it the
canonical source of information about the memory layout.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
We now start the main VMM thread, which will be listening for VM and IPC
related events.
In order to start the configured VM, we no longer directly call the VM
API but we use the IPC instead, to first create and then start a VM.
Fixes: #303
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
The VMM thread and control loop will be the sole consumer of the
EpollContext and EpollDispatch API, so let's move it to lib.rs.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
As we're going to move the control loop to the VMM thread, the exit and
reset EventFds are no longer going to be owned by the VM.
We pass a copy of them when creating the Vm instead.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
In order to handle the VM STDIN stream from a separate VMM thread
without having to export the DeviceManager, we simply add a console
handling method to the Vm structure.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
In order to transfer the control loop to a separate VMM thread, we want
to shrink the VM control loop to a bare minimum.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Once passed to the VM creation routine, a VmConfig structure is
immutable. We can simply carry a Arc of it instead of a reference.
This also allows us to remove any lifetime bound from our VM.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
The Vmm structure is just a placeholder for the KVM instance. We can
create it directly from the VM creation routine instead.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
We can integrate the kernel loading into the VM start method.
The VM start flow is then: Vm::new() -> vm.start(), which feels more
natural.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Convert Path to PathBuf and remove the associated lifetime.
Now we can remove the VmConfig associated lifetime.
Fixes#298
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Probe for the size of the host physical address range and use that to
establish the address range for the VM. This removes the limitation on
the size of the VM RAM and gives more space for the devices.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
After the 32-bit gap the memory is shared between the devices and the
RAM. Ensure that the ACPI tables correctly indicate where the RAM ends
and the device area starts by patching the precompiled tables. We get
the following valid output now from the PCI bus probing (8GiB guest)
[ 0.317757] pci_bus 0000:00: resource 4 [io 0x0000-0x0cf7 window]
[ 0.319035] pci_bus 0000:00: resource 5 [io 0x0d00-0xffff window]
[ 0.320215] pci_bus 0000:00: resource 6 [mem 0x000a0000-0x000bffff window]
[ 0.321431] pci_bus 0000:00: resource 7 [mem 0xc0000000-0xfebfffff window]
[ 0.322613] pci_bus 0000:00: resource 8 [mem 0x240000000-0xfffffffff window]
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Rather than calling it at the very start of the VM execution (i.e. when
the VCPUs are created) do it as part of the DeviceManager creation.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Rather than sending a signal to the signal handler used for handling
SIGWINCH calls instead use the crate provided termination method. This
also unregisters the signal handler which also means that there won't be
a leaked signal handler remaining.
This leaked signal handler is what was causing a failure to cleanup up
the thread on subsequent requests breaking two reboots in a row.
Fixes: #252
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Refactor out DeviceManager into it's own file. This is part of a bigger
effort to reduce complexity in the vm.rs file but will also allow future
separation to allow making PCI support optional.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
For virtio-fs and virtio-pmem regions of memory are manually mapped into
the address space of the VMM. In order to cleanly reboot we need to
unmap those regions.
Fixes: #223
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Do this by using the same mechanism as the vCPU threads by sending a
signal to the thread. As this is the same mechanism reuse the same code
and rename the "vcpus" member to "threads" to indicate this represents
both the vCPU threads and also the signal handler thread.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Put the ACPI support behind a feature and ensure that the code compiles
without that feature by adding an extra build to Travis.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
As part of the cleanup of the VM shutdown all the vCPU threads. This is
achieved by toggling a shared atomic boolean variable which is checked
in the vCPU loop. To trigger the vCPU code to look at this boolean it is
necessary to send a signal to the vCPU which will interrupt the running
KVM_RUN ioctl.
Fixes: #229
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Sadly only the first few characters of the thread name is preserved so
use a shorter name for the vCPU thread for now. Also give the signal
handling thread a name.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Add an I/O port "device" to handle requests from the kernel to shutdown
or trigger a reboot, borrowing an I/O used for ACPI on the Q35 platform.
The details of this I/O port are included in the FADT
(SLEEP_STATUS_REG/SLEEP_CONTROL_REG/RESET_REG) with the details of the
value to write in the FADT for reset (RESET_VALUE) and in the DSDT for
shutdown (S5 -> 0x05)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Add a 2nd EventFd to the VM to control resetting (rebooting) the VM this
supplements the EventFd used for managing shutdown of the VM.
The default behaviour on i8042 or triple-fault based reset is currently
unchanged i.e. it will trigger a shutdown.
In order to support restarting the VM it was necessary to make start()
function take a reference to the config.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Only add the ACPI PNP device for the COM1 serial port if it is not
turned off with "--serial off"
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Currently when the VCPU thread exits on an error the VMM continues to
run with no way of shutting down the main thread.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This patch factorizes the existing virtio-fs code by relying onto the
common code part of the vhost_user module in the vm-virtio crate.
In details, it factorizes the vhost-user setup, and reuses the error
types defined by the module instead of defining its own types.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
vhost-user-net introduced a new module vhost_user inside the vm-virtio
crate. Because virtio-fs is actually vhost-user-fs, it belongs to this
new module and needs to be moved there.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>