Commit Graph

587 Commits

Author SHA1 Message Date
Yi Wang
3225c0c7c8 vmm: Automatically pause VM for coredump
If the VMM is not already paused then pause the VM prior to executing
the coredump and then resume it after. If the VM is already paused then
the original state is maintained.

Signed-off-by: Yi Wang <foxywang@tencent.com>
2023-07-31 17:05:46 +01:00
Alyssa Ross
69bd0036d9 vmm: support removing devices before VM is booted
If the VM has been configured but not yet booted, all we need to do to
support removing a device is to remove it from the config, so it will
never be created.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2023-07-14 10:01:23 -07:00
Anatol Belski
7bf2a2c382 vmm: arch: Make phys_bits functionality use CPU vendor API
Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
2023-05-31 23:54:33 +02:00
Rob Bradford
89e658d9ff misc: Update for beta clippy failures on x86-64
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2023-05-30 07:18:17 -07:00
Bo Chen
0b1e626fe3 vmm: Allocate guest memory address space before TDX initialization
The refactoring on deferring address space allocation (#5169) broke TDX,
as TDX initialization needs to access guest memory for encryption and
measurement of guest pages.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-05-23 09:00:00 -07:00
Alyssa Ross
21d40d7489 main: reset tty if starting the VM fails
When I refactored this to centralise resetting the tty into
DeviceManager::drop, I tested that the tty was reset if an error
happened on the vmm thread, but not on the main thread.  It turns out
that if an error happened on the main thread, the process would just
exit, so drop handlers on other threads wouldn't get run.

To fix this, I've changed start_vmm() to write to the VMM's exit
eventfd and then join the thread if an error happens after the vmm
thread is started.

Fixes: b6feae0a ("vmm: only touch the tty flags if it's being used")
Signed-off-by: Alyssa Ross <hi@alyssa.is>
2023-05-02 09:33:53 +01:00
Alyssa Ross
c90a0ffff6 vmm: reset to the original termios
Previously, we used two different functions for configuring ttys.
vmm_sys_util::terminal::Terminal::set_raw_mode() was used to configure
stdio ttys, and cfmakeraw() was used to configure ptys created by
cloud-hypervisor.  When I centralized the stdio tty cleanup, I also
switched to using cfmakeraw() everywhere, to avoid duplication.

cfmakeraw sets the OPOST flag, but when we later reset the ttys, we
used vmm_sys_util::terminal::Terminal::set_canon_mode(), which does
not unset this flag.  This meant that the terminal was getting mostly,
but not fully, reset.

To fix this without depending on the implementation of cfmakeraw(),
let's just store the original termios for stdio terminals, and restore
them to exactly the state we found them in when cloud-hypervisor exits.

Fixes: b6feae0a ("vmm: only touch the tty flags if it's being used")
Signed-off-by: Alyssa Ross <hi@alyssa.is>
2023-05-02 09:33:53 +01:00
Alyssa Ross
b6feae0ace vmm: only touch the tty flags if it's being used
When neither serial nor console are connected to the tty,
cloud-hypervisor shouldn't touch the tty at all.  One way in which
this is annoying is that if I am running cloud-hypervisor without it
using my terminal, I expect to be able to suspend it with ^Z like any
other process, but that doesn't work if it's put the terminal into raw
mode.

Instead of putting the tty into raw mode when a VM is created or
restored, do it when a serial or console device is created.  Since we
now know it can't be put into raw mode until the Vm object is created,
we can move setting it back to canon mode into the drop handler for
that object, which should always be run in normal operation.  We still
also put the tty into canon mode in the SIGTERM / SIGINT handler, but
check whether the tty was actually used, rather than whether stdin is
a tty.  This requires passing on_tty around as an atomic boolean.

I explored more of an abstraction over the tty — having an object that
encapsulated stdout and put the tty into raw mode when initialized and
into canon mode when dropped — but it wasn't practical, mostly due to
the special requirements of the signal handler.  I also investigated
whether the SIGWINCH listener process could be used here, which I
think would have worked but I'm hesitant to involve it in serial
handling as well as conosle handling.

There's no longer a check for whether the file descriptor is a tty
before setting it into canon mode — it's redundant, because if it's
not a tty it just won't respond to the ioctl.

Tested by shutting down through the API, SIGTERM, and an error
injected after setting raw mode.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2023-04-17 16:33:17 +01:00
Alyssa Ross
520aff2efc vmm: don't redundantly set the TTY to canon mode
If the VM is shut down, either it's going to be started again, in
which case we still want to be in raw mode, or the process is about to
exit, in which case canon mode will be set at the end of main.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2023-04-17 16:33:17 +01:00
Alyssa Ross
38a1b45783 vmm: use the SIGWINCH listener for TTYs too
Previously, we were only using it for PTYs, because for PTYs there's
no alternative.  But since we have to have it for PTYs anyway, if we
also use it for TTYs, we can eliminate all of the code that handled
SIGWINCH for TTYs.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2023-04-05 11:23:06 +01:00
Yong He
76d6d28f3e vmm: do not start signal thread to resize console if no need
Now cloud hypervisor will start signal thread to catch
SIGWINCH signal, cloud hypervisor then will resize the
guest console via vconsole.

This patch skip starting signal thread when there is no
need to resize guest console, such as console is not
configured.

Signed-off-by: Yong He <alexyonghe@tencent.com>
2023-02-28 09:40:07 -08:00
Jinank Jain
b54ce6c3db vmm: Defer address space allocation
We can ideally defer the address space allocation till we start the
vCPUs for the very first time. Because the VM will not access the memory
until the CPUs start running. Thus there is no need to allocate the
address space eagerly and wait till the time we are going to start the
vCPUs for the first time.

Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
2023-02-10 11:52:20 +01:00
Wei Liu
34b3170680 vmm: fix two typos
Change "thead" to "thread".

Also make sure the two messages are distinguishable by adding "vmm" and
"vm" prefix.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2023-01-30 21:10:02 +00:00
Sebastien Boeuf
e4ae668bcd tdx: Update support based on kvm-upstream v5.19
In order to comply with latest TDX version, we rely onto the branch
kvm-upstream-2022.08.07-v5.19-rc8 from https://github.com/intel/tdx
repository. Updates are based on changes that happened in
arch/x86/include/uapi/asm/kvm.h headers file.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2023-01-20 09:59:56 +00:00
Rob Bradford
5e52729453 misc: Automatically fix cargo clippy issues added in 1.65 (stable)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-12-14 14:27:19 +00:00
Bo Chen
de06bf4aed vmm: Make "Vm::generate_cmdline()" public for fuzzing
Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-12-12 13:50:28 +00:00
Sebastien Boeuf
3931b99d4e vm-migration: Introduce new constructor for Snapshot
This simplifies the Snapshot creation as we expect a SnapshotData to be
provided most of the time.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-12-09 10:26:06 +01:00
Sebastien Boeuf
4ae6b595d7 vm-migration: Rename add_data_section() into add_data()
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-12-09 10:26:06 +01:00
Sebastien Boeuf
748018ace3 vm-migration: Don't store the id as part of Snapshot structure
The information about the identifier related to a Snapshot is only
relevant from the BTreeMap perspective, which is why we can get rid of
the duplicated identifier in every Snapshot structure.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-12-09 10:26:06 +01:00
Sebastien Boeuf
4517b76a23 vm-migration: Rename SnapshotDataSection into SnapshotData
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-12-09 10:26:06 +01:00
Sebastien Boeuf
1b32e2f8b2 vm-migration: Simplify SnapshotDataSection structure
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-12-09 10:26:06 +01:00
Sebastien Boeuf
5b3bcfa233 vm-migration: Snapshot should have a unique SnapshotDataSection
There's no reason to carry a HashMap of SnapshotDataSection per
Snapshot. And given we now provide at most one SnapshotDataSection per
Snapshot, there's no need to keep the id part of the SnapshotDataSection
structure.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-12-09 10:26:06 +01:00
Sebastien Boeuf
0489b6314e tdx: Support new way of declaring memory resources
Without breaking the former way of declaring them. This is simply based
on the presence of the GUID TDX Metadata offset. If not present, we
consider the firmware is quite old and therefore we fallback onto the
previous way to expose memory resources.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-12-08 10:13:12 -08:00
Rob Bradford
cefbf6b4a3 vmm: guest_debug: Mark coredump functionality x86_64 only
The coredump functionality is only implemented for x86_64 so it should
only be compiled in there.

Fixes: #4964

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-12-05 17:23:52 +00:00
Sebastien Boeuf
31209474b3 vmm: Move TDX initialization before vCPUs creation
TDX was broken by the recent refactoring moving the vCPU creation
earlier than before. The simple and correct way to fix this problem is
by moving the TDX initialization right before the vCPUs creation. The
rest of the TDX setup can remain where it is.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-12-05 07:49:40 -08:00
Rob Bradford
725e388684 vmm: Seperate the CPUID setup from the CpuManager::new()
This allows the decoupling of CpuManager and MemoryManager.

See: #4761

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-12-01 22:41:01 +00:00
Rob Bradford
c5eac2e822 vmm: Don't store GuestMemoryMmap for "guest_debug" functionality
This removes the storage of the GuestMemoryMmap on the CpuManager
further allowing the decoupling of the CpuManager from the
MemoryManager.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-12-01 22:41:01 +00:00
Rob Bradford
c7b22156da aarch, vmm: Reduce requirement for guest memory to vCPU boot only
When configuring the vCPUs it is only necessary to provide the guest
memory when booting fresh (for populating the guest memory). As such
refactor the vCPU configuration to remove the use of the
GuestMemoryMmap stored on the CpuManager.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-12-01 22:41:01 +00:00
Sebastien Boeuf
d98f2618bd vmm: Create restored Vm as paused
Thanks to the new way of restoring Vm, we can now create the Vm object
directly with the appropriate VmState rather than having to patch it at
a later time.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-12-01 10:16:44 -08:00
Sebastien Boeuf
2e01bf7f74 vmm: Provide an owned Snapshot rather than a reference
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-12-01 10:16:44 -08:00
Sebastien Boeuf
a4160c1fef vmm: Simplify list of parameters to Vm::new()
No need to provide a boolean to know if the VM is being restored given
we already have this information from the Option<Snapshot>.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-12-01 10:16:44 -08:00
Sebastien Boeuf
d0c53a5357 vmm: Move Vm to the new restore design
Now the entire codebase has been moved to the new restore design, we can
complete the work by creating a dedicated restore() function for the Vm
object and get rid of the method restore() from the Snapshottable trait.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-12-01 10:16:44 -08:00
Michael Zhao
b173f6f654 vmm,devices: Change Gic snapshot and restore path
The snapshot and restore of AArch64 Gic was done in Vm. Now it is moved
to DeviceManager.

The benefit is that the restore can be done while the Gic is created in
DeviceManager.

While the moving of state data from Vm snapshot to DeviceManager
snapshot breaks the compatability of migration from older versions.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-12-01 17:07:25 +01:00
Michael Zhao
def1d7cf86 vmm: Remove GICR typers in snapshot on AArch64
The GICR typers are also set in restoring the GIC. Saving them in
snapshot is not needed.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-12-01 17:07:25 +01:00
Sebastien Boeuf
e8c6d83f3f vmm: Merge Vm::new_from_snapshot with Vm::new
Given the recent factorization that happened in vm.rs, we're now able to
merge Vm::new_from_snapshot with Vm::new.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-12-01 13:46:31 +01:00
Sebastien Boeuf
1c36065754 vmm: Move devices creation to Vm creation
This moves the devices creation out of the dedicated restore function
which will be eventually removed.

This factorizes the creation of all devices into a single location.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-12-01 13:46:31 +01:00
Sebastien Boeuf
bccfa81368 vmm: Restore clock from Vm creation (x86_64 only)
This allows the clock restoration to be moved out of the dedicated
restore function, which will eventually be removed.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-12-01 13:46:31 +01:00
Sebastien Boeuf
a6959a7469 vmm: Move DeviceManager to new restore design
Based on all the work that has already been merged, it is now possible
to fully move DeviceManager out of the previous restore model, meaning
there's no need for a dedicated restore() function to be implemented
there.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-12-01 13:46:31 +01:00
Sebastien Boeuf
4487c8376b vmm: Move CpuManager and Vcpu to the new restore design
Every Vcpu is now created with the right state if there's an available
snapshot associated with it. This simplifies the restore logic.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-12-01 09:27:00 +01:00
Sebastien Boeuf
90b5014a50 vmm: device_manager: Remove 'restoring' attribute
Given 'restoring' isn't needed anymore from the DeviceManager structure,
let's simplify it.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-11-29 13:46:30 +01:00
Michael Zhao
7d16c74020 vmm: Refactor AArch64 GIC initialization process
In the new process, `device::Gic::new()` covers additional actions:
1. Creating `hypervisor::vGic`
2. Initializing interrupt routings

The change makes the vGic device ready in the beginning of
`DeviceManager::create_devices()`. This can unblock the GIC related
devices initialization in the `DeviceManager`.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-11-23 11:49:57 +01:00
Sebastien Boeuf
86e7f07485 vmm: cpu: Create vCPUs before the DeviceManager
Moving the creation of the vCPUs before the DeviceManager gets created
will allow for the aarch64 vGIC to be created before the DeviceManager
as well in a follow up patch. The end goal being to adopt the same
creation sequence for both x86_64 and aarch64, and keeping in mind that
the vGIC requires every vCPU to be created.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-11-23 11:49:57 +01:00
Sebastien Boeuf
578780ed0c vmm: cpu: Split vCPU creation
Split the vCPU creation into two distincts parts. On the one hand we
create the actual Vcpu object with the creation of the hypervisor::Vcpu.
And on the other hand, we configure the existing Vcpu, setting registers
to proper values (such as setting the entry point).

This will allow for further work to move the creation earlier in the
boot, so that the hypervisor::Vcpu will be already created when the
DeviceManager gets created.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-11-23 11:49:57 +01:00
Sebastien Boeuf
ec01062ada vmm: Switch order between DeviceManager and CpuManager creation
The CpuManager is now created before the DeviceManager. This is required
as preliminary work for creating the vCPUs before the DeviceManager,
which is required to ensure both x86_64 and aarch64 follow the same
sequence.

It's important to note the optimization for faster PIO accesses on the
PCI config space had to be removed given the VmOps was required by the
CpuManager and by the Vcpu by extension. But given the PciConfigIo is
created as part of the DeviceManager, there was no proper way of moving
things around so that we could provide PciConfigIo early enough.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-11-23 11:49:57 +01:00
Rob Bradford
e37ec26ccf vmm: Remove PCI PIO optimisation
This optimisation provided some peformance improvement when measured by
perf however when considered in terms of boot time peformance this
optimisation doesn't have any impact measurable using our
peformance-metrics tooling.

Removing this optimisation helps simplify the VMM internals as it allows
the reordering of the VM creation process permitting refactoring of the
restore code path.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-11-22 19:47:53 +00:00
Wei Liu
d05586f520 vmm: modify or provide safety comments
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-11-18 12:50:01 +00:00
Wei Liu
d274fe9cb8 vmm: fix tdx check
The field has been moved in 3793ffe888.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-11-18 12:50:01 +00:00
Praveen K Paladugu
af261f231c vmm: Add required acpi entries for vtpm device
Add an TPM2 entry to DSDT ACPI table. Add a TPM2 table to guest's ACPI.

Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Co-authored-by: Sean Yoo <t-seanyoo@microsoft.com>
2022-11-15 16:42:21 +00:00
Rob Bradford
6e0bd73c90 build: Bump linux-loader from 0.6.0 to 0.7.0
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-11-02 11:02:00 +00:00
Bo Chen
a9ec0f33c0 misc: Fix clippy issues
Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-11-02 09:41:43 +01:00