Significant API changes have occured, most significantly is the switch
to an approach which does not require vm-memory and can run no_std.
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
Now cloud hypervisor will start signal thread to catch
SIGWINCH signal, cloud hypervisor then will resize the
guest console via vconsole.
This patch skip starting signal thread when there is no
need to resize guest console, such as console is not
configured.
Signed-off-by: Yong He <alexyonghe@tencent.com>
The PR #2333 added I/O rate limiter on block device, with some options
in `DiskConfig`. And the PR #2401 added rate limiter on virtio-net
device with same options, but it still throws `Error::ParseDisk`.
This commit fixes it with correct values.
Fixes: #2401
Signed-off-by: Yu Li <liyu.yukiteru@bytedance.com>
Once error occur, vcpu thread may exit, this should
be critical event for the whole VM, we should fire
exit event and set vcpu state.
If we don't set vcpu state, the shutdown process
will hang at signal_thread, which is waiting the
vcpu state to change.
Signed-off-by: Yong He <alexyonghe@tencent.com>
We need to provide valid FDs while creating 'NetConfig' instances even
for unit tests. Closing invalid FDs would cause random unit test
failures.
Also, two identical 'NetConfig' instances are not allowed any more,
because it would lead to close the same FD twice. This is consistent
with the fact that a clone of a "NetConfig" instance is no
longer *equal* to the instance itself.
Fixes: #5203
Signed-off-by: Bo Chen <chen.bo@intel.com>
These are owned by the config (and are duplicated before being used to
create the `Tap` for the virtio-net device.)
By implementing Drop on NetConfig we have issues with moving out of
members that don't implement the Copy trait. This requires a small
adjustment to the unit tests that use the Default::default() function.
Fixes: #5197
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The custom version duplicates any FDs that have been provided so that
the validation logic used on hotplug, which takes a clone of the config,
can be safely carried out.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This code is indentical to what is in this repository. When a release
gets made we can then switch to that.
Fixes: #5122
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
If swtpm becomes unresponsive, guest gets blocked at "recvmsg" on tpm's
data FD. This change adds a timeout to the data fd socket. If swtpm
becomes unresponsive guest waits for "timeout" (secs) and continues to
run after returning an I/O error to tpm commands.
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
We can ideally defer the address space allocation till we start the
vCPUs for the very first time. Because the VM will not access the memory
until the CPUs start running. Thus there is no need to allocate the
address space eagerly and wait till the time we are going to start the
vCPUs for the first time.
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
This hypervisor leaf includes details of the TSC frequency if that is
available from KVM. This can be used to efficiently calculate time
passed when there is an invariant TSC.
TEST=Run `cpuid` in the guest and observe the frequency populated.
Fixes: #5178
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Change "thead" to "thread".
Also make sure the two messages are distinguishable by adding "vmm" and
"vm" prefix.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
In order to comply with latest TDX version, we rely onto the branch
kvm-upstream-2022.08.07-v5.19-rc8 from https://github.com/intel/tdx
repository. Updates are based on changes that happened in
arch/x86/include/uapi/asm/kvm.h headers file.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
A few breaking changes:
1. `-vvv` needs to be written as `-v -v -v`.
2. `--disk D1 D2` and others need to be written as `--disk D1 --disk D2`.
3. `--option=value` needs to be written as `--option value`
Change integration tests to adapt to the breaking changes.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Add new configuration for offloading features, including
Checksum/TSO/UFO, and set these offloading features as
enabled by default.
Fixes: #4792.
Signed-off-by: Yong He <alexyonghe@tencent.com>
MSHV does not require to ensure MMIO/PIO exits complete
before pausing. This patch makes sure the above requirement
by checking the hypervisor type run-time.
Fixes#5037
Signed-off-by: Muminul Islam <muislam@microsoft.com>
This functionality has been obsoleted by our native support for
hugepages and shared memory.
See: #5082
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
To align the logging messages with the rest of the code, this
message should be aligned with another similar occurrence in
epoll_helper.rs
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
The double underscore made it different from how other projects would
name this particular macro.
No functional change.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Remove from the documentation and API definition but continue support
using the field (with a deprecation warning.)
See: #4837
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This simplifies the Snapshot creation as we expect a SnapshotData to be
provided most of the time.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The information about the identifier related to a Snapshot is only
relevant from the BTreeMap perspective, which is why we can get rid of
the duplicated identifier in every Snapshot structure.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
There's no reason to carry a HashMap of SnapshotDataSection per
Snapshot. And given we now provide at most one SnapshotDataSection per
Snapshot, there's no need to keep the id part of the SnapshotDataSection
structure.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Without breaking the former way of declaring them. This is simply based
on the presence of the GUID TDX Metadata offset. If not present, we
consider the firmware is quite old and therefore we fallback onto the
previous way to expose memory resources.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In particular update to latest linux-loader release and point to latest
vfio repository for both crates hosted there.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The datatype used for the ioctl() C library call is different between it
and the glibc toolchains. The easiest solution is to have the compiler
type cast to type of the parameter.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The coredump functionality is only implemented for x86_64 so it should
only be compiled in there.
Fixes: #4964
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
TDX was broken by the recent refactoring moving the vCPU creation
earlier than before. The simple and correct way to fix this problem is
by moving the TDX initialization right before the vCPUs creation. The
rest of the TDX setup can remain where it is.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This removes the storage of the GuestMemoryMmap on the CpuManager
further allowing the decoupling of the CpuManager from the
MemoryManager.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
When configuring the vCPUs it is only necessary to provide the guest
memory when booting fresh (for populating the guest memory). As such
refactor the vCPU configuration to remove the use of the
GuestMemoryMmap stored on the CpuManager.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Thanks to the new way of restoring Vm, we can now create the Vm object
directly with the appropriate VmState rather than having to patch it at
a later time.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
No need to provide a boolean to know if the VM is being restored given
we already have this information from the Option<Snapshot>.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Now the entire codebase has been moved to the new restore design, we can
complete the work by creating a dedicated restore() function for the Vm
object and get rid of the method restore() from the Snapshottable trait.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The snapshot and restore of AArch64 Gic was done in Vm. Now it is moved
to DeviceManager.
The benefit is that the restore can be done while the Gic is created in
DeviceManager.
While the moving of state data from Vm snapshot to DeviceManager
snapshot breaks the compatability of migration from older versions.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Given the recent factorization that happened in vm.rs, we're now able to
merge Vm::new_from_snapshot with Vm::new.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This moves the devices creation out of the dedicated restore function
which will be eventually removed.
This factorizes the creation of all devices into a single location.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This allows the clock restoration to be moved out of the dedicated
restore function, which will eventually be removed.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Based on all the work that has already been merged, it is now possible
to fully move DeviceManager out of the previous restore model, meaning
there's no need for a dedicated restore() function to be implemented
there.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Every Vcpu is now created with the right state if there's an available
snapshot associated with it. This simplifies the restore logic.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Following the new restore design, it is not appropriate to set every
virtio device threads into a paused state after they've been started.
This is why we remove the line of code pausing the devices only after
they've been restored, and replace it with a small patch in every virtio
device implementation. When a virtio device is created as part of a
restored VM, the associated "paused" boolean is set to true. This
ensures the corresponding thread will be directly parked when being
started, avoiding the thread to be in a different state than the one it
was on the source VM during the snapshot.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Moving the Ioapic object to the new restore design, meaning the Ioapic
is created directly with the right state, and it shares the same
codepath as when it's created from scratch.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Creating a dedicated Result type for VirtioPciDevice, associated with
the new VirtioPciDeviceError enum. This allows for a clearer handling of
the errors generated through VirtioPciDevice::new().
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The code for restoring a VirtioPciDevice has been updated, including the
dependencies VirtioPciCommonConfig, MsixConfig and PciConfiguration.
It's important to note that both PciConfiguration and MsixConfig still
have restore() implementations because Vfio and VfioUser devices still
rely on the old way for restore.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In the new process, `device::Gic::new()` covers additional actions:
1. Creating `hypervisor::vGic`
2. Initializing interrupt routings
The change makes the vGic device ready in the beginning of
`DeviceManager::create_devices()`. This can unblock the GIC related
devices initialization in the `DeviceManager`.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
Moving the creation of the vCPUs before the DeviceManager gets created
will allow for the aarch64 vGIC to be created before the DeviceManager
as well in a follow up patch. The end goal being to adopt the same
creation sequence for both x86_64 and aarch64, and keeping in mind that
the vGIC requires every vCPU to be created.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Split the vCPU creation into two distincts parts. On the one hand we
create the actual Vcpu object with the creation of the hypervisor::Vcpu.
And on the other hand, we configure the existing Vcpu, setting registers
to proper values (such as setting the entry point).
This will allow for further work to move the creation earlier in the
boot, so that the hypervisor::Vcpu will be already created when the
DeviceManager gets created.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
The CpuManager is now created before the DeviceManager. This is required
as preliminary work for creating the vCPUs before the DeviceManager,
which is required to ensure both x86_64 and aarch64 follow the same
sequence.
It's important to note the optimization for faster PIO accesses on the
PCI config space had to be removed given the VmOps was required by the
CpuManager and by the Vcpu by extension. But given the PciConfigIo is
created as part of the DeviceManager, there was no proper way of moving
things around so that we could provide PciConfigIo early enough.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This optimisation provided some peformance improvement when measured by
perf however when considered in terms of boot time peformance this
optimisation doesn't have any impact measurable using our
peformance-metrics tooling.
Removing this optimisation helps simplify the VMM internals as it allows
the reordering of the VM creation process permitting refactoring of the
restore code path.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Add an TPM2 entry to DSDT ACPI table. Add a TPM2 table to guest's ACPI.
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Co-authored-by: Sean Yoo <t-seanyoo@microsoft.com>