This test (which relies on nesting) is failing on the VFIO worker. The tests that use the
dedicated hardware pass fine.
See: #5190
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
We need to provide valid FDs while creating 'NetConfig' instances even
for unit tests. Closing invalid FDs would cause random unit test
failures.
Also, two identical 'NetConfig' instances are not allowed any more,
because it would lead to close the same FD twice. This is consistent
with the fact that a clone of a "NetConfig" instance is no
longer *equal* to the instance itself.
Fixes: #5203
Signed-off-by: Bo Chen <chen.bo@intel.com>
These are owned by the config (and are duplicated before being used to
create the `Tap` for the virtio-net device.)
By implementing Drop on NetConfig we have issues with moving out of
members that don't implement the Copy trait. This requires a small
adjustment to the unit tests that use the Default::default() function.
Fixes: #5197
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The custom version duplicates any FDs that have been provided so that
the validation logic used on hotplug, which takes a clone of the config,
can be safely carried out.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This code is indentical to what is in this repository. When a release
gets made we can then switch to that.
Fixes: #5122
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
If swtpm becomes unresponsive, guest gets blocked at "recvmsg" on tpm's
data FD. This change adds a timeout to the data fd socket. If swtpm
becomes unresponsive guest waits for "timeout" (secs) and continues to
run after returning an I/O error to tpm commands.
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
The updated image is configured in a same way as the previously used
2019, it has same
- Credentials
- Services configured, like SAC, SSH, RDP
- Size
All the Windows updates are applied so the state is current to the date.
Also, the latest stable version 0.1.229 of the VirtIO Windows drivers
is installed.
Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
We can ideally defer the address space allocation till we start the
vCPUs for the very first time. Because the VM will not access the memory
until the CPUs start running. Thus there is no need to allocate the
address space eagerly and wait till the time we are going to start the
vCPUs for the first time.
Signed-off-by: Jinank Jain <jinankjain@microsoft.com>
This hypervisor leaf includes details of the TSC frequency if that is
available from KVM. This can be used to efficiently calculate time
passed when there is an invariant TSC.
TEST=Run `cpuid` in the guest and observe the frequency populated.
Fixes: #5178
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Updates include:
- Add references to 'TDX Tools'
- Expand instructions on buidling and using TDShim
- Add version information of guest/host kernel, TDVF, TDShim being tested
Signed-off-by: Bo Chen <chen.bo@intel.com>
This is required for booting Linux:
From: https://lore.kernel.org/all/20221028141220.29217-3-kirill.shutemov@linux.intel.com/
"""
Virtualization Exceptions (#VE) are delivered to TDX guests due to
specific guest actions such as using specific instructions or accessing
a specific MSR.
Notable reason for #VE is access to specific guest physical addresses.
It requires special security considerations as it is not fully in
control of the guest kernel. VMM can remove a page from EPT page table
and trigger #VE on access.
The primary use-case for #VE on a memory access is MMIO: VMM removes
page from EPT to trigger exception in the guest which allows guest to
emulate MMIO with hypercalls.
MMIO only happens on shared memory. All conventional kernel memory is
private. This includes everything from kernel stacks to kernel text.
Handling exceptions on arbitrary accesses to kernel memory is
essentially impossible as handling #VE may require access to memory
that also triggers the exception.
TDX module provides mechanism to disable #VE delivery on access to
private memory. If SEPT_VE_DISABLE TD attribute is set, private EPT
violation will not be reflected to the guest as #VE, but will trigger
exit to VMM.
Make sure the attribute is set by VMM. Panic otherwise.
There's small window during the boot before the check where kernel has
early #VE handler. But the handler is only for port I/O and panic as
soon as it sees any other #VE reason.
SEPT_VE_DISABLE makes SEPT violation unrecoverable and terminating the
TD is the only option.
Kernel has no legitimate use-cases for #VE on private memory. It is
either a guest kernel bug (like access of unaccepted memory) or
malicious/buggy VMM that removes guest page that is still in use.
In both cases terminating TD is the right thing to do.
"""
With this change Cloud Hypervisor can boot the current Linux guest
kernel.
Reported-By: Jiaqi Gao <jiaqi.gao@intel.com
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Do the following:
1. Use from_be_bytes to drop mutable slices.
2. Check for the exact buffer size throughout.
3. Simplify ptm_to_request where possible.
4. Make error messages style consistent.
Fix a typo in code comment while at it.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
There is no guarantee that the write can send the whole buffer at once.
In those rare occasions, we should return a sensible error.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
The largest possible PTM response is only 16 bytes. Size the output
buffer correctly.
In the socket read function, rely on the caller to provide a
sufficiently large buffer. That eliminates another large stack variable.
In total this saves almost 8KB stack space.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Change "thead" to "thread".
Also make sure the two messages are distinguishable by adding "vmm" and
"vm" prefix.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
The number of aligned operations can not be larger than the number of
descriptors. Initializing the capacity to 1 is good enough per the
observation that most of time there is only one data descriptor in a
given request.
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Before Linux v6.0, AArch64 didn't support "socket" in "cpu-map"
(CPU topology) of FDT.
We found that clusters can be used in the same way of sockets. That is
the way we implemented the socket settings in Cloud Hypervisor. But in
fact it was a bug.
Linux commit 26a2b7 fixed the mistake. So the cluster nodes can no
longer act as sockets. And in a following commit dea8c0, sockets were
supported.
This patch fixed the way to configure sockets. In each socket, a default
cluster was added to contain all the cores, because cluster layer is
mandatory in CPU topology on AArch64.
This fix will break the socket settings on the guests where the kernel
version is lower than v6.0. In that case, if socket number is set to
more than 1, the kernel will treat that as FDT mistake and all the CPUs
will be put in single cluster of single socket.
The patch only impacts the case of using FDT, not ACPI.
Signed-off-by: Michael Zhao <michael.zhao@arm.com>