Commit Graph

62 Commits

Author SHA1 Message Date
Yong He
0149e65081 vm-device: support batch update interrupt source group GSI
Split interrupt source group restore into two steps, first restore
the irqfd for each interrupt source entry, and second restore the
GSI routing of the entire interrupt source group.

This patch will reduce restore latency of interrupt source group,
and in a 200-concurrent restore test, the patch reduced the
average IOAPIC restore time from 15ms to 1ms.

Signed-off-by: Yong He <alexyonghe@tencent.com>
2023-08-03 15:58:36 +01:00
Rob Bradford
5e52729453 misc: Automatically fix cargo clippy issues added in 1.65 (stable)
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-12-14 14:27:19 +00:00
Bo Chen
a9ec0f33c0 misc: Fix clippy issues
Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-11-02 09:41:43 +01:00
Michael Zhao
5d45d6d0fb vmm: Move GIC unit test to hypervisor crate
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-06-06 10:17:26 +08:00
Michael Zhao
0fd6521759 aarch64: Avoid depending on layout in GIC code
Removing the dependency on `layout` helps moving GIC code into
`hypervisor` crate.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-05-27 10:57:50 +08:00
Rob Bradford
b1bd87df19 vmm: Simplify MsiInterruptManager generics
By taking advantage of the fact that IrqRoutingEntry is exported by the
hypervisor crate (that is typedef'ed to the hypervisor specific version)
then the code for handling the MsiInterruptManager can be simplified.

This is particularly useful if in this future it is not a typedef but
rather a wrapper type.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-11 11:19:14 +01:00
Rob Bradford
3f9e8d676a hypervisor: Move creation of irq routing struct to hypervisor crate
This removes the requirement to leak as many datastructures from the
hypervisor crate into the vmm crate.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-05-11 11:19:14 +01:00
Rob Bradford
ed87e42e6f vm-device, pci, devices: Remove InterruptSourceGroup::{un}mask
The calls to these functions are always preceded by a call to
InterruptSourceGroup::update(). By adding a masked boolean to that
function call it possible to remove 50% of the calls to the
KVM_SET_GSI_ROUTING ioctl as the the update will correctly handle the
masked or unmasked case.

This causes the ioctl to disappear from the perf report for a boot of
the VM.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-04-11 22:56:48 +01:00
Yi Wang
5375b84e3b vmm: interrupt: fix msi mask irq causing kernel panic on AMD
When mask a msi irq, we set the entry.masked to be true, so kvm
hypervisor will not pass the gsi to kernel through KVM_SET_GSI_ROUTING
ioctl which update kvm->irq_routing. This will trigger kernel
panic on AMD platform when the gsi is the largest one in kernel
kvm->irqfds.items:

crash> bt
PID: 22218  TASK: ffff951a6ad74980  CPU: 73  COMMAND: "vcpu8"
 #0 [ffffb1ba6707fa40] machine_kexec at ffffffff8565b397
 #1 [ffffb1ba6707fa90] __crash_kexec at ffffffff85788a6d
 #2 [ffffb1ba6707fb58] crash_kexec at ffffffff8578995d
 #3 [ffffb1ba6707fb70] oops_end at ffffffff85623c0d
 #4 [ffffb1ba6707fb90] no_context at ffffffff856692c9
 #5 [ffffb1ba6707fbf8] exc_page_fault at ffffffff85f95b51
 #6 [ffffb1ba6707fc50] asm_exc_page_fault at ffffffff86000ace
    [exception RIP: svm_update_pi_irte+227]
    RIP: ffffffffc0761b53  RSP: ffffb1ba6707fd08  RFLAGS: 00010086
    RAX: ffffb1ba6707fd78  RBX: ffffb1ba66d91000  RCX: 0000000000000001
    RDX: 00003c803f63f1c0  RSI: 000000000000019a  RDI: ffffb1ba66db2ab8
    RBP: 000000000000019a   R8: 0000000000000040   R9: ffff94ca41b82200
    R10: ffffffffffffffcf  R11: 0000000000000001  R12: 0000000000000001
    R13: 0000000000000001  R14: ffffffffffffffcf  R15: 000000000000005f
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #7 [ffffb1ba6707fdb8] kvm_irq_routing_update at ffffffffc09f19a1 [kvm]
 #8 [ffffb1ba6707fde0] kvm_set_irq_routing at ffffffffc09f2133 [kvm]
 #9 [ffffb1ba6707fe18] kvm_vm_ioctl at ffffffffc09ef544 [kvm]
    RIP: 00007f143c36488b  RSP: 00007f143a4e04b8  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 00007f05780041d0  RCX: 00007f143c36488b
    RDX: 00007f05780041d0  RSI: 000000004008ae6a  RDI: 0000000000000020
    RBP: 00000000000004e8   R8: 0000000000000008   R9: 00007f05780041e0
    R10: 00007f0578004560  R11: 0000000000000246  R12: 00000000000004e0
    R13: 000000000000001a  R14: 00007f1424001c60  R15: 00007f0578003bc0
    ORIG_RAX: 0000000000000010  CS: 0033  SS: 002b

To solve this problem, move route.disable() before set_gsi_routes() to
remove the gsi from irqfds.items first.

This problem only exists on AMD platform, 'cause on Intel platform
kernel just return when update irte while it only prints a warning on
AMD.

Also, this patch adjusts the order of enable() and set_gsi_routes() in
unmask(), which should do no harm.

Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
2022-03-10 09:27:50 +01:00
Yi Wang
db9e5e5a87 vmm: interrupt: fix msi mask irq causing kernel panic on AMD
When mask a msi irq, we set the entry.masked to be true, so kvm
hypervisor will not pass the gsi to kernel through KVM_SET_GSI_ROUTING
ioctl which update kvm->irq_routing. This will trigger kernel
panic on AMD platform when the gsi is the largest one in kernel
kvm->irqfds.items:

crash> bt
PID: 22218  TASK: ffff951a6ad74980  CPU: 73  COMMAND: "vcpu8"
 #0 [ffffb1ba6707fa40] machine_kexec at ffffffff8565b397
 #1 [ffffb1ba6707fa90] __crash_kexec at ffffffff85788a6d
 #2 [ffffb1ba6707fb58] crash_kexec at ffffffff8578995d
 #3 [ffffb1ba6707fb70] oops_end at ffffffff85623c0d
 #4 [ffffb1ba6707fb90] no_context at ffffffff856692c9
 #5 [ffffb1ba6707fbf8] exc_page_fault at ffffffff85f95b51
 #6 [ffffb1ba6707fc50] asm_exc_page_fault at ffffffff86000ace
    [exception RIP: svm_update_pi_irte+227]
    RIP: ffffffffc0761b53  RSP: ffffb1ba6707fd08  RFLAGS: 00010086
    RAX: ffffb1ba6707fd78  RBX: ffffb1ba66d91000  RCX: 0000000000000001
    RDX: 00003c803f63f1c0  RSI: 000000000000019a  RDI: ffffb1ba66db2ab8
    RBP: 000000000000019a   R8: 0000000000000040   R9: ffff94ca41b82200
    R10: ffffffffffffffcf  R11: 0000000000000001  R12: 0000000000000001
    R13: 0000000000000001  R14: ffffffffffffffcf  R15: 000000000000005f
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #7 [ffffb1ba6707fdb8] kvm_irq_routing_update at ffffffffc09f19a1 [kvm]
 #8 [ffffb1ba6707fde0] kvm_set_irq_routing at ffffffffc09f2133 [kvm]
 #9 [ffffb1ba6707fe18] kvm_vm_ioctl at ffffffffc09ef544 [kvm]
    RIP: 00007f143c36488b  RSP: 00007f143a4e04b8  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 00007f05780041d0  RCX: 00007f143c36488b
    RDX: 00007f05780041d0  RSI: 000000004008ae6a  RDI: 0000000000000020
    RBP: 00000000000004e8   R8: 0000000000000008   R9: 00007f05780041e0
    R10: 00007f0578004560  R11: 0000000000000246  R12: 00000000000004e0
    R13: 000000000000001a  R14: 00007f1424001c60  R15: 00007f0578003bc0
    ORIG_RAX: 0000000000000010  CS: 0033  SS: 002b

To solve this problem, move route.disable() before set_gsi_routes() to
remove the gsi from irqfds.items first.

This problem only exists on AMD platform, 'cause on Intel platform
kernel just return when update irte while it only prints a warning on
AMD.

Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
2022-03-10 09:27:50 +01:00
Michael Zhao
c9374d87ac vmm: Update devid in kvm_irq_routing_entry
After introducing multiple PCI segments, the `devid` value in
`kvm_irq_routing_entry` exceeds the maximum supported range on AArch64.

This commit restructed the `devid` to the allowed range.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-12-01 09:24:01 +08:00
Sebastien Boeuf
dcc646f5b1 clippy: Fix redundant allocations
With the new beta version, clippy complains about redundant allocation
when using Arc<Box<dyn T>>, and suggests replacing it simply with
Arc<dyn T>.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-07-29 13:28:57 +02:00
Michael Zhao
239e39ddbc vmm: Fix clippy warnings on AArch64
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-06-24 08:59:53 -07:00
Michael Zhao
ff46fb69d0 aarch64: Fix IRQ number setting for ACPI
On FDT, VMM can allocate IRQ from 0 for devices.
But on ACPI, the lowest range below 32 has to be avoided.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-05-25 10:20:37 +02:00
Wei Liu
810ed7e887 vmm: interrupt: drop unnecessary type from impl
The original code had a generic type E. It was later replaced by a
concrete type. The code should have been simplified when the replacement
happened.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2021-04-13 14:18:14 +01:00
Wei Liu
27ba8133a4 vmm: interrupt: drop RoutingEntryExt trait
We can now directly use associated functions.

This simplifies code and causes no functional change.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2021-03-25 10:13:43 +01:00
Wei Liu
f5c550affb vmm: interrupt: simplify interrupt handling traits
Drop the generic type E and use IrqRoutngEntry directly. This allows
dropping a bunch of trait bounds from code.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2021-03-25 10:13:43 +01:00
Wei Liu
1d9f27c9fb vmm: interrupt: extract common code from MSHV and KVM
Their make_entry functions look the same now. Extract the logic to a
common function.

No functional change.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2021-03-25 10:13:43 +01:00
Vineeth Pillai
68401e6e4a hypervisor:mshv: Support the move of MSI routing to kernel
Signed-off-by: Vineeth Pillai <viremana@linux.microsoft.com>
2021-03-23 11:06:13 +01:00
Michael Zhao
afc83582be aarch64: Enable IRQ routing for legacy devices
On AArch64, interrupt controller (GIC) is emulated by KVM. VMM need to
set IRQ routing for devices, including legacy ones.

Before this commit, IRQ routing was only set for MSI. Legacy routing
entries of type KVM_IRQ_ROUTING_IRQCHIP were missing. That is way legacy
devices (like serial device ttyS0) does not work.

The setting of X86 IRQ routing entries are not impacted.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-03-15 20:59:50 +08:00
Sebastien Boeuf
3bd47ffdc1 interrupt: Add a notifier method to the InterruptController
Both GIC and IOAPIC must implement a new method notifier() in order to
provide the caller with an EventFd corresponding to the IRQ it refers
to.

This is needed in anticipation for supporting INTx with VFIO PCI
devices.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-02-10 17:34:56 +00:00
Sebastien Boeuf
acfbee5b7a interrupt: Make notifier function return Option<EventFd>
In anticipation for supporting the notifier function for the legacy
interrupt source group, we need this function to return an EventFd
instead of a reference to this same EventFd.

The reason is we can't return a reference when there's an Arc<Mutex<>>
involved in the call chain.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-02-10 17:34:56 +00:00
Muminul Islam
f4af668d76 hypervisor, vmm: Implement MsiInterruptOps for mshv
Co-Developed-by: Wei Liu <liuwe@microsoft.com>
Signed-off-by: Wei Liu <liuwe@microsoft.com>
Signed-off-by: Muminul Islam <muislam@microsoft.com>
2020-12-09 14:55:20 +01:00
Muminul Islam
9ce6c3b75c hypervisor, vmm: Feature guard KVM specific code
There are some code base and function which are purely KVM specific for
now and we don't have those supports in mshv at the moment but we have plan
for the future. We are doing a feature guard with KVM. For example, KVM has
mp_state, cpu clock support,  which we don't have for mshv. In order to build
those code we are making the code base for KVM specific compilation.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2020-12-09 14:55:20 +01:00
Rob Bradford
ffaab46934 misc: Use a more relaxed memory model when possible
When a total ordering between multiple atomic variables is not required
then use Ordering::Acquire with atomic loads and Ordering::Release with
atomic stores.

This will improve performance as this does not require a memory fence
on x86_64 which Ordering::SeqCst will use.

Add a comment to the code in the vCPU handling code where it operates on
multiple atomics to explain why Ordering::SeqCst is required.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-12-02 19:04:30 +01:00
Rob Bradford
0fec326582 hypervisor, vmm: Remove shared ownership of VmmOps
This interface is used by the vCPU thread to delegate responsibility for
handling MMIO/PIO operations and to support different approaches than a
VM exit.

During profiling I found that we were spending 13.75% of the boot CPU
uage acquiring access to the object holding the VmmOps via
ArcSwap::load_full()

    13.75%     6.02%  vcpu0            cloud-hypervisor    [.] arc_swap::ArcSwapAny<T,S>::load_full
            |
            ---arc_swap::ArcSwapAny<T,S>::load_full
               |
                --13.43%--<hypervisor::kvm::KvmVcpu as hypervisor::cpu::Vcpu>::run
                          std::sys_common::backtrace::__rust_begin_short_backtrace
                          core::ops::function::FnOnce::call_once{{vtable-shim}}
                          std::sys::unix:🧵:Thread:🆕:thread_start

However since the object implementing VmmOps does not need to be mutable
and it is only used from the vCPU side we can change the ownership to
being a simple Arc<> that is passed in when calling create_vcpu().

This completely removes the above CPU usage from subsequent profiles.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2020-11-19 00:16:02 +01:00
Michael Zhao
2f2e10ea35 arch: Remove GICv2
Virtio-mmio is removed, now virtio-pci is the only option for virtio
transport layer. We use MSI for PCI device interrupt. While GICv2, the
legacy interrupt controller, doesn't support MSI. So GICv2 is not very
practical for Cloud-hypervisor, we can remove it.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2020-10-19 14:58:48 +01:00
Wei Liu
4ef97d8ddb vmm: interrupts: clearly separate MsiInterruptGroup and InterruptRoute
MsiInterruptGroup doesn't need to know the internal field names of
InterruptRoute. Introduce two helper functions to eliminate references
to irq_fd. This is done similarly to the enable and disable helper
functions.

Also drop the pub keyword from InterruptRoute fields. It is not needed
anymore.

No functional change.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-09-29 13:51:35 +02:00
Wei Liu
7e130a65ba vmm: interrupts: adjust set_gsi_routes
There is no point in manually dropping the lock for gsi_msi_routes then
instantly grabbing it again in set_gsi_routes.

Make set_gsi_routes take a reference to the routing hashmap instead.

No functional change intended.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-09-25 17:17:35 +02:00
Henry Wang
e7acbcc184 arch: AArch64: support saving RDIST pending tables into guest RAM
This commit adds a function which allows to save RDIST pending
tables to the guest RAM, as well as unit test case for it.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Henry Wang
29ce3076c2 tests: AArch64: Add unit test cases for accessing GIC registers
This commit adds the unit test cases for getting/setting the GIC
distributor, redistributor and ICC registers.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2020-09-23 12:37:25 +01:00
Michael Zhao
e3e771727a arch: Refactor GIC code to seperate KVM specific code
Shrink GICDevice trait to contain hypervisor agnostic API's only, which
are used in generating FDT.
Move all KVM specific logic into KvmGICDevice trait.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2020-07-21 16:22:02 +02:00
Wei Liu
e1af251c9f vmm, hypervisor: adjust set_gsi_routing / set_gsi_routes
Make set_gsi_routing take a list of IrqRoutingEntry. The construction of
hypervisor specific structure is left to set_gsi_routing.

Now set_gsi_routes, which is part of the interrupt module, is only
responsible for constructing a list of routing entries.

This further splits hypervisor specific code from hypervisor agnostic
code.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-07-20 07:32:32 +02:00
Wei Liu
d80e383dbb arch: move test cases to vmm crate
This saves us from adding a "kvm" feature to arch crate merely for the
purpose of running tests.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-07-15 17:21:07 +02:00
Michael Zhao
cce6237536 pci: Enable GSI routing (MSI type) for AArch64
In this commit we saved the BDF of a PCI device and set it to "devid"
in GSI routing entry, because this field is mandatory for GICv3-ITS.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2020-07-14 14:34:54 +01:00
Samuel Ortiz
8186a8eee6 vmm: interrupt: Rename vm_fd
The _fd suffix is KVM specific. But since it now point to an hypervisor
agnostic hypervisor::Vm implementation, we should just rename it vm.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-07-06 09:35:30 +01:00
Wei Liu
2b8accf49a vmm: interrupt: put KVM code into a kvm module
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-30 12:09:42 +01:00
Wei Liu
c31e747005 vmm: interrupt: generify impl InterruptManager for MsiInterruptManager
The logic can be shared among hypervisor implementations.

The 'static bound is used such that we don't need to deal with extra
lifetime parameter everywhere. It should be okay because we know the
entry type E doesn't contain any reference.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-30 12:09:42 +01:00
Wei Liu
ade904e356 vmm: interrupt: generify impl InterruptSourceGroup for MsiInterruptGroup
At this point we can use the same logic for all hypervisor
implementations.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-30 12:09:42 +01:00
Wei Liu
2b466ed80c vmm: interrupt: provide MsiInterruptGroupOps trait
Currently it only contains a function named set_gsi_routes.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-30 12:09:42 +01:00
Wei Liu
b2abead65b vmm: interrupt: provide and use extension trait RoutingEntryExt
This trait contains a function which produces a interrupt routing entry.

Implement that trait for KvmRoutingEntry and rewrite the update
function.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-30 12:09:42 +01:00
Wei Liu
4dbca81b86 vmm: interrupt: rename set_kvm_gsi_routes to set_gsi_routes
This function will be used to commit routing information to the
hypervisor.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-30 12:09:42 +01:00
Wei Liu
fd7b42e54d vmm: interrupt: inline mask_kvm_entry
The logic for looking up the correct interrupt can be shared among
hypervisors.

No functional change.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-30 12:09:42 +01:00
Wei Liu
0ec39da90c vmm: interrupt: generify KvmMsiInterruptManager
The observation is only the route entry is hypervisor dependent.

Keep a definition of KvmMsiInterruptManager to avoid too much code
churn.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-30 12:09:42 +01:00
Wei Liu
d5149e95cb vmm: interrupt: generify KvmRoutingEntry and KvmMsiInterruptGroup
The observation is that only the route field is hypervisor specific.

Provide a new function in blanket implementation. Also redefine
KvmRoutingEntry with RoutingEntry to avoid code churn.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-30 12:09:42 +01:00
Wei Liu
637f58bcd9 vmm: interrupt: drop Kvm prefix from KvmLegacyUserspaceInterruptManager
This data structure doesn't contain KVM specific stuff.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-30 12:09:42 +01:00
Wei Liu
574cab6990 vmm: interrupt: create GSI hashmap directly
The observation is that the GSI hashmap remains untouched before getting
passed into the MSI interrupt manager. We can create that hashmap
directly in the interrupt manager's new function.

The drops one import from the interrupt module.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-30 12:09:42 +01:00
Wei Liu
4cc37d7b9a vmm: interrupt: drop a few pub keywords
Those items are not used elsewhere. Restrict their scope.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-24 12:39:42 +02:00
Wei Liu
1661adbbaf vmm: interrupt: add "Kvm" prefix to MsiInterruptGroup
The structure is tightly coupled with KVM. It uses KVM specific
structures and calls. Add Kvm prefix to it.

Microsoft hypervisor will implement its own interrupt group(s) later.

No functional change intended.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2020-06-24 12:39:42 +02:00
Muminul Islam
e4dee57e81 arch, pci, vmm: Initial switch to the hypervisor crate
Start moving the vmm, arch and pci crates to being hypervisor agnostic
by using the hypervisor trait and abstractions. This is not a complete
switch and there are still some remaining KVM dependencies.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2020-06-22 15:03:15 +02:00