Commit Graph

1784 Commits

Author SHA1 Message Date
Sebastien Boeuf
86bc313f38 virtio-devices, vmm: Register a DMA handler to VirtioPciDevice
Given that some virtio device might need some DMA handling, we provide a
way to store this through the VirtioPciDevice layer, so that it can be
accessed when the PCI device is removed.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-03-11 12:37:17 +01:00
Sebastien Boeuf
54d63e774c vmm: device_manager: Extend MetaVirtioDevice with a DMA handler
In anticipation for handling potential DMA mapping/unmapping operations for a
virtio device, we extend the MetaVirtioDevice with an additional field
that holds an optional DMA handler.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-03-11 12:37:17 +01:00
Sebastien Boeuf
f801b0fc72 vmm: device_manager: Factorize virtio device tuple into structure
The tuple of information related to each virtio device is too big, and
it's better to factorize it through a dedicated structure.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-03-11 12:37:17 +01:00
Sebastien Boeuf
80296b9497 vmm: device_manager: Remove typedef VirtioDeviceArc
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-03-11 12:37:17 +01:00
Yi Wang
5375b84e3b vmm: interrupt: fix msi mask irq causing kernel panic on AMD
When mask a msi irq, we set the entry.masked to be true, so kvm
hypervisor will not pass the gsi to kernel through KVM_SET_GSI_ROUTING
ioctl which update kvm->irq_routing. This will trigger kernel
panic on AMD platform when the gsi is the largest one in kernel
kvm->irqfds.items:

crash> bt
PID: 22218  TASK: ffff951a6ad74980  CPU: 73  COMMAND: "vcpu8"
 #0 [ffffb1ba6707fa40] machine_kexec at ffffffff8565b397
 #1 [ffffb1ba6707fa90] __crash_kexec at ffffffff85788a6d
 #2 [ffffb1ba6707fb58] crash_kexec at ffffffff8578995d
 #3 [ffffb1ba6707fb70] oops_end at ffffffff85623c0d
 #4 [ffffb1ba6707fb90] no_context at ffffffff856692c9
 #5 [ffffb1ba6707fbf8] exc_page_fault at ffffffff85f95b51
 #6 [ffffb1ba6707fc50] asm_exc_page_fault at ffffffff86000ace
    [exception RIP: svm_update_pi_irte+227]
    RIP: ffffffffc0761b53  RSP: ffffb1ba6707fd08  RFLAGS: 00010086
    RAX: ffffb1ba6707fd78  RBX: ffffb1ba66d91000  RCX: 0000000000000001
    RDX: 00003c803f63f1c0  RSI: 000000000000019a  RDI: ffffb1ba66db2ab8
    RBP: 000000000000019a   R8: 0000000000000040   R9: ffff94ca41b82200
    R10: ffffffffffffffcf  R11: 0000000000000001  R12: 0000000000000001
    R13: 0000000000000001  R14: ffffffffffffffcf  R15: 000000000000005f
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #7 [ffffb1ba6707fdb8] kvm_irq_routing_update at ffffffffc09f19a1 [kvm]
 #8 [ffffb1ba6707fde0] kvm_set_irq_routing at ffffffffc09f2133 [kvm]
 #9 [ffffb1ba6707fe18] kvm_vm_ioctl at ffffffffc09ef544 [kvm]
    RIP: 00007f143c36488b  RSP: 00007f143a4e04b8  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 00007f05780041d0  RCX: 00007f143c36488b
    RDX: 00007f05780041d0  RSI: 000000004008ae6a  RDI: 0000000000000020
    RBP: 00000000000004e8   R8: 0000000000000008   R9: 00007f05780041e0
    R10: 00007f0578004560  R11: 0000000000000246  R12: 00000000000004e0
    R13: 000000000000001a  R14: 00007f1424001c60  R15: 00007f0578003bc0
    ORIG_RAX: 0000000000000010  CS: 0033  SS: 002b

To solve this problem, move route.disable() before set_gsi_routes() to
remove the gsi from irqfds.items first.

This problem only exists on AMD platform, 'cause on Intel platform
kernel just return when update irte while it only prints a warning on
AMD.

Also, this patch adjusts the order of enable() and set_gsi_routes() in
unmask(), which should do no harm.

Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
2022-03-10 09:27:50 +01:00
Yi Wang
db9e5e5a87 vmm: interrupt: fix msi mask irq causing kernel panic on AMD
When mask a msi irq, we set the entry.masked to be true, so kvm
hypervisor will not pass the gsi to kernel through KVM_SET_GSI_ROUTING
ioctl which update kvm->irq_routing. This will trigger kernel
panic on AMD platform when the gsi is the largest one in kernel
kvm->irqfds.items:

crash> bt
PID: 22218  TASK: ffff951a6ad74980  CPU: 73  COMMAND: "vcpu8"
 #0 [ffffb1ba6707fa40] machine_kexec at ffffffff8565b397
 #1 [ffffb1ba6707fa90] __crash_kexec at ffffffff85788a6d
 #2 [ffffb1ba6707fb58] crash_kexec at ffffffff8578995d
 #3 [ffffb1ba6707fb70] oops_end at ffffffff85623c0d
 #4 [ffffb1ba6707fb90] no_context at ffffffff856692c9
 #5 [ffffb1ba6707fbf8] exc_page_fault at ffffffff85f95b51
 #6 [ffffb1ba6707fc50] asm_exc_page_fault at ffffffff86000ace
    [exception RIP: svm_update_pi_irte+227]
    RIP: ffffffffc0761b53  RSP: ffffb1ba6707fd08  RFLAGS: 00010086
    RAX: ffffb1ba6707fd78  RBX: ffffb1ba66d91000  RCX: 0000000000000001
    RDX: 00003c803f63f1c0  RSI: 000000000000019a  RDI: ffffb1ba66db2ab8
    RBP: 000000000000019a   R8: 0000000000000040   R9: ffff94ca41b82200
    R10: ffffffffffffffcf  R11: 0000000000000001  R12: 0000000000000001
    R13: 0000000000000001  R14: ffffffffffffffcf  R15: 000000000000005f
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #7 [ffffb1ba6707fdb8] kvm_irq_routing_update at ffffffffc09f19a1 [kvm]
 #8 [ffffb1ba6707fde0] kvm_set_irq_routing at ffffffffc09f2133 [kvm]
 #9 [ffffb1ba6707fe18] kvm_vm_ioctl at ffffffffc09ef544 [kvm]
    RIP: 00007f143c36488b  RSP: 00007f143a4e04b8  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 00007f05780041d0  RCX: 00007f143c36488b
    RDX: 00007f05780041d0  RSI: 000000004008ae6a  RDI: 0000000000000020
    RBP: 00000000000004e8   R8: 0000000000000008   R9: 00007f05780041e0
    R10: 00007f0578004560  R11: 0000000000000246  R12: 00000000000004e0
    R13: 000000000000001a  R14: 00007f1424001c60  R15: 00007f0578003bc0
    ORIG_RAX: 0000000000000010  CS: 0033  SS: 002b

To solve this problem, move route.disable() before set_gsi_routes() to
remove the gsi from irqfds.items first.

This problem only exists on AMD platform, 'cause on Intel platform
kernel just return when update irte while it only prints a warning on
AMD.

Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
2022-03-10 09:27:50 +01:00
Wei Liu
4cf22e4ec7 arch: do not hardcode MMIO region length in MmioDeviceInfo
Add a field for its length and fix up users.

Things work just because all hardcoded values agree with each other.
This is prone to breakage.

No functional change.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-03-04 15:21:48 +08:00
Feng Ye
6c1fe07d90 openapi: Mark ReceiveMigrationData.receiver_url as required
Signed-off-by: Feng Ye <yefeng@smartx.com>
2022-02-24 09:17:22 +01:00
Sebastien Boeuf
00fbd77494 vmm: api: Make 'local' optional in SendMigrationData
Make sure the OpenAPI definition matches the code.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-02-23 14:37:41 +01:00
Feng Ye
c504f302e9 vmm: api: Make VmSendMigrationData.local optional
Fixes: #3756

Signed-off-by: Feng Ye <yefeng@smartx.com>
2022-02-23 11:56:09 +00:00
Akira Moroo
2451c4d833 vmm: Implement GDB event handler to enable --gdb flag
This commit adds event fds and the event handler to send/receive
requests and responses from the GDB thread. It also adds `--gdb` flag to
enable GDB stub feature.

Signed-off-by: Akira Moroo <retrage01@gmail.com>
2022-02-23 11:16:09 +00:00
Akira Moroo
23bb629241 vmm: Add stop_on_boot to Vm to stop VM on boot
This commit adds `stop_on_boot` to `Vm` so that the VM stops before
starting on boot requested. This change is required to keep the target
VM stopped before a debugger attached as the user expected.

Signed-off-by: Akira Moroo <retrage01@gmail.com>
2022-02-23 11:16:09 +00:00
Akira Moroo
bae63a8b8c vmm: Add debug_request to send debug request
This commit adds `Vm::debug_request` to handle `GdbRequestPayload`,
which will be sent from the GDB thread.

Signed-off-by: Akira Moroo <retrage01@gmail.com>
2022-02-23 11:16:09 +00:00
Akira Moroo
2f430e08e1 vmm: Implement multicore GDB stub support
This commit adds GDB stub implementation with multicore support. This
implementaton is based on the gdbstub crate example code [1].

[1]
https://github.com/daniel5151/gdbstub/tree/master/examples/armv4t_multicore

Signed-off-by: Akira Moroo <retrage01@gmail.com>
2022-02-23 11:16:09 +00:00
Akira Moroo
f1c4705638 vmm: Add Debuggable trait implementation
This commit adds initial gdb.rs implementation for `Debuggable` trait to
describe a debuggable component. Some part of the trait bound
implementations is based on the crosvm GDB stub code [1].

[1] https://github.com/google/crosvm/blob/main/src/gdb.rs

Signed-off-by: Akira Moroo <retrage01@gmail.com>
2022-02-23 11:16:09 +00:00
Akira Moroo
a2a492f3df seccomp: Add ioctls to seccomp filter for guest debug
This commit adds `KVM_SET_GUEST_DEBUG` and `KVM_TRANSLATE` ioctls to
seccomp filter to enable guest debugging without `--seccomp=false`.

Signed-off-by: Akira Moroo <retrage01@gmail.com>
2022-02-23 11:16:09 +00:00
Akira Moroo
f452e51488 vmm: Add BreakPoint to VmState
This commit adds `VmState::BreakPoint` to handle hardware breakpoint.
The VM will enter this state when a breakpoint hits or a debugger
interrupts the execution.

Signed-off-by: Akira Moroo <retrage01@gmail.com>
2022-02-23 11:16:09 +00:00
Fabiano Fidêncio
dd77070f16 openapi: Update the PciBdf type
42b5d4a2f7 has changed how the PciBdf
field of a DeviceNode is represented (from an int32 to its own struct).

To avoid marshelling / demarshelling issues for the projects relying on
the openapi auto generated code, let's propagate the change, updating
the yaml file accordingly.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-02-22 15:10:08 +00:00
Michael Zhao
0fc3fad363 vmm: Limit "Dies" in VCPU topology on AArch64
`Dies per package` setting of VCPU topology doesnot apply on AArch64.
Now we only accept `1` value. This way we can make the `dies` field
transparent, avoid it from impacting the topology setting.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-02-22 09:21:00 +08:00
Michael Zhao
0fa31539eb vmm: Add default VCPU topology in PPTT on AArch64
When VCPU topology is not specified, fill the PPTT with default setting.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-02-22 09:21:00 +08:00
Sebastien Boeuf
0ac094c0d1 vmm: Handle TDX hypercalls with INVALID_OPERAND
Based on the helpers from the hypervisor crate, the VMM can identify
what type of hypercall has been issued through the KVM_EXIT_TDX reason.

For now, we only log warnings and set the status to INVALID_OPERAND
since these hypercalls aren't supported. The proper handling will be
implemented later.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-02-18 14:41:07 +01:00
Sebastien Boeuf
a3dfe726f8 vmm: cpu: Avoid useless cloning of Arc<Mutex<Vcpu>>
Since the object returned from CpuManager.create_vcpu() is never used,
we can avoid the cloning of this object.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-02-18 14:41:07 +01:00
Sebastien Boeuf
42b5d4a2f7 pci, vmm: Update DeviceNode to store PciBdf instead of u32
By having the DeviceNode storing a PciBdf, we simplify the internal code
as well as allow for custom Serialize/Deserialize implementation for the
PciBdf structure. These custom implementations let us display the PCI
s/b/d/f in a human readable format.

Fixes #3711

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-02-16 11:57:23 +00:00
Fabiano Fidêncio
5752a2a4fb openapi: Add the 204 response to vm-add-* actions
As we've added support for cold adding devices to a VM that was created
but not already started, we should propagate the `204` response
generated on those cases to the yaml file, so openapi-generator can
produce the correct client code on the go side, to handle both `200` and
`204` successful results.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-02-15 11:07:26 -08:00
Fabiano Fidêncio
5d2db68f67 vmm: lib: Allow config changes before the VM is booted
Instead of erroring out when trying to change the configuration of the
VM somewhere between the VM was created but not yet booted, let's allow
users to change that without any issue, as long as the VM has already
been created.

Fixes: #3639

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-02-15 11:07:26 -08:00
Fabiano Fidêncio
b780a916bb vmm: lib: Add unit tests
Let's add very basic unit for the vm_add_$device() functions, so we can
easily expand those when changing its behaviour in the coming commits.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-02-15 11:07:26 -08:00
Fabiano Fidêncio
16782e8c6d vmm: lib: Do the config validation in the Vmm
Instead of doing the validation of the configuration change as part of
the vm, let's do this in the uper layer, in the Vmm.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-02-15 11:07:26 -08:00
Fabiano Fidêncio
bd024bffb1 vmm: config: Move add_to_config to config.rs
Let's move add_to_config to config.rs so it can be used from both inside
and outside of the vm.rs file.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-02-15 11:07:26 -08:00
Fabiano Fidêncio
55479a64d2 openapi: Expose TDx configuration
TDx support is already present on the project for quite some time, but
the TDx configuration was not yet exposed to the ones using CH via the
OpenAPI auto generated code.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-02-14 11:12:12 +01:00
Rob Bradford
57184f110a openapi: Add PlatformConfig to OpenAPI spec
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-02-11 11:20:04 +00:00
Rob Bradford
20b9f95afd vmm: Attach all devices from specified segments to the IOMMU
Since the devices behind the IOMMU cannot be changed at runtime we offer
the ability to place all devices on user chosen segments behind the
IOMMU. This allows the hotplugging of devices behind the IOMMU provided
that they are assigned to a segment that is located behind the iommu.

Fixes: #911

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-02-11 11:20:04 +00:00
Rob Bradford
6994b33a24 vmm: Add "iommu_segments" to --platform
This provides a list of segments on which all devices will be placed
behind the IOMMU.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-02-11 11:20:04 +00:00
Sebastien Boeuf
052f38fa96 vmm: Enable guest to report free pages through virtio-balloon
Adding a new parameter free_page_reporting=on|off to the balloon device
so that we can enable the corresponding feature from virtio-balloon.

Running a VM with a balloon device where this feature is enabled allows
the guest to report pages that are free from guest's perspective. This
information is used by the VMM to release the corresponding pages on the
host.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-02-11 12:10:07 +01:00
Rob Bradford
5e19422fcf vmm: config: Fix PCI segment validation error format string
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-02-09 13:50:36 +00:00
Rob Bradford
26d1a76ad9 vmm: config: Validate balloon size is less than RAM size
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-02-09 13:50:36 +00:00
Sebastien Boeuf
10676b74dc vmm: Split VM config and VM state for snapshot/restore
In order to allow for human readable output for the VM configuration, we
pull it out of the snapshot, which becomes effectively the list of
states from the VM. The configuration is stored through a dedicated file
in JSON format (not including any binary output).

Having the ability to read and modify the VM configuration manually
between the snapshot and restore phases makes debugging easier, as well
as empowers users for extending the use cases relying on the
snapshot/restore feature.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-02-08 15:06:49 +00:00
Rob Bradford
507912385a vmm: Ensure that PIO and MMIO exits complete before pausing
As per this kernel documentation:

      For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_OSI, KVM_EXIT_PAPR, KVM_EXIT_XEN,
      KVM_EXIT_EPR, KVM_EXIT_X86_RDMSR and KVM_EXIT_X86_WRMSR the corresponding
      operations are complete (and guest state is consistent) only after userspace
      has re-entered the kernel with KVM_RUN.  The kernel side will first finish
      incomplete operations and then check for pending signals.

      The pending state of the operation is not preserved in state which is
      visible to userspace, thus userspace should ensure that the operation is
      completed before performing a live migration.  Userspace can re-enter the
      guest with an unmasked signal pending or with the immediate_exit field set
      to complete pending operations without allowing any further instructions
      to be executed.

Since we capture the state as part of the pause and override it as part
of the resume we must ensure the state is consistent otherwise we will
lose the results of the MMIO or PIO operation that caused the exit from
which we paused.

Fixes: #3658

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-02-07 15:26:22 +00:00
Sebastien Boeuf
832f09a075 vmm: tdx: Insert payload into the HOB
If a payload is found in the TDVF section, and after it's been copied to
the guest memory, make sure to create the corresponding TdPayload
structure and insert it through the HOB.

Signed-off-by: Jiaqi Gao <jiaqi.gao@intel.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-02-04 13:57:56 +01:00
Sebastien Boeuf
3c421593c3 vmm: tdx: Don't load the kernel the usual way
In case of TDX, if a kernel and/or a command line are provided by the
user, they can't be treated the same way as for the non-TDX case. That
is why this patch ensures the function load_kernel() is only invoked for
the non-TDX case.

For the TDX case, whenever TDVF contains a Payload and/or PayloadParam
sections, the file provided through --kernel and the parameters provided
through --cmdline are copied at the locations specified by each TDVF
section.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-02-04 13:57:56 +01:00
Sebastien Boeuf
7b93a8dd78 vmm: config: Allow --kernel to be used with TDX
The TDVF specification has been updated with the ability to provide a
specific payload, which means we will be able to achieve direct kernel
boot.

For that reason, let's not prevent the user from using --kernel
parameter when running with TDX.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-02-04 13:57:56 +01:00
Sebastien Boeuf
b3ca1d90e9 vmm: Stop dirty logging only if it has been started
Now that we introduced a separate method to indicate when the migration
is started, both start_dirty_log() and stop_dirty_log() don't have to
carry an implicit meaning as they can focus entirely on the dirty log
being started or stopped.

For that reason, we can now safely move stop_dirty_log() to the code
section performing non-local migration. It makes only sense to stop
logging dirty pages if this has been started before.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-02-03 13:33:26 +01:00
lizhaoxin1
a45e458c50 vm-migration: Add start_migration() to Migratable trait
In order to clearly decouple when the migration is started compared to
when the dirty logging is started, we introduce a new method to the
Migratable trait. This clarifies the semantics as we don't end up using
start_dirty_log() for identifying when the migration has been started.
And similarly, we rely on the already existing complete_migration()
method to know when the migration has been ended.

A bug was reported when running a local migration with a vhost-user-net
device in server mode. The reason was because the migration_started
variable was never set to "true", since the start_dirty_log() function
was never invoked.

Signed-off-by: lizhaoxin1 <Lxiaoyouling@163.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-02-03 13:33:26 +01:00
Fabiano Fidêncio
0dafd47a7c vmm: openapi: Remove mention to net fds
While cloud-hypervisor does support receiving the file descriptors of a
tuntap device, advertising the fds structure via the openAPI can lead to
misinterpretations of what can and what should be done.

An unadvertised consumer will think that they could rather just set the
file descriptors there directly, or even pass them as a byte array.
However, the proper way to go in those cases would be actually sending
those via send_msg(), together with the request.

As hacking the openAPI auto-generated code to properly do this is not
*that* trivial, and as doing so during a `create VM` request is not
supported, we better not advertising those.

Please, for more details, also check:
https://github.com/cloud-hypervisor/cloud-hypervisor/pull/3607#issuecomment-1020935523

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-01-31 10:38:28 +00:00
Sebastien Boeuf
4e46a1bc3c vmm: api: Support multiple fds with add-net
Based on the latest code from the micro-http crate, this patch adds the
support for multiple file descriptors to be sent along with the add-net
request. This means we can now hotplug multiqueue network interface to
the VM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-01-31 10:37:53 +00:00
Sebastien Boeuf
8eed276d14 vm-virtio: Define AccessPlatform trait
Moving the whole codebase to rely on the AccessPlatform definition from
vm-virtio so that we can fully remove it from virtio-queue crate.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-01-27 10:00:20 +00:00
Henry Wang
8f4aa07a80 vmm: vm: Init PMU during the VM restore process
If a PMU is enabled in a VM, we also need to initialize the PMU
when the VM is restored. Otherwise, vCPUs cannot be started after
the VM is restored.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2022-01-21 17:59:36 +08:00
Jianyong Wu
5462fd810c seccomp: add ioctl group to seccomp authorized list for arm64
When enable PMU on arm64, ioctl with group KVM_HAS_DEVICE_ATTR will be
blocked by seccomp, add it to authorized list.

Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2022-01-21 17:59:36 +08:00
Jianyong Wu
81c5855184 fdt: add PMU node to fdt
PMU node in fdt stores some important info like irq number.

Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2022-01-21 17:59:36 +08:00
Jianyong Wu
53060874a7 vmm: Init PMU for vcpu when create vm
PMU is needed in guest for performance profiling, thus should be
enabled.

Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2022-01-21 17:59:36 +08:00
Sebastien Boeuf
7b9a110540 vmm: tdx: Pass ACPI tables through the HOB
Relying on helpers for creating the ACPI tables and to add each table to
the HOB, this patch connects the dot to provide the set of ACPI tables
to the TD firmware.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-01-20 16:50:55 +00:00
Sebastien Boeuf
ea0729c016 vmm: acpi: Create ACPI tables for TDX
The way to create ACPI tables for TDX is different as each table must be
passed through the HOB. This means the XSDT table is not required since
the firmware will take care of creating it. Same for RSDP, this is
firmware responsibility to provide it to the guest.

That's why this patch creates a TDX dedicated function, returning a list
of Sdt objects, which will let the calling code copy the content of each
table through the HOB.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-01-20 16:50:55 +00:00
Sebastien Boeuf
cdc14815be vmm: tdx: Only create ACPI tables if not running TDX
In case of TDX, we don't want to create the ACPI tables the same way we
do for all the other use cases. That's because the ACPI tables don't
need to be written to guest memory at a specific address, instead they
are passed directly through the HOB.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-01-20 16:50:55 +00:00
Sebastien Boeuf
4fda4ad6c9 arch, vmm: tdx: Remove TD_VMM_DATA mechanism
It's been decided the ACPI tables will be passed to the firmware in a
different way, rather than using TD_VMM_DATA. Since TD_VMM_DATA was
introduced for this purpose, there's no reason to keep it in our
codebase.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2022-01-20 16:50:55 +00:00
Anatol Belski
e2a8a1483f acpi: aarch64: Implement DBG2 table
This table is listed as required in the ARM Base Boot Requirements
document. The particular need arises to make the serial debugging of
Windows guest functional.

Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
2022-01-20 09:11:21 +08:00
Michael Zhao
1db7718589 pci, vmm: Pass PCI BDF to vfio and vfio_user
On AArch64, PCI BDF is used for devId in MSI-X routing entry.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2022-01-18 18:00:00 -08:00
Rob Bradford
4ecc778efe vmm: Avoid deadlock between virtio device activation and vcpu pausing
Ensure all pending virtio activations (as triggered by MMIO write on the
vCPU threads leading to a barrier wait) are completed before pausing the
vCPUs as otherwise there will a deadlock with the VMM waiting for the
vCPU to acknowledge it's pause and the vCPU waiting for the VMM to
activate the device and release the barrier.

Fixes: #3585

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-18 17:30:06 -08:00
Wei Liu
ef05354c81 memory_manager: drop unneeded clippy suppressions
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-01-18 17:23:27 -08:00
Wei Liu
277cfd07ba device_manager: use if let to drop single match
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-01-18 17:23:27 -08:00
Henry Wang
14ba3f68d3 vmm: cpu: Remove unused import in unit tests
These are the leftovers from the commit 8155be2:
arch: aarch64: vm_memory is not required when configuring vcpu

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2022-01-18 20:34:50 +08:00
Rob Bradford
70f7f64e23 vmm: api: Add "local" option to OpenAPI YAML file
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-18 09:07:47 +00:00
Rob Bradford
88952cc500 vmm: Send FDs across unix socket for migration when in local mode
When in local migration mode send the FDs for the guest memory over the
socket along with the slot that the FD is associated with. This removes
the requirement for copying the guest RAM and gives significantly faster
live migration performance (of the order of 3s to 60ms).

Fixes: #3566

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-18 09:07:47 +00:00
Rob Bradford
715a7d9065 vmm: Add convenience API for getting slots to FDs mapping
This will be used for sending those file descriptors for local
migration.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-18 09:07:47 +00:00
Rob Bradford
1676fffaad vmm: Check shared memory is enabled for local migration
This is required so that the receiving process can access the existing
process's memory.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-18 09:07:47 +00:00
Rob Bradford
1daef5e8c9 vmm: Propagate the set of memory slots to FDs received in migration
Create the VM using the FDs (wrapped in Files) that have been received
during the migration process.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-18 09:07:47 +00:00
Rob Bradford
735658a49d vm-migration: Add MemoryFd command for setting FDs for memory
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-18 09:07:47 +00:00
Rob Bradford
b95e46565c vmm: Support using existing files for memory slots
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-18 09:07:47 +00:00
Rob Bradford
eeba1d3ad8 vmm: Support using an existing FD for memory
If this FD (wrapped in a File) is supplied when the RAM region is being
created use that over creating a new one.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-18 09:07:47 +00:00
Rob Bradford
271e17bd79 vmm: Extract code for opening a file for memory
This function is used to open an FD (wrapped in a File) that points to
guest memory from memfd_create() or backed on the filesystem.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-18 09:07:47 +00:00
Rob Bradford
b9c260c0de vmm, ch-remote: Add "local" option to send-migration API
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-18 09:07:47 +00:00
Wei Liu
8155be2e6b arch: aarch64: vm_memory is not required when configuring vcpu
Drop the unused parameter throughout the code base.

Also take the chance to drop a needless clone.

No functional change intended.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2022-01-14 16:03:12 -08:00
Fabiano Fidêncio
fb1755d85d vmm: openapi: Fix "fds" field name for NetConfig
We've been currently using "fd" as the field name, but it should be
called "fds" since  6664e5a6e7 introduced
the name change on the structure field.

Fixes: #3560

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-01-12 16:44:51 +01:00
Fabiano Fidêncio
cb15ae5462 vmm: openapi: Fix default value for tap
`tap` has its default value set to `None`, but in the openapi yaml file
we've been setting it to `""`.

When using this code on the Kata Containers side we'd be hit by a non
expected behaviour of cloud-hypervisor, as even when using a different
method to initialise the `tuntap` device the code would be treated as if
using `--net tap` (which is a valid use-case).

Related: #3554

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-01-10 13:11:33 +00:00
Rob Bradford
70af81d755 vmm: config: Fix clippy (unnecessary_to_owned) issue
warning: unnecessary use of `to_string`
    --> vmm/src/config.rs:2199:38
     |
2199 | ...                   .get(&memory_zone.to_string())
     |                            ^^^^^^^^^^^^^^^^^^^^^^^^ help: use: `memory_zone`
     |
     = note: `#[warn(clippy::unnecessary_to_owned)]` on by default
     = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_to_owned

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-07 08:16:26 -08:00
Rob Bradford
8fa3864ae8 vmm: acpi: Fix clippy (needless_late_init) issue
warning: unneeded late initalization
   --> vmm/src/acpi.rs:525:5
    |
525 |     let mut prev_tbl_len: u64;
    |     ^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
    = note: `#[warn(clippy::needless_late_init)]` on by default
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_late_init
help: declare `prev_tbl_len` here
    |
552 |     let mut prev_tbl_len: u64 = madt.len() as u64;
    |     ~~~~~~~~~~~~~~~~~~~~~~~~~

warning: unneeded late initalization
   --> vmm/src/acpi.rs:526:5
    |
526 |     let mut prev_tbl_off: GuestAddress;
    |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_late_init
help: declare `prev_tbl_off` here
    |
553 |     let mut prev_tbl_off: GuestAddress = madt_offset;
    |     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-07 08:16:26 -08:00
Rob Bradford
0fcbcea275 vmm: seccomp: Remove set_tid_address syscall from seccomp filter
The origins of the requirement for this syscall in the seccomp filter
list are unknown.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-06 09:52:39 -08:00
Rob Bradford
10d1922393 vmm: seccomp: Remove arch_prctl syscall from seccomp filter
The origins of the requirement for this syscall in the seccomp filter
list are unknown.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-06 09:52:24 -08:00
Rob Bradford
d57c49664f vmm: Fix potential deadlock in CpuManager
Remove requirement for CpuManager to lock the Vcpu when starting the
vCPU as the numerical id corresponds to the index in the the vector.
This avoids a potential lock inversion between the Vcpu and CpuManager.

WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock) (pid=30497)
  Cycle in lock order graph: M48 (0x7b0c00001aa0) => M121 (0x7b0c000022b0) => M48

  Mutex M121 acquired here while holding mutex M48 in thread T1:
    #0 pthread_mutex_lock /rustc/llvm/src/llvm-project/compiler-rt/lib/tsan/../sanitizer_common/sanitizer_common_interceptors.inc:4249:3 (cloud-hypervisor+0x9c368)
    #1 std::sys::unix::mutex::Mutex:🔒:hcd1b9aa06ff775d3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/mutex.rs:63:17 (cloud-hypervisor+0x662ff4)
    #2 std::sys_common::mutex::MovableMutex::raw_lock::hff98d0b036469bca /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/mutex.rs:76:18 (cloud-hypervisor+0x4915de)
    #3 std::sync::mutex::Mutex$LT$T$GT$:🔒:h14cfa3c8f5ba878a /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sync/mutex.rs:267:13 (cloud-hypervisor+0x36ca3b)
    #4 vmm::cpu::CpuManager::start_vcpu::h290fdbb4b7124ec5 /home/rob/src/cloud-hypervisor/vmm/src/cpu.rs:710:22 (cloud-hypervisor+0x375023)
    #5 vmm::cpu::CpuManager::activate_vcpus::h2eab380826588391 /home/rob/src/cloud-hypervisor/vmm/src/cpu.rs:902:13 (cloud-hypervisor+0x376b7b)
    #6 vmm::cpu::CpuManager::start_boot_vcpus::hd80cafe6aa4e8279 /home/rob/src/cloud-hypervisor/vmm/src/cpu.rs:937:9 (cloud-hypervisor+0x3773af)
    #7 vmm::vm::Vm:👢:hc2ca6b16f996267b /home/rob/src/cloud-hypervisor/vmm/src/vm.rs:2063:9 (cloud-hypervisor+0x343d57)
    #8 vmm::Vmm::vm_boot::h06bdf54b95d5e14f /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:397:13 (cloud-hypervisor+0x2e5f45)
    #9 vmm::Vmm::control_loop::h40c9b48c7b800bed /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:1299:48 (cloud-hypervisor+0x2f44e0)
    #10 vmm::start_vmm_thread::_$u7b$$u7b$closure$u7d$$u7d$::h016d2f7cff698175 /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:263:17 (cloud-hypervisor+0x1ddae0)
    #11 std::sys_common::backtrace::__rust_begin_short_backtrace::h7fd2df3e7cfba503 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/backtrace.rs:123:18 (cloud-hypervisor+0x65926a)
    #12 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h89880b05fe892d7e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:477:17 (cloud-hypervisor+0x440a3e)
    #13 _$LT$core..panic..unwind_safe..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h487382524d80571f /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/panic/unwind_safe.rs:271:9 (cloud-hypervisor+0x6dd0fe)
    #14 std::panicking::try::do_call::h1d9c2ccdc39f3322 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:406:40 (cloud-hypervisor+0x6d2d71)
    #15 __rust_try 3hkmq3dzyyv5ejsx (cloud-hypervisor+0x6d3db8)
    #16 std::panicking::try::h251306df23d21913 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:370:19 (cloud-hypervisor+0x6d1af9)
    #17 std::panic::catch_unwind::h2a9ac2fb12c3c64e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panic.rs:133:14 (cloud-hypervisor+0x57b75e)
    #18 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::h10f4c340611b55e4 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:476:30 (cloud-hypervisor+0x43f343)
    #19 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hdd9b37241caf97b3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/ops/function.rs:227:5 (cloud-hypervisor+0x3cef45)
    #20 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::ha5022a6bb7833f62 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d692)
    #21 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h481697829cbc6746 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d692)
    #22 std::sys::unix:🧵:Thread:🆕:thread_start::h6fad62c4c393bbe7 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/thread.rs:108:17 (cloud-hypervisor+0x119d692)

  Mutex M48 previously acquired by the same thread here:
    #0 pthread_mutex_lock /rustc/llvm/src/llvm-project/compiler-rt/lib/tsan/../sanitizer_common/sanitizer_common_interceptors.inc:4249:3 (cloud-hypervisor+0x9c368)
    #1 std::sys::unix::mutex::Mutex:🔒:hcd1b9aa06ff775d3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/mutex.rs:63:17 (cloud-hypervisor+0x662ff4)
    #2 std::sys_common::mutex::MovableMutex::raw_lock::hff98d0b036469bca /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/mutex.rs:76:18 (cloud-hypervisor+0x4915de)
    #3 std::sync::mutex::Mutex$LT$T$GT$:🔒:hecf671add5fe1762 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sync/mutex.rs:267:13 (cloud-hypervisor+0x36d1cb)
    #4 vmm::vm::Vm:👢:hc2ca6b16f996267b /home/rob/src/cloud-hypervisor/vmm/src/vm.rs:2063:9 (cloud-hypervisor+0x343cd1)
    #5 vmm::Vmm::vm_boot::h06bdf54b95d5e14f /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:397:13 (cloud-hypervisor+0x2e5f45)
    #6 vmm::Vmm::control_loop::h40c9b48c7b800bed /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:1299:48 (cloud-hypervisor+0x2f44e0)
    #7 vmm::start_vmm_thread::_$u7b$$u7b$closure$u7d$$u7d$::h016d2f7cff698175 /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:263:17 (cloud-hypervisor+0x1ddae0)
    #8 std::sys_common::backtrace::__rust_begin_short_backtrace::h7fd2df3e7cfba503 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/backtrace.rs:123:18 (cloud-hypervisor+0x65926a)
    #9 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h89880b05fe892d7e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:477:17 (cloud-hypervisor+0x440a3e)
    #10 _$LT$core..panic..unwind_safe..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h487382524d80571f /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/panic/unwind_safe.rs:271:9 (cloud-hypervisor+0x6dd0fe)
    #11 std::panicking::try::do_call::h1d9c2ccdc39f3322 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:406:40 (cloud-hypervisor+0x6d2d71)
    #12 __rust_try 3hkmq3dzyyv5ejsx (cloud-hypervisor+0x6d3db8)
    #13 std::panicking::try::h251306df23d21913 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:370:19 (cloud-hypervisor+0x6d1af9)
    #14 std::panic::catch_unwind::h2a9ac2fb12c3c64e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panic.rs:133:14 (cloud-hypervisor+0x57b75e)
    #15 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::h10f4c340611b55e4 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:476:30 (cloud-hypervisor+0x43f343)
    #16 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hdd9b37241caf97b3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/ops/function.rs:227:5 (cloud-hypervisor+0x3cef45)
    #17 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::ha5022a6bb7833f62 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d692)
    #18 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h481697829cbc6746 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d692)
    #19 std::sys::unix:🧵:Thread:🆕:thread_start::h6fad62c4c393bbe7 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/thread.rs:108:17 (cloud-hypervisor+0x119d692)

  Mutex M48 acquired here while holding mutex M121 in thread T4:
    #0 pthread_mutex_lock /rustc/llvm/src/llvm-project/compiler-rt/lib/tsan/../sanitizer_common/sanitizer_common_interceptors.inc:4249:3 (cloud-hypervisor+0x9c368)
    #1 std::sys::unix::mutex::Mutex:🔒:h967991d72ceb6eb0 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/mutex.rs:63:17 (cloud-hypervisor+0xd94df4)
    #2 std::sys_common::mutex::MovableMutex::raw_lock::h8779639163126a21 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/mutex.rs:76:18 (cloud-hypervisor+0xd90cce)
    #3 std::sync::mutex::Mutex$LT$T$GT$:🔒:hd85239d207beb12f /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sync/mutex.rs:267:13 (cloud-hypervisor+0xd6e9ba)
    #4 vm_device:🚌:Bus::write::hf20f991e71af3199 /home/rob/src/cloud-hypervisor/vm-device/src/bus.rs:235:16 (cloud-hypervisor+0xd8dd2d)
    #5 _$LT$vmm..vm..VmOps$u20$as$u20$hypervisor..vm..VmmOps$GT$::mmio_write::hc759194aaebc7399 /home/rob/src/cloud-hypervisor/vmm/src/vm.rs:424:15 (cloud-hypervisor+0x32db5f)
    #6 _$LT$hypervisor..kvm..KvmVcpu$u20$as$u20$hypervisor..cpu..Vcpu$GT$::run::h94762dfba6642fb2 /home/rob/src/cloud-hypervisor/hypervisor/src/kvm/mod.rs:1003:32 (cloud-hypervisor+0xcc3ed8)
    #7 vmm::cpu::Vcpu::run::hd5cf042157f95bea /home/rob/src/cloud-hypervisor/vmm/src/cpu.rs:327:9 (cloud-hypervisor+0x370234)
    #8 vmm::cpu::CpuManager::start_vcpu::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h37e4dd8619b3a5e5 /home/rob/src/cloud-hypervisor/vmm/src/cpu.rs:813:35 (cloud-hypervisor+0x47785b)
    #9 std::panicking::try::do_call::h093e4d1434150d77 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:406:40 (cloud-hypervisor+0x6d2aea)
    #10 __rust_try 3hkmq3dzyyv5ejsx (cloud-hypervisor+0x6d3db8)
    #11 std::panicking::try::hee9535cb997282b4 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:370:19 (cloud-hypervisor+0x6d268f)
    #12 std::panic::catch_unwind::he3908c4d08a8a028 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panic.rs:133:14 (cloud-hypervisor+0x57baf9)
    #13 vmm::cpu::CpuManager::start_vcpu::_$u7b$$u7b$closure$u7d$$u7d$::h29472aaa3a600231 /home/rob/src/cloud-hypervisor/vmm/src/cpu.rs:782:21 (cloud-hypervisor+0x477156)
    #14 std::sys_common::backtrace::__rust_begin_short_backtrace::hcfc2f02361c98808 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/backtrace.rs:123:18 (cloud-hypervisor+0x65932b)
    #15 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h08b82db41d7af2f2 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:477:17 (cloud-hypervisor+0x4408ef)
    #16 _$LT$core..panic..unwind_safe..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h7ebad9d94e64fa5f /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/panic/unwind_safe.rs:271:9 (cloud-hypervisor+0x6dd1df)
    #17 std::panicking::try::do_call::h121fafbdf5cf84af /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:406:40 (cloud-hypervisor+0x6d2b81)
    #18 __rust_try 3hkmq3dzyyv5ejsx (cloud-hypervisor+0x6d3db8)
    #19 std::panicking::try::h79e25f019cd90522 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:370:19 (cloud-hypervisor+0x6d1e1f)
    #20 std::panic::catch_unwind::h5a0619a53bbd611d /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panic.rs:133:14 (cloud-hypervisor+0x57b8ff)
    #21 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::h1cfd689c9d362e48 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:476:30 (cloud-hypervisor+0x43fafb)
    #22 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::h6642b1b3a2289640 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/ops/function.rs:227:5 (cloud-hypervisor+0x3ced55)
    #23 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::ha5022a6bb7833f62 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d692)
    #24 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h481697829cbc6746 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d692)
    #25 std::sys::unix:🧵:Thread:🆕:thread_start::h6fad62c4c393bbe7 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/thread.rs:108:17 (cloud-hypervisor+0x119d692)

  Mutex M121 previously acquired by the same thread here:
    #0 pthread_mutex_lock /rustc/llvm/src/llvm-project/compiler-rt/lib/tsan/../sanitizer_common/sanitizer_common_interceptors.inc:4249:3 (cloud-hypervisor+0x9c368)
    #1 std::sys::unix::mutex::Mutex:🔒:hcd1b9aa06ff775d3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/mutex.rs:63:17 (cloud-hypervisor+0x662ff4)
    #2 std::sys_common::mutex::MovableMutex::raw_lock::hff98d0b036469bca /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/mutex.rs:76:18 (cloud-hypervisor+0x4915de)
    #3 std::sync::mutex::Mutex$LT$T$GT$:🔒:h14cfa3c8f5ba878a /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sync/mutex.rs:267:13 (cloud-hypervisor+0x36ca3b)
    #4 vmm::cpu::CpuManager::start_vcpu::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h37e4dd8619b3a5e5 /home/rob/src/cloud-hypervisor/vmm/src/cpu.rs:813:35 (cloud-hypervisor+0x4777c9)
    #5 std::panicking::try::do_call::h093e4d1434150d77 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:406:40 (cloud-hypervisor+0x6d2aea)
    #6 __rust_try 3hkmq3dzyyv5ejsx (cloud-hypervisor+0x6d3db8)
    #7 std::panicking::try::hee9535cb997282b4 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:370:19 (cloud-hypervisor+0x6d268f)
    #8 std::panic::catch_unwind::he3908c4d08a8a028 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panic.rs:133:14 (cloud-hypervisor+0x57baf9)
    #9 vmm::cpu::CpuManager::start_vcpu::_$u7b$$u7b$closure$u7d$$u7d$::h29472aaa3a600231 /home/rob/src/cloud-hypervisor/vmm/src/cpu.rs:782:21 (cloud-hypervisor+0x477156)
    #10 std::sys_common::backtrace::__rust_begin_short_backtrace::hcfc2f02361c98808 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/backtrace.rs:123:18 (cloud-hypervisor+0x65932b)
    #11 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h08b82db41d7af2f2 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:477:17 (cloud-hypervisor+0x4408ef)
    #12 _$LT$core..panic..unwind_safe..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h7ebad9d94e64fa5f /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/panic/unwind_safe.rs:271:9 (cloud-hypervisor+0x6dd1df)
    #13 std::panicking::try::do_call::h121fafbdf5cf84af /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:406:40 (cloud-hypervisor+0x6d2b81)
    #14 __rust_try 3hkmq3dzyyv5ejsx (cloud-hypervisor+0x6d3db8)
    #15 std::panicking::try::h79e25f019cd90522 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:370:19 (cloud-hypervisor+0x6d1e1f)
    #16 std::panic::catch_unwind::h5a0619a53bbd611d /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panic.rs:133:14 (cloud-hypervisor+0x57b8ff)
    #17 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::h1cfd689c9d362e48 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:476:30 (cloud-hypervisor+0x43fafb)
    #18 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::h6642b1b3a2289640 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/ops/function.rs:227:5 (cloud-hypervisor+0x3ced55)
    #19 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::ha5022a6bb7833f62 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d692)
    #20 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h481697829cbc6746 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d692)
    #21 std::sys::unix:🧵:Thread:🆕:thread_start::h6fad62c4c393bbe7 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/thread.rs:108:17 (cloud-hypervisor+0x119d692)

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-06 09:59:36 +01:00
Rob Bradford
2f46647ecc vmm: Fix potential deadlock in CpuManager
Delay creating the mutex on the Vcpu until later preventing a potential
lock inversion between CpuManager and the Vcpu.

WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock) (pid=28799)
  Cycle in lock order graph: M48 (0x7b0c00001aa0) => M117 (0x7b0c00002280) => M48

  Mutex M117 acquired here while holding mutex M48 in thread T1:
    #0 pthread_mutex_lock /rustc/llvm/src/llvm-project/compiler-rt/lib/tsan/../sanitizer_common/sanitizer_common_interceptors.inc:4249:3 (cloud-hypervisor+0x9c368)
    #1 std::sys::unix::mutex::Mutex:🔒:hcd1b9aa06ff775d3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/mutex.rs:63:17 (cloud-hypervisor+0x662fc4)
    #2 std::sys_common::mutex::MovableMutex::raw_lock::hff98d0b036469bca /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/mutex.rs:76:18 (cloud-hypervisor+0x4915ae)
    #3 std::sync::mutex::Mutex$LT$T$GT$:🔒:h14cfa3c8f5ba878a /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sync/mutex.rs:267:13 (cloud-hypervisor+0x36c82b)
    #4 vmm::cpu::CpuManager::create_vcpu::hd5878da6efae8d68 /home/rob/src/cloud-hypervisor/vmm/src/cpu.rs:665:13 (cloud-hypervisor+0x3743de)
    #5 vmm::cpu::CpuManager::create_vcpus::h3c747553a1d5bc4e /home/rob/src/cloud-hypervisor/vmm/src/cpu.rs:704:13 (cloud-hypervisor+0x374d87)
    #6 vmm::cpu::CpuManager::create_boot_vcpus::he8eeca10785067c1 /home/rob/src/cloud-hypervisor/vmm/src/cpu.rs:938:9 (cloud-hypervisor+0x377305)
    #7 vmm::vm::Vm:👢:hc2ca6b16f996267b /home/rob/src/cloud-hypervisor/vmm/src/vm.rs:1986:9 (cloud-hypervisor+0x3432d3)
    #8 vmm::Vmm::vm_boot::h06bdf54b95d5e14f /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:397:13 (cloud-hypervisor+0x2e5d35)
    #9 vmm::Vmm::control_loop::h40c9b48c7b800bed /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:1299:48 (cloud-hypervisor+0x2f42d0)
    #10 vmm::start_vmm_thread::_$u7b$$u7b$closure$u7d$$u7d$::h016d2f7cff698175 /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:263:17 (cloud-hypervisor+0x1dd8d0)
    #11 std::sys_common::backtrace::__rust_begin_short_backtrace::h7fd2df3e7cfba503 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/backtrace.rs:123:18 (cloud-hypervisor+0x65923a)
    #12 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h89880b05fe892d7e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:477:17 (cloud-hypervisor+0x440a0e)
    #13 _$LT$core..panic..unwind_safe..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h487382524d80571f /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/panic/unwind_safe.rs:271:9 (cloud-hypervisor+0x6dd0ce)
    #14 std::panicking::try::do_call::h1d9c2ccdc39f3322 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:406:40 (cloud-hypervisor+0x6d2d41)
    #15 __rust_try 3hkmq3dzyyv5ejsx (cloud-hypervisor+0x6d3d88)
    #16 std::panicking::try::h251306df23d21913 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:370:19 (cloud-hypervisor+0x6d1ac9)
    #17 std::panic::catch_unwind::h2a9ac2fb12c3c64e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panic.rs:133:14 (cloud-hypervisor+0x57b72e)
    #18 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::h10f4c340611b55e4 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:476:30 (cloud-hypervisor+0x43f313)
    #19 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hdd9b37241caf97b3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/ops/function.rs:227:5 (cloud-hypervisor+0x3cef15)
    #20 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::ha5022a6bb7833f62 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d662)
    #21 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h481697829cbc6746 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d662)
    #22 std::sys::unix:🧵:Thread:🆕:thread_start::h6fad62c4c393bbe7 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/thread.rs:108:17 (cloud-hypervisor+0x119d662)

  Mutex M48 previously acquired by the same thread here:
    #0 pthread_mutex_lock /rustc/llvm/src/llvm-project/compiler-rt/lib/tsan/../sanitizer_common/sanitizer_common_interceptors.inc:4249:3 (cloud-hypervisor+0x9c368)
    #1 std::sys::unix::mutex::Mutex:🔒:hcd1b9aa06ff775d3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/mutex.rs:63:17 (cloud-hypervisor+0x662fc4)
    #2 std::sys_common::mutex::MovableMutex::raw_lock::hff98d0b036469bca /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/mutex.rs:76:18 (cloud-hypervisor+0x4915ae)
    #3 std::sync::mutex::Mutex$LT$T$GT$:🔒:hecf671add5fe1762 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sync/mutex.rs:267:13 (cloud-hypervisor+0x36cfbb)
    #4 vmm::vm::Vm:👢:hc2ca6b16f996267b /home/rob/src/cloud-hypervisor/vmm/src/vm.rs:1986:9 (cloud-hypervisor+0x343237)
    #5 vmm::Vmm::vm_boot::h06bdf54b95d5e14f /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:397:13 (cloud-hypervisor+0x2e5d35)
    #6 vmm::Vmm::control_loop::h40c9b48c7b800bed /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:1299:48 (cloud-hypervisor+0x2f42d0)
    #7 vmm::start_vmm_thread::_$u7b$$u7b$closure$u7d$$u7d$::h016d2f7cff698175 /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:263:17 (cloud-hypervisor+0x1dd8d0)
    #8 std::sys_common::backtrace::__rust_begin_short_backtrace::h7fd2df3e7cfba503 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/backtrace.rs:123:18 (cloud-hypervisor+0x65923a)
    #9 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h89880b05fe892d7e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:477:17 (cloud-hypervisor+0x440a0e)
    #10 _$LT$core..panic..unwind_safe..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h487382524d80571f /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/panic/unwind_safe.rs:271:9 (cloud-hypervisor+0x6dd0ce)
    #11 std::panicking::try::do_call::h1d9c2ccdc39f3322 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:406:40 (cloud-hypervisor+0x6d2d41)
    #12 __rust_try 3hkmq3dzyyv5ejsx (cloud-hypervisor+0x6d3d88)
    #13 std::panicking::try::h251306df23d21913 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:370:19 (cloud-hypervisor+0x6d1ac9)
    #14 std::panic::catch_unwind::h2a9ac2fb12c3c64e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panic.rs:133:14 (cloud-hypervisor+0x57b72e)
    #15 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::h10f4c340611b55e4 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:476:30 (cloud-hypervisor+0x43f313)
    #16 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hdd9b37241caf97b3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/ops/function.rs:227:5 (cloud-hypervisor+0x3cef15)
    #17 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::ha5022a6bb7833f62 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d662)
    #18 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h481697829cbc6746 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d662)
    #19 std::sys::unix:🧵:Thread:🆕:thread_start::h6fad62c4c393bbe7 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/thread.rs:108:17 (cloud-hypervisor+0x119d662)

  Mutex M48 acquired here while holding mutex M117 in thread T3:
    #0 pthread_mutex_lock /rustc/llvm/src/llvm-project/compiler-rt/lib/tsan/../sanitizer_common/sanitizer_common_interceptors.inc:4249:3 (cloud-hypervisor+0x9c368)
    #1 std::sys::unix::mutex::Mutex:🔒:h967991d72ceb6eb0 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/mutex.rs:63:17 (cloud-hypervisor+0xd94dc4)
    #2 std::sys_common::mutex::MovableMutex::raw_lock::h8779639163126a21 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/mutex.rs:76:18 (cloud-hypervisor+0xd90c9e)
    #3 std::sync::mutex::Mutex$LT$T$GT$:🔒:hd85239d207beb12f /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sync/mutex.rs:267:13 (cloud-hypervisor+0xd6e98a)
    #4 vm_device:🚌:Bus::write::hf20f991e71af3199 /home/rob/src/cloud-hypervisor/vm-device/src/bus.rs:235:16 (cloud-hypervisor+0xd8dcfd)
    #5 _$LT$vmm..vm..VmOps$u20$as$u20$hypervisor..vm..VmmOps$GT$::mmio_write::hc759194aaebc7399 /home/rob/src/cloud-hypervisor/vmm/src/vm.rs:424:15 (cloud-hypervisor+0x32d94f)
    #6 _$LT$hypervisor..kvm..KvmVcpu$u20$as$u20$hypervisor..cpu..Vcpu$GT$::run::h94762dfba6642fb2 /home/rob/src/cloud-hypervisor/hypervisor/src/kvm/mod.rs:1003:32 (cloud-hypervisor+0xcc3ea8)
    #7 vmm::cpu::Vcpu::run::hd5cf042157f95bea /home/rob/src/cloud-hypervisor/vmm/src/cpu.rs:327:9 (cloud-hypervisor+0x3700c4)
    #8 vmm::cpu::CpuManager::start_vcpu::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h37e4dd8619b3a5e5 /home/rob/src/cloud-hypervisor/vmm/src/cpu.rs:819:35 (cloud-hypervisor+0x47782b)
    #9 std::panicking::try::do_call::h093e4d1434150d77 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:406:40 (cloud-hypervisor+0x6d2aba)
    #10 __rust_try 3hkmq3dzyyv5ejsx (cloud-hypervisor+0x6d3d88)
    #11 std::panicking::try::hee9535cb997282b4 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:370:19 (cloud-hypervisor+0x6d265f)
    #12 std::panic::catch_unwind::he3908c4d08a8a028 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panic.rs:133:14 (cloud-hypervisor+0x57bac9)
    #13 vmm::cpu::CpuManager::start_vcpu::_$u7b$$u7b$closure$u7d$$u7d$::h29472aaa3a600231 /home/rob/src/cloud-hypervisor/vmm/src/cpu.rs:788:21 (cloud-hypervisor+0x477126)
    #14 std::sys_common::backtrace::__rust_begin_short_backtrace::hcfc2f02361c98808 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/backtrace.rs:123:18 (cloud-hypervisor+0x6592fb)
    #15 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h08b82db41d7af2f2 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:477:17 (cloud-hypervisor+0x4408bf)
    #16 _$LT$core..panic..unwind_safe..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h7ebad9d94e64fa5f /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/panic/unwind_safe.rs:271:9 (cloud-hypervisor+0x6dd1af)
    #17 std::panicking::try::do_call::h121fafbdf5cf84af /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:406:40 (cloud-hypervisor+0x6d2b51)
    #18 __rust_try 3hkmq3dzyyv5ejsx (cloud-hypervisor+0x6d3d88)
    #19 std::panicking::try::h79e25f019cd90522 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:370:19 (cloud-hypervisor+0x6d1def)
    #20 std::panic::catch_unwind::h5a0619a53bbd611d /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panic.rs:133:14 (cloud-hypervisor+0x57b8cf)
    #21 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::h1cfd689c9d362e48 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:476:30 (cloud-hypervisor+0x43facb)
    #22 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::h6642b1b3a2289640 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/ops/function.rs:227:5 (cloud-hypervisor+0x3ced25)
    #23 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::ha5022a6bb7833f62 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d662)
    #24 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h481697829cbc6746 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d662)
    #25 std::sys::unix:🧵:Thread:🆕:thread_start::h6fad62c4c393bbe7 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/thread.rs:108:17 (cloud-hypervisor+0x119d662)

  Mutex M117 previously acquired by the same thread here:
    #0 pthread_mutex_lock /rustc/llvm/src/llvm-project/compiler-rt/lib/tsan/../sanitizer_common/sanitizer_common_interceptors.inc:4249:3 (cloud-hypervisor+0x9c368)
    #1 std::sys::unix::mutex::Mutex:🔒:hcd1b9aa06ff775d3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/mutex.rs:63:17 (cloud-hypervisor+0x662fc4)
    #2 std::sys_common::mutex::MovableMutex::raw_lock::hff98d0b036469bca /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/mutex.rs:76:18 (cloud-hypervisor+0x4915ae)
    #3 std::sync::mutex::Mutex$LT$T$GT$:🔒:h14cfa3c8f5ba878a /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sync/mutex.rs:267:13 (cloud-hypervisor+0x36c82b)
    #4 vmm::cpu::CpuManager::start_vcpu::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h37e4dd8619b3a5e5 /home/rob/src/cloud-hypervisor/vmm/src/cpu.rs:819:35 (cloud-hypervisor+0x477799)
    #5 std::panicking::try::do_call::h093e4d1434150d77 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:406:40 (cloud-hypervisor+0x6d2aba)
    #6 __rust_try 3hkmq3dzyyv5ejsx (cloud-hypervisor+0x6d3d88)
    #7 std::panicking::try::hee9535cb997282b4 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:370:19 (cloud-hypervisor+0x6d265f)
    #8 std::panic::catch_unwind::he3908c4d08a8a028 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panic.rs:133:14 (cloud-hypervisor+0x57bac9)
    #9 vmm::cpu::CpuManager::start_vcpu::_$u7b$$u7b$closure$u7d$$u7d$::h29472aaa3a600231 /home/rob/src/cloud-hypervisor/vmm/src/cpu.rs:788:21 (cloud-hypervisor+0x477126)
    #10 std::sys_common::backtrace::__rust_begin_short_backtrace::hcfc2f02361c98808 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/backtrace.rs:123:18 (cloud-hypervisor+0x6592fb)
    #11 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h08b82db41d7af2f2 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:477:17 (cloud-hypervisor+0x4408bf)
    #12 _$LT$core..panic..unwind_safe..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h7ebad9d94e64fa5f /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/panic/unwind_safe.rs:271:9 (cloud-hypervisor+0x6dd1af)
    #13 std::panicking::try::do_call::h121fafbdf5cf84af /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:406:40 (cloud-hypervisor+0x6d2b51)
    #14 __rust_try 3hkmq3dzyyv5ejsx (cloud-hypervisor+0x6d3d88)
    #15 std::panicking::try::h79e25f019cd90522 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:370:19 (cloud-hypervisor+0x6d1def)
    #16 std::panic::catch_unwind::h5a0619a53bbd611d /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panic.rs:133:14 (cloud-hypervisor+0x57b8cf)
    #17 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::h1cfd689c9d362e48 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:476:30 (cloud-hypervisor+0x43facb)
    #18 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::h6642b1b3a2289640 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/ops/function.rs:227:5 (cloud-hypervisor+0x3ced25)
    #19 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::ha5022a6bb7833f62 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d662)
    #20 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h481697829cbc6746 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d662)
    #21 std::sys::unix:🧵:Thread:🆕:thread_start::h6fad62c4c393bbe7 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/thread.rs:108:17 (cloud-hypervisor+0x119d662)

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-06 09:59:36 +01:00
Rob Bradford
9ef1187f4a vmm, pci: Fix potential deadlock in PCI BAR allocation
The allocator is locked by both the BAR allocation code and the
interrupt allocation code. Resulting in a potential lock inversion
error.

WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock) (pid=26318)
  Cycle in lock order graph: M87 (0x7b0c00001e30) => M28 (0x7b0c00001830) => M87

  Mutex M28 acquired here while holding mutex M87 in thread T1:
    #0 pthread_mutex_lock /rustc/llvm/src/llvm-project/compiler-rt/lib/tsan/../sanitizer_common/sanitizer_common_interceptors.inc:4249:3 (cloud-hypervisor+0x9c368)
    #1 std::sys::unix::mutex::Mutex:🔒:hcd1b9aa06ff775d3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/mutex.rs:63:17 (cloud-hypervisor+0x663954)
    #2 std::sys_common::mutex::MovableMutex::raw_lock::hff98d0b036469bca /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/mutex.rs:76:18 (cloud-hypervisor+0x491bae)
    #3 std::sync::mutex::Mutex$LT$T$GT$:🔒:hc61622e5536f5b72 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sync/mutex.rs:267:13 (cloud-hypervisor+0x36d07b)
    #4 _$LT$vmm..interrupt..MsiInterruptManager$LT$kvm_bindings..x86..bindings..kvm_irq_routing_entry$GT$$u20$as$u20$vm_device..interrupt..InterruptManager$GT$::create_group::hd412b5e1e8eeacc2 /home/rob/src/cloud-hypervisor/vmm/src/interrupt.rs:310:29 (cloud-hypervisor+0x6d1403)
    #5 virtio_devices::transport::pci_device::VirtioPciDevice:🆕:h3af603c3f00f4b3d /home/rob/src/cloud-hypervisor/virtio-devices/src/transport/pci_device.rs:376:38 (cloud-hypervisor+0x8e6137)
    #6 vmm::device_manager::DeviceManager::add_virtio_pci_device::h23608151d7668a1c /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:3333:37 (cloud-hypervisor+0x3b6339)
    #7 vmm::device_manager::DeviceManager::add_pci_devices::h136cc20cbeb6b977 /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:1236:30 (cloud-hypervisor+0x390aad)
    #8 vmm::device_manager::DeviceManager::create_devices::h29fc5b8a20e1aea5 /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:1155:9 (cloud-hypervisor+0x38f48c)
    #9 vmm::vm::Vm:🆕:h43efe7c6cd97ede5 /home/rob/src/cloud-hypervisor/vmm/src/vm.rs:799:9 (cloud-hypervisor+0x334641)
    #10 vmm::Vmm::vm_boot::h06bdf54b95d5e14f /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:379:26 (cloud-hypervisor+0x2e5ba8)
    #11 vmm::Vmm::control_loop::h40c9b48c7b800bed /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:1299:48 (cloud-hypervisor+0x2f44e0)
    #12 vmm::start_vmm_thread::_$u7b$$u7b$closure$u7d$$u7d$::h016d2f7cff698175 /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:263:17 (cloud-hypervisor+0x1dda20)
    #13 std::sys_common::backtrace::__rust_begin_short_backtrace::h7fd2df3e7cfba503 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/backtrace.rs:123:18 (cloud-hypervisor+0x659bca)
    #14 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h89880b05fe892d7e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:477:17 (cloud-hypervisor+0x44100e)
    #15 _$LT$core..panic..unwind_safe..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h487382524d80571f /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/panic/unwind_safe.rs:271:9 (cloud-hypervisor+0x6dda5e)
    #16 std::panicking::try::do_call::h1d9c2ccdc39f3322 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:406:40 (cloud-hypervisor+0x6d36d1)
    #17 __rust_try 3hkmq3dzyyv5ejsx (cloud-hypervisor+0x6d4718)
    #18 std::panicking::try::h251306df23d21913 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:370:19 (cloud-hypervisor+0x6d2459)
    #19 std::panic::catch_unwind::h2a9ac2fb12c3c64e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panic.rs:133:14 (cloud-hypervisor+0x57bdce)
    #20 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::h10f4c340611b55e4 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:476:30 (cloud-hypervisor+0x43f913)
    #21 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hdd9b37241caf97b3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/ops/function.rs:227:5 (cloud-hypervisor+0x3cf3f5)
    #22 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::ha5022a6bb7833f62 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d492)
    #23 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h481697829cbc6746 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d492)
    #24 std::sys::unix:🧵:Thread:🆕:thread_start::h6fad62c4c393bbe7 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/thread.rs:108:17 (cloud-hypervisor+0x119d492)

  Mutex M87 previously acquired by the same thread here:
    #0 pthread_mutex_lock /rustc/llvm/src/llvm-project/compiler-rt/lib/tsan/../sanitizer_common/sanitizer_common_interceptors.inc:4249:3 (cloud-hypervisor+0x9c368)
    #1 std::sys::unix::mutex::Mutex:🔒:h9a2d3e97e05c6430 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/mutex.rs:63:17 (cloud-hypervisor+0x9ea344)
    #2 std::sys_common::mutex::MovableMutex::raw_lock::h8abb3b5cf55c0264 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/mutex.rs:76:18 (cloud-hypervisor+0x96face)
    #3 std::sync::mutex::Mutex$LT$T$GT$:🔒:hecec128d40c6dd44 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sync/mutex.rs:267:13 (cloud-hypervisor+0x97120a)
    #4 virtio_devices::transport::pci_device::VirtioPciDevice:🆕:h3af603c3f00f4b3d /home/rob/src/cloud-hypervisor/virtio-devices/src/transport/pci_device.rs:356:29 (cloud-hypervisor+0x8e5c0e)
    #5 vmm::device_manager::DeviceManager::add_virtio_pci_device::h23608151d7668a1c /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:3333:37 (cloud-hypervisor+0x3b6339)
    #6 vmm::device_manager::DeviceManager::add_pci_devices::h136cc20cbeb6b977 /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:1236:30 (cloud-hypervisor+0x390aad)
    #7 vmm::device_manager::DeviceManager::create_devices::h29fc5b8a20e1aea5 /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:1155:9 (cloud-hypervisor+0x38f48c)
    #8 vmm::vm::Vm:🆕:h43efe7c6cd97ede5 /home/rob/src/cloud-hypervisor/vmm/src/vm.rs:799:9 (cloud-hypervisor+0x334641)
    #9 vmm::Vmm::vm_boot::h06bdf54b95d5e14f /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:379:26 (cloud-hypervisor+0x2e5ba8)
    #10 vmm::Vmm::control_loop::h40c9b48c7b800bed /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:1299:48 (cloud-hypervisor+0x2f44e0)
    #11 vmm::start_vmm_thread::_$u7b$$u7b$closure$u7d$$u7d$::h016d2f7cff698175 /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:263:17 (cloud-hypervisor+0x1dda20)
    #12 std::sys_common::backtrace::__rust_begin_short_backtrace::h7fd2df3e7cfba503 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/backtrace.rs:123:18 (cloud-hypervisor+0x659bca)
    #13 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h89880b05fe892d7e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:477:17 (cloud-hypervisor+0x44100e)
    #14 _$LT$core..panic..unwind_safe..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h487382524d80571f /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/panic/unwind_safe.rs:271:9 (cloud-hypervisor+0x6dda5e)
    #15 std::panicking::try::do_call::h1d9c2ccdc39f3322 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:406:40 (cloud-hypervisor+0x6d36d1)
    #16 __rust_try 3hkmq3dzyyv5ejsx (cloud-hypervisor+0x6d4718)
    #17 std::panicking::try::h251306df23d21913 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:370:19 (cloud-hypervisor+0x6d2459)
    #18 std::panic::catch_unwind::h2a9ac2fb12c3c64e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panic.rs:133:14 (cloud-hypervisor+0x57bdce)
    #19 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::h10f4c340611b55e4 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:476:30 (cloud-hypervisor+0x43f913)
    #20 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hdd9b37241caf97b3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/ops/function.rs:227:5 (cloud-hypervisor+0x3cf3f5)
    #21 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::ha5022a6bb7833f62 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d492)
    #22 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h481697829cbc6746 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d492)
    #23 std::sys::unix:🧵:Thread:🆕:thread_start::h6fad62c4c393bbe7 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/thread.rs:108:17 (cloud-hypervisor+0x119d492)

  Mutex M87 acquired here while holding mutex M28 in thread T1:
    #0 pthread_mutex_lock /rustc/llvm/src/llvm-project/compiler-rt/lib/tsan/../sanitizer_common/sanitizer_common_interceptors.inc:4249:3 (cloud-hypervisor+0x9c368)
    #1 std::sys::unix::mutex::Mutex:🔒:h9a2d3e97e05c6430 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/mutex.rs:63:17 (cloud-hypervisor+0x9ea344)
    #2 std::sys_common::mutex::MovableMutex::raw_lock::h8abb3b5cf55c0264 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/mutex.rs:76:18 (cloud-hypervisor+0x96face)
    #3 std::sync::mutex::Mutex$LT$T$GT$:🔒:hecec128d40c6dd44 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sync/mutex.rs:267:13 (cloud-hypervisor+0x97120a)
    #4 _$LT$virtio_devices..transport..pci_device..VirtioPciDevice$u20$as$u20$pci..device..PciDevice$GT$::allocate_bars::h39dc42b48fc8264c /home/rob/src/cloud-hypervisor/virtio-devices/src/transport/pci_device.rs:850:22 (cloud-hypervisor+0x8eb1a4)
    #5 vmm::device_manager::DeviceManager::add_pci_device::h561f6c8ed61db117 /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:3087:20 (cloud-hypervisor+0x3b0c62)
    #6 vmm::device_manager::DeviceManager::add_virtio_pci_device::h23608151d7668a1c /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:3359:20 (cloud-hypervisor+0x3b6707)
    #7 vmm::device_manager::DeviceManager::add_pci_devices::h136cc20cbeb6b977 /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:1236:30 (cloud-hypervisor+0x390aad)
    #8 vmm::device_manager::DeviceManager::create_devices::h29fc5b8a20e1aea5 /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:1155:9 (cloud-hypervisor+0x38f48c)
    #9 vmm::vm::Vm:🆕:h43efe7c6cd97ede5 /home/rob/src/cloud-hypervisor/vmm/src/vm.rs:799:9 (cloud-hypervisor+0x334641)
    #10 vmm::Vmm::vm_boot::h06bdf54b95d5e14f /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:379:26 (cloud-hypervisor+0x2e5ba8)
    #11 vmm::Vmm::control_loop::h40c9b48c7b800bed /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:1299:48 (cloud-hypervisor+0x2f44e0)
    #12 vmm::start_vmm_thread::_$u7b$$u7b$closure$u7d$$u7d$::h016d2f7cff698175 /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:263:17 (cloud-hypervisor+0x1dda20)
    #13 std::sys_common::backtrace::__rust_begin_short_backtrace::h7fd2df3e7cfba503 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/backtrace.rs:123:18 (cloud-hypervisor+0x659bca)
    #14 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h89880b05fe892d7e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:477:17 (cloud-hypervisor+0x44100e)
    #15 _$LT$core..panic..unwind_safe..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h487382524d80571f /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/panic/unwind_safe.rs:271:9 (cloud-hypervisor+0x6dda5e)
    #16 std::panicking::try::do_call::h1d9c2ccdc39f3322 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:406:40 (cloud-hypervisor+0x6d36d1)
    #17 __rust_try 3hkmq3dzyyv5ejsx (cloud-hypervisor+0x6d4718)
    #18 std::panicking::try::h251306df23d21913 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:370:19 (cloud-hypervisor+0x6d2459)
    #19 std::panic::catch_unwind::h2a9ac2fb12c3c64e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panic.rs:133:14 (cloud-hypervisor+0x57bdce)
    #20 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::h10f4c340611b55e4 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:476:30 (cloud-hypervisor+0x43f913)
    #21 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hdd9b37241caf97b3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/ops/function.rs:227:5 (cloud-hypervisor+0x3cf3f5)
    #22 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::ha5022a6bb7833f62 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d492)
    #23 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h481697829cbc6746 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d492)
    #24 std::sys::unix:🧵:Thread:🆕:thread_start::h6fad62c4c393bbe7 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/thread.rs:108:17 (cloud-hypervisor+0x119d492)

  Mutex M28 previously acquired by the same thread here:
    #0 pthread_mutex_lock /rustc/llvm/src/llvm-project/compiler-rt/lib/tsan/../sanitizer_common/sanitizer_common_interceptors.inc:4249:3 (cloud-hypervisor+0x9c368)
    #1 std::sys::unix::mutex::Mutex:🔒:hcd1b9aa06ff775d3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/mutex.rs:63:17 (cloud-hypervisor+0x663954)
    #2 std::sys_common::mutex::MovableMutex::raw_lock::hff98d0b036469bca /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/mutex.rs:76:18 (cloud-hypervisor+0x491bae)
    #3 std::sync::mutex::Mutex$LT$T$GT$:🔒:hc61622e5536f5b72 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sync/mutex.rs:267:13 (cloud-hypervisor+0x36d07b)
    #4 vmm::device_manager::DeviceManager::add_pci_device::h561f6c8ed61db117 /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:3091:22 (cloud-hypervisor+0x3b0a95)
    #5 vmm::device_manager::DeviceManager::add_virtio_pci_device::h23608151d7668a1c /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:3359:20 (cloud-hypervisor+0x3b6707)
    #6 vmm::device_manager::DeviceManager::add_pci_devices::h136cc20cbeb6b977 /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:1236:30 (cloud-hypervisor+0x390aad)
    #7 vmm::device_manager::DeviceManager::create_devices::h29fc5b8a20e1aea5 /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:1155:9 (cloud-hypervisor+0x38f48c)
    #8 vmm::vm::Vm:🆕:h43efe7c6cd97ede5 /home/rob/src/cloud-hypervisor/vmm/src/vm.rs:799:9 (cloud-hypervisor+0x334641)
    #9 vmm::Vmm::vm_boot::h06bdf54b95d5e14f /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:379:26 (cloud-hypervisor+0x2e5ba8)
    #10 vmm::Vmm::control_loop::h40c9b48c7b800bed /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:1299:48 (cloud-hypervisor+0x2f44e0)
    #11 vmm::start_vmm_thread::_$u7b$$u7b$closure$u7d$$u7d$::h016d2f7cff698175 /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:263:17 (cloud-hypervisor+0x1dda20)
    #12 std::sys_common::backtrace::__rust_begin_short_backtrace::h7fd2df3e7cfba503 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/backtrace.rs:123:18 (cloud-hypervisor+0x659bca)
    #13 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h89880b05fe892d7e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:477:17 (cloud-hypervisor+0x44100e)
    #14 _$LT$core..panic..unwind_safe..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h487382524d80571f /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/panic/unwind_safe.rs:271:9 (cloud-hypervisor+0x6dda5e)
    #15 std::panicking::try::do_call::h1d9c2ccdc39f3322 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:406:40 (cloud-hypervisor+0x6d36d1)
    #16 __rust_try 3hkmq3dzyyv5ejsx (cloud-hypervisor+0x6d4718)
    #17 std::panicking::try::h251306df23d21913 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:370:19 (cloud-hypervisor+0x6d2459)
    #18 std::panic::catch_unwind::h2a9ac2fb12c3c64e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panic.rs:133:14 (cloud-hypervisor+0x57bdce)
    #19 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::h10f4c340611b55e4 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:476:30 (cloud-hypervisor+0x43f913)
    #20 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hdd9b37241caf97b3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/ops/function.rs:227:5 (cloud-hypervisor+0x3cf3f5)
    #21 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::ha5022a6bb7833f62 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d492)
    #22 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h481697829cbc6746 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d492)
    #23 std::sys::unix:🧵:Thread:🆕:thread_start::h6fad62c4c393bbe7 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/thread.rs:108:17 (cloud-hypervisor+0x119d492)

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-06 09:59:36 +01:00
Rob Bradford
e7db354c27 vmm: Fix potential deadlock in CpuManager
The lock on the config should not be held whilst calling into
CpuManager::new().

WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock) (pid=24176)
  Cycle in lock order graph: M13 (0x7b0c000001e0) => M43 (0x7b0c00001a70) => M13

  Mutex M43 acquired here while holding mutex M13 in thread T1:
    #0 pthread_mutex_lock /rustc/llvm/src/llvm-project/compiler-rt/lib/tsan/../sanitizer_common/sanitizer_common_interceptors.inc:4249:3 (cloud-hypervisor+0x9c368)
    #1 std::sys::unix::mutex::Mutex:🔒:hcd1b9aa06ff775d3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/mutex.rs:63:17 (cloud-hypervisor+0x663984)
    #2 std::sys_common::mutex::MovableMutex::raw_lock::hff98d0b036469bca /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/mutex.rs:76:18 (cloud-hypervisor+0x491bde)
    #3 std::sync::mutex::Mutex$LT$T$GT$:🔒:h8a843a6e74b34c4a /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sync/mutex.rs:267:13 (cloud-hypervisor+0x36cd8b)
    #4 vmm::cpu::CpuManager:🆕:h1cc88224a2a50d87 /home/rob/src/cloud-hypervisor/vmm/src/cpu.rs:575:30 (cloud-hypervisor+0x372e16)
    #5 vmm::vm::Vm::new_from_memory_manager::ha2a4467be260e93c /home/rob/src/cloud-hypervisor/vmm/src/vm.rs:592:27 (cloud-hypervisor+0x330b86)
    #6 vmm::vm::Vm:🆕:h43efe7c6cd97ede5 /home/rob/src/cloud-hypervisor/vmm/src/vm.rs:784:22 (cloud-hypervisor+0x3343df)
    #7 vmm::Vmm::vm_boot::h06bdf54b95d5e14f /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:379:26 (cloud-hypervisor+0x2e5ba8)
    #8 vmm::Vmm::control_loop::h40c9b48c7b800bed /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:1299:48 (cloud-hypervisor+0x2f44e0)
    #9 vmm::start_vmm_thread::_$u7b$$u7b$closure$u7d$$u7d$::h016d2f7cff698175 /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:263:17 (cloud-hypervisor+0x1dda20)
    #10 std::sys_common::backtrace::__rust_begin_short_backtrace::h7fd2df3e7cfba503 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/backtrace.rs:123:18 (cloud-hypervisor+0x659bfa)
    #11 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h89880b05fe892d7e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:477:17 (cloud-hypervisor+0x44103e)
    #12 _$LT$core..panic..unwind_safe..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h487382524d80571f /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/panic/unwind_safe.rs:271:9 (cloud-hypervisor+0x6dda8e)
    #13 std::panicking::try::do_call::h1d9c2ccdc39f3322 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:406:40 (cloud-hypervisor+0x6d3701)
    #14 __rust_try 3hkmq3dzyyv5ejsx (cloud-hypervisor+0x6d4748)
    #15 std::panicking::try::h251306df23d21913 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:370:19 (cloud-hypervisor+0x6d2489)
    #16 std::panic::catch_unwind::h2a9ac2fb12c3c64e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panic.rs:133:14 (cloud-hypervisor+0x57bdfe)
    #17 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::h10f4c340611b55e4 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:476:30 (cloud-hypervisor+0x43f943)
    #18 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hdd9b37241caf97b3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/ops/function.rs:227:5 (cloud-hypervisor+0x3cf425)
    #19 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::ha5022a6bb7833f62 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d4c2)
    #20 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h481697829cbc6746 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d4c2)
    #21 std::sys::unix:🧵:Thread:🆕:thread_start::h6fad62c4c393bbe7 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/thread.rs:108:17 (cloud-hypervisor+0x119d4c2)

  Mutex M13 previously acquired by the same thread here:
    #0 pthread_mutex_lock /rustc/llvm/src/llvm-project/compiler-rt/lib/tsan/../sanitizer_common/sanitizer_common_interceptors.inc:4249:3 (cloud-hypervisor+0x9c368)
    #1 std::sys::unix::mutex::Mutex:🔒:hcd1b9aa06ff775d3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/mutex.rs:63:17 (cloud-hypervisor+0x663984)
    #2 std::sys_common::mutex::MovableMutex::raw_lock::hff98d0b036469bca /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/mutex.rs:76:18 (cloud-hypervisor+0x491bde)
    #3 std::sync::mutex::Mutex$LT$T$GT$:🔒:ha29f58bbf496356a /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sync/mutex.rs:267:13 (cloud-hypervisor+0x36d03b)
    #4 vmm::vm::Vm::new_from_memory_manager::ha2a4467be260e93c /home/rob/src/cloud-hypervisor/vmm/src/vm.rs:593:14 (cloud-hypervisor+0x3307ee)
    #5 vmm::vm::Vm:🆕:h43efe7c6cd97ede5 /home/rob/src/cloud-hypervisor/vmm/src/vm.rs:784:22 (cloud-hypervisor+0x3343df)
    #6 vmm::Vmm::vm_boot::h06bdf54b95d5e14f /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:379:26 (cloud-hypervisor+0x2e5ba8)
    #7 vmm::Vmm::control_loop::h40c9b48c7b800bed /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:1299:48 (cloud-hypervisor+0x2f44e0)
    #8 vmm::start_vmm_thread::_$u7b$$u7b$closure$u7d$$u7d$::h016d2f7cff698175 /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:263:17 (cloud-hypervisor+0x1dda20)
    #9 std::sys_common::backtrace::__rust_begin_short_backtrace::h7fd2df3e7cfba503 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/backtrace.rs:123:18 (cloud-hypervisor+0x659bfa)
    #10 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h89880b05fe892d7e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:477:17 (cloud-hypervisor+0x44103e)
    #11 _$LT$core..panic..unwind_safe..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h487382524d80571f /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/panic/unwind_safe.rs:271:9 (cloud-hypervisor+0x6dda8e)
    #12 std::panicking::try::do_call::h1d9c2ccdc39f3322 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:406:40 (cloud-hypervisor+0x6d3701)
    #13 __rust_try 3hkmq3dzyyv5ejsx (cloud-hypervisor+0x6d4748)
    #14 std::panicking::try::h251306df23d21913 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:370:19 (cloud-hypervisor+0x6d2489)
    #15 std::panic::catch_unwind::h2a9ac2fb12c3c64e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panic.rs:133:14 (cloud-hypervisor+0x57bdfe)
    #16 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::h10f4c340611b55e4 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:476:30 (cloud-hypervisor+0x43f943)
    #17 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hdd9b37241caf97b3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/ops/function.rs:227:5 (cloud-hypervisor+0x3cf425)
    #18 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::ha5022a6bb7833f62 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d4c2)
    #19 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h481697829cbc6746 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d4c2)
    #20 std::sys::unix:🧵:Thread:🆕:thread_start::h6fad62c4c393bbe7 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/thread.rs:108:17 (cloud-hypervisor+0x119d4c2)

  Mutex M13 acquired here while holding mutex M43 in thread T1:
    #0 pthread_mutex_lock /rustc/llvm/src/llvm-project/compiler-rt/lib/tsan/../sanitizer_common/sanitizer_common_interceptors.inc:4249:3 (cloud-hypervisor+0x9c368)
    #1 std::sys::unix::mutex::Mutex:🔒:hcd1b9aa06ff775d3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/mutex.rs:63:17 (cloud-hypervisor+0x663984)
    #2 std::sys_common::mutex::MovableMutex::raw_lock::hff98d0b036469bca /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/mutex.rs:76:18 (cloud-hypervisor+0x491bde)
    #3 std::sync::mutex::Mutex$LT$T$GT$:🔒:ha29f58bbf496356a /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sync/mutex.rs:267:13 (cloud-hypervisor+0x36d03b)
    #4 vmm::device_manager::DeviceManager::add_console_device::h1d2b419feef80564 /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:1839:29 (cloud-hypervisor+0x3972f4)
    #5 vmm::device_manager::DeviceManager::create_devices::h29fc5b8a20e1aea5 /home/rob/src/cloud-hypervisor/vmm/src/device_manager.rs:1143:24 (cloud-hypervisor+0x38f068)
    #6 vmm::vm::Vm:🆕:h43efe7c6cd97ede5 /home/rob/src/cloud-hypervisor/vmm/src/vm.rs:798:9 (cloud-hypervisor+0x334671)
    #7 vmm::Vmm::vm_boot::h06bdf54b95d5e14f /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:379:26 (cloud-hypervisor+0x2e5ba8)
    #8 vmm::Vmm::control_loop::h40c9b48c7b800bed /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:1299:48 (cloud-hypervisor+0x2f44e0)
    #9 vmm::start_vmm_thread::_$u7b$$u7b$closure$u7d$$u7d$::h016d2f7cff698175 /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:263:17 (cloud-hypervisor+0x1dda20)
    #10 std::sys_common::backtrace::__rust_begin_short_backtrace::h7fd2df3e7cfba503 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/backtrace.rs:123:18 (cloud-hypervisor+0x659bfa)
    #11 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h89880b05fe892d7e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:477:17 (cloud-hypervisor+0x44103e)
    #12 _$LT$core..panic..unwind_safe..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h487382524d80571f /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/panic/unwind_safe.rs:271:9 (cloud-hypervisor+0x6dda8e)
    #13 std::panicking::try::do_call::h1d9c2ccdc39f3322 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:406:40 (cloud-hypervisor+0x6d3701)
    #14 __rust_try 3hkmq3dzyyv5ejsx (cloud-hypervisor+0x6d4748)
    #15 std::panicking::try::h251306df23d21913 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:370:19 (cloud-hypervisor+0x6d2489)
    #16 std::panic::catch_unwind::h2a9ac2fb12c3c64e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panic.rs:133:14 (cloud-hypervisor+0x57bdfe)
    #17 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::h10f4c340611b55e4 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:476:30 (cloud-hypervisor+0x43f943)
    #18 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hdd9b37241caf97b3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/ops/function.rs:227:5 (cloud-hypervisor+0x3cf425)
    #19 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::ha5022a6bb7833f62 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d4c2)
    #20 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h481697829cbc6746 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d4c2)
    #21 std::sys::unix:🧵:Thread:🆕:thread_start::h6fad62c4c393bbe7 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/thread.rs:108:17 (cloud-hypervisor+0x119d4c2)

  Mutex M43 previously acquired by the same thread here:
    #0 pthread_mutex_lock /rustc/llvm/src/llvm-project/compiler-rt/lib/tsan/../sanitizer_common/sanitizer_common_interceptors.inc:4249:3 (cloud-hypervisor+0x9c368)
    #1 std::sys::unix::mutex::Mutex:🔒:hcd1b9aa06ff775d3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/mutex.rs:63:17 (cloud-hypervisor+0x663984)
    #2 std::sys_common::mutex::MovableMutex::raw_lock::hff98d0b036469bca /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/mutex.rs:76:18 (cloud-hypervisor+0x491bde)
    #3 std::sync::mutex::Mutex$LT$T$GT$:🔒:h8a843a6e74b34c4a /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sync/mutex.rs:267:13 (cloud-hypervisor+0x36cd8b)
    #4 vmm::vm::Vm:🆕:h43efe7c6cd97ede5 /home/rob/src/cloud-hypervisor/vmm/src/vm.rs:798:9 (cloud-hypervisor+0x334537)
    #5 vmm::Vmm::vm_boot::h06bdf54b95d5e14f /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:379:26 (cloud-hypervisor+0x2e5ba8)
    #6 vmm::Vmm::control_loop::h40c9b48c7b800bed /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:1299:48 (cloud-hypervisor+0x2f44e0)
    #7 vmm::start_vmm_thread::_$u7b$$u7b$closure$u7d$$u7d$::h016d2f7cff698175 /home/rob/src/cloud-hypervisor/vmm/src/lib.rs:263:17 (cloud-hypervisor+0x1dda20)
    #8 std::sys_common::backtrace::__rust_begin_short_backtrace::h7fd2df3e7cfba503 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys_common/backtrace.rs:123:18 (cloud-hypervisor+0x659bfa)
    #9 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h89880b05fe892d7e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:477:17 (cloud-hypervisor+0x44103e)
    #10 _$LT$core..panic..unwind_safe..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h487382524d80571f /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/panic/unwind_safe.rs:271:9 (cloud-hypervisor+0x6dda8e)
    #11 std::panicking::try::do_call::h1d9c2ccdc39f3322 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:406:40 (cloud-hypervisor+0x6d3701)
    #12 __rust_try 3hkmq3dzyyv5ejsx (cloud-hypervisor+0x6d4748)
    #13 std::panicking::try::h251306df23d21913 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panicking.rs:370:19 (cloud-hypervisor+0x6d2489)
    #14 std::panic::catch_unwind::h2a9ac2fb12c3c64e /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/panic.rs:133:14 (cloud-hypervisor+0x57bdfe)
    #15 std:🧵:Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::h10f4c340611b55e4 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/thread/mod.rs:476:30 (cloud-hypervisor+0x43f943)
    #16 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::hdd9b37241caf97b3 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/core/src/ops/function.rs:227:5 (cloud-hypervisor+0x3cf425)
    #17 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::ha5022a6bb7833f62 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d4c2)
    #18 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h481697829cbc6746 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/alloc/src/boxed.rs:1854:9 (cloud-hypervisor+0x119d4c2)
    #19 std::sys::unix:🧵:Thread:🆕:thread_start::h6fad62c4c393bbe7 /rustc/7d6f948173ccb18822bab13d548c65632db5f0aa/library/std/src/sys/unix/thread.rs:108:17 (cloud-hypervisor+0x119d4c2)

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-06 09:59:36 +01:00
Rob Bradford
e4763b47f1 vmm, build: Remove use of "credibility" from unit tests
This crate was used in the integration tests to allow the tests to
continue and clean up after a failure. This isn't necessary in the unit
tests and adds a large build dependency chain including an unmaintained
crate.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-05 12:35:50 +01:00
Rob Bradford
a749063c8a vmm: Don't assume that resize_pipe is initialised
If the underlying kernel is old PTY resize is disabled and this is
represented by the use of None in the provided Option<File> type. In the
virtio-console PTY path don't blindly unwrap() the value that will be
preserved across a reboot.

Fixes: #3496

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2022-01-04 12:04:50 +00:00
Sebastien Boeuf
4a47cdcebd vmm: tdx: Make sure a TDX enabled binary can be used for non-TDX
It's important to maintain the ability to run in a non-TDX environment
a Cloud Hypervisor binary with the 'tdx' feature enabled.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-12-17 12:52:40 +01:00
Rob Bradford
cbc388c7e2 vmm: Add ioctls to seccomp filter for block topology detection
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-12-17 12:42:10 +01:00
Rob Bradford
bde81405a8 vmm: seccomp: Remove fork & evecve syscalls
These were use for the self spawning vhost-user device feature that has
been removed.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-12-16 20:56:50 +01:00
Rob Bradford
afe386bc13 vmm: Only warn on error when setting up SIGWINCH handler
Setting up the SIGWINCH handler requires at least Linux 5.7. However
this functionality is not required for basic PTY operation.

Fixes: #3456

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-12-14 13:05:09 +01:00
Bo Chen
8fb64859cc vmm: openapi: Add receive/send-migration endpoints
Fixes: #3426

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-12-09 08:49:19 -08:00
Rob Bradford
50f5f43ae3 vmm: acpi: Make MBRD _CRS multi-segment aware
Advertise the PCI MMIO config spaces here so that the MMIO config space
is correctly recognised.

Tested by: --platform num_pci_segments=1 or 16 hotplug NVMe vfio-user device
works correctly with hypervisor-fw & OVMF and direct kernel boot.

Fixes: #3432

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-12-08 14:38:30 +00:00
Rob Bradford
e1c09b66ba vmm: Replace device tree value when restoring DeviceManager
When restoring replace the internal value of the device tree rather than
replacing the Arc<Mutex<DeviceTree>> itself. This is fixes an issue
where the AddressManager has a copy of the the original
Arc<Mutex<DeviceTree>> from when the DeviceManager was created. The
original restore path only replaced the DeviceManager's version of the
Arc<Mutex<DeviceTree>>. Instead replace the contents of the
Arc<Mutex<DeviceTree>> so all users see the updated version.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-12-06 15:58:37 +00:00
Rob Bradford
a29e53e436 vmm: Move KVM clock saving to common Vm::restore() method
Saving the KVM clock and restoring it is key for correct behaviour of
the VM when doing snapshot/restore or live migration. The clock is
restored to the KVM state as part of the Vm::resume() method prior to
that it must be extracted from the state object and stored for later use
by this method. This change simplifies the extraction and storage part
so that it is done in the same way for both snapshot/restore and live
migration.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-12-06 11:23:16 +00:00
Henry Wang
2f8540da70 vmm: Rename PCI_MMIO_CONFIG_SIZE and move it to arch
The constant `PCI_MMIO_CONFIG_SIZE` defined in `vmm/pci_segment.rs`
describes the MMIO configuation size for each PCI segment. However,
this name conflicts with the `PCI_MMCONFIG_SIZE` defined in `layout.rs`
in the `arch` crate, which describes the memory size of the PCI MMIO
configuration region.

Therefore, this commit renames the `PCI_MMIO_CONFIG_SIZE` to
`PCI_MMIO_CONFIG_SIZE_PER_SEGMENT` and moves this constant from `vmm`
crate to `arch` crate.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-12-06 09:29:49 +00:00
Henry Wang
07bef815cc aarch64: Introduce struct PciSpaceInfo for FDT
Currently, a tuple containing PCI space start address and PCI space
size is used to pass the PCI space information to the FDT creator.
In order to support the multiple PCI segment for FDT, more information
such as the PCI segment ID should be passed to the FDT creator. If we
still use a tuple to store these information, the code flexibility and
readablity will be harmed.

To address this issue, this commit replaces the tuple containing the
PCI space information to a structure `PciSpaceInfo` and uses a vector
of `PciSpaceInfo` to store PCI space information for each segment, so
that multiple PCI segment information can be passed to the FDT together.

Note that the scope of this commit will only contain the refactor of
original code, the actual multiple PCI segments support will be in
following series, and for now `--platform num_pci_segments` should only
be 1.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-12-06 09:29:49 +00:00
Sebastien Boeuf
7bb343dce8 vmm: Improve logging related to memory management
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-12-04 23:04:32 +01:00
Sebastien Boeuf
03a606c7ec arch, vmm: Place KVM identity map region after TSS region
In order to avoid the identity map region to conflict with a possible
firmware being placed in the last 4MiB of the 4GiB range, we must set
the address to a chosen location. And it makes the most sense to have
this region placed right after the TSS region.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-12-04 19:33:34 +00:00
Barret Rhoden
e08c747638 vmm: fix HANDLED_SIGNALS build error
The error was:

	borrow the array with `&` or call `.iter()` on it to iterate
	over it

Fixes #3348
Signed-off-by: Barret Rhoden <brho@google.com>
2021-12-04 13:45:02 +01:00
Rob Bradford
348def9dfb arch, hypervisor, vmm: Explicitly place the TSS in the 32-bit space
Place the 3 page TSS at an explicit location in the 32-bit address space
to avoid conflicting with the loaded raw firmware.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-12-03 16:53:56 +01:00
Ziye Yang
b09cbb8493 vmm: Add constant SGX_PAGE_SIZE in memory_manager.rs
Purpose: Do not directly use 0x1000 but use predefined constant value.

Signed-off-by: Ziye Yang <ziye.yang@intel.com>
2021-12-03 10:06:15 +00:00
Michael Zhao
8c88b10384 vmm: Add some missing fields in IORT table
Added fields:
- `Memory address size limit`: the missing of this field triggered
  warnings in guest kernel
- `Node ID`

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-12-01 09:24:01 +08:00
Michael Zhao
b0d245be70 vmm: Add ID mappings in IORT Root Complex Nodes
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-12-01 09:24:01 +08:00
Michael Zhao
fad29fdf1a vmm: Add PCI segment in IORT table
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-12-01 09:24:01 +08:00
Michael Zhao
c9374d87ac vmm: Update devid in kvm_irq_routing_entry
After introducing multiple PCI segments, the `devid` value in
`kvm_irq_routing_entry` exceeds the maximum supported range on AArch64.

This commit restructed the `devid` to the allowed range.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-12-01 09:24:01 +08:00
Rob Bradford
82d06c0efa vmm: Add support for booting raw binary (e.g. firmware) on x86-64
If the provided binary isn't an ELF binary assume that it is a firmware
to be loaded in directly. In this case we shouldn't program any of the
registers as KVM starts in that state.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-30 13:39:36 +01:00
Ziye Yang
61ce4b8f31 vmm: Update comments related with enum Error struct in config.rs
Make the comments style consistent

Signed-off-by: Ziye Yang <ziye.yang@intel.com>
2021-11-26 10:22:57 +01:00
Ziye Yang
896a651b5c vmm: Update some comments and error message info in config.rs
Update some comments and error message info related with TDX.

Signed-off-by: Ziye Yang <ziye.yang@intel.com>
2021-11-24 10:02:00 +01:00
Ziye Yang
51cfffd24f vmm: Make the comments consistent in 'DeviceManager'
Change  "Failed xxing" to "Failed to xx", then
we can only we one style.

Signed-off-by: Ziye Yang <ziye.yang@intel.com>
2021-11-19 08:43:23 +00:00
Bo Chen
2a312cd4fe vmm: Fix a comment typo from 'DeviceManager'
Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-11-18 12:00:39 -08:00
Wei Liu
ff0e92ab88 vmm: add a safety comment for EpollContext
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2021-11-17 23:12:11 +00:00
Wei Liu
9b3cab8c72 device_manager: check return value of dup(2)
That function call can return -1 when it fails. Wrapping -1 into File
causes the code to panic when the File is dropped.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2021-11-17 23:12:11 +00:00
Wei Liu
84630aa0b5 device_manager: provide a few safety comments
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2021-11-17 23:12:11 +00:00
Alyssa Ross
ad8ed80eb1 vmm: use the tty raw mode implementation from libc
I encountered some trouble trying to use a virtio-console hooked up to
a PTY.  Reading from the PTY would produce stuff like this
"\n\nsh-5.1# \n\nsh-5.1# " (where I'm just pressing enter at a shell
prompt), and a terminal would render that like this:

----------------------------------------------------------------

sh-5.1#

       sh-5.1#
----------------------------------------------------------------

This was because we weren't disabling the ICRNL termios iflag, which
turns carriage returns (\r) into line feeds (\n).  Other raw mode
implementations (like QEMU's) set this flag, and don't have this
problem.

Instead of fixing our raw mode implementation to just disable ICRNL,
or copy the flags from QEMU's, though, here I've changed it to use the
raw mode implementation in libc.  It seems to work correctly in my
testing, and means we don't have to worry about what exactly raw mode
looks like under the hood any more.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2021-11-17 14:41:00 +00:00
Rob Bradford
419870ae45 vmm: Add epoll_ctl() syscall to vCPU seccomp filter
Fix seccomp violation when trying to add the out FD to the epoll loop
when the serial buffer needs to be flushed.

0x00007ffff7dc093e in epoll_ctl () at ../sysdeps/unix/syscall-template.S:120
0x0000555555db9b6d in epoll::ctl (epfd=56, op=epoll::ControlOptions::EPOLL_CTL_MOD, fd=55, event=...)
    at /home/rob/.cargo/registry/src/github.com-1ecc6299db9ec823/epoll-4.3.1/src/lib.rs:155
0x00005555556f5127 in vmm::serial_buffer::SerialBuffer::add_out_poll (self=0x7fffe800b5d0) at vmm/src/serial_buffer.rs:101
0x00005555556f583d in vmm::serial_buffer::{impl#1}::write (self=0x7fffe800b5d0, buf=...) at vmm/src/serial_buffer.rs:139
0x0000555555a30b10 in std::io::Write::write_all<vmm::serial_buffer::SerialBuffer> (self=0x7fffe800b5d0, buf=...)
    at /rustc/59eed8a2aac0230a8b53e89d4e99d55912ba6b35/library/std/src/io/mod.rs:1527
0x0000555555ab82fb in devices::legacy::serial::Serial::handle_write (self=0x7fffe800b520, offset=0, v=13) at devices/src/legacy/serial.rs:217
0x0000555555ab897f in devices::legacy::serial::{impl#2}::write (self=0x7fffe800b520, _base=1016, offset=0, data=...) at devices/src/legacy/serial.rs:295
0x0000555555f30e95 in vm_device:🚌:Bus::write (self=0x7fffe8006ce0, addr=1016, data=...) at vm-device/src/bus.rs:235
0x00005555559406d4 in vmm::vm::{impl#4}::pio_write (self=0x7fffe8009640, port=1016, data=...) at vmm/src/vm.rs:459

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-16 07:27:46 -08:00
Rob Bradford
66a2045148 vmm: Fix panic in SIGWINCH listener thread when no seccomp filter set
When running with `--serial pty --console pty --seccomp=false` the
SIGWICH listener thread would panic as the seccomp filter was empty.
Adopt the mechanism used in the rest of the code and check for non-empty
filter before trying to apply it.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-16 14:28:02 +00:00
Sebastien Boeuf
a1f1dfddeb vmm: Fix CpusConfig validation error message
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-11-15 17:27:23 +01:00
Rob Bradford
3480e69ff5 vmm: Cache whether io_uring is supported in DeviceManager
Probing for whether the io_uring is supported is time consuming so cache
this value if it is known to reduce the cost for secondary block devices
that are added.

Before:

cloud-hypervisor: 3.988896ms: <vmm> INFO:vmm/src/device_manager.rs:1901 -- Creating virtio-block device: DiskConfig { path: Some("/home/rob/workloads/focal-server-cloudimg-amd64-custom-20210609-0.raw"), readonly: false, direct: false, iommu: false, num_queues: 1, queue_size: 128, vhost_user: false, vhost_socket: None, poll_queue: true, rate_limiter_config: None, id: Some("_disk0"), disable_io_uring: false, pci_segment: 0 }
cloud-hypervisor: 14.129591ms: <vmm> INFO:vmm/src/device_manager.rs:1983 -- Using asynchronous RAW disk file (io_uring)
cloud-hypervisor: 14.159853ms: <vmm> INFO:vmm/src/device_manager.rs:1901 -- Creating virtio-block device: DiskConfig { path: Some("/tmp/disk"), readonly: false, direct: false, iommu: false, num_queues: 1, queue_size: 128, vhost_user: false, vhost_socket: None, poll_queue: true, rate_limiter_config: None, id: Some("_disk1"), disable_io_uring: false, pci_segment: 0 }
cloud-hypervisor: 22.110281ms: <vmm> INFO:vmm/src/device_manager.rs:1983 -- Using asynchronous RAW disk file (io_uring)

After:

cloud-hypervisor: 4.880411ms: <vmm> INFO:vmm/src/device_manager.rs:1916 -- Creating virtio-block device: DiskConfig { path: Some("/home/rob/workloads/focal-server-cloudimg-amd64-custom-20210609-0.raw"), readonly: false, direct: false, iommu: false, num_queues: 1, queue_size: 128, vhost_user: false, vhost_socket: None, poll_queue: true, rate_limiter_config: None, id: Some("_disk0"), disable_io_uring: false, pci_segment: 0 }
cloud-hypervisor: 14.105123ms: <vmm> INFO:vmm/src/device_manager.rs:1998 -- Using asynchronous RAW disk file (io_uring)
cloud-hypervisor: 14.134837ms: <vmm> INFO:vmm/src/device_manager.rs:1916 -- Creating virtio-block device: DiskConfig { path: Some("/tmp/disk"), readonly: false, direct: false, iommu: false, num_queues: 1, queue_size: 128, vhost_user: false, vhost_socket: None, poll_queue: true, rate_limiter_config: None, id: Some("_disk1"), disable_io_uring: false, pci_segment: 0 }
cloud-hypervisor: 14.221869ms: <vmm> INFO:vmm/src/device_manager.rs:1998 -- Using asynchronous RAW disk file (io_uring)

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-12 18:09:55 +00:00
Sebastien Boeuf
932c8c9713 vmm: Add CPU affinity support
With the introduction of a new option `affinity` to the `cpus`
parameter, Cloud Hypervisor can now let the user choose the set
of host CPUs where to run each vCPU.

This is useful when trying to achieve CPU pinning, as well as making
sure the VM runs on a specific NUMA node.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-11-12 09:40:37 +00:00
Sebastien Boeuf
a4f5ad6076 option_parser: Fix inner bracket support with list of integers
Give the option parser the ability to handle tuples with inner brackets
containing list of integers. The following example can now be handled
correctly "option=[key@[v1-v2,v3,v4]]" which means the option is
assigned a tuple with a key associated with a list of integers between
the range v1 - v2, as well as v3 and v4.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-11-12 09:40:37 +00:00
Sebastien Boeuf
c8e3c1eed6 clippy: Make sure to initialize data
Always properly initialize vectors so that we don't run in undefined
behaviors when the vector gets dropped.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-11-10 10:23:43 +01:00
Sebastien Boeuf
ad521fd4e4 option_parser: Create generic type Tuple
Creates a new generic type Tuple so that the same implementation of
FromStr trait can be reused for both parsing a list of two integers and
parsing a list of one integer associated with a list of integers.

This anticipates the need for retrieving sublists, which will be needed
when trying to describe the host CPU affinity for every vCPU.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-11-09 08:59:59 +01:00
Sebastien Boeuf
b81d758c41 option_parser: Expect commas instead of colons for lists
The elements of a list should be using commas as the correct delimiter
now that it is supported. Deprecate use of colons as delimiter.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-11-09 08:59:59 +01:00
Rob Bradford
751e76db08 vmm: acpi: Use Aml::append_aml_bytes() to generate DSDT
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-08 16:46:30 +00:00
Rob Bradford
d96d98d88e vmm: Port DeviceManager to Aml::append_aml_bytes()
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-08 16:46:30 +00:00
Rob Bradford
185f0c1bf3 vmm: Port MemoryManager to Aml::append_aml_bytes()
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-08 16:46:30 +00:00
Rob Bradford
e04cbb2ad4 vmm: Port PciSegment to Aml::append_aml_bytes()
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-08 16:46:30 +00:00
Rob Bradford
986e43f899 vmm: cpu: Port CpuManager to Aml::append_aml_bytes()
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-08 16:46:30 +00:00
Rob Bradford
d0c3342c97 vmm: acpi: Report time to generate ACPI tables
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-08 16:46:30 +00:00
Rob Bradford
a2e02a8fff vmm: Add SGX section creation logging
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
def98faf37 vmm, vm-allocator: Introduce an allocator for platform devices
This allocator allocates 64-bit MMIO addresses for use with platform
devices e.g. ACPI control devices and ensures there is no overlap with
PCI address space ranges which can cause issues with PCI device
remapping.

Use this allocator the ACPI platform devices.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
9d1a7e43a7 vmm: Refactor MCFG table creation to take just the PCI segments
This matches the lock taking behaviour of other functions in this file.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
afe95e5a2a vmm: Use an allocator specifically for RAM regions
Rather than use the system MMIO allocator for RAM use an allocator that
covers the full RAM range.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
b8fee11822 vmm: Place SGX EPC region between RAM and device area
Increase the start of the device area to accomodate the SGX EPC area.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
e20be3e147 vmm: Check hotplug memory against end of RAM not start of device area
This is because the SGX region will be placed between the end of ram and
the start of the device area.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
ec81f377b6 vmm: Refactor SGX setup to inside MemoryManager::new()
This makes it possible to manually allocate the SGX region after the end
of RAM region.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
438be0dad5 vmm: api: Add pci_segment entries to OpenAPI file
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
1a5a89508b vmm: Remove segment_id from DeviceNode
With the segment id now encoded in the bdf it is not necessary to have
the separate field for it.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
ae83e3b383 vmm: Use PciBdf throughout in order to remove manual bit manipulation
In particular use the accessor for getting the device id from the bdf.
As a side effect the VIOT table is now segment aware.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
a26ce353d3 vmm: Use the PCI segment allocator for pmem and fs cache allocations
Use the MMIO address space allocator associated with the segment that
the devices are on.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
cd9d1cf8fc pci, virtio-devices, vmm: Allocate PCI 64-bit bars per segment
Since each segment must have a non-overlapping memory range associated
with it the device memory must be equally divided amongst all segments.
A new allocator is used for each segment to ensure that BARs are
allocated from the correct address ranges. This requires changes to
PciDevice::allocate/free_bars to take that allocator and when
reallocating BARs the correct allocator must be identified from the
ranges.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
7cfeefde57 vmm: Add validation logic to check user specified pci_segment is valid
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
f71f6da907 vmm: Add pci_segment option to UserDeviceConfig
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
d4f7f42800 vmm: Add pci_segment option to DeviceConfig
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
ca955a47ff vmm: Implement pci_segment options for hotpluggable virtio devices
For all the devices that support being hotplugged (disk, net, pmem, fs
and vsock) add "pci_segment" option and propagate that through to the
addition onto the PCI busses.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
88378d17a2 vmm: Take PCI segment ID into BAR size allocation
Move the decision on whether to use a 64-bit bar up to the DeviceManager
so that it can use both the device type (e.g. block) and the PCI segment
ID to decide what size bar should be used.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
cf1c2bf0e8 vmm: Use the same set of reserved PCI IRQ routes for all segments
Generate a set of 8 IRQs and round-robin distribute those over all the
slots for a bus. This same set of IRQs is then used for all PCI
segments.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
e3d6e222a1 vmm: Add the required number of PCI segments
The platform config may specify a number of PCI segments to use, if this
greater than 1 then we add supplemental PCI segments as well as the
default segment.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
f8d9c073f0 vmm: Add "--platform"
This currently contains only the number over PCI segments to create.
This is limited to 16 at the moment which should allow 496 user specified
PCI devices.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
e3c35a3579 vmm: Allow specifying the PCI segment ID when adding virtio PCI device
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
7a4606f800 vmm: Implement ACPI hotplug/unplug handling for PCI segments
For the bus scanning the GED AML code now calls into a PSCN method that
scans all buses. This approach was chosen since it handles the case
correctly where one GED interrupt is services for two hotplugs on
distinct segments.

The PCIU and PCID field values are now determined by the PSEG field that
is uses to select which segment those values should be used for.
Similarly _EJ0 will notify based on the value of _SEG.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
49f19e061b vmm: Use device's segment when removing a device
The segment ID has been stored in the DeviceTree.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
d33d254921 vmm: Remove hardcoded zero PCI segment id
Replace the hardcoded zero PCI segment id when adding devices to the bus
and extend the DeviceTree to hold the PCI segment id.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
b8b0dab1ae vmm: Add segment_id parameter to DeviceManager::add_pci_device
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
c118d7d7d3 vmm: Only fill in PIO and 32-bit MMIO space on zero segment
Since each segment must have disjoint address spaces only advertise
address space in the 32-bit range and the PIO address space on the
default (zero) segment.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
3059ba4305 vmm: Refactor PCI segment creation to support non-default segment
Split PciSegment::new_default_segment() into a separate
PciSegment::new() and those parts required only for the default segment
(PIO PCI config device.)

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
080ce9b068 vmm: Populate MCFG table with details of all PCI segments
The MCFG table holds the PCI MMIO config details for all the MMIO PCI
config devices.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
c886d71d29 vmm: Add MMIO & PIO config devices for all PCI segments
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
4f5c179b9b vmm: Construct PCI DSDT data from all segments
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
fbb385834a vmm: Use a vector to store multiple segments
For now this still contains just one segment but is expanding in
preparation for more segments.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
b4fc02857f vmm: Advertise PCI MMIO config range for PCI bus
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
b55f009b8a vmm: Calculate MMIO config address based on segment id
This means that each segment can have its own PCI MMIO config device
without overlapping.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
b59f1d90dd vmm: Expose _SEG with segment ID for PCI bus
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
a7fba8105f vmm: Customise PCI device name based on segment id
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
8b67298ad8 vmm: Move PCI bus DSDT data onto PciSegment
This commit moves the code that generates the DSDT data for the PCI bus
into PciSegment making no functional changes to the generated AML.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Wei Liu
23f63262e7 vmm: drop underscore from used variables
Variables that start with underscore are used to silence rustc.
Normally those variables are not used in code.

This patch drops the underscore from variables that are used. This is
less confusing to readers.

No functional change intended.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2021-10-28 13:38:20 +01:00
Bo Chen
455b0d12e9 vmm: Remove VFIO user device from VmConfig upon device unplug
It ensures we won't recreate the unplugged device on reboot.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-10-28 09:42:52 +01:00
Rob Bradford
beb0c0707f vmm: Move logging output for the debug (0x80) port to info!()
This makes it much easier to use since the info!() level produces far
fewer messages and thus has less overhead.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-10-26 16:48:09 +01:00
Sebastien Boeuf
0249e8641a Move Cloud Hypervisor to virtio-queue crate
Relying on the vm-virtio/virtio-queue crate from rust-vmm which has been
copied inside the Cloud Hypervisor tree, the entire codebase is moved to
the new definition of a Queue and other related structures.

The reason for this move is to follow the upstream until we get some
agreement for the patches that we need on top of that to make it
properly work with Cloud Hypervisor.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-10-22 11:38:55 +02:00
Rob Bradford
e9ea9d63f8 vmm: Use assert!() rather than if+panic
As identified by the new beta clippy.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-10-19 19:42:36 +01:00
Rob Bradford
2cccdc5ddd vmm: Naturally align PCI BARs on relocation
When allocating PCI MMIO BARs they should always be naturally aligned
(i.e. aligned to the size of the BAR itself.)

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-10-15 14:54:18 -07:00
Rob Bradford
c25bd447a1 vmm: Ensure that allocate_bars() is called before mmio_regions()
The allocate_bars method has a side effect which collates the BARs used
for the device and stores them internally. Ensure that any use of this
internal state is after the state is created otherwise no MMIO regions
will be seen and so none will be mapped.

Fixes: #3237

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-10-14 10:14:33 -07:00
Sebastien Boeuf
58d8206e2b migration: Use MemoryManager restore code path
Instead of creating a MemoryManager from scratch, let's reuse the same
code path used by snapshot/restore, so that memory regions are created
identically to what they were on the source VM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-10-06 18:35:49 -07:00
Sebastien Boeuf
1e1e61614c vmm: memory_manager: Leverage new codepath for snapshot/restore
Now that all the pieces are in place, we can restore a VM with the new
codepath that restores properly all memory regions, allowing for ACPI
memory hotplug to work properly with snapshot/restore feature.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-10-06 18:35:49 -07:00
Sebastien Boeuf
6a55768d94 vmm: Create MemoryManager from restore data
Extending the MemoryManager::new() function to be able to create a
MemoryManager from data that have been previously stored instead of
always creating everything from scratch.

This change brings real added value as it allows a VM to be restored
respecting the proper memory layout instead of hoping the regions will
be created the way they were before.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-10-06 18:35:49 -07:00
Sebastien Boeuf
5b177b205b arch, vmm: Extend the data being snapshot
Storing multiple data coming from the MemoryManager in order to be able
to restore without creating everything from scratch.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-10-06 18:35:49 -07:00
Sebastien Boeuf
f440976a7c vmm: memory_manager: Add a way to restore memory regions properly
This new function will be able to restore memory regions and memory
zones based on the GuestMemoryMapping list that will be provided through
snapshot/restore and migration phases.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-10-06 18:35:49 -07:00
Sebastien Boeuf
0d573ae86c vmm: memory_manager: Add file_offset to GuestRamMapping
This will help restoring the region with the correct file offset for the
memory mapping.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-10-06 18:35:49 -07:00
Sebastien Boeuf
01420f5195 vmm: memory_manager: Add virtio_mem to GuestRamMapping
This will help identify if the range belongs to a virtio-mem region or
not.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-10-06 18:35:49 -07:00
Sebastien Boeuf
dfb1829f65 vmm: memory_manager: Add zone_id to GuestRamMapping
This can help identifying which zone relates to which memory range.
This is going to be useful when recreating GuestMemory regions from
the previous layout instead of having to recreate everything from
scratch.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-10-06 18:35:49 -07:00
Sebastien Boeuf
b5d11f72b3 vmm: memory_manager: Factorize allocation of ranges
Create a dedicated function to factorize the allocation of the memory
ranges, and helping with the simplification of MemoryManager::new()
function.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-10-06 18:35:49 -07:00
Sebastien Boeuf
00951f17d4 vmm: memory_manager: Simplify regions creation
By updating the list of GuestMemory regions with the virtio-mem ones
before the creation of the MemoryManager, we know the GuestMemory is up
to date and the allocation of memory ranges is simplified afterwards.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-10-06 18:35:49 -07:00
Sebastien Boeuf
63c6c78c4e vmm: memory_manager: Factorize configuration validation
In order to simplify MemoryManager::new() function. let's move the
memory configuration validation to its own function.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-10-06 18:35:49 -07:00
Rob Bradford
84fc0e093d vmm: Move PciSegment to new file
Move the PciSegment struct and the associated code to a new file. This
will allow some clearer separation between the core DeviceManager and
PCI handling.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-10-05 10:54:07 +01:00
Rob Bradford
0eb78ab177 vmm: Extract PCI related state from DeviceManager
Move the PCI related state from the DeviceManager struct to a PciSegment
struct inside the DeviceManager. This is in preparation for multiple
segment support. Currently this state is just the bus itself, the MMIO
and PIO config devices and hotplug related state.

The main change that this required is using the Arc<Mutex<PciBus>> in
the device addition logic in order to ensure that
the bus could be created earlier.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-10-05 10:54:07 +01:00
Rob Bradford
83066cf58e vmm: Set a default maximum physical address size
When using PVH for booting (which we use for all firmwares and direct
kernel boot) the Linux kernel does not configure LA57 correctly. As such
we need to limit the address space to the maximum 4-level paging address
space.

If the user knows that their guest image can take advantage of the
5-level addressing and they need it for their workload then they can
increase the physical address space appropriately.

This PR removes the TDX specific handling as the new address space limit
is below the one that that code specified.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-10-01 08:59:15 -07:00
Sebastien Boeuf
495e444ca6 vmm: Add ACPI tables to TdVmmData when running TDX
Whenever running TDX, we must pass the ACPI tables to the TDVF firmware
running in the guest. The proper way to do this is by adding the tables
to the TdHob as a TdVmmData type, so that TDVF will know how to access
these tables and expose them to the guest OS.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-30 06:35:55 -07:00
Sebastien Boeuf
b99a3a7dc9 vmm: Factorize ACPI tables creation inside boot() function
Instead of having the ACPI tables being created both in x86_64 and
aarch64 implementations of configure_system(), we can remove the
duplicated code by moving the ACPI tables creation in vm.rs inside the
boot() function.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-30 06:35:55 -07:00
Yu Li
08021087ec vmm: add prefault option in memory and memory-zone
The argument `prefault` is provided in MemoryManager, but it can
only be used by SGX and restore.
With prefault (MAP_POPULATE) been set, subsequent page faults will
decrease during running, although it will make boot slower.

This commit adds `prefault` in MemoryConfig and MemoryZoneConfig.
To resolve conflict between memory and restore, argument
`prefault` has been changed from `bool` to `Option<bool>`, when
its value is None, config from memory will be used, otherwise
argument in Option will be used.

Signed-off-by: Yu Li <liyu.yukiteru@bytedance.com>
2021-09-29 14:17:35 +02:00
Sebastien Boeuf
59031531b6 vmm: Simplify the way memory is snapshot and restored
By using a single file for storing the memory ranges, we simplify the
way snapshot/restore works by avoiding multiples files, but the main and
more important point is that we have now a way to save only the ranges
that matter. In particular, the ranges related to virtio-mem regions are
not always fully hotplugged, meaning we don't want to save the entire
region. That's where the usage of memory ranges is interesting as it
lets us optimize the snapshot/restore process when one or multiple
virtio-mem regions are involved.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Sebastien Boeuf
1ea63f50a1 vmm: Move MemoryRangeTable creation to the MemoryManager
The function memory_range_table() will be reused by the MemoryManager in
a following patch to describe all the ranges that we should snapshot.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Sebastien Boeuf
86f86c5348 vmm: Optimize migration for virtio-mem
Copy only the memory ranges that have been plugged through virtio-mem,
allowing for an interesting optimization regarding the time it takes to
migrate a large virtio-mem device. Even if the hotpluggable space is
very large (say 64GiB), if only 1GiB has been previously added to the
VM, only 1GiB will be sent to the destination VM, avoiding the transfer
of the remaining 63GiB which are unused.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Sebastien Boeuf
e390775bcb vmm, virtio-devices: Move BlocksState creation to the MemoryManager
By creating the BlocksState object in the MemoryManager, we can directly
provide it to the virtio-mem device when being created. This will allow
the MemoryManager through each VirtioMemZone to have a handle onto the
blocks that are plugged at any point in time.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Sebastien Boeuf
a1caa6549a vmm: Add page size as a parameter for MemoryRangeTable::from_bitmap()
This will be helpful to support the creation of a MemoryRangeTable from
virtio-mem, as it uses 2M pages.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Sebastien Boeuf
d7115ec656 virtio-devices: mem: Add snapshot/restore support
Adding the snapshot/restore support along with migration as well,
allowing a VM with virtio-mem devices attached to be properly
migrated.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Sebastien Boeuf
7bbcc0f849 vmm: memory_manager: Make sure the hotplugged_size is up to date
The amount of memory plugged in the virtio-mem region should always be
kept up to date in the hotplugged_size field from VirtioMemZone.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Sebastien Boeuf
c4dc7a583d vmm: memory_manager: Simplify the MemoryManager structure
There's no need to duplicate the GuestMemory for snapshot purpose, as we
always have a handle onto the GuestMemory through the guest_memory
field.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Sebastien Boeuf
74485924b1 vmm: memory_manager: Simplification to avoid unnecessary locking
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Rob Bradford
4889999277 vmm: Only advertise a single PCI bus
Since we only support a single PCI bus right now advertise only a single
bus in the ACPI tables. This reduces the number of VM exits from probing
substantially.

Number of PCI config I/O port exits: 17871 -> 1551 (91% reduction) with
direct kernel boot.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-28 14:10:10 +02:00
Rob Bradford
b50519651c vmm: Simplify slot eject code in PCI ACPI device code
Use a simpler method for extracting the affected slot on the eject
command. Also update the terminology to reflect that this a slot rather
than a bdf (which is what device id refers to elsewhere.)

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-28 12:03:23 +02:00
William Douglas
a8f063db7c vmm: Refactor serial buffer to allow flush on PTY when writable
Refactor the serial buffer handling in order to write the serial
buffer's output to a PTY connected after the serial device stops being
written to by the guest.

This change moves the serial buffer initialization inside the serial
manager. That is done to allow the serial buffer to be made aware of
the PTY and epoll fds needed in order to modify the
EpollDispatch::File trigger. These are then used by the serial buffer
to trigger an epoll event when the PTY fd is writable and the buffer
has content in it. They are also used to remove the trigger when the
buffer is emptied in order to avoid unnecessary wake-ups.

Signed-off-by: William Douglas <william.douglas@intel.com>
2021-09-27 14:18:21 +01:00
Sebastien Boeuf
b910a7922d vmm: Fix migration when writing/reading big chunks of data
Both read_exact_from() and write_all_to() functions from the GuestMemory
trait implementation in vm-memory are buggy. They should retry until
they wrote or read the amount of data that was expected, but instead
they simply return an error when this happens. This causes the migration
to fail when trying to send important amount of data through the
migration socket, due to large memory regions.

This should be eventually fixed in vm-memory, and here is the link to
follow up on the issue: https://github.com/rust-vmm/vm-memory/issues/174

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-27 11:13:56 +02:00
Rob Bradford
1a2d0e6dd8 build: bump linux-loader from 0.3.0 to 0.4.0
Requires manual change to command line loading.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-24 09:11:57 +00:00
Michael Zhao
d72af85c42 vmm: Add "_CCA" field to ACPI DSDT table
"_CCA" is required by DMA configuration on AArch64.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-09-24 07:57:57 +01:00
Rob Bradford
43365ade2e vmm, pci: Implement virtio-mem support for vfio-user
Implement the infrastructure that lets a virtio-mem device map the guest
memory into the device. This is necessary since with virtio-mem zones
memory can be added or removed and the vfio-user device must be
informed.

Fixes: #3025

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-21 15:42:49 +01:00
Rob Bradford
e9d67dc405 vmm: pci: Move creation of vfio_user::Client to DeviceManager
By moving this from the VfioUserPciDevice to DeviceManager the client
can be reused for handling DMA mapping behind an IOMMU.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-21 15:42:49 +01:00
Rob Bradford
fd4f32fa69 virtio-mem: Support multiple mappings
For vfio-user the mapping handler is per device and needs to be removed
when the device in unplugged.

For VFIO the mapping handler is for the default VFIO container (used
when no vIOMMU is used - using a vIOMMU does not require mappings with
virtio-mem)

To represent these two use cases use an enum for the handlers that are
stored.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-21 15:42:49 +01:00
Rob Bradford
0faa7afac2 vmm: Add fast path for PCI config IO port
Looking up devices on the port I/O bus is time consuming during the
boot at there is an O(lg n) tree lookup and the overhead from taking a
lock on the bus contents.

Avoid this by adding a fast path uses the hardcoded port address and
size and directs PCI config requests directly to the device.

Command line:
target/release/cloud-hypervisor --kernel ~/src/linux/vmlinux --cmdline "root=/dev/vda1 console=ttyS0" --serial tty --console off --disk path=~/workloads/focal-server-cloudimg-amd64-custom-20210609-0.raw --api-socket /tmp/api

PIO exit: 17913
PCI fast path: 17871
Percentage on fast path: 99.8%

perf before:

marvin:~/src/cloud-hypervisor (main *)$ perf report -g | grep resolve
     6.20%     6.20%  vcpu0            cloud-hypervisor    [.] vm_device:🚌:Bus::resolve

perf after:

marvin:~/src/cloud-hypervisor (2021-09-17-ioapic-fast-path *)$ perf report -g | grep resolve
     0.08%     0.08%  vcpu0            cloud-hypervisor    [.] vm_device:🚌:Bus::resolve

The compromise required to implement this fast path is bringing the
creation of the PciConfigIo device into the DeviceManager::new() so that
it can be used in the VmmOps struct which is created before
DeviceManager::create_devices() is called.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-17 17:09:45 +01:00
Michael Zhao
b3fa56544c virtio-devices: iommu: Support AArch64
The MSI IOVA address on X86 and AArch64 is different.

This commit refactored the code to receive the MSI IOVA address and size
from device_manager, which provides the actual IOVA space data for both
architectures.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-09-17 12:19:46 +02:00
Michael Zhao
253c06d3ba arch/aarch64: Add virtio-iommu device in FDT
Add a virtio-iommu node into FDT if iommu option is turned on. Now we
support only one virtio-iommu device.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-09-17 12:19:46 +02:00
William Douglas
46f6d9597d vmm: Switch to using the serial_manager for serial input
This change switches from handling serial input in the VMM thread to
its own thread controlled by the SerialManager.

The motivation for this change is to avoid the VMM thread being unable
to process events while serial input is happening and vice versa.

The change also makes future work flushing the serial buffer on PTY
connections easier.

Signed-off-by: William Douglas <william.douglas@intel.com>
2021-09-17 11:15:35 +01:00
William Douglas
7b4f56e372 vmm: Add new serial_manager for serial input handling
This change adds a SerialManager with its own epoll handling that
should be created and run by the DeviceManager when creating an
appropriately configured console (serial tty or pty).

Both stdin and pty input are handled by the SerialManager. The stdin
and pty specific methods used by the VMM should be removed in a future
commit.

Signed-off-by: William Douglas <william.douglas@intel.com>
2021-09-17 11:15:35 +01:00
William Douglas
d6a2f48b32 vmm: device_manager: Make PtyPair implement Clone
The clone method for PtyPair should have been an impl of the Clone
trait but the method ended up not being used. Future work will make
use of the trait however so correct the missing trait implementation.

Signed-off-by: William Douglas <william.douglas@intel.com>
2021-09-17 11:15:35 +01:00
Sebastien Boeuf
a6040d7a30 vmm: Create a single VFIO container
For most use cases, there is no need to create multiple VFIO containers
as it causes unwanted behaviors. Especially when passing multiple
devices from the same IOMMU group, we need to use the same container so
that it can properly list the groups that have been already opened. The
correct logic was already there in vfio-ioctls, but it was incorrectly
used from our VMM implementation.

For the special case where we put a VFIO device behind a vIOMMU, we must
create one container per device, as we need to control the DMA mappings
per device, which is performed at the container level. Because we must
keep one container per device, the vIOMMU use case prevents multiple
devices attached to the same IOMMU group to be passed through the VM.
But this is a limitation that we are fine with, especially since the
vIOMMU doesn't let us group multiple devices in the same group from a
guest perspective.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-15 09:08:13 -07:00
Alyssa Ross
330b5ea3be vmm: notify virtio-console of pty resizes
When a pty is resized (using the TIOCSWINSZ ioctl -- see ioctl_tty(2)),
the kernel will send a SIGWINCH signal to the pty's foreground process
group to notify it of the resize.  This is the only way to be notified
by the kernel of a pty resize.

We can't just make the cloud-hypervisor process's process group the
foreground process group though, because a process can only set the
foreground process group of its controlling terminal, and
cloud-hypervisor's controlling terminal will often be the terminal the
user is running it in.  To work around this, we fork a subprocess in a
new process group, and set its process group to be the foreground
process group of the pty.  The subprocess additionally must be running
in a new session so that it can have a different controlling
terminal.  This subprocess writes a byte to a pipe every time the pty
is resized, and the virtio-console device can listen for this in its
epoll loop.

Alternatives I considered were to have the subprocess just send
SIGWINCH to its parent, and to use an eventfd instead of a pipe.
I decided against the signal approach because re-purposing a signal
that has a very specific meaning (even if this use was only slightly
different to its normal meaning) felt unclean, and because it would
have required using pidfds to avoid race conditions if
cloud-hypervisor had terminated, which added complexity.  I decided
against using an eventfd because using a pipe instead allows the child
to be notified (via poll(2)) when nothing is reading from the pipe any
more, meaning it can be reliably notified of parent death and
terminate itself immediately.

I used clone3(2) instead of fork(2) because without
CLONE_CLEAR_SIGHAND the subprocess would inherit signal-hook's signal
handlers, and there's no other straightforward way to restore all signal
handlers to their defaults in the child process.  The only way to do
it would be to iterate through all possible signals, or maintain a
global list of monitored signals ourselves (vmm:vm::HANDLED_SIGNALS is
insufficient because it doesn't take into account e.g. the SIGSYS
signal handler that catches seccomp violations).

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2021-09-14 15:43:25 +01:00
Alyssa Ross
28382a1491 virtio-devices: determine tty size in console
This prepares us to be able to handle console resizes in the console
device's epoll loop, which we'll have to do if the output is a pty,
since we won't get SIGWINCH from it.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2021-09-14 15:43:25 +01:00
Alyssa Ross
8abe8c679b seccomp: allow mmap everywhere brk is allowed
Musl often uses mmap to allocate memory where Glibc would use brk.
This has caused seccomp violations for me on the API and signal
handling threads.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2021-09-10 12:01:31 -07:00
Rob Bradford
b6b686c71c vmm: Shutdown VMM if API thread panics
See: #3031

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-10 10:52:08 -07:00
Rob Bradford
171d12943d vmm: memory_manager: Increase robustness of MemoryManager control device
See: #1289

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-10 10:23:19 -07:00
Rob Bradford
bdc44cd8bc vmm: cpu: Increase robustness of CpuManager control device
See: #1289

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-10 10:22:05 -07:00
Bo Chen
4f37a273d9 vmm: Fix clippy issue
error: all if blocks contain the same code at the end
   --> vmm/src/memory_manager.rs:884:9
    |
884 | /             Ok(mm)
885 | |         }
    | |_________^

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-09-08 13:31:19 -07:00
Rob Bradford
d64a77a5c6 vmm: Shutdown VMM if signal thread panics
See: #3031

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-08 11:26:48 -07:00
Rob Bradford
e0d05683ab vmm: Split up functions for creating signal handler and tty setup
These are quite separate and should be in their own functions.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-08 11:26:48 -07:00
Rob Bradford
387753ae1d vmm: Remove concept of "input_enabled"
This concept ends up being broken with multiple types on input connected
e.g. console on TTY and serial on PTY. Already the code for checking for
injecting into the serial device checks that the serial is configured.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-08 11:26:48 -07:00
Rob Bradford
951ad3495e vmm: Only resize virtio-console when attached to TTY
Fixes: #3092

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-08 11:26:48 -07:00
Rob Bradford
0dbb2683e3 vmm: Consolidate duplicated code for setting up signal handler
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-08 11:26:48 -07:00
Rob Bradford
687d646c60 virtio-devices, vmm: Shutdown VMM on virtio thread panic
Shutdown the VMM in the virtio (or VMM side of vhost-user) thread
panics.

See: #3031

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-08 09:40:36 +01:00
Rob Bradford
54e523c302 virtio-devices: Use a common method for spawning virtio threads
Introduce a common solution for spawning the virtio threads which will
make it easier to add the panic handling.

During this effort I discovered that there were no seccomp filters
registered for the vhost-user-net thread nor the vhost-user-block
thread. This change also incorporates basic seccomp filters for those as
part of the refactoring.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-08 09:40:36 +01:00
Wei Liu
9c5b404415 vmm: MSHV now supports VFIO-based device passthrough
Drop a few feature gates and adjust code a bit.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2021-09-07 15:17:08 +01:00
Henry Wang
c50051a686 device_manager: Enable power button for ACPI on AArch64
Current AArch64 power button is only for device tree using a PL061
GPIO controller device. Since AArch64 now supports ACPI, this
commit extend the power button on AArch64 to:

- Using GED for ACPI+UEFI boot.
- Using PL061 for device tree boot.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-09-03 10:27:52 -07:00
Rob Bradford
e475b12cf7 virtio-devices, vmm: Upgrade restore related messages to info!()
These happen only sporadically so can be included at the info!() level.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-03 09:30:55 -07:00
Rob Bradford
968902dfec devices, vmm: Upgrade exit reasons to info!() level debugging
These statements are useful for understanding the cause of reset or
shutdown of the VM and are not spammy so should be included at info!()
level.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-03 09:30:55 -07:00
Alyssa Ross
7549149bb5 vmm: ensure signal handlers run on the right thread
Despite setting up a dedicated thread for signal handling, we weren't
making sure that the signals we were listening for there were actually
dispatched to the right thread.  While the signal-hook provides an
iterator API, so we can know that we're only processing the signals
coming out of the iterator on our signal handling thread, the actual
signal handling code from signal-hook, which pushes the signals onto
the iterator, can run on any thread.  This can lead to seccomp
violations when the signal-hook signal handler does something that
isn't allowed on that thread by our seccomp policy.

To reproduce, resize a terminal running cloud-hypervisor continuously
for a few minutes.  Eventually, the kernel will deliver a SIGWINCH to
a thread with a restrictive seccomp policy, and a seccomp violation
will trigger.

As part of this change, it's also necessary to allow rt_sigreturn(2)
on the signal handling thread, so signal handlers are actually allowed
to run on it.  The fact that this didn't seem to be needed before
makes me think that signal handlers were almost _never_ actually
running on the signal handling thread.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2021-09-02 21:33:31 +01:00
Rob Bradford
c2144b5690 vmm, virtio-console: Move input reading into virtio-console thread
Move the processing of the input from stdin, PTY or file from the VMM
thread to the existing virtio-console thread. The handling of the resize
of a virtio-console has not changed but the name of the struct used to
support that has been renamed to reflect its usage.

Fixes: #3060

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-02 21:17:33 +01:00
Henry Wang
0d01eac1d4 vmm: Do the downcast of GicDevice in a safer way for AArch64
Downcasting of GicDevice trait might fail. Therefore we try to
downcast the trait first and only if the downcasting succeeded we
can then use the object to call methods. Otherwise, do nothing and
log the failure.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-09-02 15:18:41 +01:00
Henry Wang
46c60183cd arch, vmm: Implement GIC Pausable trait
This commit implements the GIC (including both GICv3 and GICv3ITS)
Pausable trait. The pause of device manager will trigger a "pause"
of GIC, where we flush GIC pending tables and ITS tables to the
guest RAM.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-09-02 15:18:41 +01:00
Rob Bradford
66f0b5b2b6 vmm: Open the serial PTY in non-blocking mode
This prevents the boot of the guest kernel from being blocked by
blocking I/O on the serial output since the data will be buffered into
the SerialBuffer.

Fixes: #3004

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-02 13:52:18 +01:00
Rob Bradford
d92707afc5 vmm: Introduce a SerialBuffer for buffering serial output
Introduce a dynamic buffer for storing output from the serial port. The
SerialBuffer implements std::io::Write and can be used in place of the
direct output for the serial device.

The internals of the buffer is a vector that grows dynamically based on
demand up to a fixed size at which point old data will be overwritten.
Currently the buffer is only flushed upon writes.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-02 13:52:18 +01:00
Rob Bradford
63637eba31 vmm: Simplify epoll handling for VMM main loop
Remove the indirection of a dispatch table and simply use the enum as
the event data for the events.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-08-31 21:30:11 +01:00
Bo Chen
b82bb55927 vmm: openapi: use the right default values
This patch fixes couple of typos for the default values from the openapi
yaml file.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-08-27 15:58:23 +01:00
Rob Bradford
4d2a4e2805 vmm: Handle epoll events for PTYs separately
Use two separate events for the console and serial PTY and then drive
the handling of the inputs on the PTY separately. This results in the
correct behaviour when both console and serial are attached to the PTY
as they are triggered separately on the epoll so events are not lost.

Fixes: #3012

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-08-25 13:33:32 +01:00
Rob Bradford
6233f6f68e vmm: Send tty input to correct destination
Check the config to find out which device is attached to the tty and
then send the input from the user into that device (serial or
virtio-console.)

Fixes: #3005

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-08-25 10:08:25 +01:00
Fazla Mehrab
5db4dede28 block_util, vhdx: vhdx crate integration with the cloud hypervisor
vhdx_sync.rs in block_util implements traits to represent the vhdx
crate as a supported block device in the cloud hypervisor. The vhdx
is added to the block device list in device_manager.rs at the vmm
crate so that it can automatically detect a vhdx disk and invoke the
corresponding crate.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Fazla Mehrab <akm.fazla.mehrab@intel.com>
2021-08-19 11:43:19 +02:00
Bo Chen
9aba1fdee6 virtio-devices, vmm: Use syscall definitions from the libc crate
Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-08-18 10:42:19 +02:00
Bo Chen
864a5e4fe0 virtio-devices, vmm: Simplify 'get_seccomp_rules'
Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-08-18 10:42:19 +02:00
Bo Chen
7d38a1848b virtio-devices, vmm: Fix the '--seccomp false' option
We are relying on applying empty 'seccomp' filters to support the
'--seccomp false' option, which will be treated as an error with the
updated 'seccompiler' crate. This patch fixes this issue by explicitly
checking whether the 'seccomp' filter is empty before applying the
filter.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-08-18 10:42:19 +02:00
Bo Chen
08ac3405f5 virtio-devices, vmm: Move to the seccompiler crate
Fixes: #2929

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-08-18 10:42:19 +02:00
Rob Bradford
9d35a10fd4 vmm: cpu: Shutdown VMM on vCPU thread panic
If the vCPU thread panics then catch it and trigger the shutdown of the
VMM.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-08-13 09:19:54 +02:00
Rob Bradford
53b2e19934 vmm: Add support for hotplugging user devices
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-08-12 13:19:04 +01:00
Henry Wang
bcae6c41e3 vmm, doc: Forbid same memory zone in multiple NUMA nodes
It is forbidden that the same memory zone belongs to more than one
NUMA node. This commit adds related validation to the `--numa`
parameter to prevent the user from specifying such configuration.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-08-12 10:49:02 +02:00
Henry Wang
5a0a4bc505 arch: Add optional distance-map node to FDT
The optional device tree node distance-map describes the relative
distance (memory latency) between all NUMA nodes.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-08-12 10:49:02 +02:00
Henry Wang
165364e08b vmm: Move NUMA node data structures to arch
This is to make sure the NUMA node data structures can be accessed
both from the `vmm` crate and `arch` crate.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-08-12 10:49:02 +02:00
Henry Wang
20aa811de7 vmm: Extend NUMA setup to more than ACPI
The AArch64 platform provides a NUMA binding for the device tree,
which means on AArch64 platform, the NUMA setup can be extended to
more than the ACPI feature.

Based on above, this commit extends the NUMA setup and data
structures to following scenarios:

- All AArch64 platform
- x86_64 platform with ACPI feature enabled

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
Signed-off-by: Michael Zhao <Michael.Zhao@arm.com>
2021-08-12 10:49:02 +02:00
Sebastien Boeuf
4918c1ca7f block_util, vmm: Propagate error on QcowDiskSync creation
Instead of panicking with an expect() function, the QcowDiskSync::new
function now propagates the error properly. This ensures the VMM will
not panic, which might be the source of weird errors if only one thread
exits while the VMM continues to run.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-11 16:44:28 -07:00
Sebastien Boeuf
4735cb8563 vmm, virtio-devices: Restore vhost-user devices in a dedicated way
We cannot let vhost-user devices connect to the backend when the Block,
Fs or Net object is being created during a restore/migration. The reason
is we can't have two VMs (source and destination) connected to the same
backend at the same time. That's why we must delay the connection with
the vhost-user backend until the restoration is performed.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-10 12:36:58 -07:00
Sebastien Boeuf
71c7dff32b vmm: Fix the error handling logic when migration fails
The code wasn't doing what it was expected to. The '?' was simply
returning the error to the top level function, meaning the Err() case in
the match was never hit. Moving the whole logic to a dedicated function
allows to identify when something got wrong without propagating to the
calling function, so that we can still stop the dirty logging and
unpause the VM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-10 12:36:58 -07:00
Sebastien Boeuf
db444715fd vmm: Shutdown VM after migration succeeded
In case the migration succeeds, the destination VM will be correctly
running, with potential vhost-user backends attached to it. We can't let
the source VM trying to reconnect to the same backends, which is why
it's safer to shutdown the source VM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-10 12:36:58 -07:00
Sebastien Boeuf
5a83ebce64 vmm: Notify Migratable objects about migration being complete
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-10 12:36:58 -07:00
Sebastien Boeuf
06729bb3ba vmm: Provide a restoring state to the DeviceManager
In anticipation for creating vhost-user devices in a different way when
being restored compared to a fresh start, this commit introduces a new
boolean created by the Vm depending on the use case, and passed down to
the DeviceManager. In the future, the DeviceManager will use this flag
to assess how vhost-user devices should be created.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-10 12:36:58 -07:00
Rob Bradford
5e74848ab4 vmm: seccomp: Permit syscalls used for vfio-user on vCPU thread
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-08-10 16:01:00 +01:00
Rob Bradford
3efccd0fef vmm: config: Ensure shared memory is enabled if using user-devices
Correct operation of user devices (vfio-user) requires shared memory so
flag this to prevent it from failing in strange ways.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-08-10 16:01:00 +01:00
Rob Bradford
b28063a7b4 vmm: Create user devices from config
Create the vfio-user / user devices from the config. Currently hotplug
of the devices is not supported nor can they be placed behind the
(virt-)iommu.

Removal of the coldplugged device is however supported.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-08-10 16:01:00 +01:00
Rob Bradford
7fbec7113e main, config: Add support for --user-device
This allows the user to specify devices that are running in a different
userspace process and communicated with vfio-user.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-08-10 16:01:00 +01:00
Markus Theil
5b0d4bb398 virtio-devices: seccomp: allow unix socket connect in vsock thread
Allow vsocks to connect to Unix sockets on the host running
cloud-hypervisor with enabled seccomp.

Reported-by: Philippe Schaaf <philippe.schaaf@secunet.com>
Tested-by: Franz Girlich <franz.girlich@tu-ilmenau.de>
Signed-off-by: Markus Theil <markus.theil@tu-ilmenau.de>
2021-08-06 08:44:47 -07:00
Henry Wang
27a285257e vmm: cpu: Add PPTT table for AArch64
The optional Processor Properties Topology Table (PPTT) table is
used to describe the topological structure of processors controlled
by the OSPM, and their shared resources, such as caches. The table
can also describe additional information such as which nodes in the
processor topology constitute a physical package.

The ACPI PPTT table supports topology descriptions for ACPI guests.
Therefore, this commit adds the PPTT table for AArch64 to enable
CPU topology feature for ACPI.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-08-05 21:19:16 +08:00
Henry Wang
7fb980f17b arch, vmm: Pass cpu topology configuation to FDT
In an Arm system, the hierarchy of CPUs is defined through three
entities that are used to describe the layout of physical CPUs in
the system:

- cluster
- core
- thread

All these three entities have their own FDT node field. Therefore,
This commit adds an AArch64-specific helper to pass the config from
the Cloud Hypervisor command line to the `configure_system`, where
eventually the `create_fdt` is called.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-08-05 21:19:16 +08:00
Sebastien Boeuf
5c6139bbff vmm: Finalize migration support for all devices
Make sure the DeviceManager is triggered for all migration operations.
The dirty pages are merged from MemoryManager and DeviceManager before
to be sent up to the Vmm in lib.rs.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-05 06:07:00 -07:00
Sebastien Boeuf
0411064271 vmm: Refactor migration through Migratable trait
Now that Migratable provides the methods for starting, stopping and
retrieving the dirty pages, we move the existing code to these new
functions.

No functional change intended.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-05 06:07:00 -07:00
Sebastien Boeuf
e9637d3733 vmm: device_manager: Fully implement Migratable trait
This patch connects the dots between the vm.rs code and each Migratable
device, in order to make sure Migratable methods are correctly invoked
when migration happens.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-05 06:07:00 -07:00
Sebastien Boeuf
79425b6aa8 vm-migration, vmm: Extend methods for MemoryRangeTable
In anticipation for supporting the merge of multiple dirty pages coming
from multiple devices, this patch factorizes the creation of a
MemoryRangeTable from a bitmap, as well as providing a simple method for
merging the dirty pages regions under a single MemoryRangeTable.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-05 06:07:00 -07:00
Muminul Islam
83c44a2411 vmm, virtio-devices: Add missing seccomp rules for MSHV
This patch adds all the seccomp rules missing for MSHV.
With this patch MSFT internal CI runs with seccomp enabled.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2021-08-03 11:09:07 -07:00
Bo Chen
902fe20d41 vmm: Add fallback handling for sending live migration
This patch adds a fallback path for sending live migration, where it
ensures the following behavior of source VM post live-migration:

1. The source VM will be paused only when the migration is completed
successfully, or otherwise it will keep running;

2. The source VM will always stop dirty pages logging.

Fixes: #2895

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-08-03 09:26:12 +01:00
Muminul Islam
3baa0c3721 vmm: Add MSHV_VP_TRANSLATE_GVA to seccomp rule
This rule is needed to boot windows guest.
This bug was introduced while we tried to boot
windows guest on MSHV.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2021-07-29 16:29:53 +01:00
Muminul Islam
81895b9b40 hypervisor: Implement start/stop_dirty_log for MSHV
This patch modify the existing live migration code
to support MSHV. Adds couple of new functions to enable
and disable dirty page tracking. Add missing IOCTL
to the seccomp rules for live migration.
Adds necessary flags for MSHV.
This changes don't affect KVM functionality at all.

In order to get better performance it is good to
enable dirty page tracking when we start live migration
and disable it when the migration is done.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2021-07-29 16:29:53 +01:00
Muminul Islam
fdecba6958 hypervisor: MSHV needs gpa to retrieve dirty logs
Right now, get_dirty_log API has two parameters,
slot and memory_size.
MSHV needs gpa to retrieve the page states. GPA is
needed as MSHV returns the state base on PFN.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2021-07-29 16:29:53 +01:00
Sebastien Boeuf
12db6e5068 vmm: Allow restoring virtio-fs with no cache region
It's totally acceptable to snapshot and restore a virtio-fs device that
has no cache region, since this is a valid mode of functioning for
virtio-fs itself.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-07-29 06:35:03 -07:00
Sebastien Boeuf
dcc646f5b1 clippy: Fix redundant allocations
With the new beta version, clippy complains about redundant allocation
when using Arc<Box<dyn T>>, and suggests replacing it simply with
Arc<dyn T>.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-07-29 13:28:57 +02:00
Bo Chen
b00a6a8519 vmm: Create guest memory regions with explicit dirty-pages-log flags
As we are now using an global control to start/stop dirty pages log from
the `hypervisor` crate, we need to explicitly tell the hypervisor (KVM)
whether a region needs dirty page tracking when it is created.

This reverts commit f063346de3.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-07-28 09:08:32 -07:00
Bo Chen
e7c9954dc1 hypervisor, vmm: Abstract the interfaces to start/stop dirty log
Following KVM interfaces, the `hypervisor` crate now provides interfaces
to start/stop the dirty pages logging on a per region basis, and asks
its users (e.g. the `vmm` crate) to iterate over the regions that needs
dirty pages log. MSHV only has a global control to start/stop dirty
pages log on all regions at once.

This patch refactors related APIs from the `hypervisor` crate to provide
a global control to start/stop dirty pages log (following MSHV's
behaviors), and keeps tracking the regions need dirty pages log for
KVM. It avoids leaking hypervisor-specific behaviors out of the
`hypervisor` crate.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-07-28 09:08:32 -07:00
Bo Chen
ca09638491 vmm: Add CPUID compatibility check for snapshot/restore
Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-07-28 09:26:02 +02:00
Bo Chen
0835198ddd vmm: Factorize CPUID check for live-migration and snapshot/restore
This patch adds a common function "Vmm::vm_check_cpuid_compatibility()"
to be shared by both live-migration and snapshot/restore.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-07-28 09:26:02 +02:00
Bo Chen
6d9c1eb638 arch, vmm: Add CPUID check to the 'Config' step of live migration
We now send not only the 'VmConfig' at the 'Command::Config' step of
live migration, but also send the 'common CPUID'. In this way, we can
check the compatibility of CPUID features between the source and
destination VMs, and abort live migration early if needed.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-07-28 09:26:02 +02:00
Bo Chen
f063346de3 vmm: Create guest memory regions without dirty-pages-log by default
With the support of dynamically turning on/off dirty-pages-log during
live-migration (only for guest RAM regions), we now can create guest
memory regions without dirty-pages-log by default both for guest RAM
regions and other regions backed by file/device.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-07-26 09:19:35 -07:00
Bo Chen
5e0d498582 hypervisor, vmm: Add dynamic control of logging dirty pages
This patch extends slightly the current live-migration code path with
the ability to dynamically start and stop logging dirty-pages, which
relies on two new methods added to the `hypervisor::vm::Vm` Trait. This
patch also contains a complete implementation of the two new methods
based on `kvm` and placeholders for `mshv` in the `hypervisor` crate.

Fixes: #2858

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-07-26 09:19:35 -07:00
Sebastien Boeuf
0ac4545c5b vmm: Extend seccomp filters with fcntl() for HTTP thread
Whenever a file descriptor is sent through the control message, it
requires fcntl() syscall to handle it, meaning we must allow it through
the list of syscalls authorized for the HTTP thread.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-07-21 15:34:22 +02:00
Sebastien Boeuf
3e482c9c74 vmm: Limit physical address space for TDX
When running TDX guest, the Guest Physical Address space is limited by
a shared bit that is located on bit 47 for 4 level paging, and on bit 51
for 5 level paging (when GPAW bit is 1). In order to keep things simple,
and since a 47 bits address space is 128TiB large, we ensure to limit
the physical addressable space to 47 bits when runnning TDX.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-07-20 15:00:04 +02:00
Sebastien Boeuf
05f7651cf5 vmm: Force VIRTIO_F_IOMMU_PLATFORM when running TDX
When running a TDX guest, we need the virtio drivers to use the DMA API
to share specific memory pages with the VMM on the host. The point is to
let the VMM get access to the pages related to the buffers pointed by
the virtqueues.

The way to force the virtio drivers to use the DMA API is by exposing
the virtio devices with the feature VIRTIO_F_IOMMU_PLATFORM. This is a
feature indicating the device will require some address translation, as
it will not deal directly with physical addresses.

Cloud Hypervisor takes care of this requirement by adding a generic
parameter called "force_iommu". This parameter value is decided based on
the "tdx" feature gate, and then passed to the DeviceManager. It's up to
the DeviceManager to use this parameter on every virtio device creation,
which will imply setting the VIRTIO_F_IOMMU_PLATFORM feature.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-07-20 14:47:01 +02:00
Bo Chen
569be6e706 arch, vmm: Move "generate_common_cpuid" from "CpuManager" to "arch"
This refactoring ensures all CPUID related operations are centralized in
`arch::x86_64` module, and exposes only two related public functions to
the vmm crate, e.g. `generate_common_cpuid` and `configure_vcpu`.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-07-19 09:59:34 -07:00
Sebastien Boeuf
d4316d0228 vmm: http: Allow file descriptor to be sent with add-net
In order to let a separate process open a TAP device and pass the file
descriptor through the control message mechanism, this patch adds the
support for sending a file descriptor over to the Cloud Hypervisor
process along with the add-net HTTP API command.

The implementation uses the NetConfig structure mutably to update the
list of fds with the one passed through control message. The list should
always be empty prior to this, as it makes no sense to provide a list of
fds once the Cloud Hypervisor process has already been started.

It is important to note that reboot is supported since the file
descriptor is duplicated upon receival, letting the VM only use the
duplicated one. The original file descriptor is kept open in order to
support a potential reboot.

Fixes #2525

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-07-19 09:51:32 +02:00
Sebastien Boeuf
d68c388cac vmm: Update seccomp filters for HTTP thread
The micro-http crate now uses recvmsg() syscall in order to receive file
descriptors through control messages. This means the syscall must be
part of the authorized list in the seccomp filters.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-07-15 08:13:48 +00:00
Sebastien Boeuf
6b710209b1 numa: Add optional sgx_epc_sections field to NumaConfig
This new option allows the user to define a list of SGX EPC sections
attached to a specific NUMA node.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-07-09 14:45:30 +02:00
Sebastien Boeuf
9aedabe11e sgx: Add mandatory id field to SgxEpcConfig
In order to uniquely identify each SGX EPC section, we introduce a
mandatory option `id` to the `--sgx-epc` parameter.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-07-09 14:45:30 +02:00
Sebastien Boeuf
17c99ae00a vmm: Enable provisioning for SGX guest
The guest can see that SGX supports provisioning as it is exposed
through the CPUID. This patch enables the proper backing of this
feature by having the host open the provisioning device and enable
this capability through the hypervisor.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-07-07 14:56:38 +02:00
Sebastien Boeuf
5b6d424a77 arch, vmm: Fix TDVF section handling
This patch fixes a few things to support TDVF correctly.

The HOB memory resources must contain EFI_RESOURCE_ATTRIBUTE_ENCRYPTED
attribute.

Any section with a base address within the already allocated guest RAM
must not be allocated.

The list of TD_HOB memory resources should contain both TempMem and
TdHob sections as well.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-07-06 11:47:43 +02:00
Henry Wang
4da3bdcd6e vmm: Split restore device_manager and devices
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-07-05 22:51:56 +02:00
Henry Wang
95ca4fb15e vmm: vm: Enable snapshot/restore of GICv3ITS
This commit enables the snapshot/restore of GICv3ITS in the process
of VM snapshot/restore.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-07-05 22:51:56 +02:00
Wei Liu
1f2915bff0 vmm: hypervisor: split set_user_memory_region to two functions
Previously the same function was used to both create and remove regions.
This worked on KVM because it uses size 0 to indicate removal.

MSHV has two calls -- one for creation and one for removal. It also
requires having the size field available because it is not slot based.

Split set_user_memory_region to {create/remove}_user_memory_region. For
KVM they still use set_user_memory_region underneath, but for MSHV they
map to different functions.

This fixes user memory region removal on MSHV.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2021-07-05 09:45:45 +02:00
Wei Liu
71bbaf556f vmm: seccomp: add seccomp rules for MSHV
Add a minimum set of rules that allow Cloud Hypervisor to run Linux on
top of Microsoft Hypervisor.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2021-07-05 09:44:02 +02:00
Wei Liu
8819bb0f21 vmm: seccomp: make use of KVM feature
The to-be-introduced MSHV rules don't need to contain KVM rules and vice
versa.

Put KVM constants into to a module. This avoids the warnings about
dead code in the future.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2021-07-05 09:44:02 +02:00
Henry Wang
054c036e81 vmm: acpi: Add AArch64 vCPUs to SRAT table
This commit introduces the `ProcessorGiccAffinity` struct for the
AArch64 platform. This struct will be created and included into
the SRAT table to enable AArch64 NUMA setup.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-06-25 10:22:40 +01:00
Michael Zhao
239e39ddbc vmm: Fix clippy warnings on AArch64
Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-06-24 08:59:53 -07:00
Bo Chen
5768dcc320 vmm: Refactor slightly vm_boot and 'control_loop'
It ensures all handlers for `ApiRequest` in `control_loop` are
consistent and minimum and should read better.

No functional changes.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-06-24 16:01:39 +02:00
Bo Chen
1075209e2a vmm: Handle ApiRequest::VmCreate in a separate function
It simplifies a bit the `Vmm::control_loop` and reads better to be
consistent with other `ApiRequest` handlers. Also, it removes the
repetitive `ApiError::VmAlreadyCreated` and makes `ApiError::VmCreate`
useful.

No functional changes.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-06-24 16:01:39 +02:00
Michael Zhao
3613b4c096 aarch64: Enable default build option
We have been building Cloud Hypervisor with command like:
`cargo build --no-default-features --features ...`.

After implementing ACPI, we donot have to use specify all features
explicitly. Default build command `cargo build` can work.

This commit fixed some build warnings with default build option and
changed github workflow correspondingly.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-06-24 13:13:27 +01:00