Commit Graph

1705 Commits

Author SHA1 Message Date
dependabot[bot]
592b8bcaaa build: bump arc-swap from 1.4.0 to 1.5.0
Bumps [arc-swap](https://github.com/vorner/arc-swap) from 1.4.0 to 1.5.0.
- [Release notes](https://github.com/vorner/arc-swap/releases)
- [Changelog](https://github.com/vorner/arc-swap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/vorner/arc-swap/compare/v1.4.0...v1.5.0)

---
updated-dependencies:
- dependency-name: arc-swap
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-11-15 23:30:55 +00:00
Sebastien Boeuf
a1f1dfddeb vmm: Fix CpusConfig validation error message
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-11-15 17:27:23 +01:00
Rob Bradford
3480e69ff5 vmm: Cache whether io_uring is supported in DeviceManager
Probing for whether the io_uring is supported is time consuming so cache
this value if it is known to reduce the cost for secondary block devices
that are added.

Before:

cloud-hypervisor: 3.988896ms: <vmm> INFO:vmm/src/device_manager.rs:1901 -- Creating virtio-block device: DiskConfig { path: Some("/home/rob/workloads/focal-server-cloudimg-amd64-custom-20210609-0.raw"), readonly: false, direct: false, iommu: false, num_queues: 1, queue_size: 128, vhost_user: false, vhost_socket: None, poll_queue: true, rate_limiter_config: None, id: Some("_disk0"), disable_io_uring: false, pci_segment: 0 }
cloud-hypervisor: 14.129591ms: <vmm> INFO:vmm/src/device_manager.rs:1983 -- Using asynchronous RAW disk file (io_uring)
cloud-hypervisor: 14.159853ms: <vmm> INFO:vmm/src/device_manager.rs:1901 -- Creating virtio-block device: DiskConfig { path: Some("/tmp/disk"), readonly: false, direct: false, iommu: false, num_queues: 1, queue_size: 128, vhost_user: false, vhost_socket: None, poll_queue: true, rate_limiter_config: None, id: Some("_disk1"), disable_io_uring: false, pci_segment: 0 }
cloud-hypervisor: 22.110281ms: <vmm> INFO:vmm/src/device_manager.rs:1983 -- Using asynchronous RAW disk file (io_uring)

After:

cloud-hypervisor: 4.880411ms: <vmm> INFO:vmm/src/device_manager.rs:1916 -- Creating virtio-block device: DiskConfig { path: Some("/home/rob/workloads/focal-server-cloudimg-amd64-custom-20210609-0.raw"), readonly: false, direct: false, iommu: false, num_queues: 1, queue_size: 128, vhost_user: false, vhost_socket: None, poll_queue: true, rate_limiter_config: None, id: Some("_disk0"), disable_io_uring: false, pci_segment: 0 }
cloud-hypervisor: 14.105123ms: <vmm> INFO:vmm/src/device_manager.rs:1998 -- Using asynchronous RAW disk file (io_uring)
cloud-hypervisor: 14.134837ms: <vmm> INFO:vmm/src/device_manager.rs:1916 -- Creating virtio-block device: DiskConfig { path: Some("/tmp/disk"), readonly: false, direct: false, iommu: false, num_queues: 1, queue_size: 128, vhost_user: false, vhost_socket: None, poll_queue: true, rate_limiter_config: None, id: Some("_disk1"), disable_io_uring: false, pci_segment: 0 }
cloud-hypervisor: 14.221869ms: <vmm> INFO:vmm/src/device_manager.rs:1998 -- Using asynchronous RAW disk file (io_uring)

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-12 18:09:55 +00:00
Sebastien Boeuf
932c8c9713 vmm: Add CPU affinity support
With the introduction of a new option `affinity` to the `cpus`
parameter, Cloud Hypervisor can now let the user choose the set
of host CPUs where to run each vCPU.

This is useful when trying to achieve CPU pinning, as well as making
sure the VM runs on a specific NUMA node.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-11-12 09:40:37 +00:00
Sebastien Boeuf
a4f5ad6076 option_parser: Fix inner bracket support with list of integers
Give the option parser the ability to handle tuples with inner brackets
containing list of integers. The following example can now be handled
correctly "option=[key@[v1-v2,v3,v4]]" which means the option is
assigned a tuple with a key associated with a list of integers between
the range v1 - v2, as well as v3 and v4.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-11-12 09:40:37 +00:00
Sebastien Boeuf
611e71826d deps: Downgrade anyhow
Because anyhow version 1.0.46 has been yanked, let's move back to the
previous version 1.0.45.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-11-10 09:43:12 +00:00
Sebastien Boeuf
c8e3c1eed6 clippy: Make sure to initialize data
Always properly initialize vectors so that we don't run in undefined
behaviors when the vector gets dropped.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-11-10 10:23:43 +01:00
Sebastien Boeuf
ad521fd4e4 option_parser: Create generic type Tuple
Creates a new generic type Tuple so that the same implementation of
FromStr trait can be reused for both parsing a list of two integers and
parsing a list of one integer associated with a list of integers.

This anticipates the need for retrieving sublists, which will be needed
when trying to describe the host CPU affinity for every vCPU.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-11-09 08:59:59 +01:00
Sebastien Boeuf
b81d758c41 option_parser: Expect commas instead of colons for lists
The elements of a list should be using commas as the correct delimiter
now that it is supported. Deprecate use of colons as delimiter.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-11-09 08:59:59 +01:00
dependabot[bot]
a362e85539 build: bump anyhow from 1.0.45 to 1.0.46
Bumps [anyhow](https://github.com/dtolnay/anyhow) from 1.0.45 to 1.0.46.
- [Release notes](https://github.com/dtolnay/anyhow/releases)
- [Commits](https://github.com/dtolnay/anyhow/compare/1.0.45...1.0.46)

---
updated-dependencies:
- dependency-name: anyhow
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-11-08 23:34:03 +00:00
Rob Bradford
751e76db08 vmm: acpi: Use Aml::append_aml_bytes() to generate DSDT
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-08 16:46:30 +00:00
Rob Bradford
d96d98d88e vmm: Port DeviceManager to Aml::append_aml_bytes()
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-08 16:46:30 +00:00
Rob Bradford
185f0c1bf3 vmm: Port MemoryManager to Aml::append_aml_bytes()
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-08 16:46:30 +00:00
Rob Bradford
e04cbb2ad4 vmm: Port PciSegment to Aml::append_aml_bytes()
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-08 16:46:30 +00:00
Rob Bradford
986e43f899 vmm: cpu: Port CpuManager to Aml::append_aml_bytes()
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-08 16:46:30 +00:00
Rob Bradford
d0c3342c97 vmm: acpi: Report time to generate ACPI tables
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-08 16:46:30 +00:00
dependabot[bot]
b4f3e1c2a1 build: bump libc from 0.2.106 to 0.2.107
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.106 to 0.2.107.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.106...0.2.107)

---
updated-dependencies:
- dependency-name: libc
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-11-08 10:01:40 +00:00
dependabot[bot]
cc1db2ea13 build: bump serde_json from 1.0.68 to 1.0.69
Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.68 to 1.0.69.
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](https://github.com/serde-rs/json/compare/v1.0.68...v1.0.69)

---
updated-dependencies:
- dependency-name: serde_json
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-11-06 16:37:23 +00:00
dependabot[bot]
960c7027c7 build: bump anyhow from 1.0.44 to 1.0.45
Bumps [anyhow](https://github.com/dtolnay/anyhow) from 1.0.44 to 1.0.45.
- [Release notes](https://github.com/dtolnay/anyhow/releases)
- [Commits](https://github.com/dtolnay/anyhow/compare/1.0.44...1.0.45)

---
updated-dependencies:
- dependency-name: anyhow
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-11-03 09:43:57 +00:00
Rob Bradford
a2e02a8fff vmm: Add SGX section creation logging
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
def98faf37 vmm, vm-allocator: Introduce an allocator for platform devices
This allocator allocates 64-bit MMIO addresses for use with platform
devices e.g. ACPI control devices and ensures there is no overlap with
PCI address space ranges which can cause issues with PCI device
remapping.

Use this allocator the ACPI platform devices.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
9d1a7e43a7 vmm: Refactor MCFG table creation to take just the PCI segments
This matches the lock taking behaviour of other functions in this file.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
afe95e5a2a vmm: Use an allocator specifically for RAM regions
Rather than use the system MMIO allocator for RAM use an allocator that
covers the full RAM range.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
b8fee11822 vmm: Place SGX EPC region between RAM and device area
Increase the start of the device area to accomodate the SGX EPC area.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
e20be3e147 vmm: Check hotplug memory against end of RAM not start of device area
This is because the SGX region will be placed between the end of ram and
the start of the device area.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
ec81f377b6 vmm: Refactor SGX setup to inside MemoryManager::new()
This makes it possible to manually allocate the SGX region after the end
of RAM region.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
438be0dad5 vmm: api: Add pci_segment entries to OpenAPI file
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
1a5a89508b vmm: Remove segment_id from DeviceNode
With the segment id now encoded in the bdf it is not necessary to have
the separate field for it.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
ae83e3b383 vmm: Use PciBdf throughout in order to remove manual bit manipulation
In particular use the accessor for getting the device id from the bdf.
As a side effect the VIOT table is now segment aware.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
a26ce353d3 vmm: Use the PCI segment allocator for pmem and fs cache allocations
Use the MMIO address space allocator associated with the segment that
the devices are on.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
cd9d1cf8fc pci, virtio-devices, vmm: Allocate PCI 64-bit bars per segment
Since each segment must have a non-overlapping memory range associated
with it the device memory must be equally divided amongst all segments.
A new allocator is used for each segment to ensure that BARs are
allocated from the correct address ranges. This requires changes to
PciDevice::allocate/free_bars to take that allocator and when
reallocating BARs the correct allocator must be identified from the
ranges.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
7cfeefde57 vmm: Add validation logic to check user specified pci_segment is valid
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
f71f6da907 vmm: Add pci_segment option to UserDeviceConfig
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
d4f7f42800 vmm: Add pci_segment option to DeviceConfig
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
ca955a47ff vmm: Implement pci_segment options for hotpluggable virtio devices
For all the devices that support being hotplugged (disk, net, pmem, fs
and vsock) add "pci_segment" option and propagate that through to the
addition onto the PCI busses.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
88378d17a2 vmm: Take PCI segment ID into BAR size allocation
Move the decision on whether to use a 64-bit bar up to the DeviceManager
so that it can use both the device type (e.g. block) and the PCI segment
ID to decide what size bar should be used.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
cf1c2bf0e8 vmm: Use the same set of reserved PCI IRQ routes for all segments
Generate a set of 8 IRQs and round-robin distribute those over all the
slots for a bus. This same set of IRQs is then used for all PCI
segments.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
e3d6e222a1 vmm: Add the required number of PCI segments
The platform config may specify a number of PCI segments to use, if this
greater than 1 then we add supplemental PCI segments as well as the
default segment.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
f8d9c073f0 vmm: Add "--platform"
This currently contains only the number over PCI segments to create.
This is limited to 16 at the moment which should allow 496 user specified
PCI devices.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
e3c35a3579 vmm: Allow specifying the PCI segment ID when adding virtio PCI device
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
7a4606f800 vmm: Implement ACPI hotplug/unplug handling for PCI segments
For the bus scanning the GED AML code now calls into a PSCN method that
scans all buses. This approach was chosen since it handles the case
correctly where one GED interrupt is services for two hotplugs on
distinct segments.

The PCIU and PCID field values are now determined by the PSEG field that
is uses to select which segment those values should be used for.
Similarly _EJ0 will notify based on the value of _SEG.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
49f19e061b vmm: Use device's segment when removing a device
The segment ID has been stored in the DeviceTree.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
d33d254921 vmm: Remove hardcoded zero PCI segment id
Replace the hardcoded zero PCI segment id when adding devices to the bus
and extend the DeviceTree to hold the PCI segment id.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
b8b0dab1ae vmm: Add segment_id parameter to DeviceManager::add_pci_device
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
c118d7d7d3 vmm: Only fill in PIO and 32-bit MMIO space on zero segment
Since each segment must have disjoint address spaces only advertise
address space in the 32-bit range and the PIO address space on the
default (zero) segment.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
3059ba4305 vmm: Refactor PCI segment creation to support non-default segment
Split PciSegment::new_default_segment() into a separate
PciSegment::new() and those parts required only for the default segment
(PIO PCI config device.)

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
080ce9b068 vmm: Populate MCFG table with details of all PCI segments
The MCFG table holds the PCI MMIO config details for all the MMIO PCI
config devices.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
c886d71d29 vmm: Add MMIO & PIO config devices for all PCI segments
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
4f5c179b9b vmm: Construct PCI DSDT data from all segments
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
fbb385834a vmm: Use a vector to store multiple segments
For now this still contains just one segment but is expanding in
preparation for more segments.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
b4fc02857f vmm: Advertise PCI MMIO config range for PCI bus
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
b55f009b8a vmm: Calculate MMIO config address based on segment id
This means that each segment can have its own PCI MMIO config device
without overlapping.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
b59f1d90dd vmm: Expose _SEG with segment ID for PCI bus
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
a7fba8105f vmm: Customise PCI device name based on segment id
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
Rob Bradford
8b67298ad8 vmm: Move PCI bus DSDT data onto PciSegment
This commit moves the code that generates the DSDT data for the PCI bus
into PciSegment making no functional changes to the generated AML.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-11-02 16:55:42 +00:00
dependabot[bot]
21706b02b8 build: bump libc from 0.2.105 to 0.2.106
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.105 to 0.2.106.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.105...0.2.106)

---
updated-dependencies:
- dependency-name: libc
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-11-02 06:30:25 +00:00
Wei Liu
23f63262e7 vmm: drop underscore from used variables
Variables that start with underscore are used to silence rustc.
Normally those variables are not used in code.

This patch drops the underscore from variables that are used. This is
less confusing to readers.

No functional change intended.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2021-10-28 13:38:20 +01:00
Bo Chen
455b0d12e9 vmm: Remove VFIO user device from VmConfig upon device unplug
It ensures we won't recreate the unplugged device on reboot.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-10-28 09:42:52 +01:00
Rob Bradford
beb0c0707f vmm: Move logging output for the debug (0x80) port to info!()
This makes it much easier to use since the info!() level produces far
fewer messages and thus has less overhead.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-10-26 16:48:09 +01:00
dependabot[bot]
b5d5ffa969 build: bump libc from 0.2.104 to 0.2.105
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.104 to 0.2.105.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.104...0.2.105)

---
updated-dependencies:
- dependency-name: libc
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-10-26 01:32:06 +00:00
Sebastien Boeuf
0249e8641a Move Cloud Hypervisor to virtio-queue crate
Relying on the vm-virtio/virtio-queue crate from rust-vmm which has been
copied inside the Cloud Hypervisor tree, the entire codebase is moved to
the new definition of a Queue and other related structures.

The reason for this move is to follow the upstream until we get some
agreement for the patches that we need on top of that to make it
properly work with Cloud Hypervisor.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-10-22 11:38:55 +02:00
Sebastien Boeuf
7f0e7d19a6 Revert "build: bump vm-memory from 0.6.0 to 0.7.0"
This was causing some issues because of the use of 2 different versions
for the vm-memmory crate. We'll wait for all dependencies to be properly
resolved before we move to 0.7.0.

This reverts commit 76b6c62d07.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-10-22 11:09:36 +02:00
Bo Chen
76b6c62d07 build: bump vm-memory from 0.6.0 to 0.7.0
Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-10-21 06:19:02 -07:00
Rob Bradford
e9ea9d63f8 vmm: Use assert!() rather than if+panic
As identified by the new beta clippy.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-10-19 19:42:36 +01:00
dependabot[bot]
feed0efc60 build: bump libc from 0.2.103 to 0.2.104
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.103 to 0.2.104.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.103...0.2.104)

---
updated-dependencies:
- dependency-name: libc
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-10-19 11:06:10 +02:00
Rob Bradford
2cccdc5ddd vmm: Naturally align PCI BARs on relocation
When allocating PCI MMIO BARs they should always be naturally aligned
(i.e. aligned to the size of the BAR itself.)

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-10-15 14:54:18 -07:00
Rob Bradford
c25bd447a1 vmm: Ensure that allocate_bars() is called before mmio_regions()
The allocate_bars method has a side effect which collates the BARs used
for the device and stores them internally. Ensure that any use of this
internal state is after the state is created otherwise no MMIO regions
will be seen and so none will be mapped.

Fixes: #3237

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-10-14 10:14:33 -07:00
dependabot[bot]
610d694f1d build: bump thiserror from 1.0.29 to 1.0.30
Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.29 to 1.0.30.
- [Release notes](https://github.com/dtolnay/thiserror/releases)
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.29...1.0.30)

---
updated-dependencies:
- dependency-name: thiserror
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-10-12 09:45:31 +02:00
Sebastien Boeuf
58d8206e2b migration: Use MemoryManager restore code path
Instead of creating a MemoryManager from scratch, let's reuse the same
code path used by snapshot/restore, so that memory regions are created
identically to what they were on the source VM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-10-06 18:35:49 -07:00
Sebastien Boeuf
1e1e61614c vmm: memory_manager: Leverage new codepath for snapshot/restore
Now that all the pieces are in place, we can restore a VM with the new
codepath that restores properly all memory regions, allowing for ACPI
memory hotplug to work properly with snapshot/restore feature.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-10-06 18:35:49 -07:00
Sebastien Boeuf
6a55768d94 vmm: Create MemoryManager from restore data
Extending the MemoryManager::new() function to be able to create a
MemoryManager from data that have been previously stored instead of
always creating everything from scratch.

This change brings real added value as it allows a VM to be restored
respecting the proper memory layout instead of hoping the regions will
be created the way they were before.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-10-06 18:35:49 -07:00
Sebastien Boeuf
5b177b205b arch, vmm: Extend the data being snapshot
Storing multiple data coming from the MemoryManager in order to be able
to restore without creating everything from scratch.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-10-06 18:35:49 -07:00
Sebastien Boeuf
f440976a7c vmm: memory_manager: Add a way to restore memory regions properly
This new function will be able to restore memory regions and memory
zones based on the GuestMemoryMapping list that will be provided through
snapshot/restore and migration phases.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-10-06 18:35:49 -07:00
Sebastien Boeuf
0d573ae86c vmm: memory_manager: Add file_offset to GuestRamMapping
This will help restoring the region with the correct file offset for the
memory mapping.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-10-06 18:35:49 -07:00
Sebastien Boeuf
01420f5195 vmm: memory_manager: Add virtio_mem to GuestRamMapping
This will help identify if the range belongs to a virtio-mem region or
not.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-10-06 18:35:49 -07:00
Sebastien Boeuf
dfb1829f65 vmm: memory_manager: Add zone_id to GuestRamMapping
This can help identifying which zone relates to which memory range.
This is going to be useful when recreating GuestMemory regions from
the previous layout instead of having to recreate everything from
scratch.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-10-06 18:35:49 -07:00
Sebastien Boeuf
b5d11f72b3 vmm: memory_manager: Factorize allocation of ranges
Create a dedicated function to factorize the allocation of the memory
ranges, and helping with the simplification of MemoryManager::new()
function.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-10-06 18:35:49 -07:00
Sebastien Boeuf
00951f17d4 vmm: memory_manager: Simplify regions creation
By updating the list of GuestMemory regions with the virtio-mem ones
before the creation of the MemoryManager, we know the GuestMemory is up
to date and the allocation of memory ranges is simplified afterwards.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-10-06 18:35:49 -07:00
Sebastien Boeuf
63c6c78c4e vmm: memory_manager: Factorize configuration validation
In order to simplify MemoryManager::new() function. let's move the
memory configuration validation to its own function.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-10-06 18:35:49 -07:00
Rob Bradford
84fc0e093d vmm: Move PciSegment to new file
Move the PciSegment struct and the associated code to a new file. This
will allow some clearer separation between the core DeviceManager and
PCI handling.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-10-05 10:54:07 +01:00
Rob Bradford
0eb78ab177 vmm: Extract PCI related state from DeviceManager
Move the PCI related state from the DeviceManager struct to a PciSegment
struct inside the DeviceManager. This is in preparation for multiple
segment support. Currently this state is just the bus itself, the MMIO
and PIO config devices and hotplug related state.

The main change that this required is using the Arc<Mutex<PciBus>> in
the device addition logic in order to ensure that
the bus could be created earlier.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-10-05 10:54:07 +01:00
Bo Chen
1a4747a20f Build: Seccompiler: Move to use the released version from crate.io
Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-10-01 11:34:54 -07:00
Rob Bradford
83066cf58e vmm: Set a default maximum physical address size
When using PVH for booting (which we use for all firmwares and direct
kernel boot) the Linux kernel does not configure LA57 correctly. As such
we need to limit the address space to the maximum 4-level paging address
space.

If the user knows that their guest image can take advantage of the
5-level addressing and they need it for their workload then they can
increase the physical address space appropriately.

This PR removes the TDX specific handling as the new address space limit
is below the one that that code specified.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-10-01 08:59:15 -07:00
Sebastien Boeuf
495e444ca6 vmm: Add ACPI tables to TdVmmData when running TDX
Whenever running TDX, we must pass the ACPI tables to the TDVF firmware
running in the guest. The proper way to do this is by adding the tables
to the TdHob as a TdVmmData type, so that TDVF will know how to access
these tables and expose them to the guest OS.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-30 06:35:55 -07:00
Sebastien Boeuf
b99a3a7dc9 vmm: Factorize ACPI tables creation inside boot() function
Instead of having the ACPI tables being created both in x86_64 and
aarch64 implementations of configure_system(), we can remove the
duplicated code by moving the ACPI tables creation in vm.rs inside the
boot() function.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-30 06:35:55 -07:00
Yu Li
08021087ec vmm: add prefault option in memory and memory-zone
The argument `prefault` is provided in MemoryManager, but it can
only be used by SGX and restore.
With prefault (MAP_POPULATE) been set, subsequent page faults will
decrease during running, although it will make boot slower.

This commit adds `prefault` in MemoryConfig and MemoryZoneConfig.
To resolve conflict between memory and restore, argument
`prefault` has been changed from `bool` to `Option<bool>`, when
its value is None, config from memory will be used, otherwise
argument in Option will be used.

Signed-off-by: Yu Li <liyu.yukiteru@bytedance.com>
2021-09-29 14:17:35 +02:00
Sebastien Boeuf
59031531b6 vmm: Simplify the way memory is snapshot and restored
By using a single file for storing the memory ranges, we simplify the
way snapshot/restore works by avoiding multiples files, but the main and
more important point is that we have now a way to save only the ranges
that matter. In particular, the ranges related to virtio-mem regions are
not always fully hotplugged, meaning we don't want to save the entire
region. That's where the usage of memory ranges is interesting as it
lets us optimize the snapshot/restore process when one or multiple
virtio-mem regions are involved.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Sebastien Boeuf
1ea63f50a1 vmm: Move MemoryRangeTable creation to the MemoryManager
The function memory_range_table() will be reused by the MemoryManager in
a following patch to describe all the ranges that we should snapshot.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Sebastien Boeuf
86f86c5348 vmm: Optimize migration for virtio-mem
Copy only the memory ranges that have been plugged through virtio-mem,
allowing for an interesting optimization regarding the time it takes to
migrate a large virtio-mem device. Even if the hotpluggable space is
very large (say 64GiB), if only 1GiB has been previously added to the
VM, only 1GiB will be sent to the destination VM, avoiding the transfer
of the remaining 63GiB which are unused.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Sebastien Boeuf
e390775bcb vmm, virtio-devices: Move BlocksState creation to the MemoryManager
By creating the BlocksState object in the MemoryManager, we can directly
provide it to the virtio-mem device when being created. This will allow
the MemoryManager through each VirtioMemZone to have a handle onto the
blocks that are plugged at any point in time.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Sebastien Boeuf
a1caa6549a vmm: Add page size as a parameter for MemoryRangeTable::from_bitmap()
This will be helpful to support the creation of a MemoryRangeTable from
virtio-mem, as it uses 2M pages.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Sebastien Boeuf
d7115ec656 virtio-devices: mem: Add snapshot/restore support
Adding the snapshot/restore support along with migration as well,
allowing a VM with virtio-mem devices attached to be properly
migrated.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Sebastien Boeuf
7bbcc0f849 vmm: memory_manager: Make sure the hotplugged_size is up to date
The amount of memory plugged in the virtio-mem region should always be
kept up to date in the hotplugged_size field from VirtioMemZone.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Sebastien Boeuf
c4dc7a583d vmm: memory_manager: Simplify the MemoryManager structure
There's no need to duplicate the GuestMemory for snapshot purpose, as we
always have a handle onto the GuestMemory through the guest_memory
field.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Sebastien Boeuf
74485924b1 vmm: memory_manager: Simplification to avoid unnecessary locking
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-28 10:15:22 -07:00
Rob Bradford
4889999277 vmm: Only advertise a single PCI bus
Since we only support a single PCI bus right now advertise only a single
bus in the ACPI tables. This reduces the number of VM exits from probing
substantially.

Number of PCI config I/O port exits: 17871 -> 1551 (91% reduction) with
direct kernel boot.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-28 14:10:10 +02:00
dependabot[bot]
eda0dc20d3 build: bump libc from 0.2.102 to 0.2.103
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.102 to 0.2.103.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.102...0.2.103)

---
updated-dependencies:
- dependency-name: libc
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-28 10:45:35 +00:00
Rob Bradford
b50519651c vmm: Simplify slot eject code in PCI ACPI device code
Use a simpler method for extracting the affected slot on the eject
command. Also update the terminology to reflect that this a slot rather
than a bdf (which is what device id refers to elsewhere.)

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-28 12:03:23 +02:00
William Douglas
a8f063db7c vmm: Refactor serial buffer to allow flush on PTY when writable
Refactor the serial buffer handling in order to write the serial
buffer's output to a PTY connected after the serial device stops being
written to by the guest.

This change moves the serial buffer initialization inside the serial
manager. That is done to allow the serial buffer to be made aware of
the PTY and epoll fds needed in order to modify the
EpollDispatch::File trigger. These are then used by the serial buffer
to trigger an epoll event when the PTY fd is writable and the buffer
has content in it. They are also used to remove the trigger when the
buffer is emptied in order to avoid unnecessary wake-ups.

Signed-off-by: William Douglas <william.douglas@intel.com>
2021-09-27 14:18:21 +01:00
Sebastien Boeuf
b910a7922d vmm: Fix migration when writing/reading big chunks of data
Both read_exact_from() and write_all_to() functions from the GuestMemory
trait implementation in vm-memory are buggy. They should retry until
they wrote or read the amount of data that was expected, but instead
they simply return an error when this happens. This causes the migration
to fail when trying to send important amount of data through the
migration socket, due to large memory regions.

This should be eventually fixed in vm-memory, and here is the link to
follow up on the issue: https://github.com/rust-vmm/vm-memory/issues/174

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-27 11:13:56 +02:00
Rob Bradford
1a2d0e6dd8 build: bump linux-loader from 0.3.0 to 0.4.0
Requires manual change to command line loading.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-24 09:11:57 +00:00
Michael Zhao
d72af85c42 vmm: Add "_CCA" field to ACPI DSDT table
"_CCA" is required by DMA configuration on AArch64.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-09-24 07:57:57 +01:00
Rob Bradford
43365ade2e vmm, pci: Implement virtio-mem support for vfio-user
Implement the infrastructure that lets a virtio-mem device map the guest
memory into the device. This is necessary since with virtio-mem zones
memory can be added or removed and the vfio-user device must be
informed.

Fixes: #3025

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-21 15:42:49 +01:00
Rob Bradford
e9d67dc405 vmm: pci: Move creation of vfio_user::Client to DeviceManager
By moving this from the VfioUserPciDevice to DeviceManager the client
can be reused for handling DMA mapping behind an IOMMU.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-21 15:42:49 +01:00
Rob Bradford
fd4f32fa69 virtio-mem: Support multiple mappings
For vfio-user the mapping handler is per device and needs to be removed
when the device in unplugged.

For VFIO the mapping handler is for the default VFIO container (used
when no vIOMMU is used - using a vIOMMU does not require mappings with
virtio-mem)

To represent these two use cases use an enum for the handlers that are
stored.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-21 15:42:49 +01:00
dependabot[bot]
d826b4fbdc build: bump arc-swap from 1.3.2 to 1.4.0
Bumps [arc-swap](https://github.com/vorner/arc-swap) from 1.3.2 to 1.4.0.
- [Release notes](https://github.com/vorner/arc-swap/releases)
- [Changelog](https://github.com/vorner/arc-swap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/vorner/arc-swap/compare/v1.3.2...v1.4.0)

---
updated-dependencies:
- dependency-name: arc-swap
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-19 17:12:50 +00:00
Rob Bradford
0faa7afac2 vmm: Add fast path for PCI config IO port
Looking up devices on the port I/O bus is time consuming during the
boot at there is an O(lg n) tree lookup and the overhead from taking a
lock on the bus contents.

Avoid this by adding a fast path uses the hardcoded port address and
size and directs PCI config requests directly to the device.

Command line:
target/release/cloud-hypervisor --kernel ~/src/linux/vmlinux --cmdline "root=/dev/vda1 console=ttyS0" --serial tty --console off --disk path=~/workloads/focal-server-cloudimg-amd64-custom-20210609-0.raw --api-socket /tmp/api

PIO exit: 17913
PCI fast path: 17871
Percentage on fast path: 99.8%

perf before:

marvin:~/src/cloud-hypervisor (main *)$ perf report -g | grep resolve
     6.20%     6.20%  vcpu0            cloud-hypervisor    [.] vm_device:🚌:Bus::resolve

perf after:

marvin:~/src/cloud-hypervisor (2021-09-17-ioapic-fast-path *)$ perf report -g | grep resolve
     0.08%     0.08%  vcpu0            cloud-hypervisor    [.] vm_device:🚌:Bus::resolve

The compromise required to implement this fast path is bringing the
creation of the PciConfigIo device into the DeviceManager::new() so that
it can be used in the VmmOps struct which is created before
DeviceManager::create_devices() is called.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-17 17:09:45 +01:00
Michael Zhao
b3fa56544c virtio-devices: iommu: Support AArch64
The MSI IOVA address on X86 and AArch64 is different.

This commit refactored the code to receive the MSI IOVA address and size
from device_manager, which provides the actual IOVA space data for both
architectures.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-09-17 12:19:46 +02:00
Michael Zhao
253c06d3ba arch/aarch64: Add virtio-iommu device in FDT
Add a virtio-iommu node into FDT if iommu option is turned on. Now we
support only one virtio-iommu device.

Signed-off-by: Michael Zhao <michael.zhao@arm.com>
2021-09-17 12:19:46 +02:00
William Douglas
46f6d9597d vmm: Switch to using the serial_manager for serial input
This change switches from handling serial input in the VMM thread to
its own thread controlled by the SerialManager.

The motivation for this change is to avoid the VMM thread being unable
to process events while serial input is happening and vice versa.

The change also makes future work flushing the serial buffer on PTY
connections easier.

Signed-off-by: William Douglas <william.douglas@intel.com>
2021-09-17 11:15:35 +01:00
William Douglas
7b4f56e372 vmm: Add new serial_manager for serial input handling
This change adds a SerialManager with its own epoll handling that
should be created and run by the DeviceManager when creating an
appropriately configured console (serial tty or pty).

Both stdin and pty input are handled by the SerialManager. The stdin
and pty specific methods used by the VMM should be removed in a future
commit.

Signed-off-by: William Douglas <william.douglas@intel.com>
2021-09-17 11:15:35 +01:00
William Douglas
d6a2f48b32 vmm: device_manager: Make PtyPair implement Clone
The clone method for PtyPair should have been an impl of the Clone
trait but the method ended up not being used. Future work will make
use of the trait however so correct the missing trait implementation.

Signed-off-by: William Douglas <william.douglas@intel.com>
2021-09-17 11:15:35 +01:00
dependabot[bot]
f67b3f79ea build: bump vmm-sys-util from 0.8.0 to 0.9.0
Bumps [vmm-sys-util](https://github.com/rust-vmm/vmm-sys-util) from 0.8.0 to 0.9.0.
- [Release notes](https://github.com/rust-vmm/vmm-sys-util/releases)
- [Changelog](https://github.com/rust-vmm/vmm-sys-util/blob/main/CHANGELOG.md)
- [Commits](https://github.com/rust-vmm/vmm-sys-util/compare/v0.8.0...v0.9.0)

---
updated-dependencies:
- dependency-name: vmm-sys-util
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

This needed a bunch of manual updates as well, including vfio-ioctls and
vhost crates. The vhost crate is being patched with the latest version
from rust-vmm because the version 0.1.0 on crates.io doesn't include the
patches we need yet.

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-16 14:01:19 +01:00
dependabot[bot]
c1e896dddb build: bump libc from 0.2.101 to 0.2.102
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.101 to 0.2.102.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.101...0.2.102)

---
updated-dependencies:
- dependency-name: libc
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-15 17:23:46 +00:00
Sebastien Boeuf
a6040d7a30 vmm: Create a single VFIO container
For most use cases, there is no need to create multiple VFIO containers
as it causes unwanted behaviors. Especially when passing multiple
devices from the same IOMMU group, we need to use the same container so
that it can properly list the groups that have been already opened. The
correct logic was already there in vfio-ioctls, but it was incorrectly
used from our VMM implementation.

For the special case where we put a VFIO device behind a vIOMMU, we must
create one container per device, as we need to control the DMA mappings
per device, which is performed at the container level. Because we must
keep one container per device, the vIOMMU use case prevents multiple
devices attached to the same IOMMU group to be passed through the VM.
But this is a limitation that we are fine with, especially since the
vIOMMU doesn't let us group multiple devices in the same group from a
guest perspective.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-09-15 09:08:13 -07:00
dependabot[bot]
8836715c2d build: bump serde_json from 1.0.67 to 1.0.68
Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.67 to 1.0.68.
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](https://github.com/serde-rs/json/compare/v1.0.67...v1.0.68)

---
updated-dependencies:
- dependency-name: serde_json
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-15 00:06:23 +00:00
Alyssa Ross
330b5ea3be vmm: notify virtio-console of pty resizes
When a pty is resized (using the TIOCSWINSZ ioctl -- see ioctl_tty(2)),
the kernel will send a SIGWINCH signal to the pty's foreground process
group to notify it of the resize.  This is the only way to be notified
by the kernel of a pty resize.

We can't just make the cloud-hypervisor process's process group the
foreground process group though, because a process can only set the
foreground process group of its controlling terminal, and
cloud-hypervisor's controlling terminal will often be the terminal the
user is running it in.  To work around this, we fork a subprocess in a
new process group, and set its process group to be the foreground
process group of the pty.  The subprocess additionally must be running
in a new session so that it can have a different controlling
terminal.  This subprocess writes a byte to a pipe every time the pty
is resized, and the virtio-console device can listen for this in its
epoll loop.

Alternatives I considered were to have the subprocess just send
SIGWINCH to its parent, and to use an eventfd instead of a pipe.
I decided against the signal approach because re-purposing a signal
that has a very specific meaning (even if this use was only slightly
different to its normal meaning) felt unclean, and because it would
have required using pidfds to avoid race conditions if
cloud-hypervisor had terminated, which added complexity.  I decided
against using an eventfd because using a pipe instead allows the child
to be notified (via poll(2)) when nothing is reading from the pipe any
more, meaning it can be reliably notified of parent death and
terminate itself immediately.

I used clone3(2) instead of fork(2) because without
CLONE_CLEAR_SIGHAND the subprocess would inherit signal-hook's signal
handlers, and there's no other straightforward way to restore all signal
handlers to their defaults in the child process.  The only way to do
it would be to iterate through all possible signals, or maintain a
global list of monitored signals ourselves (vmm:vm::HANDLED_SIGNALS is
insufficient because it doesn't take into account e.g. the SIGSYS
signal handler that catches seccomp violations).

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2021-09-14 15:43:25 +01:00
Alyssa Ross
28382a1491 virtio-devices: determine tty size in console
This prepares us to be able to handle console resizes in the console
device's epoll loop, which we'll have to do if the output is a pty,
since we won't get SIGWINCH from it.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2021-09-14 15:43:25 +01:00
dependabot[bot]
f3778a7fc7 build: bump anyhow from 1.0.43 to 1.0.44
Bumps [anyhow](https://github.com/dtolnay/anyhow) from 1.0.43 to 1.0.44.
- [Release notes](https://github.com/dtolnay/anyhow/releases)
- [Commits](https://github.com/dtolnay/anyhow/compare/1.0.43...1.0.44)

---
updated-dependencies:
- dependency-name: anyhow
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-14 00:22:00 +00:00
Alyssa Ross
8abe8c679b seccomp: allow mmap everywhere brk is allowed
Musl often uses mmap to allocate memory where Glibc would use brk.
This has caused seccomp violations for me on the API and signal
handling threads.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2021-09-10 12:01:31 -07:00
Rob Bradford
b6b686c71c vmm: Shutdown VMM if API thread panics
See: #3031

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-10 10:52:08 -07:00
Rob Bradford
171d12943d vmm: memory_manager: Increase robustness of MemoryManager control device
See: #1289

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-10 10:23:19 -07:00
Rob Bradford
bdc44cd8bc vmm: cpu: Increase robustness of CpuManager control device
See: #1289

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-10 10:22:05 -07:00
Bo Chen
4f37a273d9 vmm: Fix clippy issue
error: all if blocks contain the same code at the end
   --> vmm/src/memory_manager.rs:884:9
    |
884 | /             Ok(mm)
885 | |         }
    | |_________^

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-09-08 13:31:19 -07:00
Rob Bradford
d64a77a5c6 vmm: Shutdown VMM if signal thread panics
See: #3031

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-08 11:26:48 -07:00
Rob Bradford
e0d05683ab vmm: Split up functions for creating signal handler and tty setup
These are quite separate and should be in their own functions.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-08 11:26:48 -07:00
Rob Bradford
387753ae1d vmm: Remove concept of "input_enabled"
This concept ends up being broken with multiple types on input connected
e.g. console on TTY and serial on PTY. Already the code for checking for
injecting into the serial device checks that the serial is configured.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-08 11:26:48 -07:00
Rob Bradford
951ad3495e vmm: Only resize virtio-console when attached to TTY
Fixes: #3092

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-08 11:26:48 -07:00
Rob Bradford
0dbb2683e3 vmm: Consolidate duplicated code for setting up signal handler
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-08 11:26:48 -07:00
Rob Bradford
687d646c60 virtio-devices, vmm: Shutdown VMM on virtio thread panic
Shutdown the VMM in the virtio (or VMM side of vhost-user) thread
panics.

See: #3031

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-08 09:40:36 +01:00
Rob Bradford
54e523c302 virtio-devices: Use a common method for spawning virtio threads
Introduce a common solution for spawning the virtio threads which will
make it easier to add the panic handling.

During this effort I discovered that there were no seccomp filters
registered for the vhost-user-net thread nor the vhost-user-block
thread. This change also incorporates basic seccomp filters for those as
part of the refactoring.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-08 09:40:36 +01:00
Wei Liu
9c5b404415 vmm: MSHV now supports VFIO-based device passthrough
Drop a few feature gates and adjust code a bit.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2021-09-07 15:17:08 +01:00
Wei Liu
10b954e954 build: use vfio-ioctls that supports MSHV
Disable default features and propagate hypervisor selection where
necessary.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2021-09-07 15:17:08 +01:00
dependabot[bot]
a20041ba68 build: bump thiserror from 1.0.28 to 1.0.29
Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.28 to 1.0.29.
- [Release notes](https://github.com/dtolnay/thiserror/releases)
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.28...1.0.29)

---
updated-dependencies:
- dependency-name: thiserror
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-07 08:35:50 +00:00
Henry Wang
c50051a686 device_manager: Enable power button for ACPI on AArch64
Current AArch64 power button is only for device tree using a PL061
GPIO controller device. Since AArch64 now supports ACPI, this
commit extend the power button on AArch64 to:

- Using GED for ACPI+UEFI boot.
- Using PL061 for device tree boot.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-09-03 10:27:52 -07:00
Rob Bradford
e475b12cf7 virtio-devices, vmm: Upgrade restore related messages to info!()
These happen only sporadically so can be included at the info!() level.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-03 09:30:55 -07:00
Rob Bradford
968902dfec devices, vmm: Upgrade exit reasons to info!() level debugging
These statements are useful for understanding the cause of reset or
shutdown of the VM and are not spammy so should be included at info!()
level.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-03 09:30:55 -07:00
Alyssa Ross
7549149bb5 vmm: ensure signal handlers run on the right thread
Despite setting up a dedicated thread for signal handling, we weren't
making sure that the signals we were listening for there were actually
dispatched to the right thread.  While the signal-hook provides an
iterator API, so we can know that we're only processing the signals
coming out of the iterator on our signal handling thread, the actual
signal handling code from signal-hook, which pushes the signals onto
the iterator, can run on any thread.  This can lead to seccomp
violations when the signal-hook signal handler does something that
isn't allowed on that thread by our seccomp policy.

To reproduce, resize a terminal running cloud-hypervisor continuously
for a few minutes.  Eventually, the kernel will deliver a SIGWINCH to
a thread with a restrictive seccomp policy, and a seccomp violation
will trigger.

As part of this change, it's also necessary to allow rt_sigreturn(2)
on the signal handling thread, so signal handlers are actually allowed
to run on it.  The fact that this didn't seem to be needed before
makes me think that signal handlers were almost _never_ actually
running on the signal handling thread.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2021-09-02 21:33:31 +01:00
Rob Bradford
c2144b5690 vmm, virtio-console: Move input reading into virtio-console thread
Move the processing of the input from stdin, PTY or file from the VMM
thread to the existing virtio-console thread. The handling of the resize
of a virtio-console has not changed but the name of the struct used to
support that has been renamed to reflect its usage.

Fixes: #3060

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-02 21:17:33 +01:00
Henry Wang
0d01eac1d4 vmm: Do the downcast of GicDevice in a safer way for AArch64
Downcasting of GicDevice trait might fail. Therefore we try to
downcast the trait first and only if the downcasting succeeded we
can then use the object to call methods. Otherwise, do nothing and
log the failure.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-09-02 15:18:41 +01:00
Henry Wang
46c60183cd arch, vmm: Implement GIC Pausable trait
This commit implements the GIC (including both GICv3 and GICv3ITS)
Pausable trait. The pause of device manager will trigger a "pause"
of GIC, where we flush GIC pending tables and ITS tables to the
guest RAM.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-09-02 15:18:41 +01:00
Rob Bradford
66f0b5b2b6 vmm: Open the serial PTY in non-blocking mode
This prevents the boot of the guest kernel from being blocked by
blocking I/O on the serial output since the data will be buffered into
the SerialBuffer.

Fixes: #3004

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-02 13:52:18 +01:00
Rob Bradford
d92707afc5 vmm: Introduce a SerialBuffer for buffering serial output
Introduce a dynamic buffer for storing output from the serial port. The
SerialBuffer implements std::io::Write and can be used in place of the
direct output for the serial device.

The internals of the buffer is a vector that grows dynamically based on
demand up to a fixed size at which point old data will be overwritten.
Currently the buffer is only flushed upon writes.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-09-02 13:52:18 +01:00
Alyssa Ross
9a634f07cb build: update Cargo for rust-vmm branch renames
The rust-vmm crates we're pulling from git have renamed their main
branches.  We need to update the branch names we're giving to Cargo,
or people who don't have these dependencies cached will get errors
like this when trying to build:

    error: failed to get `vm-fdt` as a dependency of package `arch v0.1.0 (/home/src/cloud-hypervisor/arch)`

    Caused by:
      failed to load source for dependency `vm-fdt`

    Caused by:
      Unable to update https://github.com/rust-vmm/vm-fdt?branch=master#031572a6

    Caused by:
      object not found - no match for id (031572a6edc2f566a7278f1e17088fc5308d27ab); class=Odb (9); code=NotFound (-3)

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2021-09-02 10:38:25 +01:00
dependabot[bot]
b30a95f69a build: bump signal-hook from 0.3.9 to 0.3.10
Bumps [signal-hook](https://github.com/vorner/signal-hook) from 0.3.9 to 0.3.10.
- [Release notes](https://github.com/vorner/signal-hook/releases)
- [Changelog](https://github.com/vorner/signal-hook/blob/master/CHANGELOG.md)
- [Commits](https://github.com/vorner/signal-hook/compare/v0.3.9...v0.3.10)

---
updated-dependencies:
- dependency-name: signal-hook
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-09-01 00:56:16 +00:00
Rob Bradford
63637eba31 vmm: Simplify epoll handling for VMM main loop
Remove the indirection of a dispatch table and simply use the enum as
the event data for the events.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-08-31 21:30:11 +01:00
dependabot[bot]
8841e63e2d build: bump thiserror from 1.0.26 to 1.0.28
Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.26 to 1.0.28.
- [Release notes](https://github.com/dtolnay/thiserror/releases)
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.26...1.0.28)

---
updated-dependencies:
- dependency-name: thiserror
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-08-30 06:02:02 +00:00
dependabot[bot]
e877718b29 build: bump serde from 1.0.129 to 1.0.130
Bumps [serde](https://github.com/serde-rs/serde) from 1.0.129 to 1.0.130.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.129...v1.0.130)

---
updated-dependencies:
- dependency-name: serde
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-08-30 05:23:16 +00:00
dependabot[bot]
b0d8b50b36 build: bump serde_derive from 1.0.129 to 1.0.130
Bumps [serde_derive](https://github.com/serde-rs/serde) from 1.0.129 to 1.0.130.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.129...v1.0.130)

---
updated-dependencies:
- dependency-name: serde_derive
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-08-30 00:23:14 +00:00
dependabot[bot]
2349b7e753 build: bump serde_json from 1.0.66 to 1.0.67
Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.66 to 1.0.67.
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](https://github.com/serde-rs/json/compare/v1.0.66...v1.0.67)

---
updated-dependencies:
- dependency-name: serde_json
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-08-29 23:28:14 +00:00
dependabot[bot]
8fae21c10c build: bump arc-swap from 1.3.1 to 1.3.2
Bumps [arc-swap](https://github.com/vorner/arc-swap) from 1.3.1 to 1.3.2.
- [Release notes](https://github.com/vorner/arc-swap/releases)
- [Changelog](https://github.com/vorner/arc-swap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/vorner/arc-swap/compare/v1.3.1...v1.3.2)

---
updated-dependencies:
- dependency-name: arc-swap
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-08-29 22:30:31 +00:00
Bo Chen
b82bb55927 vmm: openapi: use the right default values
This patch fixes couple of typos for the default values from the openapi
yaml file.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-08-27 15:58:23 +01:00
dependabot[bot]
f840335922 build: bump libc from 0.2.100 to 0.2.101
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.100 to 0.2.101.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.100...0.2.101)

---
updated-dependencies:
- dependency-name: libc
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-08-26 05:11:06 +00:00
Rob Bradford
4d2a4e2805 vmm: Handle epoll events for PTYs separately
Use two separate events for the console and serial PTY and then drive
the handling of the inputs on the PTY separately. This results in the
correct behaviour when both console and serial are attached to the PTY
as they are triggered separately on the epoll so events are not lost.

Fixes: #3012

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-08-25 13:33:32 +01:00
Rob Bradford
6233f6f68e vmm: Send tty input to correct destination
Check the config to find out which device is attached to the tty and
then send the input from the user into that device (serial or
virtio-console.)

Fixes: #3005

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-08-25 10:08:25 +01:00
dependabot[bot]
8f6a5f979d build: bump serde from 1.0.127 to 1.0.129
Bumps [serde](https://github.com/serde-rs/serde) from 1.0.127 to 1.0.129.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.127...v1.0.129)

---
updated-dependencies:
- dependency-name: serde
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-08-24 07:16:49 +00:00
dependabot[bot]
b8b16c6eec build: bump serde_derive from 1.0.127 to 1.0.129
Bumps [serde_derive](https://github.com/serde-rs/serde) from 1.0.127 to 1.0.129.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.127...v1.0.129)

---
updated-dependencies:
- dependency-name: serde_derive
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-08-24 06:50:04 +00:00
dependabot[bot]
a969d5016c build: bump libc from 0.2.99 to 0.2.100
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.99 to 0.2.100.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.99...0.2.100)

---
updated-dependencies:
- dependency-name: libc
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-08-24 05:53:20 +00:00
dependabot[bot]
722523f925 build: bump arc-swap from 1.3.0 to 1.3.1
Bumps [arc-swap](https://github.com/vorner/arc-swap) from 1.3.0 to 1.3.1.
- [Release notes](https://github.com/vorner/arc-swap/releases)
- [Changelog](https://github.com/vorner/arc-swap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/vorner/arc-swap/compare/v1.3.0...v1.3.1)

---
updated-dependencies:
- dependency-name: arc-swap
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-08-24 04:19:39 +00:00
Fazla Mehrab
5db4dede28 block_util, vhdx: vhdx crate integration with the cloud hypervisor
vhdx_sync.rs in block_util implements traits to represent the vhdx
crate as a supported block device in the cloud hypervisor. The vhdx
is added to the block device list in device_manager.rs at the vmm
crate so that it can automatically detect a vhdx disk and invoke the
corresponding crate.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Fazla Mehrab <akm.fazla.mehrab@intel.com>
2021-08-19 11:43:19 +02:00
Bo Chen
9aba1fdee6 virtio-devices, vmm: Use syscall definitions from the libc crate
Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-08-18 10:42:19 +02:00
Bo Chen
864a5e4fe0 virtio-devices, vmm: Simplify 'get_seccomp_rules'
Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-08-18 10:42:19 +02:00
Bo Chen
7d38a1848b virtio-devices, vmm: Fix the '--seccomp false' option
We are relying on applying empty 'seccomp' filters to support the
'--seccomp false' option, which will be treated as an error with the
updated 'seccompiler' crate. This patch fixes this issue by explicitly
checking whether the 'seccomp' filter is empty before applying the
filter.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-08-18 10:42:19 +02:00
Bo Chen
08ac3405f5 virtio-devices, vmm: Move to the seccompiler crate
Fixes: #2929

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-08-18 10:42:19 +02:00
dependabot[bot]
86b2c17135 build: bump anyhow from 1.0.42 to 1.0.43
Bumps [anyhow](https://github.com/dtolnay/anyhow) from 1.0.42 to 1.0.43.
- [Release notes](https://github.com/dtolnay/anyhow/releases)
- [Commits](https://github.com/dtolnay/anyhow/compare/1.0.42...1.0.43)

---
updated-dependencies:
- dependency-name: anyhow
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-08-17 00:32:44 +00:00
dependabot[bot]
754ce37031 build: bump bitflags from 1.3.1 to 1.3.2
Bumps [bitflags](https://github.com/bitflags/bitflags) from 1.3.1 to 1.3.2.
- [Release notes](https://github.com/bitflags/bitflags/releases)
- [Changelog](https://github.com/bitflags/bitflags/blob/main/CHANGELOG.md)
- [Commits](https://github.com/bitflags/bitflags/compare/1.3.1...1.3.2)

---
updated-dependencies:
- dependency-name: bitflags
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-08-16 23:54:21 +00:00
Rob Bradford
9d35a10fd4 vmm: cpu: Shutdown VMM on vCPU thread panic
If the vCPU thread panics then catch it and trigger the shutdown of the
VMM.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-08-13 09:19:54 +02:00
Rob Bradford
53b2e19934 vmm: Add support for hotplugging user devices
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-08-12 13:19:04 +01:00
dependabot[bot]
f99462add4 build: bump bitflags from 1.2.1 to 1.3.1
Bumps [bitflags](https://github.com/bitflags/bitflags) from 1.2.1 to 1.3.1.
- [Release notes](https://github.com/bitflags/bitflags/releases)
- [Changelog](https://github.com/bitflags/bitflags/blob/main/CHANGELOG.md)
- [Commits](https://github.com/bitflags/bitflags/compare/1.2.1...1.3.1)

---
updated-dependencies:
- dependency-name: bitflags
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-08-12 09:09:39 +00:00
Henry Wang
bcae6c41e3 vmm, doc: Forbid same memory zone in multiple NUMA nodes
It is forbidden that the same memory zone belongs to more than one
NUMA node. This commit adds related validation to the `--numa`
parameter to prevent the user from specifying such configuration.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-08-12 10:49:02 +02:00
Henry Wang
5a0a4bc505 arch: Add optional distance-map node to FDT
The optional device tree node distance-map describes the relative
distance (memory latency) between all NUMA nodes.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-08-12 10:49:02 +02:00
Henry Wang
165364e08b vmm: Move NUMA node data structures to arch
This is to make sure the NUMA node data structures can be accessed
both from the `vmm` crate and `arch` crate.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-08-12 10:49:02 +02:00
Henry Wang
20aa811de7 vmm: Extend NUMA setup to more than ACPI
The AArch64 platform provides a NUMA binding for the device tree,
which means on AArch64 platform, the NUMA setup can be extended to
more than the ACPI feature.

Based on above, this commit extends the NUMA setup and data
structures to following scenarios:

- All AArch64 platform
- x86_64 platform with ACPI feature enabled

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
Signed-off-by: Michael Zhao <Michael.Zhao@arm.com>
2021-08-12 10:49:02 +02:00
Sebastien Boeuf
4918c1ca7f block_util, vmm: Propagate error on QcowDiskSync creation
Instead of panicking with an expect() function, the QcowDiskSync::new
function now propagates the error properly. This ensures the VMM will
not panic, which might be the source of weird errors if only one thread
exits while the VMM continues to run.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-11 16:44:28 -07:00
Sebastien Boeuf
4735cb8563 vmm, virtio-devices: Restore vhost-user devices in a dedicated way
We cannot let vhost-user devices connect to the backend when the Block,
Fs or Net object is being created during a restore/migration. The reason
is we can't have two VMs (source and destination) connected to the same
backend at the same time. That's why we must delay the connection with
the vhost-user backend until the restoration is performed.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-10 12:36:58 -07:00
Sebastien Boeuf
71c7dff32b vmm: Fix the error handling logic when migration fails
The code wasn't doing what it was expected to. The '?' was simply
returning the error to the top level function, meaning the Err() case in
the match was never hit. Moving the whole logic to a dedicated function
allows to identify when something got wrong without propagating to the
calling function, so that we can still stop the dirty logging and
unpause the VM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-10 12:36:58 -07:00
Sebastien Boeuf
db444715fd vmm: Shutdown VM after migration succeeded
In case the migration succeeds, the destination VM will be correctly
running, with potential vhost-user backends attached to it. We can't let
the source VM trying to reconnect to the same backends, which is why
it's safer to shutdown the source VM.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-10 12:36:58 -07:00
Sebastien Boeuf
5a83ebce64 vmm: Notify Migratable objects about migration being complete
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-10 12:36:58 -07:00
Sebastien Boeuf
06729bb3ba vmm: Provide a restoring state to the DeviceManager
In anticipation for creating vhost-user devices in a different way when
being restored compared to a fresh start, this commit introduces a new
boolean created by the Vm depending on the use case, and passed down to
the DeviceManager. In the future, the DeviceManager will use this flag
to assess how vhost-user devices should be created.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-10 12:36:58 -07:00
Rob Bradford
5e74848ab4 vmm: seccomp: Permit syscalls used for vfio-user on vCPU thread
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-08-10 16:01:00 +01:00
Rob Bradford
3efccd0fef vmm: config: Ensure shared memory is enabled if using user-devices
Correct operation of user devices (vfio-user) requires shared memory so
flag this to prevent it from failing in strange ways.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-08-10 16:01:00 +01:00
Rob Bradford
b28063a7b4 vmm: Create user devices from config
Create the vfio-user / user devices from the config. Currently hotplug
of the devices is not supported nor can they be placed behind the
(virt-)iommu.

Removal of the coldplugged device is however supported.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-08-10 16:01:00 +01:00
Rob Bradford
7fbec7113e main, config: Add support for --user-device
This allows the user to specify devices that are running in a different
userspace process and communicated with vfio-user.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-08-10 16:01:00 +01:00
Rob Bradford
77e147f333 build: Bump dependencies
This has the side effect of also removing the vm-memory 0.5.0
dependency.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-08-10 15:24:28 +01:00
Markus Theil
5b0d4bb398 virtio-devices: seccomp: allow unix socket connect in vsock thread
Allow vsocks to connect to Unix sockets on the host running
cloud-hypervisor with enabled seccomp.

Reported-by: Philippe Schaaf <philippe.schaaf@secunet.com>
Tested-by: Franz Girlich <franz.girlich@tu-ilmenau.de>
Signed-off-by: Markus Theil <markus.theil@tu-ilmenau.de>
2021-08-06 08:44:47 -07:00
Rob Bradford
f7f2f25a57 build: Use fixed versions in Cargo.toml files
This doesn't really affect the build as we ship a Cargo.lock with fixed
versions in. However for clarity it makes sense to use fixed versions
throughout and let dependabot update them.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-08-06 12:11:39 +02:00
Rob Bradford
b4f887ea80 build: Move from patched vm-memory version to released version
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2021-08-06 10:08:58 +01:00
Henry Wang
27a285257e vmm: cpu: Add PPTT table for AArch64
The optional Processor Properties Topology Table (PPTT) table is
used to describe the topological structure of processors controlled
by the OSPM, and their shared resources, such as caches. The table
can also describe additional information such as which nodes in the
processor topology constitute a physical package.

The ACPI PPTT table supports topology descriptions for ACPI guests.
Therefore, this commit adds the PPTT table for AArch64 to enable
CPU topology feature for ACPI.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-08-05 21:19:16 +08:00
Henry Wang
7fb980f17b arch, vmm: Pass cpu topology configuation to FDT
In an Arm system, the hierarchy of CPUs is defined through three
entities that are used to describe the layout of physical CPUs in
the system:

- cluster
- core
- thread

All these three entities have their own FDT node field. Therefore,
This commit adds an AArch64-specific helper to pass the config from
the Cloud Hypervisor command line to the `configure_system`, where
eventually the `create_fdt` is called.

Signed-off-by: Henry Wang <Henry.Wang@arm.com>
2021-08-05 21:19:16 +08:00
Sebastien Boeuf
5c6139bbff vmm: Finalize migration support for all devices
Make sure the DeviceManager is triggered for all migration operations.
The dirty pages are merged from MemoryManager and DeviceManager before
to be sent up to the Vmm in lib.rs.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-05 06:07:00 -07:00
Sebastien Boeuf
0411064271 vmm: Refactor migration through Migratable trait
Now that Migratable provides the methods for starting, stopping and
retrieving the dirty pages, we move the existing code to these new
functions.

No functional change intended.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-05 06:07:00 -07:00
Sebastien Boeuf
e9637d3733 vmm: device_manager: Fully implement Migratable trait
This patch connects the dots between the vm.rs code and each Migratable
device, in order to make sure Migratable methods are correctly invoked
when migration happens.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-05 06:07:00 -07:00
Sebastien Boeuf
79425b6aa8 vm-migration, vmm: Extend methods for MemoryRangeTable
In anticipation for supporting the merge of multiple dirty pages coming
from multiple devices, this patch factorizes the creation of a
MemoryRangeTable from a bitmap, as well as providing a simple method for
merging the dirty pages regions under a single MemoryRangeTable.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-08-05 06:07:00 -07:00
Muminul Islam
83c44a2411 vmm, virtio-devices: Add missing seccomp rules for MSHV
This patch adds all the seccomp rules missing for MSHV.
With this patch MSFT internal CI runs with seccomp enabled.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2021-08-03 11:09:07 -07:00
Bo Chen
902fe20d41 vmm: Add fallback handling for sending live migration
This patch adds a fallback path for sending live migration, where it
ensures the following behavior of source VM post live-migration:

1. The source VM will be paused only when the migration is completed
successfully, or otherwise it will keep running;

2. The source VM will always stop dirty pages logging.

Fixes: #2895

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-08-03 09:26:12 +01:00
Muminul Islam
3baa0c3721 vmm: Add MSHV_VP_TRANSLATE_GVA to seccomp rule
This rule is needed to boot windows guest.
This bug was introduced while we tried to boot
windows guest on MSHV.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2021-07-29 16:29:53 +01:00
Muminul Islam
81895b9b40 hypervisor: Implement start/stop_dirty_log for MSHV
This patch modify the existing live migration code
to support MSHV. Adds couple of new functions to enable
and disable dirty page tracking. Add missing IOCTL
to the seccomp rules for live migration.
Adds necessary flags for MSHV.
This changes don't affect KVM functionality at all.

In order to get better performance it is good to
enable dirty page tracking when we start live migration
and disable it when the migration is done.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2021-07-29 16:29:53 +01:00
Muminul Islam
fdecba6958 hypervisor: MSHV needs gpa to retrieve dirty logs
Right now, get_dirty_log API has two parameters,
slot and memory_size.
MSHV needs gpa to retrieve the page states. GPA is
needed as MSHV returns the state base on PFN.

Signed-off-by: Muminul Islam <muislam@microsoft.com>
2021-07-29 16:29:53 +01:00
Sebastien Boeuf
12db6e5068 vmm: Allow restoring virtio-fs with no cache region
It's totally acceptable to snapshot and restore a virtio-fs device that
has no cache region, since this is a valid mode of functioning for
virtio-fs itself.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-07-29 06:35:03 -07:00
Sebastien Boeuf
dcc646f5b1 clippy: Fix redundant allocations
With the new beta version, clippy complains about redundant allocation
when using Arc<Box<dyn T>>, and suggests replacing it simply with
Arc<dyn T>.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2021-07-29 13:28:57 +02:00