Commit Graph

2238 Commits

Author SHA1 Message Date
Bo Chen
3b39c41a01 build: Bulk update rust-vmm dependencies
Bump to the latest rust-vmm crates, including vm-memory, vfio,
vfio-bindings, vfio-user, virtio-bindings, virtio-queue, linux-loader,
vhost, and vhost-user-backend,

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-06-08 13:15:25 +01:00
Omer Faruk Bayram
7a458d85d1 main: cli: add D-Bus API related CLI options
Introduces three new CLI options, `dbus-service-name`,
`dbus-object-path` and `dbus-system-bus` to configure the DBus API.

Signed-off-by: Omer Faruk Bayram <omer.faruk@sartura.hr>
2023-06-06 10:18:26 -07:00
Omer Faruk Bayram
a92d852848 vmm: dbus: apply seccomp filter
This commit applies the previously created seccomp filter
to the `DbusApi` thread.

Also encloses the main loop of the `DBusApi` thread using
`std::panic::catch_unwind` and `AssertUnwindSafe` in order to mirror
the behavior of the HTTP API.

Signed-off-by: Omer Faruk Bayram <omer.faruk@sartura.hr>
2023-06-06 10:18:26 -07:00
Omer Faruk Bayram
0664647109 vmm: seccomp: add new seccomp filter for the DBusApi thread
Signed-off-by: Omer Faruk Bayram <omer.faruk@sartura.hr>
2023-06-06 10:18:26 -07:00
Omer Faruk Bayram
f2c813e1cf vmm: seccomp: rename Thread::Api to Thread::HttpApi
Signed-off-by: Omer Faruk Bayram <omer.faruk@sartura.hr>
2023-06-06 10:18:26 -07:00
Omer Faruk Bayram
f00df25d40 vmm: dbus: graceful shutdown of the DBusApi thread
This commit adds support for graceful shutdown of the DBusApi thread
using `futures::channel::oneshot` channels. By using oneshot channels,
we ensure that the thread has enough time to send a response to the
`VmmShutdown` method call before it is terminated. Without this step,
the thread may be terminated before it can send a response, resulting
in an error message on the client side stating that the message
recipient disconnected from the message bus without providing a reply.

Also changes the default values for DBus service name, object path
and interface name.

Signed-off-by: Omer Faruk Bayram <omer.faruk@sartura.hr>
2023-06-06 10:18:26 -07:00
Omer Faruk Bayram
c016a0d4d3 vmm: dbus: implement the D-Bus API
This commit introduces three new dependencies: `zbus`, `futures`
and `blocking`. `blocking` is used to call the Internal API in zbus'
async context which is driven by `futures::executor`. They are all
behind the `dbus_api` feature flag.

The D-Bus API implementation is behind the same `dbus_api` feature
flag as well.

Signed-off-by: Omer Faruk Bayram <omer.faruk@sartura.hr>
2023-06-06 10:18:26 -07:00
Omer Faruk Bayram
5c96fbb19b vmm: move the http api into its own submodule
This commits moves the http API code into its own
submodule.

Signed-off-by: Omer Faruk Bayram <omer.faruk@sartura.hr>
2023-06-06 10:18:26 -07:00
dependabot[bot]
9014a5e59c build: Bump serde from 1.0.156 to 1.0.163
Bumps [serde](https://github.com/serde-rs/serde) from 1.0.156 to 1.0.163.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.156...v1.0.163)

---
updated-dependencies:
- dependency-name: serde
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-06-02 00:38:59 +00:00
Anatol Belski
7bf2a2c382 vmm: arch: Make phys_bits functionality use CPU vendor API
Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
2023-05-31 23:54:33 +02:00
Rob Bradford
89e658d9ff misc: Update for beta clippy failures on x86-64
Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2023-05-30 07:18:17 -07:00
dependabot[bot]
681a30bd15 build: Bump thiserror from 1.0.39 to 1.0.40
Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.39 to 1.0.40.
- [Release notes](https://github.com/dtolnay/thiserror/releases)
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.39...1.0.40)

---
updated-dependencies:
- dependency-name: thiserror
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-05-24 00:39:08 +00:00
Bo Chen
0b1e626fe3 vmm: Allocate guest memory address space before TDX initialization
The refactoring on deferring address space allocation (#5169) broke TDX,
as TDX initialization needs to access guest memory for encryption and
measurement of guest pages.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-05-23 09:00:00 -07:00
dependabot[bot]
79bc42f3c2 build: Bump anyhow from 1.0.70 to 1.0.71
Bumps [anyhow](https://github.com/dtolnay/anyhow) from 1.0.70 to 1.0.71.
- [Release notes](https://github.com/dtolnay/anyhow/releases)
- [Commits](https://github.com/dtolnay/anyhow/compare/1.0.70...1.0.71)

---
updated-dependencies:
- dependency-name: anyhow
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-05-23 00:42:11 +00:00
dependabot[bot]
e6208d1c0e build: Bump uuid from 1.3.0 to 1.3.3
Bumps [uuid](https://github.com/uuid-rs/uuid) from 1.3.0 to 1.3.3.
- [Release notes](https://github.com/uuid-rs/uuid/releases)
- [Commits](https://github.com/uuid-rs/uuid/compare/1.3.0...1.3.3)

---
updated-dependencies:
- dependency-name: uuid
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-05-19 00:44:43 +00:00
dependabot[bot]
b7338c96eb build: Bump serde from 1.0.152 to 1.0.156
Bumps [serde](https://github.com/serde-rs/serde) from 1.0.152 to 1.0.156.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.152...v1.0.156)

---
updated-dependencies:
- dependency-name: serde
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-05-18 13:54:52 +01:00
Hao Xu
c56a3ce59a vmm: reduce memory copy when BFT device tree
The current implementation of breadth first traversal for device tree
uses a temporary vector, therefore causes unnecessary memory copy.
Remove it and do it within vector nodes.

Signed-off-by: Hao Xu <howeyxu@tencent.com>
2023-05-15 17:19:48 +01:00
Anatol Belski
083ce323c0 seccomp: Add filter entry for MSHV_VP_REGISTER_INTERCEPT_RESULT
Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
2023-05-08 08:50:09 -07:00
Anatol Belski
8fff4c1af3 mshv: Pass topology explicitly while constructing cpuid
Unlike KVM, there's no internal handling for topoolgy under MSHV. Thus,
if no topology has been passed during the CH launch, use the boot CPUs
count to construct the topology struct.

Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
2023-05-08 08:50:09 -07:00
Wei Liu
aa14fe214a pci: bump the number of supported PCI segments
Signed-off-by: Wei Liu <liuwe@microsoft.com>
2023-05-02 09:34:05 +01:00
Wei Liu
45e3f49bba vmm: use MAX_NUM_PCI_SEGMENTS in test cases
No functional change.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2023-05-02 09:34:05 +01:00
Wei Liu
ba1e89139a pci: aml: support up to 256 PCI segments
Originally the AML only accepted one hex number for PCI segment
numbering. Change it to accept two numbers. That makes it possible to
add up to 256 PCI segments.

Signed-off-by: Wei Liu <liuwe@microsoft.com>
2023-05-02 09:34:05 +01:00
Alyssa Ross
21d40d7489 main: reset tty if starting the VM fails
When I refactored this to centralise resetting the tty into
DeviceManager::drop, I tested that the tty was reset if an error
happened on the vmm thread, but not on the main thread.  It turns out
that if an error happened on the main thread, the process would just
exit, so drop handlers on other threads wouldn't get run.

To fix this, I've changed start_vmm() to write to the VMM's exit
eventfd and then join the thread if an error happens after the vmm
thread is started.

Fixes: b6feae0a ("vmm: only touch the tty flags if it's being used")
Signed-off-by: Alyssa Ross <hi@alyssa.is>
2023-05-02 09:33:53 +01:00
Alyssa Ross
c90a0ffff6 vmm: reset to the original termios
Previously, we used two different functions for configuring ttys.
vmm_sys_util::terminal::Terminal::set_raw_mode() was used to configure
stdio ttys, and cfmakeraw() was used to configure ptys created by
cloud-hypervisor.  When I centralized the stdio tty cleanup, I also
switched to using cfmakeraw() everywhere, to avoid duplication.

cfmakeraw sets the OPOST flag, but when we later reset the ttys, we
used vmm_sys_util::terminal::Terminal::set_canon_mode(), which does
not unset this flag.  This meant that the terminal was getting mostly,
but not fully, reset.

To fix this without depending on the implementation of cfmakeraw(),
let's just store the original termios for stdio terminals, and restore
them to exactly the state we found them in when cloud-hypervisor exits.

Fixes: b6feae0a ("vmm: only touch the tty flags if it's being used")
Signed-off-by: Alyssa Ross <hi@alyssa.is>
2023-05-02 09:33:53 +01:00
dependabot[bot]
97fdb65012 build: Bump anyhow from 1.0.69 to 1.0.70
Bumps [anyhow](https://github.com/dtolnay/anyhow) from 1.0.69 to 1.0.70.
- [Release notes](https://github.com/dtolnay/anyhow/releases)
- [Commits](https://github.com/dtolnay/anyhow/compare/1.0.69...1.0.70)

---
updated-dependencies:
- dependency-name: anyhow
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-04-27 00:39:57 +00:00
Rob Bradford
d17d70fae1 vmm: Update for new acpi_tables version
In particular the Std::write() API requires that the value implements
AsBytes and copies the slice representation into the table data. This
avoids unaligned writes which can cause a panic with the updated
toolchain.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2023-04-26 23:25:57 +01:00
Rob Bradford
71d1296d09 vmm: Implemented zerocopy::AsBytes for SDT structures
For structures that are used in SDT ACPI tables it is necessary for them
to implement this trait for the newly safe Std::write() API.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2023-04-26 23:25:57 +01:00
dependabot[bot]
97012c511d build: Bump serde_json from 1.0.95 to 1.0.96
Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.95 to 1.0.96.
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](https://github.com/serde-rs/json/compare/v1.0.95...v1.0.96)

---
updated-dependencies:
- dependency-name: serde_json
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-04-25 08:45:18 +00:00
Alyssa Ross
3c0b389c82 vmm: allow getdents64 in seccomp filter
This is used on older kernels where close_range() is not available.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
Fixes: 505f4dfa ("vmm: close all unused fds in sigwinch listener")
2023-04-22 11:40:17 +01:00
Rob Bradford
ceb8151747 hypervisor, vmm: Limit max number of vCPUs to hypervisor maximum
On KVM this is provided by an ioctl, on MSHV this is constant. Although
there is a HV_MAXIMUM_PROCESSORS constant the MSHV ioctl API is limited
to u8.

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>
2023-04-22 10:35:39 +01:00
Rafael Mendonca
6379074264 misc: Remove unnecessary clippy directives
Clippy passes fine without these.

Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com>
2023-04-18 10:48:31 -07:00
Bo Chen
a9623c7a28 vmm: Add valid FDs for TAP devices to 'VmConfig::preserved_fds'
In this way, valid FDs for TAP devices will be closed when the holding
VmConfig instance is destroyed.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-04-17 16:33:29 +01:00
Bo Chen
4baf85857a vmm: Add unit test for 'VmConfig::preserved_fds'
Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-04-17 16:33:29 +01:00
Bo Chen
e3d2917d5f vmm: Implement Clone and Drop for VmConfig
The custom 'clone' duplicates 'preserved_fds' so that the validation
logic can be safely carried out on the clone of the VmConfig.

The custom 'drop' ensures 'preserved_fds' are safely closed when the
holding VmConfig instance is destroyed.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-04-17 16:33:29 +01:00
Bo Chen
a84b540b65 vmm: config: Extend 'VmConfig' with 'preserved_fds'
Preserved FDs are the ones that share the same life-time as its holding
VmConfig instance, such as FDs for creating TAP devices.

Preserved FDs will stay open as long as the holding VmConfig instance is
valid, and will be closed when the holding VmConfig instance is destroyed.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-04-17 16:33:29 +01:00
Bo Chen
8eb162e3d7 Revert "vmm: config: Implement Clone for NetConfig"
This reverts commit ea4a95c4f6.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-04-17 16:33:29 +01:00
Bo Chen
2804608a1c Revert "vmm: config: Close FDs for TAP devices that are provided to VM"
This reverts commit b14427540b.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-04-17 16:33:29 +01:00
Bo Chen
e0125653b1 Revert "vmm: config: Don't close reserved FDs from NetConfig::drop()"
This reverts commit 0110fb4edc.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-04-17 16:33:29 +01:00
Bo Chen
e431a48201 Revert "vmm: config: Avoid closing invalid FDs from 'test_net_parsing()'"
This reverts commit 0567def931.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-04-17 16:33:29 +01:00
Bo Chen
c143cb3af0 Revert "vmm: config: Replace use of memfd_create with fd pointing to /dev/null"
This reverts commit 46066d6ae1.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-04-17 16:33:29 +01:00
Alyssa Ross
b6feae0ace vmm: only touch the tty flags if it's being used
When neither serial nor console are connected to the tty,
cloud-hypervisor shouldn't touch the tty at all.  One way in which
this is annoying is that if I am running cloud-hypervisor without it
using my terminal, I expect to be able to suspend it with ^Z like any
other process, but that doesn't work if it's put the terminal into raw
mode.

Instead of putting the tty into raw mode when a VM is created or
restored, do it when a serial or console device is created.  Since we
now know it can't be put into raw mode until the Vm object is created,
we can move setting it back to canon mode into the drop handler for
that object, which should always be run in normal operation.  We still
also put the tty into canon mode in the SIGTERM / SIGINT handler, but
check whether the tty was actually used, rather than whether stdin is
a tty.  This requires passing on_tty around as an atomic boolean.

I explored more of an abstraction over the tty — having an object that
encapsulated stdout and put the tty into raw mode when initialized and
into canon mode when dropped — but it wasn't practical, mostly due to
the special requirements of the signal handler.  I also investigated
whether the SIGWINCH listener process could be used here, which I
think would have worked but I'm hesitant to involve it in serial
handling as well as conosle handling.

There's no longer a check for whether the file descriptor is a tty
before setting it into canon mode — it's redundant, because if it's
not a tty it just won't respond to the ioctl.

Tested by shutting down through the API, SIGTERM, and an error
injected after setting raw mode.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2023-04-17 16:33:17 +01:00
Alyssa Ross
520aff2efc vmm: don't redundantly set the TTY to canon mode
If the VM is shut down, either it's going to be started again, in
which case we still want to be in raw mode, or the process is about to
exit, in which case canon mode will be set at the end of main.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2023-04-17 16:33:17 +01:00
Omer Faruk Bayram
346ee09e6b vmm: api: include BUILD_VERSION and CH pid in VmmPingResponse
Signed-off-by: Omer Faruk Bayram <omer.faruk@sartura.hr>
2023-04-14 12:13:46 -07:00
Alyssa Ross
9b724303ac vmm: only use KVM_ARM_VCPU_PMU_V3 if available
Having PMU in guests isn't critical, and not all hardware supports
it (e.g. Apple Silicon).

CpuManager::init_pmu already has a fallback for if PMU is not
supported by the VCPU, but we weren't getting that far, because we
would always try to initialise the VCPU with KVM_ARM_VCPU_PMU_V3, and
then bail when it returned with EINVAL.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2023-04-13 09:02:55 +08:00
Bo Chen
df2a7c1764 vmm: Ignore and warn TAP FDs sent via the HTTP request body
Valid FDs can only be sent from another process via `SCM_RIGHTS`.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-04-11 09:34:14 -07:00
Alyssa Ross
38a1b45783 vmm: use the SIGWINCH listener for TTYs too
Previously, we were only using it for PTYs, because for PTYs there's
no alternative.  But since we have to have it for PTYs anyway, if we
also use it for TTYs, we can eliminate all of the code that handled
SIGWINCH for TTYs.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2023-04-05 11:23:06 +01:00
Alyssa Ross
e9841db486 vmm: don't ignore errors from SIGWINCH listener
Now that the SIGWINCH listener has fallbacks for older kernels, we
don't expect it to routinely fail, so if there's an error setting it
up, we want to know about it.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2023-04-05 11:23:06 +01:00
Alyssa Ross
c1f555cde3 vmm: fall back if CLONE_CLEAR_SIGHAND unsupported
This will allow the SIGWINCH listener to run on kernels older than
5.5, although on those kernels it will have to make 64 syscalls to
reset all the signal handlers.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2023-04-05 11:23:06 +01:00
Alyssa Ross
505f4dfa53 vmm: close all unused fds in sigwinch listener
The PTY main file descriptor had to be introduced as a parameter to
start_sigwinch_listener, so that it could be closed in the child.
Really the SIGWINCH listener process should not have any file
descriptors open, except for the ones it needs to function, so let's
make it more robust by having it close all other file descriptors.

For recent kernels, we can do this very conveniently with
close_range(2), but for older kernels, we have to fall back to closing
open file descriptors one at a time.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
2023-04-05 11:23:06 +01:00
Ravi kumar Veeramally
a8d1849485 vmm: Remove directory support from MemoryZoneConfig::file
Fixes: #5082

Signed-off-by: Ravi kumar Veeramally <ravikumar.veeramally@intel.com>
2023-04-04 06:49:18 -07:00