Commit Graph

368 Commits

Author SHA1 Message Date
Rob Bradford
bb2e7bb942 vmm: Shutdown vCPU threads
As part of the cleanup of the VM shutdown all the vCPU threads. This is
achieved by toggling a shared atomic boolean variable which is checked
in the vCPU loop. To trigger the vCPU code to look at this boolean it is
necessary to send a signal to the vCPU which will interrupt the running
KVM_RUN ioctl.

Fixes: #229

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-03 19:18:49 +02:00
Rob Bradford
40f9da524f tests: Add a basic direct boot test with acpi=off
Check that everything continues to function without ACPI.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-03 19:18:49 +02:00
Rob Bradford
8308e1bf25 vmm, tests: Disable reboot support
Being able to reboot requires us to identify all the resources we are
leaking and cleaning those up before we can enable reboot. For now if
the user requests a reboot then shutdown instead.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-03 19:18:49 +02:00
Rob Bradford
ad128bf72d vmm: Give vCPU and signal handler thread useful names
Sadly only the first few characters of the thread name is preserved so
use a shorter name for the vCPU thread for now. Also give the signal
handling thread a name.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-03 19:18:49 +02:00
Rob Bradford
7205700c5f tests: Add integration testing for VM reboot
Both with direct kernel boot, Ubuntu and Clear firmware boots.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-03 19:18:49 +02:00
Rob Bradford
3af5619256 tests: Use shutdown rather than reboot to shutdown the VMs
Now that we have ACPI shutdown support "reboot" will actually reboot the
VM rather than trigger the VMM to exit.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-03 19:18:49 +02:00
Rob Bradford
614eb68f16 vm: Make triple-fault and i8042 reset reboot the VM
Now we have ACPI shutdown we should reboot on these reset triggers.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-03 19:18:49 +02:00
Rob Bradford
5a187ee2c2 x86_64/devices: acpi: Add support for ACPI shutdown & reboot
Add an I/O port "device" to handle requests from the kernel to shutdown
or trigger a reboot, borrowing an I/O used for ACPI on the Q35 platform.
The details of this I/O port are included in the FADT
(SLEEP_STATUS_REG/SLEEP_CONTROL_REG/RESET_REG) with the details of the
value to write in the FADT for reset (RESET_VALUE) and in the DSDT for
shutdown (S5 -> 0x05)

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-03 19:18:49 +02:00
Rob Bradford
ae66a44d26 vmm: Support both reset and shutdown
Add a 2nd EventFd to the VM to control resetting (rebooting) the VM this
supplements the EventFd used for managing shutdown of the VM.

The default behaviour on i8042 or triple-fault based reset is currently
unchanged i.e. it will trigger a shutdown.

In order to support restarting the VM it was necessary to make start()
function take a reference to the config.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-03 19:18:49 +02:00
Rob Bradford
ebe8edd423 devices: i8042: Use error! macro
Now that we have the logging infrastructure in place there is no need to
use println!

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-03 19:18:49 +02:00
Sebastien Boeuf
011496bda0 arch: acpi: Fix legacy interrupt for serial device
The DSDT must declare the interrupt used by the serial device. This
helps the guest kernel matching the right interrupt to the 8250 serial
device. This is mandatory in case the IRQ routing is handled by ACPI, as
we must let ACPI know what do do with pin based interrupts.

One thing to notice, if we were using acpi=noirq from the kernel command
line, this would mean ACPI is not in charge of the IRQ routing, and the
device COM1 declaration would not be needed.

One additional requirement is to provide the appropriate interrupt
source override for the legacy ISA interrupts (0-15), which will give
the right information to the guest kernel about how to allocate the
associated IRQs.

Because we want to keep the MADT as simple as possible, and given that
our only device requiring pin based interrupt is the serial device, we
choose to only define the pin 4.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-09-03 19:18:49 +02:00
Rob Bradford
2610f4353d arch: acpi: Only add ACPI COM1 device if serial is turned on
Only add the ACPI PNP device for the COM1 serial port if it is not
turned off with "--serial off"

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-03 19:18:49 +02:00
Rob Bradford
15387cd96a arch: x86_64: acpi: Add DSDT table entries for PCI and COM1
Currently this has a hardcoded range from 32GiB to 64GiB for the 64-bit PCI
range. It should range from the top of ram to 64GiB.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-03 19:18:49 +02:00
Rob Bradford
638bf0378c arch: x86_64: acpi: Generate MCFG table
The MCFG table contains some PCI configuration details in particular
details of where the enhanced configuration space is.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-03 19:18:49 +02:00
Rob Bradford
451502b50b vm: If a VCPU thread errors out then exit the hypervisor
Currently when the VCPU thread exits on an error the VMM continues to
run with no way of shutting down the main thread.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-03 19:18:49 +02:00
Rob Bradford
98f81c36ec arch: x86_64: acpi: Generate MADT aka APIC table
This provides important APIC configuration details for the CPU. Even
though it duplicates some of the information already included in the
mptable it is necessary when booting with ACPI as the mptable is not
used.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-03 19:18:49 +02:00
Rob Bradford
ee83c2d44e arch: x86_64: Generate basic ACPI tables
Generate very basic ACPI tables for HW reduced ACPI.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-03 19:18:49 +02:00
Rob Bradford
eea6f1dc9e acpi_tables: Add initial ACPI tables support
Add a revision 2 RSDP table only supporting an XSDT along with support
for creating generic SDT based tables.

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-09-03 19:18:49 +02:00
Yang Zhong
3e99098bf3 vhost_rs: add config messge support
The previous definitions does not cover config space read/write
and only cover general message as below:

A vhost-user message consists of 3 header fields and a payload.

+---------+-------+------+---------+
| request | flags | size | payload |
+---------+-------+------+---------+

but for config space, the payload include:

Virtio device config space
^^^^^^^^^^^^^^^^^^^^^^^^^^

+--------+------+-------+---------+
| offset | size | flags | payload |
+--------+------+-------+---------+

:offset: a 32-bit offset of virtio device's configuration space

:size: a 32-bit configuration space access size in bytes

🎏 a 32-bit value:
  - 0: Vhost master messages used for writeable fields
  - 1: Vhost master messages used for live migration

:payload: Size bytes array holding the contents of the virtio
          device's configuration space

This patch add specific functions for config message, which can
get/set config space from/to backend.

Signed-off-by: Yang Zhong <yang.zhong@intel.com>
2019-09-03 08:39:37 -07:00
Yang Zhong
e05de4514d vhost_rs: The vhost user version we support
The vhost user version should be same with backend.
define VHOST_USER_VERSION    (0x1)

Signed-off-by: Yang Zhong <yang.zhong@intel.com>
2019-09-03 08:39:37 -07:00
Yang Zhong
6fb7c3bbc2 vhost_rs: remove config space offset setting
There is one definition in message.rs file as below:
pub const VHOST_USER_CONFIG_OFFSET: u32 = 0x100

This definition is only for virtio mmio config space
and we will add this offset in virtio-mmio side and
not vhost user protocl side.

Signed-off-by: Yang Zhong <yang.zhong@intel.com>
2019-09-03 08:39:37 -07:00
Yang Zhong
a44a903587 vhost_rs: Change get_config()/set_config()
Use acked_protocol_features to replace acked_virtio_features in
get_config()/set_config() for protocol features like CONFIG.

This patch also fix wrong GET_CONFIG setting for set_config().

Signed-off-by: Yang Zhong <yang.zhong@intel.com>
2019-09-03 08:39:37 -07:00
Yang Zhong
b4187a1b9d vhost_rs: Change the VhostUserConfigFlags
The latest vhost user spec only define two members in
VhostSetConfigType, master and live migration. These
changes can make rust-vmm compatible with vhost user backend.

Signed-off-by: Yang Zhong <yang.zhong@intel.com>
2019-09-03 08:39:37 -07:00
Samuel Ortiz
8718043dfc cloud-hypervisor: Bump vmm-sys-util crate version
Bump from 829d605 to fd4dcd1.

PR #225 failed because we were still using the vmm-sys-util logging
macros and the crate's syslog module got removed.

This one relies on the previous commit switching to using the
log crate macros instead.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-09-02 15:07:42 +02:00
Samuel Ortiz
add0471120 vfio: Use the log crate macros
Instead of using the syslog vmm-sys-util ones.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2019-09-02 15:07:42 +02:00
Sebastien Boeuf
772191b409 vm-virtio: vhost-user: Rely on acked features to setup backend
At this point in the code, the acked features have been provided by the
guest and they can be set back to the backend. There's no need to
retrieve one more time the backend features for this purpose.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-31 17:33:17 +01:00
Sebastien Boeuf
97699a521f vm-virtio: vhost-user: Vring should be enabled after initialization
As mentioned in the vhost-user specification, each ring is initialized
in a stopped state. This means each ring should be enabled only after
it has been correctly initialized.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-31 17:33:17 +01:00
Sebastien Boeuf
a4ebcf486d vm-virtio: vhost-user-net: Map proper error when getting features
Simple patch replacing unwrap() with appropriate map_err().

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-31 17:33:17 +01:00
Sebastien Boeuf
cdfe576eb1 vm-virtio: vhost-user-net: Set the right set of features
The available features are masked with the backend features, therefore
the available features should be the one used when calling into
set_features() API.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-31 17:33:17 +01:00
Sebastien Boeuf
bc42420583 vm-virtio: Expand vhost-user handler to be reused from virtio-fs
In order to factorize the code between vhost-user-net and virtio-fs one
step further, this patch extends the vhost-user handler implementation
to support slave requests.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-31 17:33:17 +01:00
Sebastien Boeuf
b7d3ad9063 vm-virtio: fs: Factorize vhost-user setup
This patch factorizes the existing virtio-fs code by relying onto the
common code part of the vhost_user module in the vm-virtio crate.

In details, it factorizes the vhost-user setup, and reuses the error
types defined by the module instead of defining its own types.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-31 17:33:17 +01:00
Sebastien Boeuf
56cad00f2e vm-virtio: Move fs.rs to vhost_user module
vhost-user-net introduced a new module vhost_user inside the vm-virtio
crate. Because virtio-fs is actually vhost-user-fs, it belongs to this
new module and needs to be moved there.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-31 17:33:17 +01:00
Cathy Zhang
cc7a96e9d3 main: Add integration test
Use qemu/tests/vhost-user-bridge as the backend for integration
test for vhost-user-net.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2019-08-30 15:00:26 +01:00
Cathy Zhang
f21d54f6b0 main: Add arguments entry for vhost-user-net
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2019-08-30 15:00:26 +01:00
Cathy Zhang
584a2cccee vmm: Add vhost-user-net support
Update vm configuration and device initial process to add
vhost-user-net support.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2019-08-30 15:00:26 +01:00
Cathy Zhang
633f51af9c vm-virtio: Add vhost-user-net implementation
vhost-user framwork could provide good performance in data intensive
scenario due to the memory sharing mechanism. Implement vhost-user-net
device to get the benefit for Rust-based VMMs network.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2019-08-30 15:00:26 +01:00
Cathy Zhang
51306555e7 vmm: Add hugetlbfs handling support
The currently directory handling process to open tempfile by
OpenOptions with custom_flags(O_TMPFILE) is workable for tmp
filesystem, but not workable for hugetlbfs, add new directory
handling process which works fine for both tmpfs and hugetlbfs.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
2019-08-30 15:00:26 +01:00
dependabot-preview[bot]
ce60ff16c4 build(deps): bump vmm-sys-util from a0b3893 to 829d605
Bumps [vmm-sys-util](https://github.com/rust-vmm/vmm-sys-util) from `a0b3893` to `829d605`.
- [Release notes](https://github.com/rust-vmm/vmm-sys-util/releases)
- [Commits](a0b3893a40...829d605a07)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2019-08-30 11:35:54 +00:00
dependabot-preview[bot]
3dd329052c build(deps): bump vmm-sys-util from 2177381 to a0b3893
Bumps [vmm-sys-util](https://github.com/rust-vmm/vmm-sys-util) from `2177381` to `a0b3893`.
- [Release notes](https://github.com/rust-vmm/vmm-sys-util/releases)
- [Commits](2177381ed6...a0b3893a40)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2019-08-29 07:59:39 +00:00
Sebastien Boeuf
b2f85cbdc4 vhost_rs: Wait for full request to be satisfied
The recvmsg syscall can split a request in multiple packets unless we
use the flag MSG_WAITALL to make sure the request will wait for the
whole data to be transferred before returning.

This flag is needed to prevent the vhost crate from returning the error
PartialMessage, which occured sporadically when using virtio-fs, and
which was detected as part of our continuous integration testing.

Fixes #182

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-29 11:01:32 +08:00
dependabot-preview[bot]
18a8bb0072 build(deps): bump vmm-sys-util from 7222869 to 2177381
Bumps [vmm-sys-util](https://github.com/rust-vmm/vmm-sys-util) from `7222869` to `2177381`.
- [Release notes](https://github.com/rust-vmm/vmm-sys-util/releases)
- [Commits](7222869ed3...2177381ed6)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2019-08-28 11:35:54 +00:00
dependabot-preview[bot]
151637b647 build(deps): bump cc from 1.0.40 to 1.0.41
Bumps [cc](https://github.com/alexcrichton/cc-rs) from 1.0.40 to 1.0.41.
- [Release notes](https://github.com/alexcrichton/cc-rs/releases)
- [Commits](https://github.com/alexcrichton/cc-rs/compare/1.0.40...1.0.41)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2019-08-28 01:13:15 +00:00
dependabot-preview[bot]
c316c161a6 build(deps): bump vm-memory from 1635f25 to 8669369
Bumps [vm-memory](https://github.com/rust-vmm/vm-memory) from `1635f25` to `8669369`.
- [Release notes](https://github.com/rust-vmm/vm-memory/releases)
- [Commits](1635f25afc...8669369d17)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2019-08-27 12:48:10 +00:00
dependabot-preview[bot]
808fcaa43b build(deps): bump lazy_static from 1.3.0 to 1.4.0
Bumps [lazy_static](https://github.com/rust-lang-nursery/lazy-static.rs) from 1.3.0 to 1.4.0.
- [Release notes](https://github.com/rust-lang-nursery/lazy-static.rs/releases)
- [Commits](https://github.com/rust-lang-nursery/lazy-static.rs/compare/1.3.0...1.4.0)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2019-08-26 13:30:59 +00:00
dependabot-preview[bot]
bc87c9f19b build(deps): bump kvm-ioctls from 37669f6 to 30adb02
Bumps [kvm-ioctls](https://github.com/rust-vmm/kvm-ioctls) from `37669f6` to `30adb02`.
- [Release notes](https://github.com/rust-vmm/kvm-ioctls/releases)
- [Commits](37669f60a0...30adb02158)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2019-08-26 13:30:36 +00:00
dependabot-preview[bot]
66a7a94a12 build(deps): bump getrandom from 0.1.10 to 0.1.11
Bumps [getrandom](https://github.com/rust-random/getrandom) from 0.1.10 to 0.1.11.
- [Release notes](https://github.com/rust-random/getrandom/releases)
- [Changelog](https://github.com/rust-random/getrandom/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-random/getrandom/compare/v0.1.10...v0.1.11)

Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
2019-08-25 18:03:45 +00:00
Sebastien Boeuf
dfb18ef14a net: Make TAP registration functions immutable
By making the registration functions immutable, this patch prevents from
self borrowing issues with the RwLock on self.mem.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-22 08:24:15 +01:00
Sebastien Boeuf
0b8856d148 vmm: Add RwLock to the GuestMemoryMmap
Following the refactoring of the code allowing multiple threads to
access the same instance of the guest memory, this patch goes one step
further by adding RwLock to it. This anticipates the future need for
being able to modify the content of the guest memory at runtime.

The reasons for adding regions to an existing guest memory could be:
- Add virtio-pmem and virtio-fs regions after the guest memory was
  created.
- Support future hotplug of devices, memory, or anything that would
  require more memory at runtime.

Because most of the time, the lock will be taken as read only, using
RwLock instead of Mutex is the right approach.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-22 08:24:15 +01:00
Sebastien Boeuf
ec0b5567c8 vmm: Share the guest memory instead of cloning it
The VMM guest memory was cloned (copied) everywhere the code needed to
have ownership of it. In order to clean the code, and in anticipation
for future support of modifying this guest memory instance at runtime,
it is important that every part of the code share the same instance.

Because VirtioDevice implementations need to have access to it from
different threads, that's why Arc must be used in this case.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2019-08-22 08:24:15 +01:00
Rob Bradford
f4d41d600b virtio: net: Remove TAP fd from epoll when no available descriptors
When there are no available descriptors in the queue (observed when the
network interface hasn't been brought up by the kernel) stop waiting for
notifications that the TAP fd should be read from.

This avoids a situation where the TAP device has data avaiable and wakes
up the virtio-net thread only for the virtio-net thread not read that
data as it has nowhere to put it.

When there are descriptors available in the queue then we resume waiting
for the epoll event on the TAP fd.

This bug demonstrated itself as 100% CPU usage for cloud-hypervisor
binary prior to the guest network interface being brought up. The
solution was inspired by the Firecracker virtio-net code.

Fixes: #208

Signed-off-by: Rob Bradford <robert.bradford@intel.com>
2019-08-21 08:41:28 -07:00