By passing a reference of the DeviceTree to the AddressManager, we can
now update the DeviceTree whenever a PCI BAR is reprogrammed. This is
mandatory to maintain the correct resources information related to each
virtio-pci device, which will ensure correct information will be stored
upon VM snapshot.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
We want to be able to share the same DeviceTree across multiple threads,
particularly to handle the use case where PCI BAR reprogramming might
need to update the tree while from another thread a new device is being
added to the tree.
That's why this patch moves the DeviceTree instance into an Arc<Mutex<>>
so that we can later share a reference of the same mutable tree with the
AddressManager responsible for handling PCI BAR reprogramming.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
By using the vector of resources provided by the DeviceNode, the device
manager can store the information related to PCI BARs from a virtio-pci
device. Based on this, and upon VM restoration, the device manager can
restore the BARs in the expected location in the guest address space.
One thing to note is that we only need to provide the VirtioPciDevice
with the configuration BAR (BAR 0) since the SHaredMemory BAR info comes
from the virtio device directly.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
From a VirtioPciDevice perspective, there are two types of BARs, either
the virtio configuration BAR or the SHaredMemory BAR.
The SHaredMemory BAR address comes from the virtio device directly as
the memory region had been previously allocated when the virtio device
has been created. So for this BAR, there's nothing to do when restoring
a VM, since the associated virtio device is already restored with the
appropriate resources, hence the BAR will already be at the right
address.
The remaining configuration BAR is different, as we usually get its
address from the SystemAllocator. This means in case we restore a VM,
we must provide this value, bypassing the allocator. This is what this
commit takes care of, by letting the caller set the base address for the
configuration BAR prior to allocating the BARs.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Based on the new field "pci_bdf", a virtio-pci device can be restored at
the same place on the PCI bus it was located before the VM snapshot.
This ensures consistent placement on the PCI bus, based on the stored
information related to each device.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
We need a way to store the information about where a PCI device was
placed on the PCI bus before the VM was snapshotted. The way to do this
is by adding an extra field to the DeviceNode structure.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In order to let the PciBus user choose where a device should be placed
on the bus, a new function get_device_id() is introduced. This will be
helpful in the context of snapshot/restore as the caller will be able to
place the PCI devices on the same slot they were placed before the
snapshot was taken.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Use the ACPI feature to control whether to build the mptable. This is
necessary as the mptable and ACPI RSDP table can easily overwrite each
other leading to it failing to boot.
TEST=Compile with default features and see that --cpus boot=48 now
works, try with --no-default-features --features "pci" and observe the
--cpus boot=48 also continues to work.
Fixes: #1132
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The mmio integration test build currently doesn't use ACPI so piggyback
on that test variation.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The setup_mptables() call which is not used on ACPI builds has a side
effect of testing whether there was enough RAM which one of the unit
tests was relying on. Add a similar check for the RSDP address.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
This gives Cloud-Hypervisor the possibility to snapshot and restore a
VM running with virtio-pci devices attached to it.
The VirtioPciDevice snapshot contains a vector of sub-snapshots to store
and restore information related to MsixConfig, VirtioPciCommonConfig and
PciConfiguration structures, along with snapshot data related to
VirtioPciDevice itself.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This structure contains all the virtio generic information, and as part
of restoring a VM with virtio-pci devices, it is important to restore
these values to ensure the device's proper functioning.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The PCI configuration from each PCI device is modified at runtime as we
can expect the guest OS to write to some PCI capability structure, or
move the BAR to a different location in the guest address space.
For all the reasons why such configuration might differ from the initial
configuration, we must store the registers values to be able to restore
them with the right values whenever a PCI device is being restored.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In order to restore devices relying on MSI-X, the MsixConfig structure
must be restored with the correct values. Additionally, the KVM routes
must be restored so that interrupts can be delivered through KVM the way
they were configured before the snapshot was taken.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Continue to try and track down the instability in the numbers generated
by this test by removing the (unused) network interface. It's possible
we are getting broadcast packets into the tap device and generating a
variable size of allocations.
See: #760
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Switch to using the recently added OptionParser in the code that parses
the block backend.
Fixes: #1092
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Switch to using the recently added OptionParser in the code that parses
the network backend.
Whilst doing this also update the net-backend syntax to use "sock"
rather than socket.
Fixes: #1092
Partially fixes: #1091
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Rather than repeat syntax for the vhost-user-block backend in multiple
places store it in one place and reference it from the required places.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Rather than repeat syntax for the vhost-user-net backend in multiple
places store it in one place and reference it from the required places.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Upon a virtio reset, the driver expects that available and used indexes
will be reset to 0. That's why we need to reset these values from the
VMM for any virtio device that might get reset.
This issue was not detected before because the Vec<Queue> maintained
through VirtioPciDevice or MmioDevice was never updated from the virtio
device thread after the device had been actived. For this reason, upon
reset, both available and used indexes were already at the value 0.
The issue arose when trying to reset a device after the VM was restored.
That's because during the restore, each queue is assigned with the right
available and used indexes before it is passed to the device through the
activate function. And that's why upon reset, each queue was still
assigned with these indexes while it should have been reset to 0.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
All our virtio devices support to be reset, but the virtio-mmio
transport layer was not implemented for it. This patch fixes this
lack of support.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The ch-remote usage says:
OPTIONS:
--api-socket <api-socket> HTTP API socket path (UNIX domain socket).
However it doesn't seem to actually accept that syntax, instead
requiring "--api-socket=". This may be a clap bug however it is resolved
by setting the number of arguments requires to exactly one. Which is
also the actual correct number.
Fixes: #1117Fixes: #1116
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
To aid the debugging of why the test_memory_overhead sometimes fails
print a details of all the named regions that it uses in its
calculation.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
The current tests support giving a tap name but they only test with tap
interfaces that already exist.
Fixes: #871
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Stripping the release build for glibc shrinks the size considerably:
$ du -h target/release/cloud-hypervisor
8.5M target/release/cloud-hypervisor
$ strip target/release/cloud-hypervisor
$ du -h target/release/cloud-hypervisor
5.2M target/release/cloud-hypervisor
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
To avoid a race condition where the signal might "miss" the KVM_RUN
ioctl() instead reapeatedly try sending a signal until the vCPU run is
interrupted (as indicated by setting a new per vCPU atomic.)
It important to also clear this atomic when coming out of a paused
state.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
To ensure that the DeviceManager threads (such as those used for virtio
devices) are cleaned up it is necessary to unpark them so that they get
cleanly terminated as part of the shutdown.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
After setting the kill signal flag for the vCPU thread release the pause
flag and unpark the threads. This ensures that that the vCPU thread will
wake up and check the kill signal flag if the VM is paused.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>
Rather than immediately entering the vCPU run() code check if the kill
signal is set. This allows paused VMs to be shutdown.
Signed-off-by: Rob Bradford <robert.bradford@intel.com>