9091 Commits

Author SHA1 Message Date
Cole Robinson
d79a2c079c conf: net: Add model enum, and netfront value
This adds a network model enum. The virDomainNetDef property
is named 'model' like most other devices.

When the XML parser or a driver calls NetSetModelString, if
the passed string is in the enum, we will set net->model,
otherwise we copy the string into net->modelstr

Add a single example for the 'netfront' xen model, and wire
that up, just to verify it's all working

Acked-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
2019-04-16 13:11:08 -04:00
Cole Robinson
6bf7c67699 conf: net: Add wrapper functions for <model> value
To ease converting the net->model value to an enum, add
the wrapper functions:

virDomainNetGetModelString
virDomainNetSetModelString
virDomainNetStreqModelString
virDomainNetStrcaseeqModelString

Acked-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
2019-04-16 13:11:08 -04:00
Daniel P. Berrangé
bbe2aa627f conf: simplify link from hostdev back to network device
hostdevs have a link back to the original network device. This is fairly
generic accepting any type of device, however, we don't intend to make
use of this approach in future. It can thus be specialized to network
devices.

Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-04-16 14:44:53 +01:00
Daniel P. Berrangé
e1d10f8ef2 network: pass a virNetworkPtr to port management APIs
The APIs for allocating/notifying/removing network ports just take
an internal domain interface struct right now. As a step towards
turning these into public facing APIs, add a virNetworkPtr argument
to all of them.

Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-04-16 14:44:53 +01:00
Daniel P. Berrangé
dd52444f23 network: restrict usage of port management APIs
The port allocation APIs are currently called unconditionally for all
types of NIC, but (mostly) only do anything for NICs with type=network.

The exception is the port allocate API which does some validation even
for NICs with type!=network. Relying on this validation is flawed,
however, since the network driver may not even be installed. IOW virt
drivers must not delegate validation to the network driver for NICs
with type != network.

This change allows us to report errors when the virtual network driver
is not registered.

Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-04-16 14:44:53 +01:00
Martin Kletzander
2b342cda72 qemu: Add support for emulatorsched
This helps in a scenarios where vCPUs run with a priority that is so high they
might starve the emulator thread.  And it also fits with the rest of the
settings.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-16 13:46:17 +02:00
Marc-André Lureau
ad32d76165 qemu: do not set wait:false for client sockets
Qemu commit 767abe7 ("chardev: forbid 'wait' option with client
sockets") effectively deprecates usage of "wait" with client sockets
starting with qemu 4.0, and earlier versions ignored the value.

Cc: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2019-04-16 12:52:03 +02:00
Jiri Denemark
673c62a3b7 qemu: Don't cache microcode version
My earlier commit be46f61326 was incomplete. It removed caching of
microcode version in the CPU driver, which means the capabilities XML
will see the correct microcode version. But it is also cached in the
QEMU capabilities cache where it is used to detect whether we need to
reprobe QEMU. By missing the second place, the original commit
be46f61326 made the situation even worse since libvirt would report
correct microcode version while still using the old host CPU model
(visible in domain capabilities XML).

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-15 14:34:49 +02:00
Ján Tomko
5dd6e7f949 Delete QEMU_CAPS_KQEMU and QEMU_CAPS_ENABLE_KQEMU
Support for kqemu was dropped in libvirt by commit 8e91a400c and even
back then we never set these capabilities when doing QMP probing.

Since no QEMU we aim to support has these, drop them completely.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2019-04-15 14:06:39 +02:00
Daniel Henrique Barboza
1a922648f6 PPC64 support for NVIDIA V100 GPU with NVLink2 passthrough
The NVIDIA V100 GPU has an onboard RAM that is mapped into the
host memory and accessible as normal RAM via an NVLink2 bridge. When
passed through in a guest, QEMU puts the NVIDIA RAM window in a
non-contiguous area, above the PCI MMIO area that starts at 32TiB.
This means that the NVIDIA RAM window starts at 64TiB and go all the
way to 128TiB.

This means that the guest might request a 64-bit window, for each PCI
Host Bridge, that goes all the way to 128TiB. However, the NVIDIA RAM
window isn't counted as regular RAM, thus this window is considered
only for the allocation of the Translation and Control Entry (TCE).
For more information about how NVLink2 support works in QEMU,
refer to the accepted implementation [1].

This memory layout differs from the existing VFIO case, requiring its
own formula. This patch changes the PPC64 code of
@qemuDomainGetMemLockLimitBytes to:

- detect if we have a NVLink2 bridge being passed through to the
guest. This is done by using the @ppc64VFIODeviceIsNV2Bridge function
added in the previous patch. The existence of the NVLink2 bridge in
the guest means that we are dealing with the NVLink2 memory layout;

- if an IBM NVLink2 bridge exists, passthroughLimit is calculated in a
different way to account for the extra memory the TCE table can alloc.
The 64TiB..128TiB window is more than enough to fit all possible
GPUs, thus the memLimit is the same regardless of passing through 1 or
multiple V100 GPUs.

Further reading explaining the background
[1] https://lists.gnu.org/archive/html/qemu-devel/2019-03/msg03700.html
[2] https://www.redhat.com/archives/libvir-list/2019-March/msg00660.html
[3] https://www.redhat.com/archives/libvir-list/2019-April/msg00527.html

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2019-04-15 07:41:43 +02:00
Daniel Henrique Barboza
cc9f03801c qemu_domain: NVLink2 bridge detection function for PPC64
The NVLink2 support in QEMU implements the detection of NVLink2
capable devices by verifying the attributes of the VFIO mem region
QEMU allocates for the NVIDIA GPUs. To properly allocate an
adequate amount of memLock, Libvirt needs this information before
a QEMU instance is even created, thus querying QEMU is not
possible and opening a VFIO window is too much.

An alternative is presented in this patch. Making the following
assumptions:

- if we want GPU RAM to be available in the guest, an NVLink2 bridge
must be passed through;

- an unknown PCI device can be classified as a NVLink2 bridge
if its device tree node has 'ibm,gpu', 'ibm,nvlink',
'ibm,nvlink-speed' and 'memory-region'.

This patch introduces a helper called @ppc64VFIODeviceIsNV2Bridge
that checks the device tree node of a given PCI device and
check if it meets the criteria to be a NVLink2 bridge. This
new function will be used in a follow-up patch that, using the
first assumption, will set up the rlimits of the guest
accordingly.

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2019-04-15 07:06:52 +02:00
Martin Kletzander
673f805d4d qemu: Label uniqDir when probing capabilities
This does not cause a problem in usual scenarios thanks to us allowing
CAP_DAC_OVERRIDE for the qemu process, however in some scenarios this might be
an issue because the directory is created with mkdtemp(3) which explicitly
creates that with 0700 permissions and qemu running as non-root cannot access
that.

The scenarios include:
 - Builds without CAPNG
 - Running libvirtd in certain container configurations [1]
 - and possibly others.

[1] https://github.com/kubevirt/kubevirt/pull/2181#issuecomment-481840304

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2019-04-13 00:56:45 +02:00
Jiri Denemark
370177e2f6 cpu_x86: Store virCPUx86DataItem content in union
The structure can only be used for CPUID data now. Adding a type
indicator and moving the data into a union will let us store alternative
data types.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-12 22:53:39 +02:00
Jiri Denemark
8f1a8ce397 cpu_x86: Rename virCPUx86DataAddCPUID
It's called virCPUx86DataAdd now.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-12 22:53:39 +02:00
Jiri Denemark
3673269e3a cpu_x86: Introduce virCPUx86DataItem container struct
The following patches introduce CPU features read from MSR in addition
to those queried via CPUID instruction. Let's introduce a container
struct which will be able to describe either feature type.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-12 22:53:39 +02:00
Andrea Bolognani
03a07357e1 maint: Add filetype annotations to Makefile.inc.am
Vim has trouble figuring out the filetype automatically because
the name doesn't follow existing conventions; annotations like
the ones we already have in Makefile.ci help it out.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2019-04-12 16:55:38 +02:00
Michal Privoznik
e8c2c8bd07 qemu_command: Prefer '-overcommit mem-lock' over -realtime mlock'
The latter is deprecated and will be removed soon. The advised
replacement is '-overcommit mem-lock=on|off'.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-12 14:13:45 +02:00
Michal Privoznik
be51feff69 qemu_capabilities: Introduce QEMU_CAPS_OVERCOMMIT
Added in QEMU commit of v3.0.0-rc0~48^2~9 (then fixed by
v3.1.0-rc0~119^2~37) QEMU is replacing '-realtime mlock' with
'-overcommit mem-lock'. Add a capability to tell if we're dealing
new new enough qemu to use the replacement.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-12 13:42:39 +02:00
Michal Privoznik
a08c4b3741 qemu: Always assume QEMU_CAPS_REALTIME_MLOCK
The '-realtime mlock' cmd line argument was introduced in QEMU
commit v1.5.0-rc0~190 which matches minimal QEMU version we
require. Therefore, the capability will always be present.

Apparently, nearly none of our xml2argv test cases had the
capability hence slightly bigger change under qemuxml2argvdata/.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-12 13:39:42 +02:00
Cole Robinson
1d31526b52 Always put _LAST enums on second line of VIR_ENUM_IMPL
Standardize on putting the _LAST enum value on the second line
of VIR_ENUM_IMPL invocations. Later patches that add string labels
to VIR_ENUM_IMPL will push most of these to the second line anyways,
so this saves some noise.

Signed-off-by: Cole Robinson <crobinso@redhat.com>
2019-04-11 12:47:23 -04:00
Peter Krempa
0ef161c88f qemu: block: Use VIR_RETURN_PTR
Demonstrate how VIR_RETURN_PTR is used by refactoring qemu_block.c

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-10 16:34:57 +02:00
Peter Krempa
c9cec6a8b0 qemu: block: Remove unneeded cleanup jumps
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-10 16:34:57 +02:00
Peter Krempa
6542fbe2d5 qemu: block: Add and use AUTOPTR func for qemuBlockNodeNameBackingChainData
This is a locally used helper struct but we can make use of automatic
freeing for it.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-10 16:34:57 +02:00
Peter Krempa
7141bdd5bf qemu: block: Use VIR_AUTOFREE for char *
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-10 16:34:57 +02:00
Peter Krempa
ae0c36ecbb qemu: block: Use VIR_AUTOPTR for virHashTablePtr
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-10 16:34:57 +02:00
Peter Krempa
bc6eabbec3 qemu: block: Use VIR_AUTOPTR for virURIPtr
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-10 16:34:57 +02:00
Peter Krempa
e8ef1dd174 qemu: block: Use VIR_AUTOPTR for virJSONValue
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-10 16:34:57 +02:00
Peter Krempa
1d2eb86682 qemu: block: Introduce and use AUTOPTR func for qemuBlockStorageSourceAttachDataPtr
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-10 16:34:57 +02:00
Ján Tomko
e0befb78b1 qemuHotplugDiskSourceDataFree: also free backends
Also free the backends array, not just its members.

Fixes: d3f9dda2c9fd9fa7d2f7f1f1dd70ed7d83938101

Signed-off-by: Ján Tomko <jtomko@redhat.com>
2019-04-10 16:28:50 +02:00
Ján Tomko
c264cb1b1c qemu: remove qemuGetDomainDefaultHugepath
It is no longer used.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
2019-04-10 16:24:33 +02:00
Ján Tomko
07c6738460 qemu: do not fill in default pagesize in qemuGetDomainHupageMemPath
Commit 6864d8f740e2502dc7625bdf18ffde4465b14f69 moved this one level up
for qemuBuildMemoryBackendProps but left qemuBuildMemPathStr intact.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
2019-04-10 16:24:33 +02:00
Ján Tomko
b261c9c3a0 qemu: rename function for getting the default hugepage size
Use qemuBuildMemoryGetDefaultPagesize.

Fixes: 6864d8f740e2502dc7625bdf18ffde4465b14f69
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2019-04-10 16:24:33 +02:00
Michal Privoznik
5b9819eedc domain capabilities: Expose firmware auto selection feature
If a management application wants to use firmware auto selection
feature it can't currently know if the libvirtd it's talking to
support is or not. Moreover, it doesn't know which values that
are accepted for the @firmware attribute of <os/> when parsing
will allow successful start of the domain later, i.e. if the mgmt
application wants to use 'bios' whether there exists a FW
descriptor in the system that describes bios.

This commit then adds 'firmware' enum to <os/> element in
<domainCapabilities/> XML like this:

  <enum name='firmware'>
    <value>bios</value>
    <value>efi</value>
  </enum>

We can see both 'bios' and 'efi' listed which means that there
are descriptors for both found in the system (matched with the
machine type and architecture reported in the domain capabilities
earlier and not shown here).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
2019-04-10 13:58:51 +02:00
Michal Privoznik
9c0d73bf49 qemu_firmware: Introduce qemuFirmwareGetSupported
The point of this API is to fetch all FW descriptors, parse them
and return list of supported interfaces and SMM feature for given
combination of machine type and guest architecture.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
2019-04-10 13:58:30 +02:00
Michal Privoznik
2337309e04 qemu_firmware: Separate machine and arch matching into a function
This part of the code will be reused later.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
2019-04-10 13:54:07 +02:00
Michal Privoznik
15e0b76480 qemu_firmware: Separate firmware loading into a function
This piece of code will be reused later.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
2019-04-10 13:45:51 +02:00
Peter Krempa
f785318187 Revert "Include unistd.h directly by files using it"
This reverts commit a5e16020907e91bca1b0ab6c4ee5dbbdcccf6a54.

Getting rid of unistd.h from our headers will require more work than
just fixing the broken mingw build. Revert it until I have a more
complete proposal.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
2019-04-10 12:26:32 +02:00
Peter Krempa
a5e1602090 Include unistd.h directly by files using it
util/virutil.h bogously included unistd.h. Drop it and replace it by
including it directly where needed.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-10 09:12:04 +02:00
Peter Krempa
285c5f28c4 util: Move enum convertors into virenum.(c|h)
virutil.(c|h) is a very gross collection of random code. Remove the enum
handlers from there so we can limit the scope where virtutil.h is used.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-10 09:12:04 +02:00
Peter Krempa
c0abcca417 util: Don't include 'viralloc.h' into other header files
'viralloc.h' does not provide any type or macro which would be necessary
in headers. Prevent leakage of the inclusion.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-10 09:12:04 +02:00
Peter Krempa
a4bfc2521f util: Move the VIR_AUTO(CLEAN|PTR) helper macros into a separate header
Keeping them with viralloc.h forcibly pulls in the other stuff from
viralloc.h into other header files. This in turn creates a mess
as more and more headers pull in the 'viral' header file.

If we want to make 'viralloc.h' omnipresent we should pick a different
approach.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-10 09:12:03 +02:00
Andrea Bolognani
5ee5ebf453 qemu: Unify address assignment for virt guests
The rules are the same for all virt guests, regardless of the
architecture.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-04-04 09:52:20 +02:00
Andrea Bolognani
20011d01d9 qemu: Require PCIe Root Port for PCI by default on ARM virt
Our PCIe topology depends on the availability of PCIe Root Ports,
so if none of the suitable devices (pcie-root-port, ioh3420) is
compiled into QEMU we should fall back to virtio-mmio rather than
trying to use PCI addresses only to fail immediately afterwards
when we realize we can't use the necessary controllers.

Note that this additional check is basically moot for ARM virt
guests, because PCIe Root Ports were enabled in QEMU builds for
the architecture well before guest OS support had been widely
available; however, the opposite is true for RISC-V, and tweaking
the code this way will allow us to share it between architectures.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-04-04 09:52:14 +02:00
Nikolay Shirokovskiy
e3389d830c qemu: Don't duplicate suspend events and state changes
Since the STOP event handler can use the pausedReason as sent to
qemuProcessStopCPUs, we no longer need to send duplicate suspended
lifecycle events because we know what caused the stop along with extra
details. This processing allows us to also remove the duplicated state
change from qemuProcessStopCPUs.

Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
2019-04-04 10:36:04 +03:00
Nikolay Shirokovskiy
ab2eaa1492 qemu: Map suspended state reason to suspended event detail
Map is based on existing cases in code where we send suspended
event after changing domain state to paused.

Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
2019-04-04 10:36:03 +03:00
Nikolay Shirokovskiy
93c7d13eec qemu: Pass stop reason from qemuProcessStopCPUs to stop handler
Similar to commit [1] which saves and passes the running reason to
the RESUME event handler, during qemuProcessStopCPUs let's save and pass
the pause reason in the domain private data so that the STOP event
handler can use it.

[1] 5dab984ed : qemu: Pass running reason to RESUME event handler

Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
2019-04-04 10:36:03 +03:00
Michal Privoznik
6864d8f740 qemuBuildMemoryBackendProps: Get pagesize early
https://bugzilla.redhat.com/show_bug.cgi?id=1693066

Up until memfd introduction (in 24b74d187ca) we did not need to
know @pagesize because qemuGetDomainHupageMemPath() could deal
with it being zero (value of zero means use the default hugetlbfs
mount). But since for memfd we are not passing a path to
hugetlbfs mount rather the page size value we need to know its
value upfront.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-03 16:37:19 +02:00
Michal Privoznik
465df4771a virfile: Introduce and use virFileGetDefaultHugepage
This helper returns the default hugetlbfs mount point from given
array of mount points.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-03 16:37:19 +02:00
Nikolay Shirokovskiy
cae45f2cdd qemu: fix domain unlock/unref in qemuMigrationSrcPerform
qemuMigrationSrcPerform callers expect it to call virDomainObjEndAPI
in any case so on error paths we miss the virDomainObjEndAPI call.
To fix this let's make qemuMigrationSrcPerform callers responsible
for the virDomainObjEndAPI call.

ACKed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
2019-04-03 13:46:26 +03:00
Daniel P. Berrangé
ebe9c6eab7 qemu: don't rely on the non-portable d_type field in dirent
d_type is a non-portable extension to the struct dirent and even if it
exists, its value may be DT_UNKNOWN if the filesystem doesn't support
it. This is common with older versions of XFS which have ftype=0
feature.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-04-03 11:31:38 +01:00