32876 Commits

Author SHA1 Message Date
Martin Kletzander
2b342cda72 qemu: Add support for emulatorsched
This helps in a scenarios where vCPUs run with a priority that is so high they
might starve the emulator thread.  And it also fits with the rest of the
settings.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-16 13:46:17 +02:00
Martin Kletzander
842bc56ad2 conf: Add support for emulatorsched
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-16 13:46:17 +02:00
Martin Kletzander
c79a39e60c docs: Mention iothreadsched element in the docs and reword
Just one missing occurrence of iothreadsched fixed plus some rewording for this
to make more sense for the readers.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-16 13:46:17 +02:00
Martin Kletzander
3217bcc535 conf: Format thread IDs optionally
This will be used later when we want to format emulator scheduler parameters
which don't apply for multiple threads.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-16 13:46:17 +02:00
Martin Kletzander
0c010cd103 conf: Parse common scheduler attributes in separate function
This will become useful later when parsing emulatorsched parameters which don't
need the rest of the current function.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-16 13:46:17 +02:00
Marc-André Lureau
ad32d76165 qemu: do not set wait:false for client sockets
Qemu commit 767abe7 ("chardev: forbid 'wait' option with client
sockets") effectively deprecates usage of "wait" with client sockets
starting with qemu 4.0, and earlier versions ignored the value.

Cc: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2019-04-16 12:52:03 +02:00
Adrian Brzezinski
70d60b811f news: cleanup in virNetTLSContextNew
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Adrian Brzezinski <redhat@adrb.pl>
2019-04-16 11:23:10 +01:00
Adrian Brzezinski
dc4e9bfb84 rpc: cleanup in virNetTLSContextNew
Failed new gnutls context allocations in virNetTLSContextNew function
results in double free and segfault. Occasional memory leaks may also
occur.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Adrian Brzezinski <redhat@adrb.pl>
2019-04-16 11:22:50 +01:00
Michal Privoznik
c2568c1c5e news: Document firmware autoselection exposure in domcaps
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
2019-04-16 10:52:51 +02:00
Michal Privoznik
abd70ac3ae virSecurityDACRestoreChardevLabel: Restore UNIX sockets too
We're setting seclabels on unix sockets but never restoring them.
Surprisingly, we are doing so in SELinux driver.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-16 10:47:51 +02:00
Pino Toscano
3958e3d6a5 docs: document firmware attribute for VMware guests
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Pino Toscano <ptoscano@redhat.com>
2019-04-15 20:03:55 -04:00
Pino Toscano
b4e34d1083 vmx: write firmware back from autoselection
When writing the VMX file from the domain XML, write the firmware key
according to the firmware autoselection.  Though, at the moment only
'efi' is supported.

Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Pino Toscano <ptoscano@redhat.com>
2019-04-15 20:03:55 -04:00
Pino Toscano
9bb6e4e739 vmx: convert firmware config for autoselection
Convert the firmware key to a type of autoselected firmware.

Only the 'efi' firmware is allowed for now, in case the key is present.
It seems VMware (at least ESXi) does not write the key in VMX files when
setting BIOS as firmware.

Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Pino Toscano <ptoscano@redhat.com>
2019-04-15 20:03:55 -04:00
Laine Stump
fc79e73836 network: only reload firewall after firewalld is finished restarting
The network driver used to reload the firewall rules whenever a dbus
NameOwnerChanged message for org.fedoraproject.FirewallD1 was
received. Presumably at some point in the past this was successful at
reloading our rules after a firewalld restart. Recently though I
noticed that once firewalld was restarted, libvirt's logs would get this
message:

  The name org.fedoraproject.FirewallD1 was not provided by any .service files

After this point, no networks could be started until libvirtd itself
was restarted.

The problem is that the NameOwnerChanged message is sent twice during
a firewalld restart - once when the old firewalld is stopped, and
again when the new firewalld is started. If we try to reload at the
point the old firewalld is stopped, none of the firewalld dbus calls
will succeed.

The solution is to check the new_owner field of the message - we
should reload our firewall rules only if new_owner is non-empty (it is
set to "" when firewalld is stopped, and some sort of epoch number
when it is again started).

Signed-off-by: Laine Stump <laine@laine.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2019-04-15 12:53:38 -04:00
Laine Stump
687f556750 util: eliminate duplicate function virDBusMessageRead
When virDBusMessageRead() and virDBusMessageDecode were first added in
commit 834c9c94, they were identical except that virDBusMessageRead()
would unref the message after decoding it.

This difference was eliminated later in commit dc7f3ffc after it
became apparent that unref-ing the message so soon was never the right
thing to do. The two identical functions remained though, with the
tests and virDBus library itself calling the Decode variant, and all
other users calling the Read variant.

This patch eliminates the duplication, switching all users to
virDBusMessageDecode (and moving the nice API documentation comment
from the Read function up to the Decode function).

Signed-off-by: Laine Stump <laine@laine.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2019-04-15 12:47:44 -04:00
Daniel P. Berrangé
4683a609f6 vbox: drop C API definition for release 4.3.4
Support for compiling this version was dropped in an earlier commit.

Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-04-15 17:16:52 +01:00
Daniel P. Berrangé
9a6d16674f vbox: drop C API definition for release 4.3
Support for compiling this version was dropped in an earlier commit.

Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-04-15 17:16:46 +01:00
Daniel P. Berrangé
1aab36e16b vbox: drop C API definition for release 4.2.20
Support for compiling this version was dropped in an earlier commit.

Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-04-15 17:16:44 +01:00
Daniel P. Berrangé
3b111eddb9 vbox: drop C API definition for release 4.2
Support for compiling this version was dropped in an earlier commit.

Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-04-15 17:16:41 +01:00
Daniel P. Berrangé
4e65eda252 vbox: drop C API definition for release 4.1
Support for compiling this version was dropped in an earlier commit.

Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-04-15 17:16:37 +01:00
Daniel P. Berrangé
3e2402e8b8 vbox: drop C API definition for release 4.0
Support for compiling this version was dropped in an earlier commit.

Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-04-15 17:16:31 +01:00
Daniel P. Berrangé
2d1fadb44d vbox: drop support for VirtualBox 4.x releases
Support for all the 4.x releases was ended by VirtualBox maintainers in
Dec 2015. Even the "newest" 4.3.40 of those is only supported on old
versions of Linux (Ubuntu <= 13.03, RHEL <= 6, SLES <= 11), which are all
discontinued hosts from libvirt's POV.

We can thus reasonably drop all 4.x support from the libvirt VirtualBox
driver.

Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-04-15 17:16:21 +01:00
Daniel P. Berrangé
c1c235eb5c network: clear cached error if we successfully create firewall chains
Since:

  commit 9f4e35dc73ec9e940aa61bc7c140c2b800218ef3
  Author: Daniel P. Berrangé <berrange@redhat.com>
  Date:   Mon Mar 18 17:31:21 2019 +0000

    network: improve error report when firewall chain creation fails

We cache an error when failing to create the top level firewall chains.
This commit failed to account for fact that we may invoke
networkPreReloadFirewallRules() many times while libvirtd is running.
For example when firewalld is restarted.

When this happens the original failure may no longer occurr and we'll
successfully create our top level chains. We failed to clear the cached
error resulting in us failing to start virtual networks.

Reviewed-by: Laine Stump <laine@laine.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2019-04-15 17:08:47 +01:00
Andrea Bolognani
d28102e511 tools: Reduce table width in virsh(1)
The table included in the sample output for 'list --title' is
unnecessarily wide, which causes man to complain:

  warning [p 8, 0.5i]: can't break line

Make the table narrower.

Spotted by Lintian (manpage-has-errors-from-man tag).

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Acked-by: Michal Privoznik <mprivozn@redhat.com>
2019-04-15 18:07:19 +02:00
Andrea Bolognani
51d48c48e4 tools: Fix grammar
Apparently "allow(s) to frobnicate" is not correct English, and
either "allow(s) one to frobnicate" or "allow(s) frobnicating"
should be used instead.

Spotted by Lintian (spelling-error-in-{binary,manpage} tags).

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2019-04-15 17:37:52 +02:00
Andrea Bolognani
b6e6de9974 util: Fix NAME section for virkey{code,name}-*
Spotted by Lintian (manpage-has-bad-whatis-entry tag).

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2019-04-15 16:20:46 +02:00
Andrea Bolognani
4fe32dac30 keycodemapdb: Update submodule
We need commit 6280c94f306d in order to fix our generated
man pages.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2019-04-15 16:18:00 +02:00
Andrea Bolognani
2b48ab6176 Don't hardcode list of git submodules
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2019-04-15 15:12:20 +02:00
Jiri Denemark
673c62a3b7 qemu: Don't cache microcode version
My earlier commit be46f61326 was incomplete. It removed caching of
microcode version in the CPU driver, which means the capabilities XML
will see the correct microcode version. But it is also cached in the
QEMU capabilities cache where it is used to detect whether we need to
reprobe QEMU. By missing the second place, the original commit
be46f61326 made the situation even worse since libvirt would report
correct microcode version while still using the old host CPU model
(visible in domain capabilities XML).

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-15 14:34:49 +02:00
Ján Tomko
5dd6e7f949 Delete QEMU_CAPS_KQEMU and QEMU_CAPS_ENABLE_KQEMU
Support for kqemu was dropped in libvirt by commit 8e91a400c and even
back then we never set these capabilities when doing QMP probing.

Since no QEMU we aim to support has these, drop them completely.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2019-04-15 14:06:39 +02:00
Martin Kletzander
6a6453fb56 examples: Initialize @pos in domtop.c
This is a zero-cost workaround for a bug in GCC 8.3.0 which causes the
compilation to fail, because the compiler thinks that the value might be used
uninitialized even though it clearly cannot be.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2019-04-15 11:39:26 +02:00
Andrea Bolognani
9d7b9cf166 Fix spelling for macOS
Though it used to be called "Mac OS X" and "OS X" in the past,
it was never "MacOS X" nor "OS-X", and it's just "macOS" now.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
2019-04-15 11:09:10 +02:00
Andrea Bolognani
7cd70adbd2 news: Drop empty sections
We have occasionally failed to document certain categories
of changes in the release notes, yet still left the
corresponding sections in the file even though they were
completely empty.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
2019-04-15 11:08:32 +02:00
Michal Privoznik
0a97486e09 cpu_x86: Fix placement of *CheckFeature functions
In e17d10386 these functions were mistakenly moved into an #ifdef
block, but remained used outside of it leaving the build broken
for platforms where #ifdef evaluated to false.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2019-04-15 09:48:05 +02:00
Michal Privoznik
ae3d812b00 virhostcpu: Make virHostCPUGetMSR() work only on x86
Model specific registers are a thing only on x86. Also, the
/dev/cpu/0/msr path exists only on Linux and the fallback
mechanism (asking KVM) exists on Linux and FreeBSD only.

Therefore, move the function within #ifdef that checks all
aforementioned constraints and provide a dummy stub for all
other cases.

This fixes the build on my arm box, mingw-* builds, etc.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2019-04-15 09:46:27 +02:00
Michal Privoznik
b9991e8386 virhostcpu.c: Fix misalignment in virHostCPUGetMSRFromKVM comment
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2019-04-15 09:39:11 +02:00
Daniel Henrique Barboza
1a922648f6 PPC64 support for NVIDIA V100 GPU with NVLink2 passthrough
The NVIDIA V100 GPU has an onboard RAM that is mapped into the
host memory and accessible as normal RAM via an NVLink2 bridge. When
passed through in a guest, QEMU puts the NVIDIA RAM window in a
non-contiguous area, above the PCI MMIO area that starts at 32TiB.
This means that the NVIDIA RAM window starts at 64TiB and go all the
way to 128TiB.

This means that the guest might request a 64-bit window, for each PCI
Host Bridge, that goes all the way to 128TiB. However, the NVIDIA RAM
window isn't counted as regular RAM, thus this window is considered
only for the allocation of the Translation and Control Entry (TCE).
For more information about how NVLink2 support works in QEMU,
refer to the accepted implementation [1].

This memory layout differs from the existing VFIO case, requiring its
own formula. This patch changes the PPC64 code of
@qemuDomainGetMemLockLimitBytes to:

- detect if we have a NVLink2 bridge being passed through to the
guest. This is done by using the @ppc64VFIODeviceIsNV2Bridge function
added in the previous patch. The existence of the NVLink2 bridge in
the guest means that we are dealing with the NVLink2 memory layout;

- if an IBM NVLink2 bridge exists, passthroughLimit is calculated in a
different way to account for the extra memory the TCE table can alloc.
The 64TiB..128TiB window is more than enough to fit all possible
GPUs, thus the memLimit is the same regardless of passing through 1 or
multiple V100 GPUs.

Further reading explaining the background
[1] https://lists.gnu.org/archive/html/qemu-devel/2019-03/msg03700.html
[2] https://www.redhat.com/archives/libvir-list/2019-March/msg00660.html
[3] https://www.redhat.com/archives/libvir-list/2019-April/msg00527.html

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2019-04-15 07:41:43 +02:00
Daniel Henrique Barboza
cc9f03801c qemu_domain: NVLink2 bridge detection function for PPC64
The NVLink2 support in QEMU implements the detection of NVLink2
capable devices by verifying the attributes of the VFIO mem region
QEMU allocates for the NVIDIA GPUs. To properly allocate an
adequate amount of memLock, Libvirt needs this information before
a QEMU instance is even created, thus querying QEMU is not
possible and opening a VFIO window is too much.

An alternative is presented in this patch. Making the following
assumptions:

- if we want GPU RAM to be available in the guest, an NVLink2 bridge
must be passed through;

- an unknown PCI device can be classified as a NVLink2 bridge
if its device tree node has 'ibm,gpu', 'ibm,nvlink',
'ibm,nvlink-speed' and 'memory-region'.

This patch introduces a helper called @ppc64VFIODeviceIsNV2Bridge
that checks the device tree node of a given PCI device and
check if it meets the criteria to be a NVLink2 bridge. This
new function will be used in a follow-up patch that, using the
first assumption, will set up the rlimits of the guest
accordingly.

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2019-04-15 07:06:52 +02:00
Michal Privoznik
4a0f604dd0 cpu_map: Distribute x86_Cascadelake-Server.xml
In 2878278c74cc4 we've added new cpu model but we've forgot to
distribute the XML file it comes in.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2019-04-13 21:33:22 +02:00
Martin Kletzander
673f805d4d qemu: Label uniqDir when probing capabilities
This does not cause a problem in usual scenarios thanks to us allowing
CAP_DAC_OVERRIDE for the qemu process, however in some scenarios this might be
an issue because the directory is created with mkdtemp(3) which explicitly
creates that with 0700 permissions and qemu running as non-root cannot access
that.

The scenarios include:
 - Builds without CAPNG
 - Running libvirtd in certain container configurations [1]
 - and possibly others.

[1] https://github.com/kubevirt/kubevirt/pull/2181#issuecomment-481840304

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2019-04-13 00:56:45 +02:00
Jiri Denemark
df4b46737f vircpuhost: Add support for reading MSRs
The new virHostCPUGetMSR internal API will try to read the MSR from
/dev/cpu/0/msr and if it is not possible (the device does not exist or
libvirt is running unprivileged), it will fallback to asking KVM for the
MSR using KVM_GET_MSRS ioctl.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-12 22:53:40 +02:00
Jiri Denemark
1c0ff5df07 cputest: Add support for MSR features to cpu-cpuid.py
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-12 22:53:40 +02:00
Jiri Denemark
8904492e21 cputest: Add support for MSR features to cpu-parse.sh
The script just parses whatever cpu-gather.sh printed out.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-12 22:53:39 +02:00
Jiri Denemark
ab3d6ea0da cputest: Add support for MSR features to cpu-gather.sh
This patch adds an inline python code for reading MSR features. Since
reading MSRs is a privileged operation, we have to read them from
/dev/cpu/*/msr if it is readable (i.e., the script runs as root) or
fallback to using KVM ioctl which can be done by any user that can start
virtual machines.

The python code is inlined rather than provided in a separate script
because whenever there's an issue with proper detection of CPU features,
we ask the reporter to run cpu-gather.sh script to give us all data we
need to know about the host CPU. Asking them to run several scripts
would likely result in one of them being ignored or forgotten.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-12 22:53:39 +02:00
Jiri Denemark
4dbb82a967 cputest: Generalize feature parsing in cpu-cpuid.py
The parseMapFeature for parsing features from CPU map XML can be easily
generalized to support more feature types.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-12 22:53:39 +02:00
Jiri Denemark
df9a23beee cputest: Prepare cpu-cpuid.py for MSR features
Let's make sure the current CPUID specific code is only applied to CPUID
features.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-12 22:53:39 +02:00
Jiri Denemark
6cbab502d3 cputest: Rename in_e[ac]x as e[ac]x_in in cpu-cpuid.py
This will let us simplify the code since the dictionary keys will match
attribute names in various XMLs.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-12 22:53:39 +02:00
Jiri Denemark
77f1fbaed8 cputest: Fix comparison in checkCPUIDFeature in cpu-cpuid.py
leaf["eax"] & eax > 0 check works correctly only if there's at most 1
bit set in eax. Luckily that's been always the case, but fixing this
could save us from future surprises.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-12 22:53:39 +02:00
Jiri Denemark
a7ad56edd9 cputest: Generalize function names in cpu-cpuid.py
The function will have to deal with both CPUID and MSR features.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-12 22:53:39 +02:00
Jiri Denemark
ee6185db02 cputest: Drop support for old QEMU from cpu-parse.sh
We don't really need to parse CPU data from QEMU older than 2.9 (i.e.,
before query-cpu-model-expansion) at this point. But even if there's a
need to do so, we can always use an older version of this script to do
the conversion.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-04-12 22:53:39 +02:00