Commit Graph

4788 Commits

Author SHA1 Message Date
Martin Kletzander
c69915ccaf peer2peer migration: allow connecting to local sockets
Local socket connections were outright disabled because there was no "server"
part in the URI.  However, given how requirements and usage scenarios are
evolving, some management apps might need the source libvirt daemon to connect
to the destination daemon over a UNIX socket for peer2peer migration.  Since we
cannot know where the socket leads (whether the same daemon or not) let's decide
that based on whether the socket path is non-standard, or rather explicitly
specified in the URI.  Checking non-standard path would require to ask the
daemon for configuration and the only misuse that it would prevent would be a
pretty weird one.  And that's not worth it.  The assumption is that whenever
someone uses explicit UNIX socket paths in the URI for migration they better
know what they are doing.

Partially resolves: https://bugzilla.redhat.com/1638889

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
2020-09-04 10:20:49 +02:00
Ján Tomko
ebbf8ebe4f util: virnetdevtap: stats: fix txdrop on FreeBSD
For older FreeBSD, we needed an ifdef guard to use
if_data.ifi_oqdrops, which was introduced by:

commit 61bbdbb94c
    Implement interface stats for BSD

But when we dropped the check because we deprecated
building on FreeBSD-10 in:

commit 83131d9714
    configure: drop check for unsupported FreeBSD

We started building the wrong side of the ifdef.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Fixes: 83131d9714
Reviewed-by: Roman Bogorodskiy <bogorodskiy@gmail.com>
2020-09-03 20:25:07 +02:00
Michal Privoznik
95b9db4ee2 lib: Prefer WITH_* prefix for #if conditionals
Currently, we are mixing: #if HAVE_BLAH with #if WITH_BLAH.
Things got way better with Pavel's work on meson, but apparently,
mixing these two lead to confusing and easy to miss bugs (see
31fb929eca for instance). While we were forced to use HAVE_
prefix with autotools, we are free to chose our own prefix with
meson and since WITH_ prefix appears to be more popular let's use
it everywhere.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-09-02 10:28:10 +02:00
Michal Privoznik
63b41d3f93 virfile.c: Remove some #endif comments
There are couple of conditional #includes at the beginning of
virfile.c and they try to be nice and document #endifs. But they
are mostly wrong because either they have the condition in the
comment inverted or the comment refers to a different condition
than they belong to. Just remove the comments as these #includes
are single line mostly.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-09-02 10:28:10 +02:00
Michal Privoznik
e1178d55c6 util: Check for HAVE_NET_IF_H correctly
There are two places where we try to check whether the host
system has net/if.h before including it. But the check is missing
'_H' suffix.

Fixes: 7f3eb533f4
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-09-02 10:28:10 +02:00
Ján Tomko
6fab37da59 Prefer https: everywhere where possible
Use https: links for websites that support them.

The URIs which are used as namespace identifiers
are left alone.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Neal Gompa <ngompa13@gmail.com>
2020-09-01 21:58:46 +02:00
Ján Tomko
4e7a27b610 Prefer https: for Wikipedia links
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Neal Gompa <ngompa13@gmail.com>
2020-09-01 21:58:45 +02:00
Laine Stump
95089f481e util: assign tap device names using a monotonically increasing integer
When creating a standard tap device, if provided with an ifname that
contains "%d", rather than taking that literally as the name to use
for the new device, the kernel will instead use that string as a
template, and search for the lowest number that could be put in place
of %d and produce an otherwise unused and unique name for the new
device. For example, if there is no tap device name given in the XML,
libvirt will always send "vnet%d" as the device name, and the kernel
will create new devices named "vnet0", "vnet1", etc. If one of those
devices is deleted, creating a "hole" in the name list, the kernel
will always attempt to reuse the name in the hole first before using a
name with a higher number (i.e. it finds the lowest possible unused
number).

The problem with this, as described in the previous patch dealing with
macvtap device naming, is that it makes "immediate reuse" of a newly
freed tap device name *much* more common, and in the aftermath of
deleting a tap device, there is some other necessary cleanup of things
which are named based on the device name (nwfilter rules, bandwidth
rules, OVS switch ports, to name a few) that could end up stomping
over the top of the setup of a new device of the same name for a
different guest.

Since the kernel "create a name based on a template" functionality for
tap devices doesn't exist for macvtap, this patch for standard tap
devices is a bit different from the previous patch for macvtap - in
particular there was no previous "bitmap ID reservation system" or
overly-complex retry loop that needed to be removed. We simply find
and unused name, and pass that name on to the kernel instead of
"vnet%d".

This counter is also wrapped when either it gets to INT_MAX or if the
full name would overflow IFNAMSIZ-1 characters. In the case of
"vnet%d" and a 32 bit int, we would reach INT_MAX first, but possibly
someday someone will change the name from vnet to something else.

(NB: It is still possible for a user to provide their own
parameterized template name (e.g. "mytap%d") in the XML, and libvirt
will just pass that through to the kernel as it always has.)

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-09-01 14:16:44 -04:00
Laine Stump
d7f38beb2e util: replace macvtap name reservation bitmap with a simple counter
There have been some reports that, due to libvirt always trying to
assign the lowest numbered macvtap / tap device name possible, a new
guest would sometimes be started using the same tap device name as
previously used by another guest that is in the process of being
destroyed *as the new guest is starting.

In some cases this has led to, for example, the old guest's
qemuProcessStop() code deleting a port from an OVS switch that had
just been re-added by the new guest (because the port name is based on
only the device name using the port). Similar problems can happen (and
I believe have) with nwfilter rules and bandwidth rules (which are
both instantiated based on the name of the tap device).

A couple patches have been previously proposed to change the ordering
of startup and shutdown processing, or to put a mutex around
everything related to the tap/macvtap device name usage, but in the
end no matter what you do there will still be possible holes, because
the device could be deleted outside libvirt's control (for example,
regular tap devices are automatically deleted when the qemu process
terminates, and that isn't always initiated by libvirt but could
instead happen completely asynchronously - libvirt then has no control
over the ordering of shutdown operations, and no opportunity to
protect it with a mutex.)

But this only happens if a new device is created at the same time as
one is being deleted. We can effectively eliminate the chance of this
happening if we end the practice of always looking for the lowest
numbered available device name, and instead just keep an integer that
is incremented each time we need a new device name. At some point it
will need to wrap back around to 0 (in order to avoid the IFNAMSIZ 15
character limit if nothing else), and we can't guarantee that the new
name really will be the *least* recently used name, but "math"
suggests that it will be *much* less common that we'll try to re-use
the *most* recently used name.

This patch implements such a counter for macvtap/macvlan, replacing
the existing, and much more complicated, "ID reservation" system. The
counter is set according to whatever macvtap/macvlan devices are
already in use by guests when libvirtd is started, incremented each
time a new device name is needed, and wraps back to 0 when either
INT_MAX is reached, or when the resulting device name would be longer
than IFNAMSIZ-1 characters (which actually is what happens when the
template for the device name is "maccvtap%d"). The result is that no
macvtap name will be re-used until the host has created (and possibly
destroyed) 99,999,999 devices.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-09-01 14:16:36 -04:00
Laine Stump
b546b48344 meson: link libm
On some platforms libm (needed for the pow() function) isn't being
linked in somehow. This patch adds the necessary bits to assure that
it's linked in when necessary.

Suggested-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
(cherry picked from commit 20a62b42ec001310a6329d7ee2021f0737d534ef)
2020-09-01 14:16:19 -04:00
Scott Shambarger
89f5b90a5f util: use host module suffix when loading drivers
Driver module loaders current hardcode ".so" as the file
extension.  On MacOS, meson uses ".dylib" as a module file extension.
This patch adds VIR_FILE_MODULE_EXT to virfile.h defined as the
hosts module extension, and updates driver module loaders to make
use of it.

Signed-off-by: Scott Shambarger <scott-libvirt@shambarger.net>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2020-08-26 10:30:18 +02:00
Daniel Henrique Barboza
46d88d8dba domaincapsmock: mock virHostCPUGetMicrocodeVersion()
Previous patch handled the runtime case where a non-x86 host is
fetching /proc/cpuinfo data for a microcode info that we know
it doesn't exist. This change alone speeded everything by a
bit for non-x86, but there is at least one major culprit left.

qemuxml2argvtest does several arch-specific tests, and a good
chunk of them are x86 exclusive. This means that 'hostArch'
will be seen as x86 for these tests, even when running in
non-x86 hosts. In a Power 9 server with 128 CPUs, qemuxml2argvtest
takes 298 seconds to complete in average, and 'perf record'
indicates that 95% of the time is spent in
virHostCPUGetMicrocodeVersion().

This patch mocks virHostCPUGetMicrocodeVersion() to always return
0 in the tests, avoiding /proc/cpuinfo reads. This will make all
tests behave arch-agnostic, and the microcode value being 0 has no
impact on any existing test.

This is a CI speed across the board for all archs, including x86,
given that we're not reading /proc/cpuinfo in the tests. For
a Thinkpad T480 laptop with 8 Intel i7 CPUs, qemuxml2argvtest
went from 15.50 sec to 12.50 seconds. The performance gain is even
more noticeable for huge servers with lots of CPUs. For the
Power 9 server mentioned above, this patch speeds qemuxml2argvtest
to 9 seconds, down from 298 sec.

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2020-08-25 19:44:43 +02:00
Daniel Henrique Barboza
2ba0b7497c virhostcpu.c: skip non x86 hosts in virHostCPUGetMicrocodeVersion()
Non-x86 archs does not have a 'microcode' version like x86. This is
covered already inside the function - just return 0 if no microcode
is found. Regardless of that, a read of /proc/cpuinfo is always made.
Each read will invoke the kernel to fill in the CPU details every time.

Now let's consider a non-x86 host, like a Power 9 server with 128 CPUs.
Each /proc/cpuinfo read will need to fetch data for each CPU and it
won't even matter because we know beforehand that PowerPC chips don't
have microcode information.

We can do better for non-x86 hosts by skipping this process entirely.

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-08-25 19:44:39 +02:00
Daniel Henrique Barboza
97ac16baab virhostcpu.c: modernize virHostCPUGetMicrocodeVersion()
Use g_autofree and remove the cleanup label.

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2020-08-25 19:06:19 +02:00
Ján Tomko
52cd849e62 VIR_XPATH_NODE_AUTORESTORE: remove semicolon from users
Since the macro no longer includes the 'ignore_value'
statement, stop putting another empty statement after it.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-08-25 19:03:12 +02:00
Ján Tomko
8cc177fc5d util: xml: use pragma in VIR_XPATH_NODE_AUTORESTORE
The VIR_XPATH_NODE_AUTORESTORE contains an ignore_value
statement to silence an unused variable warning on clang.

Use a pragma instead, which is not a statement.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-08-25 19:03:12 +02:00
Ján Tomko
c23c7dac9b util: cgroup: wrap BACKEND_CALL macro in a block
VIR_CGROUP_BACKEND_CALL is exclusively used at the end
of a function, but it declares a variable.

Wrap it in a do..while block.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-08-25 19:03:12 +02:00
Ján Tomko
8687408f90 util: virNetDevBridgeSet: split declarations
Declare the variables at the beginning of the function,
then fill them up.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-08-25 19:03:12 +02:00
Ján Tomko
96b4f38603 Move debug statements after declarations
Many of our functions start with a DEBUG statement.
Move the statements after declarations to appease
our coding style.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-08-25 19:03:11 +02:00
Ján Tomko
0a37e0695b Split declarations from initializations
Split those initializations that depend on a statement
above them.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-08-25 19:03:11 +02:00
Ján Tomko
a5152f23e7 Move declarations before statements
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-08-25 19:03:11 +02:00
Ján Tomko
908bcaa452 util: move declarations in virStorageFileChainLookup
Use g_autofree and move the declarations to the beginning
of the block.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-08-25 19:03:11 +02:00
Ján Tomko
d8c9584aed util: virHostMem*Parameters: split out non-Linux stubs
Repeat the whole function header instead of mixing #ifdefs
in the code.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-08-25 19:03:11 +02:00
Ján Tomko
c927b2e85a util: virHostMemSetParameters: remove pointless variable
It is only used inside the condition.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-08-25 19:03:11 +02:00
Ján Tomko
ec9b47e133 util: virRandomInt: remove temporary variable
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-08-25 19:03:11 +02:00
Peter Krempa
7a268c7c3a qemu: Move virQEMUFileOpenAs to qemu_domain.c
Commit 4362068979 moved the function to
util/virqemu.c which is compiled also on win32 and geteuid()/getegid()
doesn't exist there.

Move it to qemu_domain.c which is compiled only when the qemu driver is
enabled. Originally I didn't want to put it here as qemu_domain.c is a
code dump for helper functions but this is the least invasive fix.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-08-24 18:12:44 +02:00
Peter Krempa
4362068979 qemuOpenFileAs: Move into util/virqemu.c
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-08-24 16:40:42 +02:00
Michal Privoznik
fd6b531cb2 virfdstream: Emulate skip for block devices
This is similar to one of previous patches.

When receiving stream (on virStorageVolUpload() and subsequent
virStreamSparseSendAll()) we may receive a hole. If the volume we
are saving the incoming data into is a regular file we just
lseek() and ftruncate() to create the hole. But this won't work
if the file is a block device. If that is the case we must write
zeroes so that any subsequent reader reads nothing just zeroes
(just like they would from a hole in a regular file).

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1852528

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2020-08-24 13:40:06 +02:00
Michal Privoznik
6e0306fa26 virfdstream: Allow sparse stream vol-download
When handling sparse stream, a thread is executed. This thread
runs a read() or write() loop (depending what API is called; in
this case it's virStorageVolDownload() and  this the thread run
read() loop). The read() is handled in virFDStreamThreadDoRead()
which is then data/hole section aware, meaning it uses
virFileInData() to detect data and hole sections and sends
TYPE_DATA or TYPE_HOLE virStream messages accordingly.

However, virFileInData() does not work with block devices. Simply
because block devices don't have data and hole sections. What we
can do though, is to mimic being always in a DATA section.

Partially resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1852528

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2020-08-24 13:39:28 +02:00
Michal Privoznik
24d7d85208 virnuma: Don't work around numa_node_to_cpus() for non-existent nodes
In a very distant past, we came around machines that has not
continuous node IDs. This made us error out when constructing
capabilities XML. We resolved it by utilizing strange behaviour
of numa_node_to_cpus() in which it returned a mask with all bits
set for a non-existent node. However, this is not the only case
when it returns all ones mask - if the node exists and has enough
CPUs to fill the mask up (e.g. 128 CPUs).

The fix consists of using nodemask_isset(&numa_all_nodes, ..)
prior to calling numa_node_to_cpus() to determine if the node
exists.

Fixes: 628c935747
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1860231
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-08-21 18:47:41 +02:00
Michal Privoznik
f34642d17c virfdstream: Drop some needless labels
After previous cleanups, some labels in some functions have
nothing but 'return' statement in them. Drop the labels and
replace 'goto'-s with respective return statements.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2020-08-20 14:03:41 +02:00
Michal Privoznik
2d3ac83670 virfdstream: Use VIR_AUTOCLOSE()
Again, instead of closing FDs explicitly, we can automatically
close them when they go out of their respective scopes.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2020-08-20 14:02:33 +02:00
Michal Privoznik
4bbe816d9f virfdstream: Use g_new0() instead of VIR_ALLOC()
This switch allow us to save a few lines of code.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2020-08-20 13:53:34 +02:00
Michal Privoznik
fb27b7b9be virfdstream: Use autoptr for virFDStreamMsg
A cleanup function can be declared for virFDStreamMsg type so
that the structure doesn't have to be freed explicitly.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2020-08-20 13:51:43 +02:00
Michal Privoznik
211ea0d20c virFDStreamMsgQueuePush: Clear pointer to passed message
All callers of virFDStreamMsgQueuePush() have the same pattern:
they explicitly set @msg passed to NULL to avoid freeing it later
on. Well, the function can take address of the pointer and clear
it for them.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2020-08-20 13:51:02 +02:00
Michal Privoznik
8d5cae317e virfdstream: Use g_autofree in virFDStreamThreadDoRead()
The buffer that allocated in the virFDStreamThreadDoRead() can be
automatically freed, or if saved into the message structure it
can be stolen.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2020-08-20 13:48:45 +02:00
Michal Privoznik
53d9af1e79 virdevmapper: Ignore all errors when opening /dev/mapper/control
So far, only ENOENT is ignored (to deal with kernels without
devmapper). However, as reported on the list, under certain
scenarios a different error can occur. For instance, when libvirt
is running inside a container which doesn't have permissions to
talk to the devmapper. If this is the case, then open() returns
-1 and sets errno=EPERM.

Assuming that multipath devices are fairly narrow use case and
using them in a restricted container is even more narrow the best
fix seems to be to ignore all open errors BUT produce a warning
on failure. To avoid flooding logs with warnings on kernels
without devmapper the level is reduced to a plain debug message.

Reported-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Reviewed-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2020-08-19 17:54:04 +02:00
Michal Privoznik
feb8564a3c virdevmapper: Handle kernel without device-mapper support
In one of my latest patch (v6.6.0~30) I was trying to remove
libdevmapper use in favor of our own implementation. However, the
code did not take into account that device mapper can be not
compiled into the kernel (e.g. be a separate module that's not
loaded) in which case /proc/devices won't have the device-mapper
major number and thus virDevMapperGetTargets() and/or
virIsDevMapperDevice() fails.

However, such failure is safe to ignore, because if device mapper
is missing then there can't be any multipath devices and thus we
don't need to allow the deps in CGroups, nor create them in the
domain private namespace, etc.

Fixes: 2249455654
Reported-by: Andrea Bolognani <abologna@redhat.com>
Reported-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Tested-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
2020-08-18 12:58:27 +02:00
Michal Privoznik
82bb167f0d virdevmapper: Don't cache device-mapper major
The device mapper major is needed in virIsDevMapperDevice() which
determines whether given device is managed by device-mapper. This
number is obtained by parsing /proc/devices and then stored in a
global variable so that the file doesn't have to be parsed again.
However, as it turns out this logic is flawed - the major number
is not static and can change as it can be specified as a
parameter when loading the dm-mod module.

Unfortunately, I was not able to come up with a good solution and
thus the /proc/devices file is being parsed every time we need
the device mapper major.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Tested-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
2020-08-18 12:57:32 +02:00
Pavel Hrdina
7e574d1a07 vircgroupv2devices: fix counting entries in BPF map
BPF syscall BPF_MAP_GET_NEXT_KEY returns -1 if something fails but it
will also return -1 if trying to get next key using the last key in the
map with errno set to ENOENT.

If there are VMs running and libvirtd is restarted and user tries to
call some cgroup devices operation on a VM we need to get the count of
entries in BPF map and it fails which will result in error when trying
to attach/detech devices.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1833321

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2020-08-11 15:11:15 +02:00
Daniel P. Berrangé
2998ba2012 util: avoid race in releasing the GSource in event thread
There is a race between  vir_event_thread_finalize and
virEventThreadWorker in releasing the last reference on
the GMainContext. If virEventThreadDataFree() runs after
vir_event_thread_finalize releases its reference, then
it will release the last reference on the GMainContext.
As a result g_autoptr cleanup on the GSource will access
free'd memory.

The race can be seen in non-deterministic crashes of the
virt-run-qemu program during its shutdown, but could
also likely affect the main libvirtd QEMU driver:

  Thread 2 (Thread 0x7f508ffff700 (LWP 222813)):
  #0  0x00007f509c8e26b0 in malloc_consolidate (av=av@entry=0x7f5088000020) at malloc.c:4488
  #1  0x00007f509c8e4b08 in _int_malloc (av=av@entry=0x7f5088000020, bytes=bytes@entry=2048) at malloc.c:3711
  #2  0x00007f509c8e6412 in __GI___libc_malloc (bytes=2048) at malloc.c:3073
  #3  0x00007f509d6e925e in g_realloc (mem=0x0, n_bytes=2048) at gmem.c:164
  #4  0x00007f509d705a57 in g_string_maybe_expand (string=string@entry=0x7f5088001f20, len=len@entry=1024) at gstring.c:102
  #5  0x00007f509d705ab6 in g_string_sized_new (dfl_size=dfl_size@entry=1024) at gstring.c:127
  #6  0x00007f509d708c5e in g_test_log_dump (len=<synthetic pointer>, msg=<synthetic pointer>) at gtestutils.c:3330
  #7  0x00007f509d708c5e in g_test_log
      (lbit=G_TEST_LOG_ERROR, string1=0x7f508800fcb0 "GLib:ERROR:ghash.c:377:g_hash_table_lookup_node: assertion failed: (hash_table->ref_count > 0)", string2=<optimized out>, n_args=0, largs=0x0) at gtestutils.c:975
  #8  0x00007f509d70af2a in g_assertion_message
      (domain=<optimized out>, file=0x7f509d7324a2 "ghash.c", line=<optimized out>, func=0x7f509d732750 <__func__.11348> "g_hash_table_lookup_node", message=<optimized out>)
      at gtestutils.c:2504
  #9  0x00007f509d70af8e in g_assertion_message_expr
      (domain=domain@entry=0x7f509d72d76e "GLib", file=file@entry=0x7f509d7324a2 "ghash.c", line=line@entry=377, func=func@entry=0x7f509d732750 <__func__.11348> "g_hash_table_lookup_node", expr=expr@entry=0x7f509d732488 "hash_table->ref_count > 0") at gtestutils.c:2555
  #10 0x00007f509d6d197e in g_hash_table_lookup_node (hash_table=0x55b70ace1760, key=<optimized out>, hash_return=<synthetic pointer>) at ghash.c:377
  #11 0x00007f509d6d197e in g_hash_table_lookup_node (hash_return=<synthetic pointer>, key=<optimized out>, hash_table=0x55b70ace1760) at ghash.c:361
  #12 0x00007f509d6d197e in g_hash_table_remove_internal (hash_table=0x55b70ace1760, key=<optimized out>, notify=1) at ghash.c:1371
  #13 0x00007f509d6e0664 in g_source_unref_internal (source=0x7f5088000b60, context=0x55b70ad87e00, have_lock=0) at gmain.c:2103
  #14 0x00007f509d6e1f64 in g_source_unref (source=<optimized out>) at gmain.c:2176
  #15 0x00007f50a08ff84c in glib_autoptr_cleanup_GSource (_ptr=<synthetic pointer>) at /usr/include/glib-2.0/glib/glib-autocleanups.h:58
  #16 0x00007f50a08ff84c in virEventThreadWorker (opaque=0x55b70ad87f80) at ../../src/util/vireventthread.c:114
  #17 0x00007f509d70bd4a in g_thread_proxy (data=0x55b70acf3850) at gthread.c:784
  #18 0x00007f509d04714a in start_thread (arg=<optimized out>) at pthread_create.c:479
  #19 0x00007f509c95cf23 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

  Thread 1 (Thread 0x7f50a1380c00 (LWP 222802)):
  #0  0x00007f509c8977ff in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
  #1  0x00007f509c881c35 in __GI_abort () at abort.c:79
  #2  0x00007f509d72a823 in g_mutex_clear (mutex=0x55b70ad87e00) at gthread-posix.c:1307
  #3  0x00007f509d72a823 in g_mutex_clear (mutex=mutex@entry=0x55b70ad87e00) at gthread-posix.c:1302
  #4  0x00007f509d6e1a84 in g_main_context_unref (context=0x55b70ad87e00) at gmain.c:582
  #5  0x00007f509d6e1a84 in g_main_context_unref (context=0x55b70ad87e00) at gmain.c:541
  #6  0x00007f50a08ffabb in vir_event_thread_finalize (object=0x55b70ad83180 [virEventThread]) at ../../src/util/vireventthread.c:50
  #7  0x00007f509d9c48a9 in g_object_unref (_object=<optimized out>) at gobject.c:3340
  #8  0x00007f509d9c48a9 in g_object_unref (_object=0x55b70ad83180) at gobject.c:3232

  #9  0x00007f509583d311 in qemuProcessQMPFree (proc=proc@entry=0x55b70ad87b90) at ../../src/qemu/qemu_process.c:8355
  #10 0x00007f5095790f58 in virQEMUCapsInitQMPSingle
      (qemuCaps=qemuCaps@entry=0x55b70ad88010, libDir=libDir@entry=0x55b70ad049e0 "/tmp/virt-qemu-run-VZC9N0/lib/qemu", runUid=runUid@entry=107, runGid=runGid@entry=107, onlyTCG=onlyTCG@entry=false) at ../../src/qemu/qemu_capabilities.c:5409
  #11 0x00007f509579108f in virQEMUCapsInitQMP (runGid=107, runUid=107, libDir=0x55b70ad049e0 "/tmp/virt-qemu-run-VZC9N0/lib/qemu", qemuCaps=0x55b70ad88010)
      at ../../src/qemu/qemu_capabilities.c:5420
  #12 0x00007f509579108f in virQEMUCapsNewForBinaryInternal
      (hostArch=VIR_ARCH_X86_64, binary=binary@entry=0x55b70ad7dc40 "/usr/libexec/qemu-kvm", libDir=0x55b70ad049e0 "/tmp/virt-qemu-run-VZC9N0/lib/qemu", runUid=107, runGid=107, hostCPUSignature=0x55b70ad01320 "GenuineIntel, Intel(R) Xeon(R) Silver 4210 CPU @ 2.20GHz, family: 6, model: 85, stepping: 7", microcodeVersion=83898113, kernelVersion=0x55b70ad00d60 "4.18.0-211.el8.x86_64 #1 SMP Thu Jun 4 08:08:16 UTC 2020") at ../../src/qemu/qemu_capabilities.c:5472
  #13 0x00007f5095791373 in virQEMUCapsNewData (binary=0x55b70ad7dc40 "/usr/libexec/qemu-kvm", privData=0x55b70ad5b8f0) at ../../src/qemu/qemu_capabilities.c:5505
  #14 0x00007f50a09a32b1 in virFileCacheNewData (name=0x55b70ad7dc40 "/usr/libexec/qemu-kvm", cache=<optimized out>) at ../../src/util/virfilecache.c:208
  #15 0x00007f50a09a32b1 in virFileCacheValidate (cache=cache@entry=0x55b70ad5c030, name=name@entry=0x55b70ad7dc40 "/usr/libexec/qemu-kvm", data=data@entry=0x7ffca39ffd90)
      at ../../src/util/virfilecache.c:277
  #16 0x00007f50a09a37ea in virFileCacheLookup (cache=cache@entry=0x55b70ad5c030, name=name@entry=0x55b70ad7dc40 "/usr/libexec/qemu-kvm") at ../../src/util/virfilecache.c:310
  #17 0x00007f5095791627 in virQEMUCapsCacheLookup (cache=0x55b70ad5c030, binary=0x55b70ad7dc40 "/usr/libexec/qemu-kvm") at ../../src/qemu/qemu_capabilities.c:5647
  #18 0x00007f50957c34c3 in qemuDomainPostParseDataAlloc (def=<optimized out>, parseFlags=<optimized out>, opaque=<optimized out>, parseOpaque=0x7ffca39ffe18)
      at ../../src/qemu/qemu_domain.c:5470
  #19 0x00007f50a0a34051 in virDomainDefPostParse
      (def=def@entry=0x55b70ad7d200, parseFlags=parseFlags@entry=258, xmlopt=xmlopt@entry=0x55b70ad5d010, parseOpaque=parseOpaque@entry=0x0)
      at ../../src/conf/domain_conf.c:5970
  #20 0x00007f50a0a464bb in virDomainDefParseNode
      (xml=xml@entry=0x55b70aced140, root=root@entry=0x55b70ad5f020, xmlopt=xmlopt@entry=0x55b70ad5d010, parseOpaque=parseOpaque@entry=0x0, flags=flags@entry=258)
      at ../../src/conf/domain_conf.c:22520
  #21 0x00007f50a0a4669b in virDomainDefParse
      (xmlStr=xmlStr@entry=0x55b70ad5f9e0 "<domain type='kvm'>\n  <name>83</name>\n  <uuid>9350639d-1c8a-4f51-a4a6-4eaf8eabe83e</uuid>\n  <metadata>\n    <libosinfo:libosinfo xmlns:libosinfo=\"http://libosinfo.org/xmlns/libvirt/domain/1.0\">\n      <"..., filename=filename@entry=0x0, xmlopt=0x55b70ad5d010, parseOpaque=parseOpaque@entry=0x0, flags=flags@entry=258) at ../../src/conf/domain_conf.c:22474
  #22 0x00007f50a0a467ae in virDomainDefParseString
      (xmlStr=xmlStr@entry=0x55b70ad5f9e0 "<domain type='kvm'>\n  <name>83</name>\n  <uuid>9350639d-1c8a-4f51-a4a6-4eaf8eabe83e</uuid>\n  <metadata>\n    <libosinfo:libosinfo xmlns:libosinfo=\"http://libosinfo.org/xmlns/libvirt/domain/1.0\">\n      <"..., xmlopt=<optimized out>, parseOpaque=parseOpaque@entry=0x0, flags=flags@entry=258)
      at ../../src/conf/domain_conf.c:22488
  #23 0x00007f50958ce112 in qemuDomainCreateXML
      (conn=0x55b70acf9090, xml=0x55b70ad5f9e0 "<domain type='kvm'>\n  <name>83</name>\n  <uuid>9350639d-1c8a-4f51-a4a6-4eaf8eabe83e</uuid>\n  <metadata>\n    <libosinfo:libosinfo xmlns:libosinfo=\"http://libosinfo.org/xmlns/libvirt/domain/1.0\">\n      <"..., flags=0) at ../../src/qemu/qemu_driver.c:1744
  #24 0x00007f50a0c268ac in virDomainCreateXML
      (conn=0x55b70acf9090, xmlDesc=0x55b70ad5f9e0 "<domain type='kvm'>\n  <name>83</name>\n  <uuid>9350639d-1c8a-4f51-a4a6-4eaf8eabe83e</uuid>\n  <metadata>\n    <libosinfo:libosinfo xmlns:libosinfo=\"http://libosinfo.org/xmlns/libvirt/domain/1.0\">\n      <"..., flags=0) at ../../src/libvirt-domain.c:176
  #25 0x000055b709547e7b in main (argc=<optimized out>, argv=<optimized out>) at ../../src/qemu/qemu_shim.c:289

The solution is to explicitly unref the GSource at a safe time instead
of letting g_autoptr unref it when leaving scope.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-08-07 12:44:05 +01:00
Daniel P. Berrangé
0db4743645 util: avoid crash due to race in glib event loop code
There is a fairly long standing race condition bug in glib which can hit
if you call g_source_destroy or g_source_unref from a non-main thread:

  https://gitlab.gnome.org/GNOME/glib/-/merge_requests/1358

Unfortunately it is really common for libvirt to call g_source_destroy
from a non-main thread. This glib bug is the cause of non-determinstic
crashes in eventtest, and probably in libvirtd too.

To work around the problem we need to ensure that we never release
the last reference on a GSource from a non-main thread. The previous
patch replaced our use of g_source_destroy with a pair of
g_source_remove and g_source_unref. We can now delay the g_source_unref
call by using a idle callback to invoke it from the main thread which
avoids the race condition.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-08-07 12:43:59 +01:00
Daniel P. Berrangé
da0a182708 util: keep track of full GSource object not source ID number
The source ID number is an alternative way to identify a source that has
been added to a GMainContext. Internally when a source ID is given, glib
will lookup the corresponding GSource and use that. The use of a source
ID is racy in some cases though, because it is invalid to continue to
use an ID number after the GSource has been removed. It is thus safer
to use the GSource object directly and have full control over the ref
counting and thus cleanup.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-08-07 12:43:56 +01:00
Jiri Denemark
2edd63a0db util: Fix logic in virFileSetCOW
When COW is not explicitly requested to be disabled or enabled, the
function is supposed to do nothing on non-BTRFS file systems.

Fixes commit 7230bc95aa.

https://bugzilla.redhat.com/show_bug.cgi?id=1866157

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-08-05 11:04:17 +02:00
Laine Stump
cb373a0068 util: log an error if virXMLNodeContentString will return NULL
Many of our calls to xmlNodeGetContent() (which are now all via
virXMLNodeContentString() are failing to check for a NULL return. We
need to remedy that, but in order to make the remedy simpler, let's
log an error in virXMLNodeContentString(), so that the callers don't
all individually need to (since it would be the same error message for
all of them anyway).

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-08-05 00:04:48 -04:00
Binfeng Wu
8361d335ab mdev: Fix daemon crash when reattaching mdevs on assignment conflict
If there's a list of mdevs to be assigned to a domain, but one of them
(NOT the first) is already assigned to a different domain we're going
to crash in the qemuProcessStop phase in
virMediatedDeviceListFindIndex, because some of the pointers in
mgr->activeMediatedHostdevs are dangling. This is due to
virMediatedDeviceListMarkDevices using cleanup instead of rollback when
we find out that a device is already taken.

Reproducer steps:
1. start vm1 with mdev1
2. start vm2 with mdev2, mdev1 (the order is important!)

Backtrace:
 #0  0x0000ffffb8c36250 in strcmp
 #1  0x0000ffffb9b80754 in virMediatedDeviceListFindIndex
 #2  0x0000ffffb9b80870 in virMediatedDeviceListFind
 #3  0x0000ffffb9c9e168 in virHostdevReAttachMediatedDevices
 #4  0x0000ffff9949f724 in qemuHostdevReAttachMediatedDevices
 #5  0x0000ffff9949f7f8 in qemuHostdevReAttachDomainDevices
 #6  0x0000ffff994bcd70 in qemuProcessStop
 #7  0x0000ffff994bf4e0 in qemuProcessStart

Signed-off-by: Binfeng Wu <wubinfeng@huawei.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2020-08-04 14:03:54 +02:00
Ján Tomko
34b4b4faf0 Remove unused variables
These variables are only used for assignment and have
no other effect.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2020-08-03 15:52:09 +02:00
Ján Tomko
ef87d60120 util: cgroup: remove unused opts in virCgroupV2BindMount
In virCgroupV2BindMount there is an unused variable containing
what seem to be tmpfs mount options.

Delete it. Unlike with cgroups v1, we do not create a tmpfs
here.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2020-08-03 15:52:09 +02:00
Ján Tomko
21cd1e7254 util: delete virStringListFree
Now that everything uses g_strfreev, this function is no longer
needed.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2020-08-03 15:37:36 +02:00
Ján Tomko
8003fe0361 util: recommend g_strfreev instead of virStringListFree
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2020-08-03 15:37:36 +02:00