We no longer report any errors so all callers can be replaced by
virBitmapNew. Additionally virBitmapNew can't return NULL now so error
handling is not necessary.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We now always return a valid pointer or crash so the return value
doesn't need to be checked.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Modify the condition which would make virBitmapNewQuiet fail to possibly
overallocate by 1 rather than failing.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We now have APIs which automatically expand the bitmap and also API
which allocates a 0 size bitmap. Remove the condition from virBitmapNew.
Effectively reverts ce49cfb48a
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
virBitmapNewEmpty() can create a bitmap with 0 length. With such a
bitmap virBitmapToString will return NULL rather than an empty string.
Initialize the buffer to avoid that.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Clarify which bit is considered most significant in the bitmap and
resulting string. Also be explicit that it's a hex string.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
There's only one combination used so we can remove the rest.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When VIR_EXEC_DAEMON is true and cmd->pidfile exists, the parent
will expect the pidfile to be written before exiting, sitting
tight in a saferead() call waiting.
The child then does process tuning (via virProcessSet* functions)
before writing the pidfile. Problem is that these tunings can
fail, and trigger a 'fork_error' jump, before cmd->pidfile is
written. The result is that the process was aborted in the
child, but the parent is still hang in the saferead() call.
This behavior can be reproduced by trying to create and execute
a QEMU guest in user mode (e.g. using qemu:///session as non-root).
virProcessSetMaxMemLock() will fail if the spawned libvirtd user
process does not have CAP_SYS_RESOURCE capability. setrlimit() will
fail, and a 'fork_error' jump is triggered before cmd->pidfile
is written. The parent will hung in saferead() indefinitely. From
the user perspective, 'virsh start <guest>' will hang up
indefinitely. CTRL+C can be used to retrieve the terminal, but
any subsequent 'virsh' call will also hang because the previous
libvirtd user process is still there.
We can fix this by moving all virProcessSet*() tuning functions
to be executed after cmd->pidfile is taken care of. In the case
mentioned above, this would be the result of 'virsh start'
after this patch:
error: Failed to start domain vm1
error: internal error: Process exited prior to exec: libvirt: error :
cannot limit locked memory to 79691776: Operation not permitted
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1882093
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Lack of this one function (which is called for each active tap device
every time libvirtd is started) is the one thing preventing a
"WITHOUT_LIBNL" build of libvirt from being useful. With this
alternate implementation, guests using standard tap devices will work
properly even when libvirt is built without libnl support.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
There was one stray bit of code in virnetdev.c that required libnl to
build, but wasn't qualified by defined(WITH_LIBNL). Adding that, plus
putting a similar check around a static function only used by that
aforementioned code, makes libvirt build properly without libnl3-devel
installed.
How useful it is in that state is a separate issue :-)
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This flag was originally created to indicate that either 1) the build
platform wasn't linux, 2) the build platform was linux, but the kernel
was too old to have macvtap support. Since there was already a switch
there, the ability to also disable it when 3) the kernel supports
macvtap but the user doesn't want it, was added in. I don't think that
(3) was ever an intentional goal, just something that grew naturally
out of having the flag there in the first place (unless possibly the
original author wanted a way to quickly disable their new code in case
it caused regressions elsewhere).
Now that the check for (2) has been removed, WITH_MACVTAP is just
checking (1) and (3), but (3) is pointless (because the extra code in
libvirt itself is miniscule, and the only external library needed for
it is libnl, which is also required for other unrelated features (and
itself has no subordinate dependencies and takes up < 1MB on
disk)). We can therfore eliminate the WITH_MACVTAP flag, as it is
functionally equivalent to WITH_LIBNL (which implies __linux__).
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
macvlan support was added to the Linux kernel in 2.6.33, but
MACVLAN_MODE_PASSTHRU wasn't added until 2.6.38, so a workaround had
been put in place to define that constant on those few systems where
it was missing. It's useful like was probably 6 months at most, but
it's been there for over 10 years.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
WITH_VIRTUALPORT just checks that we are building on Linux and that
IFLA_PORT_MAX is defined in linux/if_link.h. Back when 802.11Qb[gh]
support was added, the IFLA_* stuff was new (introduced in kernel
2.6.35, backported to RHEL6 2.6.32 kernel at some point), and so this
extra check was necessary, because libvirt was being built on Linux
distros that didn't yet have IFLA_* (e.g. older RHEL6, all
RHEL5). It's been in the kernel for a *very* long time now, so all
supported versions of all Linux platforms libvirt builds on have it.
Note that the above paragraph implies that the conditional compilation
should be changed to #if defined(__linux__). However, the astute
reader will notice that the code in question is sending and receiving
netlink messages, so it really should be conditional on WITH_LIBNL
(which implies __linux__) instead, so that's what this patch does.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
WITH_LIBNL will only be defined on Linux platforms (because libnl is a
library written to encapsulate parts of netlink, which is a Linux-only
API), so it's redundant to write:
#if defined(__linux__) && defined(WITH_LIBNL)
We can just check for WITH_LIBNL.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
IFLA_VF_MAX was introduced to the Linux kernel in 2.6.35, and was even
backported to the RHEL*6* 2.6.32 kernel downstream, so it is present
in all supported versions of all Linux distros that libvirt builds
on. Additionally, it can't be conditionally compiled out of a
kernel. There is no reason to conditionalize any piece of code on
presence of IFLA_VF_MAX - if the platform is Linux, it is supported.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The former has been present since
commit f43798c27684ab925adde7d8acc34c78c6e50df8
Author: Rusty Russell <rusty@rustcorp.com.au>
Date: Thu Jul 3 03:48:02 2008 -0700
tun: Allow GSO using virtio_net_hdr
and the latter since
commit bbb009941efaece3898910a862f6d23aa55d6ba8
Author: Jason Wang <jasowang@redhat.com>
Date: Wed Oct 31 19:45:59 2012 +0000
tuntap: introduce multiqueue flags
these are old enough that they can be assumed present in all Linux
platforms we support. The tap device creation code changed is specific
to Linux, with a separate impl for non-Linux platforms.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This flag was added by Linux with:
commit f43798c27684ab925adde7d8acc34c78c6e50df8
Author: Rusty Russell <rusty@rustcorp.com.au>
Date: Thu Jul 3 03:48:02 2008 -0700
tun: Allow GSO using virtio_net_hdr
so we can assume all Linux distros we support have this flag available
and thus the compile time check is sufficient.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Add an abort() on the class/object allocation failures so that
virStorageSourceNew() always returns a virStorageSource and remove
checks from all callers.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
g_thread_join() eats a reference.
==295055== Invalid read of size 4
==295055== at 0x4DA4AE4: g_thread_unref (in /usr/lib64/libglib-2.0.so.0.6400.5)
==295055== by 0x491D5FA: vir_event_thread_finalize (vireventthread.c:47)
==295055== by 0x4E6BCFF: g_object_unref (in /usr/lib64/libgobject-2.0.so.0.6400.5)
==295055== by 0x22F35CF4: qemuProcessQMPFree (qemu_process.c:8525)
==295055== by 0x22E71B58: glib_autoptr_clear_qemuProcessQMP (qemu_process.h:237)
...
==295055== by 0x22E98A29: qemuDomainPostParseDataAlloc (qemu_domain.c:5476)
==295055== by 0x49ABF83: virDomainDefPostParse (domain_conf.c:6023)
==295055== Address 0x2acb1c68 is 24 bytes inside a block of size 88 free'd
==295055== at 0x483B9F5: free (vg_replace_malloc.c:538)
==295055== by 0x4D80A4C: g_free (in /usr/lib64/libglib-2.0.so.0.6400.5)
...
==295055== by 0x491D5F1: vir_event_thread_finalize (vireventthread.c:46)
==295055== by 0x4E6BCFF: g_object_unref (in /usr/lib64/libgobject-2.0.so.0.6400.5)
==295055== by 0x22F35CF4: qemuProcessQMPFree (qemu_process.c:8525)
==295055== by 0x22E71B58: glib_autoptr_clear_qemuProcessQMP (qemu_process.h:237)
...
==295055== Block was alloc'd at
==295055== at 0x483A809: malloc (vg_replace_malloc.c:307)
==295055== by 0x4D80958: g_malloc (in /usr/lib64/libglib-2.0.so.0.6400.5)
...
==295055== by 0x4DA4C32: g_thread_try_new (in /usr/lib64/libglib-2.0.so.0.6400.5)
==295055== by 0x491D3BC: virEventThreadStart (vireventthread.c:159)
==295055== by 0x491D3BC: virEventThreadNew (vireventthread.c:185)
...
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Fixes: f4fc3db920
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
g_variant_iter_loop() handles freeing all arguments unless we break out
of the loop, in that case we have to free them manually.
Reported-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
We used to check the format of reply data with libdbus so we should do
the same with GLib DBus as well.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We need to pass pointer to `array`.
Reported-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Tested-by: Ján Tomko <jtomko@redhat.com>
virFileComparePaths just return 0 or 1 after commit 7b48bb8
so break while after virFileComparePaths return 1
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Yi Li <yili@winhong.com>
An extra parameter was added to virQEMUBuildQemuImgKeySecretOpts in
commit ecfc4094d8
Author: Daniel P. Berrangé <berrange@redhat.com>
Date: Tue Sep 15 16:30:37 2020 +0100
storage: add support for qcow2 LUKS encryption
but the non-null pointer annotations were not adjusted to take account.
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The storage driver was wired up to support creating raw volumes in LUKS
format, but was never adapted to support LUKS-in-qcow2. This is trivial
as it merely requires the encryption properties to be prefixed with
the "encrypt." prefix, and "encrypt.format=luks" when creating the
volume.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Crypt method number 2 indicates LUKS format.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
With libdbus our wrappers had a special syntax to create the DBus
messages by defining the DBus message signature followed by list
of arguments providing data based on the signature.
There will be no similar helper with GLib implementation as they
provide same functionality via GVariant APIs. The syntax is slightly
different mostly for how arrays, variadic types and dictionaries are
created/parsed.
Additional difference is that with GLib DBus everything is wrapped in
extra tuple (struct). For more details refer to the documentation [1].
[1] <https://developer.gnome.org/glib/stable/gvariant-format-strings.html>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The original motivation for adding virNetDevIPCheckIPv6Forwarding
(commit 00d28a78b5) was that networking routes would disappear when
ipv6 forwarding was enabled for an interface.
This is a fairly undocumented side-effect of the "accept_ra" sysctl
for an interface. 1 means the interface will accept_ra's if not
forwarding, 2 means always accept_RAs; but it is not explained that
enabling forwarding when accept_ra==1 will also clear any kernel RA
assigned routes, very likely breaking your networking.
The check to warn about this currently uses netlink to go through all
the routes and then look at the accept_ra status of the interfaces.
However, it has been noticed that this problem does not affect systems
where IPv6 RA configuration is handled in userspace, e.g. via tools
such as NetworkManager. In this case, the error message from libvirt
is spurious, and modifying the forwarding state will not affect the RA
state or disable your networking.
If you refer to the function rt6_purge_dflt_routers() in the kernel,
we can see that the routes being purged are only those with the
kernel's RTF_ADDRCONF flag set; that is, routes added by the kernel's
RA handling. Why does it do this? I think this is a Linux
implementation decision; it has always been like that and there are
some comments suggesting that it is because a router should be
statically configured, rather than accepting external configurations.
The solution implemented here is to convert the existing check into a
walk of /proc/net/ipv6_route (because RTF_ADDRCONF is apparently not
exposed in netlink) and look for routes with this flag set. We then
check the accept_ra status for the interface, and if enabling
forwarding would break things raise an error.
This should hopefully avoid "interactive" users, who are likely to be
using NetworkManager and the like, having false warnings when enabling
IPv6, but retain the error check for users relying on kernel-based
IPv6 interface auto-configuration.
Signed-off-by: Ian Wienand <iwienand@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
Reviewed-by: Cedric Bosdonnat <CBosdonnat@suse.com>
In v6.7.0-rc1~86 I've tried to fix a problem where we were not
detecting NUMA nodes properly because we misused behaviour of a
libnuma API and as it turned out the behaviour was correct for
hosts with 64 CPUs in one NUMA node. So I changed the code to use
nodemask_isset(&numa_all_nodes, ..) instead and it fixed the
problem on such hosts. However, what I did not realize is that
numa_all_nodes does not reflect all NUMA nodes visible to
userspace, it contains only those nodes that the process
(libvirtd) an allocate memory from, which can be only a subset of
all NUMA nodes. The bitmask that contains all NUMA nodes visible
to userspace and which one I should have used is: numa_nodes_ptr.
For curious ones:
4a22f22382
And as I was fixing virNumaGetNodeCPUs() I came to realize that
we already have a function that wraps the correct bitmask:
virNumaNodeIsAvailable().
Fixes: 24d7d85208
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1876956
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
This function was introduced in the 2.0.6 release which happened
in December 2010. I think it is safe to assume that all libnuma
we deal with have the function.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Since we no longer need to wait for IPv6 DAD to complete, we never
call this function.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We are currently adding -lutil and -lkvm to the linker using the
add_project_link_arguments method. On FreeBSD 11.4, this results in
build errors because the args appear too early in the command line.
We need to pass the libraries as dependencies so that they get placed
at the same point in the linker args as other dependencies.
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
When starting a domain with <numatune/> set libvirt translates
given NUMA nodes into a set of host CPUs which is then used to
QEMU process affinity. But, if the numatune contains a
non-existent NUMA node then the translation fails with no error
reported. This is because virNumaNodesetToCPUset() calls
virNumaGetNodeCPUs() and expects it to report an error on
failure. Well, it does except for non-existent NUMA nodes. While
this behaviour might look strange it is actually desired because
of how we construct host capabilities. The virNumaGetNodeCPUs()
is called from virCapabilitiesHostNUMAInitReal() where we do not
want any error reported for non-existent NUMA nodes.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1724866
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
'blockdev-create' allows us to create the image with a custom cluster
size if we wish to. Wire it up for 'qcow2'.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
It it useful to be sure no thread is running after we drop all references to
virEventThread. Otherwise in order to avoid crashes we need to synchronize some
other way or we make extra references in event handler callbacks to all the
object in use. And some of them are not prepared to be refcounted.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Stop just send signal for threads to exit when they finish with
current task. Drain waits when all threads will finish.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Even if we have no priority threads on pool creation we can add them thru
virThreadPoolSetParameters later.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
The conditional was removed in
commit ebbf8ebe4f
Author: Ján Tomko <jtomko@redhat.com>
Date: Tue Sep 1 22:56:37 2020 +0200
util: virnetdevtap: stats: fix txdrop on FreeBSD
That commit was correct about this no longer being required for FreeBSD,
but missed that the code is also built on macOS.
Rather than testing for this field in meson though, we can simply use
a platform conditional test in the code.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Local socket connections were outright disabled because there was no "server"
part in the URI. However, given how requirements and usage scenarios are
evolving, some management apps might need the source libvirt daemon to connect
to the destination daemon over a UNIX socket for peer2peer migration. Since we
cannot know where the socket leads (whether the same daemon or not) let's decide
that based on whether the socket path is non-standard, or rather explicitly
specified in the URI. Checking non-standard path would require to ask the
daemon for configuration and the only misuse that it would prevent would be a
pretty weird one. And that's not worth it. The assumption is that whenever
someone uses explicit UNIX socket paths in the URI for migration they better
know what they are doing.
Partially resolves: https://bugzilla.redhat.com/1638889
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
For older FreeBSD, we needed an ifdef guard to use
if_data.ifi_oqdrops, which was introduced by:
commit 61bbdbb94c
Implement interface stats for BSD
But when we dropped the check because we deprecated
building on FreeBSD-10 in:
commit 83131d9714
configure: drop check for unsupported FreeBSD
We started building the wrong side of the ifdef.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Fixes: 83131d9714
Reviewed-by: Roman Bogorodskiy <bogorodskiy@gmail.com>
Currently, we are mixing: #if HAVE_BLAH with #if WITH_BLAH.
Things got way better with Pavel's work on meson, but apparently,
mixing these two lead to confusing and easy to miss bugs (see
31fb929eca for instance). While we were forced to use HAVE_
prefix with autotools, we are free to chose our own prefix with
meson and since WITH_ prefix appears to be more popular let's use
it everywhere.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
There are couple of conditional #includes at the beginning of
virfile.c and they try to be nice and document #endifs. But they
are mostly wrong because either they have the condition in the
comment inverted or the comment refers to a different condition
than they belong to. Just remove the comments as these #includes
are single line mostly.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
There are two places where we try to check whether the host
system has net/if.h before including it. But the check is missing
'_H' suffix.
Fixes: 7f3eb533f4
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Use https: links for websites that support them.
The URIs which are used as namespace identifiers
are left alone.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Neal Gompa <ngompa13@gmail.com>
When creating a standard tap device, if provided with an ifname that
contains "%d", rather than taking that literally as the name to use
for the new device, the kernel will instead use that string as a
template, and search for the lowest number that could be put in place
of %d and produce an otherwise unused and unique name for the new
device. For example, if there is no tap device name given in the XML,
libvirt will always send "vnet%d" as the device name, and the kernel
will create new devices named "vnet0", "vnet1", etc. If one of those
devices is deleted, creating a "hole" in the name list, the kernel
will always attempt to reuse the name in the hole first before using a
name with a higher number (i.e. it finds the lowest possible unused
number).
The problem with this, as described in the previous patch dealing with
macvtap device naming, is that it makes "immediate reuse" of a newly
freed tap device name *much* more common, and in the aftermath of
deleting a tap device, there is some other necessary cleanup of things
which are named based on the device name (nwfilter rules, bandwidth
rules, OVS switch ports, to name a few) that could end up stomping
over the top of the setup of a new device of the same name for a
different guest.
Since the kernel "create a name based on a template" functionality for
tap devices doesn't exist for macvtap, this patch for standard tap
devices is a bit different from the previous patch for macvtap - in
particular there was no previous "bitmap ID reservation system" or
overly-complex retry loop that needed to be removed. We simply find
and unused name, and pass that name on to the kernel instead of
"vnet%d".
This counter is also wrapped when either it gets to INT_MAX or if the
full name would overflow IFNAMSIZ-1 characters. In the case of
"vnet%d" and a 32 bit int, we would reach INT_MAX first, but possibly
someday someone will change the name from vnet to something else.
(NB: It is still possible for a user to provide their own
parameterized template name (e.g. "mytap%d") in the XML, and libvirt
will just pass that through to the kernel as it always has.)
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
There have been some reports that, due to libvirt always trying to
assign the lowest numbered macvtap / tap device name possible, a new
guest would sometimes be started using the same tap device name as
previously used by another guest that is in the process of being
destroyed *as the new guest is starting.
In some cases this has led to, for example, the old guest's
qemuProcessStop() code deleting a port from an OVS switch that had
just been re-added by the new guest (because the port name is based on
only the device name using the port). Similar problems can happen (and
I believe have) with nwfilter rules and bandwidth rules (which are
both instantiated based on the name of the tap device).
A couple patches have been previously proposed to change the ordering
of startup and shutdown processing, or to put a mutex around
everything related to the tap/macvtap device name usage, but in the
end no matter what you do there will still be possible holes, because
the device could be deleted outside libvirt's control (for example,
regular tap devices are automatically deleted when the qemu process
terminates, and that isn't always initiated by libvirt but could
instead happen completely asynchronously - libvirt then has no control
over the ordering of shutdown operations, and no opportunity to
protect it with a mutex.)
But this only happens if a new device is created at the same time as
one is being deleted. We can effectively eliminate the chance of this
happening if we end the practice of always looking for the lowest
numbered available device name, and instead just keep an integer that
is incremented each time we need a new device name. At some point it
will need to wrap back around to 0 (in order to avoid the IFNAMSIZ 15
character limit if nothing else), and we can't guarantee that the new
name really will be the *least* recently used name, but "math"
suggests that it will be *much* less common that we'll try to re-use
the *most* recently used name.
This patch implements such a counter for macvtap/macvlan, replacing
the existing, and much more complicated, "ID reservation" system. The
counter is set according to whatever macvtap/macvlan devices are
already in use by guests when libvirtd is started, incremented each
time a new device name is needed, and wraps back to 0 when either
INT_MAX is reached, or when the resulting device name would be longer
than IFNAMSIZ-1 characters (which actually is what happens when the
template for the device name is "maccvtap%d"). The result is that no
macvtap name will be re-used until the host has created (and possibly
destroyed) 99,999,999 devices.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
On some platforms libm (needed for the pow() function) isn't being
linked in somehow. This patch adds the necessary bits to assure that
it's linked in when necessary.
Suggested-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
(cherry picked from commit 20a62b42ec001310a6329d7ee2021f0737d534ef)
Driver module loaders current hardcode ".so" as the file
extension. On MacOS, meson uses ".dylib" as a module file extension.
This patch adds VIR_FILE_MODULE_EXT to virfile.h defined as the
hosts module extension, and updates driver module loaders to make
use of it.
Signed-off-by: Scott Shambarger <scott-libvirt@shambarger.net>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Previous patch handled the runtime case where a non-x86 host is
fetching /proc/cpuinfo data for a microcode info that we know
it doesn't exist. This change alone speeded everything by a
bit for non-x86, but there is at least one major culprit left.
qemuxml2argvtest does several arch-specific tests, and a good
chunk of them are x86 exclusive. This means that 'hostArch'
will be seen as x86 for these tests, even when running in
non-x86 hosts. In a Power 9 server with 128 CPUs, qemuxml2argvtest
takes 298 seconds to complete in average, and 'perf record'
indicates that 95% of the time is spent in
virHostCPUGetMicrocodeVersion().
This patch mocks virHostCPUGetMicrocodeVersion() to always return
0 in the tests, avoiding /proc/cpuinfo reads. This will make all
tests behave arch-agnostic, and the microcode value being 0 has no
impact on any existing test.
This is a CI speed across the board for all archs, including x86,
given that we're not reading /proc/cpuinfo in the tests. For
a Thinkpad T480 laptop with 8 Intel i7 CPUs, qemuxml2argvtest
went from 15.50 sec to 12.50 seconds. The performance gain is even
more noticeable for huge servers with lots of CPUs. For the
Power 9 server mentioned above, this patch speeds qemuxml2argvtest
to 9 seconds, down from 298 sec.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Non-x86 archs does not have a 'microcode' version like x86. This is
covered already inside the function - just return 0 if no microcode
is found. Regardless of that, a read of /proc/cpuinfo is always made.
Each read will invoke the kernel to fill in the CPU details every time.
Now let's consider a non-x86 host, like a Power 9 server with 128 CPUs.
Each /proc/cpuinfo read will need to fetch data for each CPU and it
won't even matter because we know beforehand that PowerPC chips don't
have microcode information.
We can do better for non-x86 hosts by skipping this process entirely.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Use g_autofree and remove the cleanup label.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Since the macro no longer includes the 'ignore_value'
statement, stop putting another empty statement after it.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The VIR_XPATH_NODE_AUTORESTORE contains an ignore_value
statement to silence an unused variable warning on clang.
Use a pragma instead, which is not a statement.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
VIR_CGROUP_BACKEND_CALL is exclusively used at the end
of a function, but it declares a variable.
Wrap it in a do..while block.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Declare the variables at the beginning of the function,
then fill them up.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Many of our functions start with a DEBUG statement.
Move the statements after declarations to appease
our coding style.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Split those initializations that depend on a statement
above them.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Use g_autofree and move the declarations to the beginning
of the block.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Repeat the whole function header instead of mixing #ifdefs
in the code.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Commit 4362068979 moved the function to
util/virqemu.c which is compiled also on win32 and geteuid()/getegid()
doesn't exist there.
Move it to qemu_domain.c which is compiled only when the qemu driver is
enabled. Originally I didn't want to put it here as qemu_domain.c is a
code dump for helper functions but this is the least invasive fix.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
This is similar to one of previous patches.
When receiving stream (on virStorageVolUpload() and subsequent
virStreamSparseSendAll()) we may receive a hole. If the volume we
are saving the incoming data into is a regular file we just
lseek() and ftruncate() to create the hole. But this won't work
if the file is a block device. If that is the case we must write
zeroes so that any subsequent reader reads nothing just zeroes
(just like they would from a hole in a regular file).
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1852528
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
When handling sparse stream, a thread is executed. This thread
runs a read() or write() loop (depending what API is called; in
this case it's virStorageVolDownload() and this the thread run
read() loop). The read() is handled in virFDStreamThreadDoRead()
which is then data/hole section aware, meaning it uses
virFileInData() to detect data and hole sections and sends
TYPE_DATA or TYPE_HOLE virStream messages accordingly.
However, virFileInData() does not work with block devices. Simply
because block devices don't have data and hole sections. What we
can do though, is to mimic being always in a DATA section.
Partially resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1852528
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
In a very distant past, we came around machines that has not
continuous node IDs. This made us error out when constructing
capabilities XML. We resolved it by utilizing strange behaviour
of numa_node_to_cpus() in which it returned a mask with all bits
set for a non-existent node. However, this is not the only case
when it returns all ones mask - if the node exists and has enough
CPUs to fill the mask up (e.g. 128 CPUs).
The fix consists of using nodemask_isset(&numa_all_nodes, ..)
prior to calling numa_node_to_cpus() to determine if the node
exists.
Fixes: 628c935747
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1860231
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
After previous cleanups, some labels in some functions have
nothing but 'return' statement in them. Drop the labels and
replace 'goto'-s with respective return statements.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Again, instead of closing FDs explicitly, we can automatically
close them when they go out of their respective scopes.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
A cleanup function can be declared for virFDStreamMsg type so
that the structure doesn't have to be freed explicitly.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
All callers of virFDStreamMsgQueuePush() have the same pattern:
they explicitly set @msg passed to NULL to avoid freeing it later
on. Well, the function can take address of the pointer and clear
it for them.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The buffer that allocated in the virFDStreamThreadDoRead() can be
automatically freed, or if saved into the message structure it
can be stolen.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
So far, only ENOENT is ignored (to deal with kernels without
devmapper). However, as reported on the list, under certain
scenarios a different error can occur. For instance, when libvirt
is running inside a container which doesn't have permissions to
talk to the devmapper. If this is the case, then open() returns
-1 and sets errno=EPERM.
Assuming that multipath devices are fairly narrow use case and
using them in a restricted container is even more narrow the best
fix seems to be to ignore all open errors BUT produce a warning
on failure. To avoid flooding logs with warnings on kernels
without devmapper the level is reduced to a plain debug message.
Reported-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Reviewed-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In one of my latest patch (v6.6.0~30) I was trying to remove
libdevmapper use in favor of our own implementation. However, the
code did not take into account that device mapper can be not
compiled into the kernel (e.g. be a separate module that's not
loaded) in which case /proc/devices won't have the device-mapper
major number and thus virDevMapperGetTargets() and/or
virIsDevMapperDevice() fails.
However, such failure is safe to ignore, because if device mapper
is missing then there can't be any multipath devices and thus we
don't need to allow the deps in CGroups, nor create them in the
domain private namespace, etc.
Fixes: 2249455654
Reported-by: Andrea Bolognani <abologna@redhat.com>
Reported-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Tested-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
The device mapper major is needed in virIsDevMapperDevice() which
determines whether given device is managed by device-mapper. This
number is obtained by parsing /proc/devices and then stored in a
global variable so that the file doesn't have to be parsed again.
However, as it turns out this logic is flawed - the major number
is not static and can change as it can be specified as a
parameter when loading the dm-mod module.
Unfortunately, I was not able to come up with a good solution and
thus the /proc/devices file is being parsed every time we need
the device mapper major.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Tested-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
BPF syscall BPF_MAP_GET_NEXT_KEY returns -1 if something fails but it
will also return -1 if trying to get next key using the last key in the
map with errno set to ENOENT.
If there are VMs running and libvirtd is restarted and user tries to
call some cgroup devices operation on a VM we need to get the count of
entries in BPF map and it fails which will result in error when trying
to attach/detech devices.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1833321
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
There is a race between vir_event_thread_finalize and
virEventThreadWorker in releasing the last reference on
the GMainContext. If virEventThreadDataFree() runs after
vir_event_thread_finalize releases its reference, then
it will release the last reference on the GMainContext.
As a result g_autoptr cleanup on the GSource will access
free'd memory.
The race can be seen in non-deterministic crashes of the
virt-run-qemu program during its shutdown, but could
also likely affect the main libvirtd QEMU driver:
Thread 2 (Thread 0x7f508ffff700 (LWP 222813)):
#0 0x00007f509c8e26b0 in malloc_consolidate (av=av@entry=0x7f5088000020) at malloc.c:4488
#1 0x00007f509c8e4b08 in _int_malloc (av=av@entry=0x7f5088000020, bytes=bytes@entry=2048) at malloc.c:3711
#2 0x00007f509c8e6412 in __GI___libc_malloc (bytes=2048) at malloc.c:3073
#3 0x00007f509d6e925e in g_realloc (mem=0x0, n_bytes=2048) at gmem.c:164
#4 0x00007f509d705a57 in g_string_maybe_expand (string=string@entry=0x7f5088001f20, len=len@entry=1024) at gstring.c:102
#5 0x00007f509d705ab6 in g_string_sized_new (dfl_size=dfl_size@entry=1024) at gstring.c:127
#6 0x00007f509d708c5e in g_test_log_dump (len=<synthetic pointer>, msg=<synthetic pointer>) at gtestutils.c:3330
#7 0x00007f509d708c5e in g_test_log
(lbit=G_TEST_LOG_ERROR, string1=0x7f508800fcb0 "GLib:ERROR:ghash.c:377:g_hash_table_lookup_node: assertion failed: (hash_table->ref_count > 0)", string2=<optimized out>, n_args=0, largs=0x0) at gtestutils.c:975
#8 0x00007f509d70af2a in g_assertion_message
(domain=<optimized out>, file=0x7f509d7324a2 "ghash.c", line=<optimized out>, func=0x7f509d732750 <__func__.11348> "g_hash_table_lookup_node", message=<optimized out>)
at gtestutils.c:2504
#9 0x00007f509d70af8e in g_assertion_message_expr
(domain=domain@entry=0x7f509d72d76e "GLib", file=file@entry=0x7f509d7324a2 "ghash.c", line=line@entry=377, func=func@entry=0x7f509d732750 <__func__.11348> "g_hash_table_lookup_node", expr=expr@entry=0x7f509d732488 "hash_table->ref_count > 0") at gtestutils.c:2555
#10 0x00007f509d6d197e in g_hash_table_lookup_node (hash_table=0x55b70ace1760, key=<optimized out>, hash_return=<synthetic pointer>) at ghash.c:377
#11 0x00007f509d6d197e in g_hash_table_lookup_node (hash_return=<synthetic pointer>, key=<optimized out>, hash_table=0x55b70ace1760) at ghash.c:361
#12 0x00007f509d6d197e in g_hash_table_remove_internal (hash_table=0x55b70ace1760, key=<optimized out>, notify=1) at ghash.c:1371
#13 0x00007f509d6e0664 in g_source_unref_internal (source=0x7f5088000b60, context=0x55b70ad87e00, have_lock=0) at gmain.c:2103
#14 0x00007f509d6e1f64 in g_source_unref (source=<optimized out>) at gmain.c:2176
#15 0x00007f50a08ff84c in glib_autoptr_cleanup_GSource (_ptr=<synthetic pointer>) at /usr/include/glib-2.0/glib/glib-autocleanups.h:58
#16 0x00007f50a08ff84c in virEventThreadWorker (opaque=0x55b70ad87f80) at ../../src/util/vireventthread.c:114
#17 0x00007f509d70bd4a in g_thread_proxy (data=0x55b70acf3850) at gthread.c:784
#18 0x00007f509d04714a in start_thread (arg=<optimized out>) at pthread_create.c:479
#19 0x00007f509c95cf23 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Thread 1 (Thread 0x7f50a1380c00 (LWP 222802)):
#0 0x00007f509c8977ff in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007f509c881c35 in __GI_abort () at abort.c:79
#2 0x00007f509d72a823 in g_mutex_clear (mutex=0x55b70ad87e00) at gthread-posix.c:1307
#3 0x00007f509d72a823 in g_mutex_clear (mutex=mutex@entry=0x55b70ad87e00) at gthread-posix.c:1302
#4 0x00007f509d6e1a84 in g_main_context_unref (context=0x55b70ad87e00) at gmain.c:582
#5 0x00007f509d6e1a84 in g_main_context_unref (context=0x55b70ad87e00) at gmain.c:541
#6 0x00007f50a08ffabb in vir_event_thread_finalize (object=0x55b70ad83180 [virEventThread]) at ../../src/util/vireventthread.c:50
#7 0x00007f509d9c48a9 in g_object_unref (_object=<optimized out>) at gobject.c:3340
#8 0x00007f509d9c48a9 in g_object_unref (_object=0x55b70ad83180) at gobject.c:3232
#9 0x00007f509583d311 in qemuProcessQMPFree (proc=proc@entry=0x55b70ad87b90) at ../../src/qemu/qemu_process.c:8355
#10 0x00007f5095790f58 in virQEMUCapsInitQMPSingle
(qemuCaps=qemuCaps@entry=0x55b70ad88010, libDir=libDir@entry=0x55b70ad049e0 "/tmp/virt-qemu-run-VZC9N0/lib/qemu", runUid=runUid@entry=107, runGid=runGid@entry=107, onlyTCG=onlyTCG@entry=false) at ../../src/qemu/qemu_capabilities.c:5409
#11 0x00007f509579108f in virQEMUCapsInitQMP (runGid=107, runUid=107, libDir=0x55b70ad049e0 "/tmp/virt-qemu-run-VZC9N0/lib/qemu", qemuCaps=0x55b70ad88010)
at ../../src/qemu/qemu_capabilities.c:5420
#12 0x00007f509579108f in virQEMUCapsNewForBinaryInternal
(hostArch=VIR_ARCH_X86_64, binary=binary@entry=0x55b70ad7dc40 "/usr/libexec/qemu-kvm", libDir=0x55b70ad049e0 "/tmp/virt-qemu-run-VZC9N0/lib/qemu", runUid=107, runGid=107, hostCPUSignature=0x55b70ad01320 "GenuineIntel, Intel(R) Xeon(R) Silver 4210 CPU @ 2.20GHz, family: 6, model: 85, stepping: 7", microcodeVersion=83898113, kernelVersion=0x55b70ad00d60 "4.18.0-211.el8.x86_64 #1 SMP Thu Jun 4 08:08:16 UTC 2020") at ../../src/qemu/qemu_capabilities.c:5472
#13 0x00007f5095791373 in virQEMUCapsNewData (binary=0x55b70ad7dc40 "/usr/libexec/qemu-kvm", privData=0x55b70ad5b8f0) at ../../src/qemu/qemu_capabilities.c:5505
#14 0x00007f50a09a32b1 in virFileCacheNewData (name=0x55b70ad7dc40 "/usr/libexec/qemu-kvm", cache=<optimized out>) at ../../src/util/virfilecache.c:208
#15 0x00007f50a09a32b1 in virFileCacheValidate (cache=cache@entry=0x55b70ad5c030, name=name@entry=0x55b70ad7dc40 "/usr/libexec/qemu-kvm", data=data@entry=0x7ffca39ffd90)
at ../../src/util/virfilecache.c:277
#16 0x00007f50a09a37ea in virFileCacheLookup (cache=cache@entry=0x55b70ad5c030, name=name@entry=0x55b70ad7dc40 "/usr/libexec/qemu-kvm") at ../../src/util/virfilecache.c:310
#17 0x00007f5095791627 in virQEMUCapsCacheLookup (cache=0x55b70ad5c030, binary=0x55b70ad7dc40 "/usr/libexec/qemu-kvm") at ../../src/qemu/qemu_capabilities.c:5647
#18 0x00007f50957c34c3 in qemuDomainPostParseDataAlloc (def=<optimized out>, parseFlags=<optimized out>, opaque=<optimized out>, parseOpaque=0x7ffca39ffe18)
at ../../src/qemu/qemu_domain.c:5470
#19 0x00007f50a0a34051 in virDomainDefPostParse
(def=def@entry=0x55b70ad7d200, parseFlags=parseFlags@entry=258, xmlopt=xmlopt@entry=0x55b70ad5d010, parseOpaque=parseOpaque@entry=0x0)
at ../../src/conf/domain_conf.c:5970
#20 0x00007f50a0a464bb in virDomainDefParseNode
(xml=xml@entry=0x55b70aced140, root=root@entry=0x55b70ad5f020, xmlopt=xmlopt@entry=0x55b70ad5d010, parseOpaque=parseOpaque@entry=0x0, flags=flags@entry=258)
at ../../src/conf/domain_conf.c:22520
#21 0x00007f50a0a4669b in virDomainDefParse
(xmlStr=xmlStr@entry=0x55b70ad5f9e0 "<domain type='kvm'>\n <name>83</name>\n <uuid>9350639d-1c8a-4f51-a4a6-4eaf8eabe83e</uuid>\n <metadata>\n <libosinfo:libosinfo xmlns:libosinfo=\"http://libosinfo.org/xmlns/libvirt/domain/1.0\">\n <"..., filename=filename@entry=0x0, xmlopt=0x55b70ad5d010, parseOpaque=parseOpaque@entry=0x0, flags=flags@entry=258) at ../../src/conf/domain_conf.c:22474
#22 0x00007f50a0a467ae in virDomainDefParseString
(xmlStr=xmlStr@entry=0x55b70ad5f9e0 "<domain type='kvm'>\n <name>83</name>\n <uuid>9350639d-1c8a-4f51-a4a6-4eaf8eabe83e</uuid>\n <metadata>\n <libosinfo:libosinfo xmlns:libosinfo=\"http://libosinfo.org/xmlns/libvirt/domain/1.0\">\n <"..., xmlopt=<optimized out>, parseOpaque=parseOpaque@entry=0x0, flags=flags@entry=258)
at ../../src/conf/domain_conf.c:22488
#23 0x00007f50958ce112 in qemuDomainCreateXML
(conn=0x55b70acf9090, xml=0x55b70ad5f9e0 "<domain type='kvm'>\n <name>83</name>\n <uuid>9350639d-1c8a-4f51-a4a6-4eaf8eabe83e</uuid>\n <metadata>\n <libosinfo:libosinfo xmlns:libosinfo=\"http://libosinfo.org/xmlns/libvirt/domain/1.0\">\n <"..., flags=0) at ../../src/qemu/qemu_driver.c:1744
#24 0x00007f50a0c268ac in virDomainCreateXML
(conn=0x55b70acf9090, xmlDesc=0x55b70ad5f9e0 "<domain type='kvm'>\n <name>83</name>\n <uuid>9350639d-1c8a-4f51-a4a6-4eaf8eabe83e</uuid>\n <metadata>\n <libosinfo:libosinfo xmlns:libosinfo=\"http://libosinfo.org/xmlns/libvirt/domain/1.0\">\n <"..., flags=0) at ../../src/libvirt-domain.c:176
#25 0x000055b709547e7b in main (argc=<optimized out>, argv=<optimized out>) at ../../src/qemu/qemu_shim.c:289
The solution is to explicitly unref the GSource at a safe time instead
of letting g_autoptr unref it when leaving scope.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
There is a fairly long standing race condition bug in glib which can hit
if you call g_source_destroy or g_source_unref from a non-main thread:
https://gitlab.gnome.org/GNOME/glib/-/merge_requests/1358
Unfortunately it is really common for libvirt to call g_source_destroy
from a non-main thread. This glib bug is the cause of non-determinstic
crashes in eventtest, and probably in libvirtd too.
To work around the problem we need to ensure that we never release
the last reference on a GSource from a non-main thread. The previous
patch replaced our use of g_source_destroy with a pair of
g_source_remove and g_source_unref. We can now delay the g_source_unref
call by using a idle callback to invoke it from the main thread which
avoids the race condition.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The source ID number is an alternative way to identify a source that has
been added to a GMainContext. Internally when a source ID is given, glib
will lookup the corresponding GSource and use that. The use of a source
ID is racy in some cases though, because it is invalid to continue to
use an ID number after the GSource has been removed. It is thus safer
to use the GSource object directly and have full control over the ref
counting and thus cleanup.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
When COW is not explicitly requested to be disabled or enabled, the
function is supposed to do nothing on non-BTRFS file systems.
Fixes commit 7230bc95aa.
https://bugzilla.redhat.com/show_bug.cgi?id=1866157
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Many of our calls to xmlNodeGetContent() (which are now all via
virXMLNodeContentString() are failing to check for a NULL return. We
need to remedy that, but in order to make the remedy simpler, let's
log an error in virXMLNodeContentString(), so that the callers don't
all individually need to (since it would be the same error message for
all of them anyway).
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
If there's a list of mdevs to be assigned to a domain, but one of them
(NOT the first) is already assigned to a different domain we're going
to crash in the qemuProcessStop phase in
virMediatedDeviceListFindIndex, because some of the pointers in
mgr->activeMediatedHostdevs are dangling. This is due to
virMediatedDeviceListMarkDevices using cleanup instead of rollback when
we find out that a device is already taken.
Reproducer steps:
1. start vm1 with mdev1
2. start vm2 with mdev2, mdev1 (the order is important!)
Backtrace:
#0 0x0000ffffb8c36250 in strcmp
#1 0x0000ffffb9b80754 in virMediatedDeviceListFindIndex
#2 0x0000ffffb9b80870 in virMediatedDeviceListFind
#3 0x0000ffffb9c9e168 in virHostdevReAttachMediatedDevices
#4 0x0000ffff9949f724 in qemuHostdevReAttachMediatedDevices
#5 0x0000ffff9949f7f8 in qemuHostdevReAttachDomainDevices
#6 0x0000ffff994bcd70 in qemuProcessStop
#7 0x0000ffff994bf4e0 in qemuProcessStart
Signed-off-by: Binfeng Wu <wubinfeng@huawei.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
These variables are only used for assignment and have
no other effect.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
In virCgroupV2BindMount there is an unused variable containing
what seem to be tmpfs mount options.
Delete it. Unlike with cgroups v1, we do not create a tmpfs
here.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Now that everything uses g_strfreev, this function is no longer
needed.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Both accept a NULL value gracefully and virStringFreeList
does not zero the pointer afterwards, so a straight replace
is safe.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
The g_strdupv function from GLib provides
the same functionality.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Last usage out of virlog.c was removed by
commit 91268c715c
node_device_udev: remove deprecated logging function
Also drop the virbuffer.h include - it seems it was never used
for anything else than the transitive stdarg.h include.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
This function calls virLogVMessage. Move it below the definition
of virLogVMessage so it can call it even without a prototype.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
The XML function is needed in the C file,
not in the header.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
It was needed for virAsprintf, which is now dropped.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Fixes: 33ed622106
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
We use an array of size VIR_NODE_MEMORY_STATS_FIELD_LENGTH
to store the string read from sysfs, but pass unbound "%s"
to sscanf.
Make the array larger by one and simply stringify that
constant as the field width specifier.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
There is no distinction between Read/Write locks for resctrl from libvirt's
point of view any more.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
It was created to get rid of conditional compilation in the resctrl code and
make it usable anywhere else. However this is not something that is going to be
used in other places because it is not portable and resctrl is just very
specific in this regard. And there is no reason why there could not be a
preprocessor conditional in the resctrl code. Also the interface of
virFileFlock() was very ambiguous which lead to some issues.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
That's the way it should've been all the time. It was originally the case, but
then the rework to virFileFlock() made the function ambiguous when it was
created in commit 5a0a5f7fb5, and due to that it was misused in commit
657ddeff23 and since then the lock being taken was shared rather than
exclusive.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Meson doesn't use .libs directory, everything is placed directly into
directories where meson.build file is used.
In order to have working tests and running libvirt directly from GIT we
need to fix all the paths pointing '.libs' directory.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Neal Gompa <ngompa13@gmail.com>
With meson we no longer have .libs directory with the actual binary so
we have to take a different approach to detect if running from build
directory.
This is not as robust as for autotools because if you select --prefix
in the build directory it will incorrectly enable the override as well
but nobody should do that.
We have to modify some of the tests to not add current build path into
PATH variable and use the full path for virsh instead. Otherwise it
would be impossible to figure out that we are running virsh from build
directory.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Neal Gompa <ngompa13@gmail.com>
There is no point of having this option in libvirt because the debug
logs can be configured using log filters.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Neal Gompa <ngompa13@gmail.com>
EXTRA_DIST is not relevant because meson makes a git copy when creating
dist archive so everything tracked by git is part of dist tarball.
The remaining ones are not converted to meson files as they are
automatically tracked by meson.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Neal Gompa <ngompa13@gmail.com>
After the switch to libnl these are no longer used.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Fixes: 77e7c13b2e
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
CVE-2020-14339
When building domain's private /dev in a namespace, libdevmapper
is consulted for getting full dependency tree of domain's disks.
The reason is that for a multipath devices all dependent devices
must be created in the namespace and allowed in CGroups.
However, this approach is very fragile as building of namespace
happens in the forked off child process, after mass close of FDs
and just before dropping privileges and execing QEMU. And it so
happens that when calling libdevmapper APIs, one of them opens
/dev/mapper/control and saves the FD into a global variable. The
FD is kept open until the lib is unlinked or dm_lib_release() is
called explicitly. We are doing neither.
However, the virDevMapperGetTargets() function is called also
from libvirtd (when setting up CGroups) and thus has to be thread
safe. Unfortunately, libdevmapper APIs are not thread safe (nor
async signal safe) and thus we can't use them. Reimplement what
libdevmapper would do using plain C (ioctl()-s, /proc/devices
parsing, /dev/mapper dirwalking, and so on).
Fixes: a30078cb83
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1858260
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Since we have VIR_AUTOSTRINGLIST we can use it to free string
lists used in the function automatically.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
There are two distinct WITH_DEVMAPPER sections in the file, for
different functions each. Rearrange the code to make some of
future commits smaller.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
btrfs defaults to performing copy-on-write for files. This is often
undesirable for VM images, so we need to be able to control whether this
behaviour is used.
The virFileSetCOW() will allow for this. We use a tristate, since out of
the box, we want the default behaviour attempt to disable cow, but only
on btrfs, silently do nothing on non-btrfs. If someone explicitly asks
to disable/enable cow, then we want to raise a hard error on non-btrfs.
Reviewed-by: Neal Gompa <ngompa13@gmail.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
gcc 10.1.0 on Debian sid has a bug where the bounds checking gets
confused beteen two branches:
In file included from /usr/include/string.h:495,
from ../../src/internal.h:28,
from ../../src/util/virsocket.h:21,
from ../../src/util/virsocketaddr.h:21,
from ../../src/util/virnetdevip.h:21,
from ../../src/util/virnetdevip.c:21:
In function 'memcpy',
inlined from 'virNetDevGetifaddrsAddress' at ../../src/util/virnetdevip.c:914:13,
inlined from 'virNetDevIPAddrGet' at ../../src/util/virnetdevip.c:962:16:
/usr/include/arm-linux-gnueabihf/bits/string_fortified.h:34:10: error: '__builtin_memcpy' offset [16, 27] from the object at 'addr' is out of the bounds of referenced subobject 'inet4' with type 'struct sockaddr_in' at offset 0 [-Werror=array-bounds]
34 | return __builtin___memcpy_chk (__dest, __src, __len, __bos0 (__dest));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from ../../src/util/virnetdevip.h:21,
from ../../src/util/virnetdevip.c:21:
../../src/util/virnetdevip.c: In function 'virNetDevIPAddrGet':
../../src/util/virsocketaddr.h:29:28: note: subobject 'inet4' declared here
29 | struct sockaddr_in inet4;
| ^~~~~
cc1: all warnings being treated as errors
Note the source location is pointing to the "inet6" / AF_INET6 branch of
the "if", but is complaining about bounds of the "inet4" field. Changing
the code into a switch() is sufficient to avoid triggering the bug and
is arguably better code too.
Reviewed-by: Laine Stump <laine@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
g_new() is used in only 3 places. Switching them to g_new0() will do
no harm, reduces confusion, and helps me sleep better at night knowing
that all allocated memory is initialized to 0 :-) (Yes, I *know* that
in all three cases the associated memory is immediately assigned some
other value. Today.)
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Historically, we've used security_context_t for variables passed
to libselinux APIs. But almost 7 years ago, libselinux developers
admitted in their API that in fact, it's just a 'char *' type
[1]. Ever since then the APIs accept 'char *' instead, but they
kept the old alias just for API stability. Well, not anymore [2].
1: 9eb9c93275
2: 7a124ca275
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Commit <f650e86703847af544762d02f79c70131ff7fbab> added check for
openpty function from util library using AC_CHECK_LIB(). However, that
macro doesn't define OPENPTY_LIBS, it only defines WITH_LIBUTIL and
prepends -lutil into LIBS for the whole project.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
It was introduced by commit <c606671aaad10a9bc87f226bc473a091e00a9629>
as a gnulib ldexp module and later removed by commit
<09fe607b4de8eb883c966e90aaf5563299a22738>.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Fixes inconsistency with macro names for external programs.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>