A memory leak was identified in
virCgroupEnableMissingControllers():
==11680== at 0x483EAE5: calloc (vg_replace_malloc.c:760)
==11680== by 0x4E51780: g_malloc0 (in /usr/lib64/libglib-2.0.so.0.6701.0)
==11680== by 0x4908618: virCgroupNew (vircgroup.c:701)
==11680== by 0x49096F4: virCgroupEnableMissingControllers (vircgroup.c:1146)
==11680== by 0x4909B17: virCgroupNewMachineSystemd (vircgroup.c:1228)
==11680== by 0x4909E94: virCgroupNewMachine (vircgroup.c:1313)
==11680== by 0x1694FDBC: qemuInitCgroup (qemu_cgroup.c:946)
==11680== by 0x1695046B: qemuSetupCgroup (qemu_cgroup.c:1083)
==11680== by 0x16A60126: qemuProcessLaunch (qemu_process.c:7077)
==11680== by 0x16A61504: qemuProcessStart (qemu_process.c:7384)
==11680== by 0x169B84C2: qemuDomainObjStart (qemu_driver.c:6590)
==11680== by 0x169B8776: qemuDomainCreateWithFlags (qemu_driver.c:6641)
What happens is that new virCgroup is created and stored into
@parent. Then, if @tokens is not empty the for() loop is entered
into where another virCgroup is created and @parent is replaced
with this new virCgroup. But nothing freed the old @parent.
Fixes: 77291414c7a8745cf4d2b06d3c38d269cfbcfe32
Reported-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The use case VIR_ALLOC_VAR deals with is very unlikely. We had just 2
legitimate uses, which were reimplemented locally using g_malloc0 and
sizeof instead as they used a static number of members of the trailing
array.
Remove VIR_ALLOC_VAR since in most cases the direct implementation is
shorter and clearer and there are no users of it currently.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
In this case we need a 'struct ethtool_gfeatures' followed by two
'struct ethtool_get_features_block' so there's no risk of overflow.
Use g_malloc0 and sizeof() to allocate the memory instead of
VIR_ALLOC_VAR.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Users were replaced with virSecureEraseString with explicit freeing of
the memory.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
There are no users any more. The replacement is to use g_auto and
virSecureEraseString explicitly.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The macros are unused now and callers who care about clearing the memory
they use should use memset() appropriately.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Clear the key and IV structs using virSecureErase.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Clear out the value using virSecureErase and free it with g_free so
that VIR_DISPOSE_N can be phased out.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The module will provide functions for disposing secrets stored in
memory.
Note that for now it's implemented using memset, which is not really
secure.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The headers weren't removed after use of VIR_STRDUP was removed.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
"f + 0.5" does not round correctly for values very close to
".5" for every integer multiple, e.g. "0.499999975".
Found by clang-tidy's "bugprone-incorrect-roundings" check.
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This was found by clang-tidy's
"clang-analyzer-security.insecureAPI.bzero" check.
bzero is marked as deprecated ("LEGACY") in POSIX.1-2001 and
removed in POSIX.1-2008.
Besides its deprecation, bzero can be unsafe to use under certain
circumstances, e.g. when used to zero-out memory containing secrects.
These calls can be optimized away by the compiler, if it concludes no
further access happens to the memory, thus leaving the secrets still
in memory. Hence its classification as "insecureAPI".
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This section is guarded by "#ifndef WIN32" in line 2109--2808.
Found by clang-tidy's "readability-redundant-preprocessor" check.
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
posix_fallocate() might be not supported by a filesystem, for example,
it's not supported by ZFS. In that case it fails with
return code 22 (EINVAL), and thus safezero_posix_fallocate() returns -1.
As safezero_posix_fallocate() is the first function tried by safezero()
and it tries other functions only when it returns -2, it fails
immediately without falling back to other methods, such as
safezero_slow().
Fix that by returning -2 if posix_fallocate() returns EINVAL, to give
safezero() a chance to try other functions.
Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The current virPCIDeviceNew() signature, receiving 4 uints in sequence
(domain, bus, slot, function), is not neat.
We already have a way to represent a PCI address in virPCIDeviceAddress
that is used in the code. Aside from the test files, most of
virPCIDeviceNew() callers have access to a virPCIDeviceAddress reference,
but then we need to retrieve the 4 required uints (addr.domain, addr.bus,
addr.slot, addr.function) to satisfy virPCIDeviceNew(). The result is
that we have extra verbosity/boilerplate to retrieve an information that
is already available in virPCIDeviceAddress.
A better way is presented by virNVMEDeviceNew(), where the caller just
supplies a virPCIDeviceAddress pointer and the function handles the
details internally.
This patch changes virPCIDeviceNew() to receive a virPCIDeviceAddress
pointer instead of 4 uints.
Reviewed-by: Laine Stump <laine@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
There is no need to open code the PCI address string format
when we have a function that does exactly that.
Reviewed-by: Laine Stump <laine@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
There are no other files using it. Move it and make the functions
static.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
virPCIGetNetName is used to get the name of the netdev associated with
a particular PCI device. This is used when we have a VF name, but need
the PF name in order to send a netlink command (e.g. in order to
get/set the MAC address of the VF).
In simple cases there is a single netdev associated with any PCI
device, so it is easy to figure out the PF netdev for a VF - just look
for the PCI device that has the VF listed in its "virtfns" directory;
the only name in the "net" subdirectory of that PCI device's sysfs
directory is the PF netdev that is upstream of the VF in question.
In some cases there can be more than one netdev in a PCI device's net
directory though. In the past, the only case of this was for SR-IOV
NICs that could have multiple PF's per PCI device. In this case, all
PF netdevs associated with a PCI address would be listed in the "net"
subdirectory of the PCI device's directory in sysfs. At the same time,
all VF netdevs and all PF netdevs have a phys_port_id in their sysfs,
so the way to learn the correct PF netdev for a particular VF netdev
is to search through the list of devices in the net subdirectory of
the PF's PCI device, looking for the one netdev with a "phys_port_id"
matching that of the VF netdev.
But starting in kernel 5.8, the NVIDIA Mellanox driver began linking
the VFs' representor netdevs to the PF PCI address [1], and so the VF
representor netdevs would also show up in the net
subdirectory. However, all of the devices that do so also only have a
single PF netdev for any given PCI address.
This means that the net directory of the PCI device can still hold
multiple net devices, but only one of them will be the PF netdev (the
others are VF representors):
$ ls '/sys/bus/pci/devices/0000:82:00.0/net'
ens1f0 eth0 eth1
In this case the way to find the PF device is to look at the
"phys_port_name" attribute of each netdev in sysfs. All PF devices
have a phys_port_name matching a particular regex
(p[0-9]+$)|(p[0-9]+s[0-9]+$)
Since there can only be one PF in the entire list of devices, once we
match that regex, we've found the PF netdev.
[1] - https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/
commit/?id=123f0f53dd64b67e34142485fe866a8a581f12f1
Co-Authored-by: Moshe Levi <moshele@nvidia.com>
Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com>
Reviewed-by: Adrian Chiris <adrianc@nvidia.com>
Reviewed-by: Laine Stump <laine@redhat.com>
This commit add virNetDevGetPhysPortName to read netdevice
phys_port_name from sysfs. It also refactor the code so
virNetDevGetPhysPortName and virNetDevGetPhysPortID will use
same method to read the netdevice sysfs.
Signed-off-by: Moshe Levi <moshele@nvidia.com>
Reviewed-by: Laine Stump <laine@redhat.com>
The code handles XML bits and internal definition and should be
in conf directory.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The code handles XML bits and internal definition and should be
in conf directory.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Same as virStorageFileBackend, it doesn't belong into util directory.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
It's used only by storage file code so it doesn't make sense to have
it in util directory.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Up until now we had a runtime code and XML related code in the same
source file inside util directory.
This patch takes the runtime part and extracts it into the new
storage_file directory.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This code is not directly relevant to virStorageSource so move it to
separate file.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This will allow following patches to move virStorageSource into conf
directory and virStorageDriverData into a new storage_file directory.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Since Hyper-V allows multiple VMs to be created with the same name,
some commands produce unpredictable results due to
hypervDomainLookupByName's WMI query selecting the wrong domain.
For example, this prevents `virsh dumpxml` from outputting XML for the
wrong domain.
Signed-off-by: Matt Coleman <matt@datto.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Extract common code as helper function virNetlinkTalk, then simplify
the functions virNetlink[DumpLink|NewLink|DelLink|GetNeighbor].
Signed-off-by: Shi Lei <shi_lei@massclouds.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Introduce a macro NETLINK_MSG_APPEND to wrap nlmsg_append and
simplify code. Remove those labels 'buffer_too_small', since they
are now useless.
Signed-off-by: Shi Lei <shi_lei@massclouds.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Move macros NETLINK_MSG_[NEST_START|NEST_END|PUT] from .h into .c;
within these macros, replace 'goto' with reporting error and returning;
simplify virNetlinkDumpLink and virNetlinkDelLink by using NETLINK_MSG_PUT.
Signed-off-by: Shi Lei <shi_lei@massclouds.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
NLM_F_CREATE and NLM_F_EXCL are invalid for RTM_DELLINK,
so remove them.
Signed-off-by: Shi Lei <shi_lei@massclouds.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
So far we assumed that any vhostuser interface is plugged into an
OVS bridge and thus 'ovs-vsctl' exists. But this is not always
true. In testing scenarios it is possible to create a vhostuser
interface with this tool dpdk-testpmd (part of dpdk RPM) which
creates/connects to UNIX socket needed for vhostuser. Of course,
since there is no OVS then there is no interface name in which
case virNetDevOpenvswitchGetVhostuserIfname() should return 0.
The rest of APIs that assume OVS are not 'fixed' because we still
want them to fail (e.g. getting statistics, plugging interface
into an OVS bridge, unplugging it from an OVS bridge, ...).
The only API that is fixed is
virNetDevOpenvswitchGetVhostuserIfname() because it is called
explicitly when starting a guest (and callers are okay if no name
was found).
The other way to fix this bug seems to be to simply require
'ovs-vsctl' on spec file level, but that is too heavy gun given
that vhostuser is used by a small set of our users (assumption
made on requirements for vhostuser). Also, this way would drag in
yet another dependency for all users (even those who want minimal
libvirt).
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1913156
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
dnsmasq usually prints out a version string like this:
Dnsmasq version 2.82 [...]
but a user reported that the build of dnsmasq included with pihole has
a version string like this:
Dnsmasq version pi-hole-2.81 [...]
We parse the dnsmasq version number to figure out if the dnsmasq
binary supports certain features. Since we expect the version number
(and it must be only numbers!) to start on the first non-space after
the string "Dnsmasq version", we fail to parse this format of the
version string.
Rather than spending a bunch of time trying to get pihole to change
that, we can just make our parsing more permissive - after searching
for "Dnsmasq version", we'll skip ahead to the first decimal digit,
rather than just the first non-space.
(NB: The features we're checking for purely by looking at version
number have been in all releases of dnsmasq since at least 2012, so we
could actually just remove the reading of the version number
completely. However it's possible (although *highly* unlikely)
that some new feature would be added to dnsmasq in the future and we
would need to add that code back.)
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/29
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This function skips over the beginning of a string until it reaches a
decimal digit (0-9) or the NULL at the end of the string. The original
pointer is modified in place (similar to virSkipSpaces()).
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
In certain specific cases it might be beneficial to be able to control
the metadata caching of storage image format drivers of a hypervisor.
Introduce XML machinery to set the maximum size of the metadata cache
which will be used by qemu's qcow2 driver.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Enable parsing of backing store strings containing the native 'nfs'
protocol specification.
Signed-off-by: Ryan Gahagan <rgahagan@cs.utexas.edu>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
'nfs_user'/'nfs_group' represents the XML configuration.
'nfs_uid'/'nfs_gid' is internal store when libvirt looks up the user's
uid/gid in the system.
Signed-off-by: Ryan Gahagan <rgahagan@cs.utexas.edu>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
virJSONValueObjectRemoveKey can be used as direct replacement. Fix the
one caller and remove the duplicate function.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The lookup didn't do anything apart from comparing the sysfs paths
anyway since that's what makes each mdev unique.
The most ridiculous usage of the old logic was in
virHostdevReAttachMediatedDevices where in order to drop an mdev
hostdev from the list of active devices we first had to create a new
mdev and use it in the lookup call. Why couldn't we have used the
hostdev directly? Because the hostdev and mdev structures are
incompatible.
The way mdevs are currently removed is via a write to a specific sysfs
attribute. If you do it while the machine which has the mdev assigned
is running, the write call may block (with a new enough kernel, with
older kernels it would return a write error!) until the device
is no longer in use which is when the QEMU process exits.
The interesting part here comes afterwards when we're cleaning up and
call virHostdevReAttachMediatedDevices. The domain doesn't exist
anymore, so the list of active hostdevs needs to be updated and the
respective hostdevs removed from the list, but remember we had to
create an mdev object in the memory in order to find it in the list
first which will fail because the write to sysfs had already removed
the mdev instance from the host system.
And so the next time you try to start the same domain you'll get:
"Requested operation is not valid: mediated device <path> is in use by
driver QEMU, domain <name>"
Fixes: https://gitlab.com/libvirt/libvirt/-/issues/119
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
virDeviceHasPCIExpressLink() wasn't checking that pcie_cap_pos was
valid before attempting to use it, which could lead to reading the
byte at offset 0 + PCI_CAP_ID_EXP instead of [valid offset] +
PCI_CAP_ID_EXP. In particular, this could happen for "integrated" PCI
devices (those that are on the PCIe root complex). If it happened that
the byte from the wrong address had the "right" bit set, then it would
lead to us innappropriately believing that Express Link info was
available when it wasn't, and the node device driver would then log an
error like this:
virPCIDeviceGetLinkCapSta:2754 :
internal error: pci device 0000:00:18.0 is not a PCI-Express device
during a libvirtd restart. (this didn't ever occur until after
virPCIDeviceIsPCIExpress() was made more intelligent in commit
c00b6b1ae, which hasn't yet been in any official release)
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>