"f + 0.5" does not round correctly for values very close to
".5" for every integer multiple, e.g. "0.499999975".
Found by clang-tidy's "bugprone-incorrect-roundings" check.
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This was found by clang-tidy's
"clang-analyzer-security.insecureAPI.bzero" check.
bzero is marked as deprecated ("LEGACY") in POSIX.1-2001 and
removed in POSIX.1-2008.
Besides its deprecation, bzero can be unsafe to use under certain
circumstances, e.g. when used to zero-out memory containing secrects.
These calls can be optimized away by the compiler, if it concludes no
further access happens to the memory, thus leaving the secrets still
in memory. Hence its classification as "insecureAPI".
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This section is guarded by "#ifndef WIN32" in line 2109--2808.
Found by clang-tidy's "readability-redundant-preprocessor" check.
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
posix_fallocate() might be not supported by a filesystem, for example,
it's not supported by ZFS. In that case it fails with
return code 22 (EINVAL), and thus safezero_posix_fallocate() returns -1.
As safezero_posix_fallocate() is the first function tried by safezero()
and it tries other functions only when it returns -2, it fails
immediately without falling back to other methods, such as
safezero_slow().
Fix that by returning -2 if posix_fallocate() returns EINVAL, to give
safezero() a chance to try other functions.
Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The current virPCIDeviceNew() signature, receiving 4 uints in sequence
(domain, bus, slot, function), is not neat.
We already have a way to represent a PCI address in virPCIDeviceAddress
that is used in the code. Aside from the test files, most of
virPCIDeviceNew() callers have access to a virPCIDeviceAddress reference,
but then we need to retrieve the 4 required uints (addr.domain, addr.bus,
addr.slot, addr.function) to satisfy virPCIDeviceNew(). The result is
that we have extra verbosity/boilerplate to retrieve an information that
is already available in virPCIDeviceAddress.
A better way is presented by virNVMEDeviceNew(), where the caller just
supplies a virPCIDeviceAddress pointer and the function handles the
details internally.
This patch changes virPCIDeviceNew() to receive a virPCIDeviceAddress
pointer instead of 4 uints.
Reviewed-by: Laine Stump <laine@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
There is no need to open code the PCI address string format
when we have a function that does exactly that.
Reviewed-by: Laine Stump <laine@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
There are no other files using it. Move it and make the functions
static.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
virPCIGetNetName is used to get the name of the netdev associated with
a particular PCI device. This is used when we have a VF name, but need
the PF name in order to send a netlink command (e.g. in order to
get/set the MAC address of the VF).
In simple cases there is a single netdev associated with any PCI
device, so it is easy to figure out the PF netdev for a VF - just look
for the PCI device that has the VF listed in its "virtfns" directory;
the only name in the "net" subdirectory of that PCI device's sysfs
directory is the PF netdev that is upstream of the VF in question.
In some cases there can be more than one netdev in a PCI device's net
directory though. In the past, the only case of this was for SR-IOV
NICs that could have multiple PF's per PCI device. In this case, all
PF netdevs associated with a PCI address would be listed in the "net"
subdirectory of the PCI device's directory in sysfs. At the same time,
all VF netdevs and all PF netdevs have a phys_port_id in their sysfs,
so the way to learn the correct PF netdev for a particular VF netdev
is to search through the list of devices in the net subdirectory of
the PF's PCI device, looking for the one netdev with a "phys_port_id"
matching that of the VF netdev.
But starting in kernel 5.8, the NVIDIA Mellanox driver began linking
the VFs' representor netdevs to the PF PCI address [1], and so the VF
representor netdevs would also show up in the net
subdirectory. However, all of the devices that do so also only have a
single PF netdev for any given PCI address.
This means that the net directory of the PCI device can still hold
multiple net devices, but only one of them will be the PF netdev (the
others are VF representors):
$ ls '/sys/bus/pci/devices/0000:82:00.0/net'
ens1f0 eth0 eth1
In this case the way to find the PF device is to look at the
"phys_port_name" attribute of each netdev in sysfs. All PF devices
have a phys_port_name matching a particular regex
(p[0-9]+$)|(p[0-9]+s[0-9]+$)
Since there can only be one PF in the entire list of devices, once we
match that regex, we've found the PF netdev.
[1] - https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/
commit/?id=123f0f53dd64b67e34142485fe866a8a581f12f1
Co-Authored-by: Moshe Levi <moshele@nvidia.com>
Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com>
Reviewed-by: Adrian Chiris <adrianc@nvidia.com>
Reviewed-by: Laine Stump <laine@redhat.com>
This commit add virNetDevGetPhysPortName to read netdevice
phys_port_name from sysfs. It also refactor the code so
virNetDevGetPhysPortName and virNetDevGetPhysPortID will use
same method to read the netdevice sysfs.
Signed-off-by: Moshe Levi <moshele@nvidia.com>
Reviewed-by: Laine Stump <laine@redhat.com>
The code handles XML bits and internal definition and should be
in conf directory.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The code handles XML bits and internal definition and should be
in conf directory.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Same as virStorageFileBackend, it doesn't belong into util directory.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
It's used only by storage file code so it doesn't make sense to have
it in util directory.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Up until now we had a runtime code and XML related code in the same
source file inside util directory.
This patch takes the runtime part and extracts it into the new
storage_file directory.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This code is not directly relevant to virStorageSource so move it to
separate file.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This will allow following patches to move virStorageSource into conf
directory and virStorageDriverData into a new storage_file directory.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Since Hyper-V allows multiple VMs to be created with the same name,
some commands produce unpredictable results due to
hypervDomainLookupByName's WMI query selecting the wrong domain.
For example, this prevents `virsh dumpxml` from outputting XML for the
wrong domain.
Signed-off-by: Matt Coleman <matt@datto.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Extract common code as helper function virNetlinkTalk, then simplify
the functions virNetlink[DumpLink|NewLink|DelLink|GetNeighbor].
Signed-off-by: Shi Lei <shi_lei@massclouds.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Introduce a macro NETLINK_MSG_APPEND to wrap nlmsg_append and
simplify code. Remove those labels 'buffer_too_small', since they
are now useless.
Signed-off-by: Shi Lei <shi_lei@massclouds.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Move macros NETLINK_MSG_[NEST_START|NEST_END|PUT] from .h into .c;
within these macros, replace 'goto' with reporting error and returning;
simplify virNetlinkDumpLink and virNetlinkDelLink by using NETLINK_MSG_PUT.
Signed-off-by: Shi Lei <shi_lei@massclouds.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
NLM_F_CREATE and NLM_F_EXCL are invalid for RTM_DELLINK,
so remove them.
Signed-off-by: Shi Lei <shi_lei@massclouds.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
So far we assumed that any vhostuser interface is plugged into an
OVS bridge and thus 'ovs-vsctl' exists. But this is not always
true. In testing scenarios it is possible to create a vhostuser
interface with this tool dpdk-testpmd (part of dpdk RPM) which
creates/connects to UNIX socket needed for vhostuser. Of course,
since there is no OVS then there is no interface name in which
case virNetDevOpenvswitchGetVhostuserIfname() should return 0.
The rest of APIs that assume OVS are not 'fixed' because we still
want them to fail (e.g. getting statistics, plugging interface
into an OVS bridge, unplugging it from an OVS bridge, ...).
The only API that is fixed is
virNetDevOpenvswitchGetVhostuserIfname() because it is called
explicitly when starting a guest (and callers are okay if no name
was found).
The other way to fix this bug seems to be to simply require
'ovs-vsctl' on spec file level, but that is too heavy gun given
that vhostuser is used by a small set of our users (assumption
made on requirements for vhostuser). Also, this way would drag in
yet another dependency for all users (even those who want minimal
libvirt).
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1913156
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
dnsmasq usually prints out a version string like this:
Dnsmasq version 2.82 [...]
but a user reported that the build of dnsmasq included with pihole has
a version string like this:
Dnsmasq version pi-hole-2.81 [...]
We parse the dnsmasq version number to figure out if the dnsmasq
binary supports certain features. Since we expect the version number
(and it must be only numbers!) to start on the first non-space after
the string "Dnsmasq version", we fail to parse this format of the
version string.
Rather than spending a bunch of time trying to get pihole to change
that, we can just make our parsing more permissive - after searching
for "Dnsmasq version", we'll skip ahead to the first decimal digit,
rather than just the first non-space.
(NB: The features we're checking for purely by looking at version
number have been in all releases of dnsmasq since at least 2012, so we
could actually just remove the reading of the version number
completely. However it's possible (although *highly* unlikely)
that some new feature would be added to dnsmasq in the future and we
would need to add that code back.)
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/29
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This function skips over the beginning of a string until it reaches a
decimal digit (0-9) or the NULL at the end of the string. The original
pointer is modified in place (similar to virSkipSpaces()).
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
In certain specific cases it might be beneficial to be able to control
the metadata caching of storage image format drivers of a hypervisor.
Introduce XML machinery to set the maximum size of the metadata cache
which will be used by qemu's qcow2 driver.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Enable parsing of backing store strings containing the native 'nfs'
protocol specification.
Signed-off-by: Ryan Gahagan <rgahagan@cs.utexas.edu>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
'nfs_user'/'nfs_group' represents the XML configuration.
'nfs_uid'/'nfs_gid' is internal store when libvirt looks up the user's
uid/gid in the system.
Signed-off-by: Ryan Gahagan <rgahagan@cs.utexas.edu>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
virJSONValueObjectRemoveKey can be used as direct replacement. Fix the
one caller and remove the duplicate function.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The lookup didn't do anything apart from comparing the sysfs paths
anyway since that's what makes each mdev unique.
The most ridiculous usage of the old logic was in
virHostdevReAttachMediatedDevices where in order to drop an mdev
hostdev from the list of active devices we first had to create a new
mdev and use it in the lookup call. Why couldn't we have used the
hostdev directly? Because the hostdev and mdev structures are
incompatible.
The way mdevs are currently removed is via a write to a specific sysfs
attribute. If you do it while the machine which has the mdev assigned
is running, the write call may block (with a new enough kernel, with
older kernels it would return a write error!) until the device
is no longer in use which is when the QEMU process exits.
The interesting part here comes afterwards when we're cleaning up and
call virHostdevReAttachMediatedDevices. The domain doesn't exist
anymore, so the list of active hostdevs needs to be updated and the
respective hostdevs removed from the list, but remember we had to
create an mdev object in the memory in order to find it in the list
first which will fail because the write to sysfs had already removed
the mdev instance from the host system.
And so the next time you try to start the same domain you'll get:
"Requested operation is not valid: mediated device <path> is in use by
driver QEMU, domain <name>"
Fixes: https://gitlab.com/libvirt/libvirt/-/issues/119
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
virDeviceHasPCIExpressLink() wasn't checking that pcie_cap_pos was
valid before attempting to use it, which could lead to reading the
byte at offset 0 + PCI_CAP_ID_EXP instead of [valid offset] +
PCI_CAP_ID_EXP. In particular, this could happen for "integrated" PCI
devices (those that are on the PCIe root complex). If it happened that
the byte from the wrong address had the "right" bit set, then it would
lead to us innappropriately believing that Express Link info was
available when it wasn't, and the node device driver would then log an
error like this:
virPCIDeviceGetLinkCapSta:2754 :
internal error: pci device 0000:00:18.0 is not a PCI-Express device
during a libvirtd restart. (this didn't ever occur until after
virPCIDeviceIsPCIExpress() was made more intelligent in commit
c00b6b1ae, which hasn't yet been in any official release)
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The only reason why virstoragefile.h needs to be included in virfile.h
is that virFileNBDDeviceAssociate() takes virStorageFileFormat argument.
The function doesn't need the enum value as it converts the value to
string and uses only that.
Change the argument to string which will allow us to remove that
include.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
All these headers are indirectly included provided by virfile.h having
virstoragefile.h which will be removed in the following patch.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The function doesn't take virStorageSource as argument and has nothing
in common with virStorageSource or storage file.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Function virQEMUBuildQemuImgKeySecretOpts is not used anywhere else
so there is no need to have it in util.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The last usage outside of tests was removed by commit
<780f8c94ca8b3dee7eb59c1bfbc32f672f965df8>.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The last user was removed by commit
<40f0e0348dfc84f28a500e262c4953b0d3b44fa0>.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
When adding a new lease by our leaseshelper then virLeaseNew() is
called. Here, we check for DNSMASQ_LEASE_EXPIRES environment
variable which is the expiration time for the lease. For infinite
lease time the value is zero. However, our code is not prepared
for that and adds "expiry-time" into the JSON file only if lease
expiry time is non-zero. This breaks the assumption that the
"expiry-time" attribute is always present (as can be seen in
virLeaseReadCustomLeaseFile() and virLeasePrintLeases()).
Store "expiry-time" always.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
In virLeaseNew() we are trying to remove trailing space (per
comment it may happen that older versions of dnsmasq put it into
an env variable). Well, instead of open coding it, we can use
virTrimSpaces().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
There are some variables which are used only inside the single
loop the function has. Let's declare them inside the loop body to
make that obvious. Also, fix indendation.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
During testing of my patch v6.10.0-rc1~221 it was found that
'ovs-vsctl get Interface $name name' or
'ovs-vsctl find Interface options:vhost-server-path=$path'
may return a string in double quotes, e.g. "vhost-user1". Later
investigation of openvswitch code showed, that early versions
(like 1.3.0) have somewhat restrictive set of safe characters
(isalpha() || '_' || '-' || '.'), which is then refined with
increasing version. For instance, version 2.11.4 has: isalnum()
|| '_' || '-' || '.'. If the string that ovs-vsctl wants to
output contains any other character it is escaped. You want to be
looking at ovsdb_atom_to_string() which handles outputting of a
single string and calls string_needs_quotes() and possibly
json_serialize_string() in openvswitch code base.
Since the interfaces are usually named "vhost-userN" we are
facing a problem where with one version we get the name in double
quotes and with another we get plain name without funny business.
Because of json involved I thought, let's make ovs-vsctl output
into JSON format and then use our JSON parser, but guess what -
ovs-vsctl ignores --format=json. But with a little help of
g_strdup_printf() it can be turned into JSON.
Fixes: e4c29e2904
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1767013
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
In v6.10.0-rc1~221 I wanted to make virNetDevOpenvswitchGetVhostuserIfname()
lookup interface name even for vhostuser interfaces with mode='server'. For
these, we are given a socket path which is then created by QEMU and to which
OpenVSwitch connects to and creates an interface. Because of this, we don't
know the name of the interface upfront (when starting QEMU) and have to use
the path to query OpenVSwitch later (using ovs-vsctl). What I intended to use
was:
ovs-vsctl --no-headings --columns=name find Interface options:vhost-server-path=$path
But what my code does is:
ovs-vsctl --no-headings --columns=name find Interface options:vhost-server-path=path
and it's all because the argument to the function is named "path"
which I then enclosed in double quotes while it should have been
used as a variable.
Fixes: e4c29e2904
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1767013
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
The comment about auto-generating names was obsoleted by recent
changes, and there was an unnecessary set of braces around a single
line conditional body.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Since commit 282d135ddb the parser for <interface> has cleared out
any interface name from the input XML that used the macvtap/macvlan
name as a prefix. Along with that, the switch to use the new
virNetDevGenerateName() function for auto-generating macvtap/macvlan
device names (commit 9b5d741a9), has realized two facts:
1) virNetDevGenerateName() can be called with a name already filled
in, and in that case it is an effective NOP.
2) because virNetDevGenerate() will always find an unused name, there
is no need to retry device creation in a loop - if it fails the
first time, it would fail any subsequent time as well.
that, combined with the aforementioned parser change allow us to
simplify virNetDevMacVLanCreateWithVPortProfile() - we no longer need
any extra code to determine if a template "AutoName" was requested,
and don't need a separate code path for creating the device in the
case that a specific name was given in the XML - all we need to do is
log any requested name, and then call exactly the same code as we
would if no name was given.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The Linux implementation of virNetDevCreate() doesn't require a
template ifname (e.g. "vnet%d") when it is called, but just generates
a new name if ifname is empty. The FreeBSD implementation requires
that the caller actually fill in a template ifname, and will fail if
ifname is empty. Since we want to eliminate all the special code in
callers that is setting the template name, we need to make the
behavior of the FreeBSD virNetDevCreate() match the behavior of the
Linux virNetDevCreate().
The simplest way to do this is to use the new virNetDevGenerateName()
function - if ifname is empty it generates a new name with the proper
prefix, and if it's not empty, it leaves it alone.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
When netlink is supported, use netlink to create veth device pair
rather than 'ip link' command.
Signed-off-by: Shi Lei <shi_lei@massclouds.com>
Reviewed-by: Laine Stump <laine@redhat.com>
Simplify virNetDevVethCreate by using common GenerateName/ReserveName
functions.
Signed-off-by: Shi Lei <shi_lei@massclouds.com>
Reviewed-by: Laine Stump <laine@redhat.com>
Simplify ReserveName/GenerateName for macvlan and macvtap by using
common functions.
Signed-off-by: Shi Lei <shi_lei@massclouds.com>
Reviewed-by: Laine Stump <laine@redhat.com>
Simplify GenerateName/ReserveName for netdevtap by using common
functions.
Signed-off-by: Shi Lei <shi_lei@massclouds.com>
Reviewed-by: Laine Stump <laine@redhat.com>
Extract ReserveName/GenerateName from netdevtap and netdevmacvlan as
common helper functions.
Signed-off-by: Shi Lei <shi_lei@massclouds.com>
Reviewed-by: Laine Stump <laine@redhat.com>
Until now there has been an extra bit of code in
qemuDomainDeviceCalculatePCIConnectFlag() (one of the two callers of
virPCIDeviceIsPCIExpress()) that tries to determine if a device is
PCIe by looking at the *length* of its sysfs config file; it only does
this when libvirt is running as a non-root process.
This patch takes advantage of our newfound ability to tell the
difference between "I read a 0 from the device PCI config file" and "I
couldn't read the PCI Express Capabilities because I don't have
sufficient permission" to put the file length check down in
virPCIDeviceIsPCIExpress(), and do that check any time we fail while
reading the config file (not only when the process is non-root).
Fixes: https://bugzilla.redhat.com/1901685
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Previously there was no way to differentiate between this function 1)
encountering an error while reading the pci config, and 2) determining
that the device in question is a conventional PCI device, and so has
no Express Capabilities.
The difference between these two conditions is important, because an
unprivileged libvirtd will be unable to read all of the pci config (it
can only read the first 64 bytes, and will get ENOENT when it tries to
seek past that limit) even though the device is in fact a PCIe device.
This patch changes virPCIDeviceFindCapabilityOffset() to put the
determined offset into an argument of the function (rather than
sending it back as the return value), and to return the standard "0 on
success, -1 on failure". Failure is determined by checking the value
of errno after each attemptd read of the config file (which can only
work reliably if errno is reset to 0 before each read, and after
virPCIDeviceFindCapabilityOffset() has finished examining it).
(NB: if the config file is read successfully, but no Express
Capabilities are found, then the function returns success, but the
returned offset will be 0 (which is an impossible offset for Express
Capabilities, and so easily recognizeable).
An upcoming patch will take advantage of the change made here.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The new message is more verbose/useful, but only logged at debug level
instead of as a warning (since it could easily happen in a non-error
situation).
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This function returned an int, but would only return 0 or 1, and the
one place it was called would just use !! to convert that value to a
bool. Change the function to directly return bool instead.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This function returned an int, and that int was being checked for < 0
in its solitary caller, but within the function it would only ever
return 0 or 1. Change the function itself to return a bool, and the
caller to just directly set the flag in the virPCIDevice.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
We previous added code for passing FDs which was explicitly derived from
gnulib's passfd code:
commit 17460825f3
Author: Daniel P. Berrangé <berrange@redhat.com>
Date: Fri Jan 17 11:57:17 2020 +0000
src: implement APIs for passing FDs over UNIX sockets
This is a simplified variant of gnulib's passfd module
without the portability code that we do not require.
while the license was unchanged, we mistakenly failed to copy the FSF
copyright header which is required by the license terms.
Reported-by: Bruno Haible <bruno@clisp.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This macro checks whether given number is an integer power of
two. At the same time, I've identified two places where we check
for pow2 and I'm replacing them with the macro.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Tested-by: Han Han <hhan@redhat.com>
The only test we do when checking for UUID validity is that
whether all bytes are the same (invalid UUID) or not (valid
UUID). The algorithm we use is needlessly complicated.
Also, the checked UUID is not modified and hence the argument can
be of 'const' type.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Tested-by: Han Han <hhan@redhat.com>
In this previous commit:
commit 65491a2dfe
Author: Martin Kletzander <mkletzan@redhat.com>
Date: Thu Nov 12 13:58:53 2020 +0100
Do not disable incompatible-pointer-types-discards-qualifiers
We selectively rewrite G_DEFINE_TYPE to avoid warnings about
mismatched volatile/non-volatile pointers that appeared with
CLang when using GLib2 >= 2.67
We have now just hit the reverse problem, GCC >= 11 has started
warning about mismatched volatile/non-volatile pointers but only
with GLib2 < 2.67. The new GLib2 avoids the warning, as does
older GCC.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Since 032548c4 @cmd was never autofree'd. Perhaps as a result of
VIR_AUTOPTR type changes occurring at roughly the same time so the
copy pasta missed this.
Found by Coverity.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Now that no one uses VIR_AUTOSTRINGLIST it can be dropped.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Glib provides g_auto(GStrv) which is in-place replacement of our
VIR_AUTOSTRINGLIST.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This will open an opportunity to modernize virDomainDiskDefParseXML()
in the next patch.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Just like virCommandPassFD, but it also returns an index of
the passed FD in the FD set.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The virJSONValueObjectGetStringArray() function is given a @key
which is supposed to be an array inside given @object. Well, if
it's not then an error state is returned (NULL), but no error
message is set.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The original logic is incorrect. We would delete the device entry
from eBPF map only if the newval would be same as current val in the
map. In case that the device was allowed only as read-only but later
we remove all permissions for that device it would remain in the table
with empty values.
The old code would still deny the device but it's not working as
intended. Instead we will update the value in advance. If the updated
value is 0 it means that we are removing all permissions so it should
be removed from the map, otherwise we will update the value in map.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1810356
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Kernel commit <d505b8af58912ae1e1a211fabc9995b19bd40828> added proper
check for cpu quota maximum limit to prevent internal overflow.
Even though this change is not present in all kernels it makes sense
to enforce the same limit in libvirt.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1750315
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Tested-by: Han Han <hhan@redhat.com>
I previously did a workaround for a glib event loop race
that causes crashes:
commit 0db4743645
Author: Daniel P. Berrangé <berrange@redhat.com>
Date: Tue Jul 28 16:52:47 2020 +0100
util: avoid crash due to race in glib event loop code
it turns out that the workaround has a significant performance
penalty on I/O intensive workloads. We thus need to avoid the
workaround if we know we have a new enough glib to avoid the
race condition.
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Tested-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
When libvirt added support for firewalld, we were unable to use
firewalld's higher level rules, because they weren't detailed enough
and could not be applied to the iptables FORWARD or OUTPUT chains
(only to the INPUT chain). Instead we changed our code so that rather
than running the iptables/ip6tables/ebtables binaries ourselves, we
would send these commands to firewalld as "passthrough commands", and
firewalld would run the appropriate program on our behalf.
This was done under the assumption that firewalld was somehow tracking
all these rules, and that this tracking was benefitting proper
operation of firewalld and the system in general.
Several years later this came up in a discussion on IRC, and we
learned from the firewalld developers that, in fact, adding iptables
and ebtables rules with firewalld's passthrough commands actually has
*no* advantage; firewalld doesn't keep track of these rules in any
way, and doesn't use them to tailor the construction of its own rules.
Meanwhile, users have been complaining for some time that whenever
firewalld is restarted on a system with libvirt virtual networks
and/or nwfilter rules active, the system logs would be flooded with
warning messages whining that [lots of different rules] could not be
deleted because they didn't exist. For example:
firewalld[3536040]: WARNING: COMMAND_FAILED:
'/usr/sbin/iptables -w10 -w --table filter --delete LIBVIRT_OUT
--out-interface virbr4 --protocol udp --destination-port 68
--jump ACCEPT' failed: iptables: Bad rule
(does a matching rule exist in that chain?).
(See https://bugzilla.redhat.com/1790837 for many more examples and a
discussion)
Note that these messages are created by iptables, but are logged by
firewalld - when an iptables/ebtables command fails, firewalld grabs
whatever is in stderr of the program, and spits it out to the system
log as a warning. We've requested that firewalld not do this (and
instead leave it up to the calling application to do the appropriate
logging), but this request has been respectfully denied.
But combining the two problems above ( 1) firewalld doesn't do
anything useful when you use it as a proxy to add/remove iptables
rules, 2) firewalld often insists on logging lots of
annoying/misleading/useless "error" messages when you use it as a
proxy to remove iptables rules that don't already exist), leads to a
solution - simply stop using firewalld to add and remove iptables
rules. Instead, exec iptables/ip6tables/ebtables directly in the same
way we do when firewalld isn't active.
We still need to keep track of whether or not firewalld is active, as
there are some things that must be done, e.g. we need to add some
actual firewalld rules in the firewalld "libvirt" zone, and we need to
take notice when firewalld restarts, so that we can reload all our
rules.
This patch doesn't remove the infrastructure that allows having
different firewall backends that perform their functions in different
ways, as that will very possibly come in handy in the future when we
want to have an nftables direct backend, and possibly a "pure"
firewalld backend (now that firewalld supports more complex rules, and
can add those rules to the FORWARD and OUTPUT chains). Instead, it
just changes the action when the selected backend is "firewalld" so
that it adds rules directly rather than through firewalld, while
leaving as much of the existing code intact as possible.
In order for tests to still pass, virfirewalltest also had to be
modified to behave in a different way (i.e. by capturing the generated
commandline as it does for the DIRECT backend, rather than capturing
dbus messages using a mocked dbus API).
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
When it is starting up, firewalld will delete all existing iptables
rules and chains before adding its own rules. If libvirtd were to try
to directly add iptables rules during the time before firewalld has
finished initializing, firewalld would end up deleting the rules that
libvirtd has just added.
Currently this isn't a problem, since libvirtd only adds iptables
rules via the firewalld "passthrough command" API, and so firewalld is
able to properly serialize everything. However, we will soon be
changing libvirtd to add its iptables and ebtables rules by directly
calling iptables/ebtables rather than via firewalld, thus removing the
serialization of libvirtd adding rules vs. firewalld deleting rules.
This will especially apparent (if we don't fix it in advance, as this
patch does) when libvirtd is responding to the dbus NameOwnerChanged
event, which is used to learn when firewalld has been restarted. In
that case, dbus sends the event before firewalld has been able to
complete its initialization, so when libvirt responds to the event by
adding back its iptables rules (with direct calls to
/usr/bin/iptables), some of those rules are added before firewalld has
a chance to do its "remove everything" startup protocol. The usual
result of this is that libvirt will successfully add its private
chains (e.g. LIBVIRT_INP, etc), but then fail when it tries to add a
rule jumping to one of those chains (because in the interim, firewalld
has deleted the new chains).
The solution is for libvirt to preface it's direct calling to iptables
with a iptables command sent via firewalld's passthrough command
API. Since commands sent to firewalld are completed synchronously, and
since firewalld won't service them until it has completed its own
initialization, this will assure that by the time libvirt starts
calling iptables to add rules, that firewalld will not be following up
by deleting any of those rules.
To minimize the amount of extra overhead, we request the simplest
iptables command possible: "iptables -V" (and aside from logging a
debug message, we ignore the result, for good measure).
(This patch is being done *before* the patch that switches to calling
iptables directly, so that everything will function properly with any
fractional part of the series applied).
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Even though *we* don't call ebtables/iptables/ip6tables (yet) when the
firewalld backend is selected, firewalld does, so these binaries need
to be there; let's check for them. (Also, the patch after this one is
going to start execing those binaries directly rather than via
firewalld).
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
iptables and ip6tables have had a "-w" commandline option to grab a
systemwide lock that prevents two iptables invocations from modifying
the iptables chains since 2013 (upstream commit 93587a04 in
iptables-1.4.20). Similarly, ebtables has had a "--concurrent"
commandline option for the same purpose since 2011 (in the upstream
ebtables commit f9b4bcb93, which was present in ebtables-2.0.10.4).
Libvirt added code to conditionally use the commandline option for
iptables/ip6tables in upstream commit ba95426d6f (libvirt-1.2.0,
November 2013), and for ebtables in upstream commit dc33e6e4a5
(libvirt-1.2.11, November 2014) (the latter actually *re*-added the
locking for iptables/ip6tables, as it had accidentally been removed
during a refactor of firewall code in the interim).
I say "conditionally" because a check was made during firewall module
initialization that tried executing a test command with the
-w/--concurrent option, and only continued using it for actual
commands if that test command completed successfully. At the time the
code was added this was a reasonable thing to do, as it had been less
than a year since introduction of -w to iptables, so many distros
supported by libvirt were still using iptables (and possibly even
ebtables) versions too old to have the new commandline options.
It is now 2020, and as far as I can discern from repology.org (and
manually examining a RHEL7.9 system), every version of every distro
that is supported by libvirt now uses new enough versions of both
iptables and ebtables that they all have support for -w/--concurrent.
That means we can finally remove the conditional code and simply
always use them.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
All the unit tests that use iptables/ip6tables/ebtables have been
written to omit the locking/exclusive use primitive on the generated
commandlines. Even though none of the tests actually execute those
commands (and so it doesn't matter for purposes of the test whether or
not the commands support these options), it still made sense when some
systems had these locking options and some didn't.
We are now at a point where every supported Linux distro has supported
the locking options on these commands for quite a long time, and are
going to make their use non-optional. As a first step, this patch uses
the virFirewallSetLockOverride() function, which is called at the
beginning of all firewall-related tests, to set all the bools
controlling whether or not the locking options are used to true. This
means that all the test cases must be updated to include the proper
locking option in their commandlines.
The change to make actual execs of the commands unconditionally use
the locking option will be in an upcoming patch - this one affects
only the unit tests.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Commit 912c6b22fc added abort() when the
'val' parameter is NULL along with setting the error variable for the
command. We don't want to abort in this case, just set the error.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Using virtCgroupNewSelf() is not correct with cgroups v2 because the
the virt-host-validate process is executed from from the same cgroup
context as the terminal and usually not all controllers are enabled
by default.
To do a proper check we need to use the root cgroup to see what
controllers are actually available. Libvirt or systemd ensures that
all controllers are available for VMs as well.
This still doesn't solve the devices controller with cgroups v2 where
there is no controller as it was replaced by eBPF. Currently libvirt
tries to query eBPF programs which usually works only for root as
regular users will get permission denied for that operation.
Fixes: https://gitlab.com/libvirt/libvirt/-/issues/94
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This reverts commit b3710e9a2a.
That check is very valuable for our code, but it causes issue with glib >=
2.67.0 when building with clang.
The reason is a combination of two commits in glib, firstly fdda405b6b1b which
adds a g_atomic_pointer_{set,get} variants that enforce stricter type
checking (by removing an extra cast) for compilers that support __typeof__, and
commit dce24dc4492d which effectively enabled the new variant of glib's atomic
code for clang. This will not be necessary when glib's issue #600 [0] (8 years
old) is fixed. Thankfully, MR #1719 [1], which is supposed to deal with this
issue was opened 3 weeks ago, so there is a slight sliver of hope.
[0] https://gitlab.gnome.org/GNOME/glib/-/issues/600
[1] https://gitlab.gnome.org/GNOME/glib/-/merge_requests/1719
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Introduced by commit <22494556542c676d1b9e7f1c1f2ea13ac17e1e3e> which
fixed a CVE.
If the @path passed to virDMSanitizepath() is not a DM name or not a
path to DM name this function could return incorrect sanitized path as
it would always be the first device under /dev/mapper/.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
While it's certainly good to log events like "failed to close fd"
and "tried to close invalid fd", which are likely to be the
consequence of some bug in libvirt, logging a message every single
time a file descriptor is closed successfully is perhaps excessive
and can lead to useful information being missed among the noise.
Log filters don't help in this situation, because filtering out all
of util.file is too big a hammer and would cause important messages
to be left out as well.
To give an idea of just how much noise this single debug statement
can cause, here's a real life example from a quite large libvirtd
log I had to look at recently:
$ grep virFile libvirt.log | wc -l
1307
$ grep virFile libvirt.log | grep -v 'Closed fd' | wc -l
343
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Before commit 24d8968c, virDirClose took a DIR**, and that was never
NULL, so its declaration included ATTRIBUTE_NONNULL(1). Since that
commit, virDirClose takes a DIR*, and it may be NULL (e.g. if the DIR*
is initialized to NULL and was never closed).
Even though virDirClose() is currently only called implicitly (as the
cleanup for a g_autoptr(DIR)), and (as I've just newly learned) the
autocleanup function g_autoptr will only be called if the pointer in
question is non-null (see the definition of
_GLIB_AUTOPTR_CLEAR_FUNC_NAME in
/usr/include/glib-2.0/glib/gmacros.h), it does still cause Coverity to
complain that it *could* be called with a NULL, and it's also possible
that in the future someone might add code that explicitly calls
virDirClose.
To eliminate the Coverity complaints, and protect against the
hypothetical future where someone both explicitly calls virDirClose()
with a potentially NULL value, *and* re-enables the nonnull directive
when not building with Coverity (disabled by commit eefb881) this
patch removes the ATTRIBUTE_NONNULL(1) from the declaration of
virDirClose().
Fixes: 24d8968cd0
Reported-by: John Ferlan <jferlan@redhat.com>
Details-Research-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Laine Stump <laine@redhat.com>
After e4c29e2904 the function has one argument more and the
argument that can't be NULL moved from second to third position.
Reported-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Every time we create new virCommand of OVS_VSCTL it must be
followed by virNetDevOpenvswitchAddTimeout() call which adds the
--timeout=X argument to freshly created cmd. Instead of having
this as two separate function calls it can be just one.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
There are two types of vhostuser ports:
dpdkvhostuser - OVS creates the socket and QEMU connects to it
dpdkvhostuserclient - QEMU creates the socket and OVS connects to it
But of course ovs-vsctl syntax for fetching ifname is different.
So far, we've implemented the former. The lack of implementation
for the latter means that we are not detecting the interface name
and thus not reporting it in domain XML, or failing to get
interface statistics.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1767013
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
We need to pass some flags in order to properly initialize the
connection otherwise it will not work. This copies what GLib does
for g_bus_get_sync() internally.
This fixes an issue with LXC driver where libvirt was not able to
register any VM with machined.
Reported-by: Matthias Maier <tamiko@gentoo.org>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When using .path() for an argument to a python script meson will not
setup dependancies on the file. This means that changes to the generator
script will not trigger a rebiuld
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This code will be used to signal cases when the checkpoint is broken
either during backup or other operations where a user might want to make
decision based on the presence of the checkpoint, such as do a full
backup instead of an incremental one.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The 'qemu_migration_cookie' module uses these. Provide a stable override
for tests.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The virGDBusBusInit is supposed to return a reference to
requested bus type (system/session) or, if non-shared bus is
requested then create a new bus of the type. As an argument, it
gets a double pointer to GError which is passed to all g_dbus_*()
calls which allocate it on failure. Pretty standard approach.
However, since it is a double pointer we must dereference the
first level to see if the value is NULL. IOW:
if (*error)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The only caller of this function ignores failure
and just sets the unique_id to -1.
Failing to read the file is likely to the device no longer
being present, not a real error.
Stop reporting errors in this function.
https://bugzilla.redhat.com/show_bug.cgi?id=1692100
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
For functions which have reasonable replacement, let's encourage usage
of g_hash_table_ alternatives.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Matt Coleman <matt@datto.com>
Don't hide our use of GHashTable behind our typedef. This will also
promote the use of glibs hash function directly.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Matt Coleman <matt@datto.com>
Glib's hash table provides basically the same functionality as our hash
table.
In most cases the only thing that remains in the virHash* wrappers is
NULL-checks of '@table' argument as glib's hash functions don't tolerate
NULL.
In case of iterators, we adapt the existing API of iterators to glibs to
prevent having rewrite all callers at this point.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Matt Coleman <matt@datto.com>
We didn't use it rigorously and some helpers even cast it away. Remove
const from all hash utility functions.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Matt Coleman <matt@datto.com>
Convert all calls to virHashForEach where it's not obvious that the
callback is _not_ deleting the current element from the hash to
virHashForEachSafe which will be deemed safe to do such operation.
Now that no iterator used with virHashForEach deletes current element we
can document that virHashForEach must not touch the hash table in any
way.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Matt Coleman <matt@datto.com>
'virHashForEach' historically allowed deletion of the current element as
'virHashRemoveSet' didn't exist. To prevent us from having to deeply
analyse all iterators add virHashForEachSafe which first gets a list of
elements and iterates them outside of the hash table.
This will allow replace the internals of the hash table with other
implementation which don't allow such operation.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Matt Coleman <matt@datto.com>
The simplest way to write tests is to check the output against expected
output, but we must ensure that the output is stable. We can use
virHashForEachSorted as a hash iterator to ensure stable ordering.
This patch fixes 3 instances of hash iteration which is tested in
various parts, including test output changes in appropriate places.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Matt Coleman <matt@datto.com>
Iterate the hash elements sorted by key. This is useful to provide a
stable ordering such as in cases when the output is checked in tests.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Matt Coleman <matt@datto.com>
All but one of the callers either use the list in arbitrary order or
sorted by key. Rewrite the function so that it supports sorting by key
natively and make it return the element count. This in turn allows to
rewrite the only caller to sort by value internally.
This allows to remove multiple sorting functions which were sorting by
key and the function will be also later reused for some hash operations
internally.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Matt Coleman <matt@datto.com>
Commit <99d2c6519ad18651b5959fa0a3366bcb2c1e44f3> removed parameter
from the function but did not modified ATTRIBUTE_NONNULL.
Reported-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Remove mix of array length and error code in the return code.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.ibm.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Extract virPCIGetMdevTypes from PCI as virMediatedDeviceGetMdevTypes
into mdev for later reuse.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.ibm.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Now that every caller to copyPlacement doesn't pass absolute path there
is no need to have a condition to handle that case.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The old code passed an absolute path to virCgroupNewFromParent() which
is not necessary. The code can take the current placement of parent
cgroup and append a relative path.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Use virStringSplit() to get the list of directories needed to be
created. This improves readability of the code and stops passing
absolute path to virCgroupNewFromParent().
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Currently this task is done by virCgroupCopyPlacement when the @path
starts with "/".
virCgroupNew is always called with @path starting with "/" and there is
no parent to copy path from. To make it obvious what the code is doing
introduce new helper.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This function is relevant only with cgroups v1 where it creates
hierarchy for controllers that are not managed by systemd. PID is used
to detect a placement of current process but in this situation we are
building the hierarchy for already known placement.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The current code uses virCgroupNew() as a single point of entry and
calls into virCgroupDetect() as well. Both have logic for several paths
which is difficult to figure out.
Extract the actually used code path from the two functions to make
it obvious what's happening in this case.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The current code uses virCgroupNew() as a single point of entry and
calls into virCgroupDetect() as well. Both have logic for several paths
which is difficult to figure out.
Extract the actually used code path from the two functions to make
it obvious what's happening in this case.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
It is only used for debug and error purposes which can be easily
replaced by @placement.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
With cgroups v2 working with controllers is a bit more complicated then
with cgroups v1 where the controller had to be mounted.
There are two files, cgroups.controllers and cgroup.subtree_control.
The file cgroup.controllers lists all controllers enabled in the current
cgroup and cgroups.subtree_control, as the name suggest, controls which
controllers are enabled for a subtree of cgroups.
Now the issue here is that the current code doesn't make any difference
if the @parent variable is NULL or not because ../cgroup.subtree_control
will list the same controllers as ./cgroup.controllers.
The whole point of the @parent variable is when we are building the
cgroup topology ourselves without systemd help we need to detect which
controllers are enabled in the parent cgroup in order to enable them for
the current cgroup as well and for that we need to check
cgroup.controllers of the parent group.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
When libvirtd starts a VM it internally stores a path to the main
cgroup. When we restart libvirtd we should get to the same state.
When we start a VM on host with systemd the cgroup is created for us and
the process is already placed into that cgroup and we detect the path
created by systemd using /proc/$PID/cgroup. After that we create
sub-cgroups and move all threads there.
Once libvirtd is restarted we again detect the cgroup path using
/proc/$PID/cgroup, but in this case we will get a different path because
the main thread was moved to a "emulator" cgroup.
Instead of ignoring the "emulator" directory when validating cgroups
remove it completely when detecting cgroup otherwise cgroups will not
work properly when libvirtd is restarted.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
With cgroups v2 the file cgroup.procs will never be empty if threading
is enabled as it will always have ID of all processes even if all
threads of the processes are moved to sub-cgroups. If that happens the
file cgroup.threads will be empty.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
After converting all DIR* to g_autoptr(DIR), many cleanup: labels
ended up just having "return ret", and every place that set ret would
just immediately goto cleanup. Remove the cleanup label and its
return, and just return the set value immediately, thus eliminating
the need for the return variable itself.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Once the DIR* in virPCIGetName() was made g_autoptr, the cleanup:
label just had a "return ret;", but the rest of the function was more
compilcated than it needed to be, doing funky things with the value of
ret inside multi-level conditionals and a while loop that might exit
early via a break with ret == 0 or exit early via a goto cleanup with
ret == -1.
It really didn't need to be nearly as complicated. After doing the
trivial replacements of "goto cleanup" with appropriate direct
returns, it became obvious that:
1) the outermost level of the nested conditional at the end of the
function ("if (ret < 0)") was now redundant, since ret is now
*always* < 0 by that point (otherwise the function has returned).
2) by switching the sense of the next level of the conditional (making
it "if (!physPortID)", the "else" (which is now just "return 0;"
becomes the "if", and the new "else" no longer needs to be inside
the conditional.
3) the value of firstEntryName can be moved into *netname with
g_steal_pointer()
Once that is all done, ret is no longer used and can be removed.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Since every single use of DIR* was converted to use g_autoptr, this
function is not currently needed. Even if someone comes up with a
usage for a non-g_autoptr DIR* in the future, they can just use
virDirClose(), since there is no longer a semantic difference between
the two (VIR_DIR_CLOSE() previously had an extra & on the pointer so
that it could be transparently passed as a DIR** to virDirClose(), but
that was removed several commits back.)
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
All of these conversions are trivial - VIR_DIR_CLOSE() (aka
virDirClose()) is called only once on the DIR*, and it happens just
before going out of scope.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
In order to make a usable g_autoptr(DIR), we need to have a close
function that is a NOP when the pointer is NULL, but takes a simple
DIR*. But virDirClose() (candidate to be the g_autoptr cleanup
function) currently takes a DIR**, not DIR*. It does this so that it
can clear the pointer, thus making it safe to call virDirClose on the
same DIR multiple times.
In the past the clearing of the DIR* was essential in a few places,
but those few places have now been changed, so we can modify
virDirClose() to take a DIR*, and remove the side effect of clearing
the DIR*. This will make it directly usable as the g_autoptr cleanup,
and will mean that this:
{
DIR *dirp = NULL;
blah blah ...
VIR_DIR_CLOSE(dirp)
}
is functionally identical to
{
g_autoptr(DIR) dirp = NULL;
blah blah ...
}
which will make conversion to using g_autoptr mechanical and simple to review.
(Note that virDirClose() will still check for NULL before attempting
to close, so that it can always be safely called, as long as the DIR*
was initialized to NULL (another prerequisite of becoming a g_autoptr
cleanup function)
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
DIR *dh is being re-used each time through the for loop of this
function, so it must be closed and then re-opened, which means we
can't convert it to g_autoptr. By moving the definition of dh inside
the for loop, we make it possible to trivially convert to g_autoptr
(which will happen in a subsequent patch)
NB: VIR_DIR_CLOSE() is already called at the bottom of the for loop,
so removing the VIR_DIR_CLOSE() at the end of the function is *not*
creating a leak of a DIR*!
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
This will make it easier to review upcoming patches that use g_autoptr
to auto-close all DIRs.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
The cpu mask was free()'d immediately on any error and at the end of the
function, where it was expected that it would either error out and return or
goto another allocation if the code was to fail. However since commit
9514e24984 the error path did not return in one new case which caused
double-free in such situation. In order to make the code more straightforward
just free the mask after it's been used even before checking the return code of
the call.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1819801
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Since we use virHashTable for string-keyed values only, we can remove
all the callbacks which allowed universal keys.
Code which wishes to use non-string keys should use glib's GHashTable.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
All users of virHashTable pass strings as the name/key of the entry.
Make this an official requirement by turning the variables to 'const
char *'.
For any other case it's better to use glib's GHashTable.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The only place we call it is in virHashNew. Move the code to virHashNew
and remove virHashCreateFull.
Code wishing to use non-strings as hash table keys will be better off
using glib's GHashTable directly.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
It doesn't make much sense to configure the bucket count in the hash
table for each case specifically. Replace all calls of virHashCreate
with virHashNew which has a pre-set size and remove virHashCreate
completely.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Remove 'cleanup' label and simplify remembering of the returned value
from the callback.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Rewrite using GHashTable which already has interfaces for using a number
as hash key. Glib's implementation doesn't copy the key by default, so
we need to allocate it, but overal the interface is more suited for this
case.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
It's used only in one place in tests which isn't even automatically
evaluated.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
virCgroupKillRecursive sneakily initializes 'ret' to 0 rather than the
usual -1. 401030499b moved an error condition but didn't actually
modify 'ret' return the proper error code.
Fixes: 401030499b
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
virPCIDeviceAddressGetSysfsFile() is simpler to call.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
These were nops once enough cleanup was g_auto'd.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
thisPhysPortID is only used inside a conditional, so reduce its scope
to just the body of that conditional, which will eliminate the need
for the undesirable manual VIR_FREE().
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This function had a loop that was only executed twice; it was
artificially constructed with a label, a goto, and a boolean to tell
that it had already been executed once. Aside from that, the body of
the loop contained only two lines that needed to be repeated (the
second time through, everything beyond those two lines would be
skipped).
One side effect of this strange loop was that a g_autofree string was
manually freed and re-initialized; I've been told that manually
freeing a g_auto_free object is highly discouraged.
This patch refactors the function to simply repeat the 2 lines that
might possibly be executed twice, thus eliminating the ugly use of
goto to construct a loop, and also takes advantage of the fact that
virPCIDriverDir() was previously returning *exactly* the same string
both times it was called to eliminate the manual VIR_FREE of drvpath.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
There is no need for a temporary variable in this function, and since
it can't return NULL, no need for callers to check for it.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
There is no need for a temporary variable in this function, and ever
since we switched to glib for memory allocation, there is no possibility
it can return NULL, so callers don't need to check for it.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
When this function was recently changed to add in parsing of
IFLA_VF_STATS, I noticed that the checks for existence of IFLA_VF_MAC
and IFLA_VF_VLAN were looking in the *wrong array*. The array that
contains the results of parsing each IFLA_VFINFO in
tb[IFLA_VFINFO_LIST] is tb_vf, but we were checking for these in tb
(which is the array containing the results of the toplevel parsing of
the netlink message, *not* the results of parsing one of the nested
IFLA_VFINFO's.
This incorrect code has been here since the function was originally
written in 2012. It has only worked all these years due to coincidence
- the items at those indexes in tb are IFLA_ADDRESS and IFLA_BROADCAST
(of the *PF*, not of any of its VFs), and those happen to always be
present in the toplevel netlink message; since we are only looking in
the incorrect place to check for mere existence of the attribute (but
are doing the actual retrieval of the attribute from the correct
place), this bug has no real consequences other than confusing anyone
trying to understand the code.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
virNetDevParseVfConfig has became a multifunctional helper function,
rename it to virNetDevParseVfInfo.
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Reviewed-by: Laine Stump <laine@redhat.com>
libvirt can retrieve traffic stats for emulated interfaces that are
backed by tap or macvtap devices, but this information wasn't
available for hostdev interfaces (those that are implemented by
assigning an SR-IOV VF device to a guest using vfio):
#virsh domifstat instance --interface=52:54:00:2d:b2:35
error: Failed to get interface stats instance 52:54:00:2d:b2:35
error: internal error: Interface name not provided
For some SR-IOV VF devices this information is available via the
netlink VFINFO_LIST request/response, and that is what this patch uses
to implement stats retrieval for VF. Not that this is dependent on
support in the PF driver - for example, the Mellanox ConnectX-4 Lx
(mlx5) driver reports usable stats, while Intel 82599 (ixgbe) and
82576 (igb) just report all stats as 0. (this is the same result as
"ip -s link show").
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Reviewed-by: Laine Stump <laine@redhat.com>
By default, pfifo_fast queueing discipline (qdisc) is set on
newly created interfaces (including TAPs). This qdisc has three
queues and packets that want to be sent through given NIC are
placed into one of the queues based on TOS field. Queues are then
emptied based on their priority allowing interactive sessions
stay interactive whilst something else is downloading a large
file.
Obviously, this means that kernel has to be involved and some
locking has to happen (when placing packets into queues). If
virtualization is taken into account then the above algorithm
happens twice - once in the guest and the second time in the
host.
This is arguably not optimal as it burns host CPU cycles
needlessly. Guest already made it choice and sent packets in the
order it wants.
To resolve this, Linux kernel offers 'noqueue' qdisc which can be
applied on virtual interfaces and in fact for 'lo' it is by
default:
lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue
Set it for other TAP devices we create for domains too. With this
change I was able to squeeze 1Mbps more from a macvtap attached
to a guest and to my 1Gbps LAN (as measured by iperf3).
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1329644
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
This helper changes the root qdisc on given interface.
Ideally, it would be written using netlink but my attempts to
write the code were not successful and thus I've fallen back to
virCommand() + tc.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Currently setting max_len=0 causes virtlogd to spin in a busy loop. It
is natural to allow this to disable log rollover which can be useful for
developers debugging things.
Note disabling rollover exposes the host to denial of service from a
malicious guest, so must be used with care.
Closes https://gitlab.com/libvirt/libvirt/-/issues/85
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The aim of virSocketAddrPrefixToNetmask() is to initialize passed
virSocketAddr structure based on prefix length and family.
However, it doesn't set all members in the struct which may lead
to reads of uninitialized values:
==15421== Use of uninitialised value of size 8
==15421== at 0x50F297A: _itoa_word (in /lib64/libc-2.31.so)
==15421== by 0x510C8FE: __vfprintf_internal (in /lib64/libc-2.31.so)
==15421== by 0x5120295: __vsnprintf_internal (in /lib64/libc-2.31.so)
==15421== by 0x50F8969: snprintf (in /lib64/libc-2.31.so)
==15421== by 0x51BB602: getnameinfo (in /lib64/libc-2.31.so)
==15421== by 0x496DEE0: virSocketAddrFormatFull (virsocketaddr.c:486)
==15421== by 0x496DD9F: virSocketAddrFormat (virsocketaddr.c:444)
==15421== by 0x11871F: networkDnsmasqConfContents (bridge_driver.c:1404)
==15421== by 0x1118F5: testCompareXMLToConfFiles (networkxml2conftest.c:48)
==15421== by 0x111BAF: testCompareXMLToConfHelper (networkxml2conftest.c:112)
==15421== by 0x112679: virTestRun (testutils.c:142)
==15421== by 0x111D09: mymain (networkxml2conftest.c:144)
==15421== Uninitialised value was created by a stack allocation
==15421== at 0x1175D2: networkDnsmasqConfContents (bridge_driver.c:1056)
All callers expect the function to initialize the structure
fully.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
As preparation for g_autoptr() we need to change the function to take
only virCgroupPtr.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
We don't use the lib prefix for all libraries but in these cases it
makes sense to use the prefix.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Validation is usually performed on an entire document. If we are only
interested in validating a single nested node that can occur in
different contexts, this would require writing different schemas for
any of those different contexts.
By temporarily replacing the document's root node, we can validate the
relevant node only.
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>