This reverts commit a5e1602090.
Getting rid of unistd.h from our headers will require more work than
just fixing the broken mingw build. Revert it until I have a more
complete proposal.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
util/virutil.h bogously included unistd.h. Drop it and replace it by
including it directly where needed.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
'viralloc.h' does not provide any type or macro which would be necessary
in headers. Prevent leakage of the inclusion.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Only PCI devices have '/sys/class/net/<ifname>/device/resource' so we
need to skip this check for all other network devices.
Without this patch and RDMA enabled libvirt will not detect any network
device that doesn't have the path above which includes 'lo', 'virbr',
'tun', etc.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1639258
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
For consistency, let's use the semicolon for all definitions.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Missing semicolon at the end of macros can confuse some analyzers
(like cppcheck <filename>), and we have a mix of semicolon and
non-semicolon usage through the code. Let's standardize on using
a semicolon for VIR_ENUM_IMPL calls.
Move the verify() statement to the end of the macro and drop
the semicolon, so the compiler will require callers to add a
semicolon.
While we are touching these call sites, standardize on putting
the closing parenth on its own line, as discussed here:
https://www.redhat.com/archives/libvir-list/2019-January/msg00750.html
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
This is a return argument that is to be compared against NULL on
successful return. However, it is not initialized and therefore
relies on callers setting it to NULL prior calling the function.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Removing redundant sections of the code
Signed-off-by: Radoslaw Biernacki <radoslaw.biernacki@linaro.org>
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
libvirt wrongly assumes that VF netdev has to have the
netdev assigned to PF. There is no such requirement in SRIOV standard.
This patch change the virNetDevSwitchdevFeature() function to deal
with SRIOV devices which does not have netdev on PF. Also corrects
one comment about PF netdev assumption.
One example of such devices is ThunderX VNIC.
By applying this change, VF device is used for virNetlinkCommand() as
it is the only netdev assigned to VNIC.
Signed-off-by: Radoslaw Biernacki <radoslaw.biernacki@linaro.org>
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In many files there are header comments that contain an Author:
statement, supposedly reflecting who originally wrote the code.
In a large collaborative project like libvirt, any non-trivial
file will have been modified by a large number of different
contributors. IOW, the Author: comments are quickly out of date,
omitting people who have made significant contribitions.
In some places Author: lines have been added despite the person
merely being responsible for creating the file by moving existing
code out of another file. IOW, the Author: lines give an incorrect
record of authorship.
With this all in mind, the comments are useless as a means to identify
who to talk to about code in a particular file. Contributors will always
be better off using 'git log' and 'git blame' if they need to find the
author of a particular bit of code.
This commit thus deletes all Author: comments from the source and adds
a rule to prevent them reappearing.
The Copyright headers are similarly misleading and inaccurate, however,
we cannot delete these as they have legal meaning, despite being largely
inaccurate. In addition only the copyright holder is permitted to change
their respective copyright statement.
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
There seems to be no need to add the ignore_value wrapper or
caste with (void) to the unlink() calls, so let's just remove
them. I assume at one point in time Coverity complained. So,
let's just be consistent - those that care to check the return
status can and those that don't can just have the naked unlink.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
nlmsg_append from the libnl library provides exactly the same
functionality, so we should rely on that instead. This also allows us to
drop the aforementioned function completely.
Signed-off-by: Shi Lei <shi_lei@massclouds.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
There's a single user for it which takes an existing
virPCIDeviceAddress, passes its various bits to the
function which in turn constructs a virPCIDevice and
then copies the string representation for the caller
to use: we can use virPCIDeviceAddressAsString()
instead and avoid creating the virPCIDevice in the
first place. Since the function ends up having no
users after the change, we can just drop it.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Commits 7b706f33ac and 4acb7887e4 introduced some compound type *Free
wrappers in order to use them with VIR_DEFINE_AUTOPTR_FUNC. However,
since those were not used in the code right away, Clang complained about
unused functions (static ones that are defined by the macro above).
This patch puts the defined functions in use.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Using the new VIR_DEFINE_AUTOPTR_FUNC macro defined in
src/util/viralloc.h, define a new wrapper around an existing
cleanup function which will be called when a variable declared
with VIR_AUTOPTR macro goes out of scope. Also, drop the redundant
viralloc.h include, since that has moved from the source module into
the header.
When variables of type virNetDevRxFilterPtr and virNetDevMcastEntryPtr
are declared using VIR_AUTOPTR, the functions virNetDevRxFilterFree
and virNetDevMcastEntryFree, respectively, will be run
automatically on them when they go out of scope.
Signed-off-by: Sukrit Bhatnagar <skrtbhtngr@gmail.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Currently, the functions return a pointer to the
destination buffer on success or NULL on failure.
Not only does this kind of error handling look quite
alien in the context of libvirt, where most functions
return zero on success and a negative int on failure,
but it's also somewhat pointless because unless there's
been a failure the returned pointer will be the same
one passed in by the user, thus offering no additional
value.
Change the functions so that they return an int
instead.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
virStrncpy() allows us to copy a substring, but if we're
going to copy the entire thing it's much more convenient
to use virStrcpy() instead.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
This convenience macro was created for the simple cases
where the length of the source string and the size of the
destination buffer can be figued out with strlen() and
sizeof() respectively, so we should use it wherever
possible instead of open-coding parts of it.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
This makes it easier to see why libvirt has decided it must re-attach
a tap device to its bridge.
Signed-off-by: Laine Stump <laine@laine.org>
ACKed-by: Michal Privoznik <mprivozn@redhat.com>
Commit 8708ca01c added virNetDevSwitchdevFeature() to check if a network
device has Switchdev capabilities. virNetDevSwitchdevFeature() attempts
to retrieve the PCI device associated with the network device, ignoring
non-PCI devices. It does so via the following call chain
virNetDevSwitchdevFeature()->virNetDevGetPCIDevice()->
virPCIGetDeviceAddressFromSysfsLink()
For non-PCI network devices (qeth, Xen vif, etc),
virPCIGetDeviceAddressFromSysfsLink() will report an error when
virPCIDeviceAddressParse() fails. virPCIDeviceAddressParse() also
logs an error. After commit 8708ca01c there are now two errors reported
for each non-PCI network device even though the errors are harmless.
To avoid the errors, introduce virNetDevIsPCIDevice() and use it in
virNetDevGetPCIDevice() before attempting to retrieve the associated
PCI device. virNetDevIsPCIDevice() uses the 'subsystem' property of the
device to determine if it is PCI. See the sysfs rules in kernel
documentation for more details
https://www.kernel.org/doc/html/latest/admin-guide/sysfs-rules.html
Right-aligning backslashes when defining macros or using complex
commands in Makefiles looks cute, but as soon as any changes is
required to the code you end up with either distractingly broken
alignment or unnecessarily big diffs where most of the changes
are just pushing all backslashes a few characters to one side.
Generated using
$ git grep -El '[[:blank:]][[:blank:]]\\$' | \
grep -E '*\.([chx]|am|mk)$$' | \
while read f; do \
sed -Ei 's/[[:blank:]]*[[:blank:]]\\$/ \\/g' "$f"; \
done
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Commit id '8708ca01c' added a check to determine whether the NIC had
Switchdev capabilities; however, in doing so inadvertently would cause
network devices without a PCI device to not be added to the node device
database. Thus, network devices having a "computer" as a parent, such
as "net_lo*", "net_virbr*", "net_tun*", "net_vnet*", etc. were not added.
Alter the check to not even check for Switchdev bits if no PCI device found.
After commit 8708ca01c0 libvirtd consistently aborts with "stack
smashing detected" when nodedev driver is initialized.
This is caused by nlmsg_parse() being told that its array of nlattr*
has CTRL_CMD_MAX (10) entries, when in fact it is declared to have
CTRL_ATTR_MAX (8) entries. Since all the entries are initialized to
NULL, the result is that nlmsg_parse is overwriting 2*(sizof(nlattr*))
bytes outside the array.
Signed-off-by: Laine Stump <laine@laine.org>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Instead of checking for all possible constants that every
kernel header with devlink support should have (and defining
HAVE_DECL_DEVLINK as 1 if any of them is present due to the
way AC_CHECK_DECLS works), only check for DEVLINK_CMD_ESWITCH_GET.
This is the name of the constant since kernel 4.11. Between 4.8
and 4.11, the now deprecated spelling DEVLINK_CMD_ESWITCH_MODE_GET
was used.
Assume DEVLINK_ESWITCH_MODE_SWITCHDEV is available, since it was
introduced along with the deprecated spelling.
Adding functionality to libvirt that will allow querying the interface
for the availability of switchdev Offloading NIC capabilities.
The switchdev mode was introduced in kernel 4.8, the iproute2-devlink
command to retrieve the switchdev NIC feature with command example:
devlink dev eswitch show pci/0000:03:00.0
This feature is needed for Openstack so we can do a scheduling decision
if the NIC is in Hardware Offload (switchdev) or regular SR-IOV (legacy) mode.
And select the appropriate hypervisors with the requested capability see [1].
[1] - https://specs.openstack.org/openstack/nova-specs/specs/pike/approved/enable-sriov-nic-features.html
Reviewed-by: Laine Stump <laine@laine.org>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Commit 81fb440b further qualified an if statement by adding the
boolean saveVlan to the condition. Coverity pointed out that this
change in the logic eliminated the need to check saveVlan in an
argument to virAsprintf().
When using a VF from an SRIOV-capable network card in a guest (either
in macvtap passthrough mode, or via VFIO PCI device assignment), The
associated PF netdev must be online in order for the VF to be usable
by the guest. The guest, however, is not able to change the state of
the PF. And libvirt *could* set the PF online as needed, but that
could lead to the host receiving unexpected IPv6 traffic (since the
default for an unconfigured interface is to participate in IPv6
autoconf). For this reason, before assigning a VF to a guest, libvirt
verifies that the related PF netdev is online - if it isn't, then we
log an error and don't allow the guest startup to continue.
Until now, this check was done during virNetDevSetNetConfig(). This
works nicely because the same function is called both for macvtap
passthrough and for VFIO device assignment. But in the case of VFIO,
the VF has already been unbound from its netdev driver by the time we
get to virNetDevSetNetConfig(), and in the case of dual port Mellanox
NICs that have their VFs setup in single port mode, the *only* way to
determine the proper PF netdev to query for online status is via the
"phys_port_id" file that is in the VF netdev's sysfs directory. *BUT*
if we've unbound the VF from the netdev driver, then it doesn't *have*
a netdev sysfs directory.
So, in order to check the correct PF netdev for online status, this
patch moved the check earlier in the setup, into
virNetDevSaveNetConfig(), which is called *before* unbinding the VF
from its netdev driver.
(Note that this implies that if you are using VFIO device assignment
for the VFs of a Mellanox NIC that has the VFs programmed in single
port mode, you must let the VFs be bound to their net driver and use
"managed='yes'" in the device definition. To be more specific, this is
only true if the VFs in single port mode are using port *2* of the PF
- if the VFs are using only port 1, then the correct PF netdev will be
arrived at by default/chance))
This resolves: https://bugzilla.redhat.com/267191
virHostdevRestoreNetConfig() calls virNetDevReadNetConfig() to try and
read the "original config" of a netdev, and if that fails, it tries
again with a different directory/netdev name. This achieves the
desired effect (we end up finding the config wherever it may be), but
for each failure, virNetDevReadNetConfig() places a nice error message
in the system logs. Experience has shown that false-positive error
logs like this lead to erroneous bug reports, and can often mislead
those searching for *real* bugs.
This patch changes virNetDevReadNetConfig() to explicitly check if the
file exists before calling virFileReadAll(); if it doesn't exist,
virNetDevReadNetConfig() returns a success, but leaves all the
variables holding the results as NULL. (This makes sense if you define
the purpose of the function as "read a netdev's config from its config
file *if that file exists*).
To take advantage of that change, the caller,
virHostdevRestoreNetConfig() is modified to fail immediately if
virNetDevReadNetConfig() returns an error, and otherwise to try the
different directory/netdev name if adminMAC & vlan & MAC are all NULL
after the preceding attempt.
This patch updates functions in netdev.c to pay attention to
phys_port_id. It uses the new function virNetDevGetPhysPortID() to
learn the phys_port_id of a VF or PF, then sends that info to
virPCIGetNetName(), which has newly been modified to take an optional
phys_port_id.
A single PCI device may have multiple netdevs associated with it. Each
of those netdevs will have a different phys_port_id entry in
sysfs. This patch modifies virPCIGetNetName() to allow selecting one
of the potential many netdevs in two different ways:
1) by setting the "idx" argument, the caller can select the 1st (0),
2nd (1), etc. netdev from the PCI device's net subdirectory.
2) If the physPortID arg is set (to a null-terminated string) then
virPCIGetNetName() returns the netdev that has that phys_port_id in
the sysfs file of the same name in the netdev's directory.
On Linux each network device *can* (but not necessarily *does*) have
an attribute called phys_port_id which can be read from the file of
that name in the netdev's sysfs directory. The examples I've seen have
been a many-digit hexadecimal number (as an ASCII string).
This value can be useful when a single PCI device is associated with
multiple netdevs (e.g a dual port Mellanox SR-IOV NIC - this card has
a single PCI Physical Function (PF), and that PF has two netdevs
associated with it (the "net" subdirectory of the PF in sysfs has two
links rather than the usual single link to a netdev directory). Each
of the PF netdevs has a different phys_port_id. The Virtual Functions
(VF) are similar - the PF (a PCI device) has "n" VFs (also each of
these is a PCI device), each VF has two netdevs, and each of the VF
netdevs points back to the VF PCI device (with the "device" entry in
its sysfs directory) as well as having a phys_port_id matching the PF
netdev it is associated with.
virNetDevGetPhysPortID() simply attempts to read the phys_port_id for
the given netdev and return it to the caller. If this particular
netdev driver doesn't support phys_port_id, it returns NULL (*not* a
NULL-terminated string, but a NULL pointer) but still counts it as a
success.
Change the settings from qemuDomainUpdateDeviceLive() as otherwise the
call would succeed even though nothing has changed.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1414627
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Currently, virNetDevSetCoalesce() stub is always returning error. As
it's used by virNetDevTapCreateInBridgePort(), it essentially breaks
bridged networking if coalesce is not supported.
To make it work, relax the stub to trigger error only when its
coalesce argument is not NULL, otherwise report success.
Commit f4ef3a71 made a variation of virNetDevSetMAC that would return
without logging an error message if errno was set to
EADDRNOTAVAIL. This errno is set by some SRIOV VF drivers (in
particular igbvf) when they fail to set the device's MAC address due
to the PF driver refusing the request. This is useful if we want to
try a different method of setting the VF MAC address before giving up
(Commit 86556e16 actually does this, setting the desired MAC address
to the "admin MAC in the PF, then detaching and reattaching the VF
netdev driver to force a reinit of the MAC address).
During testing of Bug 1442040 t was discovered that the ixgbe driver
returns EPERM in this situation, so this patch changes the exception
case for silent+non-terminal failure to account for this difference.
Completes resolution to: https://bugzilla.redhat.com/1415609 (RHEL 7.4)
https://bugzilla.redhat.com/1442040 (RHEL 7.3.z)
The current fallback stub for virNetDevSetCoalesce is inside an
earlier conditional block. This deals with the feature being
missing on older Linux platforms. We need a second fallback stub
though, outside the top level conditional, to ensure builds work
on Win32/FreeBSD platforms too.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
That function is able to configure coalesce settings for an interface,
similarly to 'ethtool -C'. This function also updates back the
structure so that it contains actual data on the device (if the device
doesn't support some settings kernel might just return 0 and not set
whatever is not supported), so this way we'll have up-to-date
information in the live domain XML.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
If if_indextoname is not defined, the whole function using it should
not be defined either. Add stub to fix build on mingw.
Caused by 5dd607059d
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Having this information available will make it easier to determine the
culprit when MAC or vlan tag appear to not be set, eg.:
https://bugzilla.redhat.com/1364073
(This patch doesn't fix that bug, just makes it easier to diagnose)
If an SRIOV VF has previously been used for VFIO device assignment,
the "admin MAC" that is stored in the PF driver's table of VF info
will have been set to the MAC address that the virtual machine wanted
the device to have. Setting the admin MAC for a VF also sets a flag in
the PF that is loosely called the "administratively set" flag. Once
that flag is set, it is no longer possible for the net driver of the
VF (either on the host or in a virtual machine) to directly set the
VF's MAC again; this flag isn't reset until the *PF* driver is
restarted, and that requires taking *all* VFs offline, so it's not
really feasible to do.
If the same SRIOV VF is later used for macvtap passthrough mode, the
VF's MAC address must be set, but normally we don't unbind the VF from
its host net driver (since we actually need the host net driver in
this case). Since setting the VF MAC directly will fail, in the past
"we" ("I") had tried to fix the problem by simply setting the admin MAC
(via the PF) instead. This *appeared* to work (and might have at one
time, due to promiscuous mode being turned on somewhere or something),
but it currently creates a non-working interface because only the
value for admin MAC is set to the desired value, *not* the actual MAC
that the VF is using.
Earlier patches in this series reverted that behavior, so that we once
again set the MAC of the VF itself for macvtap passthrough operation,
not the admin MAC. But that brings back the original bug - if the
interface has been used for VFIO device assignment, you can no longer
use it for macvtap passthrough.
This patch solves that problem by noticing when virNetDevSetMAC()
fails for a VF, and in that case it sets the desired MAC to the admin
MAC via the PF, then "bounces" the VF driver (by unbinding and the
immediately rebinding it to the VF). This causes the VF's MAC to be
reinitialized from the admin MAC, and everybody is happy (until the
*next* time someone wants to set the VF's MAC address, since the
"administratively set" bit is still turned on).
Some PF drivers allow setting the admin MAC (that is the MAC address
that the VF will be initialized to the next time the VF's driver is
loaded) to 00:00:00:00:00:00, and some don't. Multiple drivers
initialize the admin MACs to all 0, but don't allow setting it to that
very same value. It has been an uphill battle convincing the driver
people that it's reasonable to expect The argument that's used is
that an all 0 device MAC address on a device is invalid; however, from
an outsider's point of view, when the admin MAC is set to 0 at the
time the VF driver is loaded, the VF's MAC is *not* set to 0, but to a
random non-0 value. But that's beside the point - even if I could
convince one or two SRIOV driver maintainers to permit setting the
admin MAC to 0, there are still several other drivers.
So rather than fighting that losing battle, this patch checks for a
failure to set the admin MAC due to an all 0 value, and retries it
with 02:00:00:00:00:00. That won't result in a random value being set
in the VF MAC at next VF driver init, but that's okay, because we
always want to set a specific value anyway. Rather, the "almost 0"
setting makes it easy to visually detect from the output of "ip link
show" which VFs are currently in use and which are free.
The global functions virNetDevReplaceMacAddress(),
virNetDevReplaceNetConfig(), virNetDevRestoreMacAddress(), and
virNetDevRestoreNetConfig() are no longer used, as their functionality
has been replaced by virNetDev(Save|Read|Set)NetConfig().
The static functions virNetDevReplaceVfConfig() and
virNetDevRestoreVfConfig() were only used by the above-named global
functions that were removed.
These three functions are destined to replace
virNetDev(Replace|Restore)NetConfig() and
virNetDev(Replace|Restore)MacAddress(), which both do the save and set
together as a single step. We need to separate the save, read, and set
steps because there will be situations where we need to do something
else in between (in particular, we will need to rebind a VF's driver
after save but before set).
This patch creates the new functions, but doesn't call them - that
will come in a subsequent patch. Note that the new functions to
read/write the file that stores the original network config now uses
JSON rather than plaintext (it still recognizes the old format as well
though, so it won't get confused during an upgrade).
Fix typo in virNetDevPFGetVF() stub:
ATTRUBUTE_UNUSED -> ATTRIBUTE_UNUSED.
While here, use common indent style for arguments in
virNetDevGetVirtualFunctionIndex() stub.
Given an SRIOV PF netdev name (e.g. "enp2s0f0") and VF#, this new
function returns the netdev name of the referenced VF device
(e.g. "enp2s11f6"), or NULL if the device isn't bound to a net driver.