Having this information available will make it easier to determine the
culprit when MAC or vlan tag appear to not be set, eg.:
https://bugzilla.redhat.com/1364073
(This patch doesn't fix that bug, just makes it easier to diagnose)
If an SRIOV VF has previously been used for VFIO device assignment,
the "admin MAC" that is stored in the PF driver's table of VF info
will have been set to the MAC address that the virtual machine wanted
the device to have. Setting the admin MAC for a VF also sets a flag in
the PF that is loosely called the "administratively set" flag. Once
that flag is set, it is no longer possible for the net driver of the
VF (either on the host or in a virtual machine) to directly set the
VF's MAC again; this flag isn't reset until the *PF* driver is
restarted, and that requires taking *all* VFs offline, so it's not
really feasible to do.
If the same SRIOV VF is later used for macvtap passthrough mode, the
VF's MAC address must be set, but normally we don't unbind the VF from
its host net driver (since we actually need the host net driver in
this case). Since setting the VF MAC directly will fail, in the past
"we" ("I") had tried to fix the problem by simply setting the admin MAC
(via the PF) instead. This *appeared* to work (and might have at one
time, due to promiscuous mode being turned on somewhere or something),
but it currently creates a non-working interface because only the
value for admin MAC is set to the desired value, *not* the actual MAC
that the VF is using.
Earlier patches in this series reverted that behavior, so that we once
again set the MAC of the VF itself for macvtap passthrough operation,
not the admin MAC. But that brings back the original bug - if the
interface has been used for VFIO device assignment, you can no longer
use it for macvtap passthrough.
This patch solves that problem by noticing when virNetDevSetMAC()
fails for a VF, and in that case it sets the desired MAC to the admin
MAC via the PF, then "bounces" the VF driver (by unbinding and the
immediately rebinding it to the VF). This causes the VF's MAC to be
reinitialized from the admin MAC, and everybody is happy (until the
*next* time someone wants to set the VF's MAC address, since the
"administratively set" bit is still turned on).
Some PF drivers allow setting the admin MAC (that is the MAC address
that the VF will be initialized to the next time the VF's driver is
loaded) to 00:00:00:00:00:00, and some don't. Multiple drivers
initialize the admin MACs to all 0, but don't allow setting it to that
very same value. It has been an uphill battle convincing the driver
people that it's reasonable to expect The argument that's used is
that an all 0 device MAC address on a device is invalid; however, from
an outsider's point of view, when the admin MAC is set to 0 at the
time the VF driver is loaded, the VF's MAC is *not* set to 0, but to a
random non-0 value. But that's beside the point - even if I could
convince one or two SRIOV driver maintainers to permit setting the
admin MAC to 0, there are still several other drivers.
So rather than fighting that losing battle, this patch checks for a
failure to set the admin MAC due to an all 0 value, and retries it
with 02:00:00:00:00:00. That won't result in a random value being set
in the VF MAC at next VF driver init, but that's okay, because we
always want to set a specific value anyway. Rather, the "almost 0"
setting makes it easy to visually detect from the output of "ip link
show" which VFs are currently in use and which are free.
The global functions virNetDevReplaceMacAddress(),
virNetDevReplaceNetConfig(), virNetDevRestoreMacAddress(), and
virNetDevRestoreNetConfig() are no longer used, as their functionality
has been replaced by virNetDev(Save|Read|Set)NetConfig().
The static functions virNetDevReplaceVfConfig() and
virNetDevRestoreVfConfig() were only used by the above-named global
functions that were removed.
It takes longer to explain this than to fix it...
In the past we weren't able to save the VF's own MAC address *at all*
when using it for hostdev assignment, because we had already unbound
the VF from the host net driver prior to saving its config. With the
previous patch, that problem has been solved, so we now have the VF's
MAC address saved and can move on to the *next* problem, which is twofold:
1) during teardown we restore the config before we've re-bound, so the
VF doesn't have a net driver, and thus we can't set its MAC address
directly.
2) even if we delay restoring the config until the VF is bound to a
net driver, the request to set its MAC address would fail, since
(during device setup) we had set the "admin MAC" for the VF via an
RTM_SETLINK to the PF - once you've set the admin MAC for a VF, the
VF driver (either on host or on guest) is not allowed to change the
VF's MAC address "forever" (well, until you reload the PF driver,
but that requires destroying and recreating every single VF, which
isn't something you can require).
The solution is to keep the restoration of config at the same place,
but to set the *admin MAC* to the address you want the VF to have -
when the VF net driver is later initialized (as a part of re-binding
to the VF net driver) its MAC will be initialized to the current value
of the admin MAC.
In order to properly restore the original state of an SRIOV VF when
we're finished with it, we need to save the MAC address of the VF
itself (not just the admin MAC address for the VF that is stored in
the PF). But that can only be done when the VF is still bound to the
host's netdev driver, and we have always done the saving of device
config after the VF is already bound to vfio-pci. This patch prepares
us for adding a save of the VF's MAC by calling the function that
saves netconfig earlier in the device preparation, before we've
unbound it from the host netdev driver.
These two operations will need to be separated so that saving of the
original config is done before detaching the host net driver, and
setting the new config is done after attaching vfio-pci. This patch
splits the single function into two, but for now calls them together
(to make bisecting easier if there is a regression).
virHostdevNetConfigReplace() and virHostdevNetConfigRestore() are
modified to use the new virNetDev*NetConfig() functions.
Note that due to the VF's original MAC addresses being saved after it
has already been un-bound from the host net driver, the actual current
VF MAC address won't be saved (because it no longer exists) - only the
"admin MAC" will be saved. This reflects existing behavior that will
be fixed in an upcoming patch.
This patch modifies the macvtap passthrough setup to use
virNetDevSaveNetConfig()+virNetDevSetConfig() instead of
virNetDevReplaceNetConfig() or virNetDevReplaceMacAddress(), and the
teardown to use virNetDevReadNetConfig()+virNetDevSetConfig() instead
of virNetDevRestoreNetConfig() or virNetDevRestoreMacAddress().
Since the older functions only saved/restored the admin MAC and vlan
tag (which is incorrect) and the new functions save/restore the VF's
own MAC address and vlan tag (correct), this actually fixes a bug
(which was introduced by commit cb3fe38c7, which was itself supposed
to be a fix for https://bugzilla.redhat.com/1113474 ).
The downside to this patch is that it causes an *apparent* regression
in that bug (because there will once again be an error reported if the
interface had previously been used for VFIO device assignment), but in
reality, the code hasn't been working for *any* case before this
current patch (at least not with any recent kernel). Anyway, that
"regression" will be fixed with an upcoming patch that fixes it the
*right* way.
These three functions are destined to replace
virNetDev(Replace|Restore)NetConfig() and
virNetDev(Replace|Restore)MacAddress(), which both do the save and set
together as a single step. We need to separate the save, read, and set
steps because there will be situations where we need to do something
else in between (in particular, we will need to rebind a VF's driver
after save but before set).
This patch creates the new functions, but doesn't call them - that
will come in a subsequent patch. Note that the new functions to
read/write the file that stores the original network config now uses
JSON rather than plaintext (it still recognizes the old format as well
though, so it won't get confused during an upgrade).
Keep track of the assigned mediated devices the same way we do it for
the rest of hostdevs. Methods like 'Prepare', 'Update', and 'ReAttach'
are introduced by this patch.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Beside creation, disposal, getter, and setter methods the module exports
methods to work with lists of mediated devices.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
We keep forgetting that older setups don't like 'index':
CC util/libvirt_util_la-virsysinfo.lo
cc1: warnings being treated as errors
util/virstoragefile.c: In function 'virStorageSourceFindByNodeName':
util/virstoragefile.c:3804: error: declaration of 'index' shadows a global declaration [-Wshadow]
/usr/include/string.h:489: error: shadowed declaration is here [-Wshadow]
Signed-off-by: Eric Blake <eblake@redhat.com>
That file has only two exported files and each one of them has
different naming. virNode is what all the other files use, so let's
use it. It wasn't used before because the clash with public API
naming, so let's fix that by shortening the name (there is no other
private variant of it anyway).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
There is no reason for it not to be in the utils, all global symbols
under that file already have prefix vir* and there is no reason for it
to be part of DRIVER_SOURCES because that is just a leftover from
older days (pre-driver modules era, I believe).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
While on that, drop support for kernels from RHEL-5 era (missing
cpu/present file). Also add some useful functions and export them.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
By using this we are able to easily switch the sysfs path being
used (fake it). This will not only help tests in the future but can
be also used from files where the code is duplicated currently.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
These helpers are doing just a read and covert the value, but they
properly size the read limit, handle additional whitespace characters,
and unify error reporting.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
STREQ_NULLABLE returns true if both parameters are NULL. And that's
not what we want here. We just want to skop comparing source nodes
that don't have that info set. The function wouldn't make much sense
with nodeName == NULL, so we don't need to check that. Moreover, the
function's declaration uses ATTRIBUDE_NONNULL for nodeName, which not
only means that function expects the parameter not to be NULL, but
actually tells the compiler that it can optimize out the NULL checks.
That way it could end up calling strcmp on NULL (either nodeformat or
nodebacking). GCC figures this out if libvirt is compiled with
lv_cv_static_analysis=yes, unfortunately not everyone uses that.
Caused by cbc6d53513568c9c9613b3eaae1c8a8230fd6aab.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The function has very specific semantics. Split out the part that parses
the backing store specification string into a separate helper so that it
can be reused later while keeping the wrapper with existing semantics.
Note that virStorageFileParseChainIndex is pretty well covered by the
test suite.
Fix typo in virNetDevPFGetVF() stub:
ATTRUBUTE_UNUSED -> ATTRIBUTE_UNUSED.
While here, use common indent style for arguments in
virNetDevGetVirtualFunctionIndex() stub.
Given an SRIOV PF netdev name (e.g. "enp2s0f0") and VF#, this new
function returns the netdev name of the referenced VF device
(e.g. "enp2s11f6"), or NULL if the device isn't bound to a net driver.
We will want to allow silent failure of virNetDevSetMAC() in the case
that the SIOSIFHWADDR ioctl fails with errno == EADDRNOTAVAIL. (Yes,
that is very specific, but we really *do* want a logged failure in all
other circumstances, and don't want to duplicate code in the caller
for the other possibilities).
This patch renames the 3 different virNetDevSetMAC() functions to
virNetDevSetMACInternal(), adding a 3rd arg called "quiet" and making
them static (because this extra control will only be needed within
virnetdev.c). A new global virNetDevSetMAC() is defined that calls
whichever of the three *Internal() functions gets compiled with quiet
= false. Callers in virnetdev.c that want to notice a failure with
errno == EADDRNOTAVAIL and retry with a different strategy rather than
immediately failing, can call virNetDevSetMACInternal(..., true).
This function unbinds a device from its driver, then immediately
rebinds it to its driver again. The code for this new function is just
the 2nd half of virPCIDeviceBindWithDriverOverride(), so that
function's 2nd half is replaced with a call to virPCIDeviceRebind().
...and cleanup the callers to report it when it *is* an error.
In many cases It's useful for virPCIGetNetName() to not log an error
and simply return a NULL pointer when the given device isn't bound to
a net driver (e.g. we're looking at a VF that is permanently bound to
vfio-pci). The existing code would silently return an error in this
case, which could eventually lead to the dreaded "An error occurred
but the cause is unknown" log message.
This patch changes virPCIGetNetName() to still return success if the
device simply isn't bound to a net driver, and adjusts all the callers
that require a non-null netname to check for that condition and log an
error when it happens.
vf in virNetDevMacVLanDeleteWithVPortProfile() is initialized to -1
and never set. It's not set for a good reason - because it doesn't
make sense during macvtap device setup to refer to a VF device as
"PF:VF#". This patch replaces the two uses of "vf" with "-1", and
removes the local variable, so that it's more clear we are always
calling the utility functions with vf set to -1.
This function is only called in two places, and the ifindex,
nltarget_kernel, and getPidFunc args are never used (and never will
be).
ifindex - we always know the name of the device, and never know the
ifindex - if we really did need the ifindex we would have to get it
from the name using virNetDevGetIndex(). In practice, we just send -1
to virNetDevSetVfConfig(), which doesn't bother to learn the real
ifindex (you only need a name *or* an ifindex for the netlink command
to succeed, not both).
nltarget_kernel - messages to set the config of an SRIOV VF will
always go to netlink in the kernel, not to another user process, so
this arg is always true (there are other uses of netlink messages
where the message might need to go to another user process, but never
in the case of RTM_SETLINK for SRIOV).
getPidFunc - this arg is only used if nltarget_kernel is false, and it
never is.
None of this has any functional effect, it just makes it easier to
follow what's happening when virNetDevSetVfConfig() is called.
virNetDevParseVfConfig() assumed that both the MAC address and VLAN
tag pointers were valid, so even if you only wanted one or the other,
you would need a variable to hold the returned value for both. This
patch checks each for a NULL pointer before filling it in.
The source code will check for NULL arguments for 'macvtap_macaddr' and
'vmuuid', so no need for the NONNULL in the prototypes. Following the stack
for both arguments to virNetDevVPortProfileOpSetLink also shows called
functions would handle a NULL value.
Additionally, modified the prototype to use the same 'macvtap_macaddr'
name as the source code for consistency.
Since the code checks 'mgr == NULL' anyway, no need for the prototype
to have the NONNULL arg check. Also add an error message to indicate what
the failure is so that there isn't a failed for some reason error.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The comparison code used STREQ_NULLABLE anyway for both 'drv_name' and
'dom_name', so no need. Add a NULLSTR on the 'dom_name' too.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The comparison code used STREQ_NULLABLE anyway for both 'drv_name' and
'dom_name', so no need. Add a NULLSTR on the 'dom_name' too.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The code checks and handles a NULL 'str', so just remove the NONNULL.
Update the error message to add the NULLSTR() around 'str' also.
Signed-off-by: John Ferlan <jferlan@redhat.com>