In f08e6883cb I've made @pcidevs in
virHostdevReAttachPCIDevices() to be automatically unrefed using
VIR_AUTOUNREF() but I forgot to remove the line that explicitly
unrefs the object at the end of the function.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
All code using LOCALSTATEDIR "/run" is updated to use RUNSTATEDIR
instead. The exception is the remote driver client which still
uses LOCALSTATEDIR "/run". The client needs to connect to remote
machines which may not be using /run, so /var/run is more portable
due to the /var/run -> /run symlink.
Some duplicate paths in the apparmor code are also purged.
There's no functional change by default yet since both expressions
expand to the same value.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The KVM assignment is going to be removed shortly. Don't let the
hostdev module configure it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com>
There are two places where we need to create virPCIDevice from
given virDomainHostdevDef. In both places the code is duplicated.
Move them into a single function and call it from those two
places.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com>
The virHostdevPreparePCIDevices() function works in several
steps. In the very first one, it checks if devices we want to
detach from the host are not taken already by some other domain.
However, this piece of code returns different results depending
on the stub driver used (which is not wrong per se, but keep on
reading). If the stub driver is KVM then
virHostdevIsPCINodeDeviceUsed() is called which basically checks
if a PCI device from the detach list is not used by any domain
(including the one we are preparing the device for). If that is
the case, an error is reported ("device in use") and -1 is
returned.
However, that is not what happens if the stub driver is VFIO. If
the stub driver is VFIO, then we iterate over all PCI devices
from the same IOMMU group and check if they are taken by some
other domain (because a PCI device, well IOMMU group, can't be
shared between two or more qemu processes). But we fail to check,
if the device we are trying to detach from the host is not
already taken by a domain. That is, calling
virHostdevPreparePCIDevices() over a hostdev device twice
succeeds the first time and fails too late in the second run
(fortunately, virHostdevResetAllPCIDevices() will throw an error,
but this is already too late because the PCI device in question
was moved to the list of inactive PCI devices and now it appears
in both lists).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
It may happen that there are two domains with the same name in
two separate drivers (e.g. qemu and lxc). That is why for PCI
devices we track both names of driver and domain combination
which has taken the device. However, when we check if given PCI
device is in use (or PCI devices from the same IOMMU group) we
compare only domain name. This means that we can mistakenly claim
device as free to use while in fact it isn't.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
This code that executes virPCIDeviceReattach in all
virPCIDevicePtr objects of a given virPCIDeviceListPtr
list is replicated twice in the code. Putting it in a helper
function helps with readability.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
virHostdevReattachPCIDevice() is a static that simply does
a wait loop with virPCIDeviceWaitForCleanup() before
calling virPCIDeviceReattach().
This loop traces back to commit d1e5676c0d, aiming to
solve a race condition between Libvirt returning the
device back to the host and QEMU trying to access it in
the meantime, which resulted in QEMU exiting on error
and killing the guest. This happens because device_del
is asynchronous, returning OK even if the guest didn't
release the device. Commit 01abc8a1b8 moved this code
to qemu_hostdev.c, 82e8dd4cf8 added the pci-stub conditional
for the loop, 899b261127 moved the code to virhostdev.c
where it stood until now.
The intent of this wait loop is still valid: device_del
is still not bullet proof into preventing the conditions
that commit d1e5676c0d aimed to fix, especially when considering
all the architectures we must support. However, this loop
is executed only in virHostdevReattachPCIDevice(), leaving
every other virPCIDeviceReattach() call prone to that error.
Let's move the wait loop code to virPCIDeviceReattach(). This
will:
- make every reattach call safe from this race condition
with the pci-stub;
- allow for a bit of code cleanup (virHostdevReattachPCIDevice()
can be erased, and virHostdevReAttachPCIDevices() can use
virPCIDeviceReattach() directly);
- make it easier to understand the overall reattach mechanisms in
Libvirt, without the risk of a newcomer wondering why reattach
is done slightly different in some instances.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This code that executes virPCIDeviceReset in all virPCIDevicePtr
objects of a given virPCIDeviceListPtr list is replicated twice
in the code. Putting it in a helper function helps with
readability.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
There are couple of functions which get shorter after the
treatment.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Couple of things happening in this patch:
1) We can mark the device we're adding onto active list as used
way before - when adding it onto temporary list.
2) When actually moving device from a temporary helper list onto
the list of active devices we check if the device isn't
already there. The same check is performed by
virSCSIVHostDeviceListAdd() later. Drop this duplicity.
3) The 'error' label is renamed to 'rollback' to reflect what it
is actually doing. While in the rest of the code we don't
allow random label names, this source file is different.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When looking up a USB device by vendor the
virUSBDeviceFindByVendor() is used. The function returns number
of items found. But the logic in caller to process it is
needlessly complicated.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
There are couple of functions which get shorter after the
treatment.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
There's no need to translate virDomainHostdevDef-s into
virPCIDevice-s with locked list of PCI devices.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
There's no need to translate virDomainHostdevDef-s into
virPCIDevice-s with locked list of PCI devices.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This function is a good candidate for VIR_AUTOPTR() conversion.
But this conversion will be easier if we only add @pci device
onto @pcidevs list after it was all set up.
This is no functional change.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Due to the way that our virObjectUnref() is written it's not
possible that a NULL is passed into *Dispose() function. However,
some functions check for that regardless.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
hostdevs have a link back to the original network device. This is fairly
generic accepting any type of device, however, we don't intend to make
use of this approach in future. It can thus be specialized to network
devices.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Missing semicolon at the end of macros can confuse some analyzers
(like cppcheck <filename>). VIR_ONCE_GLOBAL_INIT is almost
exclusively called without an ending semicolon, but let's
standardize on using one like the other macros.
Add a dummy struct definition at the end of the macro, so
the compiler will require callers to add a semicolon.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
The @linkdev is In/Out function parameter as second order
reference pointer so requires first order dereference for
checking NULL which can be the result of virPCIGetNetName().
Fixes: d6ee56d723 (util: change virPCIGetNetName() to not return error if device has no net name)
Signed-off-by: Radoslaw Biernacki <radoslaw.biernacki@linaro.org>
Signed-off-by: dann frazier <dann.frazier@canonical.com>
In many files there are header comments that contain an Author:
statement, supposedly reflecting who originally wrote the code.
In a large collaborative project like libvirt, any non-trivial
file will have been modified by a large number of different
contributors. IOW, the Author: comments are quickly out of date,
omitting people who have made significant contribitions.
In some places Author: lines have been added despite the person
merely being responsible for creating the file by moving existing
code out of another file. IOW, the Author: lines give an incorrect
record of authorship.
With this all in mind, the comments are useless as a means to identify
who to talk to about code in a particular file. Contributors will always
be better off using 'git log' and 'git blame' if they need to find the
author of a particular bit of code.
This commit thus deletes all Author: comments from the source and adds
a rule to prevent them reappearing.
The Copyright headers are similarly misleading and inaccurate, however,
we cannot delete these as they have legal meaning, despite being largely
inaccurate. In addition only the copyright holder is permitted to change
their respective copyright statement.
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
All of the ones being removed are pulled in by internal.h. The only
exception is sanlock which expects the application to include <stdint.h>
before sanlock's headers, because sanlock prototypes use fixed width
int, but they don't include stdint.h themselves, so we have to leave
that one in place.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Acked-by: Michal Privoznik <mprivozn@redhat.com>
It doesn't really make sense for us to have stdlib.h and string.h but
not stdio.h in the internal.h header.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Acked-by: Michal Privoznik <mprivozn@redhat.com>
By making use of GNU C's cleanup attribute handled by the
VIR_AUTOPTR macro for declaring aggregate pointer variables,
majority of the calls to *Free functions can be dropped, which
in turn leads to getting rid of most of our cleanup sections.
Signed-off-by: Sukrit Bhatnagar <skrtbhtngr@gmail.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
By making use of GNU C's cleanup attribute handled by the
VIR_AUTOFREE macro for declaring scalar variables, majority
of the VIR_FREE calls can be dropped, which in turn leads to
getting rid of most of our cleanup sections.
Signed-off-by: Sukrit Bhatnagar <skrtbhtngr@gmail.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Modify virUSBDeviceListAdd to take a double pointer to
virUSBDevicePtr as the second argument. This will enable usage
of cleanup macros upon the virUSBDevicePtr item which is to be
added to the list as it will be cleared by virInsertElementsN
upon success.
Signed-off-by: Sukrit Bhatnagar <skrtbhtngr@gmail.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Add the function virHostdevIsMdevDevice() which detects whether a
hostdev is a mediated device or not. Also, replace all existing
conditionals.
Signed-off-by: Shalini Chellathurai Saroja <shalini@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
So far we are repeating the following lines over and over:
if (!(virSomeObjectClass = virClassNew(virClassForObject(),
"virSomeObject",
sizeof(virSomeObject),
virSomeObjectDispose)))
return -1;
While this works, it is impossible to do some checking. Firstly,
the class name (the 2nd argument) doesn't match the name in the
code in all cases (the 3rd argument). Secondly, the current style
is needlessly verbose. This commit turns example into following:
if (!(VIR_CLASS_NEW(virSomeObject,
virClassForObject)))
return -1;
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Commit 9a94af6d restructured virHostdevReadNetConfig() so that it
would manually set ret = 0 after successfully reading the device's
config, but Coverity pointed out that "ret = 0" was erroneously placed
outside of an "else" clause, meaning that the the value of ret set in
the "if" clause was unnecessarily and incorrectly overwritten.
This patch moves ret = 0 into the else clause, which should silence
Coverity.
virHostdevRestoreNetConfig() calls virNetDevReadNetConfig() to try and
read the "original config" of a netdev, and if that fails, it tries
again with a different directory/netdev name. This achieves the
desired effect (we end up finding the config wherever it may be), but
for each failure, virNetDevReadNetConfig() places a nice error message
in the system logs. Experience has shown that false-positive error
logs like this lead to erroneous bug reports, and can often mislead
those searching for *real* bugs.
This patch changes virNetDevReadNetConfig() to explicitly check if the
file exists before calling virFileReadAll(); if it doesn't exist,
virNetDevReadNetConfig() returns a success, but leaves all the
variables holding the results as NULL. (This makes sense if you define
the purpose of the function as "read a netdev's config from its config
file *if that file exists*).
To take advantage of that change, the caller,
virHostdevRestoreNetConfig() is modified to fail immediately if
virNetDevReadNetConfig() returns an error, and otherwise to try the
different directory/netdev name if adminMAC & vlan & MAC are all NULL
after the preceding attempt.
Mellanox ConnectX-3 dual port SRIOV NICs present a bit of a challenge
when assigning one of their VFs to a guest using VFIO device
assignment.
These NICs have only a single PCI PF device, and that single PF has
two netdevs sharing the single PCI address - one for port 1 and one
for port 2. When a VF is created it can also have 2 netdevs, or it can
be setup in "single port" mode, where the VF has only a single netdev,
and that netdev is connected either to port 1 or to port 2.
When the VF is created in dual port mode, you get/set the MAC
address/vlan tag for the port 1 VF by sending a netlink message to the
PF's port1 netdev, and you get/set the MAC address/vlan tag for the
port 2 VF by sending a netlink message to the PF's port 2 netdev. (Of
course libvirt doesn't have any way to describe MAC/vlan info for 2
ports in a single hostdev interface, so that's a bit of a moot point)
When the VF is created in single port mode, you can *set* the MAC/vlan
info by sending a netlink message to *either* PF netdev - the driver
is smart enough to understand that there's only a single netdev, and
set the MAC/vlan for that netdev. When you want to *get* it, however,
the driver is more accurate - it will return 00:00:00:00:00:00 for the
MAC if you request it from the port 1 PF netdev when the VF was
configured to be single port on port 2, or if you request if from the
port 2 PF netdev when the VF was configured to be single port on port
1.
Based on this information, when *getting* the MAC/vlan info (to save
the original setting prior to assignment), we determine the correct PF
netdev by matching phys_port_id between VF and PF.
(IMPORTANT NOTE: this implies that to do PCI device assignment of the
VFs on dual port Mellanox cards using <interface type='hostdev'>
(i.e. if you want the MAC address/vlan tag to be set), not only must
the VFs be configured in single port mode, but also the VFs *must* be
bound to the host VF net driver, and libvirt must use managed='yes')
By the time libvirt is ready to set the new MAC/vlan tag, the VF has
already been unbound from the host net driver and bound to
vfio-pci. This isn't problematic though because, as stated earlier,
when a VF is created in single port mode, commands to configure it can
be sent to either the port 1 PF netdev or the port 2 PF netdev.
When it is time to restore the original MAC/vlan tag, again the VF
will *not* be bound to a host net driver, so it won't be possible to
learn from sysfs whether to use the port 1 or port 2 PF netdev for the
netlink commands. And again, it doesn't matter which netdev you
use. However, we must keep in mind that we saved the original settings
to a file called "${PF}_${VFNUM}". To solve this problem, we just
check for the existence of ${PF1}_${VFNUM} and ${PF2}_${VFNUM}, and
use whichever one we find (since we know that only one can be there)
A single PCI device may have multiple netdevs associated with it. Each
of those netdevs will have a different phys_port_id entry in
sysfs. This patch modifies virPCIGetNetName() to allow selecting one
of the potential many netdevs in two different ways:
1) by setting the "idx" argument, the caller can select the 1st (0),
2nd (1), etc. netdev from the PCI device's net subdirectory.
2) If the physPortID arg is set (to a null-terminated string) then
virPCIGetNetName() returns the netdev that has that phys_port_id in
the sysfs file of the same name in the netdev's directory.
The problem resides in virHostdevUpdateActiveMediatedDevices which gets
called during qemuProcessReconnect. The issue here is that
virMediatedDeviceListAdd takes a pointer to the item to be added to the
list to which VIR_APPEND_ELEMENT is used, which also clears the pointer.
However, in this case only the local copy of the pointer got cleared,
leaving the original pointing to valid memory. To sum it up, during
cleanup phase, the original pointer is freed and the daemon crashes
basically any time it would access it.
Backtrace:
0x00007ffff3ccdeba in __strcmp_sse2_unaligned
0x00007ffff72a444a in virMediatedDeviceListFindIndex
0x00007ffff7241446 in virHostdevReAttachMediatedDevices
0x00007fffc60215d9 in qemuHostdevReAttachMediatedDevices
0x00007fffc60216dc in qemuHostdevReAttachDomainDevices
0x00007fffc6046e6f in qemuProcessStop
0x00007fffc6091596 in processMonitorEOFEvent
0x00007fffc6091793 in qemuProcessEventHandler
0x00007ffff7294bf5 in virThreadPoolWorker
0x00007ffff7294184 in virThreadHelper
0x00007ffff3fdc3c4 in start_thread () from /lib64/libpthread.so.0
0x00007ffff3d269cf in clone () from /lib64/libc.so.6
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1446455
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
It takes longer to explain this than to fix it...
In the past we weren't able to save the VF's own MAC address *at all*
when using it for hostdev assignment, because we had already unbound
the VF from the host net driver prior to saving its config. With the
previous patch, that problem has been solved, so we now have the VF's
MAC address saved and can move on to the *next* problem, which is twofold:
1) during teardown we restore the config before we've re-bound, so the
VF doesn't have a net driver, and thus we can't set its MAC address
directly.
2) even if we delay restoring the config until the VF is bound to a
net driver, the request to set its MAC address would fail, since
(during device setup) we had set the "admin MAC" for the VF via an
RTM_SETLINK to the PF - once you've set the admin MAC for a VF, the
VF driver (either on host or on guest) is not allowed to change the
VF's MAC address "forever" (well, until you reload the PF driver,
but that requires destroying and recreating every single VF, which
isn't something you can require).
The solution is to keep the restoration of config at the same place,
but to set the *admin MAC* to the address you want the VF to have -
when the VF net driver is later initialized (as a part of re-binding
to the VF net driver) its MAC will be initialized to the current value
of the admin MAC.
In order to properly restore the original state of an SRIOV VF when
we're finished with it, we need to save the MAC address of the VF
itself (not just the admin MAC address for the VF that is stored in
the PF). But that can only be done when the VF is still bound to the
host's netdev driver, and we have always done the saving of device
config after the VF is already bound to vfio-pci. This patch prepares
us for adding a save of the VF's MAC by calling the function that
saves netconfig earlier in the device preparation, before we've
unbound it from the host netdev driver.
These two operations will need to be separated so that saving of the
original config is done before detaching the host net driver, and
setting the new config is done after attaching vfio-pci. This patch
splits the single function into two, but for now calls them together
(to make bisecting easier if there is a regression).
virHostdevNetConfigReplace() and virHostdevNetConfigRestore() are
modified to use the new virNetDev*NetConfig() functions.
Note that due to the VF's original MAC addresses being saved after it
has already been un-bound from the host net driver, the actual current
VF MAC address won't be saved (because it no longer exists) - only the
"admin MAC" will be saved. This reflects existing behavior that will
be fixed in an upcoming patch.
Keep track of the assigned mediated devices the same way we do it for
the rest of hostdevs. Methods like 'Prepare', 'Update', and 'ReAttach'
are introduced by this patch.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
...and cleanup the callers to report it when it *is* an error.
In many cases It's useful for virPCIGetNetName() to not log an error
and simply return a NULL pointer when the given device isn't bound to
a net driver (e.g. we're looking at a VF that is permanently bound to
vfio-pci). The existing code would silently return an error in this
case, which could eventually lead to the dreaded "An error occurred
but the cause is unknown" log message.
This patch changes virPCIGetNetName() to still return success if the
device simply isn't bound to a net driver, and adjusts all the callers
that require a non-null netname to check for that condition and log an
error when it happens.
Since the code checks 'mgr == NULL' anyway, no need for the prototype
to have the NONNULL arg check. Also add an error message to indicate what
the failure is so that there isn't a failed for some reason error.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The comparison code used STREQ_NULLABLE anyway for both 'drv_name' and
'dom_name', so no need. Add a NULLSTR on the 'dom_name' too.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The comparison code used STREQ_NULLABLE anyway for both 'drv_name' and
'dom_name', so no need. Add a NULLSTR on the 'dom_name' too.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Non-endpoint devices like pci-bridges cannot be assigned to guests.
Prevent such attempts.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
For a new hostdev type='scsi_host' we have a number of
required functions for managing, adding, and removing the
host device to/from guests. Provide the basic infrastructure
for these tasks.
The name "SCSIVHost" (and its variants) is chosen to avoid
conflicts with existing code named "SCSIHost" to refer to
a hostdev type='scsi' protcol='none'.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Use the util function virHostdevIsSCSIDevice() to simplify if
statements.
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>