While on that, drop support for kernels from RHEL-5 era (missing
cpu/present file). Also add some useful functions and export them.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
By using this we are able to easily switch the sysfs path being
used (fake it). This will not only help tests in the future but can
be also used from files where the code is duplicated currently.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
These helpers are doing just a read and covert the value, but they
properly size the read limit, handle additional whitespace characters,
and unify error reporting.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
STREQ_NULLABLE returns true if both parameters are NULL. And that's
not what we want here. We just want to skop comparing source nodes
that don't have that info set. The function wouldn't make much sense
with nodeName == NULL, so we don't need to check that. Moreover, the
function's declaration uses ATTRIBUDE_NONNULL for nodeName, which not
only means that function expects the parameter not to be NULL, but
actually tells the compiler that it can optimize out the NULL checks.
That way it could end up calling strcmp on NULL (either nodeformat or
nodebacking). GCC figures this out if libvirt is compiled with
lv_cv_static_analysis=yes, unfortunately not everyone uses that.
Caused by cbc6d53513.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The function has very specific semantics. Split out the part that parses
the backing store specification string into a separate helper so that it
can be reused later while keeping the wrapper with existing semantics.
Note that virStorageFileParseChainIndex is pretty well covered by the
test suite.
Fix typo in virNetDevPFGetVF() stub:
ATTRUBUTE_UNUSED -> ATTRIBUTE_UNUSED.
While here, use common indent style for arguments in
virNetDevGetVirtualFunctionIndex() stub.
Given an SRIOV PF netdev name (e.g. "enp2s0f0") and VF#, this new
function returns the netdev name of the referenced VF device
(e.g. "enp2s11f6"), or NULL if the device isn't bound to a net driver.
We will want to allow silent failure of virNetDevSetMAC() in the case
that the SIOSIFHWADDR ioctl fails with errno == EADDRNOTAVAIL. (Yes,
that is very specific, but we really *do* want a logged failure in all
other circumstances, and don't want to duplicate code in the caller
for the other possibilities).
This patch renames the 3 different virNetDevSetMAC() functions to
virNetDevSetMACInternal(), adding a 3rd arg called "quiet" and making
them static (because this extra control will only be needed within
virnetdev.c). A new global virNetDevSetMAC() is defined that calls
whichever of the three *Internal() functions gets compiled with quiet
= false. Callers in virnetdev.c that want to notice a failure with
errno == EADDRNOTAVAIL and retry with a different strategy rather than
immediately failing, can call virNetDevSetMACInternal(..., true).
This function unbinds a device from its driver, then immediately
rebinds it to its driver again. The code for this new function is just
the 2nd half of virPCIDeviceBindWithDriverOverride(), so that
function's 2nd half is replaced with a call to virPCIDeviceRebind().
...and cleanup the callers to report it when it *is* an error.
In many cases It's useful for virPCIGetNetName() to not log an error
and simply return a NULL pointer when the given device isn't bound to
a net driver (e.g. we're looking at a VF that is permanently bound to
vfio-pci). The existing code would silently return an error in this
case, which could eventually lead to the dreaded "An error occurred
but the cause is unknown" log message.
This patch changes virPCIGetNetName() to still return success if the
device simply isn't bound to a net driver, and adjusts all the callers
that require a non-null netname to check for that condition and log an
error when it happens.
vf in virNetDevMacVLanDeleteWithVPortProfile() is initialized to -1
and never set. It's not set for a good reason - because it doesn't
make sense during macvtap device setup to refer to a VF device as
"PF:VF#". This patch replaces the two uses of "vf" with "-1", and
removes the local variable, so that it's more clear we are always
calling the utility functions with vf set to -1.
This function is only called in two places, and the ifindex,
nltarget_kernel, and getPidFunc args are never used (and never will
be).
ifindex - we always know the name of the device, and never know the
ifindex - if we really did need the ifindex we would have to get it
from the name using virNetDevGetIndex(). In practice, we just send -1
to virNetDevSetVfConfig(), which doesn't bother to learn the real
ifindex (you only need a name *or* an ifindex for the netlink command
to succeed, not both).
nltarget_kernel - messages to set the config of an SRIOV VF will
always go to netlink in the kernel, not to another user process, so
this arg is always true (there are other uses of netlink messages
where the message might need to go to another user process, but never
in the case of RTM_SETLINK for SRIOV).
getPidFunc - this arg is only used if nltarget_kernel is false, and it
never is.
None of this has any functional effect, it just makes it easier to
follow what's happening when virNetDevSetVfConfig() is called.
virNetDevParseVfConfig() assumed that both the MAC address and VLAN
tag pointers were valid, so even if you only wanted one or the other,
you would need a variable to hold the returned value for both. This
patch checks each for a NULL pointer before filling it in.
The source code will check for NULL arguments for 'macvtap_macaddr' and
'vmuuid', so no need for the NONNULL in the prototypes. Following the stack
for both arguments to virNetDevVPortProfileOpSetLink also shows called
functions would handle a NULL value.
Additionally, modified the prototype to use the same 'macvtap_macaddr'
name as the source code for consistency.
Since the code checks 'mgr == NULL' anyway, no need for the prototype
to have the NONNULL arg check. Also add an error message to indicate what
the failure is so that there isn't a failed for some reason error.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The comparison code used STREQ_NULLABLE anyway for both 'drv_name' and
'dom_name', so no need. Add a NULLSTR on the 'dom_name' too.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The comparison code used STREQ_NULLABLE anyway for both 'drv_name' and
'dom_name', so no need. Add a NULLSTR on the 'dom_name' too.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The code checks and handles a NULL 'str', so just remove the NONNULL.
Update the error message to add the NULLSTR() around 'str' also.
Signed-off-by: John Ferlan <jferlan@redhat.com>
This patch splits out the part of virNetDevTapCreateInBridgePort()
that would need to be re-done if an existing tap device had to be
re-attached to a bridge, and puts it into a separate function. This
can be used both when an existing domain interface config is updated
to change its connection, and also to re-attach to the "same" bridge
when a network has been stopped and restarted. So far it is used for
nothing.
This function provides the bridge/bond device that the given network
device is attached to. The return value is 0 or -1, and the master
device is a char** argument to the function - this is needed in order
to allow for a "success" return from a device that has no master.
The only reason that the ethtool features weren't being retrieved in
an unprivileged libvirtd was because they required ioctl(), and the
ioctl was using an AF_PACKET socket, which requires root. Now that we
are using AF_UNIX for ioctl(), this restriction can be removed.
The exact family of the socket created for the fd used by ioctl(7)
doesn't matter, it just needs to be a socket and not a file. But for
some reason when macvtap support was added, it used
AF_PACKET/SOCK_DGRAM sockets for its ioctls; we later used the same
AF_PACKET/SOCK_DGRAM socket for new ioctls we added, and eventually
modified the other pre-existing ioctl sockets (for creating/deleting
bridges) to also use AF_PACKET/SOCK_DGRAM (that code originally used
AF_UNIX/SOCK_STREAM).
The problem with using AF_PACKET (intended for sending/receiving "raw"
packets, i.e. packets that can be some protocol other than TCP or UDP)
is that it requires root privileges. This meant that none of the
ioctls in virnetdev.c or virnetdevip.c would work when running
libvirtd unprivileged.
This packet solves that problem by changing the family to AF_UNIX when
creating the socket used for any ioctl().
When enabling IPv6 on all interfaces, we may get the host Router
Advertisement routes discarded. To avoid this, the user needs to set
accept_ra to 2 for the interfaces with such routes.
See https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt
on this topic.
To avoid user mistakenly losing routes on their hosts, check
accept_ra values before enabling IPv6 forwarding. If a RA route is
detected, but neither the corresponding device nor global accept_ra
is set to 2, the network will fail to start.
virNetlinkCommand() processes only one response message, while some
netlink commands, like route dumping, need to process several.
Add virNetlinkDumpCommand() as a virNetlinkCommand() sister.
Allow to reuse as much as possible from virNetlinkCommand(). This
comment prepares for the introduction of virNetlinkDumpCommand()
only differing by how it handles the responses.
While connecting to qemu monitor, the first thing we do is wait
for it to show up. However, we are doing it with some timeout to
avoid indefinite waits (e.g. when qemu doesn't create the monitor
socket at all). After beaa447a29 we are using exponential back
off timeout meaning, after the first connection attempt we wait
1ms, then 2ms, then 4 and so on. This allows us to bring down
wait time for small domains where qemu initializes quickly.
However, on the other end of this scale are some domains with
huge amounts of guest memory. Now imagine that we've gotten up to
wait time of 15 seconds. The next one is going to be 30 seconds,
and the one after that whole minute. Well, okay - with current
code we are not going to wait longer than 30 seconds in total,
but this is going to change in the next commit.
The exponential back off is usable only for first few iterations.
Then it needs to be caped (one second was chosen as the limit)
and switch to constant wait time.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The function is actually in virutil.c, but prototyped in virfile.h.
This patch fixes that by renaming the function to virWaitForDevices,
adding the prototype in virutil.h and libvirt_private.syms, and then
changing the callers to use the new name.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Found by Coverity. Because there's an "if ((cur = strstr(base, "revision"))
!= NULL) {" followed by a "base = cur" coverity notes that 'base' could
then be NULL causing the return to the top of the "while ((tmp_base =
strstr(base, "processor")) != NULL) {" to have strstr deref a NULL 'base'
pointer because the setting of base at the bottom of the loop is unconditional.
Alter the code to set "base = cur" after processing each key. That will
"ensure" that base doesn't get set to NULL if both "cpu" and "revision"
do no follow a "processor".
While a /proc/cpuinfo file that has a "processor" key but with neither
a "cpu" nor a "revision" doesn't seem feasible, the code is written as if
it could happen, so we have to account for it.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Calls to virFileReadAll after a VIR_ALLOC that return NULL all show
a memory leak since 'ret' isn't virSysinfoDefFree'd and normal path
"return ret" doesn't free outbuf.
Reported by Coverity
Signed-off-by: John Ferlan <jferlan@redhat.com>
Whole implementations along with helper totalling screens of code were
conditionally compiled. That made the code totally unreadable and
untestable. Rename functions to have the architecture in the name so
that all can be compiled at the same time and introduce header to allow
testing them all.
Proposed formal coding conventions encourage defining typedefs for
vir[Blah] and vir[Blah]Ptr separately from the associated struct named
_vir[Blah]:
typedef struct _virBlah virBlah;
typedef virBlah *virBlahPtr;
struct _virBlah {
...
};
At some point in the past, I had submitted several patches using a
more compact style that I prefer, and they were accepted:
typedef struct _virBlah {
...
} virBlah, *virBlahPtr;
Since these are by far a minority among all struct definitions, this
patch changes all those definitions to reflect the style prefered by
the proposal so that there is 100% consistency.
After the system has been booted, it should not change.
Cache the return value of virSystemdHasMachined.
Allow starting and terminating machines with just one
DBus call, instead of three, reducing the chance of
the call timing out.
Also introduce a small function for resetting the cache
to be used in tests.
Both virSystemdTerminateMachine and virSystemdCreateMachine
propagate the error to tell between a non-systemd system
and a hard error.
In virSystemdGetMachineNameByPID both are treated the same,
but an error is ignored by the callers.
Split out the checks into a separate function.
Arguably though, function returning only on success is a very
interesting, although quite impractical concept. Also, the errno isn't
and shouldn't be preserved in this case, since the errno can be directly
fed to the virReportSystemError.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This will eventually replace virQEMUBuildBufferEscapeComma, however
it's not possible right now. Some parts of the code that uses the
old function needs to be refactored.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
GCC 7 gets upset by
if (!tmp && (size * count))
warning
util/viralloc.c: In function 'virReallocN':
util/viralloc.c:246:23: error: '*' in boolean context, suggest '&&' instead [-Werror=int-in-bool-context]
if (!tmp && (size * count)) {
~~~~~~^~~~~~~~
Keep it happy by adding != 0 to the right hand expression
so it realizes we really are wanting to treat the result
of the arithmetic expression as a boolean
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The 'raw' block driver in Qemu is not directly interesting from
libvirt's perspective, but it can be layered above some other block
drivers and this may be interesting for the user.
The patch adds support for the 'raw' block driver. The driver is treated
simply as a pass-through and child driver in JSON is queried to get the
necessary information.
Signed-off-by: Tomáš Golembiovský <tgolembi@redhat.com>
Split virStorageSourceParseBackingJSON into two functions so that the
core can be reused by other functions. The new function called
virStorageSourceParseBackingJSONInternal accepts virJSONValuePtr.
Signed-off-by: Tomáš Golembiovský <tgolembi@redhat.com>
The utils code should stay separated from other code (except for very
well justified cases). Unfortunately commit 272769becc
made it trivial to break the separation (and not get slapped by the
syntax-check rule) by adding -I src/conf to the CFLAGS for utils.
Remove this shortcut and except the two offenders from the syntax check
so that the codebase can be kept separated.
Create a virscsihost.c and place the functions there. That removes the
last #ifdef __linux__ from virutil.c.
Take the opporunity to also change the function names and in one case
the parameters slightly
Rather than have them mixed in with the virutil apis, create a separate
virvhba.c module and move the vHBA related calls into there. Soon there
will be more added.
Also modify the names of the functions and some arguments to be more
indicative of what is really happening. Adjust the callers respectively.
While I was changing fchosttest, rather than the non-descriptive names
test1...test6, rename them to match what the test is doing.
Before 9c17d665fd (v1.3.2 - I know, right?) it was possible to
have the following interface configuration:
<interface type='ethernet'/>
<script path=''/>
</interface>
This resulted in -netdev tap,script=,.. Fortunately, qemu helped
us to get away with this as it just ignored the empty script
path. However, after the commit mentioned above it's libvirtd
who is executing the script. Unfortunately without special
case-ing empty script path.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Currently disk names do not follow the
(regex) /^[fhv]d[a-z]+[0-9]*$/ completely
and hence one can assign disk names like
vd2 etc. This patch ensures that the
disk names follow the regex mentioned.
This patch also adds a testcase.
Signed-off-by: Nitesh Konkar <nitkon12@linux.vnet.ibm.com>
To make sure bit 'b' fits into the bitmap, we need to allocate b+1
bits, since we number from 0.
Adjust the bitmap test to set a bit at a multiple of 16.
That way the test fails without this fix, because the VIR_REALLOC
call clears the newly added memory even if the original pointer
has not changed.
So rather than comparing 2 paths (strings) as they are, which can very
easily lead to unnecessary errors (e.g. in storage driver) that the paths
are not the same when in fact they'd be e.g. just symlinks to the same
location, we should put our best effort into resolving any symlinks and
canonicalizing the path and only then compare the 2 paths for equality.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Although currently this is documented in virsh man page
and virsh help, the expicit mention in the error message
is helful for tools using the API directly.
Signed-off-by: Nitesh Konkar <nitkon12@linux.vnet.ibm.com>
After freeing the data structures we have to reset the counters to
zero. This fixes a segmentation fault when virNetDevIPInfoClear is
called twice (e.g. this is possible in virDomainNetDefParseXML() if
virDomainNetIPInfoParseXML(...) fails with ret < 0 (this leads to the
first call of 'virNetDevIPInfoClear(&def->guestIP)') and the resulting
call of virDomainNetDefFree(def) in the error path of
virDomainNetDefParseXML() (this leads to the second call of
virNetDevIPInfoClear(&def->guestIP), and finally to the segmentation
fault).
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
This patchs allows to set the timeout value used for all
openvswitch calls. The default timeout value remains as
before at 5 seconds.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Provide the ability to specify a default timeout value for
successful completion of openvswitch calls in the libvirtd
configuration file.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
virNetDevTapCreateInBridgePort() has always set the new tap device to
the current MTU of the bridge it's being attached to. There is one
case where we will want to set the new tap device to a different
(usually larger) MTU - if that's done with the very first device added
to the bridge, the bridge's MTU will be set to the device's MTU. This
patch allows for that possibility by adding "int mtu" to the arg list
for virNetDevTapCreateInBridgePort(), but all callers are sending -1,
so it doesn't yet have any effect.
Since the requested MTU isn't necessarily what is used in the end (for
example, if there is no MTU requested, the tap device will be set to
the current MTU of the bridge), and the hypervisor may want to know
the actual MTU used, we also return the actual MTU to the caller (if
actualMTU is non-NULL).
We will need to traverse the symlinks one step at the time.
Therefore we need to see where a symlink is pointing to.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The comment to the function states that the errors from the child
process are reported. Well, the error buffer is filled with
possible error messages. But then it is thrown away. Among with
important error message from the child process.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Originally/discovered proposed by "Wang King <king.wang@huawei.com>"
When the virCloseCallbacksSet is first called, it increments the refcnt
on the domain object to ensure it doesn't get deleted before the callback
is called. The refcnt would be decremented in virCloseCallbacksUnset once
the entry is removed from the closeCallbacks has table.
When (mostly) normal shutdown occurs, the qemuProcessStop will end up
calling qemuProcessAutoDestroyRemove and will remove the callback from
the list and hash table normally and decrement the refcnt.
However, when qemuConnectClose calls virCloseCallbacksRun, it will scan
the (locked) closeCallbacks list for matching domain and callback function.
If an entry is found, it will be removed from the closeCallbacks list and
placed into a lookaside list to be processed when the closeCallbacks lock
is dropped. The callback function (e.g. qemuProcessAutoDestroy) is called
and will run qemuProcessStop. That code will fail to find the callback
in the list when qemuProcessAutoDestroyRemove is called and thus not decrement
the domain refcnt. Instead since the entry isn't found the code will just
return (mostly) harmlessly.
This patch will resolve the issue by taking another ref during the
search UUID process during virCloseCallackRun, decrementing the refcnt
taken by virCloseCallbacksSet, calling the callback routine and returning
overwriting the vm (since it could return NULL). Finally, it will call the
virDomainObjEndAPI to lower the refcnt and remove the lock taken during
the search UUID processing. This may cause the vm to be destroyed.
Currently, on every --enable perf_event command,
a new event->fd is created and counting of perf
event counter starts from zero and previous
event->fd is lost. This patch prevents this
behaviour.
Signed-off-by: Nitesh Konkar <nitkon12@linux.vnet.ibm.com>
It is destructive to attempt reset on a pci- or cardbus-bridge, the
host can crash. The bridges won't contain any guest data and neither
they can be passed through using vfio/stub. So, no point in allowing a
reset on them.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Non-endpoint devices like pci-bridges cannot be assigned to guests.
Prevent such attempts.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Avoid return with the closeCallbacks locked when get callbacks list for connect fail.
Signed-off-by: Wang King <king.wang@huawei.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
Based on work of Mehdi Abaakouk <sileht@sileht.net>.
When parsing vhost-user interface XML and no ifname is found we
can try to fill it in in post parse callback. The way this works
is we try to make up interface name from given socket path and
then ask openvswitch whether it knows the interface.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
File open errors are prevented by a file exists check before
virFileReadAll is called since all callers of the virReadFCHost
method handle errors themselves based on the NULL return anyway.
Also included is a minor spelling correction in a comment.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
This is a simple wrapper over mount(). However, not every system
out there is capable of moving a mount point. Therefore, instead
of having to deal with this fact in all the places of our code we
can have a simple wrapper and deal with this fact at just one
place.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Other drivers (like qemu) would like to know if the namespaces
are available therefore it makes sense to move this function to
a shared module.
At the same time, this function had some default namespaces that
are checked with every call. It is not necessary - let callers
pass just those namespaces they are interested in.
With the move the function is renamed to
virProcessNamespaceAvailable.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The code at the very bottom of the DAC secdriver that calls
chown() should be fine with read-only data. If something needs to
be prepared it should have been done beforehand.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This patch adds support and documentation for
a generalized hardware cache event called cache_l1d
perf event.
Signed-off-by: Nitesh Konkar <nitkon12@linux.vnet.ibm.com>
Currently when spawning containers with systemd, the container PID 1
will get moved into the systemd machine slice. Libvirt then manually
moves the libvirt_lxc and qemu-nbd processes into the cgroups associated
with the slice, but skips the systemd controller cgroup. This means that
from systemd's POV, libvirt_lxc and qemu-nbd are still part of the
libvirtd.service unit.
On systemctl daemon-reload, it will notice that libvirt_lxc & qemu-nbd
are in the libvirtd.service unit for the systemd controller, but in the
machine cgroups for resources. Systemd will thus move them back into
the libvirtd.service resource cgroups next time libvirtd is restarted.
This causes libvirtd to kill off the container due to incorrect cgroup
placement.
The solution is to ensure that when moving libvirt_lxc & qemu-nbd, we
also move the systemd cgroup controller placement. Normally this is
not something we ever want todo, but this is a special case as we are
intentionally wanting to move them to a different systemd unit.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>