Add an optional virTristateBool haveTLS to virStorageSource to
manage whether a storage source will be using TLS.
Sample XML for a VxHS disk:
<disk type='network' device='disk'>
<driver name='qemu' type='raw' cache='none'/>
<source protocol='vxhs' name='eb90327c-8302-4725-9e1b-4e85ed4dc251' tls='yes'>
<host name='192.168.0.1' port='9999'/>
</source>
<target dev='vda' bus='virtio'/>
</disk>
Additionally add a tlsFromConfig boolean to control whether the TLS
setting was due to domain configuration or qemu.conf global setting
in order to decide whether to Format the haveTLS setting for either
a live or saved domain configuration file.
Update the qemuxml2xmltest in order to add a test to show the proper
parsing.
Also update the docs to describe the tls attribute.
Signed-off-by: Ashish Mittal <Ashish.Mittal@veritas.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
Calling fallocate on the new (smaller) capacity ensures
that the whole file is allocated, but it does not reduce
the file size.
Also call ftruncate after fallocate.
https://bugzilla.redhat.com/show_bug.cgi?id=1366446
We have been trying to implement the ALLOCATE flag to mean
"the volume should be fully allocated after the resize".
Since commit b0579ed9 we do not allocate from the existing
capacity, but from the existing allocation value.
However this value is a total of all the allocated bytes,
not an offset.
For a sparsely allocated file:
$ perl -e 'print "x"x8192;' > vol1
$ fallocate -p -o 0 -l 4096 vol1
$ virsh vol-info vol1 default
Capacity: 8.00 KiB
Allocation: 4.00 KiB
Treating allocation as an offset would result in an incompletely
allocated file:
$ virsh vol-resize vol1 --pool default 16384 --allocate
Capacity: 16.00 KiB
Allocation: 12.00 KiB
Call fallocate from zero on the whole requested capacity to fully
allocate the file. After that, the volume is fully allocated
after the resize:
$ virsh vol-resize vol1 --pool default 16384 --allocate
$ virsh vol-info vol1 default
Capacity: 16.00 KiB
Allocation: 16.00 KiB
Introduce a new function virFileAllocate that will call the
non-destructive variants of safezero, essentially reverting
my commit 1390c268
safezero: fall back to writing zeroes even when resizing
back to the state as of commit 18f0316
virstoragefile: Have virStorageFileResize use safezero
This means that _ALLOCATE flag will no longer work on platforms
without the allocate syscalls, but it will not overwrite data
either.
Seeing a log message saying 'flags=93' is ambiguous & confusing unless
you happen to know that libvirt always prints flags as hex. Change our
debug messages so that they always add a '0x' prefix when printing flags,
and '0' prefix when printing mode. A few other misc places gain a '0x'
prefix in error messages too.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
After commit 8708ca01c0 libvirtd consistently aborts with "stack
smashing detected" when nodedev driver is initialized.
This is caused by nlmsg_parse() being told that its array of nlattr*
has CTRL_CMD_MAX (10) entries, when in fact it is declared to have
CTRL_ATTR_MAX (8) entries. Since all the entries are initialized to
NULL, the result is that nlmsg_parse is overwriting 2*(sizof(nlattr*))
bytes outside the array.
Signed-off-by: Laine Stump <laine@laine.org>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Commit id '5604c056' used the wrong API to generate the
<secret type='%s'..." field. The previous code used the
correct API as was done in commit id '6887af39'. The data
is actually a usage type not an auth type even though the
result is the same.
The iohelper currently calls saferead() to get data from the
underlying file. This has a problem with O_DIRECT when hitting
end-of-file. saferead() is asked to read 1MB, but the first
read() it does may return only a few KB, so it'll try another
read() to fill the remaining buffer. Unfortunately the buffer
pointer passed into this 2nd read() is likely not aligned
to the extent that O_DIRECT requires, so rather than seeing
'0' for end-of-file, we'll get -1 + EINVAL due to misaligned
buffer.
The way the iohelper is currently written, it already handles
getting short reads, so there is actually no need to use
saferead() at all. We can simply call read() directly. The
benefit of this is that we can now write() the data immediately
so when we go into the subsequent reads() we'll always have a
correctly aligned buffer.
Technically the file position ought to be aligned for O_DIRECT
too, but this does not appear to matter when at end-of-file.
Tested-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add the backing parse and a test case to verify parsing of VxHS
backing storage.
Signed-off-by: Ashish Mittal <Ashish.Mittal@veritas.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add a new virStorageNetProtocol for Veritas HyperScale (VxHS) disks
Signed-off-by: Ashish Mittal <Ashish.Mittal@veritas.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
The mlx4 (Mellanox) netdev driver implements the sysfs phys_port_id
file for both VFs and PFs, so you can find the VF netdev plugged into
the same physical port as any given PF netdev by comparing the
contents of phys_port_id of the respective netdevs. That's what
libvirt does when attempting to find the PF netdev for a given VF
netdev (or vice versa).
Most other netdev's drivers don't implement phys_port_id, so the file
is visible in sysfs directory listing, but attempts to read it result
in ENOTSUPP. In these cases, libvirt is unable to read phys_port_id of
either the PF or the VF, so it just returns the first entry in the
PF/VF's list of netdevs.
But we've found that the i40e driver is in between those two
situations - it implements phys_port_id for PF netdevs, but doesn't
implement it for VF netdevs. So libvirt would successfully read the
phys_port_id of the PF netdev, then try to find a VF netdev with
matching phys_port_id, but would fail because phys_port_id is NULL for
all VFs. This would result in a message like the following:
Could not find network device with phys_port_id '3cfdfe9edc39'
under PCI device at /sys/class/net/ens4f1/device/virtfn0
To solve this problem in a way that won't break functionality for
anyone else, this patch saves the first netdev name we find for the
device, and returns that if we fail to find a netdev with the desired
phys_port_id.
Instead of checking for all possible constants that every
kernel header with devlink support should have (and defining
HAVE_DECL_DEVLINK as 1 if any of them is present due to the
way AC_CHECK_DECLS works), only check for DEVLINK_CMD_ESWITCH_GET.
This is the name of the constant since kernel 4.11. Between 4.8
and 4.11, the now deprecated spelling DEVLINK_CMD_ESWITCH_MODE_GET
was used.
Assume DEVLINK_ESWITCH_MODE_SWITCHDEV is available, since it was
introduced along with the deprecated spelling.
Adding functionality to libvirt that will allow querying the interface
for the availability of switchdev Offloading NIC capabilities.
The switchdev mode was introduced in kernel 4.8, the iproute2-devlink
command to retrieve the switchdev NIC feature with command example:
devlink dev eswitch show pci/0000:03:00.0
This feature is needed for Openstack so we can do a scheduling decision
if the NIC is in Hardware Offload (switchdev) or regular SR-IOV (legacy) mode.
And select the appropriate hypervisors with the requested capability see [1].
[1] - https://specs.openstack.org/openstack/nova-specs/specs/pike/approved/enable-sriov-nic-features.html
Reviewed-by: Laine Stump <laine@laine.org>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Some cleanup paths overwrite a usefull error message with a less useful
one and we then try to preserve the original message. The handlers added
in this patch will simplify the operations since they are designed right
for the purpose.
TPM 2 does not implement sysfs files for cancellation of commands.
We therefore use /dev/null for the cancel path passed to QEMU.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Tested-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Although not previously explicitly documented, the expectation for
the libvirt event loop is that an implementation is registered early
in application startup, before calling any libvirt APIs and then
run forever after. Replacing a previously registered event loop is
not safe & subject to races even if virConnectClose has been called
on open handles, due to delayed deregistration of callbacks during
conenction close.
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
We can now check for the error and not care about the return value as
it will be properly handled in virBufferContentAndReset() anyway.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The function is useful even without using the return value. And if
needed, the return value can be obtained by other calls as well. The
potential for clean-up can be seen in the following patch.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Commit 94c465d0 refactored the logging setup phase but introduced an
issue, where the daemon ignores verbose mode when there are no outputs
defined and the default must be used. The problem is that the default
output was determined too early, thus ignoring the potential '--verbose'
option taking effect. This patch postpones the creation of the default
output to the very last moment when nothing else can change. Since the
default output is only created during the init phase, it's safe to leave
the pointer as NULL for a while, but it will be set eventually, thus not
affecting runtime.
Patch also adjusts both the other daemons.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1442947
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This helper allows you to better structurize the code if some element
may or may not contains attributes and/or child elements.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
The helper returns true if a string contains any of the given chars.
virStringHasControlChars can be reimplemented using that helper.
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
It's equivalent of calling virXPathString("string(.)", ctxt) but it
doesn't have to use the XPath resolving and parsing.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
The virXMLPropStringLimit is an equivalent of virXPathStringLimit
which should be preferred if you already have a XML dom node or
if you need to parse more than one property.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Move networkMacMgrFileName into src/util/virmacmap.c and rename to
virMacMapFileName. We're about to move some more MacMgr processing
files into virnetworkobj and it doesn't make sense to have this helper
in the driver or in virnetworkobj.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Rather than assuming that what's passed to virObject{Ref|Unref}
would be a virObjectPtr as long as it's not NULL, let's do the
similar checks virObjectIsClass in order to prevent a possible
increment or decrement to some field at the obj->u.s.refs offset.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The virObjectIsClass API has only ever checked object validity
based on if the @obj is not NULL and it was derived from some class.
While this has worked well in general, there is one additional
check that could be made prior to calling virClassIsDerivedFrom
which loops through the classes checking the magic number against
the klass expected magic number.
If by chance a non virObject is passed, rather than assuming the
void * @obj is a _virObject and thus offsetting to obj->klass,
obj->magic, and obj->parent, let's check that the void * @obj
has at least the "base part" of the magic number in the right
place and generate a more specific VIR_WARN message if not.
There are many consumers to virObjectIsClass, include the locking
primitives virObject{Lock|Unlock}, virObjectRWLock{Read|Write},
and virObjectRWUnlock. For those callers, the locking call will
not fail, but it also will not attempt a virMutex* call which
will "most likely" fail since the &obj->lock is used.
In order to avoid some possible future wrap on the 0xCAFExxxx
value, add a check during initialization that some new class
won't cause the wrap. Should be good for a few years at least!
It is still left up to the caller to handle the failed API calls
just as it would be if it passed a NULL opaque pointer anyobj.
If virObjectIsClass fails "internally" to virobject.c, create a
macro to generate the VIR_WARN describing what the problem is.
Also improve the checks and message a bit to indicate which was
the failure - whether the obj was NULL or just not the right class
Signed-off-by: John Ferlan <jferlan@redhat.com>
Rather than overload virObjectUnlock as commit id '77f4593b' has
done, create a separate virObjectRWUnlock API that will force the
consumers to make the proper decision regarding unlocking the
RWLock's. Similar to the RWLockRead and RWLockWrite, use the
virObjectGetRWLockableObj helper. This restores the virObjectUnlock
code to using the virObjectGetLockableObj.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Introduce a helper to handle the error path more cleanly. The same
as virObjectGetLockableObj in order to essentially follow the original
logic of commit 'b545f65d' to ensure that the input argument at least
has some validity before using.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Now that virObjectRWLockWrite exists to handle the virObjectRWLockable
objects, let's restore virObjectLock to only handle virObjectLockable
class locks. There still exists the possibility that the input @anyobj
isn't a valid object and the resource isn't truly locked, but that
also exists before commit id '77f4593b'.
This also restores some logic that commit id '77f4593b' removed
with respect to a common code path that commit id '10c2bb2b' had
introduced as virObjectGetLockableObj. This code path merely does
the same checks as the original virObjectLock commit 'b545f65d',
but in callable/reusable helper to ensure the @obj at least has
some validity before using.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Instead of making virObjectLock be the entry point for two
different types of locks, let's create a virObjectRWLockWrite API
which will only handle the virObjectRWLockableClass objects.
Use the new virObjectRWLockWrite for the virdomainobjlist code
in order to handle the Add, Remove, Rename, and Load operations
that need to be very synchronous.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Since the class it represents is based on virObjectRWLockableClass
and in order to make sure we differentiate just in case anyone somehow
believes they could use virObjectLockRead for a virObjectLockableClass,
let's rename the API to use the RW in the name. Besides the RW locks
refer to pthread_rwlock_{init|rdlock|wrlock|unlock|destroy} while the
other locks refer to pthread_mutex_{init|lock|unlock|destroy}.
Signed-off-by: John Ferlan <jferlan@redhat.com>
This way later patches can add another structures with virResctrl
prefix without the meaning being even more confusing than it needs to
be.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
That means that returning negative values means error and non-negative
values differ in meaning, but are all successful.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
It doesn't access anything from conf/ and ti will be needed to use
from other util/ places. This split makes the separation clearer.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Commit 81fb440b further qualified an if statement by adding the
boolean saveVlan to the condition. Coverity pointed out that this
change in the logic eliminated the need to check saveVlan in an
argument to virAsprintf().
Commit 9a94af6d restructured virHostdevReadNetConfig() so that it
would manually set ret = 0 after successfully reading the device's
config, but Coverity pointed out that "ret = 0" was erroneously placed
outside of an "else" clause, meaning that the the value of ret set in
the "if" clause was unnecessarily and incorrectly overwritten.
This patch moves ret = 0 into the else clause, which should silence
Coverity.
When using a VF from an SRIOV-capable network card in a guest (either
in macvtap passthrough mode, or via VFIO PCI device assignment), The
associated PF netdev must be online in order for the VF to be usable
by the guest. The guest, however, is not able to change the state of
the PF. And libvirt *could* set the PF online as needed, but that
could lead to the host receiving unexpected IPv6 traffic (since the
default for an unconfigured interface is to participate in IPv6
autoconf). For this reason, before assigning a VF to a guest, libvirt
verifies that the related PF netdev is online - if it isn't, then we
log an error and don't allow the guest startup to continue.
Until now, this check was done during virNetDevSetNetConfig(). This
works nicely because the same function is called both for macvtap
passthrough and for VFIO device assignment. But in the case of VFIO,
the VF has already been unbound from its netdev driver by the time we
get to virNetDevSetNetConfig(), and in the case of dual port Mellanox
NICs that have their VFs setup in single port mode, the *only* way to
determine the proper PF netdev to query for online status is via the
"phys_port_id" file that is in the VF netdev's sysfs directory. *BUT*
if we've unbound the VF from the netdev driver, then it doesn't *have*
a netdev sysfs directory.
So, in order to check the correct PF netdev for online status, this
patch moved the check earlier in the setup, into
virNetDevSaveNetConfig(), which is called *before* unbinding the VF
from its netdev driver.
(Note that this implies that if you are using VFIO device assignment
for the VFs of a Mellanox NIC that has the VFs programmed in single
port mode, you must let the VFs be bound to their net driver and use
"managed='yes'" in the device definition. To be more specific, this is
only true if the VFs in single port mode are using port *2* of the PF
- if the VFs are using only port 1, then the correct PF netdev will be
arrived at by default/chance))
This resolves: https://bugzilla.redhat.com/267191
virHostdevRestoreNetConfig() calls virNetDevReadNetConfig() to try and
read the "original config" of a netdev, and if that fails, it tries
again with a different directory/netdev name. This achieves the
desired effect (we end up finding the config wherever it may be), but
for each failure, virNetDevReadNetConfig() places a nice error message
in the system logs. Experience has shown that false-positive error
logs like this lead to erroneous bug reports, and can often mislead
those searching for *real* bugs.
This patch changes virNetDevReadNetConfig() to explicitly check if the
file exists before calling virFileReadAll(); if it doesn't exist,
virNetDevReadNetConfig() returns a success, but leaves all the
variables holding the results as NULL. (This makes sense if you define
the purpose of the function as "read a netdev's config from its config
file *if that file exists*).
To take advantage of that change, the caller,
virHostdevRestoreNetConfig() is modified to fail immediately if
virNetDevReadNetConfig() returns an error, and otherwise to try the
different directory/netdev name if adminMAC & vlan & MAC are all NULL
after the preceding attempt.
Mellanox ConnectX-3 dual port SRIOV NICs present a bit of a challenge
when assigning one of their VFs to a guest using VFIO device
assignment.
These NICs have only a single PCI PF device, and that single PF has
two netdevs sharing the single PCI address - one for port 1 and one
for port 2. When a VF is created it can also have 2 netdevs, or it can
be setup in "single port" mode, where the VF has only a single netdev,
and that netdev is connected either to port 1 or to port 2.
When the VF is created in dual port mode, you get/set the MAC
address/vlan tag for the port 1 VF by sending a netlink message to the
PF's port1 netdev, and you get/set the MAC address/vlan tag for the
port 2 VF by sending a netlink message to the PF's port 2 netdev. (Of
course libvirt doesn't have any way to describe MAC/vlan info for 2
ports in a single hostdev interface, so that's a bit of a moot point)
When the VF is created in single port mode, you can *set* the MAC/vlan
info by sending a netlink message to *either* PF netdev - the driver
is smart enough to understand that there's only a single netdev, and
set the MAC/vlan for that netdev. When you want to *get* it, however,
the driver is more accurate - it will return 00:00:00:00:00:00 for the
MAC if you request it from the port 1 PF netdev when the VF was
configured to be single port on port 2, or if you request if from the
port 2 PF netdev when the VF was configured to be single port on port
1.
Based on this information, when *getting* the MAC/vlan info (to save
the original setting prior to assignment), we determine the correct PF
netdev by matching phys_port_id between VF and PF.
(IMPORTANT NOTE: this implies that to do PCI device assignment of the
VFs on dual port Mellanox cards using <interface type='hostdev'>
(i.e. if you want the MAC address/vlan tag to be set), not only must
the VFs be configured in single port mode, but also the VFs *must* be
bound to the host VF net driver, and libvirt must use managed='yes')
By the time libvirt is ready to set the new MAC/vlan tag, the VF has
already been unbound from the host net driver and bound to
vfio-pci. This isn't problematic though because, as stated earlier,
when a VF is created in single port mode, commands to configure it can
be sent to either the port 1 PF netdev or the port 2 PF netdev.
When it is time to restore the original MAC/vlan tag, again the VF
will *not* be bound to a host net driver, so it won't be possible to
learn from sysfs whether to use the port 1 or port 2 PF netdev for the
netlink commands. And again, it doesn't matter which netdev you
use. However, we must keep in mind that we saved the original settings
to a file called "${PF}_${VFNUM}". To solve this problem, we just
check for the existence of ${PF1}_${VFNUM} and ${PF2}_${VFNUM}, and
use whichever one we find (since we know that only one can be there)
This patch updates functions in netdev.c to pay attention to
phys_port_id. It uses the new function virNetDevGetPhysPortID() to
learn the phys_port_id of a VF or PF, then sends that info to
virPCIGetNetName(), which has newly been modified to take an optional
phys_port_id.
A single PCI device may have multiple netdevs associated with it. Each
of those netdevs will have a different phys_port_id entry in
sysfs. This patch modifies virPCIGetNetName() to allow selecting one
of the potential many netdevs in two different ways:
1) by setting the "idx" argument, the caller can select the 1st (0),
2nd (1), etc. netdev from the PCI device's net subdirectory.
2) If the physPortID arg is set (to a null-terminated string) then
virPCIGetNetName() returns the netdev that has that phys_port_id in
the sysfs file of the same name in the netdev's directory.
On Linux each network device *can* (but not necessarily *does*) have
an attribute called phys_port_id which can be read from the file of
that name in the netdev's sysfs directory. The examples I've seen have
been a many-digit hexadecimal number (as an ASCII string).
This value can be useful when a single PCI device is associated with
multiple netdevs (e.g a dual port Mellanox SR-IOV NIC - this card has
a single PCI Physical Function (PF), and that PF has two netdevs
associated with it (the "net" subdirectory of the PF in sysfs has two
links rather than the usual single link to a netdev directory). Each
of the PF netdevs has a different phys_port_id. The Virtual Functions
(VF) are similar - the PF (a PCI device) has "n" VFs (also each of
these is a PCI device), each VF has two netdevs, and each of the VF
netdevs points back to the VF PCI device (with the "device" entry in
its sysfs directory) as well as having a phys_port_id matching the PF
netdev it is associated with.
virNetDevGetPhysPortID() simply attempts to read the phys_port_id for
the given netdev and return it to the caller. If this particular
netdev driver doesn't support phys_port_id, it returns NULL (*not* a
NULL-terminated string, but a NULL pointer) but still counts it as a
success.
We are given a string in @machinename, we never allocate it, just
merely use it for reading. We should not free it otherwise it
leads to double free:
==32191== Thread 17:
==32191== Invalid free() / delete / delete[] / realloc()
==32191== at 0x4C2D1A0: free (vg_replace_malloc.c:530)
==32191== by 0x54BBB84: virFree (viralloc.c:582)
==32191== by 0x2BC04499: qemuProcessStop (qemu_process.c:6313)
==32191== by 0x2BC500FF: processMonitorEOFEvent (qemu_driver.c:4724)
==32191== by 0x2BC502FC: qemuProcessEventHandler (qemu_driver.c:4769)
==32191== by 0x5550640: virThreadPoolWorker (virthreadpool.c:167)
==32191== by 0x554FBCF: virThreadHelper (virthread.c:206)
==32191== by 0x8F913D3: start_thread (in /lib64/libpthread-2.23.so)
==32191== by 0x928DE3C: clone (in /lib64/libc-2.23.so)
==32191== Address 0x31893d70 is 0 bytes inside a block of size 1,100 free'd
==32191== at 0x4C2D1A0: free (vg_replace_malloc.c:530)
==32191== by 0x54BBB84: virFree (viralloc.c:582)
==32191== by 0x54C1936: virCgroupValidateMachineGroup (vircgroup.c:343)
==32191== by 0x54C4B29: virCgroupNewDetectMachine (vircgroup.c:1550)
==32191== by 0x2BBDDA29: qemuConnectCgroup (qemu_cgroup.c:972)
==32191== by 0x2BC05DA7: qemuProcessReconnect (qemu_process.c:6822)
==32191== by 0x554FBCF: virThreadHelper (virthread.c:206)
==32191== by 0x8F913D3: start_thread (in /lib64/libpthread-2.23.so)
==32191== by 0x928DE3C: clone (in /lib64/libc-2.23.so)
==32191== Block was alloc'd at
==32191== at 0x4C2BE80: malloc (vg_replace_malloc.c:298)
==32191== by 0x4C2E35F: realloc (vg_replace_malloc.c:785)
==32191== by 0x54BB492: virReallocN (viralloc.c:245)
==32191== by 0x54BEDF2: virBufferGrow (virbuffer.c:150)
==32191== by 0x54BF3B9: virBufferVasprintf (virbuffer.c:408)
==32191== by 0x54BF324: virBufferAsprintf (virbuffer.c:381)
==32191== by 0x55BB271: virDomainGenerateMachineName (domain_conf.c:27078)
==32191== by 0x2BBD5B8F: qemuDomainGetMachineName (qemu_domain.c:9595)
==32191== by 0x2BBDD9B4: qemuConnectCgroup (qemu_cgroup.c:966)
==32191== by 0x2BC05DA7: qemuProcessReconnect (qemu_process.c:6822)
==32191== by 0x554FBCF: virThreadHelper (virthread.c:206)
==32191== by 0x8F913D3: start_thread (in /lib64/libpthread-2.23.so)
Moreover, make the @machinename 'const char *' to mark it
explicitly that we are not changing the passed string.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The new virFileCache will nicely handle the caching logic for any data
that we would like to cache. For each type of data we will just need
to implement few handlers that will take care of creating, validating,
loading and saving the cached data.
The cached data must be an instance of virObject.
Currently we cache QEMU capabilities which will start using
virFileCache.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
It is more related to a domain as we might use it even when there is
no systemd and it does not use any dbus/systemd functions. In order
not to use code from conf/ in util/ pass machineName in cgroups code
as a parameter. That also fixes a leak of machineName in the lxc
driver and cleans up and de-duplicates some code.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This reverts commit 328bd24443.
As it turns out, this is not portable and very Linux & glibc
specific. Worse, this may lead to not starving writers on Linux
but everywhere else. Revert this and if the starvation occurs
resolve it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Up until now we only had virObjectLockable which uses mutexes for
mutually excluding each other in critical section. Well, this is
not enough. Future work will require RW locks so we might as well
have virObjectRWLockable which is introduced here.
Moreover, polymorphism is introduced to our code for the first
time. Yay! More specifically, virObjectLock will grab a write
lock, virObjectLockRead will grab a read lock then (what a
surprise right?). This has great advantage that an object can be
made derived from virObjectRWLockable in a single line and still
continue functioning properly (mutexes can be viewed as grabbing
write locks only). Then just those critical sections that can
grab a read lock need fixing. Therefore the resulting change is
going to be way smaller.
In order to avoid writer starvation, the object initializes RW
lock that prefers writers.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
We already have virRWLockInit. But this uses pthread defaults
which prefer reader to initialize the RW lock. This may lead to
writer starvation. Therefore we need to have the counterpart that
prefers writers. Now, according to the
pthread_rwlockattr_setkind_np() man page setting
PTHREAD_RWLOCK_PREFER_WRITER_NP attribute is no-op. Therefore we
need to use PTHREAD_RWLOCK_PREFER_WRITER_NONRECURSIVE_NP
attribute. So much for good enum value names.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Currently, @port is type of string. Well, that's overkill and
waste of memory. Port is always an integer. Use it as such.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
A new function virNetDevOpenvswitchUpdateVlan has been created to instruct
OVS of the changes. qemuDomainChangeNet has been modified to handle the
update of the VLAN configuration for a running guest and rely on
virNetDevOpenvswitchUpdateVlan to do the actual update if needed.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
While searching for an element using a function it may be
desirable to know the element key for future operation.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
The purpose of this function is to tell if the current position
in given FD is in data section or a hole and how much bytes there
is remaining until the end of the section. This is achieved by
couple of lseeks(). The most important part is that we reposition
the FD back, so that the position is unchanged from the caller
POV. And until now the final lseek() back to the original
position was done with no check for errors. And I was convinced
that that's okay since nothing can go wrong. However, review
feedback from a related series persuaded me, that it's better to
be safe than sorry. Therefore, lets check if the final lseek()
succeeded and if it doesn't report an error.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Fill them in right away rather than having to figure out at runtime
whether they are necessary or not.
virStorageSourceNetworkDefaultPort does not need to be exported any
more.
This reverts commit e4b980c853.
When a binary links against a .a archive (as opposed to a shared library),
any symbols which are marked as 'weak' get silently dropped. As a result
when the binary later runs, those 'weak' functions have an address of
0x0 and thus crash when run.
This happened with virtlogd and virtlockd because they don't link to
libvirt.so, but instead just libvirt_util.a and libvirt_rpc.a. The
virRandomBits symbols was weak and so left out of the virtlogd &
virtlockd binaries, despite being required by virHashTable functions.
Various other binaries like libvirt_lxc, libvirt_iohelper, etc also
link directly to .a files instead of libvirt.so, so are potentially
at risk of dropping symbols leading to a later runtime crash.
This is normal linker behaviour because a weak symbol is not treated
as undefined, so nothing forces it to be pulled in from the .a You
have to force the linker to pull in weak symbols using -u$SYMNAME
which is not a practical approach.
This risk is silent bad linkage that affects runtime behaviour is
not acceptable for a fix that was merely trying to fix the test
suite. So stop using __weak__ again.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the scan of the /proc/mounts file used to find cgroup mount
points doesn't take into account that mount points may hidden by other
mount points. For, example in certain Kubernetes environments the
/proc/mounts contains the following lines:
cgroup /sys/fs/cgroup/net_prio,net_cls cgroup ...
tmpfs /sys/fs/cgroup tmpfs ...
cgroup /sys/fs/cgroup/net_cls,net_prio cgroup ...
In this particular environment the first mount point is hidden by the
second one. The correct mount point is the third one, but libvirt will
never process it because it only checks the first mount point for each
controller (net_cls in this case). So libvirt will try to use the first
mount point, which doesn't actually exist, and the complete detection
process will fail.
To avoid that issue this patch changes the virCgroupDetectMountsFromFile
function so that when there are duplicates it takes the information from
the last line in /proc/mounts. This requires removing the previous
explicit condition to skip duplicates, and adding code to free the
memory used by the processing of duplicated lines.
Related-To: https://bugzilla.redhat.com/1468214
Related-To: https://github.com/kubevirt/libvirt/issues/4
Signed-off-by: Juan Hernandez <jhernand@redhat.com>
Currently all mockable functions are annotated with the 'noinline'
attribute. This is insufficient to guarantee that a function can
be reliably mocked with an LD_PRELOAD. The C language spec allows
the compiler to assume there is only a single implementation of
each function. It can thus do things like propagating constant
return values into the caller at compile time, or creating
multiple specialized copies of the function body each optimized
for a different caller. To prevent these optimizations we must
also set the 'noclone' and 'weak' attributes.
This fixes the test suite when libvirt.so is built with CLang
with optimization enabled.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The HOST_NAME_MAX, INET_ADDRSTRLEN and VIR_LOOPBACK_IPV4_ADDR
constants are only used by a handful of files, so are better
kept in virsocketaddr.h or the source file that uses them.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If a value of the first level object contains more objects needing
deflattening which would be wrapped in an actual object the function
would not recurse into them.
By this simple addition we can fully deflatten the objects.
As it turns out sometimes users pass in an arbitrarily nested structure
e.g. for the qemu backing chains JSON pseudo protocol. This new
implementation deflattens now a single object fully even with nested
keys.
Additionally it's not necessary now to stick with the "file." prefix for
the properties.
Currently the function would deflatten the object by dropping the 'file'
prefix from the attributes. This does not really scale well or adhere to
the documentation.
Until we refactor the worker to properly deflatten everything we at
least simulate it by adding the "file" wrapper object back.
virCommand is a version of virExec that doesn't fork, however it is
just calling execve and doesn't honors setting uid/gid and pwd.
This commit extrac those pieces from virExec() to a virExecCommon()
function that is called from both virExec() and virCommandExec().
Problem with our error reporting is that the error object is a
thread local variable. That means if there's an error reported
within the I/O thread it gets logged and everything, but later
when the event loop aborts the stream it doesn't see the original
error. So we are left with some generic error. We can do better
if we copy the error message between the threads.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
When the I/O thread quits (e.g. due to an I/O error, lseek()
error, whatever), any subsequent virFDStream API should return
error too. Moreover, when invoking stream event callback, we must
set the VIR_STREAM_EVENT_ERROR flag so that the callback knows
something bad happened.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1371892
As it turns out the volume create, build, and refresh path was not peeking
at the meta data, so immediately after a create operation the value displayed
for capacity was still incorrect. However, if a pool refresh was done the
correct value was fetched as a result of a meta data peek.
The reason is it seems historically if the file type is RAW then peeking
at the file just took the physical value for the capacity. However, since
we know if it's an encrypted file, then peeking at the meta data will be
required in order to get a true capacity value.
So check for encryption in the source and if present, use the meta data
in order to fill in the capacity value and set the payload_offset.
https://bugzilla.redhat.com/show_bug.cgi?id=1461270
When fetching stats for a vhost-user type of interface, we run
couple of ovs-vsctl commands and parse their output. However, not
all stats exist at all times, for instance "rx_dropped" or
"tx_errors" can be missing. Thing is, we ask for a bulk of
statistics and if one of them is missing an error is reported
instead of returning the rest. Since we ignore errors, we fail to
set statistics. Fix this by asking for each piece alone.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit 5c54d29aae forgot to do that when moving the only function
using it and it broke the build on some platforms.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This commit fixes a locale problem with locales that use comma as a mantissa
separator. Example: 12.34 en_US = 12,34 pt_BR. Since strtod() is a non-safe
function, virStrToDouble() will have problems to parse double numbers from
kernel settings and other double numbers from static files (XMLs, JSONs, etc).
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1457634
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1457481
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
The function virDoubleToStr() is defined in virutil.* and virStrToDouble() is
defined in virstring.*. Joining both functions into the same file makes more
sense.
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
Starting from qemu 2.9, more granular options are supported. Add parser
for the relevant bits.
With this patch libvirt is able to parse the host and target IQN of from
the JSON pseudo-protocol specification.
This corresponds to BlockdevOptionsIscsi in qemu qapi.
'SocketAddress' structure was changed to contain 'inet' instead of
'tcp' since qemu commit c5f1ae3ae7b. Existing entries have a backward
compatibility layer.
Libvirt will parse 'inet' and 'tcp' as equivalents.
The same json strucutre is used for NBD and sheepdog volumes for
specifying of the host. Rename the function and fix up error messages to
be more universal.
Change the settings from qemuDomainUpdateDeviceLive() as otherwise the
call would succeed even though nothing has changed.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1414627
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
With glibc >= 2.25.90 writev() is only available if you explicitly
include sys/uio.h. This matches the documented requirements, but
older glibc and other *NIX pulled in writev indirectly so the bug
wasn't noticed previously.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Use ATTRIBUTE_FALLTHROUGH, introduced by commit
5d84f5961b, instead of comments to
indicate that the fall through is an intentional behavior.
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1459091
We try to get the last element of the passed path by calling
strrch(path, '/'). However, the pointer that strrchr() returns
points at the slash, We want string that starts right after that.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
There's a problem with current streams after I switched them from
iohelper to thread implementation. Previously, iohelper made sure
not to exceed specified @length resulting in the pipe EOF
appearing at the exact right moment (the pipe was used to tunnel
the data from the iohelper to the daemon). Anyway, when switching
to thread I had to write the I/O code from scratch. Whilst doing
that I took an inspiration from the iohelper code, but since the
usage of pipe switched to slightly different meaning, there was
no 1:1 relationship between the codes.
Moreover, after introducing VIR_FDSTREAM_MSG_TYPE_HOLE, the
condition that should made sure we won't exceed @length was
completely wrong.
The fix is to:
a) account for holes for @length
b) cap not just data sections but holes too (if @length would be
exceeded)
For this purpose, the condition needs to be brought closer to the
code that handles holes and data sections.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The host address or the socket path have already been checked at the
begining of the function virStorageSourceParseNBDColonString(). So,
when the parameter is not a unix socket, there is no reason to check
the address again because if it does not exists, the logic will fail
in the first IF conditional.
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
VIR_STRDUP returns -1 if the string copy was not successful. So, the
current comparison/logic is throwing an error when VIR_STRDUP() returns
1. Only when source is NULL, it is considering as a success which is
not right.
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
Callers expect the return value to be the total number of vcpus in the
host (including offline vcpus). The refactor in c67e04e25f
broke this assumption by using virHostCPUGetOnlineBitmap which only
creates a bitmap long enough to hold the last online vcpu.
Report the full number of host vcpus by returning value from
virHostCPUGetCount().
Signed-off-by: Nitesh Konkar <nitkon12@linux.vnet.ibm.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
We will need some convenient helper functions for managing sysfs-entries
for fibre channel-backed devices. Let's implement them and make them
available in the private API.
Signed-off-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
The @ipv6_host allocated in virAsprintf may be lost when virAsprintf
addrstr failed.
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Many vendor id's and product id's have leading zeros. We should show
them in the logs.
Signed-off-by: Chen Hanxiao <chenhanxiao@gmail.com>
Reviewed-by: Laine Stump <laine@laine.org>
@reply is a DBusMessage object returned by virDBusCallMethod in
get machine object call path, dereference it before calling
virDBusCallMethod again to get machine name.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Some older systems (such as RHEL6) lack SEEK_HOLE and SEEK_DATA
which virFileInData relies on. Provide a stub for these systems.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Currently, we don't assign any meaning to that. Our current view
on virStream is that it's merely a pipe. And pipes don't support
seeking.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit e3ba4025 introduced srv->handles and VIR_RESIZE_N to allocate
@handles as necessary, but did not free the handles during when calling
virNetlinkEventServiceStop.
Commit 15a71e60 introduced the virNetlinkEventServiceStopAll function, and
the code in virNetlinkEventServiceStop is copied to this function, so just
call virNetlinkEventServiceStop instead.
Namely, this patch is about virMediatedDeviceGetIOMMUGroup{Dev,Num}
functions. There's no compelling reason why these functions should take
an object, on the contrary, having to create an object every time one
needs to query the IOMMU group number, discarding the object afterwards,
seems odd.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Commit 8e09663 "pci: recognize/report GEN4 (PCIe 4.0) card 16GT/s Link
speed" introduced another speed into enum, but mistakenly also altered
field width, so one bit of link width was included there.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Basically, what is needed here is to introduce new message type
for the messages passed between the event loop callbacks and the
worker thread that does all the I/O. The idea is that instead of
a queue of read buffers we will have a queue where "hole of size
X" messages appear. That way the event loop callbacks can just
check the head of the queue and see if the worker thread is in
data or a hole section and how long the section is.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This function takes a FD and determines whether the current
position is in data section or in a hole. In addition to that,
it also determines how much bytes are there remaining till the
current section ends.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
One big downside of using the pipe to transfer the data is that
we can really transfer just bare data. No metadata can be carried
through unless some formatted messages are introduced. That would
be quite painful to achieve so let's use a message queue. It's
fairly easy to exchange info between threads now that iohelper is
no longer used.
The reason why we cannot use the FD for plain files directly is
that despite us setting noblock flag on the FD, any
read()/write() blocks regardless (which is a show stopper since
those parts of the code are run from the event loop) and poll()
reports such FD as always readable/writable - even though the
subsequent operation might block.
The pipe is still not gone though. It is used to signal the event
loop that an event occurred (e.g. data is available for reading
in the queue, or vice versa).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The QEMU default is GICv2, and some of the code in libvirt
relies on the exact value. Stop pretending that's not the
case and use GICv2 explicitly where needed.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
The metadata libvirt cares about is identical for version 3
as for previous versions, so we merely need list the new
version number.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Simply tries to match the provided regex on a string and returns
the result. Useful if caller don't care about the matched substring
and want to just test if some pattern patches a string.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
GCC complains that inlining virStringTrimOptionalNewline is not
likely on some platforms:
cc1: warnings being treated as errors
../../src/util/virfile.c: In function 'virFileReadValueBitmap':
../../src/util/virstring.h:292: error: inlining failed in call to 'virStringTrimOptionalNewline': call is unlikely and code size would grow [-Winline]
../../src/util/virfile.c:3987: error: called from here [-Winline]
Inlining this function is not going to be a measurable performance
benefit either, since the time required to execute it is going to
be dominated by running of strlen() over the string, not by the
function call overhead.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
It is no longer needed thanks to the great virfilewrapper.c. And this
way we don't have to add a new set of functions for each prefixed
path.
While on that, add two functions that weren't there before, string and
scaled integer reading ones. Also increase the length of the string
being read by one to accompany for the optional newline at the
end (i.e. change INT_STRLEN_BOUND to INT_BUFSIZE_BOUND).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
So, because mingw is somehow OK with dereferencing a pointer within a
VIR_DEBUG macro, compared to outside of it to which it complained with a
"potential NULL pointer dereference" error (still a false positive), we
can make the code a tiny bit cleaner.
Sighed-by: Erik Skultety <eskultet@redhat.com>
Signed-off-by: Erik Skultety <eskultet@redhat.com>
After bdcf6e481 there is a crasher in libvirt. The commit assumes
that priv->perf is always set. That is not true. For inactive
domains, the priv->perf is not allocated as it is set in
qemuProcessLaunch(). Now, usually we differentiate between
accesses to inactive and active definition and it works just
fine. Except for 'domstats'. There priv->perf is accessed without
prior check for domain inactivity. While we could check for that,
more robust solution is to make virPerfEventIsEnabled() accept
NULL.
How to reproduce:
1) ensure you have at least one inactive domain
2) virsh domstats
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
This patch fixes the following MinGW error (although actually being a
false positive):
../../src/util/virmdev.c: In function 'virMediatedDeviceListMarkDevices':
../../src/util/virmdev.c:453:21: error: potential null pointer
dereference [-Werror=null-dereference]
const char *mdev_path = mdev->path;
^~~~~~~~~
Signed-off-by: Erik Skultety <eskultet@redhat.com>
The problem resides in virHostdevUpdateActiveMediatedDevices which gets
called during qemuProcessReconnect. The issue here is that
virMediatedDeviceListAdd takes a pointer to the item to be added to the
list to which VIR_APPEND_ELEMENT is used, which also clears the pointer.
However, in this case only the local copy of the pointer got cleared,
leaving the original pointing to valid memory. To sum it up, during
cleanup phase, the original pointer is freed and the daemon crashes
basically any time it would access it.
Backtrace:
0x00007ffff3ccdeba in __strcmp_sse2_unaligned
0x00007ffff72a444a in virMediatedDeviceListFindIndex
0x00007ffff7241446 in virHostdevReAttachMediatedDevices
0x00007fffc60215d9 in qemuHostdevReAttachMediatedDevices
0x00007fffc60216dc in qemuHostdevReAttachDomainDevices
0x00007fffc6046e6f in qemuProcessStop
0x00007fffc6091596 in processMonitorEOFEvent
0x00007fffc6091793 in qemuProcessEventHandler
0x00007ffff7294bf5 in virThreadPoolWorker
0x00007ffff7294184 in virThreadHelper
0x00007ffff3fdc3c4 in start_thread () from /lib64/libpthread.so.0
0x00007ffff3d269cf in clone () from /lib64/libc.so.6
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1446455
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
Use a local variable to hold data, rather than accessing the pointer
after calling virMediatedDeviceListAdd (therefore VIR_APPEND_ELEMENT).
Although not causing an issue at the moment, this change is a necessary
prerequisite for tweaking virMediatedDeviceListAdd in a separate patch,
which will take a reference for the source pointer (instead of pointer
value) and will clear it along the way.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
If we are encoding a block of data that is 16 bytes in length,
we cannot leave it as 16 bytes, we must pad it out to the next
block boundary, 32 bytes. Without this padding, the decoder will
incorrectly treat the last byte of plain text as the padding
length, as it can't distinguish padded from non-padded data.
The problem exhibited itself when using a 16 byte passphrase
for a LUKS volume
$ virsh secret-set-value 55806c7d-8e93-456f-829b-607d8c198367 \
$(echo -n 1234567812345678 | base64)
Secret value set
$ virsh start demo
error: Failed to start domain demo
error: internal error: process exited while connecting to monitor: >>>>>>>>>>Len 16
2017-05-02T10:35:40.016390Z qemu-system-x86_64: -object \
secret,id=virtio-disk1-luks-secret0,data=SEtNi5vDUeyseMKHwc1c1Q==,\
keyid=masterKey0,iv=zm7apUB1A6dPcH53VW960Q==,format=base64: \
Incorrect number of padding bytes (56) found on decrypted data
Notice how the padding '56' corresponds to the ordinal value of
the character '8'.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
... with VIR_NET_GENERATED_MACV???_PREFIX, which is defined in
util/virnetdevmacvlan.h.
Since VIR_NET_GENERATED_PREFIX is used for plain tap devices, it is
renamed to VIR_NET_GENERATED_TAP_PREFIX and moved to virnetdev.h
MACVTAP_NAME_PREFIX and MACVLAN_NAME_PREFIX could be useful to other
files if they were defined in virnetdevmacvlan.h instead of
virnetdevmacvlan.c, so do that (while slightly renaming them and also
adding yet another #define that chooses between macvlan/macvtap based
on flags).
This is a prerequisite to fix: https://bugzilla.redhat.com/1335798
After 1eb6647979 nobody calls the iohelper with 6 arguments.
Everybody uses the other mode. Well, the only user of iohelper
after the previous commit is virFileWrapperFd really.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Currently we use iohelper for virFDStream implementation. This is
because UNIX I/O can lie sometimes: even though a FD for a
file/block device is set as unblocking, actual read()/write() can
block. To avoid this, a pipe is created and one end is kept for
read/write while the other is handed over to iohelper to
write/read the data for us. Thus it's iohelper which gets blocked
and not our event loop.
This approach has two problems:
1) we are spawning a new process.
2) any exchange of information between daemon and iohelper can be
done only through the pipe.
Therefore, iohelper is replaced with an implementation in thread
which is created just for the stream lifetime. The data are still
transferred through pipe (for now), but both problems described
above are solved.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
While this is no functional change, it makes the code look a bit
nicer. Moreover, it prepares ground for future work.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
There is really no reason why we should have to have 'struct'
everywhere.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
A long time ago we imported the keymaps.csv file from GTK-VNC so we
can do conversions between keycode sets. Meanwhile lots of bug fixes
have gone into this CSV file and libvirt hasn't kept in sync. The
keymaps.csv file and associated generator script has been pulled out
of GTK-VNC into a dedicated GIT repo for use as a submodule. This
allows GTK-VNC, SPICE-GTK, QEMU and libvirt to share the same master
database and tools and pushing updates merely requires a submodule
commit update as with gnulib.
The test suite is updated to cover some extra boundary conditions.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently, virNetDevSetCoalesce() stub is always returning error. As
it's used by virNetDevTapCreateInBridgePort(), it essentially breaks
bridged networking if coalesce is not supported.
To make it work, relax the stub to trigger error only when its
coalesce argument is not NULL, otherwise report success.
Commit f4ef3a71 made a variation of virNetDevSetMAC that would return
without logging an error message if errno was set to
EADDRNOTAVAIL. This errno is set by some SRIOV VF drivers (in
particular igbvf) when they fail to set the device's MAC address due
to the PF driver refusing the request. This is useful if we want to
try a different method of setting the VF MAC address before giving up
(Commit 86556e16 actually does this, setting the desired MAC address
to the "admin MAC in the PF, then detaching and reattaching the VF
netdev driver to force a reinit of the MAC address).
During testing of Bug 1442040 t was discovered that the ixgbe driver
returns EPERM in this situation, so this patch changes the exception
case for silent+non-terminal failure to account for this difference.
Completes resolution to: https://bugzilla.redhat.com/1415609 (RHEL 7.4)
https://bugzilla.redhat.com/1442040 (RHEL 7.3.z)
The current fallback stub for virNetDevSetCoalesce is inside an
earlier conditional block. This deals with the feature being
missing on older Linux platforms. We need a second fallback stub
though, outside the top level conditional, to ensure builds work
on Win32/FreeBSD platforms too.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This patch makes use of the virNetDevSetCoalesce() function to make
appropriate settings effective for devices that support them.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1414627
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
That function is able to configure coalesce settings for an interface,
similarly to 'ethtool -C'. This function also updates back the
structure so that it contains actual data on the device (if the device
doesn't support some settings kernel might just return 0 and not set
whatever is not supported), so this way we'll have up-to-date
information in the live domain XML.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reported by Rafał Wojciechowski <it@rafalwojciechowski.pl>.
Thread 1 (Thread 0x7f194b99d700 (LWP 5631)):
0 virNetDevGetifaddrsAddress (addr=0x7f194b99c7c0, ifname=0x7f193400e2b0 "ovirtmgmt") at util/virnetdevip.c:738
1 virNetDevIPAddrGet (ifname=0x7f193400e2b0 "ovirtmgmt", addr=addr@entry=0x7f194b99c7c0) at util/virnetdevip.c:795
2 0x00007f19467800d6 in networkGetNetworkAddress (netname=<optimized out>, netaddr=netaddr@entry=0x7f1924013f18) at network/bridge_driver.c:4780
3 0x00007f193e43a33c in qemuProcessGraphicsSetupNetworkAddress (listenAddr=0x7f19340f7650 "127.0.0.1", glisten=0x7f1924013f10) at qemu/qemu_process.c:4062
4 qemuProcessGraphicsSetupListen (vm=<optimized out>, graphics=0x7f1924014f10, cfg=0x7f1934119f00) at qemu/qemu_process.c:4133
5 qemuProcessSetupGraphics (flags=17, vm=0x7f19240155d0, driver=0x7f193411f1d0) at qemu/qemu_process.c:4196
6 qemuProcessPrepareDomain (conn=conn@entry=0x7f192c00ab50, driver=driver@entry=0x7f193411f1d0, vm=vm@entry=0x7f19240155d0, flags=flags@entry=17) at qemu/qemu_process.c:4969
7 0x00007f193e4417c0 in qemuProcessStart (conn=conn@entry=0x7f192c00ab50, driver=driver@entry=0x7f193411f1d0, vm=0x7f19240155d0,asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_START, migrateFrom=migrateFrom@entry=0x0, migrateFd=migrateFd@entry=-1, migratePath=migratePath@entry=0x0,snapshot=snapshot@entry=0x0, vmop=vmop@entry=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=17, flags@entry=1) at qemu/qemu_process.c:5553
Man page for getifaddrs also states that the "ifa_addr" may contain
a null pointer which happens if there is an existing network interface
on the host without IP address.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
CLang's optimizer is more aggressive at inlining functions than
gcc and so will often inline functions that our tests want to
mock-override. This causes the test to fail in bizarre ways.
We don't want to disable inlining completely, but we must at
least prevent inlining of mocked functions. Fortunately there
is a 'noinline' attribute that lets us control this per function.
A syntax check rule is added that parses tests/*mock.c to extract
the list of functions that are mocked (restricted to names starting
with 'vir' prefix). It then checks that src/*.h header file to
ensure it has a 'ATTRIBUTE_NOINLINE' annotation. This should prevent
use from bit-rotting in future.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Without this added enum value, nodedev-dumpxml of a GEN4 (PCIe 4.0)
card will fail (due to the unrecognized link speed), and since
nodedev-detach and nodedev-reattach internally do a dumpxml+parse,
they will also fail. With this patch, all those operations succeed.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Fix xlconfigtest runs build for --enable-test-oom on
Xen XL-2-XML Parse channel-pty
Program received signal SIGSEGV, Segmentation fault.
#0 0x00007ffff3c2b373 in __strchr_sse2 () from /lib64/libc.so.6
==> #1 0x00007ffff7875701 in virConfSaveValue (buf=buf@entry=0x7fffffffd8a0, val=val@entry=0x674750) at util/virconf.c:290
#2 0x00007ffff7875668 in virConfSaveValue (buf=buf@entry=0x7fffffffd8a0, val=<optimized out>) at util/virconf.c:306
#3 0x00007ffff78757ef in virConfSaveEntry (buf=buf@entry=0x7fffffffd8a0, cur=cur@entry=0x674780) at util/virconf.c:338
#4 0x00007ffff78783eb in virConfWriteMem (memory=0x665570 "", len=len@entry=0x7fffffffd910, conf=conf@entry=0x65b940)
at util/virconf.c:1543
#5 0x000000000040eccb in testCompareParseXML (replaceVars=<optimized out>, xml=<optimized out>,
xlcfg=0x662c00 "/home/wtenhave/WORK/libvirt/OOMtesting/libvirt-devel/tests/xlconfigdata/test-channel-pty.cfg")
at xlconfigtest.c:108
#6 testCompareHelper (data=<optimized out>) at xlconfigtest.c:205
#7 0x0000000000410b3a in virTestRun (title=title@entry=0x432cc0 "Xen XL-2-XML Parse channel-pty",
body=body@entry=0x40e9b0 <testCompareHelper>, data=data@entry=0x7fffffffd9f0) at testutils.c:247
#8 0x000000000040f322 in mymain () at xlconfigtest.c:278
#9 0x0000000000411410 in virTestMain (argc=1, argv=0x7fffffffdba8, func=0x40f660 <mymain>) at testutils.c:992
#10 0x00007ffff3bc0401 in __libc_start_main () from /lib64/libc.so.6
#11 0x000000000040e86a in _start ()
(gdb) frame 1
#1 0x00007ffff7875701 in virConfSaveValue (buf=buf@entry=0x7fffffffd8a0, val=val@entry=0x674750) at util/virconf.c:290
290 if (strchr(val->str, '\n') != NULL) {
(gdb) print *val
$1 = {type = VIR_CONF_STRING, next = 0x0, l = 0, str = 0x0, list = 0x0}
Signed-off-by: Wim ten Have <wim.ten.have@oracle.com>
Fix the build with clang:
util/virperf.c:86:27: error: use of GNU 'missing =' extension
in designator [-Werror,-Wgnu-designator]
[VIR_PERF_EVENT_MBML] {
^
=
The virPerfGetEvent method pointlessly checks for a NULL
parameter and the range of an enum value. The whole point
of using an enum is that we can avoid such checks. Just
replace calls to virPerfGetEvent, with perf->events[type]
array access.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virPerfGetEventAttr method contains a totally pointless
loop. Remove it, verify the array size statically, and then
just use an array index to access the perf event.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This function runs an iscsi command and parses its output.
However, due to the nature of things, virISCSIExtractSession()
callback can be called multiple times. In each run it would
allocate new memory and overwrite the variable where we keep
pointer to it and thus leaking old allocations.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Imagine that this function is called twice over the same disk
source. While in the first run all allocated memory is freed, not
all pointers are set to NULL (e.g. def->srcpool). So when called
again, these poitners are freed again resulting in double free.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Similar to commit b202c39 ignore the warning that breaks the build
with clang:
util/virnetlink.c:365:52: error: cast from 'char *' to 'struct nlmsghdr *'
increases required alignment from 1 to 4 [-Werror,-Wcast-align]
for (msg = resp; NLMSG_OK(msg, len); msg = NLMSG_NEXT(msg, len)) {
^~~~~~~~~~~~~~~~~~~~
/usr/include/linux/netlink.h:87:7: note: expanded from macro 'NLMSG_NEXT'
(struct nlmsghdr*)(((char*)(nlh)) + NLMSG_ALIGN((nlh)->nlmsg_len)))
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
https://bugzilla.redhat.com/show_bug.cgi?id=1371892
The 'capacity' value (e.g. guest logical size) for a LUKS volume is
smaller than the 'physical' value of the file in the file system, so
we need to account for that.
When peeking at the encryption information about the volume add a fetch
of the payload_offset which is described as the offset to the start of
the volume data (in 512 byte sectors) in QEMU's QCryptoBlockLUKSHeader.
Then adjust the ->capacity appropriately when we determine that the
volume target encryption has a payload_offset value.
Add check for more than one RTA_OIF, even though this is rather
unlikely.
Get rid of the buggy switch / break as this code won't need to
handle more attributes.
Use VIR_WARNINGS_NO_CAST_ALIGN to fix impossible to fix
util/virnetdevip.c:560:17: error: cast increases required alignment of target type [-Werror=cast-align]
There was an unhandled 'open' call which resulted in:
"error: Library function returned error but did not set virError"
Even if this happens during the daemon's start when we still don't have
any set of outputs defined yet, we can safely report an error, since we
automatically fallback to stderr which is fine even for both
running as a daemonized process, since this happens before the daemon
forks into the background, and running as a systemd service, since
systemd re-directs std outputs to journald by default.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1436060
Signed-off-by: Erik Skultety <eskultet@redhat.com>
The value we use internally to represent the lack of a memory
locking limit, VIR_DOMAIN_MEMORY_PARAM_UNLIMITED, doesn't
match the value setrlimit() and prlimit() use for the same
purpose, RLIM_INFINITY, so we have to handle the translation
ourselves.
Partially-resolves: https://bugzilla.redhat.com/1431793
If if_indextoname is not defined, the whole function using it should
not be defined either. Add stub to fix build on mingw.
Caused by 5dd607059d
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
- Make virMediatedDeviceNew() stub args match its prototype
- Fix typo: virRerportError -> virReportError
- Move MDEV_SYSFS_DEVICES definition out of the #ifdef __linux__ block
so we don't have to stub virMediatedDeviceGetSysfsPath()
Previously, this function must've been called only on Linux in order
to fail gracefully. That lead to #ifdef mess in callers, so the
function was redesigned so it failed gracefully on non-existing
files. However that commit forgot to define the function outside the
__linux__ ifdef, it broke non-Linux builds.
Caused by c67e04e25f.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Having this information available will make it easier to determine the
culprit when MAC or vlan tag appear to not be set, eg.:
https://bugzilla.redhat.com/1364073
(This patch doesn't fix that bug, just makes it easier to diagnose)
If an SRIOV VF has previously been used for VFIO device assignment,
the "admin MAC" that is stored in the PF driver's table of VF info
will have been set to the MAC address that the virtual machine wanted
the device to have. Setting the admin MAC for a VF also sets a flag in
the PF that is loosely called the "administratively set" flag. Once
that flag is set, it is no longer possible for the net driver of the
VF (either on the host or in a virtual machine) to directly set the
VF's MAC again; this flag isn't reset until the *PF* driver is
restarted, and that requires taking *all* VFs offline, so it's not
really feasible to do.
If the same SRIOV VF is later used for macvtap passthrough mode, the
VF's MAC address must be set, but normally we don't unbind the VF from
its host net driver (since we actually need the host net driver in
this case). Since setting the VF MAC directly will fail, in the past
"we" ("I") had tried to fix the problem by simply setting the admin MAC
(via the PF) instead. This *appeared* to work (and might have at one
time, due to promiscuous mode being turned on somewhere or something),
but it currently creates a non-working interface because only the
value for admin MAC is set to the desired value, *not* the actual MAC
that the VF is using.
Earlier patches in this series reverted that behavior, so that we once
again set the MAC of the VF itself for macvtap passthrough operation,
not the admin MAC. But that brings back the original bug - if the
interface has been used for VFIO device assignment, you can no longer
use it for macvtap passthrough.
This patch solves that problem by noticing when virNetDevSetMAC()
fails for a VF, and in that case it sets the desired MAC to the admin
MAC via the PF, then "bounces" the VF driver (by unbinding and the
immediately rebinding it to the VF). This causes the VF's MAC to be
reinitialized from the admin MAC, and everybody is happy (until the
*next* time someone wants to set the VF's MAC address, since the
"administratively set" bit is still turned on).
Some PF drivers allow setting the admin MAC (that is the MAC address
that the VF will be initialized to the next time the VF's driver is
loaded) to 00:00:00:00:00:00, and some don't. Multiple drivers
initialize the admin MACs to all 0, but don't allow setting it to that
very same value. It has been an uphill battle convincing the driver
people that it's reasonable to expect The argument that's used is
that an all 0 device MAC address on a device is invalid; however, from
an outsider's point of view, when the admin MAC is set to 0 at the
time the VF driver is loaded, the VF's MAC is *not* set to 0, but to a
random non-0 value. But that's beside the point - even if I could
convince one or two SRIOV driver maintainers to permit setting the
admin MAC to 0, there are still several other drivers.
So rather than fighting that losing battle, this patch checks for a
failure to set the admin MAC due to an all 0 value, and retries it
with 02:00:00:00:00:00. That won't result in a random value being set
in the VF MAC at next VF driver init, but that's okay, because we
always want to set a specific value anyway. Rather, the "almost 0"
setting makes it easy to visually detect from the output of "ip link
show" which VFs are currently in use and which are free.
The global functions virNetDevReplaceMacAddress(),
virNetDevReplaceNetConfig(), virNetDevRestoreMacAddress(), and
virNetDevRestoreNetConfig() are no longer used, as their functionality
has been replaced by virNetDev(Save|Read|Set)NetConfig().
The static functions virNetDevReplaceVfConfig() and
virNetDevRestoreVfConfig() were only used by the above-named global
functions that were removed.
It takes longer to explain this than to fix it...
In the past we weren't able to save the VF's own MAC address *at all*
when using it for hostdev assignment, because we had already unbound
the VF from the host net driver prior to saving its config. With the
previous patch, that problem has been solved, so we now have the VF's
MAC address saved and can move on to the *next* problem, which is twofold:
1) during teardown we restore the config before we've re-bound, so the
VF doesn't have a net driver, and thus we can't set its MAC address
directly.
2) even if we delay restoring the config until the VF is bound to a
net driver, the request to set its MAC address would fail, since
(during device setup) we had set the "admin MAC" for the VF via an
RTM_SETLINK to the PF - once you've set the admin MAC for a VF, the
VF driver (either on host or on guest) is not allowed to change the
VF's MAC address "forever" (well, until you reload the PF driver,
but that requires destroying and recreating every single VF, which
isn't something you can require).
The solution is to keep the restoration of config at the same place,
but to set the *admin MAC* to the address you want the VF to have -
when the VF net driver is later initialized (as a part of re-binding
to the VF net driver) its MAC will be initialized to the current value
of the admin MAC.
In order to properly restore the original state of an SRIOV VF when
we're finished with it, we need to save the MAC address of the VF
itself (not just the admin MAC address for the VF that is stored in
the PF). But that can only be done when the VF is still bound to the
host's netdev driver, and we have always done the saving of device
config after the VF is already bound to vfio-pci. This patch prepares
us for adding a save of the VF's MAC by calling the function that
saves netconfig earlier in the device preparation, before we've
unbound it from the host netdev driver.
These two operations will need to be separated so that saving of the
original config is done before detaching the host net driver, and
setting the new config is done after attaching vfio-pci. This patch
splits the single function into two, but for now calls them together
(to make bisecting easier if there is a regression).
virHostdevNetConfigReplace() and virHostdevNetConfigRestore() are
modified to use the new virNetDev*NetConfig() functions.
Note that due to the VF's original MAC addresses being saved after it
has already been un-bound from the host net driver, the actual current
VF MAC address won't be saved (because it no longer exists) - only the
"admin MAC" will be saved. This reflects existing behavior that will
be fixed in an upcoming patch.
This patch modifies the macvtap passthrough setup to use
virNetDevSaveNetConfig()+virNetDevSetConfig() instead of
virNetDevReplaceNetConfig() or virNetDevReplaceMacAddress(), and the
teardown to use virNetDevReadNetConfig()+virNetDevSetConfig() instead
of virNetDevRestoreNetConfig() or virNetDevRestoreMacAddress().
Since the older functions only saved/restored the admin MAC and vlan
tag (which is incorrect) and the new functions save/restore the VF's
own MAC address and vlan tag (correct), this actually fixes a bug
(which was introduced by commit cb3fe38c7, which was itself supposed
to be a fix for https://bugzilla.redhat.com/1113474 ).
The downside to this patch is that it causes an *apparent* regression
in that bug (because there will once again be an error reported if the
interface had previously been used for VFIO device assignment), but in
reality, the code hasn't been working for *any* case before this
current patch (at least not with any recent kernel). Anyway, that
"regression" will be fixed with an upcoming patch that fixes it the
*right* way.
These three functions are destined to replace
virNetDev(Replace|Restore)NetConfig() and
virNetDev(Replace|Restore)MacAddress(), which both do the save and set
together as a single step. We need to separate the save, read, and set
steps because there will be situations where we need to do something
else in between (in particular, we will need to rebind a VF's driver
after save but before set).
This patch creates the new functions, but doesn't call them - that
will come in a subsequent patch. Note that the new functions to
read/write the file that stores the original network config now uses
JSON rather than plaintext (it still recognizes the old format as well
though, so it won't get confused during an upgrade).
Keep track of the assigned mediated devices the same way we do it for
the rest of hostdevs. Methods like 'Prepare', 'Update', and 'ReAttach'
are introduced by this patch.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Beside creation, disposal, getter, and setter methods the module exports
methods to work with lists of mediated devices.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
We keep forgetting that older setups don't like 'index':
CC util/libvirt_util_la-virsysinfo.lo
cc1: warnings being treated as errors
util/virstoragefile.c: In function 'virStorageSourceFindByNodeName':
util/virstoragefile.c:3804: error: declaration of 'index' shadows a global declaration [-Wshadow]
/usr/include/string.h:489: error: shadowed declaration is here [-Wshadow]
Signed-off-by: Eric Blake <eblake@redhat.com>
That file has only two exported files and each one of them has
different naming. virNode is what all the other files use, so let's
use it. It wasn't used before because the clash with public API
naming, so let's fix that by shortening the name (there is no other
private variant of it anyway).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
There is no reason for it not to be in the utils, all global symbols
under that file already have prefix vir* and there is no reason for it
to be part of DRIVER_SOURCES because that is just a leftover from
older days (pre-driver modules era, I believe).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
While on that, drop support for kernels from RHEL-5 era (missing
cpu/present file). Also add some useful functions and export them.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
By using this we are able to easily switch the sysfs path being
used (fake it). This will not only help tests in the future but can
be also used from files where the code is duplicated currently.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
These helpers are doing just a read and covert the value, but they
properly size the read limit, handle additional whitespace characters,
and unify error reporting.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
STREQ_NULLABLE returns true if both parameters are NULL. And that's
not what we want here. We just want to skop comparing source nodes
that don't have that info set. The function wouldn't make much sense
with nodeName == NULL, so we don't need to check that. Moreover, the
function's declaration uses ATTRIBUDE_NONNULL for nodeName, which not
only means that function expects the parameter not to be NULL, but
actually tells the compiler that it can optimize out the NULL checks.
That way it could end up calling strcmp on NULL (either nodeformat or
nodebacking). GCC figures this out if libvirt is compiled with
lv_cv_static_analysis=yes, unfortunately not everyone uses that.
Caused by cbc6d53513.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The function has very specific semantics. Split out the part that parses
the backing store specification string into a separate helper so that it
can be reused later while keeping the wrapper with existing semantics.
Note that virStorageFileParseChainIndex is pretty well covered by the
test suite.
Fix typo in virNetDevPFGetVF() stub:
ATTRUBUTE_UNUSED -> ATTRIBUTE_UNUSED.
While here, use common indent style for arguments in
virNetDevGetVirtualFunctionIndex() stub.
Given an SRIOV PF netdev name (e.g. "enp2s0f0") and VF#, this new
function returns the netdev name of the referenced VF device
(e.g. "enp2s11f6"), or NULL if the device isn't bound to a net driver.
We will want to allow silent failure of virNetDevSetMAC() in the case
that the SIOSIFHWADDR ioctl fails with errno == EADDRNOTAVAIL. (Yes,
that is very specific, but we really *do* want a logged failure in all
other circumstances, and don't want to duplicate code in the caller
for the other possibilities).
This patch renames the 3 different virNetDevSetMAC() functions to
virNetDevSetMACInternal(), adding a 3rd arg called "quiet" and making
them static (because this extra control will only be needed within
virnetdev.c). A new global virNetDevSetMAC() is defined that calls
whichever of the three *Internal() functions gets compiled with quiet
= false. Callers in virnetdev.c that want to notice a failure with
errno == EADDRNOTAVAIL and retry with a different strategy rather than
immediately failing, can call virNetDevSetMACInternal(..., true).
This function unbinds a device from its driver, then immediately
rebinds it to its driver again. The code for this new function is just
the 2nd half of virPCIDeviceBindWithDriverOverride(), so that
function's 2nd half is replaced with a call to virPCIDeviceRebind().
...and cleanup the callers to report it when it *is* an error.
In many cases It's useful for virPCIGetNetName() to not log an error
and simply return a NULL pointer when the given device isn't bound to
a net driver (e.g. we're looking at a VF that is permanently bound to
vfio-pci). The existing code would silently return an error in this
case, which could eventually lead to the dreaded "An error occurred
but the cause is unknown" log message.
This patch changes virPCIGetNetName() to still return success if the
device simply isn't bound to a net driver, and adjusts all the callers
that require a non-null netname to check for that condition and log an
error when it happens.
vf in virNetDevMacVLanDeleteWithVPortProfile() is initialized to -1
and never set. It's not set for a good reason - because it doesn't
make sense during macvtap device setup to refer to a VF device as
"PF:VF#". This patch replaces the two uses of "vf" with "-1", and
removes the local variable, so that it's more clear we are always
calling the utility functions with vf set to -1.
This function is only called in two places, and the ifindex,
nltarget_kernel, and getPidFunc args are never used (and never will
be).
ifindex - we always know the name of the device, and never know the
ifindex - if we really did need the ifindex we would have to get it
from the name using virNetDevGetIndex(). In practice, we just send -1
to virNetDevSetVfConfig(), which doesn't bother to learn the real
ifindex (you only need a name *or* an ifindex for the netlink command
to succeed, not both).
nltarget_kernel - messages to set the config of an SRIOV VF will
always go to netlink in the kernel, not to another user process, so
this arg is always true (there are other uses of netlink messages
where the message might need to go to another user process, but never
in the case of RTM_SETLINK for SRIOV).
getPidFunc - this arg is only used if nltarget_kernel is false, and it
never is.
None of this has any functional effect, it just makes it easier to
follow what's happening when virNetDevSetVfConfig() is called.
virNetDevParseVfConfig() assumed that both the MAC address and VLAN
tag pointers were valid, so even if you only wanted one or the other,
you would need a variable to hold the returned value for both. This
patch checks each for a NULL pointer before filling it in.
The source code will check for NULL arguments for 'macvtap_macaddr' and
'vmuuid', so no need for the NONNULL in the prototypes. Following the stack
for both arguments to virNetDevVPortProfileOpSetLink also shows called
functions would handle a NULL value.
Additionally, modified the prototype to use the same 'macvtap_macaddr'
name as the source code for consistency.
Since the code checks 'mgr == NULL' anyway, no need for the prototype
to have the NONNULL arg check. Also add an error message to indicate what
the failure is so that there isn't a failed for some reason error.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The comparison code used STREQ_NULLABLE anyway for both 'drv_name' and
'dom_name', so no need. Add a NULLSTR on the 'dom_name' too.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The comparison code used STREQ_NULLABLE anyway for both 'drv_name' and
'dom_name', so no need. Add a NULLSTR on the 'dom_name' too.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The code checks and handles a NULL 'str', so just remove the NONNULL.
Update the error message to add the NULLSTR() around 'str' also.
Signed-off-by: John Ferlan <jferlan@redhat.com>
This patch splits out the part of virNetDevTapCreateInBridgePort()
that would need to be re-done if an existing tap device had to be
re-attached to a bridge, and puts it into a separate function. This
can be used both when an existing domain interface config is updated
to change its connection, and also to re-attach to the "same" bridge
when a network has been stopped and restarted. So far it is used for
nothing.
This function provides the bridge/bond device that the given network
device is attached to. The return value is 0 or -1, and the master
device is a char** argument to the function - this is needed in order
to allow for a "success" return from a device that has no master.
The only reason that the ethtool features weren't being retrieved in
an unprivileged libvirtd was because they required ioctl(), and the
ioctl was using an AF_PACKET socket, which requires root. Now that we
are using AF_UNIX for ioctl(), this restriction can be removed.
The exact family of the socket created for the fd used by ioctl(7)
doesn't matter, it just needs to be a socket and not a file. But for
some reason when macvtap support was added, it used
AF_PACKET/SOCK_DGRAM sockets for its ioctls; we later used the same
AF_PACKET/SOCK_DGRAM socket for new ioctls we added, and eventually
modified the other pre-existing ioctl sockets (for creating/deleting
bridges) to also use AF_PACKET/SOCK_DGRAM (that code originally used
AF_UNIX/SOCK_STREAM).
The problem with using AF_PACKET (intended for sending/receiving "raw"
packets, i.e. packets that can be some protocol other than TCP or UDP)
is that it requires root privileges. This meant that none of the
ioctls in virnetdev.c or virnetdevip.c would work when running
libvirtd unprivileged.
This packet solves that problem by changing the family to AF_UNIX when
creating the socket used for any ioctl().
When enabling IPv6 on all interfaces, we may get the host Router
Advertisement routes discarded. To avoid this, the user needs to set
accept_ra to 2 for the interfaces with such routes.
See https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt
on this topic.
To avoid user mistakenly losing routes on their hosts, check
accept_ra values before enabling IPv6 forwarding. If a RA route is
detected, but neither the corresponding device nor global accept_ra
is set to 2, the network will fail to start.
virNetlinkCommand() processes only one response message, while some
netlink commands, like route dumping, need to process several.
Add virNetlinkDumpCommand() as a virNetlinkCommand() sister.
Allow to reuse as much as possible from virNetlinkCommand(). This
comment prepares for the introduction of virNetlinkDumpCommand()
only differing by how it handles the responses.
While connecting to qemu monitor, the first thing we do is wait
for it to show up. However, we are doing it with some timeout to
avoid indefinite waits (e.g. when qemu doesn't create the monitor
socket at all). After beaa447a29 we are using exponential back
off timeout meaning, after the first connection attempt we wait
1ms, then 2ms, then 4 and so on. This allows us to bring down
wait time for small domains where qemu initializes quickly.
However, on the other end of this scale are some domains with
huge amounts of guest memory. Now imagine that we've gotten up to
wait time of 15 seconds. The next one is going to be 30 seconds,
and the one after that whole minute. Well, okay - with current
code we are not going to wait longer than 30 seconds in total,
but this is going to change in the next commit.
The exponential back off is usable only for first few iterations.
Then it needs to be caped (one second was chosen as the limit)
and switch to constant wait time.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The function is actually in virutil.c, but prototyped in virfile.h.
This patch fixes that by renaming the function to virWaitForDevices,
adding the prototype in virutil.h and libvirt_private.syms, and then
changing the callers to use the new name.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Found by Coverity. Because there's an "if ((cur = strstr(base, "revision"))
!= NULL) {" followed by a "base = cur" coverity notes that 'base' could
then be NULL causing the return to the top of the "while ((tmp_base =
strstr(base, "processor")) != NULL) {" to have strstr deref a NULL 'base'
pointer because the setting of base at the bottom of the loop is unconditional.
Alter the code to set "base = cur" after processing each key. That will
"ensure" that base doesn't get set to NULL if both "cpu" and "revision"
do no follow a "processor".
While a /proc/cpuinfo file that has a "processor" key but with neither
a "cpu" nor a "revision" doesn't seem feasible, the code is written as if
it could happen, so we have to account for it.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Calls to virFileReadAll after a VIR_ALLOC that return NULL all show
a memory leak since 'ret' isn't virSysinfoDefFree'd and normal path
"return ret" doesn't free outbuf.
Reported by Coverity
Signed-off-by: John Ferlan <jferlan@redhat.com>
Whole implementations along with helper totalling screens of code were
conditionally compiled. That made the code totally unreadable and
untestable. Rename functions to have the architecture in the name so
that all can be compiled at the same time and introduce header to allow
testing them all.
Proposed formal coding conventions encourage defining typedefs for
vir[Blah] and vir[Blah]Ptr separately from the associated struct named
_vir[Blah]:
typedef struct _virBlah virBlah;
typedef virBlah *virBlahPtr;
struct _virBlah {
...
};
At some point in the past, I had submitted several patches using a
more compact style that I prefer, and they were accepted:
typedef struct _virBlah {
...
} virBlah, *virBlahPtr;
Since these are by far a minority among all struct definitions, this
patch changes all those definitions to reflect the style prefered by
the proposal so that there is 100% consistency.
After the system has been booted, it should not change.
Cache the return value of virSystemdHasMachined.
Allow starting and terminating machines with just one
DBus call, instead of three, reducing the chance of
the call timing out.
Also introduce a small function for resetting the cache
to be used in tests.
Both virSystemdTerminateMachine and virSystemdCreateMachine
propagate the error to tell between a non-systemd system
and a hard error.
In virSystemdGetMachineNameByPID both are treated the same,
but an error is ignored by the callers.
Split out the checks into a separate function.
Arguably though, function returning only on success is a very
interesting, although quite impractical concept. Also, the errno isn't
and shouldn't be preserved in this case, since the errno can be directly
fed to the virReportSystemError.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This will eventually replace virQEMUBuildBufferEscapeComma, however
it's not possible right now. Some parts of the code that uses the
old function needs to be refactored.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
GCC 7 gets upset by
if (!tmp && (size * count))
warning
util/viralloc.c: In function 'virReallocN':
util/viralloc.c:246:23: error: '*' in boolean context, suggest '&&' instead [-Werror=int-in-bool-context]
if (!tmp && (size * count)) {
~~~~~~^~~~~~~~
Keep it happy by adding != 0 to the right hand expression
so it realizes we really are wanting to treat the result
of the arithmetic expression as a boolean
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The 'raw' block driver in Qemu is not directly interesting from
libvirt's perspective, but it can be layered above some other block
drivers and this may be interesting for the user.
The patch adds support for the 'raw' block driver. The driver is treated
simply as a pass-through and child driver in JSON is queried to get the
necessary information.
Signed-off-by: Tomáš Golembiovský <tgolembi@redhat.com>
Split virStorageSourceParseBackingJSON into two functions so that the
core can be reused by other functions. The new function called
virStorageSourceParseBackingJSONInternal accepts virJSONValuePtr.
Signed-off-by: Tomáš Golembiovský <tgolembi@redhat.com>
The utils code should stay separated from other code (except for very
well justified cases). Unfortunately commit 272769becc
made it trivial to break the separation (and not get slapped by the
syntax-check rule) by adding -I src/conf to the CFLAGS for utils.
Remove this shortcut and except the two offenders from the syntax check
so that the codebase can be kept separated.
Create a virscsihost.c and place the functions there. That removes the
last #ifdef __linux__ from virutil.c.
Take the opporunity to also change the function names and in one case
the parameters slightly
Rather than have them mixed in with the virutil apis, create a separate
virvhba.c module and move the vHBA related calls into there. Soon there
will be more added.
Also modify the names of the functions and some arguments to be more
indicative of what is really happening. Adjust the callers respectively.
While I was changing fchosttest, rather than the non-descriptive names
test1...test6, rename them to match what the test is doing.
Before 9c17d665fd (v1.3.2 - I know, right?) it was possible to
have the following interface configuration:
<interface type='ethernet'/>
<script path=''/>
</interface>
This resulted in -netdev tap,script=,.. Fortunately, qemu helped
us to get away with this as it just ignored the empty script
path. However, after the commit mentioned above it's libvirtd
who is executing the script. Unfortunately without special
case-ing empty script path.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Currently disk names do not follow the
(regex) /^[fhv]d[a-z]+[0-9]*$/ completely
and hence one can assign disk names like
vd2 etc. This patch ensures that the
disk names follow the regex mentioned.
This patch also adds a testcase.
Signed-off-by: Nitesh Konkar <nitkon12@linux.vnet.ibm.com>
To make sure bit 'b' fits into the bitmap, we need to allocate b+1
bits, since we number from 0.
Adjust the bitmap test to set a bit at a multiple of 16.
That way the test fails without this fix, because the VIR_REALLOC
call clears the newly added memory even if the original pointer
has not changed.
So rather than comparing 2 paths (strings) as they are, which can very
easily lead to unnecessary errors (e.g. in storage driver) that the paths
are not the same when in fact they'd be e.g. just symlinks to the same
location, we should put our best effort into resolving any symlinks and
canonicalizing the path and only then compare the 2 paths for equality.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Although currently this is documented in virsh man page
and virsh help, the expicit mention in the error message
is helful for tools using the API directly.
Signed-off-by: Nitesh Konkar <nitkon12@linux.vnet.ibm.com>
After freeing the data structures we have to reset the counters to
zero. This fixes a segmentation fault when virNetDevIPInfoClear is
called twice (e.g. this is possible in virDomainNetDefParseXML() if
virDomainNetIPInfoParseXML(...) fails with ret < 0 (this leads to the
first call of 'virNetDevIPInfoClear(&def->guestIP)') and the resulting
call of virDomainNetDefFree(def) in the error path of
virDomainNetDefParseXML() (this leads to the second call of
virNetDevIPInfoClear(&def->guestIP), and finally to the segmentation
fault).
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
This patchs allows to set the timeout value used for all
openvswitch calls. The default timeout value remains as
before at 5 seconds.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Provide the ability to specify a default timeout value for
successful completion of openvswitch calls in the libvirtd
configuration file.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
virNetDevTapCreateInBridgePort() has always set the new tap device to
the current MTU of the bridge it's being attached to. There is one
case where we will want to set the new tap device to a different
(usually larger) MTU - if that's done with the very first device added
to the bridge, the bridge's MTU will be set to the device's MTU. This
patch allows for that possibility by adding "int mtu" to the arg list
for virNetDevTapCreateInBridgePort(), but all callers are sending -1,
so it doesn't yet have any effect.
Since the requested MTU isn't necessarily what is used in the end (for
example, if there is no MTU requested, the tap device will be set to
the current MTU of the bridge), and the hypervisor may want to know
the actual MTU used, we also return the actual MTU to the caller (if
actualMTU is non-NULL).
We will need to traverse the symlinks one step at the time.
Therefore we need to see where a symlink is pointing to.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The comment to the function states that the errors from the child
process are reported. Well, the error buffer is filled with
possible error messages. But then it is thrown away. Among with
important error message from the child process.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Originally/discovered proposed by "Wang King <king.wang@huawei.com>"
When the virCloseCallbacksSet is first called, it increments the refcnt
on the domain object to ensure it doesn't get deleted before the callback
is called. The refcnt would be decremented in virCloseCallbacksUnset once
the entry is removed from the closeCallbacks has table.
When (mostly) normal shutdown occurs, the qemuProcessStop will end up
calling qemuProcessAutoDestroyRemove and will remove the callback from
the list and hash table normally and decrement the refcnt.
However, when qemuConnectClose calls virCloseCallbacksRun, it will scan
the (locked) closeCallbacks list for matching domain and callback function.
If an entry is found, it will be removed from the closeCallbacks list and
placed into a lookaside list to be processed when the closeCallbacks lock
is dropped. The callback function (e.g. qemuProcessAutoDestroy) is called
and will run qemuProcessStop. That code will fail to find the callback
in the list when qemuProcessAutoDestroyRemove is called and thus not decrement
the domain refcnt. Instead since the entry isn't found the code will just
return (mostly) harmlessly.
This patch will resolve the issue by taking another ref during the
search UUID process during virCloseCallackRun, decrementing the refcnt
taken by virCloseCallbacksSet, calling the callback routine and returning
overwriting the vm (since it could return NULL). Finally, it will call the
virDomainObjEndAPI to lower the refcnt and remove the lock taken during
the search UUID processing. This may cause the vm to be destroyed.
Currently, on every --enable perf_event command,
a new event->fd is created and counting of perf
event counter starts from zero and previous
event->fd is lost. This patch prevents this
behaviour.
Signed-off-by: Nitesh Konkar <nitkon12@linux.vnet.ibm.com>
It is destructive to attempt reset on a pci- or cardbus-bridge, the
host can crash. The bridges won't contain any guest data and neither
they can be passed through using vfio/stub. So, no point in allowing a
reset on them.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Non-endpoint devices like pci-bridges cannot be assigned to guests.
Prevent such attempts.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Avoid return with the closeCallbacks locked when get callbacks list for connect fail.
Signed-off-by: Wang King <king.wang@huawei.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
Based on work of Mehdi Abaakouk <sileht@sileht.net>.
When parsing vhost-user interface XML and no ifname is found we
can try to fill it in in post parse callback. The way this works
is we try to make up interface name from given socket path and
then ask openvswitch whether it knows the interface.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
File open errors are prevented by a file exists check before
virFileReadAll is called since all callers of the virReadFCHost
method handle errors themselves based on the NULL return anyway.
Also included is a minor spelling correction in a comment.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
This is a simple wrapper over mount(). However, not every system
out there is capable of moving a mount point. Therefore, instead
of having to deal with this fact in all the places of our code we
can have a simple wrapper and deal with this fact at just one
place.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Other drivers (like qemu) would like to know if the namespaces
are available therefore it makes sense to move this function to
a shared module.
At the same time, this function had some default namespaces that
are checked with every call. It is not necessary - let callers
pass just those namespaces they are interested in.
With the move the function is renamed to
virProcessNamespaceAvailable.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The code at the very bottom of the DAC secdriver that calls
chown() should be fine with read-only data. If something needs to
be prepared it should have been done beforehand.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This patch adds support and documentation for
a generalized hardware cache event called cache_l1d
perf event.
Signed-off-by: Nitesh Konkar <nitkon12@linux.vnet.ibm.com>
Currently when spawning containers with systemd, the container PID 1
will get moved into the systemd machine slice. Libvirt then manually
moves the libvirt_lxc and qemu-nbd processes into the cgroups associated
with the slice, but skips the systemd controller cgroup. This means that
from systemd's POV, libvirt_lxc and qemu-nbd are still part of the
libvirtd.service unit.
On systemctl daemon-reload, it will notice that libvirt_lxc & qemu-nbd
are in the libvirtd.service unit for the systemd controller, but in the
machine cgroups for resources. Systemd will thus move them back into
the libvirtd.service resource cgroups next time libvirtd is restarted.
This causes libvirtd to kill off the container due to incorrect cgroup
placement.
The solution is to ensure that when moving libvirt_lxc & qemu-nbd, we
also move the systemd cgroup controller placement. Normally this is
not something we ever want todo, but this is a special case as we are
intentionally wanting to move them to a different systemd unit.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently, there's only linux implementation for
virGetFCHostNameByFabricWWN(). Since the symbol is exported in
our private symbols we ought to have implementation for other
platforms too. This also triggers compilation error on FreeBSD:
../src/.libs/libvirt_driver_storage_impl.a(libvirt_driver_storage_impl_la-storage_backend_scsi.o): In function `createVport':
/usr/home/jenkins/libvirt-master/systems/libvirt-freebsd/build/src/../../src/storage/storage_backend_scsi.c:740: undefined reference to `virGetFCHostNameByFabricWWN'
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The resulting function virFileGetMountSubtreeImpl() just uses
virStringSortRevCompare or virStringSortCompare which uses strcmp().
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Rather than extraneous VIR_FREE's depending on where we are in the code,
move them to the top of the loop and in the cleanup path.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Clang 3.9 refuses to compile the existing code with the
following error:
util/virfirewall.c:425:20: error: passing an object that undergoes
default argument promotion to 'va_start'
has undefined behavior [-Werror,-Wvarargs]
va_start(args, layer);
^
util/virfirewall.c:420:37: note: parameter of type 'virFirewallLayer'
is declared here
virFirewallLayer layer,
^
This happens because 'layer' is of type virFirewallLayer, which
is an enum type and not a standard type such as eg. void* or int.
To solve the issue, turn virFirewallAddRule() from a very thin
wrapper around virFirewallAddRuleFullV() to a macro that expands
to a call to virFirewallAddRuleFull() - itself a very thin wrapper
around the aforementioned virFirewallAddRuleFullV() - with no loss
of functionality or type safety.
Due to nature of operations we do over the string list (more
precisely due to how virStringListRemove() works), it is not the
best idea to use dataFree callback. Problem is, on MAC address
remove, the string list remove function modifies the original
list in place. Then, virHashUpdateEntry() is called which frees
all the data stored in the list rendering @newMacsList point to
freed data.
==16002== Invalid read of size 8
==16002== at 0x50BC083: virFree (viralloc.c:582)
==16002== by 0x513DC39: virStringListFree (virstring.c:251)
==16002== by 0x51089B4: virMacMapHashFree (virmacmap.c:67)
==16002== by 0x50EF30B: virHashAddOrUpdateEntry (virhash.c:352)
==16002== by 0x50EF4FD: virHashUpdateEntry (virhash.c:415)
==16002== by 0x5108BED: virMacMapRemoveLocked (virmacmap.c:129)
==16002== by 0x51092D5: virMacMapRemove (virmacmap.c:346)
==16002== by 0x402F02: testMACRemove (virmacmaptest.c:107)
==16002== by 0x403F15: virTestRun (testutils.c:180)
==16002== by 0x4032C4: mymain (virmacmaptest.c:205)
==16002== by 0x405A3B: virTestMain (testutils.c:992)
==16002== by 0x403D87: main (virmacmaptest.c:237)
==16002== Address 0xdd5a4d0 is 0 bytes inside a block of size 24 free'd
==16002== at 0x4C2AD6F: realloc (vg_replace_malloc.c:693)
==16002== by 0x50BB99B: virReallocN (viralloc.c:245)
==16002== by 0x513DC0B: virStringListRemove (virstring.c:235)
==16002== by 0x5108BA6: virMacMapRemoveLocked (virmacmap.c:124)
==16002== by 0x51092D5: virMacMapRemove (virmacmap.c:346)
==16002== by 0x402F02: testMACRemove (virmacmaptest.c:107)
==16002== by 0x403F15: virTestRun (testutils.c:180)
==16002== by 0x4032C4: mymain (virmacmaptest.c:205)
==16002== by 0x405A3B: virTestMain (testutils.c:992)
==16002== by 0x403D87: main (virmacmaptest.c:237)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In virMacMapRemoveLocked() we have two variables: @macsList and
@newMacsList. Obviously, @newMacsList is supposed to hold pointer
to modified list but in fact it holds pointer to the old list.
It's confusing.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This patch reduces the complexity of the filtering algorithm in
virCgroupDetect by first correcting the controller mask and then
checking for potential co-mounts without any correlating
controller mask modifications.
If you agree that this patch removes complexity and improves
readability it could simply be squashed into the first patch
of this series.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
The cgroup controller filtering in virCgroupDetect does not work
properly if the following conditions are met:
1) the host system does not have a cgroup controller which
libvirt requests (unavailable controller) and
2) libvirt is configured to disable a controller (disabled controller) and
3) the disabled controller is located before the unavailable controller
in virCgroupController.
As an example: The memory controller is unavailable and the cpuset
controller is configured to be disabled.
In this scenario trying to start a domain results in the error
error: Controller 'cpuset' is not wanted, but 'memory' is co-mounted: Invalid argument
This error occurs when virCgroupDetect is called with a valid parent group.
The resulting group created by virCgroupCopyMounts holds for cpuset and
memory controller empty mount points. The filtering of disabled controllers
checks for co-mounts by comparing the mount points. The cpuset controller
causes the filtering to occur before the memory controller is marked as to be
ignored by modifying the controller mask since it is unavailable.
Therefore the co-mount detection logic compares the cpuset and memory controller
mount points and since both are empty the memory controller is regarded
erroneously as being co-mounted.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The API creates PTR domain which corresponds to a given addr/prefix.
Both IPv4 and IPv6 addresses are supported, but the prefix must be
divisible by 8 for IPv4 and divisible by 4 for IPv6.
The generated PTR domain has the following format
IPv4: 1.2.3.4.in-addr.arpa
IPv6: 0.1.2.3.4.5.6.7.8.9.a.b.c.d.e.f.0.1.2.3.4.5.6.7.8.9.a.b.c.d.e.f.ip6.arpa
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The PERF_COUNT_HW_REF_CPU_CYCLES constant is not available
on all Linux distros libvirt targets, so its use must be
made conditional. Other constant have existed long enough
that we can assume they exist, as we don't support very
old distros like RHEL-5 any more.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Along with an empty string, it should also be possible for users to pass
NULL to the public APIs which in turn would trigger a routine(future
work) responsible for defining an appropriate default logging output
given the current circumstances.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
These helpers will manage the log destination defaults (fetch/set). The reason
for this is to stay consistent with the current daemon's behaviour with respect
to /etc/libvirt/<daemon>.conf file, since both assignment of an empty string
or not setting the log output variable at all trigger the daemon's decision on
the default log destination which depends on whether the daemon runs daemonized
or not.
This patch also changes the logic of the selection of the default
logging output compared to how it is done now. The main difference though is
that we should only really care if we're running daemonized or not, disregarding
the fact of (not) having a TTY completely (introduced by commit eba36a3878) as
that should be of the libvirtd's parent concern (what FD it will pass to it).
Before:
if (godaemon || !hasTTY):
if (journald):
use journald
if (godaemon):
if (privileged):
use SYSCONFIG/libvirtd.log
else:
use XDG_CONFIG_HOME/libvirtd.log
else:
use stderr
After:
if (godaemon):
if (journald):
use journald
else:
if (privileged):
use SYSCONFIG/libvirtd.log
else:
use XDG_CONFIG_HOME/libvirtd.log
else:
use stderr
Signed-off-by: Erik Skultety <eskultet@redhat.com>
We will need this function in near future so that we know what
/dev device corresponds to the SCSI device.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We will need this function in near future so that we know what
/dev device corresponds to the SCSI device.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We will need this function in near future so that we know what
/dev device corresponds to the USB device.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Namely, virFileGetACLs, virFileSetACLs, virFileFreeACLs and
virFileCopyACLs. These functions are going to be required when we
are creating /dev for qemu. We have copy anything that's in
host's /dev exactly as is. Including ACLs.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Almost none of our virJSONValue*Get* functions accept const virJSONValue
pointers and it wouldn't even make sense since we sometimes modify what
we get. And because there is no reason for preventing callers of
virJSONValueObjectForeachKeyValue from modifying the values they get in
each iteration we can just stop doing it.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The functions to retrieve online and present host CPU information
are only supported on Linux for the time being.
This leads to runtime errors if these function are used on other
platforms. To avoid that, code in higher levels using the functions
must replicate the conditional compilation in higher level which
is error prone (and is plainly spoken ugly).
Adding a function virHostCPUHasBitmap that can be used to check
for host CPU bitmap support.
NB: There are other functions including the host CPU count that
are lacking support on all platforms, but they are too essential
in order to be bypassed.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Instead of having duplicated code in qemuStorageLimitsRefresh and
virStorageBackendUpdateVolTargetInfo to get capacity specific data
about the storage backing source or volume -- create a common API
to handle the details for both.
As a side effect, virStorageFileProbeFormatFromBuf returns to being
a local/static helper to virstoragefile.c
For the QEMU code - if the probe is done, then the format is saved so
as to avoid future such probes.
For the storage backend code, there is no need to deal with the probe
since we cannot call the new API if target->format == NONE.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Instead of having duplicated code in qemuStorageLimitsRefresh and
virStorageBackendUpdateVolTargetInfoFD to fill in the storage backing
source or volume allocation, capacity, and physical values - create a
common API that will handle the details for both.
The common API will fill in "default" capacity values as well - although
those more than likely will be overridden by subsequent code. Having just
one place to make the determination of what the values should be will
make things be more consistent.
For the QEMU code - the data filled in will be for inactive domains
for the GetBlockInfo and DomainGetStatsOneBlock API's. For the storage
backend code - the data will be filled in during the volume updates.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Commit id '8dc27259' introduced virStorageSourceUpdateBlockPhysicalSize
in order to retrieve the physical size for a block backed source device
for an active domain since commit id '15fa84ac' changed to use the
qemuMonitorGetAllBlockStatsInfo and qemuMonitorBlockStatsUpdateCapacity
API's to (essentially) retrieve the "actual-size" from a 'query-block'
operation for the source device.
However, the code only was made functional for a BLOCK backing type
and it neglected to use qemuOpenFile, instead using just open. After
the open the block lseek would find the end of the block and set the
physical value, close the fd and return.
Since the code would return 0 immediately if the source device wasn't
a BLOCK backed device, the physical would be displayed incorrectly,
such as follows in domblkinfo for a file backed source device:
Capacity: 1073741824
Allocation: 0
Physical: 0
This patch will modify the algorithm to get the physical size for other
backing types and it will make use of the qemuDomainStorageOpenStat
helper in order to open/stat the source file depending on its type.
The qemuDomainGetStatsOneBlock will no longer inhibit printing errors,
but it will still ignore them leaving the physical value set to 0.
Signed-off-by: John Ferlan <jferlan@redhat.com>
In preparation to the code move to virnetdevtap.c, this change:
* renames virNetInterfaceStats to virNetDevTapInterfaceStats
* changes 'path' to 'ifname', to use the same vocable as other
method in virnetdevtap.c.
* Add the attributes checker
When vhostuser interfaces are used, the interface statistics
are not available in /proc/net/dev.
This change looks at the openvswitch interfaces statistics
tables to provide this information for vhostuser interface.
Note that in openvswitch world drop/error doesn't always make sense
for some interface type. When these informations are not available we
set them to 0 on the virDomainInterfaceStats.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
With current perf framework, this patch adds support and documentation
for the branch_instructions perf event.
Signed-off-by: Nitesh Konkar <nitkon12@linux.vnet.ibm.com>
With kernel 3.18 (since commit 3e32cb2e0a12b6915056ff04601cf1bb9b44f967)
the "unlimited" value for cgroup memory limits has changed once again as
its byte value is now computed from a page counter.
The new "unlimited" value reported by the cgroup fs is therefore 2**51-1
pages which is (VIR_DOMAIN_MEMORY_PARAM_UNLIMITED - 3072). This results
e.g. in virsh memtune displaying 9007199254740988 instead of unlimited
for the limits.
This patch uses the value of memory.limit_in_bytes from the cgroup
memory root which is the system's "real" unlimited value for comparison.
See also libvirt commit 231656bbeb for the
history for kernel 3.12 and before.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
This module will be used to track:
<domain, mac address list>
pairs. It will be important to know these mappings without
libvirt connection (that is from a JSON file), because NSS
module will use those to provide better host name translation.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
There are couple of places where we have a string and want to
save it to a file. Atomically. In all those places we use
virFileRewrite() but also implement the very same callback which
takes the string and write it into temp file. This makes no
sense. Unify the callbacks and move them to one place.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The path to the config file for a PCI device is conventiently stored
in a virPCIDevice object, but that object's contents aren't directly
visible outside of virpci.c, so we need to have an accessor function
for it if anyone needs to look at it.
This new function just calls fstat() (if provided with a valid fd) or
stat() (if fd is -1) and returns st_size (or -1 if there is an
error). We may decide we want this function to be more complex, and
handle things like block devices - this is a placeholder (that works)
for any more complicated function.
We have couple of functions that operate over NULL terminated
lits of strings. However, our naming sucks:
virStringJoin
virStringFreeList
virStringFreeListCount
virStringArrayHasString
virStringGetFirstWithPrefix
We can do better:
virStringListJoin
virStringListFree
virStringListFreeCount
virStringListHasString
virStringListGetFirstWithPrefix
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
For a new hostdev type='scsi_host' we have a number of
required functions for managing, adding, and removing the
host device to/from guests. Provide the basic infrastructure
for these tasks.
The name "SCSIVHost" (and its variants) is chosen to avoid
conflicts with existing code named "SCSIHost" to refer to
a hostdev type='scsi' protcol='none'.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
The path prefixes for sysfs trees are always prepended by paths
beginning with a slash, making the trailing slash in the prefix
redundant.
Signed-off-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Only generate a warning if there is something to report.
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
It is already discussed in "[RFC] daemon: remove hardcode dep on libvirt-guests" [1].
Mgmt can use means to save/restore domains on system shutdown/boot other than
libvirt-guests.service. Thus we need to specify appropriate ordering dependency between
libvirtd, domains and save/restore service. This patch takes approach suggested
in RFC and introduces a systemd target, so that ordering can be built next way:
libvirtd -> domain -> virt-guest-shutdown.target -> save-restore.service.
This way domains are decoupled from specific shutdown service via intermediate
target.
[1] https://www.redhat.com/archives/libvir-list/2016-September/msg01353.html
Use the util function virHostdevIsSCSIDevice() to simplify if
statements.
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Add the function virHostdevIsSCSIDevice() which detects whether a
hostdev is a SCSI device or not.
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
After commit f2bf5fbb04, MinGW strikes again. Simply print pid as any
other place after commit b7d2d4af2b.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Commit 94cc577807 tried fixing build on systems that did not have
SCHED_BATCH or SCHED_IDLE defined. But instead of changing it to
conditional support, it rather completely disabled the support for
setting any scheduler. Since then, such old systems are not
supported, but rather than reverting that commit, let's change that to
the conditional support. That way any addition to the list of
schedulers can follow the same style so that we're consistent in the
future.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This new function can be used to check if e.g. name of XML
node don't contains forbidden chars like "/" or "\n".
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1357416
Rather than return a 0 or -1 and the *result string, return just the result
string to the caller. Alter all the callers to handle the different return.
As a side effect or result of this, it's much clearer that we cannot just
assign the returned string into the scsi_host wwnn, wwpn, and fabric_wwn
fields - rather we should fetch a temporary string, then as long as our
fetch was good, VIR_FREE what may have been there, and STEAL what we just got.
This fixes a memory leak in the virNodeDeviceCreateXML code path through
find_new_device and nodeDeviceLookupSCSIHostByWWN which will continually
call nodeDeviceSysfsGetSCSIHostCaps until the expected wwnn/wwpn is found
in the device object capabilities.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Not every system out there has syslog, that's why we check for it
in our configure script. However, in 640b58abdf while fixing
another issue, some variables and functions are called that are
defined only when syslog.h is present. But these function
calls/variables were not guarded by #ifdef-s.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This initially started as a fix of some debug printing in
virCgroupDetect. However it turned out that other places suffer
from the similar problem. While dealing with pids, esp. in cases
where we cannot use pid_t for ABI stability reasons, we often
chose an unsigned integer type. This makes no sense as pid_t is
signed.
Also, new syntax-check rule is introduced so we won't repeat this
mistake.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In both virLogParseOutput and virLogParseFilter, rather than returning
NULL, goto cleanup since it's possible that for each the first condition
passes, but the || condition doesn't and thus we leak memory.
Handling of outputs and filters has been changed in a way that splits
parsing and defining. Do the same thing for logging priority as well, this
however, doesn't need much of a preparation.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This is mainly virLogAddOutputTo* which were replaced by virLogNewOutputTo* and
the previously poorly named ones virLogParseAndDefine* functions. All of these
are unnecessary now, since all the original callers were transparently switched
to the new model of separate parsing and defining logic.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Similar to outputs, parser should do parsing only, thus the 'define' logic
is going to be stripped from virLogParseAndDefineFilters by replacing calls to
this method to virLogSetFilters instead.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Since virLogParseAndDefineOutputs is going to be stripped from 'output defining'
logic, replace all relevant occurrences with virLogSetOutputs call to make the
change transparent to all original callers (daemons mostly).
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This method will eventually replace virLogParseAndDefineFilters which
currently does both parsing and defining.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This API is the entry point to output modification of the logger. Currently,
everything is done by virLogParseAndDefineOutputs. Parsing and defining will be
split into two operations both handled by this method transparently.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Abstraction added over parsing a single filter. The method parses potentially a
set of logging filters, while adding each filter logging object to a
caller-provided array.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Another abstraction added on the top of parsing a single logging output. This
method takes and parses the whole set of outputs, adding each single output
that has already been parsed into a caller-provided array. If the user-supplied
string contained duplicate outputs, only the last occurrence is taken into
account (all the others are removed from the list), so we silently avoid
duplicate logs.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Same as for outputs, introduce a new method, that is basically the same as
virLogParseAndDefineFilter with the difference that it does not define the
filter. It rather returns a newly created object that needs to be inserted into
a list and then defined separately.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Introduce a method to parse an individual logging output. The difference
compared to the virLogParseAndDefineOutput is that this method does not define
the output, instead it makes use of the virLogNewOutputTo* methods introduced
in the previous patch and just returns the virLogOutput object that has to be
added to a list of object which then can be defined as a whole via
virLogDefineOutputs. The idea remains still the same - split parsing and
defining of the logging primitives (outputs, filters).
Additionally, since virLogNewOutputTo* methods are now finally used,
ATTRIBUTE_UNUSED can be successfully removed from the methods' definitions,
since that was just to avoid compiler complaints about unused static functions.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Now that we're in the critical section, syslog connection can be re-opened
by issuing openlog, which is something that cannot be done beforehand, since
syslog keeps its file descriptor private and changing the tag earlier might
introduce a log inconsistency if something went wrong with preparing a new set
of logging outputs in order to replace the existing one.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Continuing with the effort to split output parsing and defining, these new
functions return a logging object reference instead of defining the output.
Eventually, these functions will replace the existing ones (virLogAddOutputTo*)
which will then be dropped.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Prepare a method that only defines a set of filters. It takes a list of
filters, preferably created by virLogParseFilters. The original set of filters
is reset and replaced by the new user-provided set of filters.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Prepare a method that only defines a set of outputs. It takes a list of
outputs, preferably created by virLogParseOutputs. The original set of outputs
is reset and replaced by the new user-provided set of outputs.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Outputs are a bit trickier than filters, since the user(config)-specified
set of outputs can contain duplicates. That would lead to logging the same
message twice. For compatibility reasons, we cannot just error out and forbid
the daemon to start if we find duplicate outputs which do not make sense.
Instead, we could silently take into account only the last occurrence of the
duplicate output and remove all the previous ones, so that the logger will not
try to use them when it is looping over all of its registered outputs.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
In order to later split output parsing and output defining, introduce a new
function which will create a new virLogOutput object which the parser will
insert into a list with the list being eventually defined.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
There is really no reason why we could not keep journald's fd within the
journald output object the same way as we do for regular file-based outputs.
By doing this we later won't have to special case the journald-based output
(due to the fd being globally shared) when replacing the existing set of outputs
with a new one. Additionally, by making this change, we don't need the
virLogCloseJournald routine anymore, plain virLogCloseFd will suffice.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Right now virLogParse* functions are doing both parsing and defining of filters
and outputs which should be two separate operations. Since the naming is
apparently a bit poor this patch renames these functions to
virLogParseAndDefine* which eventually will be replaced by virLogSet*.
Additionally, virLogParse{Filter,Output} will be later (after the split) reused,
so that these functions do exactly what the their name suggests.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
During first stage of virlog.c refactor, commit 0b231195 forgot to remove the
macro definition along with its usage.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
If the event is already disabled, then don't bother with setting it
disabled again. Causes unnecessary error on systems that don't support
the feature anyway.
Provide the Steal API for any code paths that will desire to grab the
object array and then free it afterwards rather than relying to freeing
the whole chain from the reply.
Libvirt, on its own, shouldn't decide whether an expired lease should
stay in the custom leases database or not. It should rather rely on
the 'DEL' event from dnsmasq.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
There is nothing Linux-specific in that function. Also since commit
8c3b5bf481 mingw build is broken due to
the fact that this function is not compiled in the library.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
We will need this function shortly when implementing
nodeGetCPUStats in the test driver.
Signed-off-by: Tomáš Ryšavý <tom.rysavy.0@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Name it virNumaGetHostMemoryNodeset and return only NUMA nodes which
have memory installed. This is necessary as the kernel is not very happy
to set the memory cgroup setting for nodes which do not have any memory.
This would break vcpu hotplug with following message on such
configruation:
Invalid value '0,8' for 'cpuset.mems': Invalid argument
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1375268
Commit id 'a48c7141' altered how to determine if a volume was encrypted
by adding a peek at an offset into the file at a specific buffer location.
Unfortunately, all that was compared was the first "char" of the buffer
against the expect "int" value.
Restore the virReadBufInt32BE to get the complete field in order to
compare against the expected value from the qcow2EncryptionInfo or
qcow1EncryptionInfo "modeValue" field.
This restores the capability to create a volume with encryption, then
refresh the pool, and still find the encryption for the volume.