The fix for CVE-2018-6764 introduced a potential deadlock scenario
that gets triggered by the NSS module when virGetHostname() calls
getaddrinfo to resolve the hostname:
#0 0x00007f6e714b57e7 in futex_wait
#1 futex_wait_simple
#2 __pthread_once_slow
#3 0x00007f6e71d16e7d in virOnce
#4 0x00007f6e71d0997c in virLogInitialize
#5 0x00007f6e71d0a09a in virLogVMessage
#6 0x00007f6e71d09ffd in virLogMessage
#7 0x00007f6e71d0db22 in virObjectNew
#8 0x00007f6e71d0dbf1 in virObjectLockableNew
#9 0x00007f6e71d0d3e5 in virMacMapNew
#10 0x00007f6e71cdc50a in findLease
#11 0x00007f6e71cdcc56 in _nss_libvirt_gethostbyname4_r
#12 0x00007f6e724631fc in gaih_inet
#13 0x00007f6e72464697 in __GI_getaddrinfo
#14 0x00007f6e71d19e81 in virGetHostnameImpl
#15 0x00007f6e71d1a057 in virGetHostnameQuiet
#16 0x00007f6e71d09936 in virLogOnceInit
#17 0x00007f6e71d09952 in virLogOnce
#18 0x00007f6e714b5829 in __pthread_once_slow
#19 0x00007f6e71d16e7d in virOnce
#20 0x00007f6e71d0997c in virLogInitialize
#21 0x00007f6e71d0a09a in virLogVMessage
#22 0x00007f6e71d09ffd in virLogMessage
#23 0x00007f6e71d0db22 in virObjectNew
#24 0x00007f6e71d0dbf1 in virObjectLockableNew
#25 0x00007f6e71d0d3e5 in virMacMapNew
#26 0x00007f6e71cdc50a in findLease
#27 0x00007f6e71cdc839 in _nss_libvirt_gethostbyname3_r
#28 0x00007f6e71cdc724 in _nss_libvirt_gethostbyname2_r
#29 0x00007f6e7248f72f in __gethostbyname2_r
#30 0x00007f6e7248f494 in gethostbyname2
#31 0x000056348c30c36d in hosts_keys
#32 0x000056348c30b7d2 in main
Fortunately the extra stuff virGetHostname does is totally irrelevant to
the needs of the logging code, so we can just inline a call to the
native hostname() syscall directly.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Some of function comments don't have the right named parameters
and others are not consistent with the description alignment.
This patch fixes this.
Signed-off-by: Chen Hanxiao <chenhanxiao@gmail.com>
The QEMU driver loadable module needs to be able to resolve all ELF
symbols it references against libvirt.so. Some of its symbols can only
be resolved against the storage_driver.so loadable module which creates
a hard dependancy between them. By moving the storage file backend
framework into the util directory, this gets included directly in the
libvirt.so library. The actual backend implementations are still done as
loadable modules, so this doesn't re-add deps on gluster libraries.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
At later point it might not be possible or even safe to use getaddrinfo(). It
can in turn result in a load of NSS module.
Notably, on a LXC container startup we may find ourselves with the guest
filesystem already having replaced the host one. Loading a NSS module
from the guest tree would allow a malicous guest to escape the
confinement of its container environment because libvirt will not yet
have locked it down.
Note the fact that the unused portion of the last element in the bitmap
needs to be cleared, since we use functions which process only full-size
elements and don't really deal with individual bits.
The function only reduces the size of the bitmap thus we can use the
appropriate shrinking function which also does not have any return
value.
Since virBitmapShrink now does not return any value callers need to be
fixed as well.
The virBitmap code uses VIR_RESIZE_N to do quadratic scaling, which
means that along with the number of requested map elements we also need
to keep the number of actually allocated elements for the scaling
algorithm to work properly.
The shrinking code did not fix 'map_alloc' thus virResizeN might
actually not expand the bitmap properly after called on a previously
shrunk bitmap.
'max_bit' is misleading as the value is set to the first invalid bit
as it's used as the number of bits in the bitmap. Rename it to a more
descriptive name.
Just in case someone re-mounted /sys/fs/resctrl with different mount
options (cdp), add a check here.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1540780
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Some of the other functions depend on the fact that unused bits and longs are
always zero and it's less error-prone to clear it than fix the other functions.
It's enough to zero out one piece of the map since we're calling realloc() to
get rid of the rest (and updating map_len).
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1540817
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Add new error code to be able to allow consumers (such as Nova) to be
able to key of a specific error code rather than needing to search the
error message."
Signed-off-by: Chen Hanxiao <chenhanxiao@gmail.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Some platforms/toolchains will complain about casting
sockaddr_storage to sockaddr_un because it breaks strict
aliasing rule
../../src/util/virutil.c: In function 'virGetUNIXSocketPath':
../../src/util/virutil.c:2005: error: dereferencing pointer 'un' does break strict-aliasing rules [-Wstrict-aliasing]
Change the code to use a union, in the same way that the
virsocketaddr.h header does.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When starting an LXC container, the /dev entries are created
under temp root (/var/run/libvirt/lxc/$name.dev), relabelled and
then the root is pivoted. However, when it comes to USB devices
which keep path to the device in the structure we need a way to
override the default /dev/usb/... path because we want to work
with the one under temp root. That's what @vroot argument is for
in virUSBDeviceNew. However, what is being passed there is:
vroot = /var/run/libvirt/lxc/lxc_0.dev/bus/usb
Therefore, constructed path is wrong:
dev->path = //var/run/libvirt/lxc/lxc_0.dev/bus/usb//dev/bus/usb/002/002
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
When receiving multiple socket FDs from systemd, it is critical to know
what socket address each corresponds to so we can setup the right
protocols on each.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Due to confusing naming the pointer to the mask got copied which must not
happen, so use UpdateMask instead of SetMask. That also means we can get
completely rid of SetMask.
Also don't clear the free bits since it is not used again (leftover from
previous versions).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Introduce virResctrlAllocCopyMasks() and use that to initially copy the default
group schemata to the allocation before reserving any parts of the cache. The
reason for this is that when new group is created the schemata will have unknown
data in it. If there was previously group with the same CLoS ID, it will have
the previous valies, if not it will have all bits set. And we need to set all
unspecified (in the XML) allocations to the same one as the default group.
Some non-Linux functions now need to be made public due to this change.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1289368
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
While the QEMU QAPI schema describes 'lun' as a number, the code dealing
with JSON strings does not strictly adhere to this schema and thus
formats the number back as a string. Use the new helper to retrieve both
possibilities.
Note that the formatting code is okay and qemu will accept it as an int.
Tweak also one of the test strings to verify that both formats work
with libvirt.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1540290
The helper is useful in cases when the JSON we have to parse may contain
one of the two due to historical reasons and the number value itself
would be stored as a string.
We are skipping non-directories under /sys/fs/resctrl/(info/) since those are not
interesting for us. However in tests it can sometimes happen that ent->d_type
is 0 instead of 4 (DT_DIR) for directories.
I've seen it fail on two machines. Different machines, different systems, I
cannot reproduce it even using the same setup. So one of the ways how to work
around this is call stat() on it. The other one is not checking if it is a
directory since we'll find out eventually when we want to read some files
underneath it.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This wil be used in the future, but it makes sense for now as well. It makes
sure there is no mask leftover that would leak.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Pointed out during review on one or two places, but it actually appears in lot
more places. So let's be consistent.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
When working on the CAT series one of the changes was that the pointer got
allocated in another part of the code, even when resctrl was not available on
the host system. However this one particular place neglected that so it needs
to be fixed in order to get the proper error message when requesting
<cachetune/> on HW with no support for it.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Commits f83c7c88 and 6eb1f2b9 broke the build on FreeBSD and OSX because
of symbols being undefined for those platforms.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This is a replacement for the existing udevPCIGetMdevTypesCap which is
static to the udev backend. This simple helper constructs the sysfs path
from the device's base path for each mdev type and queries the
corresponding attributes of that type.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This should serve as a replacement for the existing udevFillMdevType
which is responsible for fetching the device type's attributes from the
sysfs interface. The problem with the existing solution is that it's
tied to the udev backend.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This is later going to replace the existing virNodeDevCapMdevType, since:
1) it's going to couple related stuff in a single module
2) util is supposed to contain helpers that are widely accessible across
the whole repository.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
With this commit we finally have a way to read and manipulate basic resctrl
settings. Locking is done only on exposed functions that read/write from/to
resctrlfs. Not in functions that are exposed in virresctrlpriv.h as those are
only supposed to be used from tests.
More information about how resctrl works:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/x86/intel_rdt_ui.txt
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This will make the current functions obsolete and it will provide more
information to the virresctrl module so that it can be used later.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The OEM strings table in SMBIOS allows the vendor to pass arbitrary
strings into the guest OS. This can be used as a way to pass data to an
application like cloud-init, or potentially as an alternative to the
kernel command line for OS installers where you can't modify the install
ISO image to change the kernel args.
As an example, consider if cloud-init and anaconda supported OEM strings
you could use something like
<oemStrings>
<entry>cloud-init:ds=nocloud-net;s=http://10.10.0.1:8000/</entry>
<entry>anaconda:method=http://dl.fedoraproject.org/pub/fedora/linux/releases/25/x86_64/os</entry>
</oemStrings>
use of a application specific prefix as illustrated above is
recommended, but not mandated, so that an app can reliably identify
which of the many OEM strings are targetted at it.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Commit 8708ca01c added virNetDevSwitchdevFeature() to check if a network
device has Switchdev capabilities. virNetDevSwitchdevFeature() attempts
to retrieve the PCI device associated with the network device, ignoring
non-PCI devices. It does so via the following call chain
virNetDevSwitchdevFeature()->virNetDevGetPCIDevice()->
virPCIGetDeviceAddressFromSysfsLink()
For non-PCI network devices (qeth, Xen vif, etc),
virPCIGetDeviceAddressFromSysfsLink() will report an error when
virPCIDeviceAddressParse() fails. virPCIDeviceAddressParse() also
logs an error. After commit 8708ca01c there are now two errors reported
for each non-PCI network device even though the errors are harmless.
To avoid the errors, introduce virNetDevIsPCIDevice() and use it in
virNetDevGetPCIDevice() before attempting to retrieve the associated
PCI device. virNetDevIsPCIDevice() uses the 'subsystem' property of the
device to determine if it is PCI. See the sysfs rules in kernel
documentation for more details
https://www.kernel.org/doc/html/latest/admin-guide/sysfs-rules.html
Let's also parse the available processor frequency information on S390
so that it can be utilized by virsh sysinfo:
# virsh sysinfo
<sysinfo type='smbios'>
...
<processor>
<entry name='family'>2964</entry>
<entry name='manufacturer'>IBM/S390</entry>
<entry name='version'>00</entry>
<entry name='max_speed'>5000</entry>
<entry name='serial_number'>145F07</entry>
</processor>
...
</sysinfo>
Reviewed-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Signed-off-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Since kernel version 4.7, processor frequency information is available
on S390. Let's adjust the parser so this information shows up for virsh
nodeinfo:
# virsh nodeinfo
CPU model: s390x
CPU(s): 8
CPU frequency: 5000 MHz
CPU socket(s): 1
Core(s) per socket: 8
Thread(s) per core: 1
NUMA cell(s): 1
Memory size: 16273908 KiB
Reviewed-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Signed-off-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Some ARM platforms, such as the original Raspberry Pi, report the
CPU frequency in the BogoMIPS field of /proc/cpuinfo, so libvirt
parsed that field and returned it through its API.
However, not only many more boards don't report any value there,
but several - including ARMv8-based server hardware, and even the
more recent Raspberry Pi 3 - use this field as originally intended:
to report the BogoMIPS value instead of the CPU frequency.
Since we have no way of detecting how the field is being used,
it's better to report no information at all rather than something
ludicrous like "your shiny 96-core aarch64 virtualization host's
CPUs are running at a whopping 100 MHz".
Partially-resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1206353
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Make the parser both more strict, by not ignoring errors reported
by virStrToLong_ui(), and more permissive, by not failing due to
unrelated fields which just happen to have a know prefix and
accepting any amount of whitespace before the numeric value.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Instead of a generic "your architecture", print the actual
architecture name.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>