Unlike the host devices of other types, SCSI host device XML supports
"shareable" tag. This patch introduces it for the virSCSIDevice struct
for a later patch use (to detect if the SCSI device is shareable when
preparing the SCSI host device in QEMU driver).
There are 2 issues here: First we shouldn't add "1" to the return
value of numa_max_node(), since the semanteme of the error message
was changed, it's not saying about the number of total NUMA nodes
anymore. Second, the value of "bit" is the position of the first
bit which exceeds either numa_max_node() or NUMA_NUM_NODES, it can
be any number in the range, so saying "bigger than $bit" is quite
confused now. For example, assuming there is a NUMA machine which
has 10 NUMA nodes, and one specifies the "nodeset" as "0,5,88",
the error message will be like:
Nodeset is out of range, host cannot support NUMA node bigger than 88
It sounds like all NUMA node number less than 88 is fine, but
actually the maximum NUMA node number the machine supports is 9.
This patch fixes the issues by removing the addition with "1" and
simplifies the error message as "NUMA node $bit is out of range".
Also simplifies the comparision in the while loop by getting the
smaller one of numa_max_node() and NUMA_NUM_NODES up front.
This is useful in certain circumstances, for example when
libvirtd is being executed by FreeBSD rc script, it cannot find
dmidecode installed from FreeBSD ports because it doesn't have
/usr/local (default prefix for ports) in PATH.
https://bugzilla.redhat.com/show_bug.cgi?id=1046919
Since commit v0.9.0-47-g4e8969e (released in 0.9.1) some failures during
device detach were reported to callers of virPCIDeviceBindToStub as
success. For example, even though a device seemed to be detached
virsh # nodedev-detach pci_0000_07_05_0 --driver vfio
Device pci_0000_07_05_0 detached
one could find similar message in libvirt logs:
Failed to bind PCI device '0000:07:05.0' to vfio-pci: No such device
This patch fixes these paths and also avoids overwriting real errors
with errors encountered during a cleanup phase.
https://bugzilla.redhat.com/show_bug.cgi?id=1046919
When a PCI device is not bound to any driver, reattach should just
trigger driver probe rather than failing with
Invalid device 0000:00:19.0 driver file
/sys/bus/pci/devices/0000:00:19.0/driver is not a symlink
While virPCIDeviceGetDriverPathAndName was documented to return success
and NULL driver and path when a device is not attached to any driver but
didn't do so. Thus callers could not distinguish unbound devices from
failures.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This patch introduces virCgroupSetBlkioDeviceReadIops,
virCgroupSetBlkioDeviceWriteIops,
virCgroupSetBlkioDeviceReadBps and
virCgroupSetBlkioDeviceWriteBps,
we can use these interfaces to set up throttle
blkio cgroup for domain.
This patch also adds the new throttle blkio cgroup
elements to the test xml.
Signed-off-by: Guan Qiang <hzguanqiang@corp.netease.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
A "xmlstr" string may not be assigned into a "doc" pointer and it
could cause memory leak. To fix it if the "doc" pointer is NULL and
the "xmlstr" string is not assigned we should free it.
This has been found by coverity.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
On my Fedora 20 box with mingw cross-compiler, the build failed with:
../../src/rpc/virnetclient.c: In function 'virNetClientSetTLSSession':
../../src/rpc/virnetclient.c:745:14: error: unused variable 'oldmask' [-Werror=unused-variable]
sigset_t oldmask, blockedsigs;
^
I traced it to the fact that mingw64-winpthreads installs a header
that does #define pthread_sigmask(...) 0, which means any argument
only ever passed to pthread_sigmask is reported as unused. This
patch works around the compilation failure, with behavior no worse
than what mingw already gives us regarding the function being a
no-op.
* configure.ac (pthread_sigmask): Probe for broken mingw macro.
* src/util/virutil.h (pthread_sigmask): Rewrite to something that
avoids unused variables.
Signed-off-by: Eric Blake <eblake@redhat.com>
Like commit 94a26c7e from Eric Blake, the old fuzzy code should
be replaced by the new array management macros now.
And the type of scsi->count should be changed into "size_t", and
thus virSCSIDeviceListCount should return size_t instead, similar
for vir{PCI,USB}DeviceListCount.
When the host is configured with very restrictive firewall (default policy
is DROP for all chains, including OUTPUT), the bridge driver for Linux
adds netfilter entries to allow DHCP and DNS requests to go from the VM
to the dnsmasq of the host.
The issue that this commit fixes is the fact that a DROP policy on the OUTPUT
chain blocks the DHCP replies from the host’s dnsmasq to the VM.
As DHCP replies are sent in UDP, they are not caught by any --ctstate ESTABLISHED
rule and so, need to be explicitly allowed.
Signed-off-by: Lénaïc Huard <lenaic@lhuard.fr.eu.org>
When determining if a device is behind a PCI bridge, the PCI device
class is checked by reading the config space. However, there are some
devices which have the wrong class on the config space, but the class is
initialized by Linux correctly as a PCI BRIDGE. This class can be read
by the sysfs file '/sys/bus/pci/devices/xxxx:xx:xx.x/class'.
One example of such bridge is IBM PCI Bridge 1014:03b9, which is
identified as a Host Bridge when reading the config space.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Some of our operation denied messages are outright stupid; for
example, if virIdentitySetAttr fails:
error: operation Identity attribute is already set forbidden for read only access
This patch fixes things to a saner:
error: operation forbidden: Identity attribute is already set
It also consolidates the most common usage pattern for operation
denied errors: read-only connections preventing a public API. In
this case, 'virsh -r -c test:///default destroy test' changes from:
error: operation virDomainDestroy forbidden for read only access
to:
error: operation forbidden: read only access prevents virDomainDestroy
Note that we were previously inconsistent on which APIs used
VIR_FROM_DOM (such as virDomainDestroy) vs. VIR_FROM_NONE (such as
virDomainPMSuspendForDuration). After this patch, all uses
consistently use VIR_FROM_NONE, on the grounds that it is unlikely
that a caller learning that a call is denied can do anything in
particular with extra knowledge which error domain the call belongs
to (similar to what we did in commit baa7244).
* src/util/virerror.c (virErrorMsg): Rework OPERATION_DENIED error
message.
* src/internal.h (virCheckReadOnlyGoto): New macro.
* src/util/virerror.h (virReportRestrictedError): New macro.
* src/libvirt-lxc.c: Use new macros.
* src/libvirt-qemu.c: Likewise.
* src/libvirt.c: Likewise.
* src/locking/lock_daemon.c (virLockDaemonClientNew): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
We weren't very consistent in our use of VIR_ERR_NO_SUPPORT; many
users just passed __FUNCTION__ on, while others passed "%s" to
silence over-eager compilers that warn about __FUNCTION__ not
containing any %. It's nicer to route all these uses through
a single macro, so that if we ever need to change the reporting,
we can do it in one place.
I verified that 'virsh -c test:///default qemu-monitor-command test foo'
gives the same error message before and after this patch:
error: this function is not supported by the connection driver: virDomainQemuMonitorCommand
Note that in libvirt.c, we were inconsistent on whether virDomain*
API used virLibConnError() (with VIR_FROM_NONE) or virLibDomainError()
(with VIR_FROM_DOMAIN); this patch unifies these errors to all use
VIR_FROM_NONE, on the grounds that it is unlikely that a caller
learning that a call is unimplemented can do anything in particular
with extra knowledge of which error domain it belongs to.
One particular change to note is virDomainOpenGraphics which was
trying to fail with VIR_ERR_NO_SUPPORT after a failed
VIR_DRV_SUPPORTS_FEATURE check; all other places that fail a
feature check report VIR_ERR_ARGUMENT_UNSUPPORTED.
* src/util/virerror.h (virReportUnsupportedError): New macro.
* src/libvirt-qemu.c: Use new macro.
* src/libvirt-lxc.c: Likewise.
* src/lxc/lxc_driver.c: Likewise.
* src/security/security_manager.c: Likewise.
* src/util/virinitctl.c: Likewise.
* src/libvirt.c: Likewise.
(virDomainOpenGraphics): Use correct error for unsupported feature.
Signed-off-by: Eric Blake <eblake@redhat.com>
Having one API call into another is generally not good; among
other issues, it gives confusing logs, and is not quite as
efficient.
This fixes several instances, but not all: we still have instances
in both libvirt.c and in backend hypervisors (lxc and qemu) calling
the public virTypedParamsGetString and friends, which dispatch
errors immediately. I'm not sure if it is worth trying to clean
that up in a separate patch (such a cleanup may be easiest by
separating the public function into a wrapper around the internal,
then tweaking internal.h so that internal users directly use the
internal function).
* src/libvirt.c (virDomainGetUUIDString, virNetworkGetUUIDString)
(virStoragePoolGetUUIDString, virSecretGetUUIDString)
(virNWFilterGetUUIDString): Avoid nested public API call.
* src/util/virtypedparam.c (virTypedParamsReplaceString): Don't
dispatch errors here.
(virTypedParamsGet): No need to reset errors.
(virTypedParamsGetBoolean): Use consistent ordering.
Signed-off-by: Eric Blake <eblake@redhat.com>
I noticed that the virDomainQemuMonitorCommand debug output wasn't
telling me the name of the domain it was working on. While it was
easy enough to determine which pointer matches the domain based on
other log messages, it is nicer to be consistent.
* src/util/viruuid.h (VIR_UUID_DEBUG): Moved here from...
* src/libvirt.c (VIR_UUID_DEBUG): ...here.
(VIR_ARG15, VIR_HAS_COMMA, VIR_DOMAIN_DEBUG_EXPAND)
(VIR_DOMAIN_DEBUG_PASTE, VIR_DOMAIN_DEBUG_0, VIR_DOMAIN_DEBUG_1)
(VIR_DOMAIN_DEBUG_2, VIR_DOMAIN_DEBUG): Move...
* src/datatypes.h: ...here.
* src/libvirt-qemu.c (virDomainQemuMonitorCommand)
(virDomainQemuAgentCommand): Better debug messages.
* src/libvirt-lxc.c (virDomainLxcOpenNamespace): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Since libvirt 0.9.3, the entire virevent.c file has been a public
API, so improve the documentation in this file. Also, fix a
potential core dump - it could only be triggered by bogus use of
the API and would only affect the caller (not libvirtd), but we
might as well be nice.
* src/libvirt.c (virConnectSetKeepAlive)
(virConnectDomainEventRegister, virConnectDomainEventRegisterAny)
(virConnectNetworkEventRegisterAny): Document event loop requirement.
* src/util/virevent.c (virEventAddHandle, virEventRemoveHandle)
(virEventAddTimeout, virEventRemoveTimeout): Likewise.
(virEventUpdateHandle, virEventUpdateTimeout): Likewise, and avoid
core dump if caller didn't register handler.
(virEventRunDefaultImpl): Expand example, and set up code block in
html docs.
(virEventRegisterImpl, virEventRegisterDefaultImpl): Document more
on the use of the event loop.
Signed-off-by: Eric Blake <eblake@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1044806
Currently, sending the ANSI_A keycode from os_x codepage doesn't work as
it has a special value of 0x0. Our internal code handles that no
different to other not defined keycodes. Hence, in order to allow it we
must change all the undefined keycodes from 0 to -1 and adapt some code
too.
# virsh send-key guestname --codeset os_x ANSI_A
error: invalid keycode: 'ANSI_A'
# virsh send-key guestname --codeset os_x ANSI_B
# virsh send-key guestname --codeset os_x ANSI_C
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Currently the virDBusAddWatch does
virEventAddHandle(fd, flags,
virDBusWatchCallback,
watch, NULL);
dbus_watch_set_data(watch, info, virDBusWatchFree);
Unfortunately this is racy - since the event loop is in a
different thread, the virDBusWatchCallback method may be
run before we get to calling dbus_watch_set_data. We must
reverse the order of these calls
See https://bugzilla.redhat.com/show_bug.cgi?id=885445
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Recent changes to events (commit 8a29ffcf) resulted in new compile
failures on some targets (such as ARM OMAP5):
conf/domain_event.c: In function 'virDomainEventDispatchDefaultFunc':
conf/domain_event.c:1198:30: error: cast increases required alignment of
target type [-Werror=cast-align]
conf/domain_event.c:1314:34: error: cast increases required alignment of
target type [-Werror=cast-align]
cc1: all warnings being treated as errors
The error is due to alignment; the base class is merely aligned
to the worst of 'int' and 'void*', while the child class must
be aligned to a 'long long'. The solution is to include a
'long long' (and for good measure, a function pointer) in the
base class to ensure correct alignment regardless of what a
child class may add, but to wrap the inclusion in a union so
as to not incur any wasted space. On a typical x86_64 platform,
the base class remains 16 bytes; on i686, the base class remains
12 bytes; and on the impacted ARM platform, the base class grows
from 12 bytes to 16 bytes due to the increase of alignment from
4 to 8 bytes.
Reported by Michele Paolino and others.
* src/util/virobject.h (_virObject): Use a union to ensure that
subclasses never have stricter alignment than the parent.
* src/util/virobject.c (virObjectNew, virObjectUnref)
(virObjectRef): Adjust clients.
* src/libvirt.c (virConnectRef, virDomainRef, virNetworkRef)
(virInterfaceRef, virStoragePoolRef, virStorageVolRef)
(virNodeDeviceRef, virSecretRef, virStreamRef, virNWFilterRef)
(virDomainSnapshotRef): Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorOpenInternal)
(qemuMonitorClose): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Since kernel 3.12 (commit 34ff8dc08956098563989d8599840b130be81252 in
linux-stable.git in particular) the value for 'unlimited' in cgroup
memory limits changed from LLONG_MAX to ULLONG_MAX. Due to rather
unfortunate choice of our VIR_DOMAIN_MEMORY_PARAM_UNLIMITED constant
(which we transfer as an unsigned long long in Kibibytes), we ended up
with the situation described below (applies to x86_64):
- 2^64-1 (ULLONG_MAX) -- "unlimited" in kernel = 3.12
- 2^63-1 (LLONG_MAX) -- "unlimited" in kernel < 3.12
- 2^63-1024 -- our PARAM_UNLIMITED scaled to Bytes
- 2^53-1 -- our PARAM_UNLIMITED unscaled (in Kibibytes)
This means that when any number within (2^63-1, 2^64-1] is read from
memory cgroup, we are transferring that number instead of "unlimited".
Unfortunately, changing VIR_DOMAIN_MEMORY_PARAM_UNLIMITED would break
ABI compatibility and thus we have to resort to a different solution.
With this patch every value greater than PARAM_UNLIMITED means
"unlimited". Even though this may seem misleading, we are already in
such unclear situation when running 3.12 kernel with memory limits set
to 2^63.
One example showing most of the problems at once (with kernel 3.12.2):
# virsh memtune asdf --hard-limit 9007199254740991 --swap-hard-limit -1
# echo 12345678901234567890 >\
/sys/fs/cgroup/memory/machine/asdf.libvirt-qemu/memory.soft_limit_in_bytes
# virsh memtune asdf
hard_limit : 18014398509481983
soft_limit : 12056327051986884
swap_hard_limit: 18014398509481983
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
In 78839da I am trying to join the worker threads. However, I can't
sipmly reuse pool->nWorkers (same applies for pool->nPrioWorkers),
because of the following flow that is currently implemented:
1) the main thread executing virThreadPoolFree sets pool->quit = true,
wakes up all the workers and wait on pool->quit_cond.
2) A worker is woken up and see quit request. It immediately jumps of
the while() loop and decrements pool->nWorkers (or pool->nPrioWorkers in
case of priority worker). The last thread signalizes pool->quit_cond.
3) Main thread is woken up, with both pool->nWorkers and
pool->nPrioWorkers being zero.
So there's a need to copy the original value of worker thread counts
into local variables. However, these need to set *after* the check for
pool being NULL (dereferencing a NULL is no no). And for safety they can
be set right after the pool is locked.
Reported-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Even though currently we are freeing the pool of worker threads at the
daemon very end, nothing holds us back in joining the worker threads.
Moreover, we avoid leaks like this:
==26697== 1,680 bytes in 5 blocks are possibly lost in loss record 913 of 942
==26697== at 0x4C2BDE4: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==26697== by 0x4011131: allocate_dtv (in /lib64/ld-2.16.so)
==26697== by 0x401176D: _dl_allocate_tls (in /lib64/ld-2.16.so)
==26697== by 0x8499602: pthread_create@@GLIBC_2.2.5 (in /lib64/libpthread-2.16.so)
==26697== by 0x52F53E9: virThreadCreate (virthreadpthread.c:188)
==26697== by 0x52F5D4F: virThreadPoolNew (virthreadpool.c:221)
==26697== by 0x53F30DB: virNetServerNew (virnetserver.c:377)
==26697== by 0x11C6ED: main (libvirtd.c:1366)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The code for extracting sub-mounts would just do a STRPREFIX
check on the mount. This was flawed because if there were
the following mounts
/etc/aliases
/etc/aliases.db
and '/etc/aliases' was asked for, it would return both even
though the latter isn't a sub-mount.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Move the code for lxcContainerGetSubtree into the virfile
module creating 2 new functions
int virFileGetMountSubtree(const char *mtabpath,
const char *prefix,
char ***mountsret,
size_t *nmountsret);
int virFileGetMountReverseSubtree(const char *mtabpath,
const char *prefix,
char ***mountsret,
size_t *nmountsret);
Add a new virfiletest.c test case to validate the new code.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Most of our code base uses space after comma but not before;
fix the remaining uses before adding a syntax check.
* src/util/vircommand.c: Consistently use commas.
* src/util/virlog.c: Likewise.
* src/util/virnetdevbandwidth.c: Likewise.
* src/util/virnetdevmacvlan.c: Likewise.
* src/util/virnetdevvportprofile.c: Likewise.
* src/util/virnetlink.c: Likewise.
* src/util/virpci.c: Likewise.
* src/util/virsysinfo.c: Likewise.
* src/util/virusb.c: Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Report the error in virPortAllocatorAcquire instead
of doing it in every caller.
The error contains the port range name instead of the intended
use for the port, e.g.:
Unable to find an unused port in range 'display' (65534-65535)
instead of:
Unable to find an unused port for SPICE
This also adds error reporting when the QEMU driver could not
find an unused port for VNC, VNC WebSockets or NBD migration.
These two chunks had to be part of df4283a55b. But for some unclear
reason, the weren't. Anyway, these two variables are not used anywhere
within function. They're initialized to NULL and then VIR_FREE()-d. And
there's no reason do do two NOPs, right?
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=1025397
When virPCIGetVirtualFunctions created the list of an SRIOV Physical
Function's (PF) Virtual Functions (VF), it had assumed that the order
of "virtfn*" links returned by readdir() from the PF's sysfs directory
was already in the correct order. Experience has shown that this is
not always the case - it can be in alphabetical order (which would
e.g. place virtfn11 before virtfn2) or even some seemingly random
order (see the example in the bugzilla report)
This results in 1) incorrect assumptions made by consumers of the
output of the virt_functions list of virsh nodedev-dumpxml, and 2)
setting MAC address and vlan tag on the wrong VF (since libvirt uses
netlink to set mac address and vlan tag, netlink requires the VF#, and
the function virPCIGetVirtualFunctionIndex() returns the wrong index
due to the improperly ordered VF list).
The solution provided by this patch is for virPCIGetVirtualFunctions
to no longer scan the entire device directory in its natural order,
but instead to check for links individually by name "virtfn%d" where
%d starts at 0 and increases with each success. Since VFs are created
contiguously by the kernel, this will guarantee that all VFs are
found, and placed in the arry in the correct order.
One note of use to the uninitiated is that VIR_APPEND_ELEMENT always
either increments *num_virtual_functions or fails, so no this isn't an
endless loop.
(NB: the SRIOV_* defines at the top of virpci.c were removed
because they are unnecessary and/or not used.)
This gets rid of another stat() per volume, as well as cutting
bytes read in half, when populating the volumes of a directory
pool during a pool refresh. Not to mention that it provides an
interface that can let gluster pools also probe file types.
* src/util/virstoragefile.h (virStorageFileProbeFormatFromFD):
Delete.
(virStorageFileProbeFormatFromBuf): New prototype.
(VIR_STORAGE_MAX_HEADER): New constant, based on...
* src/util/virstoragefile.c (STORAGE_MAX_HEAD): ...old name.
(vmdk4GetBackingStore, virStorageFileGetMetadataInternal)
(virStorageFileProbeFormat): Adjust clients.
(virStorageFileProbeFormatFromFD): Delete.
(virStorageFileProbeFormatFromBuf): Export.
* src/storage/storage_backend_fs.c (virStorageBackendProbeTarget):
Adjust client.
* src/libvirt_private.syms (virstoragefile.h): Adjust exports.
Signed-off-by: Eric Blake <eblake@redhat.com>
Future patches will want to learn metadata about a file using
a buffer that was already parsed in order to probe the file's
format. Rather than reopening and re-reading the file, it makes
sense to separate getting file contents from actually parsing
those contents.
* src/util/virstoragefile.c (virStorageFileGetMetadataFromBuf)
(virStorageFileGetMetadataFromFDInternal): New functions.
(virStorageFileGetMetadataInternal): Hoist fstat() and read() into
callers.
(virStorageFileGetMetadataFromFD)
(virStorageFileGetMetadataRecurse): Rework clients.
* src/util/virstoragefile.h (virStorageFileGetMetadataFromBuf):
New prototype.
* src/libvirt_private.syms (virstoragefile.h): Export it.
Signed-off-by: Eric Blake <eblake@redhat.com>
Our backing file chain code was not very robust to an ill-timed
EINTR, which could lead to a short read causing us to randomly
treat metadata differently than usual. But the existing
virFileReadLimFD forces an error if we don't read the entire
file, even though we only care about the header of the file.
So add a new virFile function that does what we want.
* src/util/virfile.h (virFileReadHeaderFD): New prototype.
* src/util/virfile.c (virFileReadHeaderFD): New function.
* src/libvirt_private.syms (virfile.h): Export it.
* src/util/virstoragefile.c (virStorageFileGetMetadataInternal)
(virStorageFileProbeFormatFromFD): Use it.
Signed-off-by: Eric Blake <eblake@redhat.com>
'unsigned char *' makes sense if you are doing math on bytes and
don't want to worry about wraparound from a signed 'char'; but
since all we are doing is memcmp() or virReadBufInt*[LB]E(), which
are both safe on either type of char, and since read() prefers to
operate on 'char *', it's simpler to avoid casts by just typing
things as 'char *' from the get-go. [Technically, read can
operate on an 'unsigned char *' thanks to the C rule that any
pointer can be implicitly converted to 'char *' for legacy K&R
compatibility; but where this patch saves us is if we try to use
virfile.h functions that take 'char **' in order to allocate the
buffer, where the compiler would barf on type mismatch.]
* src/util/virstoragefile.c (FileTypeInfo): Avoid unsigned char.
(cowGetBackingStore, qcow2GetBackingStoreFormat)
(qcowXGetBackingStore, qcow1GetBackingStore)
(qcow2GetBackingStore, vmdk4GetBackingStore, qedGetBackingStore)
(virStorageFileMatchesMagic, virStorageFileMatchesVersion)
(virStorageFileProbeFormatFromBuf, qcow2GetFeatures)
(virStorageFileGetMetadataInternal)
(virStorageFileProbeFormatFromFD): Simplify clients.
Signed-off-by: Eric Blake <eblake@redhat.com>
A qcow2 file with a backing file of 'gluster://host/vol/file' should
not try to look for a directory named './gluster:/' in the file system.
* src/util/virstoragefile.c (virBackingStoreIsFile): Broaden check
to include all protocols.
Signed-off-by: Eric Blake <eblake@redhat.com>
Add a function for efficiently checking if a path is a filesystem
mount point.
NB will not work for bind mounts, only true filesystem mounts.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1018897
If a PCI deivce is not binded to any driver (e.g. there's yet no PCI
driver in the linux kernel) but still users want to passthru the device
we fail the whole operation as we fail to resolve the 'driver' link
under the PCI device sysfs tree. Obviously, this is not a fatal error
and it shouldn't be error at all.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Most of the usage of getuid()/getgid() is in cases where we are
considering what privileges we have. As such the code should be
using the effective IDs, not real IDs.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
We already have stubs for getuid, geteuid, getgid but
not for getegid. Something in gnulib already does a
check for it during configure, so we already have the
HAVE_GETEGID macro defined.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The use of getenv is typically insecure, and we want people
to use our wrappers, to force them to think about setuid
needs.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Unconditional use of getenv is not secure in setuid env.
While not all libvirt code runs in a setuid env (since
much of it only exists inside libvirtd) this is not always
clear to developers. So make all the code paranoid, even
if it only ever runs inside libvirtd.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When running setuid, we must be careful about what env vars
we allow commands to inherit from us. Replace the
virCommandAddEnvPass function with two new ones which do
filtering
virCommandAddEnvPassAllowSUID
virCommandAddEnvPassBlockSUID
And make virCommandAddEnvPassCommon use the appropriate
ones
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
We must not allow file/syslog/journald log outputs when running
setuid since they can be abused to do bad things. In particular
the 'file' output can be used to overwrite files.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Care must be taken accessing env variables when running
setuid. Introduce a virGetEnvAllowSUID for env vars which
are safe to use in a setuid environment, and another
virGetEnvBlockSUID for vars which are not safe. Also add
a virIsSUID helper method for any other non-env var code
to use.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
In fact, the suffix should be _QUIET not _QUIT to stress the
fact, that no OOM error is reported on error.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The recent patch series proposing the addition of PPC little endian
arch support to Linux defines new arch names 'ppcle' and 'ppc64le':
https://lists.ozlabs.org/pipermail/linuxppc-dev/2013-August/109908.html
This just makes libvirt know about these arch names, so it doesn't
immediately trip up if it seems these new names from uname.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Implement the bare minimal sysinfo for AArch64 platforms by
reading the CPU models from /proc/cpuinfo.
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Adding AArch64(ARMv8 64bit) to the current list of valid architectures.
For now, AArch64 name would imply AArch64 LE mode only. In future,
we might have separate names for AArch64 LE and BE.
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
The range of valid values for cgroup tunables has
changed in the past and may change again in future
kernels. Avoid hardcoding range checks in libvirt
code, delegating range checking to the kernel itself.
Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
When EINVAL is returned while changing a cgroups value, tell
user that what values are invalid for the field.
Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
'const fooPtr' is the same as 'foo * const' (the pointer won't
change, but it's contents can). But in general, if an interface
is trying to be const-correct, it should be using 'const foo *'
(the pointer is to data that can't be changed).
Fix up offenders in src/util outside of the virnet namespace.
Also, make a few virSocketAddr functions const-correct, for easier
conversions in future patches.
* src/util/virbuffer.h (virBufferError, virBufferUse)
(virBufferGetIndent): Use intended type.
* src/util/virmacaddr.h (virMacAddrCmp, virMacAddrCmpRaw)
(virMacAddrSet, virMcAddrFormat, virMacAddrIsUnicast)
(virMacAddrIsMulticast): Likewise.
* src/util/virebtables.h (ebtablesAddForwardAllowIn)
(ebtablesRemoveForwardAllowIn): Likewise.
* src/util/virsocketaddr.h (virSocketAddrSetIPv4Addr): Drop
incorrect const.
(virMacAddrGetRaw, virSocketAddrFormat, virSocketAddrFormatFull):
Make const-correct.
(virSocketAddrMask, virSocketAddrMaskByPrefix)
(virSocketAddrBroadcast, virSocketAddrBroadcastByPrefix)
(virSocketAddrGetNumNetmaskBits, virSocketAddrGetIpPrefix)
(virSocketAddrEqual, virSocketAddrIsPrivate)
(virSocketAddrIsWildcard): Use intended type.
* src/util/virbuffer.c (virBufferError, virBufferUse)
(virBufferGetIndent): Fix fallout.
* src/util/virmacaddr.c (virMacAddrCmp, virMacAddrCmpRaw)
(virMacAddrSet, virMcAddrFormat, virMacAddrIsUnicast)
(virMacAddrIsMulticast): Likewise.
* src/util/virebtables.c (ebtablesAddForwardAllowIn)
(ebtablesRemoveForwardAllowIn): Likewise.
* src/util/virsocketaddr.c (virSocketAddrMask, virMacAddrGetRaw)
(virSocketAddrMaskByPrefix, virSocketAddrBroadcast)
(virSocketAddrBroadcastByPrefix, virSocketAddrGetNumNetmaskBits)
(virSocketAddrGetIpPrefix, virSocketAddrEqual)
(virSocketAddrIsPrivate, virSocketAddrIsWildcard)
(virSocketAddrGetIPv4Addr, virSocketAddrGetIPv6Addr)
(virSocketAddrFormat, virSocketAddrFormatFull): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
'const fooPtr' is the same as 'foo * const' (the pointer won't
change, but it's contents can). But in general, if an interface
is trying to be const-correct, it should be using 'const foo *'
(the pointer is to data that can't be changed).
Fix up virhash to provide a const-correct interface: all actions
that don't modify the table take a const table. Note that in
one case (virHashSearch), we actually strip const away - we aren't
modifying the contents of the table, so much as associated data
for ensuring that the code uses the table correctly (if this were
C++, it would be a case for the 'mutable' keyword).
* src/util/virhash.h (virHashKeyComparator, virHashEqual): Use
intended type.
(virHashSize, virHashTableSize, virHashLookup, virHashSearch):
Make const-correct.
* src/util/virhash.c (virHashEqualData, virHashEqual)
(virHashLookup, virHashSize, virHashTableSize, virHashSearch)
(virHashComputeKey): Fix fallout.
* src/conf/nwfilter_params.c
(virNWFilterFormatParameterNameSorter): Likewise.
* src/nwfilter/nwfilter_ebiptables_driver.c
(ebiptablesFilterOrderSort): Likewise.
* tests/virhashtest.c (testHashGetItemsCompKey)
(testHashGetItemsCompValue): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
In Fedora 20, libvirt_lxc crashes immediately at startup with a
trace
#0 0x00007f0cddb653ec in free () from /lib64/libc.so.6
#1 0x00007f0ce0e16f4a in virFree (ptrptr=ptrptr@entry=0x7f0ce1830058) at util/viralloc.c:580
#2 0x00007f0ce0e2764b in virResetError (err=0x7f0ce1830030) at util/virerror.c:354
#3 0x00007f0ce0e27a5a in virResetLastError () at util/virerror.c:387
#4 0x00007f0ce0e28858 in virEventRegisterDefaultImpl () at util/virevent.c:233
#5 0x00007f0ce0db47c6 in main (argc=11, argv=0x7fff4596c328) at lxc/lxc_controller.c:2352
Normally virInitialize calls virErrorInitialize and
virThreadInitialize, but we don't link to libvirt.so
in libvirt_lxc, and nor did we ever call the error
or thread initializers.
I have absolutely no idea how this has ever worked, let alone
what caused it to stop working in Fedora 20.
In addition not all code paths from virLogSetFromEnv will
ensure virLogInitialize is called correctly, which is another
possible crash scenario.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Previous commit
commit 7ada155cdf
Author: Gao feng <gaofeng@cn.fujitsu.com>
Date: Wed Sep 11 11:15:02 2013 +0800
DBus: introduce virDBusIsServiceEnabled
Made the cgroups code fallback to non-systemd based setup
when dbus is not running. It was too big a hammer though,
as it did not check what error code was received when the
dbus connection failed. Thus it silently ignored serious
errors from dbus such as "too many client connections",
which should always be treated as fatal.
We only want to ignore errors if the dbus unix socket does
not exist, or if nothing is listening on it.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The log message regex has been
[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}\.[0-9]{3}\+[0-9]{4}: [0-9]+: debug|info|warning|error :
The precedence of '|' is high though, so this is equivalent to matching
[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}\.[0-9]{3}\+[0-9]{4}: [0-9]+: debug
Or
info
Or
warning
Or
error :
Which is clearly not what it should have done. This caused the code to
skip over things which are not log messages. The solution is to simply
add brackets.
A test case is also added to validate correctness.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The dbus_bus_get() function returns a shared bus connection that
all libraries in a process can use. You are forbidden from calling
close on this connection though, since you can never know if any
other code might be using it.
Add an option to use private dbus bus connections, if the app
wants to be able to close the connection.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The helper function virCompareLimitUlong compares limit values,
where value of 0 is equal to unlimited. If the latter parameter is 0,
it should return -1 instead of 1, hence the user can only set hard_limit when
swap_hard_limit currently is unlimited.
Worse, all callers pass 2 64-bit values, but on 32-bit platforms,
the second argument was silently truncated to 32 bits, which
could lead to incorrect computations.
Signed-off-by: Bing Bu Cao <mars@linux.vnet.ibm.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
The enum for virNetDevVPort is declared in the header file
virnetdevvportprofile.h, but for some reason the impl is
in netdev_vport_profile_conf.c.
This causes a dep from src/util onto src/conf which is not
allowed. Move the enum impl into virnetdevvportprofile.c
to break the circle.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This function takes exactly one argument: an address to check.
It returns true, if the address is an IPv4 or IPv6 address in numeric
format, false otherwise (e.g. for "examplehost").
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We currently have other error codes in singular form, e.g.
VIR_ERR_NETWORK_EXIST. Cleanup the previous patch to match the form.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
I created a storage volume(eg: test) from a storage pool(eg:vg10) using
the following command:"virsh vol-create-as --pool vg10 --name test --capacity 300M."
When I re-executed the above command, the output was as the following:
"error: Failed to create vol test
error: Storage volume not found: storage vol 'test' already exists"
I think the output "Storage volume not found" is not appropriate. Because in fact storage
vol test has been found at this time. And then I think virErrorNumber should includes
VIR_ERR_STORAGE_EXIST which can also be used elsewhere. So I make this patch. The result
is as following:
"error: Failed to create vol test
error: storage volume 'test' exists already"
My previous commit 7dc1d4ab was supposed to change safezero to allocate
1 megabyte at maximum, but had the logic reversed and will allocate 1
megabyte at minimum (and a lot more at maximum.)
Signed-off-by: Oskari Saarenmaa <os@ohmu.fi>
mmap can fail on 32-bit systems if we're trying to zero out a lot of data.
Fall back to using block-by-block writing in that case. While we could map
smaller blocks it's unlikely that this code is used a lot and its easier to
just fall back to one of the existing methods.
Also modified the block-by-block zeroing to not allocate a megabyte of
zeroes if we're writing less than that.
Signed-off-by: Oskari Saarenmaa <os@ohmu.fi>
The XML parser reserves 'vnet' as a prefix for automatically
generated NIC device names. Switch the veth device creation
to use this prefix, so it does not have to worry about clashes
with user specified names in the XML.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The veth device creation code run in two steps, first it looks
for two free veth device names, then it runs ip link to create
the veth pair. There is an obvious race between finding free
names and creating them, when guests are started in parallel.
Rewrite the code to loop and re-try creation if it fails, to
deal with the race condition.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The kernel automatically destroys veth devices when cleaning
up the container network namespace. During normal shutdown, it
is thus likely that the attempt to run 'ip link del vethN'
will fail. If it fails, check if the device exists, and avoid
reporting an error if it has gone. This switches to use the
virCommand APIs instead of virRun too.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
So far the virNetDevBandwidthEqual() expected both ->in and ->out items
to be allocated for both @a and @b compared. This is not necessary true
for all our code. For instance, running 'update-device' twice over a NIC
with the very same XML results in SIGSEGV-ing in this function.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This should resolve:
https://bugzilla.redhat.com/show_bug.cgi?id=1012085
libvirt previously recognized NFS, GFS2, OCFS2, and AFS filesystems as
"shared", and thus eligible for exceptions to certain rules/actions
about chowning image files before handing them off to a guest. This
patch widens the definition of "shared filesystem" to include SMB and
CIFS filesystems (aka "Windows file sharing"); both of these use the
same protocol, but different drivers so there are different magic
numbers for each.
The problem is described by [0] but its effect on libvirt is that
starting a container with a full distro running systemd after having
stopped it simply fails.
The container cleanup now calls the machined Terminate function to make
sure that everything is in order for the next run.
[0]: https://bugs.freedesktop.org/show_bug.cgi?id=68370
mmap's offset must be aligned to page size or mapping will fail.
mmap-based safezero is only used if posix_fallocate isn't available.
Signed-off-by: Oskari Saarenmaa <os@ohmu.fi>
Fixed the retrieval of the AdapterId from the AdapterName of the
hostdev source so it does return an error instead of leaving the
adapter_id uninitialized.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
The debug message said there was a timeout of 0 pending for -1 ms which
made me think this is where a hang was coming from but according to the
function comments this case means that there is no timeout pending so
make the debug message say that instead of saying there's a -1 ms
timeout.
Normally a lockspace resource is not freed while there are
active owners. During initial resource creation though, an
OOM error will trigger this scenario. virLockSpaceResourceFree
was not freeing the 'owners' field in this case.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If OOM or another error occurs in virJSONValueFromString the
parser state object will be leaked.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If OOM occurs in virJSONParserHandleStartMap it will free
a variable that is owned by another object. This leads to
a later double-free.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If virDBusMessageIterEncode hits an OOM condition it often
leaks the memory associated with the dbus iterator object
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The code parsing comments in config files called virConfAddEntry
but did not check for failure. This caused the comment string to
leak on OOM.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The functions
- iptablesAddForwardDontMasquerade(),
- iptablesRemoveForwardDontMasquerade
handle exceptions in the masquerading implemented in the POSTROUTING chain
of the "nat" table. Such exceptions should be added as chronologically
latest, logically top-most rules.
The bridge driver will call these functions beginning with the next patch:
some special destination IP addresses always refer to the local
subnetwork, even though they don't match any practical subnetwork's
netmask. Packets from virbrN targeting such IP addresses are never routed
outwards, but the current rules treat them as non-virbrN-destined packets
and masquerade them. This causes problems for some receivers on virbrN.
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
The virCommandAddEnvPassCommon method ignored the failure to
pre-allocate the env variable array with VIR_RESIZE_N. While
this is harmless, it confuses the test harness which is trying
to validate OOM handling of every individual allocation call.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>