Like commit 94a26c7e from Eric Blake, the old fuzzy code should
be replaced by the new array management macros now.
And the type of scsi->count should be changed into "size_t", and
thus virSCSIDeviceListCount should return size_t instead, similar
for vir{PCI,USB}DeviceListCount.
When the host is configured with very restrictive firewall (default policy
is DROP for all chains, including OUTPUT), the bridge driver for Linux
adds netfilter entries to allow DHCP and DNS requests to go from the VM
to the dnsmasq of the host.
The issue that this commit fixes is the fact that a DROP policy on the OUTPUT
chain blocks the DHCP replies from the host’s dnsmasq to the VM.
As DHCP replies are sent in UDP, they are not caught by any --ctstate ESTABLISHED
rule and so, need to be explicitly allowed.
Signed-off-by: Lénaïc Huard <lenaic@lhuard.fr.eu.org>
When determining if a device is behind a PCI bridge, the PCI device
class is checked by reading the config space. However, there are some
devices which have the wrong class on the config space, but the class is
initialized by Linux correctly as a PCI BRIDGE. This class can be read
by the sysfs file '/sys/bus/pci/devices/xxxx:xx:xx.x/class'.
One example of such bridge is IBM PCI Bridge 1014:03b9, which is
identified as a Host Bridge when reading the config space.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Some of our operation denied messages are outright stupid; for
example, if virIdentitySetAttr fails:
error: operation Identity attribute is already set forbidden for read only access
This patch fixes things to a saner:
error: operation forbidden: Identity attribute is already set
It also consolidates the most common usage pattern for operation
denied errors: read-only connections preventing a public API. In
this case, 'virsh -r -c test:///default destroy test' changes from:
error: operation virDomainDestroy forbidden for read only access
to:
error: operation forbidden: read only access prevents virDomainDestroy
Note that we were previously inconsistent on which APIs used
VIR_FROM_DOM (such as virDomainDestroy) vs. VIR_FROM_NONE (such as
virDomainPMSuspendForDuration). After this patch, all uses
consistently use VIR_FROM_NONE, on the grounds that it is unlikely
that a caller learning that a call is denied can do anything in
particular with extra knowledge which error domain the call belongs
to (similar to what we did in commit baa7244).
* src/util/virerror.c (virErrorMsg): Rework OPERATION_DENIED error
message.
* src/internal.h (virCheckReadOnlyGoto): New macro.
* src/util/virerror.h (virReportRestrictedError): New macro.
* src/libvirt-lxc.c: Use new macros.
* src/libvirt-qemu.c: Likewise.
* src/libvirt.c: Likewise.
* src/locking/lock_daemon.c (virLockDaemonClientNew): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
We weren't very consistent in our use of VIR_ERR_NO_SUPPORT; many
users just passed __FUNCTION__ on, while others passed "%s" to
silence over-eager compilers that warn about __FUNCTION__ not
containing any %. It's nicer to route all these uses through
a single macro, so that if we ever need to change the reporting,
we can do it in one place.
I verified that 'virsh -c test:///default qemu-monitor-command test foo'
gives the same error message before and after this patch:
error: this function is not supported by the connection driver: virDomainQemuMonitorCommand
Note that in libvirt.c, we were inconsistent on whether virDomain*
API used virLibConnError() (with VIR_FROM_NONE) or virLibDomainError()
(with VIR_FROM_DOMAIN); this patch unifies these errors to all use
VIR_FROM_NONE, on the grounds that it is unlikely that a caller
learning that a call is unimplemented can do anything in particular
with extra knowledge of which error domain it belongs to.
One particular change to note is virDomainOpenGraphics which was
trying to fail with VIR_ERR_NO_SUPPORT after a failed
VIR_DRV_SUPPORTS_FEATURE check; all other places that fail a
feature check report VIR_ERR_ARGUMENT_UNSUPPORTED.
* src/util/virerror.h (virReportUnsupportedError): New macro.
* src/libvirt-qemu.c: Use new macro.
* src/libvirt-lxc.c: Likewise.
* src/lxc/lxc_driver.c: Likewise.
* src/security/security_manager.c: Likewise.
* src/util/virinitctl.c: Likewise.
* src/libvirt.c: Likewise.
(virDomainOpenGraphics): Use correct error for unsupported feature.
Signed-off-by: Eric Blake <eblake@redhat.com>
Having one API call into another is generally not good; among
other issues, it gives confusing logs, and is not quite as
efficient.
This fixes several instances, but not all: we still have instances
in both libvirt.c and in backend hypervisors (lxc and qemu) calling
the public virTypedParamsGetString and friends, which dispatch
errors immediately. I'm not sure if it is worth trying to clean
that up in a separate patch (such a cleanup may be easiest by
separating the public function into a wrapper around the internal,
then tweaking internal.h so that internal users directly use the
internal function).
* src/libvirt.c (virDomainGetUUIDString, virNetworkGetUUIDString)
(virStoragePoolGetUUIDString, virSecretGetUUIDString)
(virNWFilterGetUUIDString): Avoid nested public API call.
* src/util/virtypedparam.c (virTypedParamsReplaceString): Don't
dispatch errors here.
(virTypedParamsGet): No need to reset errors.
(virTypedParamsGetBoolean): Use consistent ordering.
Signed-off-by: Eric Blake <eblake@redhat.com>
I noticed that the virDomainQemuMonitorCommand debug output wasn't
telling me the name of the domain it was working on. While it was
easy enough to determine which pointer matches the domain based on
other log messages, it is nicer to be consistent.
* src/util/viruuid.h (VIR_UUID_DEBUG): Moved here from...
* src/libvirt.c (VIR_UUID_DEBUG): ...here.
(VIR_ARG15, VIR_HAS_COMMA, VIR_DOMAIN_DEBUG_EXPAND)
(VIR_DOMAIN_DEBUG_PASTE, VIR_DOMAIN_DEBUG_0, VIR_DOMAIN_DEBUG_1)
(VIR_DOMAIN_DEBUG_2, VIR_DOMAIN_DEBUG): Move...
* src/datatypes.h: ...here.
* src/libvirt-qemu.c (virDomainQemuMonitorCommand)
(virDomainQemuAgentCommand): Better debug messages.
* src/libvirt-lxc.c (virDomainLxcOpenNamespace): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Since libvirt 0.9.3, the entire virevent.c file has been a public
API, so improve the documentation in this file. Also, fix a
potential core dump - it could only be triggered by bogus use of
the API and would only affect the caller (not libvirtd), but we
might as well be nice.
* src/libvirt.c (virConnectSetKeepAlive)
(virConnectDomainEventRegister, virConnectDomainEventRegisterAny)
(virConnectNetworkEventRegisterAny): Document event loop requirement.
* src/util/virevent.c (virEventAddHandle, virEventRemoveHandle)
(virEventAddTimeout, virEventRemoveTimeout): Likewise.
(virEventUpdateHandle, virEventUpdateTimeout): Likewise, and avoid
core dump if caller didn't register handler.
(virEventRunDefaultImpl): Expand example, and set up code block in
html docs.
(virEventRegisterImpl, virEventRegisterDefaultImpl): Document more
on the use of the event loop.
Signed-off-by: Eric Blake <eblake@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1044806
Currently, sending the ANSI_A keycode from os_x codepage doesn't work as
it has a special value of 0x0. Our internal code handles that no
different to other not defined keycodes. Hence, in order to allow it we
must change all the undefined keycodes from 0 to -1 and adapt some code
too.
# virsh send-key guestname --codeset os_x ANSI_A
error: invalid keycode: 'ANSI_A'
# virsh send-key guestname --codeset os_x ANSI_B
# virsh send-key guestname --codeset os_x ANSI_C
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Currently the virDBusAddWatch does
virEventAddHandle(fd, flags,
virDBusWatchCallback,
watch, NULL);
dbus_watch_set_data(watch, info, virDBusWatchFree);
Unfortunately this is racy - since the event loop is in a
different thread, the virDBusWatchCallback method may be
run before we get to calling dbus_watch_set_data. We must
reverse the order of these calls
See https://bugzilla.redhat.com/show_bug.cgi?id=885445
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Recent changes to events (commit 8a29ffcf) resulted in new compile
failures on some targets (such as ARM OMAP5):
conf/domain_event.c: In function 'virDomainEventDispatchDefaultFunc':
conf/domain_event.c:1198:30: error: cast increases required alignment of
target type [-Werror=cast-align]
conf/domain_event.c:1314:34: error: cast increases required alignment of
target type [-Werror=cast-align]
cc1: all warnings being treated as errors
The error is due to alignment; the base class is merely aligned
to the worst of 'int' and 'void*', while the child class must
be aligned to a 'long long'. The solution is to include a
'long long' (and for good measure, a function pointer) in the
base class to ensure correct alignment regardless of what a
child class may add, but to wrap the inclusion in a union so
as to not incur any wasted space. On a typical x86_64 platform,
the base class remains 16 bytes; on i686, the base class remains
12 bytes; and on the impacted ARM platform, the base class grows
from 12 bytes to 16 bytes due to the increase of alignment from
4 to 8 bytes.
Reported by Michele Paolino and others.
* src/util/virobject.h (_virObject): Use a union to ensure that
subclasses never have stricter alignment than the parent.
* src/util/virobject.c (virObjectNew, virObjectUnref)
(virObjectRef): Adjust clients.
* src/libvirt.c (virConnectRef, virDomainRef, virNetworkRef)
(virInterfaceRef, virStoragePoolRef, virStorageVolRef)
(virNodeDeviceRef, virSecretRef, virStreamRef, virNWFilterRef)
(virDomainSnapshotRef): Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorOpenInternal)
(qemuMonitorClose): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Since kernel 3.12 (commit 34ff8dc08956098563989d8599840b130be81252 in
linux-stable.git in particular) the value for 'unlimited' in cgroup
memory limits changed from LLONG_MAX to ULLONG_MAX. Due to rather
unfortunate choice of our VIR_DOMAIN_MEMORY_PARAM_UNLIMITED constant
(which we transfer as an unsigned long long in Kibibytes), we ended up
with the situation described below (applies to x86_64):
- 2^64-1 (ULLONG_MAX) -- "unlimited" in kernel = 3.12
- 2^63-1 (LLONG_MAX) -- "unlimited" in kernel < 3.12
- 2^63-1024 -- our PARAM_UNLIMITED scaled to Bytes
- 2^53-1 -- our PARAM_UNLIMITED unscaled (in Kibibytes)
This means that when any number within (2^63-1, 2^64-1] is read from
memory cgroup, we are transferring that number instead of "unlimited".
Unfortunately, changing VIR_DOMAIN_MEMORY_PARAM_UNLIMITED would break
ABI compatibility and thus we have to resort to a different solution.
With this patch every value greater than PARAM_UNLIMITED means
"unlimited". Even though this may seem misleading, we are already in
such unclear situation when running 3.12 kernel with memory limits set
to 2^63.
One example showing most of the problems at once (with kernel 3.12.2):
# virsh memtune asdf --hard-limit 9007199254740991 --swap-hard-limit -1
# echo 12345678901234567890 >\
/sys/fs/cgroup/memory/machine/asdf.libvirt-qemu/memory.soft_limit_in_bytes
# virsh memtune asdf
hard_limit : 18014398509481983
soft_limit : 12056327051986884
swap_hard_limit: 18014398509481983
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
In 78839da I am trying to join the worker threads. However, I can't
sipmly reuse pool->nWorkers (same applies for pool->nPrioWorkers),
because of the following flow that is currently implemented:
1) the main thread executing virThreadPoolFree sets pool->quit = true,
wakes up all the workers and wait on pool->quit_cond.
2) A worker is woken up and see quit request. It immediately jumps of
the while() loop and decrements pool->nWorkers (or pool->nPrioWorkers in
case of priority worker). The last thread signalizes pool->quit_cond.
3) Main thread is woken up, with both pool->nWorkers and
pool->nPrioWorkers being zero.
So there's a need to copy the original value of worker thread counts
into local variables. However, these need to set *after* the check for
pool being NULL (dereferencing a NULL is no no). And for safety they can
be set right after the pool is locked.
Reported-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Even though currently we are freeing the pool of worker threads at the
daemon very end, nothing holds us back in joining the worker threads.
Moreover, we avoid leaks like this:
==26697== 1,680 bytes in 5 blocks are possibly lost in loss record 913 of 942
==26697== at 0x4C2BDE4: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==26697== by 0x4011131: allocate_dtv (in /lib64/ld-2.16.so)
==26697== by 0x401176D: _dl_allocate_tls (in /lib64/ld-2.16.so)
==26697== by 0x8499602: pthread_create@@GLIBC_2.2.5 (in /lib64/libpthread-2.16.so)
==26697== by 0x52F53E9: virThreadCreate (virthreadpthread.c:188)
==26697== by 0x52F5D4F: virThreadPoolNew (virthreadpool.c:221)
==26697== by 0x53F30DB: virNetServerNew (virnetserver.c:377)
==26697== by 0x11C6ED: main (libvirtd.c:1366)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The code for extracting sub-mounts would just do a STRPREFIX
check on the mount. This was flawed because if there were
the following mounts
/etc/aliases
/etc/aliases.db
and '/etc/aliases' was asked for, it would return both even
though the latter isn't a sub-mount.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Move the code for lxcContainerGetSubtree into the virfile
module creating 2 new functions
int virFileGetMountSubtree(const char *mtabpath,
const char *prefix,
char ***mountsret,
size_t *nmountsret);
int virFileGetMountReverseSubtree(const char *mtabpath,
const char *prefix,
char ***mountsret,
size_t *nmountsret);
Add a new virfiletest.c test case to validate the new code.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Most of our code base uses space after comma but not before;
fix the remaining uses before adding a syntax check.
* src/util/vircommand.c: Consistently use commas.
* src/util/virlog.c: Likewise.
* src/util/virnetdevbandwidth.c: Likewise.
* src/util/virnetdevmacvlan.c: Likewise.
* src/util/virnetdevvportprofile.c: Likewise.
* src/util/virnetlink.c: Likewise.
* src/util/virpci.c: Likewise.
* src/util/virsysinfo.c: Likewise.
* src/util/virusb.c: Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Report the error in virPortAllocatorAcquire instead
of doing it in every caller.
The error contains the port range name instead of the intended
use for the port, e.g.:
Unable to find an unused port in range 'display' (65534-65535)
instead of:
Unable to find an unused port for SPICE
This also adds error reporting when the QEMU driver could not
find an unused port for VNC, VNC WebSockets or NBD migration.
These two chunks had to be part of df4283a55b. But for some unclear
reason, the weren't. Anyway, these two variables are not used anywhere
within function. They're initialized to NULL and then VIR_FREE()-d. And
there's no reason do do two NOPs, right?
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=1025397
When virPCIGetVirtualFunctions created the list of an SRIOV Physical
Function's (PF) Virtual Functions (VF), it had assumed that the order
of "virtfn*" links returned by readdir() from the PF's sysfs directory
was already in the correct order. Experience has shown that this is
not always the case - it can be in alphabetical order (which would
e.g. place virtfn11 before virtfn2) or even some seemingly random
order (see the example in the bugzilla report)
This results in 1) incorrect assumptions made by consumers of the
output of the virt_functions list of virsh nodedev-dumpxml, and 2)
setting MAC address and vlan tag on the wrong VF (since libvirt uses
netlink to set mac address and vlan tag, netlink requires the VF#, and
the function virPCIGetVirtualFunctionIndex() returns the wrong index
due to the improperly ordered VF list).
The solution provided by this patch is for virPCIGetVirtualFunctions
to no longer scan the entire device directory in its natural order,
but instead to check for links individually by name "virtfn%d" where
%d starts at 0 and increases with each success. Since VFs are created
contiguously by the kernel, this will guarantee that all VFs are
found, and placed in the arry in the correct order.
One note of use to the uninitiated is that VIR_APPEND_ELEMENT always
either increments *num_virtual_functions or fails, so no this isn't an
endless loop.
(NB: the SRIOV_* defines at the top of virpci.c were removed
because they are unnecessary and/or not used.)
This gets rid of another stat() per volume, as well as cutting
bytes read in half, when populating the volumes of a directory
pool during a pool refresh. Not to mention that it provides an
interface that can let gluster pools also probe file types.
* src/util/virstoragefile.h (virStorageFileProbeFormatFromFD):
Delete.
(virStorageFileProbeFormatFromBuf): New prototype.
(VIR_STORAGE_MAX_HEADER): New constant, based on...
* src/util/virstoragefile.c (STORAGE_MAX_HEAD): ...old name.
(vmdk4GetBackingStore, virStorageFileGetMetadataInternal)
(virStorageFileProbeFormat): Adjust clients.
(virStorageFileProbeFormatFromFD): Delete.
(virStorageFileProbeFormatFromBuf): Export.
* src/storage/storage_backend_fs.c (virStorageBackendProbeTarget):
Adjust client.
* src/libvirt_private.syms (virstoragefile.h): Adjust exports.
Signed-off-by: Eric Blake <eblake@redhat.com>
Future patches will want to learn metadata about a file using
a buffer that was already parsed in order to probe the file's
format. Rather than reopening and re-reading the file, it makes
sense to separate getting file contents from actually parsing
those contents.
* src/util/virstoragefile.c (virStorageFileGetMetadataFromBuf)
(virStorageFileGetMetadataFromFDInternal): New functions.
(virStorageFileGetMetadataInternal): Hoist fstat() and read() into
callers.
(virStorageFileGetMetadataFromFD)
(virStorageFileGetMetadataRecurse): Rework clients.
* src/util/virstoragefile.h (virStorageFileGetMetadataFromBuf):
New prototype.
* src/libvirt_private.syms (virstoragefile.h): Export it.
Signed-off-by: Eric Blake <eblake@redhat.com>
Our backing file chain code was not very robust to an ill-timed
EINTR, which could lead to a short read causing us to randomly
treat metadata differently than usual. But the existing
virFileReadLimFD forces an error if we don't read the entire
file, even though we only care about the header of the file.
So add a new virFile function that does what we want.
* src/util/virfile.h (virFileReadHeaderFD): New prototype.
* src/util/virfile.c (virFileReadHeaderFD): New function.
* src/libvirt_private.syms (virfile.h): Export it.
* src/util/virstoragefile.c (virStorageFileGetMetadataInternal)
(virStorageFileProbeFormatFromFD): Use it.
Signed-off-by: Eric Blake <eblake@redhat.com>
'unsigned char *' makes sense if you are doing math on bytes and
don't want to worry about wraparound from a signed 'char'; but
since all we are doing is memcmp() or virReadBufInt*[LB]E(), which
are both safe on either type of char, and since read() prefers to
operate on 'char *', it's simpler to avoid casts by just typing
things as 'char *' from the get-go. [Technically, read can
operate on an 'unsigned char *' thanks to the C rule that any
pointer can be implicitly converted to 'char *' for legacy K&R
compatibility; but where this patch saves us is if we try to use
virfile.h functions that take 'char **' in order to allocate the
buffer, where the compiler would barf on type mismatch.]
* src/util/virstoragefile.c (FileTypeInfo): Avoid unsigned char.
(cowGetBackingStore, qcow2GetBackingStoreFormat)
(qcowXGetBackingStore, qcow1GetBackingStore)
(qcow2GetBackingStore, vmdk4GetBackingStore, qedGetBackingStore)
(virStorageFileMatchesMagic, virStorageFileMatchesVersion)
(virStorageFileProbeFormatFromBuf, qcow2GetFeatures)
(virStorageFileGetMetadataInternal)
(virStorageFileProbeFormatFromFD): Simplify clients.
Signed-off-by: Eric Blake <eblake@redhat.com>
A qcow2 file with a backing file of 'gluster://host/vol/file' should
not try to look for a directory named './gluster:/' in the file system.
* src/util/virstoragefile.c (virBackingStoreIsFile): Broaden check
to include all protocols.
Signed-off-by: Eric Blake <eblake@redhat.com>
Add a function for efficiently checking if a path is a filesystem
mount point.
NB will not work for bind mounts, only true filesystem mounts.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1018897
If a PCI deivce is not binded to any driver (e.g. there's yet no PCI
driver in the linux kernel) but still users want to passthru the device
we fail the whole operation as we fail to resolve the 'driver' link
under the PCI device sysfs tree. Obviously, this is not a fatal error
and it shouldn't be error at all.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Most of the usage of getuid()/getgid() is in cases where we are
considering what privileges we have. As such the code should be
using the effective IDs, not real IDs.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
We already have stubs for getuid, geteuid, getgid but
not for getegid. Something in gnulib already does a
check for it during configure, so we already have the
HAVE_GETEGID macro defined.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The use of getenv is typically insecure, and we want people
to use our wrappers, to force them to think about setuid
needs.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Unconditional use of getenv is not secure in setuid env.
While not all libvirt code runs in a setuid env (since
much of it only exists inside libvirtd) this is not always
clear to developers. So make all the code paranoid, even
if it only ever runs inside libvirtd.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When running setuid, we must be careful about what env vars
we allow commands to inherit from us. Replace the
virCommandAddEnvPass function with two new ones which do
filtering
virCommandAddEnvPassAllowSUID
virCommandAddEnvPassBlockSUID
And make virCommandAddEnvPassCommon use the appropriate
ones
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
We must not allow file/syslog/journald log outputs when running
setuid since they can be abused to do bad things. In particular
the 'file' output can be used to overwrite files.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Care must be taken accessing env variables when running
setuid. Introduce a virGetEnvAllowSUID for env vars which
are safe to use in a setuid environment, and another
virGetEnvBlockSUID for vars which are not safe. Also add
a virIsSUID helper method for any other non-env var code
to use.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
In fact, the suffix should be _QUIET not _QUIT to stress the
fact, that no OOM error is reported on error.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The recent patch series proposing the addition of PPC little endian
arch support to Linux defines new arch names 'ppcle' and 'ppc64le':
https://lists.ozlabs.org/pipermail/linuxppc-dev/2013-August/109908.html
This just makes libvirt know about these arch names, so it doesn't
immediately trip up if it seems these new names from uname.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>