CVE-2013-6458
Generally, every API that is going to begin a job should do that before
fetching data from vm->def. However, qemuDomainGetBlockInfo does not
know whether it will have to start a job or not before checking vm->def.
To avoid using disk alias that might have been freed while we were
waiting for a job, we use its copy. In case the disk was removed in the
meantime, we will fail with "cannot find statistics for device '...'"
error message.
(cherry picked from commit b799259583)
CVE-2013-6458
https://bugzilla.redhat.com/show_bug.cgi?id=1043069
When virDomainDetachDeviceFlags is called concurrently to
virDomainBlockStats: libvirtd may crash because qemuDomainBlockStats
finds a disk in vm->def before getting a job on a domain and uses the
disk pointer after getting the job. However, the domain in unlocked
while waiting on a job condition and thus data behind the disk pointer
may disappear. This happens when thread 1 runs
virDomainDetachDeviceFlags and enters monitor to actually remove the
disk. Then another thread starts running virDomainBlockStats, finds the
disk in vm->def, and while it's waiting on the job condition (owned by
the first thread), the first thread finishes the disk removal. When the
second thread gets the job, the memory pointed to be the disk pointer is
already gone.
That said, every API that is going to begin a job should do that before
fetching data from vm->def.
(cherry picked from commit db86da5ca2)
by, in libxlDomainGetNumaParameters(), calling libxl_bitmap_init() as soon as
possible, which avoids getting to 'cleanup:', where libxl_bitmap_dispose()
happens, without having initialized the nodemap, and hence crashing after some
invalid free()-s:
# ./daemon/libvirtd -v
*** Error in `/home/xen/libvirt.git/daemon/.libs/lt-libvirtd': munmap_chunk(): invalid pointer: 0x00007fdd42592666 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x7bbe7)[0x7fdd3f767be7]
/lib64/libxenlight.so.4.3(libxl_bitmap_dispose+0xd)[0x7fdd2c88c045]
/home/xen/libvirt.git/daemon/.libs/../../src/.libs/libvirt_driver_libxl.so(+0x12d26)[0x7fdd2caccd26]
/home/xen/libvirt.git/src/.libs/libvirt.so.0(virDomainGetNumaParameters+0x15c)[0x7fdd4247898c]
/home/xen/libvirt.git/daemon/.libs/lt-libvirtd(+0x1d9a2)[0x7fdd42ecc9a2]
/home/xen/libvirt.git/src/.libs/libvirt.so.0(virNetServerProgramDispatch+0x3da)[0x7fdd424e9eaa]
/home/xen/libvirt.git/src/.libs/libvirt.so.0(+0x1a6f38)[0x7fdd424e3f38]
/home/xen/libvirt.git/src/.libs/libvirt.so.0(+0xa81e5)[0x7fdd423e51e5]
/home/xen/libvirt.git/src/.libs/libvirt.so.0(+0xa783e)[0x7fdd423e483e]
/lib64/libpthread.so.0(+0x7c53)[0x7fdd3febbc53]
/lib64/libc.so.6(clone+0x6d)[0x7fdd3f7e1dbd]
Signed-off-by: Dario Faggili <dario.faggioli@citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Jackson <Ian.Jackson@eu.citrix.com>
(cherry picked from commit f9ee91d355)
Conflicts:
src/libxl/libxl_driver.c
The function doesn't check whether the request is made for active or
inactive domain. Thus when the domain is not running it still tries
accessing non-existing cgroups (priv->cgroup, which is NULL).
I re-made the function in order for it to work the same way it's qemu
counterpart does.
Reproducer:
1) Define an LXC domain
2) Do 'virsh memtune <domain> --hard-limit 133T'
Backtrace:
Thread 6 (Thread 0x7fffec8c0700 (LWP 26826)):
#0 0x00007ffff70edcc4 in virCgroupPathOfController (group=0x0, controller=3,
key=0x7ffff75734bd "memory.limit_in_bytes", path=0x7fffec8bf718) at util/vircgroup.c:1764
#1 0x00007ffff70e9206 in virCgroupSetValueStr (group=0x0, controller=3,
key=0x7ffff75734bd "memory.limit_in_bytes", value=0x7fffe409f360 "1073741824")
at util/vircgroup.c:669
#2 0x00007ffff70e98b4 in virCgroupSetValueU64 (group=0x0, controller=3,
key=0x7ffff75734bd "memory.limit_in_bytes", value=1073741824) at util/vircgroup.c:740
#3 0x00007ffff70ee518 in virCgroupSetMemory (group=0x0, kb=1048576) at util/vircgroup.c:1904
#4 0x00007ffff70ee675 in virCgroupSetMemoryHardLimit (group=0x0, kb=1048576)
at util/vircgroup.c:1944
#5 0x00005555557d54c8 in lxcDomainSetMemoryParameters (dom=0x7fffe40cc420,
params=0x7fffe409f100, nparams=1, flags=0) at lxc/lxc_driver.c:774
#6 0x00007ffff72c20f9 in virDomainSetMemoryParameters (domain=0x7fffe40cc420,
params=0x7fffe409f100, nparams=1, flags=0) at libvirt.c:4051
#7 0x000055555561365f in remoteDispatchDomainSetMemoryParameters (server=0x555555eb7e00,
client=0x555555ec4b10, msg=0x555555eb94e0, rerr=0x7fffec8bfb70, args=0x7fffe40b8510)
at remote_dispatch.h:7621
#8 0x00005555556133fd in remoteDispatchDomainSetMemoryParametersHelper (server=0x555555eb7e00,
client=0x555555ec4b10, msg=0x555555eb94e0, rerr=0x7fffec8bfb70, args=0x7fffe40b8510,
ret=0x7fffe40b84f0) at remote_dispatch.h:7591
#9 0x00007ffff73b293f in virNetServerProgramDispatchCall (prog=0x555555ec3ae0,
server=0x555555eb7e00, client=0x555555ec4b10, msg=0x555555eb94e0)
at rpc/virnetserverprogram.c:435
#10 0x00007ffff73b207f in virNetServerProgramDispatch (prog=0x555555ec3ae0,
server=0x555555eb7e00, client=0x555555ec4b10, msg=0x555555eb94e0)
at rpc/virnetserverprogram.c:305
#11 0x00007ffff73a4d2c in virNetServerProcessMsg (srv=0x555555eb7e00, client=0x555555ec4b10,
prog=0x555555ec3ae0, msg=0x555555eb94e0) at rpc/virnetserver.c:165
#12 0x00007ffff73a4e8d in virNetServerHandleJob (jobOpaque=0x555555ec3e30, opaque=0x555555eb7e00)
at rpc/virnetserver.c:186
#13 0x00007ffff7187f3f in virThreadPoolWorker (opaque=0x555555eb7ac0) at util/virthreadpool.c:144
#14 0x00007ffff718733a in virThreadHelper (data=0x555555eb7890) at util/virthreadpthread.c:161
#15 0x00007ffff468ed89 in start_thread (arg=0x7fffec8c0700) at pthread_create.c:308
#16 0x00007ffff3da26bd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
(cherry picked from commit 9faf3f2950)
The function doesn't check whether the request is made for active or
inactive domain. Thus when the domain is not running it still tries
accessing non-existing cgroups (priv->cgroup, which is NULL).
I re-made the function in order for it to work the same way it's qemu
counterpart does.
Reproducer:
1) Define an LXC domain
2) Do 'virsh memtune <domain>'
Backtrace:
Thread 6 (Thread 0x7fffec8c0700 (LWP 13387)):
#0 0x00007ffff70edcc4 in virCgroupPathOfController (group=0x0, controller=3,
key=0x7ffff75734bd "memory.limit_in_bytes", path=0x7fffec8bf750) at util/vircgroup.c:1764
#1 0x00007ffff70e958c in virCgroupGetValueStr (group=0x0, controller=3,
key=0x7ffff75734bd "memory.limit_in_bytes", value=0x7fffec8bf7c0) at util/vircgroup.c:705
#2 0x00007ffff70e9d29 in virCgroupGetValueU64 (group=0x0, controller=3,
key=0x7ffff75734bd "memory.limit_in_bytes", value=0x7fffec8bf810) at util/vircgroup.c:804
#3 0x00007ffff70ee706 in virCgroupGetMemoryHardLimit (group=0x0, kb=0x7fffec8bf8a8)
at util/vircgroup.c:1962
#4 0x00005555557d590f in lxcDomainGetMemoryParameters (dom=0x7fffd40024a0,
params=0x7fffd40027a0, nparams=0x7fffec8bfa24, flags=0) at lxc/lxc_driver.c:826
#5 0x00007ffff72c28d3 in virDomainGetMemoryParameters (domain=0x7fffd40024a0,
params=0x7fffd40027a0, nparams=0x7fffec8bfa24, flags=0) at libvirt.c:4137
#6 0x000055555563714d in remoteDispatchDomainGetMemoryParameters (server=0x555555eb7e00,
client=0x555555ebaef0, msg=0x555555ebb3e0, rerr=0x7fffec8bfb70, args=0x7fffd40024e0,
ret=0x7fffd4002420) at remote.c:1895
#7 0x00005555556052c4 in remoteDispatchDomainGetMemoryParametersHelper (server=0x555555eb7e00,
client=0x555555ebaef0, msg=0x555555ebb3e0, rerr=0x7fffec8bfb70, args=0x7fffd40024e0,
ret=0x7fffd4002420) at remote_dispatch.h:4050
#8 0x00007ffff73b293f in virNetServerProgramDispatchCall (prog=0x555555ec3ae0,
server=0x555555eb7e00, client=0x555555ebaef0, msg=0x555555ebb3e0)
at rpc/virnetserverprogram.c:435
#9 0x00007ffff73b207f in virNetServerProgramDispatch (prog=0x555555ec3ae0,
server=0x555555eb7e00, client=0x555555ebaef0, msg=0x555555ebb3e0)
at rpc/virnetserverprogram.c:305
#10 0x00007ffff73a4d2c in virNetServerProcessMsg (srv=0x555555eb7e00, client=0x555555ebaef0,
prog=0x555555ec3ae0, msg=0x555555ebb3e0) at rpc/virnetserver.c:165
#11 0x00007ffff73a4e8d in virNetServerHandleJob (jobOpaque=0x555555ebc7e0, opaque=0x555555eb7e00)
at rpc/virnetserver.c:186
#12 0x00007ffff7187f3f in virThreadPoolWorker (opaque=0x555555eb7ac0) at util/virthreadpool.c:144
#13 0x00007ffff718733a in virThreadHelper (data=0x555555eb7890) at util/virthreadpthread.c:161
#14 0x00007ffff468ed89 in start_thread (arg=0x7fffec8c0700) at pthread_create.c:308
#15 0x00007ffff3da26bd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
(cherry picked from commit f8c1cb9021)
Since 86d90b3a (yes, my patch; again) we are supporting NBD storage
migration. However, on error recovery path we got the steps reversed.
The correct order is: return NBD port to the virPortAllocator and then
either unlock the vm or remove it from the driver. Not vice versa.
==11192== Invalid write of size 4
==11192== at 0x11488559: qemuMigrationPrepareAny (qemu_migration.c:2459)
==11192== by 0x11488EA6: qemuMigrationPrepareDirect (qemu_migration.c:2652)
==11192== by 0x114D1509: qemuDomainMigratePrepare3Params (qemu_driver.c:10332)
==11192== by 0x519075D: virDomainMigratePrepare3Params (libvirt.c:7290)
==11192== by 0x1502DA: remoteDispatchDomainMigratePrepare3Params (remote.c:4798)
==11192== by 0x12DECA: remoteDispatchDomainMigratePrepare3ParamsHelper (remote_dispatch.h:5741)
==11192== by 0x5212127: virNetServerProgramDispatchCall (virnetserverprogram.c:435)
==11192== by 0x5211C86: virNetServerProgramDispatch (virnetserverprogram.c:305)
==11192== by 0x520A8FD: virNetServerProcessMsg (virnetserver.c:165)
==11192== by 0x520A9E1: virNetServerHandleJob (virnetserver.c:186)
==11192== by 0x50DA78F: virThreadPoolWorker (virthreadpool.c:144)
==11192== by 0x50DA11C: virThreadHelper (virthreadpthread.c:161)
==11192== Address 0x1368baa0 is 576 bytes inside a block of size 688 free'd
==11192== at 0x4A07F5C: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==11192== by 0x5079A2F: virFree (viralloc.c:580)
==11192== by 0x11456C34: qemuDomainObjPrivateFree (qemu_domain.c:267)
==11192== by 0x50F41B4: virDomainObjDispose (domain_conf.c:2034)
==11192== by 0x50C2991: virObjectUnref (virobject.c:262)
==11192== by 0x50F4CFC: virDomainObjListRemove (domain_conf.c:2361)
==11192== by 0x1145C125: qemuDomainRemoveInactive (qemu_domain.c:2087)
==11192== by 0x11488520: qemuMigrationPrepareAny (qemu_migration.c:2456)
==11192== by 0x11488EA6: qemuMigrationPrepareDirect (qemu_migration.c:2652)
==11192== by 0x114D1509: qemuDomainMigratePrepare3Params (qemu_driver.c:10332)
==11192== by 0x519075D: virDomainMigratePrepare3Params (libvirt.c:7290)
==11192== by 0x1502DA: remoteDispatchDomainMigratePrepare3Params (remote.c:4798)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
(cherry picked from commit 1f2f879ed1)
When opening a new connection to the driver, nwfilterOpen
only succeeds if the driverState has been allocated.
Move the privilege check in driver initialization before
the state allocation to disable the driver.
This changes the nwfilter-define error from:
error: cannot create config directory (null): Bad address
To:
this function is not supported by the connection driver:
virNWFilterDefineXML
https://bugzilla.redhat.com/show_bug.cgi?id=1029266
(cherry picked from commit b7829f959b)
Avoid people introducing security flaws in their apps by
forbidding the use of libvirt.so in setuid programs, with
a check in virInitialize.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 9cd6a57db6)
Most of the usage of getuid()/getgid() is in cases where we are
considering what privileges we have. As such the code should be
using the effective IDs, not real IDs.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 9b0af09240)
We already have stubs for getuid, geteuid, getgid but
not for getegid. Something in gnulib already does a
check for it during configure, so we already have the
HAVE_GETEGID macro defined.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit c566fa1ad0)
We don't want setuid programs automatically spawning libvirtd,
so disable any use of autostart when setuid.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 171bb12911)
We don't know enough about quality of external libraries used
for non-UNIX transports, nor do we want to spawn external
commands when setuid. Restrict to the bare minimum which is
UNIX transport for local usage. Users shouldn't need to be
running setuid if connecting to remote hypervisors in any
case.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit e22b0232c7)
The use of getenv is typically insecure, and we want people
to use our wrappers, to force them to think about setuid
needs.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 71b21f12be)
Unconditional use of getenv is not secure in setuid env.
While not all libvirt code runs in a setuid env (since
much of it only exists inside libvirtd) this is not always
clear to developers. So make all the code paranoid, even
if it only ever runs inside libvirtd.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 1e4a02bdfe)
When running setuid, we must be careful about what env vars
we allow commands to inherit from us. Replace the
virCommandAddEnvPass function with two new ones which do
filtering
virCommandAddEnvPassAllowSUID
virCommandAddEnvPassBlockSUID
And make virCommandAddEnvPassCommon use the appropriate
ones
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 9b8f307c6a)
Conflicts:
src/qemu/qemu_command.c
In Fedora 20, libvirt_lxc crashes immediately at startup with a
trace
#0 0x00007f0cddb653ec in free () from /lib64/libc.so.6
#1 0x00007f0ce0e16f4a in virFree (ptrptr=ptrptr@entry=0x7f0ce1830058) at util/viralloc.c:580
#2 0x00007f0ce0e2764b in virResetError (err=0x7f0ce1830030) at util/virerror.c:354
#3 0x00007f0ce0e27a5a in virResetLastError () at util/virerror.c:387
#4 0x00007f0ce0e28858 in virEventRegisterDefaultImpl () at util/virevent.c:233
#5 0x00007f0ce0db47c6 in main (argc=11, argv=0x7fff4596c328) at lxc/lxc_controller.c:2352
Normally virInitialize calls virErrorInitialize and
virThreadInitialize, but we don't link to libvirt.so
in libvirt_lxc, and nor did we ever call the error
or thread initializers.
I have absolutely no idea how this has ever worked, let alone
what caused it to stop working in Fedora 20.
In addition not all code paths from virLogSetFromEnv will
ensure virLogInitialize is called correctly, which is another
possible crash scenario.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 97973ebb7a)
The log message regex has been
[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}\.[0-9]{3}\+[0-9]{4}: [0-9]+: debug|info|warning|error :
The precedence of '|' is high though, so this is equivalent to matching
[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}\.[0-9]{3}\+[0-9]{4}: [0-9]+: debug
Or
info
Or
warning
Or
error :
Which is clearly not what it should have done. This caused the code to
skip over things which are not log messages. The solution is to simply
add brackets.
A test case is also added to validate correctness.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 5787f0b95e)
After commit 3e2f27e1, I've noticed build failures of virt-login-shell
when libapparmor-devel is installed on the build host
CCLD virt-login-shell
../src/.libs/libvirt-setuid-rpc-client.a(libvirt_setuid_rpc_client_la-vircommand.o):
In function `virExec':
/home/jfehlig/virt/upstream/libvirt/src/util/vircommand.c:653: undefined
reference to `aa_change_profile'
collect2: error: ld returned 1 exit status
I was about to commit an easy fix under the build-breaker rule
(build-fix-1.patch), but thought to extend the notion of SECDRIVER_LIBS
to SECDRIVER_CFLAGS, and use both throughout src/Makefile.am where it
makes sense (build-fix-2.patch).
Should I just stick with the simple fix, or is something along the lines
of patch 2 preferred?
Regards,
Jim
>From a0f35945f3127ab70d051101037e821b1759b4bb Mon Sep 17 00:00:00 2001
From: Jim Fehlig <jfehlig@suse.com>
Date: Mon, 21 Oct 2013 15:30:02 -0600
Subject: [PATCH] build: fix virt-login-shell build with apparmor
With libapparmor-devel installed, virt-login-shell fails to link
CCLD virt-login-shell
../src/.libs/libvirt-setuid-rpc-client.a(libvirt_setuid_rpc_client_la-vircommand.o): In function `virExec':
/home/jfehlig/virt/upstream/libvirt/src/util/vircommand.c:653: undefined reference to `aa_change_profile'
collect2: error: ld returned 1 exit status
Fix by linking libvirt_setuid_rpc_client with previously determined
SECDRIVER_LIBS in src/Makefile.am. While at it, introduce SECDRIVER_CFLAGS
and use both throughout src/Makefile.am where it makes sense.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Conflicts:
src/Makefile.am: Context
The libvirt.so library has far too many library deps to allow
linking against it from setuid programs. Those libraries can
do stuff in __attribute__((constructor) functions which is
not setuid safe.
The virt-login-shell needs to link directly against individual
files that it uses, with all library deps turned off except
for libxml2 and libselinux.
Create a libvirt-setuid-rpc-client.la library which is linked
to by virt-login-shell. A config-post.h file allows this library
to disable all external deps except libselinux and libxml2.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 3e2f27e13b)
We must not allow file/syslog/journald log outputs when running
setuid since they can be abused to do bad things. In particular
the 'file' output can be used to overwrite files.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 8c3586ea75)
Care must be taken accessing env variables when running
setuid. Introduce a virGetEnvAllowSUID for env vars which
are safe to use in a setuid environment, and another
virGetEnvBlockSUID for vars which are not safe. Also add
a virIsSUID helper method for any other non-env var code
to use.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit ae53e5d10e)
The virConnectDomainXMLToNative API should require 'connect:write'
not 'connect:read', since it will trigger execution of the QEMU
binaries listed in the XML.
Also make virConnectDomainXMLFromNative API require a full
read-write connection and 'connect:write' permission. Although the
current impl doesn't trigger execution of QEMU, we should not
rely on that impl detail from an API permissioning POV.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 57687fd6bf)
Introduced by 7b87a3
When I quit the process which only register VIR_DOMAIN_EVENT_ID_REBOOT,
I got error like:
"libvirt: XML-RPC error : internal error: domain event 0 not registered".
Then I add the following code, it fixed.
Signed-off-by: Zhou Yimin <zhouyimin@huawei.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 9712c2510e)
Since 76b644c when the support for RAM filesystems was introduced,
libvirt accepted the following XML:
<source usage='1024' unit='KiB'/>
This was parsed correctly and internally stored in bytes, but it
was formatted as (with an extra 's'):
<source usage='1024' units='KiB'/>
When read again, this was treated as if the units were missing,
meaning libvirt was unable to parse its own XML correctly.
The usage attribute was documented as being in KiB, but it was not
scaled if the unit was missing. Transient domains still worked,
because this was balanced by an extra 'k' in the mount options.
This patch:
Changes the parser to use 'units' instead of 'unit', as the latter
was never documented (fixing persistent domains) and some programs
(libvirt-glib, libvirt-sandbox) already parse the 'units' attribute.
Removes the extra 'k' from the tmpfs mount options, which is needed
because now we parse our own XML correctly.
Changes the default input unit to KiB to match documentation, fixing:
https://bugzilla.redhat.com/show_bug.cgi?id=1015689
(cherry picked from commit 3f029fb531)
After freeing the bitmap pointer, it must set the pointer to NULL.
This will avoid any other use of the freed memory of the bitmap pointer.
https://bugzilla.redhat.com/show_bug.cgi?id=1006710
Signed-off-by: Liuji (Jeremy) <jeremy.liu@huawei.com>
(cherry picked from commit ef5d51d491)
https://bugzilla.redhat.com/show_bug.cgi?id=1008619
1,003 bytes in 1 blocks are definitely lost in loss record 599 of 635
==404== by 0x50728A7: virBufferAddChar (virbuffer.c:185)
==404== by 0x50BC466: virSystemdEscapeName (virsystemd.c:67)
==404== by 0x50BC6B2: virSystemdMakeSliceName (virsystemd.c:108)
==404== by 0x50BC870: virSystemdCreateMachine (virsystemd.c:169)
==404== by 0x5078267: virCgroupNewMachine (vircgroup.c:1498)
(cherry picked from commit 09b48562aa)
Commit 27e81517a8 set the payload size to 256 KB, which is
actually the max packet size, including the size of the header.
Reduce this by VIR_NET_MESSAGE_HEADER_MAX (24) and set
VIR_NET_MESSAGE_LEGACY_PAYLOAD_MAX to 262120, which was the original
value before increasing the limit in commit eb635de1fe.
(cherry picked from commit 609eb987c6)
The libvirtd server pushes data out to clients. It does not
know what protocol version the client might have, so must be
conservative and use the old payload limits. ie send no more
than 256kb of data per packet.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 27e81517a8)
Since the wait is done during migration (still inside
QEMU_ASYNC_JOB_MIGRATION_OUT), the code should enter the monitor as such
in order to prohibit all other jobs from interfering in the meantime.
This patch fixes bug #1009886 in which qemuDomainGetBlockInfo was
waiting on the monitor condition and after GetSpiceMigrationStatus
mangled its internal data, the daemon crashed.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1009886
(cherry picked from commit 484cc3217b)
The fix for CVE-2013-4311 had a pre-requisite enhancement
to the identity code
commit db7a5688c0
Author: Daniel P. Berrange <berrange@redhat.com>
Date: Thu Aug 22 16:00:01 2013 +0100
Also store user & group ID values in virIdentity
This had a typo which caused the group ID to overwrite the
user ID string. This meant any checks using this would have
the wrong ID value. This only affected the ACL code, not the
initial polkit auth. It also leaked memory.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit e4697b92ab)
With the existing pkcheck (pid, start time) tuple for identifying
the process, there is a race condition, where a process can make
a libvirt RPC call and in another thread exec a setuid application,
causing it to change to effective UID 0. This in turn causes polkit
to do its permission check based on the wrong UID.
To address this, libvirt must get the UID the caller had at time
of connect() (from SO_PEERCRED) and pass a (pid, start time, uid)
triple to the pkcheck program.
This fix requires that libvirt is re-built against a version of
polkit that has the fix for its CVE-2013-4288, so that libvirt
can see 'pkg-config --variable pkcheck_supports_uid polkit-gobject-1'
Signed-off-by: Colin Walters <walters@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 922b7fda77)
The polkit access driver will want to use the process start
time field. This was already set for network identities, but
not for the system identity.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit e65667c0c6)
Future improvements to the polkit code will require access to
the numeric user ID, not merely user name.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit db7a5688c0)
Bother those kernel developers. In the latest rawhide, kernel
and glibc have now been unified so that <netinet/in.h> and
<linux/in6.h> no longer clash; but <linux/if_bridge.h> is still
not self-contained. Because of the latest header change, the
build is failing with:
checking for linux/param.h... no
configure: error: You must install kernel-headers in order to compile libvirt with QEMU or LXC support
with details:
In file included from conftest.c:561:0:
/usr/include/linux/in6.h:71:18: error: field 'flr_dst' has incomplete type
struct in6_addr flr_dst;
We need a workaround to avoid our workaround :)
* configure.ac (NETINET_LINUX_WORKAROUND): New test.
* src/util/virnetdevbridge.c (includes): Use it.
Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit e62e0094dc)
Since virtlockd is only built when libvirtd is built, we should
not install its auxiliary files unconditionally. This solves
two failures. 1. 'make distcheck' complains:
rm -f Makefile
ERROR: files left in build directory after distclean:
./src/virtlockd.8
2. './autobuild.sh' complains:
Checking for unpackaged file(s): /usr/lib/rpm/check-files
/home/eblake/rpmbuild/BUILDROOT/mingw-libvirt-1.1.1-1.fc19.eblake1377879911.x86_64
error: Installed (but unpackaged) file(s) found:
/usr/i686-w64-mingw32/sys-root/mingw/etc/libvirt/virtlockd.conf
/usr/i686-w64-mingw32/sys-root/mingw/share/augeas/lenses/tests/test_virtlockd.aug
/usr/i686-w64-mingw32/sys-root/mingw/share/augeas/lenses/virtlockd.aug
/usr/i686-w64-mingw32/sys-root/mingw/share/man/man8/virtlockd.8
/usr/x86_64-w64-mingw32/sys-root/mingw/etc/libvirt/virtlockd.conf
/usr/x86_64-w64-mingw32/sys-root/mingw/share/augeas/lenses/tests/test_virtlockd.aug
/usr/x86_64-w64-mingw32/sys-root/mingw/share/augeas/lenses/virtlockd.aug
/usr/x86_64-w64-mingw32/sys-root/mingw/share/man/man8/virtlockd.8
* src/Makefile.am (CLEANFILES): Add virtlockd.8.
(man8_MANS, conf_DATA, augeas_DATA, augeastest_DATA): Only install
virtlockd files when daemon is built.
Signed-off-by: Eric Blake <eblake@redhat.com>
vhost only works in KVM mode at the moment, and is infact compiled
out if the emulator is built for non-native architecture. While it
may work at some point in the future for plain qemu, for now it's
just noise on the command line (and which contributes to arm cli
breakage).
FreeBSD 10 recently changed their definition of RAND_MAX, to try
and cover the fact that their evenly distributed results of rand()
really are a smaller range than a full power of 2. As a result,
I did some investigation, and learned:
1. POSIX requires random() to be evenly distributed across exactly
31 bits. glibc also guarantees this for rand(), but the two are
unrelated, and POSIX only associates RAND_MAX with rand().
Avoiding RAND_MAX altogether thus avoids a build failure on
FreeBSD 10.
2. Concatenating random bits from a PRNG will NOT provide uniform
coverage over the larger value UNLESS the period of the original
PRNG is at least as large as the number of bits being concatenated.
Simple example: suppose that RAND_MAX were 1 with a period of 2**1
(which means that the PRNG merely alternates between 0 and 1).
Concatenating two successive rand() calls would then invariably
result in 01 or 10, which is a rather non-uniform distribution
(00 and 11 are impossible) and an even worse period (2**0, since
our second attempt will get the same number as our first attempt).
But a RAND_MAX of 1 with a period of 2**2 (alternating between
0, 1, 1, 0) provides sane coverage of all four values, if properly
tempered. (Back-to-back calls would still only see half the values
if we don't do some tempering). We therefore want to guarantee a
period of at least 2**64, preferably larger (as a tempering factor);
POSIX only makes this guarantee for random() with 256 bytes of info.
* src/util/virrandom.c (virRandomBits): Use constants that are
accurate for the PRNG we are using, not an unrelated PRNG.
(randomState): Ensure the period of our PRNG exceeds our usage.
Signed-off-by: Eric Blake <eblake@redhat.com>
Commit 29fe5d7 (released in 1.1.1) introduced a latent problem
for any caller of virSecurityManagerSetProcessLabel and where
the domain already had a uid:gid label to be parsed. Such a
setup would collect the list of supplementary groups during
virSecurityManagerPreFork, but then ignores that information,
and thus fails to call setgroups() to adjust the supplementary
groups of the process.
Upstream does not use virSecurityManagerSetProcessLabel for
qemu (it uses virSecurityManagerSetChildProcessLabel instead),
so this problem remained latent until backporting the initial
commit into v0.10.2-maint (commit c061ff5, released in 0.10.2.7),
where virSecurityManagerSetChildProcessLabel has not been
backported. As a result of using a different code path in the
backport, attempts to start a qemu domain that runs as qemu:qemu
will end up with supplementary groups unchanged from the libvirtd
parent process, rather than the desired supplementary groups of
the qemu user. This can lead to failure to start a domain
(typical Fedora setup assigns user 107 'qemu' to both group 107
'qemu' and group 36 'kvm', so a disk image that is only readable
under kvm group rights is locked out). Worse, it is a security
hole (the qemu process will inherit supplemental group rights
from the parent libvirtd process, which means it has access
rights to files owned by group 0 even when such files should
not normally be visible to user qemu).
LXC does not use the DAC security driver, so it is not vulnerable
at this time. Still, it is better to plug the latent hole on
the master branch first, before cherry-picking it to the only
vulnerable branch v0.10.2-maint.
* src/security/security_dac.c (virSecurityDACGetIds): Always populate
groups and ngroups, rather than only when no label is parsed.
Signed-off-by: Eric Blake <eblake@redhat.com>
The return values for the virConnectListAllSecrets call were not
bounds checked. This is a robustness issue for clients if
something where to cause corruption of the RPC stream data.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The return values for the virConnectListAllNWFilters call were not
bounds checked. This is a robustness issue for clients if
something where to cause corruption of the RPC stream data.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The return values for the virConnectListAllNodeDevices call were not
bounds checked. This is a robustness issue for clients if
something where to cause corruption of the RPC stream data.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The return values for the virConnectListAllInterfaces call were not
bounds checked. This is a robustness issue for clients if
something where to cause corruption of the RPC stream data.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The return values for the virConnectListAllNetworks call were not
bounds checked. This is a robustness issue for clients if
something where to cause corruption of the RPC stream data.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The return values for the virStoragePoolListAllVolumes call were not
bounds checked. This is a robustness issue for clients if
something where to cause corruption of the RPC stream data.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The return values for the virConnectListAllStoragePools call were not
bounds checked. This is a robustness issue for clients if
something where to cause corruption of the RPC stream data.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The return values for the virConnectListAllDomains call were not
bounds checked. This is a robustness issue for clients if
something where to cause corruption of the RPC stream data.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The return values for the virDomain{SnapshotListAllChildren,ListAllSnapshots}
calls were not bounds checked. This is a robustness issue for clients if
something where to cause corruption of the RPC stream data.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The return values for the virDomainGetJobStats call were not
bounds checked. This is a robustness issue for clients if
something where to cause corruption of the RPC stream data.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The parameters for the virDomainMigrate*Params RPC calls were
not bounds checks, meaning a malicious client can cause libvirtd
to consume arbitrary memory
This issue was introduced in the 1.1.0 release of libvirt
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Similarly to qemu_driver.c, we can join often repeating code of looking
up network into one function: networkObjFromNetwork.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When using a <interface type="network"> that points to a network with
hostdev forwarding mode a hostdev alias is created for the network. This
allias is inserted into the hostdev list, but is backed with a part of
the network object that it is connected to.
When a VM is being stopped qemuProcessStop() calls
networkReleaseActualDevice() which eventually frees the memory for the
hostdev object. Afterwards when the domain definition is being freed by
virDomainDefFree() an invalid pointer is accessed by
virDomainHostdevDefFree() and may cause a crash of the daemon.
This patch removes the entry in the hostdev list before freeing the
depending memory to avoid this issue.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1000973
QEMU commit 3984890 introduced the "pci-hole64-size" property,
to i440FX-pcihost and q35-pcihost with a default setting of 2 GB.
Translate <pcihole64>x<pcihole64/> to:
-global q35-pcihost.pci-hole64-size=x for q35 machines and
-global i440FX-pcihost.pci-hole64-size=x for i440FX-based machines.
Error out on other machine types or if the size was specified
but the pcihost device lacks 'pci-hole64-size' property.
https://bugzilla.redhat.com/show_bug.cgi?id=990418
<controller type='pci' index='0' model='pci-root'>
<pcihole64 unit='KiB'>1048576</pcihole64>
</controller>
It can be used to adjust (or disable) the size of the 64-bit
PCI hole. The size attribute is in kilobytes (different unit
can be specified on input), but it gets rounded up to
the nearest GB by QEMU.
Disabling it will be needed for guests that crash with the
64-bit PCI hole (like Windows XP), see:
https://bugzilla.redhat.com/show_bug.cgi?id=990418
The ftp protocol is already recognized by qemu/KVM so add this support to
libvirt as well.
The xml should be as following:
<disk type='network' device='cdrom'>
<source protocol='ftp' name='/url/path'>
<host name='host.name' port='21'/>
</source>
</disk>
Signed-off-by: Aline Manera <alinefm@br.ibm.com>
QEMU/KVM already allows a HTTP URL for the cdrom ISO image so add this support
to libvirt as well.
The xml should be as following:
<disk type='network' device='cdrom'>
<source protocol='http' name='/url/path'>
<host name='host.name' port='80'/>
</source>
</disk>
Signed-off-by: Aline Manera <alinefm@br.ibm.com>
qemu-img is going to switch the default for QCOW2
to QCOW2v3 (compat=1.1)
Extend the probing for qemu-img command line options to check
if -o compat is supported. If the volume definition specifies
the qcow2 format but no compat level and -o compat is supported,
specify -o compat=0.10 to create a QCOW2v2 image.
https://bugzilla.redhat.com/show_bug.cgi?id=997977
If there's no hard_limit set and domain uses VFIO we still must lock
the guest memory (prerequisite from qemu). Hence, we should compute
the amount to be locked from max_balloon.
When cpu hotplug fails without reporting an error, we would fail the
command but update the count of vCPUs anyways.
Commit 761fc48136 fixed the case when CPU
hot-unplug failed silently, but forgot to fix up the value in this case.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1000357
The virDomainOpenGraphics method accepts a UNIX socket FD from
the client app. It must set the label on this FD otherwise QEMU
will be prevented from receiving it with recvmsg.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If user requested multiqueue networking, beside multiple /dev/tap and
/dev/vhost-net openings, we forgot to pass mq=on onto the -device
virtio-net-pci command line. This is advised at:
http://www.linux-kvm.org/page/Multiqueue#Enable_MQ_feature
Re-arrange the code so that the returned bitmap is always initialized to
NULL even on early failures and return an error message as some callers
are already expecting it. Fix up the rest not to shadow the error.
Previously the error message showed the following:
error: internal error: Invalid or not yet handled value 'auto detect'
for VMX entry 'ide0:0.fileName'
This left the user unsure if it was a CD-ROM or a disk device that they
needed to fix. Now the error shows:
error: internal error: Invalid or not yet handled value 'auto detect'
for VMX entry 'ide0:0.fileName' for device type 'cdrom-raw'
Which should hopefully make it easier to see the issue with the VMX
configuration.
More fallout from commit d72ef888. When reconnecting to running
domains, the libxl_ctx in libxlDomainObjPrivate was used before
initializing it, causing a segfault in libxl and consequently
crashing libvirtd.
Initialize the libxlDomainObjPrivate libxl_ctx in libxlReconnectDomain,
and while at it use this ctx in libxlReconnectDomain instead of the
driver-wide ctx.
https://bugzilla.redhat.com/show_bug.cgi?id=822052
When doing a live migration, if the destination fails for any
reason after the point in which files should be labeled, then
the cleanup of the destination would restore the labels to their
defaults, even though the source is still trying to continue
running with the image open. Bug 822052 mentioned one source
of live migration failure - a mismatch in SELinux virt_use_nfs
settings (on for source, off for destination); but I found other
situations that would also trigger it (for example, having a
graphics device tied to port 5999 on the source, and a different
domain on the destination already using that port, so that the
destination cannot reuse the port).
In short, just as cleanup of the source on a successful migration
must not relabel files (because the destination would be crippled
by the relabel), cleanup of the destination on a failed migration
must not relabel files (because the source would be crippled).
* src/qemu/qemu_process.c (qemuProcessStart): Set flag to avoid
label restoration when cleaning up on failed migration.
Signed-off-by: Eric Blake <eblake@redhat.com>
Introduced by commit e0139e3044. virStorageVolDefFree free'ed the
pointers that are still used by the added volume object, this changes
it back to VIR_FREE.
Each of the modules handled reporting error messages from the secret fetching
slightly differently with respect to the error. Provide a similar message
for each error case and provide as much data as possible.
Following XML would fail :
<disk type='network' device='lun'>
<driver name='qemu' type='raw'/>
<source protocol='iscsi' name='iqn.2013-07.com.example:iscsi/1'>
<host name='example.com' port='3260'/>
</source>
<target dev='sda' bus='scsi'/>
</disk>
With the message:
error: Failed to start domain iscsilun
error: Unable to get device ID 'iqn.2013-07.com.example:iscsi/1': No such fi
Cause was commit id '1f49b05a' which added 'virDomainDiskSourceIsBlockType'
If we reached cleanup: prior to allocating cpus, it was possible that
'nr_nodes' had a value, but cpus was NULL leading to a possible NULL
deref. Add a 'cpus' as an end condition to for loop
https://bugzilla.redhat.com/show_bug.cgi?id=924153
Commit 904e05a2 (v0.9.9) added a per-<disk> seclabel element with
an attribute relabel='no' in order to try and minimize the
impact of shutdown delays when an NFS server disappears. The idea
was that if a disk is on NFS and can't be labeled in the first
place, there is no need to attempt the (no-op) relabel on domain
shutdown. Unfortunately, the way this was implemented was by
modifying the domain XML so that the optimization would survive
libvirtd restart, but in a way that is indistinguishable from an
explicit user setting. Furthermore, once the setting is turned
on, libvirt avoids attempts at labeling, even for operations like
snapshot or blockcopy where the chain is being extended or pivoted
onto non-NFS, where SELinux labeling is once again possible. As
a result, it was impossible to do a blockcopy to pivot from an
NFS image file onto a local file.
The solution is to separate the semantics of a chain that must
not be labeled (which the user can set even on persistent domains)
vs. the optimization of not attempting a relabel on cleanup (a
live-only annotation), and using only the user's explicit notation
rather than the optimization as the decision on whether to skip
a label attempt in the first place. When upgrading an older
libvirtd to a newer, an NFS volume will still attempt the relabel;
but as the avoidance of a relabel was only an optimization, this
shouldn't cause any problems.
In the ideal future, libvirt will eventually have XML describing
EVERY file in the backing chain, with each file having a separate
<seclabel> element. At that point, libvirt will be able to track
more closely which files need a relabel attempt at shutdown. But
until we reach that point, the single <seclabel> for the entire
<disk> chain is treated as a hint - when a chain has only one
file, then we know it is accurate; but if the chain has more than
one file, we have to attempt relabel in spite of the attribute,
in case part of the chain is local and SELinux mattered for that
portion of the chain.
* src/conf/domain_conf.h (_virSecurityDeviceLabelDef): Add new
member.
* src/conf/domain_conf.c (virSecurityDeviceLabelDefParseXML):
Parse it, for live images only.
(virSecurityDeviceLabelDefFormat): Output it.
(virDomainDiskDefParseXML, virDomainChrSourceDefParseXML)
(virDomainDiskSourceDefFormat, virDomainChrDefFormat)
(virDomainDiskDefFormat): Pass flags on through.
* src/security/security_selinux.c
(virSecuritySELinuxRestoreSecurityImageLabelInt): Honor labelskip
when possible.
(virSecuritySELinuxSetSecurityFileLabel): Set labelskip, not
norelabel, if labeling fails.
(virSecuritySELinuxSetFileconHelper): Fix indentation.
* docs/formatdomain.html.in (seclabel): Document new xml.
* docs/schemas/domaincommon.rng (devSeclabel): Allow it in RNG.
* tests/qemuxml2argvdata/qemuxml2argv-seclabel-*-labelskip.xml:
* tests/qemuxml2argvdata/qemuxml2argv-seclabel-*-labelskip.args:
* tests/qemuxml2xmloutdata/qemuxml2xmlout-seclabel-*-labelskip.xml:
New test files.
* tests/qemuxml2argvtest.c (mymain): Run the new tests.
* tests/qemuxml2xmltest.c (mymain): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
If there's no hard_limit set and domain uses VFIO we still must lock the
guest memory (prerequisite from qemu). Hence, we should compute the
amount to be locked from max_balloon.
Since 16bcb3 we have a regression. The hard_limit is set
unconditionally. By default the limit is zero. Hence, if user hasn't
configured any, we set the zero in cgroup subsystem making the kernel
kill the corresponding qemu process immediately. The proper fix is to
set hard_limit iff user has configured any.
From: Dario Faggioli <dario.faggioli@citrix.com>
Starting from Xen 4.2, libxl has all the bits and pieces in place
for retrieving an adequate amount of information about the host
NUMA topology. It is therefore possible, after a bit of shuffling,
to arrange those information in the way libvirt wants to present
them to the outside world.
Therefore, with this patch, the <topology> section of the host
capabilities is properly populated, when running on Xen, so that
we can figure out whether or not we're running on a NUMA host,
and what its characteristics are.
[raistlin@Zhaman ~]$ sudo virsh --connect xen:/// capabilities
<capabilities>
<host>
<cpu>
....
<topology>
<cells num='2'>
<cell id='0'>
<memory unit='KiB'>6291456</memory>
<cpus num='8'>
<cpu id='0' socket_id='1' core_id='0' siblings='0-1'/>
<cpu id='1' socket_id='1' core_id='0' siblings='0-1'/>
<cpu id='2' socket_id='1' core_id='1' siblings='2-3'/>
<cpu id='3' socket_id='1' core_id='1' siblings='2-3'/>
<cpu id='4' socket_id='1' core_id='9' siblings='4-5'/>
<cpu id='5' socket_id='1' core_id='9' siblings='4-5'/>
<cpu id='6' socket_id='1' core_id='10' siblings='6-7'/>
<cpu id='7' socket_id='1' core_id='10' siblings='6-7'/>
</cpus>
</cell>
<cell id='1'>
<memory unit='KiB'>6881280</memory>
<cpus num='8'>
<cpu id='8' socket_id='0' core_id='0' siblings='8-9'/>
<cpu id='9' socket_id='0' core_id='0' siblings='8-9'/>
<cpu id='10' socket_id='0' core_id='1' siblings='10-11'/>
<cpu id='11' socket_id='0' core_id='1' siblings='10-11'/>
<cpu id='12' socket_id='0' core_id='9' siblings='12-13'/>
<cpu id='13' socket_id='0' core_id='9' siblings='12-13'/>
<cpu id='14' socket_id='0' core_id='10' siblings='14-15'/>
<cpu id='15' socket_id='0' core_id='10' siblings='14-15'/>
</cpus>
</cell>
</cells>
</topology>
</host>
....
When the daemon is compiled with firewalld support but the DBus message
bus isn't started in the system, the initialization of the nwfilter
driver fails even if there are fallback options.
On hosts that don't have the DBus service running or installed the new
systemd cgroups code failed with hard error instead of falling back to
"manual" cgroup creation.
Use the new helper to check for the system bus and use the fallback code
in case it isn't available.
Some systems may not use DBus in their system. Add a method to check if
the system bus is available that doesn't print error messages so that
code can later check for this condition and use an alternative approach.