CVE-2016-5008
Setting an empty graphics password is documented as a way to disable
VNC/SPICE access, but QEMU does not always behaves like that. VNC would
happily accept the empty password. Let's enforce the behavior by setting
password expiration to "now".
https://bugzilla.redhat.com/show_bug.cgi?id=1180092
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
(cherry picked from commit bb848feec0f3f10e92dd8e5231ae7aa89b5598f3)
The libvirt file system storage driver determines what file to
act on by concatenating the pool location with the volume name.
If a user is able to pick names like "../../../etc/passwd", then
they can escape the bounds of the pool. For that matter,
virStoragePoolListVolumes() doesn't descend into subdirectories,
so a user really shouldn't use a name with a slash.
Normally, only privileged users can coerce libvirt into creating
or opening existing files using the virStorageVol APIs; and such
users already have full privilege to create any domain XML (so it
is not an escalation of privilege). But in the case of
fine-grained ACLs, it is feasible that a user can be granted
storage_vol:create but not domain:write, and it violates
assumptions if such a user can abuse libvirt to access files
outside of the storage pool.
Therefore, prevent all use of volume names that contain "/",
whether or not such a name is actually attempting to escape the
pool.
This changes things from:
$ virsh vol-create-as default ../../../../../../etc/haha --capacity 128
Vol ../../../../../../etc/haha created
$ rm /etc/haha
to:
$ virsh vol-create-as default ../../../../../../etc/haha --capacity 128
error: Failed to create vol ../../../../../../etc/haha
error: Requested operation is not valid: volume name '../../../../../../etc/haha' cannot contain '/'
Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 034e47c338b13a95cf02106a3af912c1c5f818d7)
Well, in 8ad126e6 we tried to fix a memory corruption problem.
However, the fix was not as good as it could be. I mean, the
commit has one line more than it should. I've noticed this output
just recently:
# ./run valgrind --leak-check=full --show-reachable=yes ./tools/virsh domblklist gentoo
==17019== Memcheck, a memory error detector
==17019== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==17019== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==17019== Command: /home/zippy/work/libvirt/libvirt.git/tools/.libs/virsh domblklist gentoo
==17019==
Target Source
------------------------------------------------
fda /var/lib/libvirt/images/fd.img
vda /var/lib/libvirt/images/gentoo.qcow2
hdc /home/zippy/tmp/install-amd64-minimal-20150402.iso
==17019== Thread 2:
==17019== Invalid read of size 4
==17019== at 0x4EFF5B4: virObjectUnref (virobject.c:258)
==17019== by 0x5038CFF: remoteClientCloseFunc (remote_driver.c:552)
==17019== by 0x5069D57: virNetClientCloseLocked (virnetclient.c:685)
==17019== by 0x506C848: virNetClientIncomingEvent (virnetclient.c:1852)
==17019== by 0x5082136: virNetSocketEventHandle (virnetsocket.c:1913)
==17019== by 0x4ECD64E: virEventPollDispatchHandles (vireventpoll.c:509)
==17019== by 0x4ECDE02: virEventPollRunOnce (vireventpoll.c:658)
==17019== by 0x4ECBF00: virEventRunDefaultImpl (virevent.c:308)
==17019== by 0x130386: vshEventLoop (vsh.c:1864)
==17019== by 0x4F1EB07: virThreadHelper (virthread.c:206)
==17019== by 0xA8462D3: start_thread (in /lib64/libpthread-2.20.so)
==17019== by 0xAB441FC: clone (in /lib64/libc-2.20.so)
==17019== Address 0x139023f4 is 4 bytes inside a block of size 240 free'd
==17019== at 0x4C2B1F0: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==17019== by 0x4EA8949: virFree (viralloc.c:582)
==17019== by 0x4EFF6D0: virObjectUnref (virobject.c:273)
==17019== by 0x4FE74D6: virConnectClose (libvirt.c:1390)
==17019== by 0x13342A: virshDeinit (virsh.c:406)
==17019== by 0x134A37: main (virsh.c:950)
The problem is, when registering remoteClientCloseFunc(), it's
conn->closeCallback which is ref'd. But in the function itself
it's conn->closeCallback->conn what is unref'd. This is causing
imbalance in reference counting. Moreover, there's no need for
the remote driver to increase/decrease conn refcount since it's
not used anywhere. It's just merely passed to client registered
callback. And for that purpose it's correctly ref'd in
virConnectRegisterCloseCallback() and then unref'd in
virConnectUnregisterCloseCallback().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
(cherry picked from commit e68930077034f786e219bdb015f8880dbc5a246f)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit id '155ca616' added the 'refreshVol' API. In an NFS root-squash
environment it was possible that if the just created volume from XML wasn't
properly created with the right uid/gid and/or mode, then the followup
refreshVol will fail to open the volume in order to get the allocation/
capacity values. This would leave the volume still on the server and
cause a libvirtd crash because 'voldef' would be in the pool list, but
the cleanup code would free it.
(cherry picked from commit db9277a39bc364806e8d3e08a08fc128d59b7094)
In an NFS root-squashed environment the 'vol-delete' command will fail to
'unlink' the target volume since it was created under a different uid:gid.
This code continues the concepts introduced in virFileOpenForked and
virDirCreate[NoFork] with respect to running the unlink command under
the uid/gid of the child. Unlike the other two, don't retry on EACCES
(that's why we're here doing this now).
(cherry picked from commit 35847860f65f92e444db9730e00cdaef45198e0c)
This reverts commit 1ce7c1d20cfd5afb26d2dbc88201085d52415d0e,
which introduced a significant semantic change to the
virDomainGetInfo() API. Additionally, the change was only
made to 2 of the 15 virt drivers.
Conflicts:
src/qemu/qemu_driver.c
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
(cherry picked from commit 60acb38abbee1636a9cddf8d296f700d115c8f77)
So, recently I was testing the LXC driver. You know, startup some
domains. But to my surprise, I was not able to start a single one:
virsh # start --console test
error: Reconnected to the hypervisor
error: Failed to start domain test
error: internal error: guest failed to start: unexpected exit status 125
So I've start digging. It turns out, that in virExec(), when I printed
out the @cmd, I got strange values: *(cmd->outfdptr) was certainly not
valid FD number: it has random value of several millions. This
obviously made prepareStdFd(childout, STDOUT_FILENO) fail (line 611).
But outfdptr is set in virCommandSetOutputFD(). The only place within
LXC driver where the function is called is in
virLXCProcessBuildControllerCmd(). If you take a closer look at the
function it looks like this:
static virCommandPtr
virLXCProcessBuildControllerCmd(virLXCDriverPtr driver,
..
int logfd,
const char *pidfile)
{
...
virCommandSetOutputFD(cmd, &logfd);
virCommandSetErrorFD(cmd, &logfd);
...
}
Yes, you guessed it. @logfd is passed into the function by value.
However, in the function we try to get its address (an address of a
local variable) which is no longer valid once function is finished and
stack is cleaned. Therefore when cmd->outfdptr is evaluated at any
point after this function, we may get a random number, depending on
what's currently on the stack. Of course, this may work sometimes too
- it depends on the compiler how it arranges the code, when the stack
is wiped out.
In order to fix this, lets pass a pointer to @logfd instead of
figuring out (wrong) its value in a function.
The bug was introduced in e1de5521.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
(cherry picked from commit 302146b16d204e3feb8f04a9f81b8f587c827302)
Future kernels will mandate the use of nosuid+nodev+noexec
flags when mounting the /proc/sys filesystem. Unconditionally
add them now since they don't harm things regardless and could
mitigate future security attacks.
(cherry picked from commit 24710414d403f1040794299f5304fee160d0fc23)
Found by Laine and discussed a bit on internal IRC.
Commit id c56fe7f1d6 added support for creating a command line to support
scsi-disk.channel.
Series was here:
http://www.redhat.com/archives/libvir-list/2012-February/msg01052.html
Which pointed to a design proposal here:
http://permalink.gmane.org/gmane.comp.emulators.libvirt/50428
Which states (in part):
Libvirt should check for the QEMU "scsi-disk.channel" property. If it
is unavailable, QEMU will only support channel=lun=0 and 0<=target<=7.
However, the check added was ensuring that bus != lun *and* bus != 0. So
if bus == lun and both were non zero, we'd never make the second check.
Changing this to an *or* check fixes the check, but still is less readable
than the just checking each for 0
Recent commit 198cc1d3 introduced integration of lockd and sanlock into
libxl, but forget to update libvirt.spec.in to also list new files
distributed via package.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Since the qemu capabilities are not initialized for offline VMs the
caller might get suboptimal error message:
$ virsh blockjob VM PATH --bandwidth 1
error: unsupported configuration: block jobs not supported with this QEMU binary
Move the checks after we make sure that the VM is alive.
Just as we allow stopping filesystem pools when they were unmounted
externally, do not fail to stop an iscsi pool when someone else
closed the session externally.
Reported at:
https://bugzilla.redhat.com/show_bug.cgi?id=1171984
The phyp driver stuffed it into a DomainDefPtr during its attachdevice
routine, but the value is never advertised via capabilities so it should
be safe to drop.
Have the phyp driver use OSTYPE_LINUX, which is what it advertises via
capabilities.
In qemuMigrationDriveMirror we can start all disk mirrors in parallel.
We wait until they are all ready, or one of them aborts.
In qemuMigrationCancelDriveMirror, we wait until all mirrors are
properly stopped. This is necessary to ensure that destination VM is
fully in sync with the (paused) source VM.
If a drive mirror can not be cancelled, then the destination is not in a
consistent state. In this case it is not safe to continue with the
migration.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
The !modern code path needs to call qemuBlockJobEventProcess directly.
the modern code path will call it via qemuBlockJobSyncWait.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
Other threads may be blocked in qemuBlockJobSyncWait. Ensure that
they're woken up when the domain is stopped.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
qemuBlockJobSyncBegin and qemuBlockJobSyncEnd delimit a region of code
where block job events are processed "synchronously".
qemuBlockJobSyncWait and qemuBlockJobSyncWaitWithTimeout wait for an
event generated by a block job.
The Wait* functions may be called multiple times while the synchronous
block job is active. Any pending block job event will be processed by
only when Wait* or End is called. disk->blockJobStatus is reset by
these functions, so if it is needed a pointer to a
virConnectDomainEventBlockJobStatus variable should be passed as the
last argument. It is safe to pass NULL if you do not care about the
block job status.
All functions assume the VM object is locked. The Wait* functions will
unlock the object for as long as they are waiting. They will return -1
and report an error if the domain exits before an event is received.
Typical use is as follows:
virQEMUDriverPtr driver;
virDomainObjPtr vm; /* locked */
virDomainDiskDefPtr disk;
virConnectDomainEventBlockJobStatus status;
qemuBlockJobSyncBegin(disk);
... start block job ...
if (qemuBlockJobSyncWait(driver, vm, disk, &status) < 0) {
/* domain died while waiting for event */
ret = -1;
goto error;
}
... possibly start other block jobs
or wait for further events ...
qemuBlockJobSyncEnd(driver, vm, disk, NULL);
To perform other tasks periodically while waiting for an event:
virQEMUDriverPtr driver;
virDomainObjPtr vm; /* locked */
virDomainDiskDefPtr disk;
virConnectDomainEventBlockJobStatus status;
unsigned long long timeout = 500 * 1000ull; /* milliseconds */
qemuBlockJobSyncBegin(disk);
... start block job ...
do {
... do other task ...
if (qemuBlockJobSyncWaitWithTimeout(driver, vm, disk,
timeout, &status) < 0) {
/* domain died while waiting for event */
ret = -1;
goto error;
}
} while (status == -1);
qemuBlockJobSyncEnd(driver, vm, disk, NULL);
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
We will want to use synchronous block jobs from qemu_migration as well,
so split this function out into a new source file.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
The documentation states that for shallow block copy the image has to
have the same guest visible content as backing file of the current
image if the file is being reused. This condition can be achieved also
with a raw file (or a qcow without a backing file) so remove the
condition that would disallow it.
(This patch additionally fixes crash described in
https://bugzilla.redhat.com/show_bug.cgi?id=1215569 )
The free callback should be qemuMonitorChardevInfoFree rather
than just 'free' when virHashCreate'ing the chardevInfo hash.
==29959== 24 bytes in 2 blocks are definitely lost in loss record 19 of 53
==29959== at 0x4C29F80: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==29959== by 0xB95C679: strdup (in /lib64/libc-2.20.so)
==29959== by 0x63C6546: virStrdup (virstring.c:709)
==29959== by 0x4805ED: qemuMonitorJSONExtractChardevInfo (qemu_monitor_json.c:3429)
==29959== by 0x4807A5: qemuMonitorJSONGetChardevInfo (qemu_monitor_json.c:3479)
==29959== by 0x434AEC: testQemuMonitorJSONqemuMonitorJSONGetChardevInfo (qemumonitorjsontest.c:1824)
==29959== by 0x436F2F: virtTestRun (testutils.c:211)
==29959== by 0x436932: mymain (qemumonitorjsontest.c:2404)
==29959== by 0x4382EA: virtTestMain (testutils.c:863)
==29959== by 0x436B27: main (qemumonitorjsontest.c:2423)
Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com>
Signed-off-by: Zhou Yimin <zhouyimin@huawei.com>
It would be used in qemumonitorjsontest, thus we make it non-static.
Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com>
Signed-off-by: Zhou Yimin <zhouyimin@huawei.com>
Trying to use qemu:///session to create a storage pool pointing at
/tmp will usually fail with something like:
$ virsh pool-start tmp
error: Failed to start pool tmp
error: cannot open volume '/tmp/systemd-private-c38cf0418d7a4734a66a8175996c384f-colord.service-kEyiTA': Permission denied
If any volume in an FS pool can't be opened by the daemon, the refresh
fails, and the pool can't be used.
This causes pain for virt-install/virt-manager though. Imaging a user
downloads a disk image to /tmp. virt-manager wants to import /tmp as
a storage pool, so we can detect what disk format it is, and set the
XML correctly. However this case will likely fail as explained above.
Change the logic here to skip volumes that fail to open. This could
conceivably cause user complaints along the lines of 'why doesn't
libvirt show $ROOT-OWNED-VOLUME-FOO', but figuring that currently
the pool won't even startup, I don't think there are any current
users that care about that case.
https://bugzilla.redhat.com/show_bug.cgi?id=1103308
If you end up with a state file for a pool that no longer starts up
or refreshes correctly, the state file is never removed and adds
noise to the logs everytime libvirtd is started.
If the initial state syncing fails, delete the statefile.
After pool startup we call refreshPool(). If that fails, we leave
a stale pool state file hanging around.
Hit this trying to create a pool with qemu:///session containing
root owned files.
If we received zero iothreads from the monitor, but were perhaps
expecting to receive something, then the code was skipping the check
to ensure what's in the monitor matches our expectations. So invert
the checks to check that what we get back matches expectations and
then check there are zero iothreads returned.
Rather than have a separate routine to parse the alias of an iothread
returned from qemu in order to get the iothread_id value, parse the alias
when returning and just return the iothread_id in qemuMonitorIOThreadInfoPtr
This set of patches removes the function, changes the "char *name" to
"unsigned int" and handles all the fallout.
Build with clang fails with:
CC conf/libvirt_conf_la-domain_conf.lo
conf/domain_conf.c:13377:9: error: variable 'cpumask' is used
uninitialized whenever 'if' condition is true
[-Werror,-Wsometimes-uninitialized]
if (!(tmp = virXMLPropString(node, "cpuset"))) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
and many other similar errors regarding the 'cpuset' variable.
Fix by explicitly initializing it with NULL.
If someone has updated a network to change its bridge name, but the
network is still active (so that bridge name hasn't taken effect yet),
we still want to disallow another network from taking that new name.
Since some people use the same naming convention as libvirt for bridge
devices they create outside the context of libvirt, it is much nicer
if we check for those devices when looking for a bridge device name to
auto-assign to a new network.
We already check that any auto-assigned bridge device name for a
virtual network (e.g. "virbr1") doesn't conflict with the bridge name
for any existing libvirt network (via virNetworkSetBridgeName() in
conf/network_conf.c).
We also want to check that the name doesn't conflict with any bridge
device created on the host system outside the control of libvirt
(history: possibly due to the ploriferation of references to libvirt's
bridge devices in HOWTO documents all around the web, it is not
uncommon for an admin to manually create a bridge in their host's
system network config and name it "virbrX"). To add such a check to
virNetworkBridgeInUse() (which is called by virNetworkSetBridgeName())
we would have to call virNetDevExists() (from util/virnetdev.c); this
function calls ioctl(SIOCGIFFLAGS), which everyone on the mailing list
agreed should not be done from an XML parsing function in the conf
directory.
To remedy that problem, this patch removes virNetworkSetBridgeName()
from conf/network_conf.c and puts an identically functioning
networkBridgeNameValidate() in network/bridge_driver.c (because it's
reasonable for the bridge driver to call virNetDevExists(), although
we don't do that yet because I wanted this patch to have as close to 0
effect on function as possible).
There are a couple of inevitable changes though:
1) We no longer check the bridge name during
virNetworkLoadConfig(). Close examination of the code shows that
this wasn't necessary anyway - the only *correct* way to get XML
into the config files is via networkDefine(), and networkDefine()
will always call networkValidate(), which previously called
virNetworkSetBridgeName() (and now calls
networkBridgeNameValidate()). This means that the only way the
bridge name can be unset during virNetworkLoadConfig() is if
someone edited the config file on disk by hand (which we explicitly
prohibit).
2) Just on the off chance that somebody *has* edited the file by hand,
rather than crashing when they try to start their malformed
network, a check for non-NULL bridge name has been added to
networkStartNetworkVirtual().
(For those wondering why I don't instead call
networkValidateBridgeName() there to set a bridge name if one
wasn't present - the problem is that during
networkStartNetworkVirtual(), the lock for the network being
started has already been acquired, but the lock for the network
list itself *has not* (because we aren't adding/removing a
network). But virNetworkBridgeInuse() iterates through *all*
networks (including this one) and locks each network as it is
checked for a duplicate entry; it is necessary to lock each network
even before checking if it is the designated "skip" network because
otherwise some other thread might acquire the list lock and delete
the very entry we're examining. In the end, permitting a setting of
the bridge name during network start would require that we lock the
entire network list during any networkStartNetwork(), which
eliminates a *lot* of parallelism that we've worked so hard to
achieve (it can make a huge difference during libvirtd startup). So
rather than try to adjust for someone playing against the rules, I
choose to instead give them the error they deserve.)
3) virNetworkAllocateBridge() (now removed) would leak any "template"
string set as the bridge name. Its replacement
networkFindUnusedBridgeName() doesn't leak the template string - it
is properly freed.
Commit ca329299 added a utility function virtTestCompareFiles() to
eliminate repetitive code in several test programs. It unfortunately
calls virtTestDifference() with the arguments in the wrong order -
strcontent is the "actual" output gathered by the test rig, while
filecontent is the "expected", and virtTestDifference() wants expected
(filecontent) followed by actual (strcontent), but
virtTestCompareFiles() does the opposite, which can make the output a
bit confusing when there is a failure.
Coverity notes that the switch() used to check 'connected' values has
two DEADCODE paths (_DEFAULT & _LAST). Since 'connected' is a boolean
it can only be one or the other (CONNECTED or DISCONNECTED), so it just
seems pointless to use a switch to get "all" values. Convert to if-else
https://bugzilla.redhat.com/show_bug.cgi?id=1161617
Add command to allow adding and removing IOThreads from the domain including
the configuration and live domain.
$ virsh iothreadadd --help
NAME
iothreadadd - add an IOThread to the guest domain
SYNOPSIS
iothreadadd <domain> <id> [--config] [--live] [--current]
DESCRIPTION
Add an IOThread to the guest domain.
OPTIONS
[--domain] <string> domain name, id or uuid
[--id] <number> iothread for the new IOThread
--config affect next boot
--live affect running domain
--current affect current domain
$ virsh iothreaddel --help
NAME
iothreaddel - delete an IOThread from the guest domain
SYNOPSIS
iothreaddel <domain> <id> [--config] [--live] [--current]
DESCRIPTION
Delete an IOThread from the guest domain.
OPTIONS
[--domain] <string> domain name, id or uuid
[--id] <number> iothread_id for the IOThread to delete
--config affect next boot
--live affect running domain
--current affect current domain
Assuming a running $dom with multiple IOThreads assigned and that
that the $dom has disks assigned to IOThread 1 and IOThread 2:
$ virsh iothreadinfo $dom
IOThread ID CPU Affinity
---------------------------------------------------
1 2
2 3
3 0-1
$ virsh iothreadadd $dom 1
error: invalid argument: an IOThread is already using iothread_id '1' in iothreadpids
$ virsh iothreadadd $dom 1 --config
error: invalid argument: an IOThread is already using iothread_id '1' in persistent iothreadids
$ virsh iothreadadd $dom 4
$ virsh iothreadinfo $dom
IOThread ID CPU Affinity
---------------------------------------------------
1 2
2 3
3 0-1
4 0-3
$ virsh iothreadinfo $dom --config
IOThread ID CPU Affinity
---------------------------------------------------
1 2
2 3
3 0-1
$ virsh iothreadadd $dom 4 --config
$ virsh iothreadinfo $dom --config
IOThread ID CPU Affinity
---------------------------------------------------
1 2
2 3
3 0-1
4 0-3
Assuming the same original configuration
$ virsh iothreaddel $dom 1
error: invalid argument: cannot remove IOThread 1 since it is being used by disk 'vde'
$ virsh iothreaddel $dom 3
$ virsh iothreadinfo $dom
IOThread ID CPU Affinity
---------------------------------------------------
1 2
2 3
$ virsh iothreadinfo $dom --config
IOThread ID CPU Affinity
---------------------------------------------------
1 2
2 3
3 0-1
Add qemuDomainAddIOThread and qemuDomainDelIOThread in order to add or
remove an IOThread to/from the host either for live or config optoins
The implementation for the 'live' option will use the iothreadpids list
in order to make decision, while the 'config' option will use the
iothreadids list. Additionally, for deletion each may have to adjust
the iothreadpin list.
IOThreads are implemented by qmp objects, the code makes use of the existing
qemuMonitorAddObject or qemuMonitorDelObject APIs.
Signed-off-by: John Ferlan <jferlan@redhat.com>
We're about to allow IOThreads to be deleted, but an iothreadid may be
included in some domain thread sched, so add a new API to allow removing
an iothread from some entry.
Then during the writing of the threadsched data and an additional check
to determine whether the bitmap is all clear before writing it out.