virRunWithHook is now unused, so we can drop it. Tested w/ raw + qcow2
volume creation and copying.
v2:
Use opaque data to skip hook second time around
Simply command building
v3:
Drop explicit FindFileInPath
This patch enables the relative backing file path support provided by
qemu-img create.
If a relative path is specified for the backing file, it is converted
to an absolute path using the storage pool path. The absolute path is
used to verify that the backing file exists. If the backing file exists,
the relative path is allowed and will be provided to qemu-img create.
This patch intentionally doesn't change indentation, in order to
make it easier to review the real changes.
* src/util/util.h (VIR_FILE_OP_RETURN_FD, virFileOperationHook):
Delete.
(virFileOperation): Rename...
(virFileOpenAs): ...and reduce parameters.
* src/util/util.c (virFileOperationNoFork, virFileOperation):
Rename and simplify.
* src/qemu/qemu_driver.c (qemudDomainSaveFlag): Adjust caller.
* src/storage/storage_backend.c (virStorageBackendCreateRaw):
Likewise.
* src/libvirt_private.syms: Reflect rename.
Child processes don't always reach _exit(); if they die from a
signal, then any messages should still be accurate. Most users
either expect a 0 status (thankfully, if status==0, then
WIFEXITED(status) is true and WEXITSTATUS(status)==0 for all
known platforms) or were filtering on WIFEXITED before printing
a status, but a few were missing this check. Additionally,
nwfilter_ebiptables_driver was making an assumption that works
on Linux (where WEXITSTATUS shifts and WTERMSIG just masks)
but fails on other platforms (where WEXITSTATUS just masks and
WTERMSIG shifts).
* src/util/command.h (virCommandTranslateStatus): New helper.
* src/libvirt_private.syms (command.h): Export it.
* src/util/command.c (virCommandTranslateStatus): New function.
(virCommandWait): Use it to also diagnose status from signals.
* src/security/security_apparmor.c (load_profile): Likewise.
* src/storage/storage_backend.c
(virStorageBackendQEMUImgBackingFormat): Likewise.
* src/util/util.c (virExecDaemonize, virRunWithHook)
(virFileOperation, virDirCreate): Likewise.
* daemon/remote.c (remoteDispatchAuthPolkit): Likewise.
* src/nwfilter/nwfilter_ebiptables_driver.c (ebiptablesExecCLI):
Likewise.
For newer qemu-img, the help string for "backing file format" is
"[-F backing_fmt]".
Fix the wrong logic error by commit e997c268.
* src/storage/storage_backend.c
virRun gives pretty useful error output, let's not overwrite it unless there
is a good reason. Some places were providing more information about what
the commands were _attempting_ to do, however that's usually less useful from
a debugging POV than what actually happened.
Use it in all places where a memory or storage request size is converted
to a larger granularity. This avoids requesting too small memory or storage
sizes that could result from the truncation done by a simple division.
This extends the round up fix in 6002e0406c
to the whole codebase.
Instead of reporting errors for odd values in the VMX code round them up.
Update the QEMU Argv tests accordingly as the original memory size 219200
isn't a even multiple of 1024 and is rounded up to 215 megabyte now. Change
it to 219100 and 219136. Use two different values intentionally to make
sure that rounding up works.
Update virsh.pod accordingly, as rounding down and rejecting are replaced
by rounding up.
If there is a dangling symbolic link in filesystem pool, the pool
will fail to start or refresh, this patch is to fix it by ignoring
it with a warning log.
security_context_t happens to be a typedef for char*, and happens to
begin with a string usable as a raw context string. But in reality,
it is an opaque type that may or may not have additional information
after the first NUL byte, where that additional information can
include pointers that can only be freed via freecon().
Proof is from this valgrind run of daemon/libvirtd:
==6028== 839,169 (40 direct, 839,129 indirect) bytes in 1 blocks are definitely lost in loss record 274 of 274
==6028== at 0x4A0515D: malloc (vg_replace_malloc.c:195)
==6028== by 0x3022E0D48C: selabel_open (label.c:165)
==6028== by 0x3022E11646: matchpathcon_init_prefix (matchpathcon.c:296)
==6028== by 0x3022E1190D: matchpathcon (matchpathcon.c:317)
==6028== by 0x4F9D842: SELinuxRestoreSecurityFileLabel (security_selinux.c:382)
800k is a lot of memory to be leaking.
* src/storage/storage_backend.c
(virStorageBackendUpdateVolTargetInfoFD): Avoid leak on error.
* src/security/security_selinux.c
(SELinuxReserveSecurityLabel, SELinuxGetSecurityProcessLabel)
(SELinuxRestoreSecurityFileLabel): Use correct function to free
security_context_t.
Similarly to deprecating close(), I am now deprecating fclose() and
introduce VIR_FORCE_FCLOSE() and VIR_FCLOSE(). Also, fdopen() is replaced with
VIR_FDOPEN().
Most of the files are opened in read-only mode, so usage of
VIR_FORCE_CLOSE() seemed appropriate. Others that are opened in write
mode already had the fclose()< 0 check and I converted those to
VIR_FCLOSE()< 0.
I did not find occurrences of possible double-closed files on the way.
Using automated replacement with sed and editing I have now replaced all
occurrences of close() with VIR_(FORCE_)CLOSE() except for one, of
course. Some replacements were straight forward, others I needed to pay
attention. I hope I payed attention in all the right places... Please
have a look. This should have at least solved one more double-close
error.
Found by clang. Clang complained that virStorageBackendProbeTarget
could dereference NULL if backingStoreFormat was NULL, but since all
callers passed a valid pointer, I added attributes instead of null
checks.
* src/storage/storage_backend.c
(virStorageBackendQEMUImgBackingFormat): Kill dead store.
* src/storage/storage_backend_fs.c (virStorageBackendProbeTarget):
Likewise. Skip null checks, by adding attributes.
One error exit in virStorageBackendCreateBlockFrom was setting the
return value to errno. The convention for volume build functions is to
return 0 on success or -1 on failure. Not only was it not necessary to
set the return value (it defaults to -1, and is set to 0 when
everything has been successfully completed), in the case that some
caller were checking for < 0 rather than != 0, they would incorrectly
believe that it completed successfully.
Previously virStorageBackendCopyToFD would simply return -1 on
error. This made the error return from one of its callers inconsistent
(createRawFileOpHook is supposed to return -errno, but if
virStorageBackendCopyToFD failed, createRawFileOpHook would just
return -1). Since there is a useful errno in every case of error
return from virStorageBackendCopyToFD, and since the other uses of
that function ignore the return code (beyond simply checking to see if
it is < 0), this is a safe change.
virFileOperation previously returned 0 on success, or the value of
errno on failure. Although there are other functions in libvirt that
use this convention, the preferred (and more common) convention is to
return 0 on success and -errno (or simply -1 in some cases) on
failure. This way the check for failure is always (ret < 0).
* src/util/util.c - change virFileOperation and virFileOperationNoFork to
return -errno on failure.
* src/storage/storage_backend.c, src/qemu/qemu_driver.c
- change the hook functions passed to virFileOperation to return
-errno on failure.
Originally the storage volume files were opened with O_DSYNC to make
sure they were flushed to disk immediately. It turned out that this
was extremely slow in some cases, so the O_DSYNC was removed in favor
of just calling fsync() after all the data had been written. However,
this call to fsync was inside the block that is executed to zero-fill
the end of the volume file. In cases where the new volume is copied
from an old volume, and they are the same length, this fsync would
never take place.
Now the fsync is *always* done, unless there is an error (in which
case it isn't important, and is most likely inappropriate.
A missing set of braces around an error condition caused us to skip
zero'ing out the remainder of a new volume file if the new volume was
longer than the original (the goto was supposed to be taken only in
the case of error, but was always being taken).
When creating qcow2 files with a backing store, it is important
to set an explicit format to prevent QEMU probing. The storage
backend was only doing this if it found a 'kvm-img' binary. This
is wrong because plenty of kvm-img binaries don't support an
explicit format, and plenty of 'qemu-img' binaries do support
a format. The result was that most qcow2 files were not getting
a backing store format.
This patch runs 'qemu-img -h' to check for the two support
argument formats
'-o backing_format=raw'
'-F raw'
and use whichever option it finds
* src/storage/storage_backend.c: Query binary to determine
how to set the backing store format
If a directory pool contains pipes or sockets, a pool start can fail or hang:
https://bugzilla.redhat.com/show_bug.cgi?id=589577
We already try to avoid these special files, but only attempt after
opening the path, which is where the problems lie. Unify volume opening
into helper functions, which use the proper open() flags to avoid error,
followed by fstat to validate storage mode.
Previously, virStorageBackendUpdateVolTargetInfoFD attempted to enforce the
storage mode check, but allowed callers to detect this case and silently
continue. In practice, only the FS backend was using this feature, the rest
were treating unknown mode as an error condition. Unfortunately the InfoFD
function wasn't raising an error message here, so error reporting was
busted.
This patch adds 2 functions: virStorageBackendVolOpen, and
virStorageBackendVolOpenModeSkip. The latter retains the original opt out
semantics, the former now throws an explicit error.
This patch maintains the previous volume mode checks: allowing specific
modes for specific pool types requires a bit of surgery, since VolOpen
is called through several different helper functions.
v2: Use ATTRIBUTE_NONNULL. Drop stat check, just open with
O_NONBLOCK|O_NOCTTY.
v3: Move mode check logic back to VolOpen. Use 2 VolOpen functions with
different error semantics.
v4: Make second VolOpen function more extensible. Didn't opt to change
FS backend defaults, this can just be to fix the original bug.
v5: Prefix default flags with VIR_, use ATTRIBUTE_RETURN_CHECK
Volume detection in the scsi backend was duplicating code already
present in storage_backend.c. Let's drop the duplicate code.
Also, change the shared function name to be less generic, and remove
some error squashing in the other call site.
WIN32 is always defined when __MINGW32__ is defined, but the
converse is not true. WIN32 is more generic, if someone were
to ever attempt porting to a microsoft compiler. This does
not affect Cygwin, which intentionally does not define WIN32.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Use more
generic flag macro.
* src/storage/storage_backend.c
(virStorageBackendUpdateVolTargetInfoFD)
(virStorageBackendRunProgRegex): Likewise.
* tools/console.h (vshRunConsole): Likewise.
This allows the config to have a setting that means "leave it alone",
eg when building a pool where the directory already exists the user
may want the current uid/gid of the directory left intact. This
actually gets us back to older behavior - before recent changes to the
pool building code, we weren't as insistent about honoring the uid/gid
settings in the XML, and virt-manager was taking advantage of this
behavior.
As a side benefit, removing calls to getuid/getgid from the XML
parsing functions also seems like a good idea. And having a default
that is different from a common/useful value (0 == root) is a good
thing in general, as it removes ambiguity from decisions (at least one
place in the code was checking for (perms.uid == 0) to see if a
special uid was requested).
Note that this will only affect newly created pools and volumes. Due
to the way that the XML is parsed, then formatted for newly created
volumes, all existing pools/volumes already have an explicit uid and
gid set.
src/conf/storage_conf.c: Remove calls to setuid/setgid for default values
of uid/gid, and set them to -1 instead
src/storage/storage_backend.c:
src/storage/storage_backend_fs.c:
Make account for the new default values of perms.uid
and perms.gid.
Various safezero() implementations used either -1, errno or -errno
return values. This patch fixes them all to return -1 and set errno
appropriately.
There was also a bug in size parameter passed to safewrite() which could
result in an attempt to write gigabytes out of a megabyte buffer.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Recently we introduced O_DSYNC flag when creating raw storage files to
avoid filling all disk cache with dirty pages. However, the patch got
lost when virStorageBackendCreateRaw was reworked using
virFileOperation. Let's use O_DSYNC again.
There were a few operations on the storage volume file that were still
being done as root, which will fail if the file is on a root-squashed
NFS share. The result was that attempts to create a storage volume of
type "raw" on a root-squashed NFS share would fail.
This patch uses the newly introduced "hook" function in
virFileOperation to execute all those file operations in the child
process that's run under the uid that owns the file (and, presumably,
has permission to write to the NFS share)
* src/storage/storage_backend.c: use virFileOperation() in
virStorageBackendCreateRaw, turning virStorageBackendCreateRaw()
into a new createRawFileOpHook() hook
It turns out it is also useful to be able to perform other operations
on a file created while running as a different uid (eg, write things
to that file), and possibly to do this to a file that already
exists. This patch adds an optional hook function to the renamed (for
more accuracy of purpose) virFileOperation; the hook will be called
after the file has been opened (possibly created) and gid/mode
checked/set, before closing it.
As with the other operations on the file, if the VIR_FILE_OP_AS_UID
flag is set, this hook function will be called in the context of a
child process forked from the process that called virFileOperation.
The implication here is that, while all data in memory is available to
this hook function, any modification to that data will not be seen by
the caller - the only indication in memory of what happened in the
hook will be the return value (which the hook should set to 0 on
success, or one of the standard errno values on failure).
Another piece of making the function more flexible was to add an
"openflags" argument. This arg should contain exactly the flags to be
passed to open(2), eg O_RDWR | O_EXCL, etc.
In the process of adding the hook to virFileOperation, I also realized
that the bits to fix up file owner/group/mode settings after creation
were being done in the parent process, which could fail, so I moved
them to the child process where they should be.
* src/util/util.[ch]: rename and rework virFileCreate-->virFileOperation,
and redo flags in virDirCreate
* storage/storage_backend.c, storage/storage_backend_fs.c: update the
calls to virFileOperation/virDirCreate to reflect changes in the API,
but don't yet take advantage of the hook.
The virConnectPtr is no longer required for error reporting since
that is recorded in a thread local. Remove use of virConnectPtr
from all APIs in secret_conf.{h,c} and update all callers to
match
The virConnectPtr is no longer required for error reporting since
that is recorded in a thread local. Remove use of virConnectPtr
from all APIs in storage_conf.{h,c} and storage_encryption_conf.{h,c}
and update all callers to match
When creating preallocated large raw files opening them with O_DSYNC
prevents long delays in reading because cache pages can be immediately
reused without writing them on a disk first.