If we cannot stat/open a file on pool refresh, returning -1 aborts
the refresh and the pool is undefined.
Only treat missing files as fatal unless VolOpenCheckMode is called
with the VIR_STORAGE_VOL_OPEN_ERROR flag. If this flag is missing
(when it's called from virStorageBackendProbeTarget in
virStorageBackendFileSystemRefresh), only emit a warning and return
-2 to let the caller skip over the file.
https://bugzilla.redhat.com/show_bug.cgi?id=977706
Any source file which calls the logging APIs now needs
to have a VIR_LOG_INIT("source.name") declaration at
the start of the file. This provides a static variable
of the virLogSource type.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add APIs that will allow to use the storage driver to assist in
operations on files even for remote filesystems without native
representation as files in the host.
When attempting to backport gluster pools to an older versoin
where there is no VIR_STRDUP, I got a crash from calling
strdup(,NULL). Rather than relying on the current else branch
safely doing nothing when there is no fd, it is easier to just
skip it. While at it, there's no need to explicitly set
perms.label to NULL after a VIR_FREE().
* src/storage/storage_backend.c
(virStorageBackendUpdateVolTargetInfoFD): Minor optimization.
Signed-off-by: Eric Blake <eblake@redhat.com>
Commit 348b4e2 introduced a potential problem (thankfully not
in any release): we are attempting to use virFileReadHeaderFD()
on a file that was opened with O_NONBLOCK. While this
shouldn't be a problem in practice (because O_NONBLOCK
typically doesn't affect regular or block files, and fifos and
sockets cannot be storage volumes), it's better to play it safe
to avoid races from opening an unexpected file type while also
avoiding problems with having to handle EAGAIN while read()ing.
Based on a report by Dan Berrange.
* src/storage/storage_backend.c
(virStorageBackendVolOpenCheckMode): Fix up fd after avoiding race.
Signed-off-by: Eric Blake <eblake@redhat.com>
We already had code for handling allocation different than
capacity for sparse files; we just had to wire it up to be
used when inspecting gluster images.
* src/storage/storage_backend.c
(virStorageBackendUpdateVolTargetInfoFD): Handle no fd.
* src/storage/storage_backend_gluster.c
(virStorageBackendGlusterRefreshVol): Handle sparse files.
Signed-off-by: Eric Blake <eblake@redhat.com>
We support gluster volumes in domain XML, so we also ought to
support them as a storage pool. Besides, a future patch will
want to take advantage of libgfapi to handle the case of a
gluster device holding qcow2 rather than raw storage, and for
that to work, we need a storage backend that can read gluster
storage volume contents. This sets up the framework.
Note that the new pool is named 'gluster' to match a
<disk type='network'><source protocol='gluster'> image source
already supported in a <domain>; it does NOT match the
<pool type='netfs'><source><target type='glusterfs'>,
since that uses a FUSE mount to a local file name rather than
a network name.
This and subsequent patches have been tested against glusterfs
3.4.1 (available on Fedora 19); there are likely bugs in older
versions that may prevent decent use of gfapi, so this patch
enforces the minimum version tested. A future patch may lower
the minimum. On the other hand, I hit at least two bugs in
3.4.1 that will be fixed in 3.5/3.4.2, where it might be worth
raising the minimum: glfs_readdir is nicer to use than
glfs_readdir_r [1], and glfs_fini should only return failure on
an actual failure [2].
[1] http://lists.gnu.org/archive/html/gluster-devel/2013-10/msg00085.html
[2] http://lists.gnu.org/archive/html/gluster-devel/2013-10/msg00086.html
* configure.ac (WITH_STORAGE_GLUSTER): New conditional.
* m4/virt-gluster.m4: new file.
* libvirt.spec.in (BuildRequires): Support gluster in spec file.
* src/conf/storage_conf.h (VIR_STORAGE_POOL_GLUSTER): New pool
type.
* src/conf/storage_conf.c (poolTypeInfo): Treat similar to
sheepdog and rbd.
(virStoragePoolDefFormat): Don't output target for gluster.
* src/storage/storage_backend_gluster.h: New file.
* src/storage/storage_backend_gluster.c: Likewise.
* po/POTFILES.in: Add new file.
* src/storage/storage_backend.c (backends): Register new type.
* src/Makefile.am (STORAGE_DRIVER_GLUSTER_SOURCES): Build new files.
* src/storage/storage_backend.h (_virStorageBackend): Documet
assumption.
Signed-off-by: Eric Blake <eblake@redhat.com>
vol-clone reports out of memory error with disk type on ppc64.
Currently, wbytes is defined as size_t type (8 bytes), but
args's value in ioctl(fd, args..) in kernel is int (4 bytes).
This makes wbytes 2^32 times larger, causing an out of memory error.
This patch changes size_t to int to synchronize with kernel.
[1] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/block/ioctl.c?id=5e01dc7b#n363
[2] https://lkml.org/lkml/2013/11/1/620
Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
We are calling fstat() at least twice per storage volume in
a directory storage pool; this is rather wasteful. Refactoring
this is also a step towards making code reusable for gluster,
where gluster can provide struct stat but cannot use fstat().
* src/storage/storage_backend.h
(virStorageBackendVolOpenCheckMode)
(virStorageBackendUpdateVolTargetInfoFD): Update signature.
* src/storage/storage_backend.c
(virStorageBackendVolOpenCheckMode): Pass stat results back.
(virStorageBackendUpdateVolTargetInfoFD): Use existing stats.
(virStorageBackendVolOpen, virStorageBackendUpdateVolTargetInfo):
Update callers.
* src/storage/storage_backend_fs.c (virStorageBackendProbeTarget):
Likewise.
* src/storage/storage_backend_scsi.c
(virStorageBackendSCSIUpdateVolTargetInfo): Likewise.
* src/storage/storage_backend_mpath.c
(virStorageBackendMpathUpdateVolTargetInfo): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Most of the usage of getuid()/getgid() is in cases where we are
considering what privileges we have. As such the code should be
using the effective IDs, not real IDs.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Commit id '532fef36' added a call to fallocate() and some error
handling based on whether or not the function existed. This new
call resulted in libvirt-cim/cimtest failures when attempting to
create a volume with "0" (zero) allocation value. The failure is
logged as:
Oct 9 07:51:33 localhost libvirtd[8030]: cannot allocate 0 bytes in
file '/var/lib/libvirt/images/cimtest-vol.img': Invalid argument
This can also be seen with virsh vol-create-as:
error: Failed to create vol test
error: cannot allocate 0 bytes in file '/home/vm-images/test': Invalid
argument
error: Failed to create vol test
error: cannot allocate 0 bytes in file '/home/vm-images/test': Invalid
argument
It turns out fallocate() will return EINVAL when the incoming 'len'
(or allocation) value is 0 (or less).
On RHEL 5, compilation fails with:
storage/storage_backend.c: In function 'createRawFile':
storage/storage_backend.c:339: warning: implicit declaration of function 'fallocate'
storage/storage_backend.c:339: warning: nested extern declaration of 'fallocate' [-Wnested-externs]
But:
$ grep HAVE_FALLOCATE config.h
/* #undef HAVE_FALLOCATE */
Huh? It turns out that in kernels that old, fallocate() is not
implemented (config.h is correct), but <linux/fs.h> defines
HAVE_FALLOCATE as an empty witness macro for a completely
different purpose. Since storage_backend.c is including
<linux/fs.h> on RHEL 5, we are hosed by the kernel definition.
Newer kernels no longer pollute the namespace, and it's fairly
easy to convert to an expression that works with both the old
kernel witness and the new-style config.h (undefined or 1).
Problem introduced in commit 532fef3.
* src/storage/storage_backend.c (createRawFile): Avoid namespace
pollution from kernel, by checking HAVE_FALLOCATE for a value.
Signed-off-by: Eric Blake <eblake@redhat.com>
Fixed the safezero call for allocating the rest of the file after cloning
an existing volume; it used to always use a zero offset, causing it to
only allocate the beginning of the file.
Also modified file creation to try to use fallocate(2) to pre-allocate
disk space before copying any data to make sure it fails early on if disk
is full and makes sure we can skip zero blocks when copying file contents.
If fallocate isn't available we will zero out the rest of the file after
cloning and only use sparse cloning if client requested a lower allocation
than the input volume's capacity.
Signed-off-by: Oskari Saarenmaa <os@ohmu.fi>
The VIR_FREE() macro will cast away any const-ness. This masked a
number of places where we passed a 'const char *' string to
VIR_FREE. Fortunately in all of these cases, the variable was not
in fact const data, but a heap allocated string. Fix all the
variable declarations to reflect this.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
qemu-img is going to switch the default for QCOW2
to QCOW2v3 (compat=1.1)
Extend the probing for qemu-img command line options to check
if -o compat is supported. If the volume definition specifies
the qcow2 format but no compat level and -o compat is supported,
specify -o compat=0.10 to create a QCOW2v2 image.
https://bugzilla.redhat.com/show_bug.cgi?id=997977
The switch statement in 'virStorageBackendCreateQemuImgOpts' used the
for loop end condition 'VIR_STORAGE_FILE_FEATURE_LAST' as a possible value,
but since that cannot happen Coverity spits out a DEADCODE message. Adding
the Coverity tag just removes the Coverity message
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Commit id '53d5967c' introduced the following:
TEST: storagevolxml2argvtest
.............. 14 OK
==25636== 358 (264 direct, 94 indirect) bytes in 1 blocks are definitely lost in loss record 67 of 75
==25636== at 0x4A06B6F: calloc (vg_replace_malloc.c:593)
==25636== by 0x4C95791: virAlloc (viralloc.c:124)
==25636== by 0x4CA0BB4: virCommandNewArgs (vircommand.c:805)
==25636== by 0x4CA0C88: virCommandNew (vircommand.c:789)
==25636== by 0x408602: virStorageBackendCreateQemuImgCmd (storage_backend.c:849)
==25636== by 0x405427: testCompareXMLToArgvHelper (storagevolxml2argvtest.c:61)
==25636== by 0x4064DF: virtTestRun (testutils.c:158)
==25636== by 0x40516F: mymain (storagevolxml2argvtest.c:195)
==25636== by 0x406B1A: virtTestMain (testutils.c:722)
==25636== by 0x37C1021A04: (below main) (libc-start.c:225)
==25636==
PASS: storagevolxml2argvtest
It's not used anywhere except for the switch in
virStorageBackendCreateQemuImgOpts, where leaving it in causes
a dead code coverity warning and omitting it breaks compilation
because of unhandled enum value.
Introduced by 6298f74.
Setting of local variables in virStorageBackendCreateQemuImgCmd was
unnecessarily cluttered with ternary operators and repeated testing of
of conditions.
This patch refactors the function to use if statements and improves
error reporting in case inputvol is specified but does not contain
target path. Previously we would complain about "unknown storage vol
type 0" instead of the actual problem.
This resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=851411https://bugzilla.redhat.com/show_bug.cgi?id=955500
The first problem was that virFileOpenAs was returning fd (-1) in one
of the error cases rather than ret (-errno), so the caller thought
that the error was EPERM rather than ENOENT.
The second problem was that some log messages in the general purpose
qemuOpenFile() function would always say "Failed to create" even if
the caller hadn't included O_CREAT (i.e. they were trying to open an
existing file).
This fixes virFileOpenAs to jump down to the error return (which
returns ret instead of fd) in the previously mentioned incorrect
failure case of virFileOpenAs(), removes all error logging from
virFileOpenAs() (since the callers report it), and modifies
qemuOpenFile to appropriately use "open" or "create" in its log
messages.
NB: I seriously considered removing logging from all callers of
virFileOpenAs(), but there is at least one case where the caller
doesn't want virFileOpenAs() to log any errors, because it's just
going to try again (qemuOpenFile()). We can't simply make a silent
variation of virFileOpenAs() though, because qemuOpenFile() can't make
the decision about whether or not it wants to retry until after
virFileOpenAs() has already returned an error code.
Likewise, I also considered changing virFileOpenAs() to return -1 with
errno set on return, and may still do that, but only as a separate
patch, as it obscures the intent of this patch too much.
The source code base needs to be adapted as well. Some files
include virutil.h just for the string related functions (here,
the include is substituted to match the new file), some include
virutil.h without any need (here, the include is removed), and
some require both.
Ensure that the driver struct field names match the public
API names. For an API virXXXX we must have a driver struct
field xXXXX. ie strip the leading 'vir' and lowercase any
leading uppercase letters.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When creating a logical volume with virStorageVolCreateXMLFrom,
"qemu-img convert" is called internally if clonevol is a file volume.
Then, vol->target.format is used as output_fmt parameter but the
target.format of logical volumes is always 0 because logical volumes
haven't the volume format type element.
Fortunately, 0 was treated as RAW file format before commit f772b3d9,
so there was no problem. But now, 0 is treated as the type of none,
qemu-img fails with "Unknown file format 'none'".
This patch fixes this issue by treating output block devices as RAW
file format like for input block devices.
Signed-off-by: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
uid_t and gid_t are opaque types, ranging from s32 to u32 to u64.
Explicitly cast the magic -1 to the appropriate type.
Signed-off-by: Philipp Hahn <hahn@univention.de>
uid_t and gid_t are opaque types, ranging from s32 to u32 to u64.
Explicitly cast them to unsigned int for printing.
Signed-off-by: Philipp Hahn <hahn@univention.de>
For really old qemu-img binaries which do not support specifying
the format of the backing file, display a DEBUG message instead of
INFO that this can't be done.
No need to use HAVE_REGEX_H - our use of gnulib guarantees that
the header exists and works, regardless of platform. Similarly,
we can unconditionally assume a compiling <sys/wait.h> (although
the mingw version of this header is not full-featured).
* src/storage/storage_backend.c: Drop useless conditional.
* tests/testutils.c: Likewise.
Add VIR_STORAGE_VOL_CREATE_PREALLOC_METADATA flag to virStorageVolCreateXML
and virStorageVolCreateXMLFrom. This flag requests metadata
preallocation when creating/cloning qcow2 images, resulting in creating
a sparse file with qcow2 metadata. It has only slightly larger disk usage
compared to new image with no allocation, but offers higher performance.
The libvirt coding standard is to use 'function(...args...)'
instead of 'function (...args...)'. A non-trivial number of
places did not follow this rule and are fixed in this patch.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Rename the 'wait' parameter to 'loop'.
This silences the warning:
storage/storage_backend.c:1348:34: error: declaration of 'wait' shadows
a global declaration [-Werror=shadow]
and fixes the build with -Werror.
--
Note: loop is pool backwards.
virStorageVolLookupByPath is an API call that virt-manager uses
quite a bit when dealing with storage. This call use BackendStablePath
which has several usleep() heuristics that can be tripped up
and hang virt-manager for a while.
Current example: an empty mpath pool pointing to /dev/mapper makes
_any_ calls to virStorageVolLookupByPath take 5 seconds.
The sleep heuristics are actually only needed in certain cases
when we are waiting for new storage to appear, so let's skip the
timeout steps when calling from LookupByPath.
We are currently able to work only with non-translated SELinux
contexts, but we are using functions that work with translated
contexts throughout the code. This patch swaps all SELinux context
translation relative calls with their raw sisters to avoid parsing
problems.
The problems can be experienced with mcstrans for example. The
difference is that if you have translations enabled (yum install
mcstrans; service mcstrans start), fgetfilecon_raw() will get you
something like 'system_u:object_r:virt_image_t:s0', whereas
fgetfilecon() will return 'system_u:object_r:virt_image_t:SystemLow'
that we cannot parse.
I was trying to confirm that the _raw variants were here since the dawn of
time, but the only thing I see now is that it was imported together in
the upstream repo [1] from svn, so before 2008.
Thanks Laurent Bigonville for finding this out.
[1] http://oss.tresys.com/git/selinux.git
It might need some time till the LUN's stable path shows up on
initiator host, and although the time window is not foreseeable,
as a better than nothing fix, this patch adds timeout for the
stable path discovery process.
https://www.gnu.org/licenses/gpl-howto.html recommends that
the 'If not, see <url>.' phrase be a separate sentence.
* tests/securityselinuxhelper.c: Remove doubled line.
* tests/securityselinuxtest.c: Likewise.
* globally: s/; If/. If/
Otherwise, in locations like virobject.c where PROBE is used,
for certain configure options, the compiler warns:
util/virobject.c:110:1: error: 'intptr_t' undeclared (first use in this function)
As long as we are making this header always available, we can
clean up several other files.
* src/internal.h (includes): Pull in <stdint.h>.
* src/conf/nwfilter_conf.h: Rely on internal.h.
* src/storage/storage_backend.c: Likewise.
* src/storage/storage_backend.h: Likewise.
* src/util/cgroup.c: Likewise.
* src/util/sexpr.h: Likewise.
* src/util/virhashcode.h: Likewise.
* src/util/virnetdevvportprofile.h: Likewise.
* src/util/virnetlink.h: Likewise.
* src/util/virrandom.h: Likewise.
* src/vbox/vbox_driver.c: Likewise.
* src/xenapi/xenapi_driver.c: Likewise.
* src/xenapi/xenapi_utils.c: Likewise.
* src/xenapi/xenapi_utils.h: Likewise.
* src/xenxs/xenxs_private.h: Likewise.
* tests/storagebackendsheepdogtest.c: Likewise.
The access, birth, modification and change times are added to
storage volumes and corresponding xml representations. This
shows up in the XML in this format:
<timestamps>
<atime>1341933637.027319099</atime>
<mtime>1341933637.027319099</mtime>
</timestamps>
Signed-off-by: Eric Blake <eblake@redhat.com>
Per the FSF address could be changed from time to time, and GNU
recommends the following now: (http://www.gnu.org/licenses/gpl-howto.html)
You should have received a copy of the GNU General Public License
along with Foobar. If not, see <http://www.gnu.org/licenses/>.
This patch removes the explicit FSF address, and uses above instead
(of course, with inserting 'Lesser' before 'General').
Except a bunch of files for security driver, all others are changed
automatically, the copyright for securify files are not complete,
that's why to do it manually:
src/security/security_selinux.h
src/security/security_driver.h
src/security/security_selinux.c
src/security/security_apparmor.h
src/security/security_apparmor.c
src/security/security_driver.c
This patch brings support to manage sheepdog pools and volumes to libvirt.
It uses the "collie" command-line utility that comes with sheepdog for that.
A sheepdog pool in libvirt maps to a sheepdog cluster.
It needs a host and port to connect to, which in most cases
is just going to be the default of localhost on port 7000.
A sheepdog volume in libvirt maps to a sheepdog vdi.
To create one specify the pool, a name and the capacity.
Volumes can also be resized later.
In the volume XML the vdi name has to be put into the <target><path>.
To use the volume as a disk source for virtual machines specify
the vdi name as "name" attribute of the <source>.
The host and port information from the pool are specified inside the host tag.
<disk type='network'>
...
<source protocol="sheepdog" name="vdi_name">
<host name="localhost" port="7000"/>
</source>
</disk>
To work right this patch parses the output of collie,
so it relies on the raw output option. There recently was a bug which caused
size information to be reported wrong. This is fixed upstream already and
will be in the next release.
Signed-off-by: Sebastian Wiedenroth <wiedi@frubar.net>
Update the storage driver to use virReportError instead of
the virStorageReportError custom macro
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To make it easier to dynamically change the command line ARGV,
switch all storage code over to use virCommandPtr APIs for
running programs
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This patch adds support for a new storage backend with RBD support.
RBD is the RADOS Block Device and is part of the Ceph distributed storage
system.
It comes in two flavours: Qemu-RBD and Kernel RBD, this storage backend only
supports Qemu-RBD, thus limiting the use of this storage driver to Qemu only.
To function this backend relies on librbd and librados being present on the
local system.
The backend also supports Cephx authentication for safe authentication with
the Ceph cluster.
For storing credentials it uses the built-in secret mechanism of libvirt.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
Input to the volume cloning code is a source volume and an XML
descriptor for the new volume. It is possible for the new volume
to have a greater size than source volume, at which point libvirt
will just stick 0s on the end of the new image (for raw format
anyways).
Unfortunately a logic error messed up our tracking of the of the
excess amount that needed to be written: end result is that sparse
clones were made very much non-sparse, and cloning regular disk
images could end up excessively sized (though data unaltered).
Drop the 'remain' variable entriely here since it's redundant, and
track actual allocation directly against the desired 'total'.
virFileOpenAs previously would only try opening a file as the current
user, or as a different user, but wouldn't try both methods in a
single call. This made it cumbersome to use as a replacement for
open(2). Additionally, it had a lot of historical baggage that led to
it being difficult to understand.
This patch refactors virFileOpenAs in the following ways:
* reorganize the code so that everything dealing with both the parent
and child sides of the "fork+setuid+setgid+open" method are in a
separate function. This makes the public function easier to understand.
* Allow a single call to virFileOpenAs() to first attempt the open as
the current user, and if that fails to automatically re-try after
doing fork+setuid (if deemed appropriate, i.e. errno indicates it
would now be successful, and the file is on a networkFS). This makes
it possible (in many, but possibly not all, cases) to drop-in
virFileOpenAs() as a replacement for open(2).
(NB: currently qemuOpenFile() calls virFileOpenAs() twice, once
without forking, then again with forking. That unfortunately can't
be changed without at least some discussion of the ramifications,
because the requested file permissions are different in each case,
which is something that a single call to virFileOpenAs() can't deal
with.)
* Add a flag so that any fchown() of the file to a different uid:gid
is explicitly requested when the function is called, rather than it
being implied by the presence of the O_CREAT flag. This just makes
for less subtle surprises to consumers. (Commit
b1643dc15c added the check for O_CREAT
before forcing ownership. This patch just makes that restriction
more explicit.)
* If either the uid or gid is specified as "-1", virFileOpenAs will
interpret this to mean "the current [gu]id".
All current consumers of virFileOpenAs should retain their present
behavior (after a few minor changes to their setup code and
arguments).
On F16 at least, empty volume groups don't have a directory under /dev.
The directory only appears once a logical volume is created.
This tickles some behavior in BackendStablePath which ends with
libvirt sleeping for 5 seconds while waiting for the directory to appear.
This causes all sorts of problems for the virStorageVolLookupByPath API
which virtinst uses, even if trying to resolve a path that is independent
of the logical pool.
In reality we don't even need to do that checking since logical pools
always have a stable target path. Short circuit the polling in that
case.
Fixes bug 782261
This partly reverts my previous patch f88de3eb. We need to
get file status after open, as given path could have been symlink,
so fstat() will operate on different file than lstat().
Detected by Coverity. Only possible if qemu-img gives bogus output,
but we might as well be robust.
* src/storage/storage_backend.c
(virStorageBackendQEMUImgBackingFormat): Check for strstr failure.
If the regexes supported (?:pvs)?, then we could handle this by
optionally matching but not returning the initial command name. But it
doesn't. So add a new char* argument to
virStorageBackendRunProgRegex(). If that argument is NULL then we act
as usual. Otherwise, if the string at that argument is found at the
start of a returned line, we drop that before running the regex.
With this patch, virt-manager shows me lvs with command_names 1 or 0.
The definitions of PVS_BASE etc may want to be moved into the configure
scripts (though given how PVS is found, IIUC that could only happen if
pvs was a link to pvs_real), but in any case no sense dealing with that
until we're sure this is an ok way to handle it.
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Although we are flushing cache after some critical writes (e.g.
volume creation), after some others we do not (e.g. volume cloning).
This patch fix this issue. That is for volume cloning, writing
header of logical volume, and storage wipe.
No caller was using the flags argument, and this function is internal
only, so we might as well skip it.
* src/util/util.h (safezero): Update signature.
* src/util/util.c (safezero): Update function.
* src/locking/lock_driver_sanlock.c
(virLockManagerSanlockSetupLockspace)
(virLockManagerSanlockCreateLease): Update all callers.
* src/storage/storage_backend.c (createRawFile): Likewise.
virStorageBackendCreateRaw: createRawFile already reported the
exact error.
Before the fix:
error: Failed to create vol vol-create.img
error: cannot create path '/var/lib/libvirt/images/vol-create.img': Unknown error 18446744073709551597
After the fix:
error: Failed to create vol vol-create.img
error: cannot fill file '/var/lib/libvirt/images/vol-create.img': No space left on device
Most of the safezero() implementations return -1 on error,
setting errno. The safezero() impl using posix_fallocate()
though returned a positive errno value on error (due to
the unusual API contract of posix_fallocate() compared to
most syscall APIs).
* src/util/util.c: Ensure safezero() returns -1 and sets
errno on error.
* src/storage/storage_backend.c: Change safezero != 0 to
< 0 for detecting errors
Since directories can be used for <filesystem> passthrough, they are
basically storage volumes.
v2:
Skip ., .., lost+found dirs
v3:
Use gnulib last_component
v4:
Use gnulib "dirname.h", not system <dirname.h>
Don't skip lost+found
Two additional places need initgroups call to properly work in an
environment where the UID is allowed to open/create stuff through its
supplementary groups.
virRunWithHook is now unused, so we can drop it. Tested w/ raw + qcow2
volume creation and copying.
v2:
Use opaque data to skip hook second time around
Simply command building
v3:
Drop explicit FindFileInPath
This patch enables the relative backing file path support provided by
qemu-img create.
If a relative path is specified for the backing file, it is converted
to an absolute path using the storage pool path. The absolute path is
used to verify that the backing file exists. If the backing file exists,
the relative path is allowed and will be provided to qemu-img create.
This patch intentionally doesn't change indentation, in order to
make it easier to review the real changes.
* src/util/util.h (VIR_FILE_OP_RETURN_FD, virFileOperationHook):
Delete.
(virFileOperation): Rename...
(virFileOpenAs): ...and reduce parameters.
* src/util/util.c (virFileOperationNoFork, virFileOperation):
Rename and simplify.
* src/qemu/qemu_driver.c (qemudDomainSaveFlag): Adjust caller.
* src/storage/storage_backend.c (virStorageBackendCreateRaw):
Likewise.
* src/libvirt_private.syms: Reflect rename.
Child processes don't always reach _exit(); if they die from a
signal, then any messages should still be accurate. Most users
either expect a 0 status (thankfully, if status==0, then
WIFEXITED(status) is true and WEXITSTATUS(status)==0 for all
known platforms) or were filtering on WIFEXITED before printing
a status, but a few were missing this check. Additionally,
nwfilter_ebiptables_driver was making an assumption that works
on Linux (where WEXITSTATUS shifts and WTERMSIG just masks)
but fails on other platforms (where WEXITSTATUS just masks and
WTERMSIG shifts).
* src/util/command.h (virCommandTranslateStatus): New helper.
* src/libvirt_private.syms (command.h): Export it.
* src/util/command.c (virCommandTranslateStatus): New function.
(virCommandWait): Use it to also diagnose status from signals.
* src/security/security_apparmor.c (load_profile): Likewise.
* src/storage/storage_backend.c
(virStorageBackendQEMUImgBackingFormat): Likewise.
* src/util/util.c (virExecDaemonize, virRunWithHook)
(virFileOperation, virDirCreate): Likewise.
* daemon/remote.c (remoteDispatchAuthPolkit): Likewise.
* src/nwfilter/nwfilter_ebiptables_driver.c (ebiptablesExecCLI):
Likewise.
For newer qemu-img, the help string for "backing file format" is
"[-F backing_fmt]".
Fix the wrong logic error by commit e997c268.
* src/storage/storage_backend.c
virRun gives pretty useful error output, let's not overwrite it unless there
is a good reason. Some places were providing more information about what
the commands were _attempting_ to do, however that's usually less useful from
a debugging POV than what actually happened.
Use it in all places where a memory or storage request size is converted
to a larger granularity. This avoids requesting too small memory or storage
sizes that could result from the truncation done by a simple division.
This extends the round up fix in 6002e0406c
to the whole codebase.
Instead of reporting errors for odd values in the VMX code round them up.
Update the QEMU Argv tests accordingly as the original memory size 219200
isn't a even multiple of 1024 and is rounded up to 215 megabyte now. Change
it to 219100 and 219136. Use two different values intentionally to make
sure that rounding up works.
Update virsh.pod accordingly, as rounding down and rejecting are replaced
by rounding up.
If there is a dangling symbolic link in filesystem pool, the pool
will fail to start or refresh, this patch is to fix it by ignoring
it with a warning log.
security_context_t happens to be a typedef for char*, and happens to
begin with a string usable as a raw context string. But in reality,
it is an opaque type that may or may not have additional information
after the first NUL byte, where that additional information can
include pointers that can only be freed via freecon().
Proof is from this valgrind run of daemon/libvirtd:
==6028== 839,169 (40 direct, 839,129 indirect) bytes in 1 blocks are definitely lost in loss record 274 of 274
==6028== at 0x4A0515D: malloc (vg_replace_malloc.c:195)
==6028== by 0x3022E0D48C: selabel_open (label.c:165)
==6028== by 0x3022E11646: matchpathcon_init_prefix (matchpathcon.c:296)
==6028== by 0x3022E1190D: matchpathcon (matchpathcon.c:317)
==6028== by 0x4F9D842: SELinuxRestoreSecurityFileLabel (security_selinux.c:382)
800k is a lot of memory to be leaking.
* src/storage/storage_backend.c
(virStorageBackendUpdateVolTargetInfoFD): Avoid leak on error.
* src/security/security_selinux.c
(SELinuxReserveSecurityLabel, SELinuxGetSecurityProcessLabel)
(SELinuxRestoreSecurityFileLabel): Use correct function to free
security_context_t.
Similarly to deprecating close(), I am now deprecating fclose() and
introduce VIR_FORCE_FCLOSE() and VIR_FCLOSE(). Also, fdopen() is replaced with
VIR_FDOPEN().
Most of the files are opened in read-only mode, so usage of
VIR_FORCE_CLOSE() seemed appropriate. Others that are opened in write
mode already had the fclose()< 0 check and I converted those to
VIR_FCLOSE()< 0.
I did not find occurrences of possible double-closed files on the way.
Using automated replacement with sed and editing I have now replaced all
occurrences of close() with VIR_(FORCE_)CLOSE() except for one, of
course. Some replacements were straight forward, others I needed to pay
attention. I hope I payed attention in all the right places... Please
have a look. This should have at least solved one more double-close
error.
Found by clang. Clang complained that virStorageBackendProbeTarget
could dereference NULL if backingStoreFormat was NULL, but since all
callers passed a valid pointer, I added attributes instead of null
checks.
* src/storage/storage_backend.c
(virStorageBackendQEMUImgBackingFormat): Kill dead store.
* src/storage/storage_backend_fs.c (virStorageBackendProbeTarget):
Likewise. Skip null checks, by adding attributes.
One error exit in virStorageBackendCreateBlockFrom was setting the
return value to errno. The convention for volume build functions is to
return 0 on success or -1 on failure. Not only was it not necessary to
set the return value (it defaults to -1, and is set to 0 when
everything has been successfully completed), in the case that some
caller were checking for < 0 rather than != 0, they would incorrectly
believe that it completed successfully.
Previously virStorageBackendCopyToFD would simply return -1 on
error. This made the error return from one of its callers inconsistent
(createRawFileOpHook is supposed to return -errno, but if
virStorageBackendCopyToFD failed, createRawFileOpHook would just
return -1). Since there is a useful errno in every case of error
return from virStorageBackendCopyToFD, and since the other uses of
that function ignore the return code (beyond simply checking to see if
it is < 0), this is a safe change.
virFileOperation previously returned 0 on success, or the value of
errno on failure. Although there are other functions in libvirt that
use this convention, the preferred (and more common) convention is to
return 0 on success and -errno (or simply -1 in some cases) on
failure. This way the check for failure is always (ret < 0).
* src/util/util.c - change virFileOperation and virFileOperationNoFork to
return -errno on failure.
* src/storage/storage_backend.c, src/qemu/qemu_driver.c
- change the hook functions passed to virFileOperation to return
-errno on failure.
Originally the storage volume files were opened with O_DSYNC to make
sure they were flushed to disk immediately. It turned out that this
was extremely slow in some cases, so the O_DSYNC was removed in favor
of just calling fsync() after all the data had been written. However,
this call to fsync was inside the block that is executed to zero-fill
the end of the volume file. In cases where the new volume is copied
from an old volume, and they are the same length, this fsync would
never take place.
Now the fsync is *always* done, unless there is an error (in which
case it isn't important, and is most likely inappropriate.
A missing set of braces around an error condition caused us to skip
zero'ing out the remainder of a new volume file if the new volume was
longer than the original (the goto was supposed to be taken only in
the case of error, but was always being taken).
When creating qcow2 files with a backing store, it is important
to set an explicit format to prevent QEMU probing. The storage
backend was only doing this if it found a 'kvm-img' binary. This
is wrong because plenty of kvm-img binaries don't support an
explicit format, and plenty of 'qemu-img' binaries do support
a format. The result was that most qcow2 files were not getting
a backing store format.
This patch runs 'qemu-img -h' to check for the two support
argument formats
'-o backing_format=raw'
'-F raw'
and use whichever option it finds
* src/storage/storage_backend.c: Query binary to determine
how to set the backing store format
If a directory pool contains pipes or sockets, a pool start can fail or hang:
https://bugzilla.redhat.com/show_bug.cgi?id=589577
We already try to avoid these special files, but only attempt after
opening the path, which is where the problems lie. Unify volume opening
into helper functions, which use the proper open() flags to avoid error,
followed by fstat to validate storage mode.
Previously, virStorageBackendUpdateVolTargetInfoFD attempted to enforce the
storage mode check, but allowed callers to detect this case and silently
continue. In practice, only the FS backend was using this feature, the rest
were treating unknown mode as an error condition. Unfortunately the InfoFD
function wasn't raising an error message here, so error reporting was
busted.
This patch adds 2 functions: virStorageBackendVolOpen, and
virStorageBackendVolOpenModeSkip. The latter retains the original opt out
semantics, the former now throws an explicit error.
This patch maintains the previous volume mode checks: allowing specific
modes for specific pool types requires a bit of surgery, since VolOpen
is called through several different helper functions.
v2: Use ATTRIBUTE_NONNULL. Drop stat check, just open with
O_NONBLOCK|O_NOCTTY.
v3: Move mode check logic back to VolOpen. Use 2 VolOpen functions with
different error semantics.
v4: Make second VolOpen function more extensible. Didn't opt to change
FS backend defaults, this can just be to fix the original bug.
v5: Prefix default flags with VIR_, use ATTRIBUTE_RETURN_CHECK
Volume detection in the scsi backend was duplicating code already
present in storage_backend.c. Let's drop the duplicate code.
Also, change the shared function name to be less generic, and remove
some error squashing in the other call site.
WIN32 is always defined when __MINGW32__ is defined, but the
converse is not true. WIN32 is more generic, if someone were
to ever attempt porting to a microsoft compiler. This does
not affect Cygwin, which intentionally does not define WIN32.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Use more
generic flag macro.
* src/storage/storage_backend.c
(virStorageBackendUpdateVolTargetInfoFD)
(virStorageBackendRunProgRegex): Likewise.
* tools/console.h (vshRunConsole): Likewise.
This allows the config to have a setting that means "leave it alone",
eg when building a pool where the directory already exists the user
may want the current uid/gid of the directory left intact. This
actually gets us back to older behavior - before recent changes to the
pool building code, we weren't as insistent about honoring the uid/gid
settings in the XML, and virt-manager was taking advantage of this
behavior.
As a side benefit, removing calls to getuid/getgid from the XML
parsing functions also seems like a good idea. And having a default
that is different from a common/useful value (0 == root) is a good
thing in general, as it removes ambiguity from decisions (at least one
place in the code was checking for (perms.uid == 0) to see if a
special uid was requested).
Note that this will only affect newly created pools and volumes. Due
to the way that the XML is parsed, then formatted for newly created
volumes, all existing pools/volumes already have an explicit uid and
gid set.
src/conf/storage_conf.c: Remove calls to setuid/setgid for default values
of uid/gid, and set them to -1 instead
src/storage/storage_backend.c:
src/storage/storage_backend_fs.c:
Make account for the new default values of perms.uid
and perms.gid.
Various safezero() implementations used either -1, errno or -errno
return values. This patch fixes them all to return -1 and set errno
appropriately.
There was also a bug in size parameter passed to safewrite() which could
result in an attempt to write gigabytes out of a megabyte buffer.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Recently we introduced O_DSYNC flag when creating raw storage files to
avoid filling all disk cache with dirty pages. However, the patch got
lost when virStorageBackendCreateRaw was reworked using
virFileOperation. Let's use O_DSYNC again.
There were a few operations on the storage volume file that were still
being done as root, which will fail if the file is on a root-squashed
NFS share. The result was that attempts to create a storage volume of
type "raw" on a root-squashed NFS share would fail.
This patch uses the newly introduced "hook" function in
virFileOperation to execute all those file operations in the child
process that's run under the uid that owns the file (and, presumably,
has permission to write to the NFS share)
* src/storage/storage_backend.c: use virFileOperation() in
virStorageBackendCreateRaw, turning virStorageBackendCreateRaw()
into a new createRawFileOpHook() hook
It turns out it is also useful to be able to perform other operations
on a file created while running as a different uid (eg, write things
to that file), and possibly to do this to a file that already
exists. This patch adds an optional hook function to the renamed (for
more accuracy of purpose) virFileOperation; the hook will be called
after the file has been opened (possibly created) and gid/mode
checked/set, before closing it.
As with the other operations on the file, if the VIR_FILE_OP_AS_UID
flag is set, this hook function will be called in the context of a
child process forked from the process that called virFileOperation.
The implication here is that, while all data in memory is available to
this hook function, any modification to that data will not be seen by
the caller - the only indication in memory of what happened in the
hook will be the return value (which the hook should set to 0 on
success, or one of the standard errno values on failure).
Another piece of making the function more flexible was to add an
"openflags" argument. This arg should contain exactly the flags to be
passed to open(2), eg O_RDWR | O_EXCL, etc.
In the process of adding the hook to virFileOperation, I also realized
that the bits to fix up file owner/group/mode settings after creation
were being done in the parent process, which could fail, so I moved
them to the child process where they should be.
* src/util/util.[ch]: rename and rework virFileCreate-->virFileOperation,
and redo flags in virDirCreate
* storage/storage_backend.c, storage/storage_backend_fs.c: update the
calls to virFileOperation/virDirCreate to reflect changes in the API,
but don't yet take advantage of the hook.
The virConnectPtr is no longer required for error reporting since
that is recorded in a thread local. Remove use of virConnectPtr
from all APIs in secret_conf.{h,c} and update all callers to
match
The virConnectPtr is no longer required for error reporting since
that is recorded in a thread local. Remove use of virConnectPtr
from all APIs in storage_conf.{h,c} and storage_encryption_conf.{h,c}
and update all callers to match
When creating preallocated large raw files opening them with O_DSYNC
prevents long delays in reading because cache pages can be immediately
reused without writing them on a disk first.
In order to avoid problems trying to chown files that were created by
root on a root-squashing nfs server, fork a new process that setuid's
to the desired uid before creating the file. (It's only done this way
if the pool containing the new volume is of type 'netfs', otherwise
the old method of creating the file followed by chown() is used.)
This changes the semantics of the "create_func" slightly - previously
it was assumed that this function just created the file, then the
caller would chown it to the desired uid. Now, create_func does both
operations.
There are multiple functions that can take on the role of create_func:
createFileDir - previously called mkdir(), now calls virDirCreate().
virStorageBackendCreateRaw - previously called open(),
now calls virFileCreate().
virStorageBackendCreateQemuImg - use virRunWithHook() to setuid/gid.
virStorageBackendCreateQcowCreate - same.
virStorageBackendCreateBlockFrom - preserve old behavior (but attempt
chown when necessary even if not root)
* src/storage/storage_backend.[ch] src/storage/storage_backend_disk.c
src/storage/storage_backend_fs.c src/storage/storage_backend_logical.c
src/storage/storage_driver.c: change the create_func implementations,
also propagate the pool information to be able to detect NETFS ones.
Replace free(virBufferContentAndReset()) with virBufferFreeAndReset().
Update documentation and replace all remaining calls to free() with
calls to VIR_FREE(). Also add missing calls to virBufferFreeAndReset()
and virReportOOMError() in OOM error cases.
When building with --disable-nls, I got a few messages like this:
storage/storage_backend.c: In function 'virStorageBackendCreateQemuImg':
storage/storage_backend.c:571: warning: format not a string literal and no format arguments
Fix these up.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Rename virStorageVolFormatFileSystem to virStorageFileFormat and
move to src/util/storage_file.[ch]
* src/Makefile.am: add src/util/storage_file.[ch]
* src/conf/storage_conf.[ch]: move enum from here ...
* src/util/storage_file.[ch]: .. to here
* src/libvirt_private.syms: update To/FromString exports
* src/storage/storage_backend.c, src/storage/storage_backend_fs.c,
src/vbox/vbox_tmpl.c: update for above changes