The SCSI volumes get a better 'key' field based on the fully
qualified volume path. All SCSI volumes have a unique serial
available in hardware which can be obtained by sending a
suitable SCSI command. Call out to udev's 'scsi_id' command
to fetch this value
* src/storage/storage_backend_scsi.c: Improve volume key
field value stability and uniqueness
This patch enables the relative backing file path support provided by
qemu-img create.
If a relative path is specified for the backing file, it is converted
to an absolute path using the storage pool path. The absolute path is
used to verify that the backing file exists. If the backing file exists,
the relative path is allowed and will be provided to qemu-img create.
Even with -Wuninitialized (which is part of autobuild.sh
--enable-compile-warnings=error), gcc does NOT catch this
use of an uninitialized variable:
{
if (cond)
goto error;
int a = 1;
error:
printf("%d", a);
}
which prints 0 (supposing the stack started life wiped) if
cond was true. Clang will catch it, but we don't use clang
as often. Using gcc -Wjump-misses-init catches it, but also
gives false positives:
{
if (cond)
goto error;
int a = 1;
return a;
error:
return 0;
}
Here, a was never used in the scope of the error block, so
declaring it after goto is technically fine (and clang agrees).
However, given that our HACKING already documents a preference
to C89 decl-before-statement, the false positive warning is
enough of a prod to comply with HACKING.
[Personally, I'd _really_ rather use C99 decl-after-statement
to minimize scope, but until gcc can efficiently and reliably
catch scoping and uninitialized usage bugs, I'll settle with
the compromise of enforcing a coding standard that happens to
reject false positives if it can also detect real bugs.]
* acinclude.m4 (LIBVIRT_COMPILE_WARNINGS): Add -Wjump-misses-init.
* src/util/util.c (__virExec): Adjust offenders.
* src/conf/domain_conf.c (virDomainTimerDefParseXML): Likewise.
* src/remote/remote_driver.c (doRemoteOpen): Likewise.
* src/phyp/phyp_driver.c (phypGetLparNAME, phypGetLparProfile)
(phypGetVIOSFreeSCSIAdapter, phypVolumeGetKey)
(phypGetStoragePoolDevice)
(phypVolumeGetPhysicalVolumeByStoragePool)
(phypVolumeGetPath): Likewise.
* src/vbox/vbox_tmpl.c (vboxNetworkUndefineDestroy)
(vboxNetworkCreate, vboxNetworkDumpXML)
(vboxNetworkDefineCreateXML): Likewise.
* src/xenapi/xenapi_driver.c (getCapsObject)
(xenapiDomainDumpXML): Likewise.
* src/xenapi/xenapi_utils.c (createVMRecordFromXml): Likewise.
* src/security/security_selinux.c (SELinuxGenNewContext):
Likewise.
* src/qemu/qemu_command.c (qemuBuildCommandLine): Likewise.
* src/qemu/qemu_hotplug.c (qemuDomainChangeEjectableMedia):
Likewise.
* src/qemu/qemu_process.c (qemuProcessWaitForMonitor): Likewise.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextGetPtyPaths):
Likewise.
* src/qemu/qemu_driver.c (qemudDomainShutdown)
(qemudDomainBlockStats, qemudDomainMemoryPeek): Likewise.
* src/storage/storage_backend_iscsi.c
(virStorageBackendCreateIfaceIQN): Likewise.
* src/node_device/node_device_udev.c (udevProcessPCI): Likewise.
Currently libvirt's default logging is limited and it is difficult to
determine what was happening when a proglem occurred (especially on a
machines where one don't know the detail.) This patch helps to do that
by making additional logging available for the following events:
creating/defining/undefining domains
creating/defining/undefining/starting/stopping networks
creating/defining/undefining/starting/stopping storage pools
creating/defining/undefining/starting/stopping storage volumes.
* AUTHORS: add Naoya Horiguchi
* src/network/bridge_driver.c src/qemu/qemu_driver.c
src/storage/storage_driver.c: provide more VIR_INFO logging
New APIs are added allowing streaming of content to/from
storage volumes.
* include/libvirt/libvirt.h.in: Add virStorageVolUpload and
virStorageVolDownload APIs
* src/driver.h, src/libvirt.c, src/libvirt_public.syms: Stub
code for new APIs
* src/storage/storage_driver.c, src/esx/esx_storage_driver.c:
Add dummy entries in driver table for new APIs
This patch intentionally doesn't change indentation, in order to
make it easier to review the real changes.
* src/util/util.h (VIR_FILE_OP_RETURN_FD, virFileOperationHook):
Delete.
(virFileOperation): Rename...
(virFileOpenAs): ...and reduce parameters.
* src/util/util.c (virFileOperationNoFork, virFileOperation):
Rename and simplify.
* src/qemu/qemu_driver.c (qemudDomainSaveFlag): Adjust caller.
* src/storage/storage_backend.c (virStorageBackendCreateRaw):
Likewise.
* src/libvirt_private.syms: Reflect rename.
Child processes don't always reach _exit(); if they die from a
signal, then any messages should still be accurate. Most users
either expect a 0 status (thankfully, if status==0, then
WIFEXITED(status) is true and WEXITSTATUS(status)==0 for all
known platforms) or were filtering on WIFEXITED before printing
a status, but a few were missing this check. Additionally,
nwfilter_ebiptables_driver was making an assumption that works
on Linux (where WEXITSTATUS shifts and WTERMSIG just masks)
but fails on other platforms (where WEXITSTATUS just masks and
WTERMSIG shifts).
* src/util/command.h (virCommandTranslateStatus): New helper.
* src/libvirt_private.syms (command.h): Export it.
* src/util/command.c (virCommandTranslateStatus): New function.
(virCommandWait): Use it to also diagnose status from signals.
* src/security/security_apparmor.c (load_profile): Likewise.
* src/storage/storage_backend.c
(virStorageBackendQEMUImgBackingFormat): Likewise.
* src/util/util.c (virExecDaemonize, virRunWithHook)
(virFileOperation, virDirCreate): Likewise.
* daemon/remote.c (remoteDispatchAuthPolkit): Likewise.
* src/nwfilter/nwfilter_ebiptables_driver.c (ebiptablesExecCLI):
Likewise.
Currently a single storage volume with a broken backing file will disable the
whole storage pool. This can happen when the backing file is on some
unavailable network storage or if the backing volume is deleted, while the
storage volumes using it remain.
Since the storage pool can not be re-activated, re-creating the missing
or deleting the now useless volumes using libvirt only is not possible.
Fixing this is a little bit tricky:
1. virStorageBackendProbeTarget() only detects the missing backing file,
if the backing file format is not explicitly specified. If the
backing file is created using
kvm-img create -f qcow2 -o backing_fmt=qcow2,backing_file=... ...
no error is detected at this stage.
The new return code -3 signals that the backing file could not be
opened.
2. The backingStore.format must be >= 0, since values < 0 would break
virStorageVolTargetDefFormat() when dumping the XML data such as
<format type='...'/>
Because of this the format is faked as VIR_STORAGE_FILE_RAW.
3. virStorageBackendUpdateVolTargetInfo() always opens the backing file
and thus always detects a missing backing file.
Since it "only" updates the capacity, allocation, owner, group, mode
and SELinux label, just ignore errors at this stage, print an error
message and continue.
4. Using vol-dump on a broken volume still doesn't work, but at least
vol-destroy and pool-refresh do work now.
To reproduce:
dir=$(mktemp -d)
virsh pool-create-as tmp dir '' '' '' '' "$dir"
virsh vol-create-as --format qcow2 tmp back 1G
virsh vol-create-as --format qcow2 --backing-vol-format qcow2 --backing-vol back tmp cow 1G
virsh vol-delete --pool tmp back
virsh pool-refresh tmp
After the last step, the pool will be gone (because it was not persistent). As
long as the now broken image stays in the directory, you will not be able to
re-create or re-start the pool.
Signed-off-by: Philipp Hahn <hahn@univention.de>
For newer qemu-img, the help string for "backing file format" is
"[-F backing_fmt]".
Fix the wrong logic error by commit e997c268.
* src/storage/storage_backend.c
virRun gives pretty useful error output, let's not overwrite it unless there
is a good reason. Some places were providing more information about what
the commands were _attempting_ to do, however that's usually less useful from
a debugging POV than what actually happened.
Done mechanically with:
$ git grep -l '\bDEBUG0\? *(' | xargs -L1 sed -i 's/\bDEBUG0\? *(/VIR_&/'
followed by manual deletion of qemudDebug in daemon/libvirtd.c, along
with a single 'make syntax-check' fallout in the same file, and the
actual deletion in src/util/logging.h.
* src/util/logging.h (DEBUG, DEBUG0): Delete.
* daemon/libvirtd.h (qemudDebug): Likewise.
* global: Change remaining clients over to VIR_DEBUG counterpart.
The name convention of device mapper disk is different, and 'parted'
can't be used to delete a device mapper disk partition. e.g.
Name Path
-----------------------------------------
3600a0b80005ad1d7000093604cae912fp1 /dev/mapper/3600a0b80005ad1d7000093604cae912fp1
Error: Expecting a partition number.
This patch introduces 'dmsetup' to fix it.
Changes:
- New function "virIsDevMapperDevice" in "src/utils/utils.c"
- remove "is_dm_device" in "src/storage/parthelper.c", use
"virIsDevMapperDevice" instead.
- Requires "device-mapper" for 'with-storage-disk" in "libvirt.spec.in"
- Check "dmsetup" in 'configure.ac' for "with-storage-disk"
- Changes on "src/Makefile.am" to link against libdevmapper
- New entry for "virIsDevMapperDevice" in "src/libvirt_private.syms"
Changes from v1 to v3:
- s/virIsDeviceMapperDevice/virIsDevMapperDevice/g
- replace "virRun" with "virCommand"
- sort the list of util functions in "libvirt_private.syms"
- ATTRIBUTE_NONNULL(1) for virIsDevMapperDevice declaration.
e.g.
Name Path
-----------------------------------------
3600a0b80005ad1d7000093604cae912fp1 /dev/mapper/3600a0b80005ad1d7000093604cae912fp1
Vol /dev/mapper/3600a0b80005ad1d7000093604cae912fp1 deleted
Name Path
-----------------------------------------
The SCSI storage backend leaks a string containing the pathname
for each block device it discovers
* src/storage/storage_backend_scsi.c: Free the device name
"virStorageBackendCreateVols":
"names->next" serves as condition expression for "do...while",
however, "names" was shifted before, it then results in one less
loop, and thus, one less volume will be created for mpath pool,
the patch is to fix it.
* src/storage/storage_backend_mpath.c
Use it in all places where a memory or storage request size is converted
to a larger granularity. This avoids requesting too small memory or storage
sizes that could result from the truncation done by a simple division.
This extends the round up fix in 6002e0406c
to the whole codebase.
Instead of reporting errors for odd values in the VMX code round them up.
Update the QEMU Argv tests accordingly as the original memory size 219200
isn't a even multiple of 1024 and is rounded up to 215 megabyte now. Change
it to 219100 and 219136. Use two different values intentionally to make
sure that rounding up works.
Update virsh.pod accordingly, as rounding down and rejecting are replaced
by rounding up.
If vol->capacity is odd, the capacity will be rounded down
by devision, this patch is to round it up instead of rounding
down, to be safer in case of one writes to the volume with the
size he used to create.
- src/storage/storage_backend_logical.c: make sure size is not rounded down
If there is a dangling symbolic link in filesystem pool, the pool
will fail to start or refresh, this patch is to fix it by ignoring
it with a warning log.
* .gnulib: Update to latest, for at least a stdint.h fix
* src/storage/storage_driver.c (storageVolumeZeroSparseFile)
(storageWipeExtent): Use better type, although it still triggers
spurious -Wformat warning on MacOS's gcc.
security_context_t happens to be a typedef for char*, and happens to
begin with a string usable as a raw context string. But in reality,
it is an opaque type that may or may not have additional information
after the first NUL byte, where that additional information can
include pointers that can only be freed via freecon().
Proof is from this valgrind run of daemon/libvirtd:
==6028== 839,169 (40 direct, 839,129 indirect) bytes in 1 blocks are definitely lost in loss record 274 of 274
==6028== at 0x4A0515D: malloc (vg_replace_malloc.c:195)
==6028== by 0x3022E0D48C: selabel_open (label.c:165)
==6028== by 0x3022E11646: matchpathcon_init_prefix (matchpathcon.c:296)
==6028== by 0x3022E1190D: matchpathcon (matchpathcon.c:317)
==6028== by 0x4F9D842: SELinuxRestoreSecurityFileLabel (security_selinux.c:382)
800k is a lot of memory to be leaking.
* src/storage/storage_backend.c
(virStorageBackendUpdateVolTargetInfoFD): Avoid leak on error.
* src/security/security_selinux.c
(SELinuxReserveSecurityLabel, SELinuxGetSecurityProcessLabel)
(SELinuxRestoreSecurityFileLabel): Use correct function to free
security_context_t.
The SCSI volumes currently get a name like '17:0:0:1' based
on $host:$bus:$target:$lun. The names are intended to be unique
per pool and stable across pool restarts. The inclusion of the
$host component breaks this, because the $host number for iSCSI
pools is dynamically allocated by the kernel at time of login.
This changes the name to be 'unit:0:0:1', ie removes the leading
host component. The 'unit:' prefix is just to ensure the volume
name doesn't start with a number and make it clearer when seen
out of context.
* src/storage/storage_backend_scsi.c: Improve volume name
field value stability and uniqueness
Many operations are not valid on inactive storage pools. The
storage driver is currently returning VIR_ERR_INTERNAL_ERROR
in these cases, rather than the more suitable error code
VIR_ERR_OPERATION_INVALID
* src/storage/storage_driver.c: Fix error code when pool
is not active
When libvirt starts up all storage pools default to the inactive
state, even if the underlying storage is already active on the
host. This introduces a new API into the internal storage backend
drivers that checks whether a storage pool is already active. If
the pool is active at libvirtd startup, the volume list will be
immediately populated.
* src/storage/storage_backend.h: New internal API for checking
storage pool state
* src/storage/storage_driver.c: Check whether a pool is active
upon driver startup
* src/storage/storage_backend_fs.c, src/storage/storage_backend_iscsi.c,
src/storage/storage_backend_logical.c, src/storage/storage_backend_mpath.c,
src/storage/storage_backend_scsi.c: Add checks for pool state
Since the previous patch added support for parsing the output of
the 'sendtargets' command, it is now trivial to support the
storage pool discovery API.
Given a hostname and optional portnumber and initiator IQN,
the code can return a full list of storage pool source docs,
each one representing a iSCSI target.
* src/storage/storage_backend_iscsi.c: Wire up target
auto-discovery
The Linux iSCSI initiator toolchain has the dubious feature that
if you ever run the 'sendtargets' command to merely query what
targets are available from a server, the results will be recorded
in /var/lib/iscsi. Any time the '/etc/init.d/iscsi' script runs
in the future, it will then automatically login to all those
targets. /etc/init.d/iscsi is automatically run whenever a NIC
comes online.
So from the moment you ask a server what targets are available,
your client will forever more automatically try to login to all
targets without ever asking if you actually want it todo this.
To stop this stupid behaviour, we need to run
iscsiadm --portal $PORTAL --target $TARGET
--op update --name node.startup --value manual
For every target on the server.
* src/storage/storage_backend_iscsi.c: Disable automatic login
for targets found as a result of a 'sendtargets' command
The following series of patches are adding significant
extra functionality to the iSCSI driver. THe current
internal helper methods are not sufficiently flexible
to cope with these changes. This patch refactors the
code to avoid needing to have a virStoragePoolObjPtr
instance as a parameter, instead passing individual
target, portal and initiatoriqn parameters.
It also removes hardcoding of port 3260 in the portal
address, instead using the XML value if any.
* src/storage/storage_backend_iscsi.c: Refactor internal
helper methods
Per the gettext developer:
http://lists.gnu.org/archive/html/bug-gnu-utils/2010-10/msg00019.htmlhttp://lists.gnu.org/archive/html/bug-gnu-utils/2010-10/msg00021.html
gettext() doesn't work correctly on all platforms unless you have
called setlocale(). Furthermore, gnulib's gettext.h has provisions
for setting up a default locale, which is the preferred method for
libraries to use gettext without having to call textdomain() and
override the main program's default domain (virInitialize already
calls bindtextdomain(), but this is insufficient without the
setlocale() added in this patch; and a redundant bindtextdomain()
in this patch doesn't hurt, but serves as a good example for other
packages that need to bind a second translation domain).
This patch is needed to silence a new gnulib 'make syntax-check'
rule in the next patch.
* daemon/libvirtd.c (main): Setup locale and gettext.
* src/lxc/lxc_controller.c (main): Likewise.
* src/security/virt-aa-helper.c (main): Likewise.
* src/storage/parthelper.c (main): Likewise.
* tools/virsh.c (main): Fix exit status.
* src/internal.h (DEFAULT_TEXT_DOMAIN): Define, for gettext.h.
(_): Simplify definition accordingly.
* po/POTFILES.in: Add src/storage/parthelper.c.
Similarly to deprecating close(), I am now deprecating fclose() and
introduce VIR_FORCE_FCLOSE() and VIR_FCLOSE(). Also, fdopen() is replaced with
VIR_FDOPEN().
Most of the files are opened in read-only mode, so usage of
VIR_FORCE_CLOSE() seemed appropriate. Others that are opened in write
mode already had the fclose()< 0 check and I converted those to
VIR_FCLOSE()< 0.
I did not find occurrences of possible double-closed files on the way.
Using automated replacement with sed and editing I have now replaced all
occurrences of close() with VIR_(FORCE_)CLOSE() except for one, of
course. Some replacements were straight forward, others I needed to pay
attention. I hope I payed attention in all the right places... Please
have a look. This should have at least solved one more double-close
error.
The 2nd and 3rd hunk show the only double-closed file descriptor code part that I found while trying to clean up close(). The first hunk seems a harmless cleanup in that same file.
Found by clang. Clang complained that virStorageBackendProbeTarget
could dereference NULL if backingStoreFormat was NULL, but since all
callers passed a valid pointer, I added attributes instead of null
checks.
* src/storage/storage_backend.c
(virStorageBackendQEMUImgBackingFormat): Kill dead store.
* src/storage/storage_backend_fs.c (virStorageBackendProbeTarget):
Likewise. Skip null checks, by adding attributes.
One error exit in virStorageBackendCreateBlockFrom was setting the
return value to errno. The convention for volume build functions is to
return 0 on success or -1 on failure. Not only was it not necessary to
set the return value (it defaults to -1, and is set to 0 when
everything has been successfully completed), in the case that some
caller were checking for < 0 rather than != 0, they would incorrectly
believe that it completed successfully.
virDirCreate also previously returned 0 on success and errno on
failure. This makes it fit the recommended convention of returning 0
on success, -errno (ie a negative number) on failure.
Previously virStorageBackendCopyToFD would simply return -1 on
error. This made the error return from one of its callers inconsistent
(createRawFileOpHook is supposed to return -errno, but if
virStorageBackendCopyToFD failed, createRawFileOpHook would just
return -1). Since there is a useful errno in every case of error
return from virStorageBackendCopyToFD, and since the other uses of
that function ignore the return code (beyond simply checking to see if
it is < 0), this is a safe change.
virFileOperation previously returned 0 on success, or the value of
errno on failure. Although there are other functions in libvirt that
use this convention, the preferred (and more common) convention is to
return 0 on success and -errno (or simply -1 in some cases) on
failure. This way the check for failure is always (ret < 0).
* src/util/util.c - change virFileOperation and virFileOperationNoFork to
return -errno on failure.
* src/storage/storage_backend.c, src/qemu/qemu_driver.c
- change the hook functions passed to virFileOperation to return
-errno on failure.
Originally the storage volume files were opened with O_DSYNC to make
sure they were flushed to disk immediately. It turned out that this
was extremely slow in some cases, so the O_DSYNC was removed in favor
of just calling fsync() after all the data had been written. However,
this call to fsync was inside the block that is executed to zero-fill
the end of the volume file. In cases where the new volume is copied
from an old volume, and they are the same length, this fsync would
never take place.
Now the fsync is *always* done, unless there is an error (in which
case it isn't important, and is most likely inappropriate.