Commit id 'df1011ca8' modified virStorageBackendDiskDeleteVol to use
"dmsetup remove --force" to remove the volume, but left things in an
inconsistent state since the partition still existed on the disk and
only the device mapper device (/dev/dm-#) was removed.
Prior to commit '1895b421' (or '1ffd82bb' and '471e1c4e'), this could
go unnoticed since virStorageBackendDiskRefreshPool wasn't called.
However, the pool would be unusable since the /dev/dm-# device would
be removed even though the partition was not removed unless a multipathd
restart reset the link. That would of course make the volume appear again
in the pool after a refresh or pool start after libvirt reload.
This patch removes the 'dmsetup' logic and re-implements the partition
deletion logic for device mapper devices. The removal of the partition
via 'parted rm --script #' will cause udev device change logic to allow
multipathd to handle removing the dm-* device associated with the partition.
https://bugzilla.redhat.com/show_bug.cgi?id=1265694
Commit id '020135dc' didn't quite get the algorithm correct when a
device mapper source ended with a non numeric value (e.g. ends with
an alphabet value).
This patch modifies the 'part_separator' logic to add the "p" separator
to the attempted target path name only when specified as part_separator='yes'.
For a source name that already ends with a number, the logic doesn't change
as the part separator would need to be there.
For a source name that ends with something other than a number, this allows
the possibility that a "p" separator can be added. The default for one of
these source devices is to not add the separator.
The key for device mapper and the need for a partition separator "p" is
the presence of a number in the last character of the device name link
in /dev/mapper. A name such as "/dev/mapper/mpatha1" would generate
a "/dev/mapper/mpatha1p1" partition, while "/dev/mapper/mpatha" would
generate partition "/dev/mapper/mpatha1". Similarly for a device
mapper entry not using friendly names or an alias, a device such as
"/dev/mapper/3600a0b80005b10ca00005ad656fd8d93" would generate a
paritition "/dev/mapper/3600a0b80005b10ca00005ad656fd8d93p1", while
a device such as "/dev/mapper/3600a0b80005b10ca00005e115729093f" would
generate a partition "/dev/mapper/3600a0b80005b10ca00005e115729093f1".
The long number is the WWID of the device. It's also possible to assign
an alias for a device mapper entry, that alias follows the same rules
with respect to ending with a number or not when adding a "p" to create
the target device path.
Prior to calling the 'refreshPool' during CreatePool or UploadPool
operations, we need to clear the pool; otherwise, the pool will
have duplicated entries.
https://bugzilla.redhat.com/show_bug.cgi?id=1318993
Commit id 'dd519a294' caused a regression cloning a volume into a
logical pool by removing just the 'allocation' adjustment during
storageVolCreateXMLFrom. Combined with the change to not require the
new volume input XML to have a capacity listed (commit id 'e3f1d2a8')
left the possibility that a zero allocation value (e.g., not provided)
would create a thin/sparse logical volume. When a thin lv becomes fully
populated, then LVM sets the partition 'inactive' and the subsequent
fdatasync() fails.
Add a new 'has_allocation' flag to be set at XML parse time to indicate
that allocation was provided. This is done so that if it's not provided
the create-from code uses the capacity value since we document that if
omitted, the volume will be fully allocated at time of creation.
For a logical backend, that creation time is 'createVol', while for a
file backend, creation doesn't set the size, but the 'createRaw' called
during buildVolFrom will decide whether the file is sparse or not based
on the provided capacity and allocation value.
For volume clones that provide different allocation and capacity values
to allow for sparse files, there is no change.
We had both and the only difference was that the latter also included
information about multifunction setting. The problem with that was that
we couldn't use functions made for only one of the structs (e.g.
parsing). To consolidate those two structs, use the one in virpci.h,
include that in domain_conf.h and add the multifunction member in it.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Remove all the plumbing needed for the different qcow-create/kvm-img
non-raw file creation.
We can drop the error messages because CreateQemuImg will thrown an
error for us but with slightly less fidelity (unable to find qemu-img),
which I think is acceptable given the unlikeliness of that error in
practice.
This an ubuntu/debian packaging convention. At one point it may have
been an actually different binary, but at least as of ubuntu precise
(the oldest supported ubuntu distro, released april 2012) kvm-img is
just a symlink to qemu-img for back compat.
I think it's safe to drop support for it
qcow-create was a crippled qemu-img impl that shipped with xen. I
think supporting this was only relevant for really old distros
that didn't have a proper qemu package, like early RHEL5. I think
it's fair to drop support
By default, `zfs create -V ...` reserves space for the entire volsize,
plus some extra (which attempts to account for overhead).
If `zfs create -s -V ...` is used instead, zvols are (fully) sparse.
A middle ground (partial allocation) can be achieved with
`zfs create -s -o refreservation=... -V ...`. Both libvirt and ZFS
support this approach, so the ZFS storage backend should support it.
Signed-off-by: Richard Laager <rlaager@wiktel.com>
In case of ploop volume, target path of the volume is the path to the
directory that contains image file named root.hds and DiskDescriptor.xml.
While using uploadVol and downloadVol callbacks we need to open root.hds
itself.
Upload or download operations with ploop volume are only allowed when
images do not have snapshots. Otherwise operation fails.
Signed-off-by: Olga Krishtal <okrishtal@virtuozzo.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Refreshes meta-information such as allocation, capacity, format, etc.
Ploop volumes differ from other volume types. Path to volume is the path
to directory with image file root.hds and DiskDescriptor.xml.
https://openvz.org/Ploop/format
Due to this fact, operations of opening the volume have to be done once
again. get the information.
To decide whether the given volume is ploops one, it is necessary to check
the presence of root.hds and DiskDescriptor.xml files in volumes' directory.
Only in this case the volume can be manipulated as the ploops one.
Such strategy helps us to resolve problems that might occure, when we
upload some other volume type from ploop source.
Signed-off-by: Olga Krishtal <okrishtal@virtuozzo.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Recursively deletes whole directory of a ploop volume.
To delete ploop image it has to be unmounted.
Signed-off-by: Olga Krishtal <okrishtal@virtuozzo.com>
These callbacks let us to create ploop volumes in dir, fs and etc. pools.
If a ploop volume was created via buildVol callback, then this volume
is an empty ploop device with DiskDescriptor.xml.
If the volume was created via .buildFrom - then its content is similar to
input volume content.
Signed-off-by: Olga Krishtal <okrishtal@virtuozzo.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Ploop image consists of directory with two files: ploop image itself,
called root.hds and DiskDescriptor.xml that contains information about
ploop device: https://openvz.org/Ploop/format.
Such volume are difficult to manipulate in terms of existing volume types
because they are neither a single files nor a directory.
This patch introduces new volume type - ploop. This volume type is used
by ploop volume's exclusively.
Signed-off-by: Olga Krishtal <okrishtal@virtuozzo.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
We use device-mapper to enumerate all dm devices, and filter out
the list of multipath devices by checking the target_type string
name. The code however cancels all scanning if we encounter
target_type=NULL
I don't know how to reproduce that situation, but a user was hitting
it in their setup, and inspecting the lvm2/device-mapper code shows
many places where !target_type is explicitly ignored and processing
continues on to the next device. So I think we should do the same
https://bugzilla.redhat.com/show_bug.cgi?id=1069317
I tried compiling libvirt with older gcc and probably because I used
different configure options I got some shadowed declarations.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
If the pool creation thread happens to detect the luns in
the scsi target, the size parameters will be calculated as
part of the refreshPool called from storagePoolCreate().
This means the virStoragePoolFCRefreshThread (commit id
'512b874') waiting to run and "refresh" the pool will
essentially double the allocation and capacity values.
A separate refresh would correct the values.
To avoid this, the FCRefreshThread needs to reinitialize
the pool size values prior to calling virStorageBackendSCSIFindLUs
which eventually calls virStorageBackendSCSINewLun and
updates the size values for each volume found.
After the recent commits the build didn't work for me. Fix it by
using size_t as the callback argument is using and the correct
formatter. The attempted fixup to use %llu as a formatter was wrong.
This reverts commit bb5f2dc91f.
The "if (vol->target.format != VIR_STORAGE_FILE_RAW)" check in the
createVol backend. This check is bogus because virStorageVolDefParseXML()
in conf/storage_conf.c sets target.format only if volOptions in
virStoragePoolTypeInfo has formatFromString set, and that's not the
case the zfs backend.
So the check always fails and breaks volume creation.
This reverts commit 6682d6219d.
The "if (vol->target.format != VIR_STORAGE_FILE_RAW)" check in the
createVol backend. This check is bogus because virStorageVolDefParseXML()
in conf/storage_conf.c sets target.format only if volOptions in
virStoragePoolTypeInfo has formatFromString set, and that's not the
case the logical backend.
So the check always fails and breaks volume creation.
While trying to build with -Os couple of compile errors showed
up.
conf/domain_conf.c: In function 'virDomainChrRemove':
conf/domain_conf.c:13666:24: error: 'ret' may be used uninitialized in this function [-Werror=maybe-uninitialized]
virDomainChrDefPtr ret, **arrPtr = NULL;
^
Compiler fails to see that @ret is used only if set in the loop,
but whatever, there's no harm in initializing the variable.
In vboxAttachDrivesNew and _vboxAttachDrivesOld compiler thinks
that @rc may be used uninitialized. Well, not directly, but maybe
after some optimization. Yet again, no harm in initializing a
variable.
In file included from ./util/virthread.h:26:0,
from ./datatypes.h:28,
from vbox/vbox_tmpl.c:43,
from vbox/vbox_V3_1.c:37:
vbox/vbox_tmpl.c: In function '_vboxAttachDrivesOld':
./util/virerror.h:181:5: error: 'rc' may be used uninitialized in this function [-Werror=maybe-uninitialized]
virReportErrorHelper(VIR_FROM_THIS, code, __FILE__, \
^
In file included from vbox/vbox_V3_1.c:37:0:
vbox/vbox_tmpl.c:1041:14: note: 'rc' was declared here
nsresult rc;
^
Yet again, one uninitialized variable:
qemu/qemu_driver.c: In function 'qemuDomainBlockCommit':
qemu/qemu_driver.c:17194:9: error: 'baseSource' may be used uninitialized in this function [-Werror=maybe-uninitialized]
qemuDomainPrepareDiskChainElement(driver, vm, baseSource,
^
And another one:
storage/storage_backend_logical.c: In function 'virStorageBackendLogicalMatchPoolSource.isra.2':
storage/storage_backend_logical.c:618:33: error: 'thisSource' may be used uninitialized in this function [-Werror=maybe-uninitialized]
thisSource->devices[j].path))
^
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Found by inspection - after calling virStoragePoolObjAssignDef the
pool is part of the driver->pools.objs list and the failure path
for the virStoragePoolObjSaveDef will use virStoragePoolObjRemove
to remove the pool from the objs list which will unlock and free
the pool pointer (as pools->objs[i] during the loop). Since the call
doesn't clear the pool address from the callee, we need to set it
to NULL; otherwise, the virStoragePoolObjUnlock in the cleanup: code
will fail miserably.
Generates a false positive for Coverity, but it turns out there's no need
to check ret == -1 since if VIR_APPEND_ELEMENT is successful, the local
vol pointer is cleared anyway.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Found by my Coverity checker - virCheckFlags call could return -1, but
not virCommandFree(destroy_cmd).
Signed-off-by: John Ferlan <jferlan@redhat.com>
%zu is not always synonymous with uint64_t; on 32-bit machines,
size_t is only 32 bits. Prefer "%lld"/'unsigned long long' when
the variable is under our control, and "%"PRIu64 when we are
stuck with 'uint64_t' from RBD.
Fixes errors such as:
../../src/storage/storage_backend_rbd.c: In function 'virStorageBackendRBDVolWipe':
../../src/storage/storage_backend_rbd.c:1281:15: error: format '%zu' expects argument of type 'size_t', but argument 8 has type 'uint64_t {aka long long unsigned int}' [-Werror=format=]
VIR_DEBUG("Need to wipe %zu bytes from RBD image %s/%s",
^
../../src/util/virlog.h:90:73: note: in definition of macro 'VIR_DEBUG_INT'
virLogMessage(src, VIR_LOG_DEBUG, filename, linenr, funcname, NULL, __VA_ARGS__)
^
../../src/storage/storage_backend_rbd.c:1281:5: note: in expansion of macro 'VIR_DEBUG'
VIR_DEBUG("Need to wipe %zu bytes from RBD image %s/%s",
^
Signed-off-by: Eric Blake <eblake@redhat.com>
Checking whether x > 0 before looping over [0..x] items doesn't make
sense and multi-line body must have curly brackets around it.
Best viewed with '-w'.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This reverts commit 611a278fa4.
According to the original commit message, this is dead code:
It is highly unlikely that a backend will know how to create a
volume from a different volume (buildVolFrom) and not know how to
create an empty volume (createVol).
Since Ceph version Infernalis (9.2.0) the new fast-diff mechanism
of RBD allows for querying actual volume usage.
Prior to this version there was no easy and fast way to query how
much allocation a RBD volume had inside a Ceph cluster.
To use the fast-diff feature it needs to be enabled per RBD image
and is only supported by Ceph cluster running version Infernalis
(9.2.0) or newer.
Without the fast-diff feature enabled libvirt will report an allocation
identical to the image capacity. This is how libvirt behaves currently.
'virsh vol-info rbd/image2' might output for example:
Name: image2
Type: network
Capacity: 1,00 GiB
Allocation: 124,00 MiB
Newly created volumes will have the fast-diff feature enabled if the
backing Ceph cluster supports it.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
In commit 0b15f920 there is a #ifdef which requires LIBRBD_VERSION_CODE
266 or newer for rbd_diff_iterate2()
rbd_diff_iterate2() is available since 266, so this if-statement should
require anything newer than 265.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
As more and more features are added to RBD volumes we will need to
call this method more often.
By moving it into a internal function we can re-use code inside the
storage backend.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
It is highly unlikely that a backend will know how to create a
volume from a different volume (buildVolFrom) and not know how to
create an empty volume (createVol). But:
1) we call the function without any prior check so if that's the
case we would SIGSEGV immediatelly
2) it's better to be safe than sorry.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Firstly, we realloc internal list to hold new item (=volume that
will be potentially created) and then we check whether we
actually know how to create it. If we don't we consume more
memory than we really need for no good reason.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Do not store the return value of called functions in the same variable
as the (future) return value of the current function.
This makes tracking the origin of the value easier and reduces
the chance of introducing a new point of exit without resetting
the return value back to -1.
The virStringListLength function does not ever modify the passed
string list. It merely counts the items in it. Make sure that we
reflect this bit in the function header.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
(crobinso: fix up spacing and squash in sheepdog bit suggested
by Andrea)
This was only used in debugging messages and not in any real code.
Ceph/RBD uses uint64_t for sizes internally and they can be printed
with %zu without any need for casting.
Signed-off-by: Wido den Hollander <wido@widodh.nl>