Right now, grabbing blockinfo always calls stat on the disk, then
opens the image to determine the capacity, using a throw-away
virStorageSourcePtr. This has a couple of drawbacks:
1. We are calling stat and opening a file on every invocation of
the API. However, there are cases where the stats should NOT be
changing between successive calls (if a domain is running, no
one should be changing the physical size of a block device or raw
image behind our backs; capacity of read-only files should not
be changing; and we are the gateway to the block-resize command
to know when the capacity of read-write files should be changing).
True, we still have to use stat in some cases (a sparse raw file
changes allocation if it is read-write and the amount of holes is
changing, and a read-write qcow2 image stored in a file changes
physical size if it was not fully pre-allocated). But for
read-only images, even this should be something we can remember
from the previous time, rather than repeating every call.
2. We want to enhance the power of virDomainListGetStats, by
sharing code. But we already have a virStorageSourcePtr for
each disk, and it would be easier to reuse the common structure
than to have to worry about the one-off virDomainBlockInfoPtr.
While this patch does not optimize reuse of information in point
1, it does get us closer to being able to do so; by updating a
structure that survives between consecutive calls.
* src/util/virstoragefile.h (_virStorageSource): Add physical, to
mirror virDomainBlockInfo; rearrange fields to match public struct.
(virStorageSourceCopy): Copy the new field.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Store into
storage source, then copy to block info.
Signed-off-by: Eric Blake <eblake@redhat.com>
Currently virStorageFileResize() function uses build conditionals to
choose either the posix_fallocate() or syscall(SYS_fallocate) with no
fallback in order to preallocate the space in the newly resized file.
Since the safezero code has a similar set of conditionals modify the
resize and safezero code in order to allow the resize logic to make use
of safezero to unify the look/feel of the code paths.
Add a new boolean (resize) to safezero() to make the optional decision
whether to try syscall(SYS_fallocate) if the posix_fallocate fails because
HAVE_POSIX_FALLOCATE is not defined (eg, return -1 and errno == 0).
Create a local safezero_sys_fallocate in order to handle the resize
code paths that support that. If not present, the set errno = ENOSYS
in order to allow the caller to handle the failure scenarios.
Signed-off-by: John Ferlan <jferlan@redhat.com>
To be able to express some use cases of the RBD backing with libvirt, we
need to be able to specify a config file for the RBD client to qemu as
that is one of the commonly used options.
Some storage systems have internal support for snapshots. Libvirt should
be able to select a correct snapshot when starting a VM.
This patch adds a XML element to select a storage source snapshot for
the RBD protocol which supports this feature.
As we now have a common function to parse backing store string for RBD
backing store we can reuse it in the backing store walker so that we
don't fail on files backed by RBD storage.
This patch also adds a few tests to verify that the parsing works as
expected.
To allow reuse this non-trivial parser code in the backing store parser
this part of the command line parser needs to be split out into a
separate funciton.
If there are no hosts for a storage source virStorageSourceCopy and
virStorageSourceNewFromBackingRelative would try to copy them anyways.
As the success of virStorageNetHostDefCopy is determined by returning
a pointer and malloc of 0 elements might return NULL according to the
implementation, the result of the copy function may vary.
Fix this by copying the hosts array only if there are hosts defined.
When creating a disk image snapshot the libvirt code would blindly copy
the parents label to the newly created image. This runs into problems
when you start a VM from an image hosted on NFS (or other storage system
that doesn't support selinux labels) and the snapshot destination is on
a storage system that does support selinux labels. Libvirt's code in
that case generates a different security label for the image hosted on
NFS. This label is valid only for NFS images and doesn't allow access in
case of a locally stored image.
To fix this issue libvirt needs to refrain from copying security
information in cases where the default domain seclabel is a better
choice.
This patch repurposes the now unused @force argument of
virStorageSourceInitChainElement to denote whether a copy of the
security labelling stuff should be attempted or not. This allows to
fine-control the copy operation for cases where we need to keep the
label of the old disk vs. the cases where we need to keep the label
unset to use the default domain imagelabel.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1151718
The code that parses the schema from the URI touches the "hosts[0]"
member of the storage file source structure in case the URI contains a
schema. The hosts array was not yet allocated at the point in the code
where the transport protocol was parsed and set. This lead to a crash of
libvirtd.
Fix the code by allocating the "hosts" array upfront and add a test case
to verify this scenario. (Unfortunately this requires shuffling the test
case numbers too).
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1156288
virStorageSourceInitChainElement initializes a new storage chain element
for use as a new disk source. If the new element doesn't contain the
driver name, copy it from the old source.
This fixes issue where a disk would forget the driver after a snapshot.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1140984
The backing store string location offset 0 determines that the file
isn't present. The string size shouldn't be then checked:
from qemu.git/docs/specs/qcow2.txt
== Header ==
The first cluster of a qcow2 image contains the file header:
Byte 0 - 3: magic
QCOW magic string ("QFI\xfb")
4 - 7: version
Version number (valid values are 2 and 3)
8 - 15: backing_file_offset
Offset into the image file at which the backing file name
is stored (NB: The string is not null terminated). 0 if the
image doesn't have a backing file.
16 - 19: backing_file_size
Length of the backing file name in bytes. Must not be
longer than 1023 bytes. Undefined if the image doesn't have
a backing file. ^^^^^^^^^
This patch intentionally leaves the backing file string size check in
place in case a malformatted file would be presented to libvirt. Also
according to the docs the string size is maximum 1023 bytes, thus this
patch adds a check to verify that.
I was also able to verify that the check was done the same way in the
legacy qcow fromat (in qemu's code).
To express empty drive we historically use storage source with empty
path. Unfortunately NBD disks may be declared without a path.
Add a helper to wrap this logic.
Valgrind caught a memory leak:
==2018== 9 bytes in 1 blocks are definitely lost in loss record 143 of 927
==2018== at 0x4A0645D: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==2018== by 0x8C42369: strdup (strdup.c:42)
==2018== by 0x50EACC9: virStrdup (virstring.c:676)
==2018== by 0x50E79E5: virStorageSourceCopy (virstoragefile.c:1845)
==2018== by 0x20A3FAA7: qemuDomainBlockCommit (qemu_driver.c:15620)
==2018== by 0x51DC6B2: virDomainBlockCommit (libvirt.c:20092)
I traced it to the fact that blockcopy and blockcommit end up
reparsing a backing chain on pivot, but the chain parsing code
doesn't gracefully handle the case where the backing file is
already known.
I'm not exactly sure when this was introduced, but suspect that the
refactoring in commit 9944b71 and friends that moved towards probing
in-place rather than into a temporary structure are part of the cause.
* src/util/virstoragefile.c (virStorageFileGetMetadataInternal):
Don't leak any prior value.
Signed-off-by: Eric Blake <eblake@redhat.com>
The commit referenced above changed function arguments of
virStorageFileGetMetadataFromBuf() but didn't tweak the
ATTRIBUTE_NONNULL tied to them. This was caught by coverity as it
actually obeys them. We disabled them for GCC and thus it didn't show
up.
Additionally in commit 3ea661deea I passed
NULL to the backingFormat argument which was also marked as nonnull. Use
a dummy int's address when the argument isn't supplied so that the code
doesn't need to change much.
When discovering a disk backing chain the parent disk's metadata need to
be populated into the guest images so that each piece of the backing
chain contains a copy of those. This will allow us to refactor the
security driver so that it will not need to carry around the original
disk definition.
We are going to modify storage source chains in place. Add a helper that
will copy relevant information such as security labels to the new
element if that doesn't contain it.
The qemu block info function relied on working with local storage. Break
this assumption by adding support for remote volumes. Unfortunately we
still need to take a hybrid approach as some of the operations require a
filedescriptor.
Previously you'd get:
$ virsh domblkinfo gl vda
error: cannot stat file '/img10': Bad file descriptor
Now you get some stats:
$ virsh domblkinfo gl vda
Capacity: 10485760
Allocation: 197120
Physical: 197120
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1110198
To allow reusing this function in the qemu driver we need to allow
specifying the storage format. Also separate return of the backing store
path now isn't necessary.
There's a lot of places where we skip doing actions based on the
locality of given storage type. The usual pattern is to skip it if:
virStorageSourceGetActualType(src) == VIR_STORAGE_TYPE_NETWORK
Add a simple helper to simplify the pattern to
virStorageSourceIsLocalStorage(src)
Replace the inline "auth" struct in virStorageSource with a pointer
to a virStorageAuthDefPtr and utilize between the domain_conf, qemu_conf,
and qemu_command sources for finding the auth data for a domain disk
Introduce virStorageAuthDef and friends. Future patches will merge/utilize
their view of storage source/pool auth/secret definitions.
New API's include:
virStorageAuthDefParse: Parse the "<auth/>" XML data for either the
domain disk or storage pool returning a
virStorageAuthDefPtr
virStorageAuthDefCopy: Copy a virStorageAuthDefPtr - to be used by
the qemuTranslateDiskSourcePoolAuth when it
copies storage pool auth data into domain
disk auth data
virStorageAuthDefFormat: Common output of the "<auth" in the domain
disk or storage pool XML
virStorageAuthDefFree: Free memory associated with virStorageAuthDef
Subsequent patches will utilize the new functions for the domain disk and
storage pools.
Future work in the hostdev pass through can then make use of common data
structures and code.
Replace:
if (virBufferError(&buf)) {
virBufferFreeAndReset(&buf);
virReportOOMError();
...
}
with:
if (virBufferCheckError(&buf) < 0)
...
This should not be a functional change (unless some callers
misused the virBuffer APIs - a different error would be reported
then)
The parent directory doesn't necessarily need to be stored after we
don't mangle the path stored in the image. Remove it and tweak the code
to avoid using it.
Store backing chain paths as non-canonical. The canonicalization step
will be already taken. This will allow to avoid storing unnecessary
amounts of data.
Now that we store only relative names in virStorageSource's member
relPath the backingRelative member is obsolete. Remove it and adapt the
code to the removal.
Due to various refactors and compatibility with the virstoragetest the
relPath field of the virStorageSource structure was always filled either
with the relative name or the full path in case of absolutely backed
storage. Return its original purpose to store only the relative name of
the disk if it is backed relatively and tweak the tests.
This patch introduces a function that will allow us to resolve a
relative difference between two elements of a disk backing chain. This
function will be used to allow relative block commit and block pull
where we need to specify the new relative name of the image to qemu.
This patch also adds unit tests for the function to verify that it works
correctly.
Introduce a common function that will take a callback to resolve links
that will be used to canonicalize paths on various storage systems and
add extensive tests.
The image labels are stored in the virStorageSource struct. Convert the
virDomainDiskDefGetSecurityLabelDef helper not to use the full disk def
and move it appropriately.
The block commit code looks for an explicit base file relative
to the discovered top file; so for a chain of:
base <- snap1 <- snap2 <- snap3
and a command of:
virsh blockcommit $dom vda --base snap2 --top snap1
we got a sane message (here from libvirt 1.0.5):
error: invalid argument: could not find base 'snap2' below 'snap1' in chain for 'vda'
Meanwhile, recent refactoring has slightly reduced the quality of the
libvirt error messages, by losing the phrase 'below xyz':
error: invalid argument: could not find image 'snap2' in chain for 'snap3'
But we had a one-off, where we were not excluding the top file
itself in searching for the base; thankfully qemu still reports
the error, but the quality is worse:
virsh blockcommit $dom vda --base snap2 --top snap2
error: internal error unable to execute QEMU command 'block-commit': Base '/snap2' not found
Fix the one-off in blockcommit by changing the semantics of name
lookup - if a starting point is specified, then the result must
be below that point, rather than including that point. The only
other call to chain lookup was blockpull code, which was already
forcing the lookup to omit the active layer and only needs a
tweak to use the new semantics.
This also fixes the bug exposed in the testsuite, where when doing
a lookup pinned to an intermediate point in the chain, we were
unable to return the name of the parent also in the chain.
* src/util/virstoragefile.c (virStorageFileChainLookup): Change
semantics for non-NULL startFrom.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Adjust caller,
to keep existing semantics.
* tests/virstoragetest.c (mymain): Adjust to expose new semantics.
Signed-off-by: Eric Blake <eblake@redhat.com>
Jim Fehlig reported a regression found by libvirt-TCK tests:
> ~ # perl /usr/share/libvirt-tck/tests/qemu/100-disk-encryption.t
...
> ok 4 - defined persistent domain config
> # Starting inactive domain config
> libvirt error code: 1, message: internal error: unable to execute QEMU command
> 'cont': 'drive-ide0-0-1'
> (/var/cache/libvirt-tck/300-disk-encryption/demo.qcow2) is encrypted
Commit 2279d560 converted a boolean into a pointer with the intent of
transferring that pointer out of a temporary object into the caller's
data structure. The temporary structure meant that meta->encryption
was always NULL on entry, so we could get away with blindly allocating
the pointer when the header said so. But later, commit 8823272d
tweaked things to do backing chain detection in-place, rather than via
a temporary object; this has the net result that meta->encryption can
be non-NULL on entry. Not only did this turn the latent behavior into
a memory leak, it is also a behavior regression: blindly allocating a
new pointer wipes out what secrets we already knew about the chain,
making it impossible to restart the domain.
Of course, no one in their right mind should be relying on qcow2
encryption - it is fundamentally flawed. And sadly, the TCK tests
don't get run often enough, and this shows that our virstoragetest
does not exercise encrypted images at all. Otherwise, we could
have avoided a release containing this regression.
* src/util/virstoragefile.c (virStorageFileGetMetadataInternal):
Don't nuke an already-existing encryption.
Signed-off-by: Eric Blake <eblake@redhat.com>
Add parsers for relative and absolute backing names for local and remote
storage files.
This parser parses relative paths as relative to their parents and
absolute paths according to the protocol or local access.
For remote storage volumes, all URI based backing file names are
supported and for the qemu colon syntax the NBD protocol is supported.
Use virStorageFileReadHeader() to read headers of storage files possibly
on remote storage to retrieve the image metadata.
The backend information is now parsed by
virStorageFileGetMetadataInternal which is now exported from the util
source and virStorageFileGetMetadataFromFDInternal now doesn't need to
be exported.
My future work will modify the metadata crawler function to use the
storage driver file APIs to access the files instead of accessing them
directly so that we will be able to request the metadata for remote
files too. To avoid linking the storage driver to every helper file
using the utils code, the backing chain traversal function needs to be
moved to the storage driver source.
Additionally the virt-aa-helper and virstoragetest programs need to be
linked with the storage driver as a result of this change.
For guests backed by gluster volumes (or other network storage) we don't
fill the backing chain (see qemuDomainDetermineDiskChain). This leaves
the "relPath" field of the top image NULL. This causes a crash in
virStorageFileChainLookup() when looking up a backing element for such a
disk.
Since I'm working on adding support for network storage and one of the
steps will make the "relPath" field optional let's use STREQ_NULLABLE
instead of STREQ in virStorageFileChainLookup() to avoid the problem.
Add argument to return backing file format of a file probed by
virStorageFileGetMetadataFromFD so that it can be used in place of
virStorageFileGetMetadataFromBuf.
Currently the protocol type with index 0 was NBD which made it hard to
distinguish whether the protocol type was actually assigned. Add a new
protocol type with index 0 to distinguish it explicitly.
The gluster volume name was previously stored as part of the source path
string. This is unfortunate when we want to do operations on the path as
the volume is used separately.
Parse and store the volume name separately for gluster storage volumes
and use the newly stored variable appropriately.
Add VIR_STORAGE_FILE_PLOOP format. This format is used
to store disk images for virtual machines in PCS and containers
in PCS, OpenVZ and also in Parallels Desktop for Mac.
This format is described on OpenVZ site -
https://openvz.org/Ploop (together with ploop devices). It
consists of XML descriptor and one or more image files: base
image and deltas. Format of the image files described here:
https://openvz.org/Ploop/format.
This patch only adds VIR_STORAGE_FILE_PLOOP constant, consequent
patches will use it in parallels driver.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
All callers of virStorageFileGetMetadataFromBuf were first calling
virStorageFileProbeFormatFromBuf, to learn what format to pass in.
But this function is already wired to do the exact same probe if
the incoming format is VIR_STORAGE_FILE_AUTO, so it's simpler to
just refactor the probing into the central function.
* src/util/virstoragefile.h (virStorageFileGetMetadataFromBuf):
Drop parameter.
(virStorageFileProbeFormatFromBuf): Drop declaration.
* src/util/virstoragefile.c (virStorageFileGetMetadataFromBuf):
Do probe here instead of in callers.
(virStorageFileProbeFormatFromBuf): Make static.
* src/libvirt_private.syms (virstoragefile.h): Drop function.
* src/storage/storage_backend_fs.c (virStorageBackendProbeTarget):
Update caller.
* src/storage/storage_backend_gluster.c
(virStorageBackendGlusterRefreshVol): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>