Commit Graph

45 Commits

Author SHA1 Message Date
Peter Krempa
d46512fc95 backup: Move file format check from parser to qemu driver
It's a technical detail in qemu that QCOW2 is needed for a pull-mode
backup.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-12-08 15:12:33 +01:00
Peter Krempa
0316c28a45 qemu: backup: Install bitmap for incremental backup to appropriate node only
Libvirt's backup code has two modes:

1) push - where qemu actively writes the difference since the checkpoint
          into the output file

2) pull - where we instruct qemu to expose a frozen disk state along
          with a bitmap of blocks which changed since the checkpoint

For push mode qemu needs the temporary bitmap we use where we calculate
the actual changes to be present on the block node backing the disk.

For pull mode where we expose the bitmap via NBD qemu actually wants the
bitmap to be present for the exported block node which is the scratch
file.

Until now we've calculated the bitmap twice and installed it both to the
scratch file and to the disk node, but we don't need to since we know
when it's needed.

Pass in the 'pull' flag and decide where to install the bitmap according
to it and also when to register the bitmap name with the blockjob.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-11-16 14:12:38 +01:00
Peter Krempa
5ab8cc78c4 qemu: backup: Add partial validation of incremental backup checkpoint
Verify that the checkpoint requested by an incremental backup exists.
Unfortunately validating whether the checkpoint configuration actually
matches the disk may not be reasonably feasible as the disk may have
been renamed/snapshotted/etc. We still rely on bitmap presence.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-11-09 12:25:50 +01:00
Peter Krempa
e33e89d839 qemu: backup: Use VIR_ERR_CHECKPOINT_INCONSISTENT when starting a backup
If we don't have a consistent chain of bitmaps for the backup to proceed
we'd report VIR_ERR_INVALID_ARG error code, which makes it hard to
decide whether an incremental backup makes even sense.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-11-09 12:25:49 +01:00
Peter Krempa
62a01d84a3 util: hash: Retire 'virHashTable' in favor of 'GHashTable'
Don't hide our use of GHashTable behind our typedef. This will also
promote the use of glibs hash function directly.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Matt Coleman <matt@datto.com>
2020-11-06 10:40:51 +01:00
Peter Krempa
7c38a0dc9a qemu: block: Extract code for adding NBD exports to 'qemuBlockExportAddNBD'
Centralize the logic deciding which arguments to use when exporting a
block backend via NBD to a single place so that it can be centrally
fixed in upcoming commits to support the new export method via
'block-export-add'.

Additionally this allows simplification of the caller from migration as
the logic deciding which arguments to use is extracted too.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-10-19 13:36:58 +02:00
Peter Krempa
1a5f35dbd2 qemu: backup: Write TLS cert and secret object aliases into status XML
We've put the aliases into the backup job definition after the status
XML was already written so they didn't appear in the on-disk state.

Move the code putting them into the private definition earlier, so that
the status XML update done by saving blockjobs already writes them out.

Also add a note notifying that the block job status update writes the
status XML.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1870488
Fixes: 423576679a
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-09-15 15:25:22 +02:00
Peter Krempa
5058062b5d qemu: backup: Remove note that TLS should be implemented
Commit 423576679a implementing TLS forgot to remove the comment.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-09-15 15:25:22 +02:00
Laine Stump
cc5da62bbd replace g_new() with g_new0() for consistency
g_new() is used in only 3 places. Switching them to g_new0() will do
no harm, reduces confusion, and helps me sleep better at night knowing
that all allocated memory is initialized to 0 :-) (Yes, I *know* that
in all three cases the associated memory is immediately assigned some
other value. Today.)

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-07-20 19:08:07 -04:00
Peter Krempa
423576679a qemu: backup: Setup TLS environment for pull-mode backup jobs
Use the configured TLS env to setup encryption of the TLS transport.

https://bugzilla.redhat.com/show_bug.cgi?id=1822631

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-07-07 12:58:19 +02:00
Peter Krempa
d55b0811ff qemuBackupDiskDataCleanupOne: Free 'incrementalBitmap'
The bitmap name used for the incremental backup would be leaked
otherwise.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-06-23 11:23:01 +02:00
Peter Krempa
165b430eb9 qemuBackupDiskDataCleanupOne: Don't exit early when the job has started
Originally the function was cleaning up a failed job only but now
there's other stuff that needs to be cleared too.

Make only steps which clean up after a failed job depend on the
'started' field and execute the rest of the code always.

This fixes a leak of the backup job tracking object and the blockdev-add
helper data.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-06-23 11:23:01 +02:00
Peter Krempa
214faa0b04 qemuBackupDiskStarted: Fix improper dereference of array
The code would repeatedly mark the first disk's blockjob as started
rather than accessing all the blockjobs. Fix the dereferencing operator.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-06-23 11:23:01 +02:00
Peter Krempa
2d26f8b710 qemu: backup: Initialize 'store' source properly and just once
Two functions called in sequence both initialized the virStorageSource
backing 'store' leading to a memleak.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-06-23 11:23:01 +02:00
Peter Krempa
b5212365b6 qemuBackupBegin: Don't leak 'def' on early failures
The cleanup path expects that 'def' is assigned to 'priv->backup', but
that's not the case for early failures. Add a check to stop overwriting
of 'def' so that it can be freed.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-06-23 11:23:01 +02:00
Peter Krempa
e0d8d989e2 qemu: backup: Rewrite backup bitmap handling to the new bitmap semantics
Reuse qemuBlockGetBitmapMergeActions which allows removal of the ad-hoc
implementation of bitmap merging for backup. The new approach is simpler
and also more robust in case some of the bitmaps break as they remove
the dependency on the whole chain of bitmaps working.

The new approach also allows backups if a snapshot is created outside of
libvirt.

Additionally the code is greatly simplified.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-06-22 16:04:30 +02:00
Peter Krempa
15c5ed8ba6 qemu: backup: Move fetching of checkpoint list for incremental backup
Fetch the checkpoint list for every disk specifically based on the new
per-disk 'incremental' field.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-06-22 16:04:29 +02:00
Peter Krempa
c89a44777f qemu: backup: Fix backup of disk skipped in an intermediate checkpoint
If a disk is not captured by one of the intermediate checkpoints the
code would fail, but we can easily calculate the bitmaps to merge
correctly by skipping over checkpoints which don't describe the disk.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-06-22 16:04:29 +02:00
Peter Krempa
562511afa6 qemu: backup: Split up code traversing checkpoint list looking for bitmaps
The algorithm is getting quite complex. Split out the lookup of range of
backing chain storage sources and bitmaps contained in them which
correspond to one checkpoint.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-06-22 16:04:29 +02:00
Peter Krempa
b50a8354f6 qemu: Refuse blockjobs on disk bus='sd' with -blockdev
We still have to use -drive to instantiate sd disks. Combining that with
the new logic for blockjobs would be very complicated and not worth it
given that 'sd' cards work only on few rarely used machine types of
non-common architectures and libvirt didn't implement support for 'sd'
bus controllers. This will allow us to use -blockdev for other kinds on
such machines while sacrificing block jobs.

Note: this is currently no-op as we mask-out the QEMU_CAPS_BLOCKDEV
capability if any of the disks has bus='sd'.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-05-12 06:55:00 +02:00
Peter Krempa
b37fdfb9d4 backup: Store error message for failed backups
If a backup job fails midway it's hard to figure out what happened as
it's running asynchronous. Use the VIR_DOMAIN_JOB_ERRMSG job statistics
field to pass through the error from the first failed backup-blockjob
so that both the consumer of the virDomainGetJobStats and the
corresponding event can see the error.

event 'job-completed' for domain backup-test:
	operation: 9
	time_elapsed: 46
	disk_total: 104857600
	disk_processed: 10158080
	disk_remaining: 94699520
	success: 0
	errmsg: No space left on device

virsh domjobinfo backup-test --completed --anystats
Job type:         Failed
Operation:        Backup
Time elapsed:     46           ms
File processed:   9.688 MiB
File remaining:   90.312 MiB
File total:       100.000 MiB
Error message:    No space left on device

https://bugzilla.redhat.com/show_bug.cgi?id=1812827

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-04-24 08:56:57 +02:00
Peter Krempa
72186f9c8c qemu: Add free and copy function for qemuDomainJobInfo and use it
In order to add a string to qemuDomainJobInfo we must ensure that it's
freed and copied properly. Add helpers to copy and free the structure
and adjust the code to use them properly for the new semantics.

Additionally also allocation is changed to g_new0 as it includes the
type and thus it's very easy to grep for all the allocations of a given
type.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-04-24 08:56:57 +02:00
Peter Krempa
4aea6f42fe qemu: backup: Fix handling of backing store for backup target images
We always tried to install backing store for the image even if it didn't
make sense, e.g. for a full backup into a raw image. Additionally we
didn't record the backing file into the qcow2 metadata so the image
itself contained the diff of data but reading from it would be
incomplete as it depends on the backing image.

This patch fixes both issues by carefully installing the correct backing
file when appropriate and also recording it into the metadata when
creating the image.

https://bugzilla.redhat.com/show_bug.cgi?id=1813310

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-04-14 18:37:05 +02:00
Peter Krempa
e060b0624d qemuBackupBegin: Fix monitor access when rolling back due to failure
The code attempting to clean up after a failed pull mode backup job
wrongly entered monitor but didn't clean up nor exit monitor due to a
logic bug. Fix the condition.

Introduced in a1521f84a5

https://bugzilla.redhat.com/show_bug.cgi?id=1817327

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-03-26 17:56:30 +01:00
Michal Privoznik
13eb6c1468 qemu: Tell secdrivers which images are top parent
When preparing images for block jobs we modify their seclabels so
that QEMU can open them. However, as mentioned in the previous
commit, secdrivers base some it their decisions whether the image
they are working on is top of of the backing chain. Fortunately,
in places where we call secdrivers we know this and the
information can be passed to secdrivers.

The problem is the following: after the first blockcommit from
the base to one of the parents the XATTRs on the base image are
not cleared and therefore the second attempt to do another
blockcommit fails. This is caused by blockcommit code calling
qemuSecuritySetImageLabel() over the base image, possibly
multiple times (to ensure RW/RO access). A naive fix would be to
call the restore function. But this is not possible, because that
would deny QEMU the access to the base image.  Fortunately, we
can use the fact that seclabels are remembered only for the top
of the backing chain and not for the rest of the backing chain.
And thanks to the previous commit we can tell secdrivers which
images are top of the backing chain.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1803551

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2020-03-09 14:14:55 +01:00
Peter Krempa
d69470a18a virJSONValueNewArray: Use g_new0 to allocate and remove NULL checks from callers
Use the glib allocation function that never returns NULL and remove the
now dead-code checks from all callers.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-02-04 13:45:33 +01:00
Peter Krempa
5ddfac1169 qemu: block: Extract calls of qemuBlockGetNamedNodeData into a helper function
Create a wrapper for qemuBlockGetNamedNodeData named
qemuBlockGetNamedNodeData. The purpose of the wrapper is to integrate
the monitor handling functionality and in the future possible
qemuCaps-based flags.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-02-04 13:45:32 +01:00
Peter Krempa
0c3792a155 qemu: backup: Implement support for backup disk bitmap name configuration
Use the user-configured name of the bitmap when merging the appropriate
bitmaps for an incremental backup so that the user can see it as
configured. Additionally expose the default bitmap name if nothing is
configured.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2020-01-24 13:40:53 +01:00
Peter Krempa
bce4ac55f8 qemu: backup: Implement support for backup disk export name configuration
Pass the exportname as configured when exporting the image via NBD and
fill it with the default if it's not configured.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-01-24 13:40:48 +01:00
Peter Krempa
c314222a01 qemu: backup: Move capability check after inactive check
Inactive VM doesn't have qemuCaps set thus we'd never properly report
that VM backups are supported only for running VMs.

Move the capability check after the active check.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2020-01-08 07:10:46 +01:00
Peter Krempa
a4877192a1 qemu: backup: roll-back checkpoint metadata if the checkpoint wasn't taken
We insert the checkpoint metadata into the list of checkpoints prior to
actually creating the on-disk bits. If the 'transaction' or any other
steps done between inserting the checkpoint and creating the on-disk
data fail we'd end up with an unusable checkpoint that would vanish
after libvirtd restart.

Prevent this by rolling back the metadata if we didn't actually take and
record the checkpoint.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-07 15:20:14 +01:00
Peter Krempa
5632ed8bad qemu: process: Terminate backup job on VM destroy
Commit d75f865fb9 caused a job-deadlock if
a VM is running the backup job and being destroyed as it removed the
cleanup of the async job type and there was nothing to clean up the
backup job.

Add an explicit cleanup of the backup job when destroying a VM.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-06 10:15:36 +01:00
Peter Krempa
bc8b159cb1 qemu: backup: Properly propagate async job type when cancelling the job
When cancelling the blockjobs as part of failed backup job startup
recover we didn't pass in the correct async job type. Luckily the block
job handler and cancellation code paths use no block job at all
currently so those were correct.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-06 10:15:36 +01:00
Peter Krempa
3a98fe9db3 qemu: blockjob: Remove infrastructure for remembering to delete image
Now that we delete the images elsewhere it's not required. Additionally
it's safe to do as we never released an upstream version which required
this being in place.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-06 10:15:36 +01:00
Peter Krempa
40485059ab qemu: backup: Move deletion of backup images to job termination
While qemu is running both locations are identical in semantics, but the
move will allow us to fix the scenario when the VM is destroyed or
crashes where we'd leak the images.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-06 10:15:35 +01:00
Peter Krempa
d6b994bafd qemu: backup: Configure backup store image with backing file
In contrast to snapshots the backup job does not complain when the
backup job's store file has backing pre-configured. It's actually
required so that the NBD server exposes all the data properly.

Remove our fake termination and use the existing disk source as backing.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-06 10:15:35 +01:00
Daniel P. Berrangé
f5e9bdb87f src: replace clock_gettime()/gettimeofday() with g_get_real_time()
g_get_real_time() returns the time since epoch in microseconds.
It uses gettimeofday() internally while libvirt used clock_gettime
because it is declared async signal safe. In practice gettimeofday
is also async signal safe *provided* the timezone parameter is
NULL. This is indeed the case in g_get_real_time().

Reviewed-by: Fabiano Fidêncio <fidencio@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-03 15:42:13 +00:00
Peter Krempa
450888d96b qemu: backup: Merge bitmaps accross the backing chain
To allow backups work across external snapshots we need to improve the
algorithm which calculates which bitmaps to merge.

The algorithm must look for appropriately named bitmaps in the image and
possibly descend into a backing image if the current image does not have
the bitmap.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2019-12-13 13:22:55 +01:00
Peter Krempa
59999670f2 qemu: backup: Export qemuBackupDiskPrepareOneBitmapsChain for tests
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2019-12-13 13:22:55 +01:00
Peter Krempa
775228dccf qemu: backup: Propagate bitmap metadata into qemuBackupDiskPrepareOneBitmapsChain
The function will require the bitmap topology for the full
implementation. To facilitate testing, add the propagation of the
necessary data beforehand so that the test code can stay unchanged
during the changes.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2019-12-13 13:22:55 +01:00
Peter Krempa
3323e85bf6 qemu: backup: Extract calculations of bitmaps to merge for incremental backup
Separate the for now incomplete code that collects the bitmaps to be
merged for an incremental backup into a separate function. This will
allow adding testing prior to the improvement of the algorithm to
include snapshots.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2019-12-13 13:22:55 +01:00
Peter Krempa
d0e829e232 qemu: backup: Return 'def' instead of 'obj' from qemuBackupBeginCollectIncrementalCheckpoints
The object itself has no extra value and it would make testing the code
harder. Refactor it to remove just the definition pointer.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2019-12-13 13:22:55 +01:00
Peter Krempa
f1bc1f0fe5 qemu: monitor: Add 'granularity' parameter for block-dirty-bitmap-add
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2019-12-13 13:22:55 +01:00
Peter Krempa
5ea6cec9ef qemu: backup: Implement stats gathering while the job is running
We can use the output of 'query-jobs' to figure out some useful
information about a backup job. That is progress in case of a push job
and scratch file use in case of a pull job.

Add a worker which will total up the data and call it from
qemuDomainGetJobStatsInternal.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-12-10 12:41:58 +01:00
Peter Krempa
a1521f84a5 qemu: Implement backup job APIs and qemu handling
This allows to start and manage the backup job.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-12-10 12:41:58 +01:00