Commit Graph

34 Commits

Author SHA1 Message Date
Peter Krempa
165b430eb9 qemuBackupDiskDataCleanupOne: Don't exit early when the job has started
Originally the function was cleaning up a failed job only but now
there's other stuff that needs to be cleared too.

Make only steps which clean up after a failed job depend on the
'started' field and execute the rest of the code always.

This fixes a leak of the backup job tracking object and the blockdev-add
helper data.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-06-23 11:23:01 +02:00
Peter Krempa
214faa0b04 qemuBackupDiskStarted: Fix improper dereference of array
The code would repeatedly mark the first disk's blockjob as started
rather than accessing all the blockjobs. Fix the dereferencing operator.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-06-23 11:23:01 +02:00
Peter Krempa
2d26f8b710 qemu: backup: Initialize 'store' source properly and just once
Two functions called in sequence both initialized the virStorageSource
backing 'store' leading to a memleak.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-06-23 11:23:01 +02:00
Peter Krempa
b5212365b6 qemuBackupBegin: Don't leak 'def' on early failures
The cleanup path expects that 'def' is assigned to 'priv->backup', but
that's not the case for early failures. Add a check to stop overwriting
of 'def' so that it can be freed.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-06-23 11:23:01 +02:00
Peter Krempa
e0d8d989e2 qemu: backup: Rewrite backup bitmap handling to the new bitmap semantics
Reuse qemuBlockGetBitmapMergeActions which allows removal of the ad-hoc
implementation of bitmap merging for backup. The new approach is simpler
and also more robust in case some of the bitmaps break as they remove
the dependency on the whole chain of bitmaps working.

The new approach also allows backups if a snapshot is created outside of
libvirt.

Additionally the code is greatly simplified.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-06-22 16:04:30 +02:00
Peter Krempa
15c5ed8ba6 qemu: backup: Move fetching of checkpoint list for incremental backup
Fetch the checkpoint list for every disk specifically based on the new
per-disk 'incremental' field.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-06-22 16:04:29 +02:00
Peter Krempa
c89a44777f qemu: backup: Fix backup of disk skipped in an intermediate checkpoint
If a disk is not captured by one of the intermediate checkpoints the
code would fail, but we can easily calculate the bitmaps to merge
correctly by skipping over checkpoints which don't describe the disk.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-06-22 16:04:29 +02:00
Peter Krempa
562511afa6 qemu: backup: Split up code traversing checkpoint list looking for bitmaps
The algorithm is getting quite complex. Split out the lookup of range of
backing chain storage sources and bitmaps contained in them which
correspond to one checkpoint.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-06-22 16:04:29 +02:00
Peter Krempa
b50a8354f6 qemu: Refuse blockjobs on disk bus='sd' with -blockdev
We still have to use -drive to instantiate sd disks. Combining that with
the new logic for blockjobs would be very complicated and not worth it
given that 'sd' cards work only on few rarely used machine types of
non-common architectures and libvirt didn't implement support for 'sd'
bus controllers. This will allow us to use -blockdev for other kinds on
such machines while sacrificing block jobs.

Note: this is currently no-op as we mask-out the QEMU_CAPS_BLOCKDEV
capability if any of the disks has bus='sd'.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2020-05-12 06:55:00 +02:00
Peter Krempa
b37fdfb9d4 backup: Store error message for failed backups
If a backup job fails midway it's hard to figure out what happened as
it's running asynchronous. Use the VIR_DOMAIN_JOB_ERRMSG job statistics
field to pass through the error from the first failed backup-blockjob
so that both the consumer of the virDomainGetJobStats and the
corresponding event can see the error.

event 'job-completed' for domain backup-test:
	operation: 9
	time_elapsed: 46
	disk_total: 104857600
	disk_processed: 10158080
	disk_remaining: 94699520
	success: 0
	errmsg: No space left on device

virsh domjobinfo backup-test --completed --anystats
Job type:         Failed
Operation:        Backup
Time elapsed:     46           ms
File processed:   9.688 MiB
File remaining:   90.312 MiB
File total:       100.000 MiB
Error message:    No space left on device

https://bugzilla.redhat.com/show_bug.cgi?id=1812827

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-04-24 08:56:57 +02:00
Peter Krempa
72186f9c8c qemu: Add free and copy function for qemuDomainJobInfo and use it
In order to add a string to qemuDomainJobInfo we must ensure that it's
freed and copied properly. Add helpers to copy and free the structure
and adjust the code to use them properly for the new semantics.

Additionally also allocation is changed to g_new0 as it includes the
type and thus it's very easy to grep for all the allocations of a given
type.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-04-24 08:56:57 +02:00
Peter Krempa
4aea6f42fe qemu: backup: Fix handling of backing store for backup target images
We always tried to install backing store for the image even if it didn't
make sense, e.g. for a full backup into a raw image. Additionally we
didn't record the backing file into the qcow2 metadata so the image
itself contained the diff of data but reading from it would be
incomplete as it depends on the backing image.

This patch fixes both issues by carefully installing the correct backing
file when appropriate and also recording it into the metadata when
creating the image.

https://bugzilla.redhat.com/show_bug.cgi?id=1813310

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-04-14 18:37:05 +02:00
Peter Krempa
e060b0624d qemuBackupBegin: Fix monitor access when rolling back due to failure
The code attempting to clean up after a failed pull mode backup job
wrongly entered monitor but didn't clean up nor exit monitor due to a
logic bug. Fix the condition.

Introduced in a1521f84a5

https://bugzilla.redhat.com/show_bug.cgi?id=1817327

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-03-26 17:56:30 +01:00
Michal Privoznik
13eb6c1468 qemu: Tell secdrivers which images are top parent
When preparing images for block jobs we modify their seclabels so
that QEMU can open them. However, as mentioned in the previous
commit, secdrivers base some it their decisions whether the image
they are working on is top of of the backing chain. Fortunately,
in places where we call secdrivers we know this and the
information can be passed to secdrivers.

The problem is the following: after the first blockcommit from
the base to one of the parents the XATTRs on the base image are
not cleared and therefore the second attempt to do another
blockcommit fails. This is caused by blockcommit code calling
qemuSecuritySetImageLabel() over the base image, possibly
multiple times (to ensure RW/RO access). A naive fix would be to
call the restore function. But this is not possible, because that
would deny QEMU the access to the base image.  Fortunately, we
can use the fact that seclabels are remembered only for the top
of the backing chain and not for the rest of the backing chain.
And thanks to the previous commit we can tell secdrivers which
images are top of the backing chain.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1803551

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2020-03-09 14:14:55 +01:00
Peter Krempa
d69470a18a virJSONValueNewArray: Use g_new0 to allocate and remove NULL checks from callers
Use the glib allocation function that never returns NULL and remove the
now dead-code checks from all callers.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-02-04 13:45:33 +01:00
Peter Krempa
5ddfac1169 qemu: block: Extract calls of qemuBlockGetNamedNodeData into a helper function
Create a wrapper for qemuBlockGetNamedNodeData named
qemuBlockGetNamedNodeData. The purpose of the wrapper is to integrate
the monitor handling functionality and in the future possible
qemuCaps-based flags.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2020-02-04 13:45:32 +01:00
Peter Krempa
0c3792a155 qemu: backup: Implement support for backup disk bitmap name configuration
Use the user-configured name of the bitmap when merging the appropriate
bitmaps for an incremental backup so that the user can see it as
configured. Additionally expose the default bitmap name if nothing is
configured.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2020-01-24 13:40:53 +01:00
Peter Krempa
bce4ac55f8 qemu: backup: Implement support for backup disk export name configuration
Pass the exportname as configured when exporting the image via NBD and
fill it with the default if it's not configured.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2020-01-24 13:40:48 +01:00
Peter Krempa
c314222a01 qemu: backup: Move capability check after inactive check
Inactive VM doesn't have qemuCaps set thus we'd never properly report
that VM backups are supported only for running VMs.

Move the capability check after the active check.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
2020-01-08 07:10:46 +01:00
Peter Krempa
a4877192a1 qemu: backup: roll-back checkpoint metadata if the checkpoint wasn't taken
We insert the checkpoint metadata into the list of checkpoints prior to
actually creating the on-disk bits. If the 'transaction' or any other
steps done between inserting the checkpoint and creating the on-disk
data fail we'd end up with an unusable checkpoint that would vanish
after libvirtd restart.

Prevent this by rolling back the metadata if we didn't actually take and
record the checkpoint.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-07 15:20:14 +01:00
Peter Krempa
5632ed8bad qemu: process: Terminate backup job on VM destroy
Commit d75f865fb9 caused a job-deadlock if
a VM is running the backup job and being destroyed as it removed the
cleanup of the async job type and there was nothing to clean up the
backup job.

Add an explicit cleanup of the backup job when destroying a VM.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-06 10:15:36 +01:00
Peter Krempa
bc8b159cb1 qemu: backup: Properly propagate async job type when cancelling the job
When cancelling the blockjobs as part of failed backup job startup
recover we didn't pass in the correct async job type. Luckily the block
job handler and cancellation code paths use no block job at all
currently so those were correct.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-06 10:15:36 +01:00
Peter Krempa
3a98fe9db3 qemu: blockjob: Remove infrastructure for remembering to delete image
Now that we delete the images elsewhere it's not required. Additionally
it's safe to do as we never released an upstream version which required
this being in place.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-06 10:15:36 +01:00
Peter Krempa
40485059ab qemu: backup: Move deletion of backup images to job termination
While qemu is running both locations are identical in semantics, but the
move will allow us to fix the scenario when the VM is destroyed or
crashes where we'd leak the images.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-06 10:15:35 +01:00
Peter Krempa
d6b994bafd qemu: backup: Configure backup store image with backing file
In contrast to snapshots the backup job does not complain when the
backup job's store file has backing pre-configured. It's actually
required so that the NBD server exposes all the data properly.

Remove our fake termination and use the existing disk source as backing.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-06 10:15:35 +01:00
Daniel P. Berrangé
f5e9bdb87f src: replace clock_gettime()/gettimeofday() with g_get_real_time()
g_get_real_time() returns the time since epoch in microseconds.
It uses gettimeofday() internally while libvirt used clock_gettime
because it is declared async signal safe. In practice gettimeofday
is also async signal safe *provided* the timezone parameter is
NULL. This is indeed the case in g_get_real_time().

Reviewed-by: Fabiano Fidêncio <fidencio@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-03 15:42:13 +00:00
Peter Krempa
450888d96b qemu: backup: Merge bitmaps accross the backing chain
To allow backups work across external snapshots we need to improve the
algorithm which calculates which bitmaps to merge.

The algorithm must look for appropriately named bitmaps in the image and
possibly descend into a backing image if the current image does not have
the bitmap.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2019-12-13 13:22:55 +01:00
Peter Krempa
59999670f2 qemu: backup: Export qemuBackupDiskPrepareOneBitmapsChain for tests
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2019-12-13 13:22:55 +01:00
Peter Krempa
775228dccf qemu: backup: Propagate bitmap metadata into qemuBackupDiskPrepareOneBitmapsChain
The function will require the bitmap topology for the full
implementation. To facilitate testing, add the propagation of the
necessary data beforehand so that the test code can stay unchanged
during the changes.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2019-12-13 13:22:55 +01:00
Peter Krempa
3323e85bf6 qemu: backup: Extract calculations of bitmaps to merge for incremental backup
Separate the for now incomplete code that collects the bitmaps to be
merged for an incremental backup into a separate function. This will
allow adding testing prior to the improvement of the algorithm to
include snapshots.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2019-12-13 13:22:55 +01:00
Peter Krempa
d0e829e232 qemu: backup: Return 'def' instead of 'obj' from qemuBackupBeginCollectIncrementalCheckpoints
The object itself has no extra value and it would make testing the code
harder. Refactor it to remove just the definition pointer.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2019-12-13 13:22:55 +01:00
Peter Krempa
f1bc1f0fe5 qemu: monitor: Add 'granularity' parameter for block-dirty-bitmap-add
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2019-12-13 13:22:55 +01:00
Peter Krempa
5ea6cec9ef qemu: backup: Implement stats gathering while the job is running
We can use the output of 'query-jobs' to figure out some useful
information about a backup job. That is progress in case of a push job
and scratch file use in case of a pull job.

Add a worker which will total up the data and call it from
qemuDomainGetJobStatsInternal.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-12-10 12:41:58 +01:00
Peter Krempa
a1521f84a5 qemu: Implement backup job APIs and qemu handling
This allows to start and manage the backup job.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2019-12-10 12:41:58 +01:00