pve-qemu-qoup/debian/patches/pve/0051-backup-add-minimum-cluster-size-to-performance-optio.patch

107 lines
4.3 KiB
Diff
Raw Normal View History

implement support for backup fleecing Excerpt from Fiona's v3 cover-letter [0]: When a backup for a VM is started, QEMU will install a "copy-before-write" filter in its block layer. This filter ensures that upon new guest writes, old data still needed for the backup is sent to the backup target first. The guest write blocks until this operation is finished so guest IO to not-yet-backed-up sectors will be limited by the speed of the backup target. With backup fleecing, such old data is cached in a fleecing image rather than sent directly to the backup target. This can help guest IO performance and even prevent hangs in certain scenarios, at the cost of requiring more storage space. With this series it will be possible to enable backup-fleecing via e.g. `vzdump 123 --fleecing enabled=1,storage=local-lvm` with fleecing images created on the storage `local-lvm`. The fleecing storage should be a fast local storage which supports thin-provisioning and discard. If the storage supports qcow2, that is used as the fleecing image format. If the underlying file system does not support discard, with qcow2 and preallocation=off, at least already allocated parts of the image can be re-used later. Fleecing images are created by qemu-server via pve-storage and attached to QEMU before the backup starts, and cleaned up after the backup finished or failed. The naming schema for fleecing images is 'vm-ID-fleece-N(.FORMAT)'. The allocated images are recorded in the guest configuration, so that even after a hard failure, clean-up can be re-attempted. While not too bad, it's a non-trivial amount of code and I'm not 100% sure about the cost-benefit, so sending those as RFC. The fleecing image needs to be the exact same size as the source, but luckily, an explicit size can be specified when attaching a raw image to QEMU so there are no size issues when using storages that have coarser allocation/round up. For qcow2, it seems that virtual size can be nearly arbitrary (i.e. modulo 512 byte granularity) during allocation. [0]: https://lists.proxmox.com/pipermail/pve-devel/2024-April/062815.html Originally-by: Fiona Ebner <f.ebner@proxmox.com> Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2024-04-11 18:38:26 +03:00
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 11 Apr 2024 11:29:27 +0200
Subject: [PATCH] backup: add minimum cluster size to performance options
Useful to make discard-source work in the context of backup fleecing
when the fleecing image has a larger granularity than the backup
target.
Backup/block-copy will use at least this granularity for copy operations
and in particular, discard requests to the backup source will too. If
the granularity is too small, they will just be aligned down in
cbw_co_pdiscard_snapshot() and thus effectively ignored.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/backup.c | 2 +-
block/copy-before-write.c | 2 ++
block/copy-before-write.h | 1 +
blockdev.c | 3 +++
qapi/block-core.json | 9 +++++++--
5 files changed, 14 insertions(+), 3 deletions(-)
diff --git a/block/backup.c b/block/backup.c
index 3dc955f625..ac5bd81338 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -430,7 +430,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
}
cbw = bdrv_cbw_append(bs, target, filter_node_name, discard_source,
- &bcs, errp);
+ perf->min_cluster_size, &bcs, errp);
if (!cbw) {
goto error;
}
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index 4a8c5bdb62..9ca5ec5e5c 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -555,6 +555,7 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
BlockDriverState *target,
const char *filter_node_name,
bool discard_source,
+ int64_t min_cluster_size,
BlockCopyState **bcs,
Error **errp)
{
@@ -573,6 +574,7 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
}
qdict_put_str(opts, "file", bdrv_get_node_name(source));
qdict_put_str(opts, "target", bdrv_get_node_name(target));
+ qdict_put_int(opts, "min-cluster-size", min_cluster_size);
top = bdrv_insert_node(source, opts, flags, errp);
if (!top) {
diff --git a/block/copy-before-write.h b/block/copy-before-write.h
index 01af0cd3c4..dc6cafe7fa 100644
--- a/block/copy-before-write.h
+++ b/block/copy-before-write.h
@@ -40,6 +40,7 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
BlockDriverState *target,
const char *filter_node_name,
bool discard_source,
+ int64_t min_cluster_size,
BlockCopyState **bcs,
Error **errp);
void bdrv_cbw_drop(BlockDriverState *bs);
diff --git a/blockdev.c b/blockdev.c
index ce3fef924c..5ae1dde73c 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2729,6 +2729,9 @@ static BlockJob *do_backup_common(BackupCommon *backup,
if (backup->x_perf->has_max_chunk) {
perf.max_chunk = backup->x_perf->max_chunk;
}
+ if (backup->x_perf->has_min_cluster_size) {
+ perf.min_cluster_size = backup->x_perf->min_cluster_size;
+ }
}
if ((backup->sync == MIRROR_SYNC_MODE_BITMAP) ||
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 33e7e3c090..58fd637e86 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1757,11 +1757,16 @@
# it should not be less than job cluster size which is calculated
# as maximum of target image cluster size and 64k. Default 0.
#
+# @min-cluster-size: Minimum size of blocks used by copy-before-write
+# and background copy operations. Has to be a power of 2. No
+# effect if smaller than the maximum of the target's cluster size
+# and 64 KiB. Default 0. (Since 8.1)
+#
# Since: 6.0
##
{ 'struct': 'BackupPerf',
- 'data': { '*use-copy-range': 'bool',
- '*max-workers': 'int', '*max-chunk': 'int64' } }
+ 'data': { '*use-copy-range': 'bool', '*max-workers': 'int',
+ '*max-chunk': 'int64', '*min-cluster-size': 'uint32' } }
##
# @BackupCommon: