pve-qemu-qoup/debian/patches/pve/0007-PVE-Up-glusterfs-allow-partial-reads.patch

From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Wolfgang Bumiller <w.bumiller@proxmox.com>
Date: Mon, 6 Apr 2020 12:16:38 +0200
Subject: [PATCH] PVE: [Up] glusterfs: allow partial reads

This should deal with qemu bug #1644754 until upstream
decides which way to go. The general direction seems to be
away from sector based block APIs and with that in mind, and
when comparing to other network block backends (eg. nfs)
treating partial reads as errors doesn't seem to make much
sense.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
 block/gluster.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/block/gluster.c b/block/gluster.c
index d0011085c4..2df3d6e35d 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -58,6 +58,7 @@ typedef struct GlusterAIOCB {
     int ret;
     Coroutine *coroutine;
     AioContext *aio_context;
+    bool is_write;
 } GlusterAIOCB;
 
 typedef struct BDRVGlusterState {
@@ -753,8 +754,10 @@ static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret,
         acb->ret = 0; /* Success */
     } else if (ret < 0) {
         acb->ret = -errno; /* Read/Write failed */
+    } else if (acb->is_write) {
+        acb->ret = -EIO; /* Partial write - fail it */
     } else {
-        acb->ret = -EIO; /* Partial read/write - fail it */
+        acb->ret = 0; /* Success */
     }
 
     aio_co_schedule(acb->aio_context, acb->coroutine);
@@ -1021,6 +1024,7 @@ static coroutine_fn int qemu_gluster_co_pwrite_zeroes(BlockDriverState *bs,
     acb.ret = 0;
     acb.coroutine = qemu_coroutine_self();
     acb.aio_context = bdrv_get_aio_context(bs);
+    acb.is_write = true;
 
     ret = glfs_zerofill_async(s->fd, offset, bytes, gluster_finish_aiocb, &acb);
     if (ret < 0) {
@@ -1201,9 +1205,11 @@ static coroutine_fn int qemu_gluster_co_rw(BlockDriverState *bs,
     acb.aio_context = bdrv_get_aio_context(bs);
 
     if (write) {
+        acb.is_write = true;
         ret = glfs_pwritev_async(s->fd, qiov->iov, qiov->niov, offset, 0,
                                  gluster_finish_aiocb, &acb);
     } else {
+        acb.is_write = false;
         ret = glfs_preadv_async(s->fd, qiov->iov, qiov->niov, offset, 0,
                                 gluster_finish_aiocb, &acb);
     }
@@ -1266,6 +1272,7 @@ static coroutine_fn int qemu_gluster_co_flush_to_disk(BlockDriverState *bs)
     acb.ret = 0;
     acb.coroutine = qemu_coroutine_self();
     acb.aio_context = bdrv_get_aio_context(bs);
+    acb.is_write = true;
 
     ret = glfs_fsync_async(s->fd, gluster_finish_aiocb, &acb);
     if (ret < 0) {
@@ -1314,6 +1321,7 @@ static coroutine_fn int qemu_gluster_co_pdiscard(BlockDriverState *bs,
     acb.ret = 0;
     acb.coroutine = qemu_coroutine_self();
     acb.aio_context = bdrv_get_aio_context(bs);
+    acb.is_write = true;
 
     ret = glfs_discard_async(s->fd, offset, bytes, gluster_finish_aiocb, &acb);
     if (ret < 0) {
patch cleanup Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com> 2018-02-19 12:38:54 +03:00			`From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001`
import stable-4 build files 2017-04-05 11:49:19 +03:00			`From: Wolfgang Bumiller <w.bumiller@proxmox.com>`
import QEMU 5.0.0-rc2 and rebase patches Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com> 2020-04-07 17:53:19 +03:00			`Date: Mon, 6 Apr 2020 12:16:38 +0200`
			`Subject: [PATCH] PVE: [Up] glusterfs: allow partial reads`
import stable-4 build files 2017-04-05 11:49:19 +03:00
			`This should deal with qemu bug #1644754 until upstream`
			`decides which way to go. The general direction seems to be`
			`away from sector based block APIs and with that in mind, and`
			`when comparing to other network block backends (eg. nfs)`
			`treating partial reads as errors doesn't seem to make much`
			`sense.`
update patches for v4.0.0 Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com> 2019-06-06 13:58:15 +03:00
			`Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>`
import stable-4 build files 2017-04-05 11:49:19 +03:00			`---`
			`block/gluster.c \| 10 +++++++++-`
			`1 file changed, 9 insertions(+), 1 deletion(-)`

			`diff --git a/block/gluster.c b/block/gluster.c`
update submodule and patches to QEMU 8.1.2 Bigger notable changes: * Commit 1a30b0f5d7 ("block: .bdrv_open is non-coroutine and unlocked") broke the PVE backup patches, in particular setting up the backup dump block driver, because bdrv_new_open_driver() cannot be called from a coroutine. To fix it, bdrv_co_open() is used instead, and while it's a much more involved function, the result should be essentially the same. The only difference I noticed is that the BDRV_O_ALLOW_RDWR flag is also set in the resulting bds (block driver state), but that shouldn't hurt. Smaller notable changes: * aio_set_fd_handler() dropped its 'is_external' parameter stating that all callers now pass false in 60f782b6b7 ("aio: remove aio_disable_external() API"). The calls in the PVE patches also passed false, so just drop the parameter too. * global_state_store() does not have a return value anymore, so the user in the PVE savevm-async patch was adapted. For context, see c33f1829f8 ("migration: never fail in global_state_store()"). * Renames affecting the PVE savevm-async patch: migrate_use_block() -> migrate_block() and ram_counters -> mig_stats 9d4b1e5f22 ("migration: Move migrate_use_block() to options.c") aff3f6606d ("migration: Rename ram_counters to mig_stats") Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com> 2023-10-17 15:10:09 +03:00			`index d0011085c4..2df3d6e35d 100644`
import stable-4 build files 2017-04-05 11:49:19 +03:00			`--- a/block/gluster.c`
			`+++ b/block/gluster.c`
update submodule and patches to QEMU 8.0.0 Many changes were necessary this time around: * QAPI was changed to avoid redundant has_* variables, see commit 44ea9d9be3 ("qapi: Start to elide redundant has_FOO in generated C") for details. This affected many QMP commands added by Proxmox too. * Pending querying for migration got split into two functions, one to estimate, one for exact value, see commit c8df4a7aef ("migration: Split save_live_pending() into state_pending_") for details. Relevant for savevm-async and PBS dirty bitmap. Some block (driver) functions got converted to coroutines, so the Proxmox block drivers needed to be adapted. * Alloc track auto-detaching during PBS live restore got broken by AioContext-related changes resulting in a deadlock. The current, hacky method was replaced by a simpler one. Stefan apparently ran into a problem with that when he wrote the driver, but there were improvements in the stream job code since then and I didn't manage to reproduce the issue. It's a separate patch "alloc-track: fix deadlock during drop" for now, you can find the details there. * Async snapshot-related changes: - The pending querying got adapted to the above-mentioned split and a patch is added to optimize it/make it more similar to what upstream code does. - Added initialization of the compression counters (for future-proofing). - It's necessary the hold the BQL (big QEMU lock = iothread mutex) during the setup phase, because block layer functions are used there and not doing so leads to racy, hard-to-debug crashes or hangs. It's necessary to change some upstream code too for this, a version of the patch "migration: for snapshots, hold the BQL during setup callbacks" is intended to be upstreamed. - Need to take the bdrv graph read lock before flushing. * hmp_info_balloon was moved to a different file. * Needed to include a new headers from time to time to still get the correct functions. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> 2023-05-15 16:39:53 +03:00			`@@ -58,6 +58,7 @@ typedef struct GlusterAIOCB {`
update to 2.9.0-rc2 build files 2017-04-05 12:38:26 +03:00			`int ret;`
import stable-4 build files 2017-04-05 11:49:19 +03:00			`Coroutine *coroutine;`
			`AioContext *aio_context;`
			`+ bool is_write;`
			`} GlusterAIOCB;`

			`typedef struct BDRVGlusterState {`
update submodule and patches to QEMU 8.0.0 Many changes were necessary this time around: * QAPI was changed to avoid redundant has_* variables, see commit 44ea9d9be3 ("qapi: Start to elide redundant has_FOO in generated C") for details. This affected many QMP commands added by Proxmox too. * Pending querying for migration got split into two functions, one to estimate, one for exact value, see commit c8df4a7aef ("migration: Split save_live_pending() into state_pending_") for details. Relevant for savevm-async and PBS dirty bitmap. Some block (driver) functions got converted to coroutines, so the Proxmox block drivers needed to be adapted. * Alloc track auto-detaching during PBS live restore got broken by AioContext-related changes resulting in a deadlock. The current, hacky method was replaced by a simpler one. Stefan apparently ran into a problem with that when he wrote the driver, but there were improvements in the stream job code since then and I didn't manage to reproduce the issue. It's a separate patch "alloc-track: fix deadlock during drop" for now, you can find the details there. * Async snapshot-related changes: - The pending querying got adapted to the above-mentioned split and a patch is added to optimize it/make it more similar to what upstream code does. - Added initialization of the compression counters (for future-proofing). - It's necessary the hold the BQL (big QEMU lock = iothread mutex) during the setup phase, because block layer functions are used there and not doing so leads to racy, hard-to-debug crashes or hangs. It's necessary to change some upstream code too for this, a version of the patch "migration: for snapshots, hold the BQL during setup callbacks" is intended to be upstreamed. - Need to take the bdrv graph read lock before flushing. * hmp_info_balloon was moved to a different file. * Needed to include a new headers from time to time to still get the correct functions. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> 2023-05-15 16:39:53 +03:00			`@@ -753,8 +754,10 @@ static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret,`
import stable-4 build files 2017-04-05 11:49:19 +03:00			`acb->ret = 0; /* Success */`
			`} else if (ret < 0) {`
			`acb->ret = -errno; /* Read/Write failed */`
			`+ } else if (acb->is_write) {`
			`+ acb->ret = -EIO; /* Partial write - fail it */`
			`} else {`
			`- acb->ret = -EIO; /* Partial read/write - fail it */`
			`+ acb->ret = 0; /* Success */`
			`}`

update to 2.9.0-rc2 build files 2017-04-05 12:38:26 +03:00			`aio_co_schedule(acb->aio_context, acb->coroutine);`
update submodule and patches to QEMU 8.0.0 Many changes were necessary this time around: * QAPI was changed to avoid redundant has_* variables, see commit 44ea9d9be3 ("qapi: Start to elide redundant has_FOO in generated C") for details. This affected many QMP commands added by Proxmox too. * Pending querying for migration got split into two functions, one to estimate, one for exact value, see commit c8df4a7aef ("migration: Split save_live_pending() into state_pending_") for details. Relevant for savevm-async and PBS dirty bitmap. Some block (driver) functions got converted to coroutines, so the Proxmox block drivers needed to be adapted. * Alloc track auto-detaching during PBS live restore got broken by AioContext-related changes resulting in a deadlock. The current, hacky method was replaced by a simpler one. Stefan apparently ran into a problem with that when he wrote the driver, but there were improvements in the stream job code since then and I didn't manage to reproduce the issue. It's a separate patch "alloc-track: fix deadlock during drop" for now, you can find the details there. * Async snapshot-related changes: - The pending querying got adapted to the above-mentioned split and a patch is added to optimize it/make it more similar to what upstream code does. - Added initialization of the compression counters (for future-proofing). - It's necessary the hold the BQL (big QEMU lock = iothread mutex) during the setup phase, because block layer functions are used there and not doing so leads to racy, hard-to-debug crashes or hangs. It's necessary to change some upstream code too for this, a version of the patch "migration: for snapshots, hold the BQL during setup callbacks" is intended to be upstreamed. - Need to take the bdrv graph read lock before flushing. * hmp_info_balloon was moved to a different file. * Needed to include a new headers from time to time to still get the correct functions. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> 2023-05-15 16:39:53 +03:00			`@@ -1021,6 +1024,7 @@ static coroutine_fn int qemu_gluster_co_pwrite_zeroes(BlockDriverState *bs,`
import stable-4 build files 2017-04-05 11:49:19 +03:00			`acb.ret = 0;`
			`acb.coroutine = qemu_coroutine_self();`
			`acb.aio_context = bdrv_get_aio_context(bs);`
			`+ acb.is_write = true;`

update submodule and patches to 6.2.0 Notable changes: * bdrv_co_p{discard,readv,writev,write_zeroes} function signatures changed, to using int64_t for offsets/bytes and some still had int rather than BrdvRequestFlags for the flags. * job_cancel_sync now has a force parameter. Commit messages in 73895f3838cd7fdaf185cf1dbc47be58844a966f 4cfb3f05627ad82af473e7f7ae113c3884cd04e3 sound like using force=true makes more sense. * Added 3 patches coming in via qemu-stable tag, most important one is to work around a librbd issue. * Added another 3 patches from qemu-devel to fix issue leading to crash when live migrating with iothread. * cluster_size calculation helper changed (see patch pve/0026). * QAPI's if conditionals now use 'CONFIG_FOO' rather than 'defined(CONFIG_FOO)' Signed-off-by: Fabian Ebner <f.ebner@proxmox.com> 2022-02-11 12:24:33 +03:00			`ret = glfs_zerofill_async(s->fd, offset, bytes, gluster_finish_aiocb, &acb);`
import stable-4 build files 2017-04-05 11:49:19 +03:00			`if (ret < 0) {`
update submodule and patches to QEMU 8.0.0 Many changes were necessary this time around: * QAPI was changed to avoid redundant has_* variables, see commit 44ea9d9be3 ("qapi: Start to elide redundant has_FOO in generated C") for details. This affected many QMP commands added by Proxmox too. * Pending querying for migration got split into two functions, one to estimate, one for exact value, see commit c8df4a7aef ("migration: Split save_live_pending() into state_pending_") for details. Relevant for savevm-async and PBS dirty bitmap. Some block (driver) functions got converted to coroutines, so the Proxmox block drivers needed to be adapted. * Alloc track auto-detaching during PBS live restore got broken by AioContext-related changes resulting in a deadlock. The current, hacky method was replaced by a simpler one. Stefan apparently ran into a problem with that when he wrote the driver, but there were improvements in the stream job code since then and I didn't manage to reproduce the issue. It's a separate patch "alloc-track: fix deadlock during drop" for now, you can find the details there. * Async snapshot-related changes: - The pending querying got adapted to the above-mentioned split and a patch is added to optimize it/make it more similar to what upstream code does. - Added initialization of the compression counters (for future-proofing). - It's necessary the hold the BQL (big QEMU lock = iothread mutex) during the setup phase, because block layer functions are used there and not doing so leads to racy, hard-to-debug crashes or hangs. It's necessary to change some upstream code too for this, a version of the patch "migration: for snapshots, hold the BQL during setup callbacks" is intended to be upstreamed. - Need to take the bdrv graph read lock before flushing. * hmp_info_balloon was moved to a different file. * Needed to include a new headers from time to time to still get the correct functions. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> 2023-05-15 16:39:53 +03:00			`@@ -1201,9 +1205,11 @@ static coroutine_fn int qemu_gluster_co_rw(BlockDriverState *bs,`
import stable-4 build files 2017-04-05 11:49:19 +03:00			`acb.aio_context = bdrv_get_aio_context(bs);`

			`if (write) {`
			`+ acb.is_write = true;`
			`ret = glfs_pwritev_async(s->fd, qiov->iov, qiov->niov, offset, 0,`
			`gluster_finish_aiocb, &acb);`
			`} else {`
			`+ acb.is_write = false;`
			`ret = glfs_preadv_async(s->fd, qiov->iov, qiov->niov, offset, 0,`
			`gluster_finish_aiocb, &acb);`
			`}`
update submodule and patches to QEMU 8.0.0 Many changes were necessary this time around: * QAPI was changed to avoid redundant has_* variables, see commit 44ea9d9be3 ("qapi: Start to elide redundant has_FOO in generated C") for details. This affected many QMP commands added by Proxmox too. * Pending querying for migration got split into two functions, one to estimate, one for exact value, see commit c8df4a7aef ("migration: Split save_live_pending() into state_pending_") for details. Relevant for savevm-async and PBS dirty bitmap. Some block (driver) functions got converted to coroutines, so the Proxmox block drivers needed to be adapted. * Alloc track auto-detaching during PBS live restore got broken by AioContext-related changes resulting in a deadlock. The current, hacky method was replaced by a simpler one. Stefan apparently ran into a problem with that when he wrote the driver, but there were improvements in the stream job code since then and I didn't manage to reproduce the issue. It's a separate patch "alloc-track: fix deadlock during drop" for now, you can find the details there. * Async snapshot-related changes: - The pending querying got adapted to the above-mentioned split and a patch is added to optimize it/make it more similar to what upstream code does. - Added initialization of the compression counters (for future-proofing). - It's necessary the hold the BQL (big QEMU lock = iothread mutex) during the setup phase, because block layer functions are used there and not doing so leads to racy, hard-to-debug crashes or hangs. It's necessary to change some upstream code too for this, a version of the patch "migration: for snapshots, hold the BQL during setup callbacks" is intended to be upstreamed. - Need to take the bdrv graph read lock before flushing. * hmp_info_balloon was moved to a different file. * Needed to include a new headers from time to time to still get the correct functions. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> 2023-05-15 16:39:53 +03:00			`@@ -1266,6 +1272,7 @@ static coroutine_fn int qemu_gluster_co_flush_to_disk(BlockDriverState *bs)`
import stable-4 build files 2017-04-05 11:49:19 +03:00			`acb.ret = 0;`
			`acb.coroutine = qemu_coroutine_self();`
			`acb.aio_context = bdrv_get_aio_context(bs);`
			`+ acb.is_write = true;`

			`ret = glfs_fsync_async(s->fd, gluster_finish_aiocb, &acb);`
			`if (ret < 0) {`
update submodule and patches to QEMU 8.0.0 Many changes were necessary this time around: * QAPI was changed to avoid redundant has_* variables, see commit 44ea9d9be3 ("qapi: Start to elide redundant has_FOO in generated C") for details. This affected many QMP commands added by Proxmox too. * Pending querying for migration got split into two functions, one to estimate, one for exact value, see commit c8df4a7aef ("migration: Split save_live_pending() into state_pending_") for details. Relevant for savevm-async and PBS dirty bitmap. Some block (driver) functions got converted to coroutines, so the Proxmox block drivers needed to be adapted. * Alloc track auto-detaching during PBS live restore got broken by AioContext-related changes resulting in a deadlock. The current, hacky method was replaced by a simpler one. Stefan apparently ran into a problem with that when he wrote the driver, but there were improvements in the stream job code since then and I didn't manage to reproduce the issue. It's a separate patch "alloc-track: fix deadlock during drop" for now, you can find the details there. * Async snapshot-related changes: - The pending querying got adapted to the above-mentioned split and a patch is added to optimize it/make it more similar to what upstream code does. - Added initialization of the compression counters (for future-proofing). - It's necessary the hold the BQL (big QEMU lock = iothread mutex) during the setup phase, because block layer functions are used there and not doing so leads to racy, hard-to-debug crashes or hangs. It's necessary to change some upstream code too for this, a version of the patch "migration: for snapshots, hold the BQL during setup callbacks" is intended to be upstreamed. - Need to take the bdrv graph read lock before flushing. * hmp_info_balloon was moved to a different file. * Needed to include a new headers from time to time to still get the correct functions. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> 2023-05-15 16:39:53 +03:00			`@@ -1314,6 +1321,7 @@ static coroutine_fn int qemu_gluster_co_pdiscard(BlockDriverState *bs,`
import stable-4 build files 2017-04-05 11:49:19 +03:00			`acb.ret = 0;`
			`acb.coroutine = qemu_coroutine_self();`
			`acb.aio_context = bdrv_get_aio_context(bs);`
			`+ acb.is_write = true;`

update submodule and patches to 6.2.0 Notable changes: * bdrv_co_p{discard,readv,writev,write_zeroes} function signatures changed, to using int64_t for offsets/bytes and some still had int rather than BrdvRequestFlags for the flags. * job_cancel_sync now has a force parameter. Commit messages in 73895f3838cd7fdaf185cf1dbc47be58844a966f 4cfb3f05627ad82af473e7f7ae113c3884cd04e3 sound like using force=true makes more sense. * Added 3 patches coming in via qemu-stable tag, most important one is to work around a librbd issue. * Added another 3 patches from qemu-devel to fix issue leading to crash when live migrating with iothread. * cluster_size calculation helper changed (see patch pve/0026). * QAPI's if conditionals now use 'CONFIG_FOO' rather than 'defined(CONFIG_FOO)' Signed-off-by: Fabian Ebner <f.ebner@proxmox.com> 2022-02-11 12:24:33 +03:00			`ret = glfs_discard_async(s->fd, offset, bytes, gluster_finish_aiocb, &acb);`
import stable-4 build files 2017-04-05 11:49:19 +03:00			`if (ret < 0) {`