2020-04-07 17:53:19 +03:00
|
|
|
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
|
2020-03-17 10:55:24 +03:00
|
|
|
From: John Snow <jsnow@redhat.com>
|
2020-04-07 17:53:19 +03:00
|
|
|
Date: Mon, 6 Apr 2020 12:17:03 +0200
|
|
|
|
Subject: [PATCH] drive-mirror: add support for sync=bitmap mode=never
|
2020-03-17 10:55:24 +03:00
|
|
|
MIME-Version: 1.0
|
|
|
|
Content-Type: text/plain; charset=UTF-8
|
|
|
|
Content-Transfer-Encoding: 8bit
|
|
|
|
|
|
|
|
This patch adds support for the "BITMAP" sync mode to drive-mirror and
|
|
|
|
blockdev-mirror. It adds support only for the BitmapSyncMode "never,"
|
|
|
|
because it's the simplest mode.
|
|
|
|
|
|
|
|
This mode simply uses a user-provided bitmap as an initial copy
|
|
|
|
manifest, and then does not clear any bits in the bitmap at the
|
|
|
|
conclusion of the operation.
|
|
|
|
|
|
|
|
Any new writes dirtied during the operation are copied out, in contrast
|
|
|
|
to backup. Note that whether these writes are reflected in the bitmap
|
|
|
|
at the conclusion of the operation depends on whether that bitmap is
|
|
|
|
actually recording!
|
|
|
|
|
|
|
|
This patch was originally based on one by Ma Haocong, but it has since
|
|
|
|
been modified pretty heavily.
|
|
|
|
|
|
|
|
Suggested-by: Ma Haocong <mahaocong@didichuxing.com>
|
|
|
|
Signed-off-by: Ma Haocong <mahaocong@didichuxing.com>
|
|
|
|
Signed-off-by: John Snow <jsnow@redhat.com>
|
|
|
|
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
|
2022-01-13 12:34:33 +03:00
|
|
|
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
[FE: rebased for 8.2.2]
|
update submodule and patches to QEMU 8.0.0
Many changes were necessary this time around:
* QAPI was changed to avoid redundant has_* variables, see commit
44ea9d9be3 ("qapi: Start to elide redundant has_FOO in generated C")
for details. This affected many QMP commands added by Proxmox too.
* Pending querying for migration got split into two functions, one to
estimate, one for exact value, see commit c8df4a7aef ("migration:
Split save_live_pending() into state_pending_*") for details. Relevant
for savevm-async and PBS dirty bitmap.
* Some block (driver) functions got converted to coroutines, so the
Proxmox block drivers needed to be adapted.
* Alloc track auto-detaching during PBS live restore got broken by
AioContext-related changes resulting in a deadlock. The current, hacky
method was replaced by a simpler one. Stefan apparently ran into a
problem with that when he wrote the driver, but there were
improvements in the stream job code since then and I didn't manage to
reproduce the issue. It's a separate patch "alloc-track: fix deadlock
during drop" for now, you can find the details there.
* Async snapshot-related changes:
- The pending querying got adapted to the above-mentioned split and
a patch is added to optimize it/make it more similar to what
upstream code does.
- Added initialization of the compression counters (for
future-proofing).
- It's necessary the hold the BQL (big QEMU lock = iothread mutex)
during the setup phase, because block layer functions are used there
and not doing so leads to racy, hard-to-debug crashes or hangs. It's
necessary to change some upstream code too for this, a version of
the patch "migration: for snapshots, hold the BQL during setup
callbacks" is intended to be upstreamed.
- Need to take the bdrv graph read lock before flushing.
* hmp_info_balloon was moved to a different file.
* Needed to include a new headers from time to time to still get the
correct functions.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-05-15 16:39:53 +03:00
|
|
|
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
|
2020-03-17 10:55:24 +03:00
|
|
|
---
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
block/mirror.c | 99 ++++++++++++++++++++------
|
update submodule and patches to QEMU 8.0.0
Many changes were necessary this time around:
* QAPI was changed to avoid redundant has_* variables, see commit
44ea9d9be3 ("qapi: Start to elide redundant has_FOO in generated C")
for details. This affected many QMP commands added by Proxmox too.
* Pending querying for migration got split into two functions, one to
estimate, one for exact value, see commit c8df4a7aef ("migration:
Split save_live_pending() into state_pending_*") for details. Relevant
for savevm-async and PBS dirty bitmap.
* Some block (driver) functions got converted to coroutines, so the
Proxmox block drivers needed to be adapted.
* Alloc track auto-detaching during PBS live restore got broken by
AioContext-related changes resulting in a deadlock. The current, hacky
method was replaced by a simpler one. Stefan apparently ran into a
problem with that when he wrote the driver, but there were
improvements in the stream job code since then and I didn't manage to
reproduce the issue. It's a separate patch "alloc-track: fix deadlock
during drop" for now, you can find the details there.
* Async snapshot-related changes:
- The pending querying got adapted to the above-mentioned split and
a patch is added to optimize it/make it more similar to what
upstream code does.
- Added initialization of the compression counters (for
future-proofing).
- It's necessary the hold the BQL (big QEMU lock = iothread mutex)
during the setup phase, because block layer functions are used there
and not doing so leads to racy, hard-to-debug crashes or hangs. It's
necessary to change some upstream code too for this, a version of
the patch "migration: for snapshots, hold the BQL during setup
callbacks" is intended to be upstreamed.
- Need to take the bdrv graph read lock before flushing.
* hmp_info_balloon was moved to a different file.
* Needed to include a new headers from time to time to still get the
correct functions.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-05-15 16:39:53 +03:00
|
|
|
blockdev.c | 38 +++++++++-
|
2022-06-27 14:05:40 +03:00
|
|
|
include/block/block_int-global-state.h | 4 +-
|
2023-10-17 15:10:09 +03:00
|
|
|
qapi/block-core.json | 25 ++++++-
|
2022-06-27 14:05:40 +03:00
|
|
|
tests/unit/test-block-iothread.c | 4 +-
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
5 files changed, 142 insertions(+), 28 deletions(-)
|
2020-03-17 10:55:24 +03:00
|
|
|
|
|
|
|
diff --git a/block/mirror.c b/block/mirror.c
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
index abbddb39e4..ed14c8b498 100644
|
2020-03-17 10:55:24 +03:00
|
|
|
--- a/block/mirror.c
|
|
|
|
+++ b/block/mirror.c
|
2022-06-27 14:05:40 +03:00
|
|
|
@@ -51,7 +51,7 @@ typedef struct MirrorBlockJob {
|
2020-03-17 10:55:24 +03:00
|
|
|
BlockDriverState *to_replace;
|
|
|
|
/* Used to block operations on the drive-mirror-replace target */
|
|
|
|
Error *replace_blocker;
|
|
|
|
- bool is_none_mode;
|
|
|
|
+ MirrorSyncMode sync_mode;
|
|
|
|
BlockMirrorBackingMode backing_mode;
|
|
|
|
/* Whether the target image requires explicit zero-initialization */
|
|
|
|
bool zero_target;
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
@@ -73,6 +73,8 @@ typedef struct MirrorBlockJob {
|
2020-03-17 10:55:24 +03:00
|
|
|
size_t buf_size;
|
|
|
|
int64_t bdev_length;
|
|
|
|
unsigned long *cow_bitmap;
|
|
|
|
+ BdrvDirtyBitmap *sync_bitmap;
|
|
|
|
+ BitmapSyncMode bitmap_mode;
|
|
|
|
BdrvDirtyBitmap *dirty_bitmap;
|
|
|
|
BdrvDirtyBitmapIter *dbi;
|
|
|
|
uint8_t *buf;
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
@@ -724,7 +726,8 @@ static int mirror_exit_common(Job *job)
|
2020-03-17 10:55:24 +03:00
|
|
|
&error_abort);
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
|
2020-03-17 10:55:24 +03:00
|
|
|
if (!abort && s->backing_mode == MIRROR_SOURCE_BACKING_CHAIN) {
|
|
|
|
- BlockDriverState *backing = s->is_none_mode ? src : s->base;
|
|
|
|
+ BlockDriverState *backing;
|
|
|
|
+ backing = s->sync_mode == MIRROR_SYNC_MODE_NONE ? src : s->base;
|
2021-02-11 19:11:11 +03:00
|
|
|
BlockDriverState *unfiltered_target = bdrv_skip_filters(target_bs);
|
|
|
|
|
|
|
|
if (bdrv_cow_bs(unfiltered_target) != backing) {
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
@@ -831,6 +834,16 @@ static void mirror_abort(Job *job)
|
2020-03-17 10:55:24 +03:00
|
|
|
assert(ret == 0);
|
|
|
|
}
|
|
|
|
|
|
|
|
+/* Always called after commit/abort. */
|
|
|
|
+static void mirror_clean(Job *job)
|
|
|
|
+{
|
|
|
|
+ MirrorBlockJob *s = container_of(job, MirrorBlockJob, common.job);
|
|
|
|
+
|
|
|
|
+ if (s->sync_bitmap) {
|
|
|
|
+ bdrv_dirty_bitmap_set_busy(s->sync_bitmap, false);
|
|
|
|
+ }
|
|
|
|
+}
|
|
|
|
+
|
|
|
|
static void coroutine_fn mirror_throttle(MirrorBlockJob *s)
|
|
|
|
{
|
|
|
|
int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
@@ -1027,7 +1040,8 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
|
2020-03-17 10:55:24 +03:00
|
|
|
mirror_free_init(s);
|
|
|
|
|
|
|
|
s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
|
|
|
|
- if (!s->is_none_mode) {
|
|
|
|
+ if ((s->sync_mode == MIRROR_SYNC_MODE_TOP) ||
|
|
|
|
+ (s->sync_mode == MIRROR_SYNC_MODE_FULL)) {
|
|
|
|
ret = mirror_dirty_init(s);
|
|
|
|
if (ret < 0 || job_is_cancelled(&s->common.job)) {
|
|
|
|
goto immediate_exit;
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
@@ -1323,6 +1337,7 @@ static const BlockJobDriver mirror_job_driver = {
|
2020-03-17 10:55:24 +03:00
|
|
|
.run = mirror_run,
|
|
|
|
.prepare = mirror_prepare,
|
|
|
|
.abort = mirror_abort,
|
|
|
|
+ .clean = mirror_clean,
|
|
|
|
.pause = mirror_pause,
|
|
|
|
.complete = mirror_complete,
|
2021-05-27 13:43:32 +03:00
|
|
|
.cancel = mirror_cancel,
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
@@ -1341,6 +1356,7 @@ static const BlockJobDriver commit_active_job_driver = {
|
2020-03-17 10:55:24 +03:00
|
|
|
.run = mirror_run,
|
|
|
|
.prepare = mirror_prepare,
|
|
|
|
.abort = mirror_abort,
|
|
|
|
+ .clean = mirror_clean,
|
|
|
|
.pause = mirror_pause,
|
|
|
|
.complete = mirror_complete,
|
2022-02-11 12:24:33 +03:00
|
|
|
.cancel = commit_active_cancel,
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
@@ -1733,7 +1749,10 @@ static BlockJob *mirror_start_job(
|
2020-03-17 10:55:24 +03:00
|
|
|
BlockCompletionFunc *cb,
|
|
|
|
void *opaque,
|
|
|
|
const BlockJobDriver *driver,
|
|
|
|
- bool is_none_mode, BlockDriverState *base,
|
|
|
|
+ MirrorSyncMode sync_mode,
|
|
|
|
+ BdrvDirtyBitmap *bitmap,
|
|
|
|
+ BitmapSyncMode bitmap_mode,
|
|
|
|
+ BlockDriverState *base,
|
|
|
|
bool auto_complete, const char *filter_node_name,
|
|
|
|
bool is_mirror, MirrorCopyMode copy_mode,
|
|
|
|
Error **errp)
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
@@ -1747,10 +1766,39 @@ static BlockJob *mirror_start_job(
|
|
|
|
|
|
|
|
GLOBAL_STATE_CODE();
|
2020-03-17 10:55:24 +03:00
|
|
|
|
|
|
|
- if (granularity == 0) {
|
2022-06-27 14:05:40 +03:00
|
|
|
- granularity = bdrv_get_default_bitmap_granularity(target);
|
2020-03-17 10:55:24 +03:00
|
|
|
+ if (sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
|
|
|
|
+ error_setg(errp, "Sync mode '%s' not supported",
|
|
|
|
+ MirrorSyncMode_str(sync_mode));
|
|
|
|
+ return NULL;
|
|
|
|
+ } else if (sync_mode == MIRROR_SYNC_MODE_BITMAP) {
|
|
|
|
+ if (!bitmap) {
|
|
|
|
+ error_setg(errp, "Must provide a valid bitmap name for '%s'"
|
|
|
|
+ " sync mode",
|
|
|
|
+ MirrorSyncMode_str(sync_mode));
|
|
|
|
+ return NULL;
|
|
|
|
+ } else if (bitmap_mode != BITMAP_SYNC_MODE_NEVER) {
|
|
|
|
+ error_setg(errp,
|
|
|
|
+ "Bitmap Sync Mode '%s' is not supported by Mirror",
|
|
|
|
+ BitmapSyncMode_str(bitmap_mode));
|
|
|
|
+ }
|
|
|
|
+ } else if (bitmap) {
|
|
|
|
+ error_setg(errp,
|
|
|
|
+ "sync mode '%s' is not compatible with bitmaps",
|
|
|
|
+ MirrorSyncMode_str(sync_mode));
|
|
|
|
+ return NULL;
|
2022-06-27 14:05:40 +03:00
|
|
|
}
|
|
|
|
|
2020-03-17 10:55:24 +03:00
|
|
|
+ if (bitmap) {
|
|
|
|
+ if (granularity) {
|
|
|
|
+ error_setg(errp, "granularity (%d)"
|
|
|
|
+ "cannot be specified when a bitmap is provided",
|
|
|
|
+ granularity);
|
|
|
|
+ return NULL;
|
|
|
|
+ }
|
|
|
|
+ granularity = bdrv_dirty_bitmap_granularity(bitmap);
|
|
|
|
+ } else if (granularity == 0) {
|
2022-06-27 14:05:40 +03:00
|
|
|
+ granularity = bdrv_get_default_bitmap_granularity(target);
|
|
|
|
+ }
|
2020-03-17 10:55:24 +03:00
|
|
|
assert(is_power_of_2(granularity));
|
|
|
|
|
|
|
|
if (buf_size < 0) {
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
@@ -1890,7 +1938,9 @@ static BlockJob *mirror_start_job(
|
2020-03-17 10:55:24 +03:00
|
|
|
s->replaces = g_strdup(replaces);
|
|
|
|
s->on_source_error = on_source_error;
|
|
|
|
s->on_target_error = on_target_error;
|
|
|
|
- s->is_none_mode = is_none_mode;
|
|
|
|
+ s->sync_mode = sync_mode;
|
|
|
|
+ s->sync_bitmap = bitmap;
|
|
|
|
+ s->bitmap_mode = bitmap_mode;
|
|
|
|
s->backing_mode = backing_mode;
|
|
|
|
s->zero_target = zero_target;
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
qatomic_set(&s->copy_mode, copy_mode);
|
|
|
|
@@ -1916,6 +1966,18 @@ static BlockJob *mirror_start_job(
|
|
|
|
*/
|
|
|
|
bdrv_disable_dirty_bitmap(s->dirty_bitmap);
|
2020-03-17 10:55:24 +03:00
|
|
|
|
|
|
|
+ if (s->sync_bitmap) {
|
|
|
|
+ bdrv_dirty_bitmap_set_busy(s->sync_bitmap, true);
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+ if (s->sync_mode == MIRROR_SYNC_MODE_BITMAP) {
|
|
|
|
+ bdrv_merge_dirty_bitmap(s->dirty_bitmap, s->sync_bitmap,
|
|
|
|
+ NULL, &local_err);
|
|
|
|
+ if (local_err) {
|
|
|
|
+ goto fail;
|
|
|
|
+ }
|
|
|
|
+ }
|
|
|
|
+
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
bdrv_graph_wrlock(bs);
|
2020-03-17 10:55:24 +03:00
|
|
|
ret = block_job_add_bdrv(&s->common, "source", bs, 0,
|
|
|
|
BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE |
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
@@ -1998,6 +2060,9 @@ fail:
|
2020-03-17 10:55:24 +03:00
|
|
|
if (s->dirty_bitmap) {
|
|
|
|
bdrv_release_dirty_bitmap(s->dirty_bitmap);
|
|
|
|
}
|
|
|
|
+ if (s->sync_bitmap) {
|
|
|
|
+ bdrv_dirty_bitmap_set_busy(s->sync_bitmap, false);
|
|
|
|
+ }
|
|
|
|
job_early_fail(&s->common.job);
|
|
|
|
}
|
|
|
|
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
@@ -2020,35 +2085,28 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
|
2020-03-17 10:55:24 +03:00
|
|
|
BlockDriverState *target, const char *replaces,
|
|
|
|
int creation_flags, int64_t speed,
|
|
|
|
uint32_t granularity, int64_t buf_size,
|
|
|
|
- MirrorSyncMode mode, BlockMirrorBackingMode backing_mode,
|
|
|
|
+ MirrorSyncMode mode, BdrvDirtyBitmap *bitmap,
|
|
|
|
+ BitmapSyncMode bitmap_mode,
|
|
|
|
+ BlockMirrorBackingMode backing_mode,
|
|
|
|
bool zero_target,
|
|
|
|
BlockdevOnError on_source_error,
|
|
|
|
BlockdevOnError on_target_error,
|
|
|
|
bool unmap, const char *filter_node_name,
|
|
|
|
MirrorCopyMode copy_mode, Error **errp)
|
|
|
|
{
|
|
|
|
- bool is_none_mode;
|
|
|
|
BlockDriverState *base;
|
|
|
|
|
2022-06-27 14:05:40 +03:00
|
|
|
GLOBAL_STATE_CODE();
|
|
|
|
|
2020-03-17 10:55:24 +03:00
|
|
|
- if ((mode == MIRROR_SYNC_MODE_INCREMENTAL) ||
|
|
|
|
- (mode == MIRROR_SYNC_MODE_BITMAP)) {
|
|
|
|
- error_setg(errp, "Sync mode '%s' not supported",
|
|
|
|
- MirrorSyncMode_str(mode));
|
|
|
|
- return;
|
|
|
|
- }
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
-
|
|
|
|
bdrv_graph_rdlock_main_loop();
|
2020-03-17 10:55:24 +03:00
|
|
|
- is_none_mode = mode == MIRROR_SYNC_MODE_NONE;
|
2021-02-11 19:11:11 +03:00
|
|
|
base = mode == MIRROR_SYNC_MODE_TOP ? bdrv_backing_chain_next(bs) : NULL;
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
bdrv_graph_rdunlock_main_loop();
|
|
|
|
|
2020-03-17 10:55:24 +03:00
|
|
|
mirror_start_job(job_id, bs, creation_flags, target, replaces,
|
|
|
|
speed, granularity, buf_size, backing_mode, zero_target,
|
|
|
|
on_source_error, on_target_error, unmap, NULL, NULL,
|
|
|
|
- &mirror_job_driver, is_none_mode, base, false,
|
|
|
|
- filter_node_name, true, copy_mode, errp);
|
|
|
|
+ &mirror_job_driver, mode, bitmap, bitmap_mode, base,
|
|
|
|
+ false, filter_node_name, true, copy_mode, errp);
|
|
|
|
}
|
|
|
|
|
|
|
|
BlockJob *commit_active_start(const char *job_id, BlockDriverState *bs,
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
@@ -2075,7 +2133,8 @@ BlockJob *commit_active_start(const char *job_id, BlockDriverState *bs,
|
2020-03-17 10:55:24 +03:00
|
|
|
job_id, bs, creation_flags, base, NULL, speed, 0, 0,
|
|
|
|
MIRROR_LEAVE_BACKING_CHAIN, false,
|
|
|
|
on_error, on_error, true, cb, opaque,
|
|
|
|
- &commit_active_job_driver, false, base, auto_complete,
|
|
|
|
+ &commit_active_job_driver, MIRROR_SYNC_MODE_FULL,
|
|
|
|
+ NULL, 0, base, auto_complete,
|
|
|
|
filter_node_name, false, MIRROR_COPY_MODE_BACKGROUND,
|
2021-05-27 13:43:32 +03:00
|
|
|
errp);
|
|
|
|
if (!job) {
|
2020-03-17 10:55:24 +03:00
|
|
|
diff --git a/blockdev.c b/blockdev.c
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
index c91f49e7b6..8c8e8b604a 100644
|
2020-03-17 10:55:24 +03:00
|
|
|
--- a/blockdev.c
|
|
|
|
+++ b/blockdev.c
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
@@ -2903,6 +2903,9 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
|
2020-03-17 10:55:24 +03:00
|
|
|
BlockDriverState *target,
|
update submodule and patches to QEMU 8.0.0
Many changes were necessary this time around:
* QAPI was changed to avoid redundant has_* variables, see commit
44ea9d9be3 ("qapi: Start to elide redundant has_FOO in generated C")
for details. This affected many QMP commands added by Proxmox too.
* Pending querying for migration got split into two functions, one to
estimate, one for exact value, see commit c8df4a7aef ("migration:
Split save_live_pending() into state_pending_*") for details. Relevant
for savevm-async and PBS dirty bitmap.
* Some block (driver) functions got converted to coroutines, so the
Proxmox block drivers needed to be adapted.
* Alloc track auto-detaching during PBS live restore got broken by
AioContext-related changes resulting in a deadlock. The current, hacky
method was replaced by a simpler one. Stefan apparently ran into a
problem with that when he wrote the driver, but there were
improvements in the stream job code since then and I didn't manage to
reproduce the issue. It's a separate patch "alloc-track: fix deadlock
during drop" for now, you can find the details there.
* Async snapshot-related changes:
- The pending querying got adapted to the above-mentioned split and
a patch is added to optimize it/make it more similar to what
upstream code does.
- Added initialization of the compression counters (for
future-proofing).
- It's necessary the hold the BQL (big QEMU lock = iothread mutex)
during the setup phase, because block layer functions are used there
and not doing so leads to racy, hard-to-debug crashes or hangs. It's
necessary to change some upstream code too for this, a version of
the patch "migration: for snapshots, hold the BQL during setup
callbacks" is intended to be upstreamed.
- Need to take the bdrv graph read lock before flushing.
* hmp_info_balloon was moved to a different file.
* Needed to include a new headers from time to time to still get the
correct functions.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-05-15 16:39:53 +03:00
|
|
|
const char *replaces,
|
2020-03-17 10:55:24 +03:00
|
|
|
enum MirrorSyncMode sync,
|
|
|
|
+ const char *bitmap_name,
|
|
|
|
+ bool has_bitmap_mode,
|
|
|
|
+ BitmapSyncMode bitmap_mode,
|
|
|
|
BlockMirrorBackingMode backing_mode,
|
|
|
|
bool zero_target,
|
|
|
|
bool has_speed, int64_t speed,
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
@@ -2921,6 +2924,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
|
2020-03-17 10:55:24 +03:00
|
|
|
{
|
2021-02-11 19:11:11 +03:00
|
|
|
BlockDriverState *unfiltered_bs;
|
2020-03-17 10:55:24 +03:00
|
|
|
int job_flags = JOB_DEFAULT;
|
|
|
|
+ BdrvDirtyBitmap *bitmap = NULL;
|
|
|
|
|
2023-10-17 15:10:09 +03:00
|
|
|
GLOBAL_STATE_CODE();
|
|
|
|
GRAPH_RDLOCK_GUARD_MAINLOOP();
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
@@ -2975,6 +2979,29 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
|
2020-03-17 10:55:24 +03:00
|
|
|
sync = MIRROR_SYNC_MODE_FULL;
|
|
|
|
}
|
|
|
|
|
2023-06-15 14:39:00 +03:00
|
|
|
+ if (bitmap_name) {
|
2020-03-17 10:55:24 +03:00
|
|
|
+ if (granularity) {
|
|
|
|
+ error_setg(errp, "Granularity and bitmap cannot both be set");
|
|
|
|
+ return;
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+ if (!has_bitmap_mode) {
|
|
|
|
+ error_setg(errp, "bitmap-mode must be specified if"
|
|
|
|
+ " a bitmap is provided");
|
|
|
|
+ return;
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+ bitmap = bdrv_find_dirty_bitmap(bs, bitmap_name);
|
|
|
|
+ if (!bitmap) {
|
|
|
|
+ error_setg(errp, "Dirty bitmap '%s' not found", bitmap_name);
|
|
|
|
+ return;
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+ if (bdrv_dirty_bitmap_check(bitmap, BDRV_BITMAP_ALLOW_RO, errp)) {
|
|
|
|
+ return;
|
|
|
|
+ }
|
|
|
|
+ }
|
|
|
|
+
|
update submodule and patches to QEMU 8.0.0
Many changes were necessary this time around:
* QAPI was changed to avoid redundant has_* variables, see commit
44ea9d9be3 ("qapi: Start to elide redundant has_FOO in generated C")
for details. This affected many QMP commands added by Proxmox too.
* Pending querying for migration got split into two functions, one to
estimate, one for exact value, see commit c8df4a7aef ("migration:
Split save_live_pending() into state_pending_*") for details. Relevant
for savevm-async and PBS dirty bitmap.
* Some block (driver) functions got converted to coroutines, so the
Proxmox block drivers needed to be adapted.
* Alloc track auto-detaching during PBS live restore got broken by
AioContext-related changes resulting in a deadlock. The current, hacky
method was replaced by a simpler one. Stefan apparently ran into a
problem with that when he wrote the driver, but there were
improvements in the stream job code since then and I didn't manage to
reproduce the issue. It's a separate patch "alloc-track: fix deadlock
during drop" for now, you can find the details there.
* Async snapshot-related changes:
- The pending querying got adapted to the above-mentioned split and
a patch is added to optimize it/make it more similar to what
upstream code does.
- Added initialization of the compression counters (for
future-proofing).
- It's necessary the hold the BQL (big QEMU lock = iothread mutex)
during the setup phase, because block layer functions are used there
and not doing so leads to racy, hard-to-debug crashes or hangs. It's
necessary to change some upstream code too for this, a version of
the patch "migration: for snapshots, hold the BQL during setup
callbacks" is intended to be upstreamed.
- Need to take the bdrv graph read lock before flushing.
* hmp_info_balloon was moved to a different file.
* Needed to include a new headers from time to time to still get the
correct functions.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-05-15 16:39:53 +03:00
|
|
|
if (!replaces) {
|
2021-02-11 19:11:11 +03:00
|
|
|
/* We want to mirror from @bs, but keep implicit filters on top */
|
|
|
|
unfiltered_bs = bdrv_skip_implicit_filters(bs);
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
@@ -3030,8 +3057,8 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
|
2020-03-17 10:55:24 +03:00
|
|
|
* and will allow to check whether the node still exist at mirror completion
|
|
|
|
*/
|
|
|
|
mirror_start(job_id, bs, target,
|
update submodule and patches to QEMU 8.0.0
Many changes were necessary this time around:
* QAPI was changed to avoid redundant has_* variables, see commit
44ea9d9be3 ("qapi: Start to elide redundant has_FOO in generated C")
for details. This affected many QMP commands added by Proxmox too.
* Pending querying for migration got split into two functions, one to
estimate, one for exact value, see commit c8df4a7aef ("migration:
Split save_live_pending() into state_pending_*") for details. Relevant
for savevm-async and PBS dirty bitmap.
* Some block (driver) functions got converted to coroutines, so the
Proxmox block drivers needed to be adapted.
* Alloc track auto-detaching during PBS live restore got broken by
AioContext-related changes resulting in a deadlock. The current, hacky
method was replaced by a simpler one. Stefan apparently ran into a
problem with that when he wrote the driver, but there were
improvements in the stream job code since then and I didn't manage to
reproduce the issue. It's a separate patch "alloc-track: fix deadlock
during drop" for now, you can find the details there.
* Async snapshot-related changes:
- The pending querying got adapted to the above-mentioned split and
a patch is added to optimize it/make it more similar to what
upstream code does.
- Added initialization of the compression counters (for
future-proofing).
- It's necessary the hold the BQL (big QEMU lock = iothread mutex)
during the setup phase, because block layer functions are used there
and not doing so leads to racy, hard-to-debug crashes or hangs. It's
necessary to change some upstream code too for this, a version of
the patch "migration: for snapshots, hold the BQL during setup
callbacks" is intended to be upstreamed.
- Need to take the bdrv graph read lock before flushing.
* hmp_info_balloon was moved to a different file.
* Needed to include a new headers from time to time to still get the
correct functions.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-05-15 16:39:53 +03:00
|
|
|
- replaces, job_flags,
|
2020-03-17 10:55:24 +03:00
|
|
|
- speed, granularity, buf_size, sync, backing_mode, zero_target,
|
update submodule and patches to QEMU 8.0.0
Many changes were necessary this time around:
* QAPI was changed to avoid redundant has_* variables, see commit
44ea9d9be3 ("qapi: Start to elide redundant has_FOO in generated C")
for details. This affected many QMP commands added by Proxmox too.
* Pending querying for migration got split into two functions, one to
estimate, one for exact value, see commit c8df4a7aef ("migration:
Split save_live_pending() into state_pending_*") for details. Relevant
for savevm-async and PBS dirty bitmap.
* Some block (driver) functions got converted to coroutines, so the
Proxmox block drivers needed to be adapted.
* Alloc track auto-detaching during PBS live restore got broken by
AioContext-related changes resulting in a deadlock. The current, hacky
method was replaced by a simpler one. Stefan apparently ran into a
problem with that when he wrote the driver, but there were
improvements in the stream job code since then and I didn't manage to
reproduce the issue. It's a separate patch "alloc-track: fix deadlock
during drop" for now, you can find the details there.
* Async snapshot-related changes:
- The pending querying got adapted to the above-mentioned split and
a patch is added to optimize it/make it more similar to what
upstream code does.
- Added initialization of the compression counters (for
future-proofing).
- It's necessary the hold the BQL (big QEMU lock = iothread mutex)
during the setup phase, because block layer functions are used there
and not doing so leads to racy, hard-to-debug crashes or hangs. It's
necessary to change some upstream code too for this, a version of
the patch "migration: for snapshots, hold the BQL during setup
callbacks" is intended to be upstreamed.
- Need to take the bdrv graph read lock before flushing.
* hmp_info_balloon was moved to a different file.
* Needed to include a new headers from time to time to still get the
correct functions.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-05-15 16:39:53 +03:00
|
|
|
+ replaces, job_flags, speed, granularity, buf_size, sync,
|
|
|
|
+ bitmap, bitmap_mode, backing_mode, zero_target,
|
2020-03-17 10:55:24 +03:00
|
|
|
on_source_error, on_target_error, unmap, filter_node_name,
|
|
|
|
copy_mode, errp);
|
|
|
|
}
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
@@ -3189,6 +3216,8 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
|
2020-03-17 10:55:24 +03:00
|
|
|
|
update submodule and patches to QEMU 8.0.0
Many changes were necessary this time around:
* QAPI was changed to avoid redundant has_* variables, see commit
44ea9d9be3 ("qapi: Start to elide redundant has_FOO in generated C")
for details. This affected many QMP commands added by Proxmox too.
* Pending querying for migration got split into two functions, one to
estimate, one for exact value, see commit c8df4a7aef ("migration:
Split save_live_pending() into state_pending_*") for details. Relevant
for savevm-async and PBS dirty bitmap.
* Some block (driver) functions got converted to coroutines, so the
Proxmox block drivers needed to be adapted.
* Alloc track auto-detaching during PBS live restore got broken by
AioContext-related changes resulting in a deadlock. The current, hacky
method was replaced by a simpler one. Stefan apparently ran into a
problem with that when he wrote the driver, but there were
improvements in the stream job code since then and I didn't manage to
reproduce the issue. It's a separate patch "alloc-track: fix deadlock
during drop" for now, you can find the details there.
* Async snapshot-related changes:
- The pending querying got adapted to the above-mentioned split and
a patch is added to optimize it/make it more similar to what
upstream code does.
- Added initialization of the compression counters (for
future-proofing).
- It's necessary the hold the BQL (big QEMU lock = iothread mutex)
during the setup phase, because block layer functions are used there
and not doing so leads to racy, hard-to-debug crashes or hangs. It's
necessary to change some upstream code too for this, a version of
the patch "migration: for snapshots, hold the BQL during setup
callbacks" is intended to be upstreamed.
- Need to take the bdrv graph read lock before flushing.
* hmp_info_balloon was moved to a different file.
* Needed to include a new headers from time to time to still get the
correct functions.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-05-15 16:39:53 +03:00
|
|
|
blockdev_mirror_common(arg->job_id, bs, target_bs,
|
|
|
|
arg->replaces, arg->sync,
|
|
|
|
+ arg->bitmap,
|
2020-03-17 10:55:24 +03:00
|
|
|
+ arg->has_bitmap_mode, arg->bitmap_mode,
|
|
|
|
backing_mode, zero_target,
|
|
|
|
arg->has_speed, arg->speed,
|
|
|
|
arg->has_granularity, arg->granularity,
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
@@ -3210,6 +3239,8 @@ void qmp_blockdev_mirror(const char *job_id,
|
2020-03-17 10:55:24 +03:00
|
|
|
const char *device, const char *target,
|
update submodule and patches to QEMU 8.0.0
Many changes were necessary this time around:
* QAPI was changed to avoid redundant has_* variables, see commit
44ea9d9be3 ("qapi: Start to elide redundant has_FOO in generated C")
for details. This affected many QMP commands added by Proxmox too.
* Pending querying for migration got split into two functions, one to
estimate, one for exact value, see commit c8df4a7aef ("migration:
Split save_live_pending() into state_pending_*") for details. Relevant
for savevm-async and PBS dirty bitmap.
* Some block (driver) functions got converted to coroutines, so the
Proxmox block drivers needed to be adapted.
* Alloc track auto-detaching during PBS live restore got broken by
AioContext-related changes resulting in a deadlock. The current, hacky
method was replaced by a simpler one. Stefan apparently ran into a
problem with that when he wrote the driver, but there were
improvements in the stream job code since then and I didn't manage to
reproduce the issue. It's a separate patch "alloc-track: fix deadlock
during drop" for now, you can find the details there.
* Async snapshot-related changes:
- The pending querying got adapted to the above-mentioned split and
a patch is added to optimize it/make it more similar to what
upstream code does.
- Added initialization of the compression counters (for
future-proofing).
- It's necessary the hold the BQL (big QEMU lock = iothread mutex)
during the setup phase, because block layer functions are used there
and not doing so leads to racy, hard-to-debug crashes or hangs. It's
necessary to change some upstream code too for this, a version of
the patch "migration: for snapshots, hold the BQL during setup
callbacks" is intended to be upstreamed.
- Need to take the bdrv graph read lock before flushing.
* hmp_info_balloon was moved to a different file.
* Needed to include a new headers from time to time to still get the
correct functions.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-05-15 16:39:53 +03:00
|
|
|
const char *replaces,
|
2020-03-17 10:55:24 +03:00
|
|
|
MirrorSyncMode sync,
|
update submodule and patches to QEMU 8.0.0
Many changes were necessary this time around:
* QAPI was changed to avoid redundant has_* variables, see commit
44ea9d9be3 ("qapi: Start to elide redundant has_FOO in generated C")
for details. This affected many QMP commands added by Proxmox too.
* Pending querying for migration got split into two functions, one to
estimate, one for exact value, see commit c8df4a7aef ("migration:
Split save_live_pending() into state_pending_*") for details. Relevant
for savevm-async and PBS dirty bitmap.
* Some block (driver) functions got converted to coroutines, so the
Proxmox block drivers needed to be adapted.
* Alloc track auto-detaching during PBS live restore got broken by
AioContext-related changes resulting in a deadlock. The current, hacky
method was replaced by a simpler one. Stefan apparently ran into a
problem with that when he wrote the driver, but there were
improvements in the stream job code since then and I didn't manage to
reproduce the issue. It's a separate patch "alloc-track: fix deadlock
during drop" for now, you can find the details there.
* Async snapshot-related changes:
- The pending querying got adapted to the above-mentioned split and
a patch is added to optimize it/make it more similar to what
upstream code does.
- Added initialization of the compression counters (for
future-proofing).
- It's necessary the hold the BQL (big QEMU lock = iothread mutex)
during the setup phase, because block layer functions are used there
and not doing so leads to racy, hard-to-debug crashes or hangs. It's
necessary to change some upstream code too for this, a version of
the patch "migration: for snapshots, hold the BQL during setup
callbacks" is intended to be upstreamed.
- Need to take the bdrv graph read lock before flushing.
* hmp_info_balloon was moved to a different file.
* Needed to include a new headers from time to time to still get the
correct functions.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-05-15 16:39:53 +03:00
|
|
|
+ const char *bitmap,
|
2020-03-17 10:55:24 +03:00
|
|
|
+ bool has_bitmap_mode, BitmapSyncMode bitmap_mode,
|
|
|
|
bool has_speed, int64_t speed,
|
|
|
|
bool has_granularity, uint32_t granularity,
|
|
|
|
bool has_buf_size, int64_t buf_size,
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
@@ -3258,7 +3289,8 @@ void qmp_blockdev_mirror(const char *job_id,
|
2020-03-17 10:55:24 +03:00
|
|
|
}
|
|
|
|
|
update submodule and patches to QEMU 8.0.0
Many changes were necessary this time around:
* QAPI was changed to avoid redundant has_* variables, see commit
44ea9d9be3 ("qapi: Start to elide redundant has_FOO in generated C")
for details. This affected many QMP commands added by Proxmox too.
* Pending querying for migration got split into two functions, one to
estimate, one for exact value, see commit c8df4a7aef ("migration:
Split save_live_pending() into state_pending_*") for details. Relevant
for savevm-async and PBS dirty bitmap.
* Some block (driver) functions got converted to coroutines, so the
Proxmox block drivers needed to be adapted.
* Alloc track auto-detaching during PBS live restore got broken by
AioContext-related changes resulting in a deadlock. The current, hacky
method was replaced by a simpler one. Stefan apparently ran into a
problem with that when he wrote the driver, but there were
improvements in the stream job code since then and I didn't manage to
reproduce the issue. It's a separate patch "alloc-track: fix deadlock
during drop" for now, you can find the details there.
* Async snapshot-related changes:
- The pending querying got adapted to the above-mentioned split and
a patch is added to optimize it/make it more similar to what
upstream code does.
- Added initialization of the compression counters (for
future-proofing).
- It's necessary the hold the BQL (big QEMU lock = iothread mutex)
during the setup phase, because block layer functions are used there
and not doing so leads to racy, hard-to-debug crashes or hangs. It's
necessary to change some upstream code too for this, a version of
the patch "migration: for snapshots, hold the BQL during setup
callbacks" is intended to be upstreamed.
- Need to take the bdrv graph read lock before flushing.
* hmp_info_balloon was moved to a different file.
* Needed to include a new headers from time to time to still get the
correct functions.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-05-15 16:39:53 +03:00
|
|
|
blockdev_mirror_common(job_id, bs, target_bs,
|
|
|
|
- replaces, sync, backing_mode,
|
|
|
|
+ replaces, sync,
|
2020-03-17 10:55:24 +03:00
|
|
|
+ bitmap, has_bitmap_mode, bitmap_mode, backing_mode,
|
|
|
|
zero_target, has_speed, speed,
|
|
|
|
has_granularity, granularity,
|
|
|
|
has_buf_size, buf_size,
|
2022-06-27 14:05:40 +03:00
|
|
|
diff --git a/include/block/block_int-global-state.h b/include/block/block_int-global-state.h
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
index ef31c58bb3..57265a617a 100644
|
2022-06-27 14:05:40 +03:00
|
|
|
--- a/include/block/block_int-global-state.h
|
|
|
|
+++ b/include/block/block_int-global-state.h
|
update submodule and patches to QEMU 8.0.0
Many changes were necessary this time around:
* QAPI was changed to avoid redundant has_* variables, see commit
44ea9d9be3 ("qapi: Start to elide redundant has_FOO in generated C")
for details. This affected many QMP commands added by Proxmox too.
* Pending querying for migration got split into two functions, one to
estimate, one for exact value, see commit c8df4a7aef ("migration:
Split save_live_pending() into state_pending_*") for details. Relevant
for savevm-async and PBS dirty bitmap.
* Some block (driver) functions got converted to coroutines, so the
Proxmox block drivers needed to be adapted.
* Alloc track auto-detaching during PBS live restore got broken by
AioContext-related changes resulting in a deadlock. The current, hacky
method was replaced by a simpler one. Stefan apparently ran into a
problem with that when he wrote the driver, but there were
improvements in the stream job code since then and I didn't manage to
reproduce the issue. It's a separate patch "alloc-track: fix deadlock
during drop" for now, you can find the details there.
* Async snapshot-related changes:
- The pending querying got adapted to the above-mentioned split and
a patch is added to optimize it/make it more similar to what
upstream code does.
- Added initialization of the compression counters (for
future-proofing).
- It's necessary the hold the BQL (big QEMU lock = iothread mutex)
during the setup phase, because block layer functions are used there
and not doing so leads to racy, hard-to-debug crashes or hangs. It's
necessary to change some upstream code too for this, a version of
the patch "migration: for snapshots, hold the BQL during setup
callbacks" is intended to be upstreamed.
- Need to take the bdrv graph read lock before flushing.
* hmp_info_balloon was moved to a different file.
* Needed to include a new headers from time to time to still get the
correct functions.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-05-15 16:39:53 +03:00
|
|
|
@@ -152,7 +152,9 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
|
2020-04-07 17:53:19 +03:00
|
|
|
BlockDriverState *target, const char *replaces,
|
|
|
|
int creation_flags, int64_t speed,
|
|
|
|
uint32_t granularity, int64_t buf_size,
|
|
|
|
- MirrorSyncMode mode, BlockMirrorBackingMode backing_mode,
|
|
|
|
+ MirrorSyncMode mode, BdrvDirtyBitmap *bitmap,
|
|
|
|
+ BitmapSyncMode bitmap_mode,
|
|
|
|
+ BlockMirrorBackingMode backing_mode,
|
|
|
|
bool zero_target,
|
|
|
|
BlockdevOnError on_source_error,
|
|
|
|
BlockdevOnError on_target_error,
|
2020-03-17 10:55:24 +03:00
|
|
|
diff --git a/qapi/block-core.json b/qapi/block-core.json
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
index ca390c5700..8db0986e9e 100644
|
2020-03-17 10:55:24 +03:00
|
|
|
--- a/qapi/block-core.json
|
|
|
|
+++ b/qapi/block-core.json
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
@@ -2163,6 +2163,15 @@
|
2023-10-17 15:10:09 +03:00
|
|
|
# destination (all the disk, only the sectors allocated in the
|
|
|
|
# topmost image, or only new I/O).
|
2020-03-17 10:55:24 +03:00
|
|
|
#
|
2023-10-17 15:10:09 +03:00
|
|
|
+# @bitmap: The name of a bitmap to use for sync=bitmap mode. This
|
|
|
|
+# argument must be present for bitmap mode and absent otherwise.
|
|
|
|
+# The bitmap's granularity is used instead of @granularity (Since
|
|
|
|
+# 4.1).
|
2020-03-17 10:55:24 +03:00
|
|
|
+#
|
2023-10-17 15:10:09 +03:00
|
|
|
+# @bitmap-mode: Specifies the type of data the bitmap should contain
|
|
|
|
+# after the operation concludes. Must be present if sync is
|
|
|
|
+# "bitmap". Must NOT be present otherwise. (Since 4.1)
|
2020-03-17 10:55:24 +03:00
|
|
|
+#
|
2023-10-17 15:10:09 +03:00
|
|
|
# @granularity: granularity of the dirty bitmap, default is 64K if the
|
|
|
|
# image format doesn't have clusters, 4K if the clusters are
|
|
|
|
# smaller than that, else the cluster size. Must be a power of 2
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
@@ -2205,7 +2214,9 @@
|
2020-03-17 10:55:24 +03:00
|
|
|
{ 'struct': 'DriveMirror',
|
|
|
|
'data': { '*job-id': 'str', 'device': 'str', 'target': 'str',
|
|
|
|
'*format': 'str', '*node-name': 'str', '*replaces': 'str',
|
|
|
|
- 'sync': 'MirrorSyncMode', '*mode': 'NewImageMode',
|
|
|
|
+ 'sync': 'MirrorSyncMode', '*bitmap': 'str',
|
|
|
|
+ '*bitmap-mode': 'BitmapSyncMode',
|
|
|
|
+ '*mode': 'NewImageMode',
|
|
|
|
'*speed': 'int', '*granularity': 'uint32',
|
|
|
|
'*buf-size': 'int', '*on-source-error': 'BlockdevOnError',
|
|
|
|
'*on-target-error': 'BlockdevOnError',
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
@@ -2489,6 +2500,15 @@
|
2023-10-17 15:10:09 +03:00
|
|
|
# destination (all the disk, only the sectors allocated in the
|
|
|
|
# topmost image, or only new I/O).
|
2020-03-17 10:55:24 +03:00
|
|
|
#
|
2023-10-17 15:10:09 +03:00
|
|
|
+# @bitmap: The name of a bitmap to use for sync=bitmap mode. This
|
|
|
|
+# argument must be present for bitmap mode and absent otherwise.
|
|
|
|
+# The bitmap's granularity is used instead of @granularity (since
|
|
|
|
+# 4.1).
|
2020-03-17 10:55:24 +03:00
|
|
|
+#
|
2023-10-17 15:10:09 +03:00
|
|
|
+# @bitmap-mode: Specifies the type of data the bitmap should contain
|
|
|
|
+# after the operation concludes. Must be present if sync is
|
|
|
|
+# "bitmap". Must NOT be present otherwise. (Since 4.1)
|
2020-03-17 10:55:24 +03:00
|
|
|
+#
|
2023-10-17 15:10:09 +03:00
|
|
|
# @granularity: granularity of the dirty bitmap, default is 64K if the
|
|
|
|
# image format doesn't have clusters, 4K if the clusters are
|
|
|
|
# smaller than that, else the cluster size. Must be a power of 2
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
@@ -2539,7 +2559,8 @@
|
2020-03-17 10:55:24 +03:00
|
|
|
{ 'command': 'blockdev-mirror',
|
|
|
|
'data': { '*job-id': 'str', 'device': 'str', 'target': 'str',
|
|
|
|
'*replaces': 'str',
|
|
|
|
- 'sync': 'MirrorSyncMode',
|
|
|
|
+ 'sync': 'MirrorSyncMode', '*bitmap': 'str',
|
|
|
|
+ '*bitmap-mode': 'BitmapSyncMode',
|
|
|
|
'*speed': 'int', '*granularity': 'uint32',
|
|
|
|
'*buf-size': 'int', '*on-source-error': 'BlockdevOnError',
|
|
|
|
'*on-target-error': 'BlockdevOnError',
|
2021-05-27 13:43:32 +03:00
|
|
|
diff --git a/tests/unit/test-block-iothread.c b/tests/unit/test-block-iothread.c
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
index 9b15d2768c..54acd47188 100644
|
2021-05-27 13:43:32 +03:00
|
|
|
--- a/tests/unit/test-block-iothread.c
|
|
|
|
+++ b/tests/unit/test-block-iothread.c
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
@@ -766,8 +766,8 @@ static void test_propagate_mirror(void)
|
2020-04-07 17:53:19 +03:00
|
|
|
/* Start a mirror job */
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
aio_context_acquire(main_ctx);
|
2020-04-07 17:53:19 +03:00
|
|
|
mirror_start("job0", src, target, NULL, JOB_DEFAULT, 0, 0, 0,
|
|
|
|
- MIRROR_SYNC_MODE_NONE, MIRROR_OPEN_BACKING_CHAIN, false,
|
|
|
|
- BLOCKDEV_ON_ERROR_REPORT, BLOCKDEV_ON_ERROR_REPORT,
|
|
|
|
+ MIRROR_SYNC_MODE_NONE, NULL, 0, MIRROR_OPEN_BACKING_CHAIN,
|
|
|
|
+ false, BLOCKDEV_ON_ERROR_REPORT, BLOCKDEV_ON_ERROR_REPORT,
|
|
|
|
false, "filter_node", MIRROR_COPY_MODE_BACKGROUND,
|
|
|
|
&error_abort);
|
update submodule and patches to QEMU 8.2.2
This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.
During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.
The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.
Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.
Major changes only affect alloc-track:
* It is not possible to call a generated co-wrapper like
bdrv_get_info() while holding the block graph lock exclusively [0],
which does happen during initialization of alloc-track when the
backing hd is set and the refresh_limits driver callback is invoked.
The bdrv_get_info() call to get the cluster size is moved to
directly after opening the file child in track_open().
The important thing is that at least the request alignment for the
write target is used, because then the RMW cycle in bdrv_pwritev
will gather enough data from the backing file. Partial cluster
allocations in the target are not a fundamental issue, because the
driver returns its allocation status based on the bitmap, so any
other data that maps to the same cluster will still be copied later
by a stream job (or during writes to that cluster).
* Replacing the node cannot be done in the
track_co_change_backing_file() callback, because it is a coroutine
and cannot hold the block graph lock exclusively. So it is moved to
the stream job itself with the auto-remove option not having an
effect anymore (qemu-server would always set it anyways).
In the future, there could either be a special option for the stream
job, or maybe the upcoming blockdev-replace QMP command can be used.
Replacing the backing child is actually already done in the stream
job, so no need to do it in the track_co_change_backing_file()
callback. It also cannot be called from a coroutine. Looking at the
implementation in the qcow2 driver, it doesn't seem to be intended
to change the backing child itself, just update driver-internal
state.
Other changes:
* alloc-track: Error out early when used without auto-remove. Since
replacing the node now happens in the stream job, where the option
cannot be read from (it's internal to the driver), it will always be
treated as 'on'. Makes sure to have users beside qemu-server notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
* alloc-track: Avoid seemingly superfluous child permission update.
Doesn't seem necessary nowadays (maybe after commit "alloc-track:
fix deadlock during drop" where the dropping is not rescheduled and
delayed anymore or some upstream change). Replacing the block node
will already update the permissions of the new node (which was the
file child before). Should there really be some issue, instead of
having a drop state, this could also be just based off the fact
whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
* PBS block driver: compile unconditionally. Proxmox VE always needs
it and something in the build process changed to make it not enabled
by default. Probably would need to move the build option to meson
otherwise.
* backup: job unreferencing during cleanup needs to happen outside of
coroutine, so it was moved to before invoking the clean
* mirror: Cherry-pick stable fix to avoid potential deadlock.
* savevm-async: migrate_init now can fail, so propagate potential
error.
* savevm-async: compression counters are not accessible outside
migration/ram-compress now, so drop code that prophylactically set
it to zero.
[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-25 18:21:28 +03:00
|
|
|
aio_context_release(main_ctx);
|