Fix rare cksum errors after rebuild

Currently, after rebuild (aka sequential resilver), checksum
errors can be seen sometimes on the spare vdev or draid spare.
On my laptop, it happens from 2 to 4 times of running
redundancy_draid_spare1 test in a loop for 100 times.

It looks like there's a race in vdev_rebuild_thread() when the
rebuild of space map ranges is finished and we re-enable
allocations from the metaslab too soon: a new allocations may
happen from that metaslab before txg with the rebuilt ranges is
sync-ed, causing undesirable interference.

Solution: wait for the txg to be sync-ed before enabling metaslab.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Akash B <akash-b@hpe.com>
Signed-off-by: Andriy Tkachuk <atkachuk@wasabi.com>
Closes #18307
Closes #18319
Closes #18473
This commit is contained in:
Andriy Tkachuk
2026-05-01 20:15:27 +01:00
committed by Tony Hutter
parent b0c1dcb531
commit 76fd64ac9f
3 changed files with 11 additions and 8 deletions
+1
View File
@@ -70,6 +70,7 @@ typedef struct vdev_rebuild {
zfs_range_tree_t *vr_scan_tree; zfs_range_tree_t *vr_scan_tree;
kmutex_t vr_io_lock; /* inflight IO lock */ kmutex_t vr_io_lock; /* inflight IO lock */
kcondvar_t vr_io_cv; /* inflight IO cv */ kcondvar_t vr_io_cv; /* inflight IO cv */
uint64_t vr_last_txg; /* last used txg */
/* In-core state and progress */ /* In-core state and progress */
uint64_t vr_scan_offset[TXG_SIZE]; uint64_t vr_scan_offset[TXG_SIZE];
+8 -1
View File
@@ -593,6 +593,7 @@ vdev_rebuild_range(vdev_rebuild_t *vr, uint64_t start, uint64_t size)
dmu_tx_t *tx = dmu_tx_create_dd(spa_get_dsl(spa)->dp_mos_dir); dmu_tx_t *tx = dmu_tx_create_dd(spa_get_dsl(spa)->dp_mos_dir);
VERIFY0(dmu_tx_assign(tx, DMU_TX_WAIT | DMU_TX_SUSPEND)); VERIFY0(dmu_tx_assign(tx, DMU_TX_WAIT | DMU_TX_SUSPEND));
uint64_t txg = dmu_tx_get_txg(tx); uint64_t txg = dmu_tx_get_txg(tx);
vr->vr_last_txg = txg;
spa_config_enter(spa, SCL_STATE_ALL, vd, RW_READER); spa_config_enter(spa, SCL_STATE_ALL, vd, RW_READER);
mutex_enter(&vd->vdev_rebuild_lock); mutex_enter(&vd->vdev_rebuild_lock);
@@ -908,8 +909,14 @@ vdev_rebuild_thread(void *arg)
error = vdev_rebuild_ranges(vr); error = vdev_rebuild_ranges(vr);
zfs_range_tree_vacate(vr->vr_scan_tree, NULL, NULL); zfs_range_tree_vacate(vr->vr_scan_tree, NULL, NULL);
spa_config_enter(spa, SCL_CONFIG, FTAG, RW_READER); /*
* Allow rebuilt ranges to be sync-ed before enabling metaslab
* to avoid any interfering allocations. Otherwise, we might
* see checksum errors after scrub.
*/
txg_wait_synced(dp, vr->vr_last_txg);
metaslab_enable(msp, B_FALSE, B_FALSE); metaslab_enable(msp, B_FALSE, B_FALSE);
spa_config_enter(spa, SCL_CONFIG, FTAG, RW_READER);
if (error != 0) if (error != 0)
break; break;
@@ -360,12 +360,7 @@ function recover_bad_missing_devs
# expected state after a healing resilver of a healthy pool. # expected state after a healing resilver of a healthy pool.
# #
# 2. sequential - The pool is fully intact. There should never be a # 2. sequential - The pool is fully intact. There should never be a
# checksum error, but the occasional checksum error does occur in # checksum error.
# practice. Until the root cause is identified and resolved, tolerate
# a checksum error when scrubbing after a sequential resilver.
#
# https://github.com/openzfs/zfs/issues/18307
# https://github.com/openzfs/zfs/issues/18319
# #
# 3. damaged - The pool was intentionally silently damaged. Checksum # 3. damaged - The pool was intentionally silently damaged. Checksum
# errors are expected to be reported as the damaged blocks are # errors are expected to be reported as the damaged blocks are
@@ -395,7 +390,7 @@ function verify_draid_pool
log_fail "Unexpected repair IO found for $pool ($cksum)" log_fail "Unexpected repair IO found for $pool ($cksum)"
fi fi
elif [[ "$replace_mode" = "sequential" ]]; then elif [[ "$replace_mode" = "sequential" ]]; then
if [[ $cksum -gt 3 ]]; then if [[ $cksum -gt 0 ]]; then
log_must zpool status -v $pool log_must zpool status -v $pool
log_fail "Unexpected CKSUM errors found for $pool ($cksum)" log_fail "Unexpected CKSUM errors found for $pool ($cksum)"
fi fi