Defer async destroys on pool import

We've observed a number of cases when pool import stuck for many
minutes due to large async destroy trying to load DDT or BRT from
HDD pool.  While proper destroy dosage is a separate problem,
lets give import process a chance to complete before that at all.
It may be not enough if there is a lot of ZIL to replay, but that
is harder to cover, since those are in separate syscalls.

Code investigation shown that we already have this mechanism used
for scrub/resilver, so this patch converts SCAN_IMPORT_WAIT_TXGS
into a tunable and applies it to async destroys also.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #18033
This commit is contained in:
Alexander Motin
2025-12-09 15:16:46 -05:00
committed by Tony Hutter
parent d5724f8f3f
commit 135103a648
2 changed files with 28 additions and 16 deletions
+18 -16
View File
@@ -217,16 +217,14 @@ static int zfs_resilver_disable_defer = B_FALSE;
static uint_t zfs_resilver_defer_percent = 10;
/*
* We wait a few txgs after importing a pool to begin scanning so that
* the import / mounting code isn't held up by scrub / resilver IO.
* Unfortunately, it is a bit difficult to determine exactly how long
* this will take since userspace will trigger fs mounts asynchronously
* and the kernel will create zvol minors asynchronously. As a result,
* the value provided here is a bit arbitrary, but represents a
* reasonable estimate of how many txgs it will take to finish fully
* importing a pool
* Number of TXGs to wait after importing before starting background
* work (async destroys, scan/scrub/resilver operations). This allows
* the import command and filesystem mounts to complete quickly without
* being delayed by background activities. The value is somewhat arbitrary
* since userspace triggers filesystem mounts asynchronously, but 5 TXGs
* provides a reasonable window for import completion in most cases.
*/
#define SCAN_IMPORT_WAIT_TXGS 5
static uint_t zfs_import_defer_txgs = 5;
#define DSL_SCAN_IS_SCRUB_RESILVER(scn) \
((scn)->scn_phys.scn_func == POOL_SCAN_SCRUB || \
@@ -4394,6 +4392,14 @@ dsl_scan_sync(dsl_pool_t *dp, dmu_tx_t *tx)
if (spa_shutting_down(spa))
return;
/*
* Wait a few txgs after importing before doing background work
* (async destroys and scanning). This should help the import
* command to complete quickly.
*/
if (spa->spa_syncing_txg < spa->spa_first_txg + zfs_import_defer_txgs)
return;
/*
* If the scan is inactive due to a stalled async destroy, try again.
*/
@@ -4430,13 +4436,6 @@ dsl_scan_sync(dsl_pool_t *dp, dmu_tx_t *tx)
if (!dsl_scan_is_running(scn) || dsl_scan_is_paused_scrub(scn))
return;
/*
* Wait a few txgs after importing to begin scanning so that
* we can get the pool imported quickly.
*/
if (spa->spa_syncing_txg < spa->spa_first_txg + SCAN_IMPORT_WAIT_TXGS)
return;
/*
* zfs_scan_suspend_progress can be set to disable scan progress.
* We don't want to spin the txg_sync thread, so we add a delay
@@ -5336,6 +5335,9 @@ ZFS_MODULE_PARAM(zfs, zfs_, scan_issue_strategy, UINT, ZMOD_RW,
ZFS_MODULE_PARAM(zfs, zfs_, scan_legacy, INT, ZMOD_RW,
"Scrub using legacy non-sequential method");
ZFS_MODULE_PARAM(zfs, zfs_, import_defer_txgs, UINT, ZMOD_RW,
"Number of TXGs to defer background work after pool import");
ZFS_MODULE_PARAM(zfs, zfs_, scan_checkpoint_intval, UINT, ZMOD_RW,
"Scan progress on-disk checkpointing interval");