Improve async destroy processing timing

Previous code effectively enforced that all async free ZIOs were
_issued_ within the TXG timeout.  But they could take forever to
complete, especially if the required metadata were not in ARC.

This patch introduces periodic waits every 2000 ZIOs, which should
give at least somewhat reasonable TXG timings even for single HDD
pools with empty ARC.  And makes them complete within half of the
TXG timeout, since we might still need time to sync DDT and BRT.

While there, change zfs_max_async_dedup_frees semantics to include
also clone and gang blocks, which are similar.  Bump the default
value from set long ago to be more forgiving to block cloning
(still not having logs and benefiting from large TXGs), now that
we have better working time limits.  The limit now is a possible
amount of dirty data produced by BRT updates.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #18043
This commit is contained in:
Alexander Motin
2025-12-11 21:46:08 -05:00
committed by Tony Hutter
parent 135103a648
commit 2428043709
3 changed files with 47 additions and 15 deletions
+8 -3
View File
@@ -1468,8 +1468,13 @@ Enable/disable the processing of the free_bpobj object.
.It Sy zfs_async_block_max_blocks Ns = Ns Sy UINT64_MAX Po unlimited Pc Pq u64
Maximum number of blocks freed in a single TXG.
.
.It Sy zfs_max_async_dedup_frees Ns = Ns Sy 100000 Po 10^5 Pc Pq u64
Maximum number of dedup blocks freed in a single TXG.
.It Sy zfs_max_async_dedup_frees Ns = Ns Sy 250000 Pq u64
Maximum number of dedup, clone or gang blocks freed in a single TXG.
These frees may require additional I/O, making them more expensive.
.
.It Sy zfs_async_free_zio_wait_interval Ns = Ns Sy 2000 Pq u64
After freeing this many dedup, clone or gang blocks wait for all pending
I/Os to complete before continuing.
.
.It Sy zfs_vdev_async_read_max_active Ns = Ns Sy 3 Pq uint
Maximum asynchronous read I/O operations active to each device.
@@ -1739,7 +1744,7 @@ but we chose the more conservative approach of not setting it,
so that there is no possibility of
leaking space in the "partial temporary" failure case.
.
.It Sy zfs_free_min_time_ms Ns = Ns Sy 1000 Ns ms Po 1s Pc Pq uint
.It Sy zfs_free_min_time_ms Ns = Ns Sy 500 Ns ms Po 1s Pc Pq uint
During a
.Nm zfs Cm destroy
operation using the