Expand fragmentation table to reflect larger possibile allocation sizes

When you are using large recordsizes in conjunction with raidz, with
incompressible data, you can pretty reliably be making 21 MB
allocations. Unfortunately, the fragmentation metric in ZFS considers
any metaslabs with 16 MB free chunks completely unfragmented, so you can
have a metaslab report 0% fragmented and be unable to satisfy an
allocation. When using the segment-based metaslab weight, this is
inconvenient; when using the space-based one, it can seriously degrade
performance.

We expand the fragmentation table to extend up to 512MB, and redefine
the table size based on the actual table, rather than having a static
define. We also tweak the one variable that depends on fragmentation
directly.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Closes #16986
This commit is contained in:
Paul Dagnelie
2025-02-06 12:40:01 -08:00
committed by GitHub
parent 2cccbacefc
commit 40496514b8
2 changed files with 32 additions and 27 deletions
+1 -1
View File
@@ -1778,7 +1778,7 @@ Normally disabled because these datasets may be missing key data.
.It Sy zfs_min_metaslabs_to_flush Ns = Ns Sy 1 Pq u64
Minimum number of metaslabs to flush per dirty TXG.
.
.It Sy zfs_metaslab_fragmentation_threshold Ns = Ns Sy 70 Ns % Pq uint
.It Sy zfs_metaslab_fragmentation_threshold Ns = Ns Sy 77 Ns % Pq uint
Allow metaslabs to keep their active state as long as their fragmentation
percentage is no more than this value.
An active metaslab that exceeds this threshold