mirror of
https://git.proxmox.com/git/mirror_zfs.git
synced 2026-05-22 10:37:35 +03:00
Cap metaslab memory usage
On systems with large amounts of storage and high fragmentation, a huge amount of space can be used by storing metaslab range trees. Since metaslabs are only unloaded during a txg sync, and only if they have been inactive for 8 txgs, it is possible to get into a state where all of the system's memory is consumed by range trees and metaslabs, and txgs cannot sync. While ZFS knows how to evict ARC data when needed, it has no such mechanism for range tree data. This can result in boot hangs for some system configurations. First, we add the ability to unload metaslabs outside of syncing context. Second, we store a multilist of all loaded metaslabs, sorted by their selection txg, so we can quickly identify the oldest metaslabs. We use a multilist to reduce lock contention during heavy write workloads. Finally, we add logic that will unload a metaslab when we're loading a new metaslab, if we're using more than a certain fraction of the available memory on range trees. Reviewed-by: Matt Ahrens <mahrens@delphix.com> Reviewed-by: George Wilson <gwilson@delphix.com> Reviewed-by: Sebastien Roy <sebastien.roy@delphix.com> Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Paul Dagnelie <pcd@delphix.com> Closes #9128
This commit is contained in:
committed by
Brian Behlendorf
parent
9323aad14d
commit
f09fda5071
@@ -386,6 +386,21 @@ considering only the histogram instead.
|
||||
Default value: \fB3600 seconds\fR (one hour)
|
||||
.RE
|
||||
|
||||
.sp
|
||||
.ne 2
|
||||
.na
|
||||
\fBzfs_metaslab_mem_limit\fR (int)
|
||||
.ad
|
||||
.RS 12n
|
||||
When we are loading a new metaslab, we check the amount of memory being used
|
||||
to store metaslab range trees. If it is over a threshold, we attempt to unload
|
||||
the least recently used metaslab to prevent the system from clogging all of
|
||||
its memory with range trees. This tunable sets the percentage of total system
|
||||
memory that is the threshold.
|
||||
.sp
|
||||
Default value: \fB75 percent\fR
|
||||
.RE
|
||||
|
||||
.sp
|
||||
.ne 2
|
||||
.na
|
||||
|
||||
Reference in New Issue
Block a user