mirror_zfs/module/zfs
Paul Dagnelie f09fda5071 Cap metaslab memory usage
On systems with large amounts of storage and high fragmentation, a huge 
amount of space can be used by storing metaslab range trees. Since 
metaslabs are only unloaded during a txg sync, and only if they have 
been inactive for 8 txgs, it is possible to get into a state where all 
of the system's memory is consumed by range trees and metaslabs, and 
txgs cannot sync. While ZFS knows how to evict ARC data when needed, 
it has no such mechanism for range tree data. This can result in boot 
hangs for some system configurations.

First, we add the ability to unload metaslabs outside of syncing 
context. Second, we store a multilist of all loaded metaslabs, sorted 
by their selection txg, so we can quickly identify the oldest 
metaslabs.  We use a multilist to reduce lock contention during heavy 
write workloads. Finally, we add logic that will unload a metaslab 
when we're loading a new metaslab, if we're using more than a certain 
fraction of the available memory on range trees.

Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed-by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #9128
2019-08-16 09:08:21 -06:00
..
abd.c single-chunk scatter ABDs can be treated as linear 2019-06-11 09:02:31 -07:00
aggsum.c OpenZFS 9688 - aggsum_fini leaks memory 2018-10-19 12:08:03 -07:00
arc.c Cap metaslab memory usage 2019-08-16 09:08:21 -06:00
blkptr.c Undo c89 workarounds to match with upstream 2017-11-04 13:25:13 -07:00
bplist.c Fast Clone Deletion 2019-07-26 10:54:14 -07:00
bpobj.c Fast Clone Deletion 2019-07-26 10:54:14 -07:00
bptree.c Implement Redacted Send/Receive 2019-06-19 09:48:12 -07:00
bqueue.c Implement Redacted Send/Receive 2019-06-19 09:48:12 -07:00
cityhash.c OpenZFS 8484 - Implement aggregate sum and use for arc counters 2018-06-06 09:35:59 -07:00
dataset_kstats.c port async unlinked drain from illumos-nexenta 2019-02-12 10:41:15 -08:00
dbuf_stats.c Prefix all refcount functions with zfs_ 2018-10-01 10:42:05 -07:00
dbuf.c dmu_tx_wait() hang likely due to cv_signal() in dsl_pool_dirty_delta() 2019-08-15 17:53:53 -06:00
ddt_zap.c fat zap should prefetch when iterating 2019-06-12 13:13:09 -07:00
ddt.c Remove dedupditto functionality 2019-06-19 14:54:02 -07:00
dmu_diff.c Implement Redacted Send/Receive 2019-06-19 09:48:12 -07:00
dmu_object.c Fix send/recv lost spill block 2019-05-07 15:18:44 -07:00
dmu_objset.c dmu_tx_wait() hang likely due to cv_signal() in dsl_pool_dirty_delta() 2019-08-15 17:53:53 -06:00
dmu_recv.c Allow unencrypted children of encrypted datasets 2019-06-20 12:29:51 -07:00
dmu_redact.c Implement Redacted Send/Receive 2019-06-19 09:48:12 -07:00
dmu_send.c Implement Redacted Send/Receive 2019-06-19 09:48:12 -07:00
dmu_traverse.c Implement Redacted Send/Receive 2019-06-19 09:48:12 -07:00
dmu_tx.c Improve performance by using dmu_tx_hold_*_by_dnode() 2019-07-30 09:18:30 -07:00
dmu_zfetch.c Replace zf_rwlock with a mutex 2019-07-25 11:57:58 -07:00
dmu.c dmu_tx_wait() hang likely due to cv_signal() in dsl_pool_dirty_delta() 2019-08-15 17:53:53 -06:00
dnode_sync.c Decrease contention on dn_struct_rwlock 2019-07-08 13:18:50 -07:00
dnode.c Assert that a dnode's bonuslen never exceeds its recorded size 2019-08-15 08:44:57 -06:00
dsl_bookmark.c Implement Redacted Send/Receive 2019-06-19 09:48:12 -07:00
dsl_crypt.c Remove VERIFY from dsl_dataset_crypt_stats() 2019-07-05 16:53:14 -07:00
dsl_dataset.c Mark dsl_livelist_should_disable() static 2019-08-13 21:16:23 -06:00
dsl_deadlist.c Fast Clone Deletion 2019-07-26 10:54:14 -07:00
dsl_deleg.c Update build system and packaging 2018-05-29 16:00:33 -07:00
dsl_destroy.c Fast Clone Deletion 2019-07-26 10:54:14 -07:00
dsl_dir.c Fast Clone Deletion 2019-07-26 10:54:14 -07:00
dsl_pool.c dmu_tx_wait() hang likely due to cv_signal() in dsl_pool_dirty_delta() 2019-08-15 17:53:53 -06:00
dsl_prop.c Update build system and packaging 2018-05-29 16:00:33 -07:00
dsl_scan.c Fast Clone Deletion 2019-07-26 10:54:14 -07:00
dsl_synctask.c OpenZFS 9425 - channel programs can be interrupted 2019-06-22 16:51:46 -07:00
dsl_userhold.c zfs should optionally send holds 2019-02-15 12:41:38 -08:00
edonr_zfs.c DLPX-44812 integrate EP-220 large memory scalability 2016-11-29 14:34:27 -08:00
fm.c Don't wakeup unnecessarily in 'zpool events -f' 2019-08-05 11:35:47 -07:00
gzip.c Update build system and packaging 2018-05-29 16:00:33 -07:00
hkdf.c Encryption patch follow-up 2017-10-11 16:54:48 -04:00
lz4.c Reword comment in lz4_compress_zfs 2019-05-02 16:46:04 -07:00
lzjb.c Change KM_PUSHPAGE -> KM_SLEEP 2015-01-16 14:41:26 -08:00
Makefile.in Log Spacemap Project 2019-07-16 10:11:49 -07:00
metaslab.c Cap metaslab memory usage 2019-08-16 09:08:21 -06:00
mmp.c MMP interval and fail_intervals in uberblock 2019-03-21 12:47:57 -07:00
multilist.c Avoid extra taskq_dispatch() calls by DMU 2019-06-25 12:03:38 -07:00
objlist.c Implement Redacted Send/Receive 2019-06-19 09:48:12 -07:00
pathname.c Disable unused pathname::pn_path* (unneeded in Linux) 2019-07-15 13:57:56 -07:00
policy.c Implement secpolicy_vnode_setid_retain() 2019-07-26 13:52:30 -07:00
qat_compress.c Code improvement and bug fixes for QAT support 2019-04-16 12:38:36 -07:00
qat_crypt.c Code improvement and bug fixes for QAT support 2019-04-16 12:38:36 -07:00
qat.c Code improvement and bug fixes for QAT support 2019-04-16 12:38:36 -07:00
qat.h Code improvement and bug fixes for QAT support 2019-04-16 12:38:36 -07:00
range_tree.c Metaslab max_size should be persisted while unloaded 2019-08-05 14:34:27 -07:00
refcount.c Prevent race in blkptr_verify against device removal 2019-08-13 21:24:43 -06:00
rrwlock.c 8659 static dtrace probes unavailable on non-GPL modules 2019-07-08 11:20:53 -07:00
sa.c Improve performance by using dmu_tx_hold_*_by_dnode() 2019-07-30 09:18:30 -07:00
sha256.c SHA256 QAT acceleration 2018-03-15 10:53:58 -07:00
skein_zfs.c DLPX-44812 integrate EP-220 large memory scalability 2016-11-29 14:34:27 -08:00
spa_boot.c Add linux kernel module support 2010-08-31 13:41:58 -07:00
spa_checkpoint.c Get rid of space_map_update() for ms_synced_length 2019-02-12 10:38:11 -08:00
spa_config.c Remove vn_set_fs_pwd()/vn_set_pwd() (no need to be at / during insmod) 2019-05-29 16:18:14 -07:00
spa_errlog.c Update build system and packaging 2018-05-29 16:00:33 -07:00
spa_history.c Fast Clone Deletion 2019-07-26 10:54:14 -07:00
spa_log_spacemap.c Cap metaslab memory usage 2019-08-16 09:08:21 -06:00
spa_misc.c Prevent race in blkptr_verify against device removal 2019-08-13 21:24:43 -06:00
spa_stats.c Restrict kstats and print real pointers 2019-04-04 18:57:06 -07:00
spa.c Cap metaslab memory usage 2019-08-16 09:08:21 -06:00
space_map.c Log Spacemap Project 2019-07-16 10:11:49 -07:00
space_reftree.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
THIRDPARTYLICENSE.cityhash OpenZFS 8484 - Implement aggregate sum and use for arc counters 2018-06-06 09:35:59 -07:00
THIRDPARTYLICENSE.cityhash.descrip OpenZFS 8484 - Implement aggregate sum and use for arc counters 2018-06-06 09:35:59 -07:00
trace.c 8659 static dtrace probes unavailable on non-GPL modules 2019-07-08 11:20:53 -07:00
txg.c Log Spacemap Project 2019-07-16 10:11:49 -07:00
uberblock.c MMP interval and fail_intervals in uberblock 2019-03-21 12:47:57 -07:00
unique.c Performance optimization of AVL tree comparator functions 2016-08-31 14:35:34 -07:00
vdev_cache.c Update build system and packaging 2018-05-29 16:00:33 -07:00
vdev_disk.c Revert "Fail early on bio corruption confirmed on 5.2-rc1" 2019-07-05 20:38:56 -07:00
vdev_file.c Update vdev_ops_t from illumos 2019-06-20 18:29:02 -07:00
vdev_indirect_births.c Fixes: #8934 Large kmem_alloc 2019-07-10 15:54:49 -07:00
vdev_indirect_mapping.c Get rid of space_map_update() for ms_synced_length 2019-02-12 10:38:11 -08:00
vdev_indirect.c Log Spacemap Project 2019-07-16 10:11:49 -07:00
vdev_initialize.c Cap metaslab memory usage 2019-08-16 09:08:21 -06:00
vdev_label.c panic in removal_remap test on 4K devices 2019-06-13 13:12:39 -07:00
vdev_mirror.c Update vdev_ops_t from illumos 2019-06-20 18:29:02 -07:00
vdev_missing.c Update vdev_ops_t from illumos 2019-06-20 18:29:02 -07:00
vdev_queue.c Move write aggregation memory copy out of vq_lock 2019-06-13 13:08:24 -07:00
vdev_raidz_math_aarch64_neon_common.h Linux 5.0 compat: ASM_BUG macro 2019-05-08 10:18:40 -07:00
vdev_raidz_math_aarch64_neon.c Linux 5.0 compat: SIMD compatibility 2019-07-12 09:31:20 -07:00
vdev_raidz_math_aarch64_neonx2.c Linux 5.0 compat: SIMD compatibility 2019-07-12 09:31:20 -07:00
vdev_raidz_math_avx2.c Linux 5.0 compat: SIMD compatibility 2019-07-12 09:31:20 -07:00
vdev_raidz_math_avx512bw.c Linux 5.0 compat: SIMD compatibility 2019-07-12 09:31:20 -07:00
vdev_raidz_math_avx512f.c Linux 5.0 compat: SIMD compatibility 2019-07-12 09:31:20 -07:00
vdev_raidz_math_impl.h codebase style improvements for OpenZFS 6459 port 2017-01-22 13:25:40 -08:00
vdev_raidz_math_scalar.c ABD Vectorized raidz 2016-11-29 14:34:33 -08:00
vdev_raidz_math_sse2.c Linux 5.0 compat: SIMD compatibility 2019-07-12 09:31:20 -07:00
vdev_raidz_math_ssse3.c Linux 5.0 compat: SIMD compatibility 2019-07-12 09:31:20 -07:00
vdev_raidz_math.c Linux 5.0 compat: SIMD compatibility 2019-07-12 09:31:20 -07:00
vdev_raidz.c Update vdev_ops_t from illumos 2019-06-20 18:29:02 -07:00
vdev_removal.c Log Spacemap Project 2019-07-16 10:11:49 -07:00
vdev_root.c Update vdev_ops_t from illumos 2019-06-20 18:29:02 -07:00
vdev_trim.c Cap metaslab memory usage 2019-08-16 09:08:21 -06:00
vdev.c Cap metaslab memory usage 2019-08-16 09:08:21 -06:00
zap_leaf.c Off-by-one in zap_leaf_array_create() 2019-01-18 09:58:46 -08:00
zap_micro.c fat zap should prefetch when iterating 2019-06-12 13:13:09 -07:00
zap.c fat zap should prefetch when iterating 2019-06-12 13:13:09 -07:00
zcp_get.c Fix get_special_prop() build failure 2019-07-16 14:14:12 -07:00
zcp_global.c OpenZFS 8600 - ZFS channel programs - snapshot 2018-02-08 15:29:24 -08:00
zcp_iter.c Introduce getting holds and listing bookmarks through ZCP 2019-08-12 10:02:34 -07:00
zcp_synctask.c OpenZFS 9166 - zfs storage pool checkpoint 2018-06-26 10:07:42 -07:00
zcp.c OpenZFS 9425 - channel programs can be interrupted 2019-06-22 16:51:46 -07:00
zfeature.c Consistently captialize GUID for features 2019-04-16 10:01:51 -07:00
zfs_acl.c Update build system and packaging 2018-05-29 16:00:33 -07:00
zfs_byteswap.c Update build system and packaging 2018-05-29 16:00:33 -07:00
zfs_ctldir.c Change boolean-like uint8_t fields in znode_t to boolean_t 2019-08-13 07:58:02 -06:00
zfs_debug.c Restrict kstats and print real pointers 2019-04-04 18:57:06 -07:00
zfs_dir.c port async unlinked drain from illumos-nexenta 2019-02-12 10:41:15 -08:00
zfs_fm.c Add zpool status -s (slow I/Os) and -p (parseable) 2018-11-08 16:47:24 -08:00
zfs_fuid.c Update build system and packaging 2018-05-29 16:00:33 -07:00
zfs_ioctl.c Don't directly cast unsigned long to void* 2019-07-25 11:59:20 -07:00
zfs_log.c Improve write performance by using dmu_read_by_dnode() 2019-08-15 17:36:24 -06:00
zfs_onexit.c Update build system and packaging 2018-05-29 16:00:33 -07:00
zfs_ratelimit.c Change checksum & IO delay ratelimit values 2018-03-04 17:34:51 -08:00
zfs_replay.c Use SEEK_{SET,CUR,END} for file seek "whence" 2019-04-25 10:17:27 -07:00
zfs_rlock.c OpenZFS 9689 - zfs range lock code should not be zpl-specific 2018-10-11 10:19:33 -07:00
zfs_sa.c Project Quota on ZFS 2018-02-13 14:54:54 -08:00
zfs_sysfs.c Prevent pointer to an out-of-scope local variable 2019-06-20 18:31:52 -07:00
zfs_vfsops.c Make txg_wait_synced conditional in zfsvfs_teardown 2019-08-15 08:27:13 -06:00
zfs_vnops.c Fix out-of-order ZIL txtype lost on hardlinked files 2019-08-13 21:21:27 -06:00
zfs_znode.c Change boolean-like uint8_t fields in znode_t to boolean_t 2019-08-13 07:58:02 -06:00
zil.c Fix out-of-order ZIL txtype lost on hardlinked files 2019-08-13 21:21:27 -06:00
zio_checksum.c Undo c89 workarounds to match with upstream 2017-11-04 13:25:13 -07:00
zio_compress.c OpenZFS 9403 - assertion failed in arc_buf_destroy() 2018-08-29 11:33:33 -07:00
zio_crypt.c Always call rw_init in zio_crypt_key_unwrap 2019-04-10 15:39:40 -07:00
zio_inject.c Multiple DVA Scrubbing Fix 2019-03-15 14:14:31 -07:00
zio.c Prevent race in blkptr_verify against device removal 2019-08-13 21:24:43 -06:00
zle.c Fix zle_decompress out of bound access 2018-02-09 10:08:05 -08:00
zpl_ctldir.c RHEL 7.5 compat: FMODE_KABI_ITERATE 2018-05-02 15:01:24 -07:00
zpl_export.c Use cstyle -cpP in make cstyle check 2016-12-12 10:46:26 -08:00
zpl_file.c Fix errant EFAULT during writes (#8719) 2019-05-08 10:04:04 -07:00
zpl_inode.c Fix errant EFAULT during writes (#8719) 2019-05-08 10:04:04 -07:00
zpl_super.c Fix statfs(2) for 32-bit user space 2018-09-24 17:11:25 -07:00
zpl_xattr.c Drop redundant POSIX ACL check in zpl_init_acl() 2019-07-15 16:26:52 -07:00
zrlock.c Update build system and packaging 2018-05-29 16:00:33 -07:00
zthr.c Fast Clone Deletion 2019-07-26 10:54:14 -07:00
zvol.c Add SCSI_PASSTHROUGH to zvols to enable UNMAP support 2019-06-21 09:40:56 -07:00