mirror_zfs/module/zfs
Alexander Motin a3a4b8def7 Reduce latency effects of non-interactive I/O
Investigating influence of scrub (especially sequential) on random read
latency I've noticed that on some HDDs single 4KB read may take up to 4
seconds!  Deeper investigation shown that many HDDs heavily prioritize
sequential reads even when those are submitted with queue depth of 1.

This patch addresses the latency from two sides:
 - by using _min_active queue depths for non-interactive requests while
   the interactive request(s) are active and few requests after;
 - by throttling it further if no interactive requests has completed
   while configured amount of non-interactive did.

While there, I've also modified vdev_queue_class_to_issue() to give
more chances to schedule at least _min_active requests to the lowest
priorities.  It should reduce starvation if several non-interactive
processes are running same time with some interactive and I think should
make possible setting of zfs_vdev_max_active to as low as 1.

I've benchmarked this change with 4KB random reads from ZVOL with 16KB
block size on newly written non-fragmented pool.  On fragmented pool I
also saw improvements, but not so dramatic.  Below are log2 histograms
of the random read latency in milliseconds for different devices:

4 2x mirror vdevs of SATA HDD WDC WD20EFRX-68EUZN0 before:
0, 0, 2,  1,  12,  21,  19,  18, 10, 15, 17, 21
after:
0, 0, 0, 24, 101, 195, 419, 250, 47,  4,  0,  0
, that means maximum latency reduction from 2s to 500ms.

4 2x mirror vdevs of SATA HDD WDC WD80EFZX-68UW8N0 before:
0, 0,  2,  31,  38,  28,  18,  12, 17, 20, 24, 10, 3
after:
0, 0, 55, 247, 455, 470, 412, 181, 36,  0,  0,  0, 0
, i.e. from 4s to 250ms.

1 SAS HDD SEAGATE ST14000NM0048 before:
0,  0,  29,   70, 107,   45,  27, 1, 0, 0, 1, 4, 19
after:
1, 29, 681, 1261, 676, 1633,  67, 1, 0, 0, 0, 0,  0
, i.e. from 4s to 125ms.

1 SAS SSD SEAGATE XS3840TE70014 before (microseconds):
0, 0, 0, 0, 0, 0, 0, 0,  70, 18343, 82548, 618
after:
0, 0, 0, 0, 0, 0, 0, 0, 283, 92351, 34844,  90

I've also measured scrub time during the test and on idle pools.  On
idle fragmented pool I've measured scrub getting few percent faster
due to use of QD3 instead of QD2 before.  On idle non-fragmented pool
I've measured no difference.  On busy non-fragmented pool I've measured
scrub time increase about 1.5-1.7x, while IOPS increase reached 5-9x.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #11166
2020-11-25 08:45:38 -08:00
..
abd.c Add gang ABD child to parent gang ABD 2020-07-24 21:09:20 -07:00
aggsum.c Reduce number of atomic_add() calls in aggsum 2020-02-06 13:21:06 -08:00
arc.c Fix ASSERT logic in l2arc_evict() 2020-11-17 12:19:46 -08:00
blkptr.c Add zstd support to zfs 2020-08-20 10:30:06 -07:00
bplist.c Fast Clone Deletion 2019-07-26 10:54:14 -07:00
bpobj.c Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
bptree.c Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
bqueue.c Implement Redacted Send/Receive 2019-06-19 09:48:12 -07:00
btree.c Fix typo in btree.c 2020-08-17 15:25:37 -07:00
dataset_kstats.c Fix panic on DilOS with kstat per dataset statistics 2019-09-03 12:12:31 -07:00
dbuf_stats.c Eliminate gratuitous bzeroing in dbuf_stats_hash_table_data 2020-10-01 12:22:54 -07:00
dbuf.c Replace cv_{timed}wait_sig with cv_{timed}wait_idle where appropriate 2020-09-09 10:21:01 -07:00
ddt_zap.c Refactor dnode dirty context from dbuf_dirty 2020-02-26 16:09:17 -08:00
ddt.c Remove dead code 2020-06-18 12:21:18 -07:00
dmu_diff.c Mark write_record static 2019-12-03 09:51:44 -08:00
dmu_object.c Add include files for prototypes 2020-06-18 12:21:25 -07:00
dmu_objset.c Fix 'zfs userspace' for received datasets in encrypted root 2020-11-17 12:19:51 -08:00
dmu_recv.c Add zstd support to zfs 2020-08-20 10:30:06 -07:00
dmu_redact.c Fix dnode refcount tracking 2020-11-11 11:03:24 -08:00
dmu_send.c Drop references when skipping dmu_send due to EXDEV 2020-10-01 12:22:36 -07:00
dmu_traverse.c zil_parse: make callback parameters const 2020-10-16 13:01:53 -07:00
dmu_tx.c Refactor dnode dirty context from dbuf_dirty 2020-02-26 16:09:17 -08:00
dmu_zfetch.c Expose zfetch_max_idistance tunable 2020-10-16 13:02:39 -07:00
dmu.c Export dmu_offset_next() symbol 2020-08-25 08:34:41 -07:00
dnode_sync.c dnode_sync is careless with range tree 2020-08-27 16:07:05 -07:00
dnode.c Add DB_RF_NOPREFETCH to dbuf_read()s in dnode.c 2020-10-01 12:21:09 -07:00
dsl_bookmark.c Fix typos 2020-06-09 21:24:09 -07:00
dsl_crypt.c Prune dead branch reported by Coverity 2020-10-01 12:20:16 -07:00
dsl_dataset.c Add zstd support to zfs 2020-08-20 10:30:06 -07:00
dsl_deadlist.c Fix i/o error handling of livelists and zap iteration 2020-08-05 10:22:09 -07:00
dsl_deleg.c Reduce loaded range tree memory usage 2019-10-09 10:36:03 -07:00
dsl_destroy.c Fix i/o error handling of livelists and zap iteration 2020-08-05 10:22:09 -07:00
dsl_dir.c Add 'zfs rename -u' to rename without remounting 2020-09-03 16:16:15 -07:00
dsl_pool.c Mark functions as static 2020-06-18 12:20:38 -07:00
dsl_prop.c Replace sprintf()->snprintf() and strcpy()->strlcpy() 2020-06-07 11:42:12 -07:00
dsl_scan.c Update references to nonexistent man pages in code 2020-10-30 16:04:41 -07:00
dsl_synctask.c nowait synctask must succeed 2020-09-09 10:25:59 -07:00
dsl_userhold.c Replace sprintf()->snprintf() and strcpy()->strlcpy() 2020-06-07 11:42:12 -07:00
edonr_zfs.c Add include files for prototypes 2020-06-18 12:21:25 -07:00
fm.c Avoid posting duplicate zpool events 2020-09-09 10:26:00 -07:00
gzip.c Add include files for prototypes 2020-06-18 12:21:25 -07:00
hkdf.c Encryption patch follow-up 2017-10-11 16:54:48 -04:00
lz4.c Prefix zfs internal endian checks with _ZFS 2020-07-28 13:02:49 -07:00
lzjb.c Add include files for prototypes 2020-06-18 12:21:25 -07:00
Makefile.in Move spa_stats.c to common code 2020-08-30 14:19:08 -07:00
metaslab.c Update references to nonexistent man pages in code 2020-10-30 16:04:41 -07:00
mmp.c Initialize mmp_last_write when the mmp thread starts 2020-09-09 10:26:04 -07:00
multilist.c Make use of ZFS_DEBUG consistent within kmod sources 2020-07-25 20:07:44 -07:00
objlist.c Implement Redacted Send/Receive 2019-06-19 09:48:12 -07:00
pathname.c Replace ZFS on Linux references with OpenZFS 2020-10-16 13:01:24 -07:00
range_tree.c Fix incorrect deletion order in range_tree_add_impl gap case 2020-10-16 13:05:23 -07:00
refcount.c Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
rrwlock.c Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
sa.c Remove duplicate dnode.h include 2020-08-27 16:06:52 -07:00
sha256.c Add include files for prototypes 2020-06-18 12:21:25 -07:00
skein_zfs.c Add include files for prototypes 2020-06-18 12:21:25 -07:00
spa_boot.c Add include files for prototypes 2020-06-18 12:21:25 -07:00
spa_checkpoint.c Refactor dnode dirty context from dbuf_dirty 2020-02-26 16:09:17 -08:00
spa_config.c vdev_ashift should only be set once 2020-09-18 12:40:20 -07:00
spa_errlog.c Fix typos in module/zfs/ 2019-09-02 17:56:41 -07:00
spa_history.c Update references to nonexistent man pages in code 2020-10-30 16:04:41 -07:00
spa_log_spacemap.c Make module tunables cross platform 2019-09-05 14:49:49 -07:00
spa_misc.c Update references to nonexistent man pages in code 2020-10-30 16:04:41 -07:00
spa_stats.c FreeBSD: Add support for procfs_list 2020-10-01 12:18:56 -07:00
spa.c Replace ZFS on Linux references with OpenZFS 2020-10-16 13:01:24 -07:00
space_map.c Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
space_reftree.c Reduce loaded range tree memory usage 2019-10-09 10:36:03 -07:00
THIRDPARTYLICENSE.cityhash OpenZFS 8484 - Implement aggregate sum and use for arc counters 2018-06-06 09:35:59 -07:00
THIRDPARTYLICENSE.cityhash.descrip OpenZFS 8484 - Implement aggregate sum and use for arc counters 2018-06-06 09:35:59 -07:00
txg.c Replace cv_{timed}wait_sig with cv_{timed}wait_idle where appropriate 2020-09-09 10:21:01 -07:00
uberblock.c MMP interval and fail_intervals in uberblock 2019-03-21 12:47:57 -07:00
unique.c Reduce loaded range tree memory usage 2019-10-09 10:36:03 -07:00
vdev_cache.c Replace ASSERTV macro with compiler annotation 2019-12-05 12:37:00 -08:00
vdev_indirect_births.c Fixes: #8934 Large kmem_alloc 2019-07-10 15:54:49 -07:00
vdev_indirect_mapping.c Replace ASSERTV macro with compiler annotation 2019-12-05 12:37:00 -08:00
vdev_indirect.c Avoid posting duplicate zpool events 2020-09-09 10:26:00 -07:00
vdev_initialize.c nowait synctask must succeed 2020-09-09 10:25:59 -07:00
vdev_label.c Replace ZFS on Linux references with OpenZFS 2020-10-16 13:01:24 -07:00
vdev_mirror.c vdev_ashift should only be set once 2020-09-18 12:40:20 -07:00
vdev_missing.c Import vdev ashift optimization from FreeBSD 2020-08-21 12:53:17 -07:00
vdev_queue.c Reduce latency effects of non-interactive I/O 2020-11-25 08:45:38 -08:00
vdev_raidz_math_aarch64_neon_common.h FreeBSD: fix the build with Clang 11 2020-08-17 15:40:17 -07:00
vdev_raidz_math_aarch64_neon.c Linux 5.0 compat: SIMD compatibility 2019-07-12 09:31:20 -07:00
vdev_raidz_math_aarch64_neonx2.c Linux 5.0 compat: SIMD compatibility 2019-07-12 09:31:20 -07:00
vdev_raidz_math_avx2.c FreeBSD: fix the build with Clang 11 2020-08-17 15:40:17 -07:00
vdev_raidz_math_avx512bw.c Refactor ccompile.h to not include system headers 2020-07-25 20:09:50 -07:00
vdev_raidz_math_avx512f.c FreeBSD: fix the build with Clang 11 2020-08-17 15:40:17 -07:00
vdev_raidz_math_impl.h Fix const-correctness in raidz math 2020-02-03 10:52:41 -08:00
vdev_raidz_math_powerpc_altivec_common.h FreeBSD: fix the build with Clang 11 2020-08-17 15:40:17 -07:00
vdev_raidz_math_powerpc_altivec.c Prefix zfs internal endian checks with _ZFS 2020-07-28 13:02:49 -07:00
vdev_raidz_math_scalar.c Linux 5.3: Fix switch() fall though compiler errors 2019-08-21 09:29:23 -07:00
vdev_raidz_math_sse2.c FreeBSD: fix the build with Clang 11 2020-08-17 15:40:17 -07:00
vdev_raidz_math_ssse3.c Refactor ccompile.h to not include system headers 2020-07-25 20:09:50 -07:00
vdev_raidz_math.c FreeBSD: disable neon usage 2020-08-27 16:06:35 -07:00
vdev_raidz.c Avoid posting duplicate zpool events 2020-09-09 10:26:00 -07:00
vdev_rebuild.c nowait synctask must succeed 2020-09-09 10:25:59 -07:00
vdev_removal.c Ignore special vdev ashift for spa ashift min/max 2020-10-16 13:05:34 -07:00
vdev_root.c Import vdev ashift optimization from FreeBSD 2020-08-21 12:53:17 -07:00
vdev_trim.c nowait synctask must succeed 2020-09-09 10:25:59 -07:00
vdev.c Correct missing zil_claim() DTL updates 2020-11-22 10:01:43 -08:00
zap_leaf.c Refactor dnode dirty context from dbuf_dirty 2020-02-26 16:09:17 -08:00
zap_micro.c Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
zap.c Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
zcp_get.c Add include files for prototypes 2020-06-18 12:21:25 -07:00
zcp_global.c OpenZFS 8600 - ZFS channel programs - snapshot 2018-02-08 15:29:24 -08:00
zcp_iter.c Fix typos in module/zfs/ 2019-09-02 17:56:41 -07:00
zcp_set.c Support setting user properties in a channel program 2020-02-14 13:41:42 -08:00
zcp_synctask.c filesystem_limit/snapshot_limit is incorrectly enforced against root 2020-07-11 17:18:02 -07:00
zcp.c Channel program may spuriously fail with "memory limit exhausted" 2020-11-12 09:02:00 -08:00
zfeature.c Throw const on some strings 2020-10-16 12:55:56 -07:00
zfs_byteswap.c Mark functions as static 2020-06-18 12:20:38 -07:00
zfs_fm.c Avoid posting duplicate zpool events 2020-09-09 10:26:00 -07:00
zfs_fuid.c FreeBSD: Fix UNIX permissions checking 2020-08-18 09:57:07 -07:00
zfs_ioctl.c Fix 'zfs userspace' for received datasets in encrypted root 2020-11-17 12:19:51 -08:00
zfs_log.c Throw const on some strings 2020-10-16 12:55:56 -07:00
zfs_onexit.c Remove deduplicated send/receive code 2020-04-23 10:06:57 -07:00
zfs_quota.c File incorrectly zeroed when receiving incremental stream that toggles -L 2020-06-09 10:41:01 -07:00
zfs_ratelimit.c Change checksum & IO delay ratelimit values 2018-03-04 17:34:51 -08:00
zfs_replay.c Simplify FreeBSD's locking requirements in zfs_replay.c 2020-01-22 17:55:56 -08:00
zfs_rlock.c Add a "try" operation for range locks 2020-07-06 11:53:31 -07:00
zfs_sa.c Add convenience wrappers for common uio usage 2020-06-14 10:09:55 -07:00
zil.c zil_parse: make callback parameters const 2020-10-16 13:01:53 -07:00
zio_checksum.c Mark functions as static 2020-06-18 12:20:38 -07:00
zio_compress.c Avoid symbol collision with in-kernel zstdlib 2020-08-24 12:20:41 -07:00
zio_inject.c Replace ASSERTV macro with compiler annotation 2019-12-05 12:37:00 -08:00
zio.c G/C data_alloc_arena 2020-11-11 18:46:22 -08:00
zle.c Add include files for prototypes 2020-06-18 12:21:25 -07:00
zrlock.c Remove dead code 2020-06-18 12:21:18 -07:00
zthr.c Retain thread name when resuming a zthr 2020-09-09 10:21:16 -07:00
zvol.c Fix problems in zvol_set_volmode_impl 2020-11-17 12:20:09 -08:00