mirror_zfs/include/sys
Alexander Motin 2080c4f27e Reduce latency effects of non-interactive I/O
Investigating influence of scrub (especially sequential) on random read
latency I've noticed that on some HDDs single 4KB read may take up to 4
seconds!  Deeper investigation shown that many HDDs heavily prioritize
sequential reads even when those are submitted with queue depth of 1.

This patch addresses the latency from two sides:
 - by using _min_active queue depths for non-interactive requests while
   the interactive request(s) are active and few requests after;
 - by throttling it further if no interactive requests has completed
   while configured amount of non-interactive did.

While there, I've also modified vdev_queue_class_to_issue() to give
more chances to schedule at least _min_active requests to the lowest
priorities.  It should reduce starvation if several non-interactive
processes are running same time with some interactive and I think should
make possible setting of zfs_vdev_max_active to as low as 1.

I've benchmarked this change with 4KB random reads from ZVOL with 16KB
block size on newly written non-fragmented pool.  On fragmented pool I
also saw improvements, but not so dramatic.  Below are log2 histograms
of the random read latency in milliseconds for different devices:

4 2x mirror vdevs of SATA HDD WDC WD20EFRX-68EUZN0 before:
0, 0, 2,  1,  12,  21,  19,  18, 10, 15, 17, 21
after:
0, 0, 0, 24, 101, 195, 419, 250, 47,  4,  0,  0
, that means maximum latency reduction from 2s to 500ms.

4 2x mirror vdevs of SATA HDD WDC WD80EFZX-68UW8N0 before:
0, 0,  2,  31,  38,  28,  18,  12, 17, 20, 24, 10, 3
after:
0, 0, 55, 247, 455, 470, 412, 181, 36,  0,  0,  0, 0
, i.e. from 4s to 250ms.

1 SAS HDD SEAGATE ST14000NM0048 before:
0,  0,  29,   70, 107,   45,  27, 1, 0, 0, 1, 4, 19
after:
1, 29, 681, 1261, 676, 1633,  67, 1, 0, 0, 0, 0,  0
, i.e. from 4s to 125ms.

1 SAS SSD SEAGATE XS3840TE70014 before (microseconds):
0, 0, 0, 0, 0, 0, 0, 0,  70, 18343, 82548, 618
after:
0, 0, 0, 0, 0, 0, 0, 0, 283, 92351, 34844,  90

I've also measured scrub time during the test and on idle pools.  On
idle fragmented pool I've measured scrub getting few percent faster
due to use of QD3 instead of QD2 before.  On idle non-fragmented pool
I've measured no difference.  On busy non-fragmented pool I've measured
scrub time increase about 1.5-1.7x, while IOPS increase reached 5-9x.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #11166
2020-12-23 13:09:03 -08:00
..
crypto Avoid installing kernel headers on FreeBSD 2020-06-27 17:40:14 -07:00
fm Avoid posting duplicate zpool events 2020-09-09 10:26:00 -07:00
fs Assertion failure when logging large output of channel program 2020-11-14 10:51:21 -08:00
lua FreeBSD: Reduce stack usage of Lua 2020-10-01 12:16:28 -07:00
sysevent Avoid installing kernel headers on FreeBSD 2020-06-27 17:40:14 -07:00
zstd zstd: track allocator statistics 2020-10-30 16:06:15 -07:00
abd_impl.h Removing ZERO_PAGE abd_alloc_zero_scatter 2020-06-10 17:54:11 -07:00
abd.h Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
aggsum.h Reduce number of atomic_add() calls in aggsum 2020-02-06 13:21:06 -08:00
arc_impl.h FreeBSD: 11.x arc_stats compatibility 2020-08-20 10:55:02 -07:00
arc.h Add zstd support to zfs 2020-08-20 10:30:06 -07:00
avl_impl.h
avl.h Restore avl_update() calls and related functions 2020-06-03 09:49:32 -07:00
bitops.h Reduce loaded range tree memory usage 2019-10-09 10:36:03 -07:00
blkptr.h
bplist.h Fast Clone Deletion 2019-07-26 10:54:14 -07:00
bpobj.h Fast Clone Deletion 2019-07-26 10:54:14 -07:00
bptree.h
bqueue.h Implement Redacted Send/Receive 2019-06-19 09:48:12 -07:00
btree.h Fix typos 2020-06-09 21:24:09 -07:00
dataset_kstats.h port async unlinked drain from illumos-nexenta 2019-02-12 10:41:15 -08:00
dbuf.h Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
ddt.h Appease GCC sprintf warnings found on Fedora 32/GCC 10.0.1 2020-08-24 10:32:59 -07:00
dmu_impl.h Prevent race condition in dnode_dest (#10101) 2020-03-12 10:25:56 -07:00
dmu_objset.h Fix 'zfs userspace' for received datasets in encrypted root 2020-11-17 12:19:51 -08:00
dmu_recv.h filesystem_limit/snapshot_limit is incorrectly enforced against root 2020-07-11 17:18:02 -07:00
dmu_redact.h Implement Redacted Send/Receive 2019-06-19 09:48:12 -07:00
dmu_send.h Add 'zfs send --saved' flag 2020-01-10 10:16:58 -08:00
dmu_traverse.h Implement Redacted Send/Receive 2019-06-19 09:48:12 -07:00
dmu_tx.h Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
dmu_zfetch.h Replace zf_rwlock with a mutex 2019-07-25 11:57:58 -07:00
dmu.h dmu.h: remove stale declaration dmu_objset_snapshot_tmp 2020-10-16 13:03:05 -07:00
dnode.h Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
dsl_bookmark.h Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
dsl_crypt.h dmu_objset_from_ds must be called with dp_config_rwlock held 2020-03-12 10:55:02 -07:00
dsl_dataset.h Add zstd support to zfs 2020-08-20 10:30:06 -07:00
dsl_deadlist.h Add fast path for zfs_ioc_space_snaps() handling of empty_bpobj 2019-08-20 11:34:52 -07:00
dsl_deleg.h Remove code for zfs remap 2019-06-24 16:44:01 -07:00
dsl_destroy.h Fast Clone Deletion 2019-07-26 10:54:14 -07:00
dsl_dir.h Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
dsl_pool.h Eliminate Linux specific inode usage from common code 2019-12-11 11:53:57 -08:00
dsl_prop.h Support inheriting properties in channel programs 2020-01-22 17:03:17 -08:00
dsl_scan.h Add device rebuild feature 2020-07-03 11:05:50 -07:00
dsl_synctask.h nowait synctask must succeed 2020-09-09 10:25:59 -07:00
dsl_userhold.h
edonr.h
efi_partition.h Fix typos in include/ 2019-08-30 09:53:15 -07:00
frame.h Linux 5.10 compat: frame.h renamed objtool.h 2020-11-03 09:51:15 -08:00
hkdf.h
Makefile.am zfs label bootenv should store data as nvlist 2020-09-15 18:36:12 -07:00
metaslab_impl.h Use a struct to organize metaslab-group-allocator fields 2020-04-22 10:26:56 -07:00
metaslab.h Extend zdb to print inconsistencies in livelists and metaslabs 2020-07-14 17:51:05 -07:00
mmp.h Add zfs_multihost_interval tunable handler for FreeBSD 2020-06-23 13:32:42 -07:00
mntent.h Add FreeBSD required defines to mntent.h 2019-11-30 15:49:09 -08:00
mod.h Replace ZFS on Linux references with OpenZFS 2020-10-16 13:01:24 -07:00
multilist.h Avoid extra taskq_dispatch() calls by DMU 2019-06-25 12:03:38 -07:00
note.h
nvpair_impl.h OpenZFS 9580 - Add a hash-table on top of nvlist to speed-up operations 2018-07-30 11:30:03 -07:00
nvpair.h FreeBSD: make adjustments for the standalone environment 2020-10-16 13:04:41 -07:00
objlist.h Implement Redacted Send/Receive 2019-06-19 09:48:12 -07:00
pathname.h Replace ZFS on Linux references with OpenZFS 2020-10-16 13:01:24 -07:00
qat.h QAT related bug fixes 2019-09-12 13:33:44 -07:00
range_tree.h Improve compatibility with C++ consumers 2020-06-06 12:54:04 -07:00
rrwlock.h Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
sa_impl.h Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
sa.h Fix typos in include/ 2019-08-30 09:53:15 -07:00
skein.h
spa_boot.h
spa_checkpoint.h Serialize ZTHR operations to eliminate races 2019-01-13 10:09:46 -08:00
spa_checksum.h
spa_impl.h Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
spa_log_spacemap.h Log Spacemap Project 2019-07-16 10:11:49 -07:00
spa.h Throw const on some strings 2020-10-16 12:55:56 -07:00
space_map.h Extend zdb to print inconsistencies in livelists and metaslabs 2020-07-14 17:51:05 -07:00
space_reftree.h Reduce loaded range tree memory usage 2019-10-09 10:36:03 -07:00
sysevent.h
txg_impl.h Fix typos in include/ 2019-08-30 09:53:15 -07:00
txg.h OpenZFS 9425 - channel programs can be interrupted 2019-06-22 16:51:46 -07:00
u8_textprep_data.h
u8_textprep.h Throw const on some strings 2020-10-16 12:55:56 -07:00
uberblock_impl.h MMP interval and fail_intervals in uberblock 2019-03-21 12:47:57 -07:00
uberblock.h
uio_impl.h deadlock between mm_sem and tx assign in zfs_write() and page fault 2018-10-16 11:11:24 -07:00
unique.h
uuid.h
vdev_disk.h Make struct vdev_disk_t be platform private 2020-06-16 11:43:33 -07:00
vdev_file.h Add zfs_file_* interface, remove vnodes 2019-11-21 09:32:57 -08:00
vdev_impl.h Reduce latency effects of non-interactive I/O 2020-12-23 13:09:03 -08:00
vdev_indirect_births.h
vdev_indirect_mapping.h
vdev_initialize.h Add TRIM support 2019-03-29 09:13:20 -07:00
vdev_raidz_impl.h Add prototypes 2020-06-18 12:21:32 -07:00
vdev_raidz.h Linux 5.0 compat: SIMD compatibility 2019-07-12 09:31:20 -07:00
vdev_rebuild.h Add device rebuild feature 2020-07-03 11:05:50 -07:00
vdev_removal.h panic in removal_remap test on 4K devices 2019-06-13 13:12:39 -07:00
vdev_trim.h Trim L2ARC 2020-06-09 10:15:08 -07:00
vdev.h vdev_ashift should only be set once 2020-09-18 12:40:20 -07:00
xvattr.h Linux 4.18 compat: inode timespec -> timespec64 2018-06-19 21:51:18 -07:00
zap_impl.h
zap_leaf.h
zap.h fat zap should prefetch when iterating 2019-06-12 13:13:09 -07:00
zcp_global.h
zcp_iter.h
zcp_prop.h
zcp_set.h Support setting user properties in a channel program 2020-02-14 13:41:42 -08:00
zcp.h filesystem_limit/snapshot_limit is incorrectly enforced against root 2020-07-11 17:18:02 -07:00
zfeature.h
zfs_acl.h Return an error code from zfs_acl_chmod_setattr 2019-11-01 10:19:11 -07:00
zfs_bootenv.h zfs label bootenv should store data as nvlist 2020-09-15 18:36:12 -07:00
zfs_context.h FreeBSD: make adjustments for the standalone environment 2020-10-16 13:04:41 -07:00
zfs_debug.h Remove sdt.h 2019-10-25 13:38:37 -07:00
zfs_delay.h
zfs_file.h Re-share zfsdev_getminor and zfs_onexit_fd_hold 2020-02-28 14:50:32 -08:00
zfs_fuid.h Replace sprintf()->snprintf() and strcpy()->strlcpy() 2020-06-07 11:42:12 -07:00
zfs_ioctl_impl.h Make zc_nvlist_src_size limit tunable 2020-08-18 09:33:55 -07:00
zfs_ioctl.h Cross-platform acltype 2020-10-16 13:05:00 -07:00
zfs_onexit.h Remove deduplicated send/receive code 2020-04-23 10:06:57 -07:00
zfs_project.h Minor diff reduction with ZoF in include/sys 2019-11-27 11:11:03 -08:00
zfs_quota.h File incorrectly zeroed when receiving incremental stream that toggles -L 2020-06-09 10:41:01 -07:00
zfs_ratelimit.h
zfs_refcount.h Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
zfs_rlock.h Add a "try" operation for range locks 2020-07-06 11:53:31 -07:00
zfs_sa.h
zfs_stat.h
zfs_sysfs.h Fix in-kernel sysfs entries 2018-09-06 21:44:52 -07:00
zfs_vfsops.h Add 'zfs rename -u' to rename without remounting 2020-09-03 16:16:15 -07:00
zfs_znode.h G/C struct znode -> z_moved 2020-11-11 11:40:15 -08:00
zil_impl.h make zil max block size tunable 2019-06-10 11:48:42 -07:00
zil.h zil_parse: make callback parameters const 2020-10-16 13:01:53 -07:00
zio_checksum.h
zio_compress.h Add zstd support to zfs 2020-08-20 10:30:06 -07:00
zio_crypt.h Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
zio_impl.h Add zstd support to zfs 2020-08-20 10:30:06 -07:00
zio_priority.h Add device rebuild feature 2020-07-03 11:05:50 -07:00
zio.h Avoid posting duplicate zpool events 2020-09-09 10:26:00 -07:00
zrlock.h
zthr.h Introduce names for ZTHRs 2020-07-29 09:43:33 -07:00
zvol_impl.h Fix problems in zvol_set_volmode_impl 2020-11-17 12:20:09 -08:00
zvol.h async zvol minor node creation interferes with receive 2020-02-03 09:33:14 -08:00