mirror_zfs/include/sys
Olaf Faaland 3d31aad83e MMP writes rotate over leaves
Instead of choosing a leaf vdev quasi-randomly, by starting at the root
vdev and randomly choosing children, rotate over leaves to issue MMP
writes.  This fixes an issue in a pool whose top-level vdevs have
different numbers of leaves.

The issue is that the frequency at which individual leaves are chosen
for MMP writes is based not on the total number of leaves but based on
how many siblings the leaves have.

For example, in a pool like this:

       root-vdev
   +------+---------------+
vdev1                   vdev2
  |                       |
  |                +------+-----+-----+----+
disk1             disk2 disk3 disk4 disk5 disk6

vdev1 and vdev2 will each be chosen 50% of the time.  Every time vdev1
is chosen, disk1 will be chosen.  However, every time vdev2 is chosen,
disk2 is chosen 20% of the time.  As a result, disk1 will be sent 5x as
many MMP writes as disk2.

This may create wear issues in the case of SSDs.  It also reduces the
effectiveness of MMP as it depends on the writes being evenly
distributed for the case where some devices fail or are partitioned.

The new code maintains a list of leaf vdevs in the pool.  MMP records
the last leaf used for an MMP write in mmp->mmp_last_leaf.  To choose
the next leaf, MMP starts at mmp->mmp_last_leaf and traverses the list,
continuing from the head if the tail is reached.  It stops when a
suitable leaf is found or all leaves have been examined.

Added a test to verify MMP write distribution is even.

Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Kash Pande <kash@tripleback.net>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7953
2019-03-12 10:37:06 -07:00
..
crypto Add support for selecting encryption backend 2018-08-02 11:59:24 -07:00
fm Add zpool status -s (slow I/Os) and -p (parseable) 2018-11-08 16:47:24 -08:00
fs Reorder ZFS ioctls to fix cross-version compatibility 2019-03-09 13:39:31 -08:00
lua Fix coverity defects: zfs channel programs 2018-02-20 11:19:42 -08:00
sysevent OpenZFS 8959 - Add notifications when a scrub is paused or resumed 2018-01-17 10:31:00 -08:00
abd.h Linux 4.19-rc3+ compat: Remove refcount_t compat 2018-09-26 10:29:26 -07:00
aggsum.h OpenZFS 8484 - Implement aggregate sum and use for arc counters 2018-06-06 09:35:59 -07:00
arc_impl.h Linux 4.19-rc3+ compat: Remove refcount_t compat 2018-09-26 10:29:26 -07:00
arc.h Linux 4.19-rc3+ compat: Remove refcount_t compat 2018-09-26 10:29:26 -07:00
avl_impl.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
avl.h Remove dead code from AVL tree 2017-10-05 19:28:00 -07:00
blkptr.h OpenZFS 8067 - zdb should be able to dump literal embedded block pointer 2017-07-07 11:28:01 -07:00
bplist.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
bpobj.h OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
bptree.h Illumos 4914 - zfs on-disk bookmark structure should be named *_phys_t 2014-08-06 14:48:41 -07:00
bqueue.h Illumos 5960, 5925 2016-01-08 15:08:19 -08:00
cityhash.h OpenZFS 8484 - Implement aggregate sum and use for arc counters 2018-06-06 09:35:59 -07:00
dataset_kstats.h port async unlinked drain from illumos-nexenta 2019-02-12 10:41:15 -08:00
dbuf.h Linux 4.19-rc3+ compat: Remove refcount_t compat 2018-09-26 10:29:26 -07:00
ddt.h Incorrect maximum DVA value in DDE_GET_NDVAS() 2018-02-26 14:20:12 -08:00
dmu_impl.h Fix race in dnode_check_slots_free() 2018-04-10 11:15:05 -07:00
dmu_objset.h Pool allocation classes 2018-09-05 18:33:36 -07:00
dmu_recv.h Refactor dmu_recv into its own file 2018-10-09 14:05:13 -07:00
dmu_send.h Refactor dmu_recv into its own file 2018-10-09 14:05:13 -07:00
dmu_traverse.h Native Encryption for ZFS on Linux 2017-08-14 10:36:48 -07:00
dmu_tx.h Linux 4.19-rc3+ compat: Remove refcount_t compat 2018-09-26 10:29:26 -07:00
dmu_zfetch.h OpenZFS 6322 - ZFS indirect block predictive prefetch 2016-08-30 14:26:55 -07:00
dmu.h Stack overflow in recursive bpobj_iterate_impl 2019-03-06 09:50:55 -08:00
dnode.h Linux 4.19-rc3+ compat: Remove refcount_t compat 2018-09-26 10:29:26 -07:00
dsl_bookmark.h Refactor dmu_recv into its own file 2018-10-09 14:05:13 -07:00
dsl_crypt.h Refcounted DSL Crypto Key Mappings 2018-10-03 09:47:11 -07:00
dsl_dataset.h Add types to featureflags in zfs 2018-10-16 11:15:04 -07:00
dsl_deadlist.h OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
dsl_deleg.h OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
dsl_destroy.h OpenZFS 7431 - ZFS Channel Programs 2018-02-08 15:28:18 -08:00
dsl_dir.h Remove duplicate macro in dsl_dir.h 2018-10-01 10:40:11 -07:00
dsl_pool.h port async unlinked drain from illumos-nexenta 2019-02-12 10:41:15 -08:00
dsl_prop.h Illumos 6171 - dsl_prop_unregister() slows down dataset eviction. 2016-01-12 10:53:12 -08:00
dsl_scan.h OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
dsl_synctask.h OpenZFS 9166 - zfs storage pool checkpoint 2018-06-26 10:07:42 -07:00
dsl_userhold.h Illumos #3740 2013-11-04 11:17:48 -08:00
edonr.h OpenZFS 4185 - add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R 2016-10-03 14:51:15 -07:00
efi_partition.h Fix spelling 2017-01-03 11:31:18 -06:00
frame.h Suppress incorrect objtool warnings 2017-12-07 10:28:50 -08:00
hkdf.h Encryption patch follow-up 2017-10-11 16:54:48 -04:00
Makefile.am OpenZFS 9102 - zfs should be able to initialize storage devices 2019-01-07 10:37:26 -08:00
metaslab_impl.h Introduce auxiliary metaslab histograms 2019-02-20 09:59:56 -08:00
metaslab.h Introduce auxiliary metaslab histograms 2019-02-20 09:59:56 -08:00
mmp.h MMP writes rotate over leaves 2019-03-12 10:37:06 -07:00
mntent.h Make zfs mount according to relatime config in dataset 2016-04-05 18:55:59 -07:00
multilist.h OpenZFS 7968 - multi-threaded spa_sync() 2017-03-20 18:36:00 -07:00
note.h Update build system and packaging 2018-05-29 16:00:33 -07:00
nvpair_impl.h OpenZFS 9580 - Add a hash-table on top of nvlist to speed-up operations 2018-07-30 11:30:03 -07:00
nvpair.h Add new fnvlist_lookup_* functions 2018-10-03 15:30:55 -07:00
pathname.h Add pn_alloc()/pn_free() functions 2016-04-21 09:49:25 -07:00
policy.h Add zfs allow and zfs unallow support 2016-06-07 09:16:52 -07:00
range_tree.h Rename range_tree_verify to range_tree_verify_not_present 2019-01-25 09:51:24 -08:00
refcount.h Add zfs_refcount_transfer_ownership_many() 2018-10-09 10:05:48 -07:00
rrwlock.h Linux 4.19-rc3+ compat: Remove refcount_t compat 2018-09-26 10:29:26 -07:00
sa_impl.h Linux 4.19-rc3+ compat: Remove refcount_t compat 2018-09-26 10:29:26 -07:00
sa.h Project Quota on ZFS 2018-02-13 14:54:54 -08:00
sdt.h Add line info and SET_ERROR() to ZFS debug log 2017-07-25 23:09:48 -07:00
sha2.h OpenZFS 4185 - add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R 2016-10-03 14:51:15 -07:00
skein.h OpenZFS 4185 - add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R 2016-10-03 14:51:15 -07:00
spa_boot.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
spa_checkpoint.h Serialize ZTHR operations to eliminate races 2019-01-13 10:09:46 -08:00
spa_checksum.h Implementation of AVX2 optimized Fletcher-4 2016-06-02 14:30:51 -07:00
spa_impl.h MMP writes rotate over leaves 2019-03-12 10:37:06 -07:00
spa.h zfs initialize performance enhancements 2019-01-07 11:03:08 -08:00
space_map.h Introduce auxiliary metaslab histograms 2019-02-20 09:59:56 -08:00
space_reftree.h Illumos #4101, #4102, #4103, #4105, #4106 2014-07-22 09:39:16 -07:00
sysevent.h OpenZFS 6939 - add sysevents to zfs core for commands 2017-07-12 21:28:13 -07:00
trace_acl.h Linux 4.16 compat: inode_set_iversion() 2018-02-08 21:25:19 -08:00
trace_arc.h Support re-prioritizing asynchronous prefetches 2017-12-21 09:13:06 -08:00
trace_common.h OpenZFS 6531 - Provide mechanism to artificially limit disk performance 2016-05-26 10:11:51 -07:00
trace_dbgmsg.h Add line info and SET_ERROR() to ZFS debug log 2017-07-25 23:09:48 -07:00
trace_dbuf.h Prefix all refcount functions with zfs_ 2018-10-01 10:42:05 -07:00
trace_dmu.h tx_waited -> tx_dirty_delayed in trace_dmu.h 2018-01-31 16:13:26 -08:00
trace_dnode.h Fix build-it compilation regression 2017-01-24 08:50:15 -08:00
trace_multilist.h Fix build-it compilation regression 2017-01-24 08:50:15 -08:00
trace_txg.h Fix build-it compilation regression 2017-01-24 08:50:15 -08:00
trace_vdev.h OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
trace_zil.h OpenZFS 8585 - improve batching done in zil_commit() 2017-12-05 09:39:16 -08:00
trace_zio.h Use cstyle -cpP in make cstyle check 2016-12-12 10:46:26 -08:00
trace_zrlock.h Fix race in trace point in zrl_add_impl 2018-03-12 11:27:02 -07:00
trace.h Remove duplicate typedefs from trace.h 2015-01-06 16:53:24 -08:00
txg_impl.h OpenZFS 9464 - txg_kick() fails to see that we are quiescing 2018-06-04 14:56:06 -07:00
txg.h Small rework of txg_list code 2018-08-27 10:16:01 -07:00
u8_textprep_data.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
u8_textprep.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
uberblock_impl.h OpenZFS 9166 - zfs storage pool checkpoint 2018-06-26 10:07:42 -07:00
uberblock.h Multi-modifier protection (MMP) 2017-07-13 13:54:00 -04:00
uio_impl.h deadlock between mm_sem and tx assign in zfs_write() and page fault 2018-10-16 11:11:24 -07:00
unique.h Illumos #3742 2013-11-04 10:55:25 -08:00
uuid.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
vdev_disk.h Add support for autoexpand property 2018-07-23 15:40:15 -07:00
vdev_file.h Use a dedicated taskq for vdev_file 2016-12-21 10:47:15 -08:00
vdev_impl.h MMP writes rotate over leaves 2019-03-12 10:37:06 -07:00
vdev_indirect_births.h OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
vdev_indirect_mapping.h OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
vdev_initialize.h zfs initialize performance enhancements 2019-01-07 11:03:08 -08:00
vdev_raidz_impl.h Revert raidz_map and _col structure types 2018-01-09 14:46:52 -08:00
vdev_raidz.h Use cstyle -cpP in make cstyle check 2016-12-12 10:46:26 -08:00
vdev_removal.h OpenZFS 9166 - zfs storage pool checkpoint 2018-06-26 10:07:42 -07:00
vdev.h Defer new resilvers until the current one ends 2018-10-18 21:06:18 -07:00
xvattr.h Linux 4.18 compat: inode timespec -> timespec64 2018-06-19 21:51:18 -07:00
zap_impl.h OpenZFS 7793 - ztest fails assertion in dmu_tx_willuse_space 2017-03-07 09:51:59 -08:00
zap_leaf.h Fix ENOSPC in "Handle zap_add() failures in ..." 2018-04-18 14:19:50 -07:00
zap.h Provide more flexible object allocation interface 2019-01-10 14:37:43 -08:00
zcp_global.h OpenZFS 7431 - ZFS Channel Programs 2018-02-08 15:28:18 -08:00
zcp_iter.h OpenZFS 7431 - ZFS Channel Programs 2018-02-08 15:28:18 -08:00
zcp_prop.h OpenZFS 7431 - ZFS Channel Programs 2018-02-08 15:28:18 -08:00
zcp.h Add tunables for channel programs 2018-06-15 15:10:42 -07:00
zfeature.h Revert "zhack: Add 'feature disable' command" 2016-05-17 11:52:07 -07:00
zfs_acl.h Project Quota on ZFS 2018-02-13 14:54:54 -08:00
zfs_context.h Fix dnode_hold_impl() soft lockup 2019-02-22 09:48:37 -08:00
zfs_ctldir.h Rename zfs_sb_t -> zfsvfs_t 2017-03-10 09:51:33 -08:00
zfs_debug.h Document guidelines for usage of zfs_dbgmsg 2019-01-18 10:16:56 -08:00
zfs_delay.h Update build system and packaging 2018-05-29 16:00:33 -07:00
zfs_dir.h port async unlinked drain from illumos-nexenta 2019-02-12 10:41:15 -08:00
zfs_fuid.h Update build system and packaging 2018-05-29 16:00:33 -07:00
zfs_ioctl.h zfs should optionally send holds 2019-02-15 12:41:38 -08:00
zfs_onexit.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
zfs_project.h Project Quota on ZFS 2018-02-13 14:54:54 -08:00
zfs_ratelimit.h Change checksum & IO delay ratelimit values 2018-03-04 17:34:51 -08:00
zfs_rlock.h OpenZFS 9689 - zfs range lock code should not be zpl-specific 2018-10-11 10:19:33 -07:00
zfs_sa.h Project Quota on ZFS 2018-02-13 14:54:54 -08:00
zfs_stat.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
zfs_sysfs.h Fix in-kernel sysfs entries 2018-09-06 21:44:52 -07:00
zfs_vfsops.h port async unlinked drain from illumos-nexenta 2019-02-12 10:41:15 -08:00
zfs_vnops.h RHEL 7.5 compat: FMODE_KABI_ITERATE 2018-05-02 15:01:24 -07:00
zfs_znode.h Linux does not HAVE_SMB_SHARE 2018-10-17 10:31:38 -07:00
zil_impl.h OpenZFS 9962 - zil_commit should omit cache thrash 2018-12-07 11:09:42 -08:00
zil.h OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
zio_checksum.h Remove dependency on linear ABD 2017-03-29 12:24:51 -07:00
zio_compress.h DLPX-44812 integrate EP-220 large memory scalability 2016-11-29 14:34:27 -08:00
zio_crypt.h Add support for decryption faults in zinject 2018-05-02 15:36:20 -07:00
zio_impl.h Native Encryption for ZFS on Linux 2017-08-14 10:36:48 -07:00
zio_priority.h OpenZFS 9102 - zfs should be able to initialize storage devices 2019-01-07 10:37:26 -08:00
zio.h Add zpool status -s (slow I/Os) and -p (parseable) 2018-11-08 16:47:24 -08:00
zpl.h Linux 4.18 compat: inode timespec -> timespec64 2018-06-19 21:51:18 -07:00
zrlock.h OpenZFS 6328 - Fix cstyle errors in zfs codebase 2017-01-12 09:42:11 -08:00
zthr.h Serialize ZTHR operations to eliminate races 2019-01-13 10:09:46 -08:00
zvol.h Add port of FreeBSD 'volmode' property 2017-07-12 13:05:37 -07:00