Detect a slow raidz child during reads

A single slow responding disk can affect the overall read
performance of a raidz group.  When a raidz child disk is
determined to be a persistent slow outlier, then have it
sit out during reads for a period of time. The raidz group
can use parity to reconstruct the data that was skipped.

Each time a slow disk is placed into a sit out period, its
`vdev_stat.vs_slow_ios count` is incremented and a zevent
class `ereport.fs.zfs.delay` is posted.

The length of the sit out period can be changed using the
`raid_read_sit_out_secs` module parameter.  Setting it to
zero disables slow outlier detection.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Contributions-by: Don Brady <don.brady@klarasystems.com>
Contributions-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17227
This commit is contained in:
Paul Dagnelie
2025-08-27 16:41:48 -07:00
committed by Brian Behlendorf
parent 0df85ec27c
commit df55ba7c49
28 changed files with 1399 additions and 13 deletions
+32
View File
@@ -1112,6 +1112,16 @@ function get_pool_prop # property pool
zpool get -Hpo value "$prop" "$pool" || log_fail "zpool get $prop $pool"
}
# Get the specified vdev property in parsable format or fail
function get_vdev_prop
{
typeset prop="$1"
typeset pool="$2"
typeset vdev="$3"
zpool get -Hpo value "$prop" "$pool" "$vdev" || log_fail "zpool get $prop $pool $vdev"
}
# Return 0 if a pool exists; $? otherwise
#
# $1 - pool name
@@ -1970,6 +1980,28 @@ function wait_vdev_state # pool disk state timeout
return 1
}
#
# Wait for vdev 'sit_out' property to be cleared.
#
# $1 pool name
# $2 vdev name
# $3 timeout
#
function wait_sit_out #pool vdev timeout
{
typeset pool=${1:-$TESTPOOL}
typeset vdev="$2"
typeset timeout=${3:-300}
for (( timer = 0; timer < $timeout; timer++ )); do
if [ "$(get_vdev_prop sit_out "$pool" "$vdev")" = "off" ]; then
return 0
fi
sleep 1;
done
return 1
}
#
# Check the output of 'zpool status -v <pool>',
# and to see if the content of <token> contain the <keyword> specified.
+3
View File
@@ -72,6 +72,9 @@ MULTIHOST_INTERVAL multihost.interval zfs_multihost_interval
OVERRIDE_ESTIMATE_RECORDSIZE send.override_estimate_recordsize zfs_override_estimate_recordsize
PREFETCH_DISABLE prefetch.disable zfs_prefetch_disable
RAIDZ_EXPAND_MAX_REFLOW_BYTES vdev.expand_max_reflow_bytes raidz_expand_max_reflow_bytes
READ_SIT_OUT_SECS vdev.read_sit_out_secs vdev_read_sit_out_secs
SIT_OUT_CHECK_INTERVAL vdev.raidz_outlier_check_interval_ms vdev_raidz_outlier_check_interval_ms
SIT_OUT_INSENSITIVITY vdev.raidz_outlier_insensitivity vdev_raidz_outlier_insensitivity
REBUILD_SCRUB_ENABLED rebuild_scrub_enabled zfs_rebuild_scrub_enabled
REMOVAL_SUSPEND_PROGRESS removal_suspend_progress zfs_removal_suspend_progress
REMOVE_MAX_SEGMENT remove_max_segment zfs_remove_max_segment