Update deadman operation to better align with upstream OpenZFS

The deadman in ZoL didn't behave quite as it did in upstream
OpenZFS.  In addition to the 2 purposes for which OpenZFS used the
zfs_deadman_synctime_ms parameter, ZoL also used it to determine how
frequently the deadman would fire once it has been triggered.

This patch adds the zfs_deadman_checktime_ms parameter to control how
frequently the subsequent checks are performed.

The deadman is now disabled for suspended pools.

As had been the case, unlike upstream OpenZFS, ZoL will not panic when
a hung IO is detected.

The module parameter documentation has been upated to include the new
parameter and to better describe the operation of the deadmen.
    
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Tim Chase <tim@chase2k.com>
Closes #5695
This commit is contained in:
Tim Chase 2017-01-31 16:19:08 -06:00 committed by Brian Behlendorf
parent e24548975c
commit b81a3ddc32
2 changed files with 41 additions and 9 deletions

View File

@ -765,9 +765,28 @@ Default value: \fB0\fR.
\fBzfs_deadman_enabled\fR (int) \fBzfs_deadman_enabled\fR (int)
.ad .ad
.RS 12n .RS 12n
Enable deadman timer. See description below. When a pool sync operation takes longer than \fBzfs_deadman_synctime_ms\fR
milliseconds, a "slow spa_sync" message is logged to the debug log
(see \fBzfs_dbgmsg_enable\fR). If \fBzfs_deadman_enabled\fR is set,
all pending IO operations are also checked and if any haven't completed
within \fBzfs_deadman_synctime_ms\fR milliseconds, a "SLOW IO" message
is logged to the debug log and a "delay" system event with the details of
the hung IO is posted.
.sp .sp
Use \fB1\fR for yes (default) and \fB0\fR to disable. Use \fB1\fR (default) to enable the slow IO check and \fB0\fR to disable.
.RE
.sp
.ne 2
.na
\fBzfs_deadman_checktime_ms\fR (int)
.ad
.RS 12n
Once a pool sync operation has taken longer than
\fBzfs_deadman_synctime_ms\fR milliseconds, continue to check for slow
operations every \fBzfs_deadman_checktime_ms\fR milliseconds.
.sp
Default value: \fB5,000\fR.
.RE .RE
.sp .sp
@ -776,12 +795,11 @@ Use \fB1\fR for yes (default) and \fB0\fR to disable.
\fBzfs_deadman_synctime_ms\fR (ulong) \fBzfs_deadman_synctime_ms\fR (ulong)
.ad .ad
.RS 12n .RS 12n
Expiration time in milliseconds. This value has two meanings. First it is Interval in milliseconds after which the deadman is triggered and also
used to determine when the spa_deadman() logic should fire. By default the the interval after which an IO operation is considered to be "hung"
spa_deadman() will fire if spa_sync() has not completed in 1000 seconds. if \fBzfs_deadman_enabled\fR is set.
Secondly, the value determines if an I/O is considered "hung". Any I/O that
has not completed in zfs_deadman_synctime_ms is considered "hung" resulting See \fBzfs_deadman_enabled\fR.
in a zevent being logged.
.sp .sp
Default value: \fB1,000,000\fR. Default value: \fB1,000,000\fR.
.RE .RE

View File

@ -297,6 +297,12 @@ int zfs_free_leak_on_eio = B_FALSE;
*/ */
unsigned long zfs_deadman_synctime_ms = 1000000ULL; unsigned long zfs_deadman_synctime_ms = 1000000ULL;
/*
* Check time in milliseconds. This defines the frequency at which we check
* for hung I/O.
*/
unsigned long zfs_deadman_checktime_ms = 5000ULL;
/* /*
* By default the deadman is enabled. * By default the deadman is enabled.
*/ */
@ -524,6 +530,10 @@ spa_deadman(void *arg)
{ {
spa_t *spa = arg; spa_t *spa = arg;
/* Disable the deadman if the pool is suspended. */
if (spa_suspended(spa))
return;
zfs_dbgmsg("slow spa_sync: started %llu seconds ago, calls %llu", zfs_dbgmsg("slow spa_sync: started %llu seconds ago, calls %llu",
(gethrtime() - spa->spa_sync_starttime) / NANOSEC, (gethrtime() - spa->spa_sync_starttime) / NANOSEC,
++spa->spa_deadman_calls); ++spa->spa_deadman_calls);
@ -532,7 +542,7 @@ spa_deadman(void *arg)
spa->spa_deadman_tqid = taskq_dispatch_delay(system_delay_taskq, spa->spa_deadman_tqid = taskq_dispatch_delay(system_delay_taskq,
spa_deadman, spa, TQ_SLEEP, ddi_get_lbolt() + spa_deadman, spa, TQ_SLEEP, ddi_get_lbolt() +
NSEC_TO_TICK(spa->spa_deadman_synctime)); MSEC_TO_TICK(zfs_deadman_checktime_ms));
} }
/* /*
@ -2114,6 +2124,10 @@ MODULE_PARM_DESC(zfs_free_leak_on_eio,
module_param(zfs_deadman_synctime_ms, ulong, 0644); module_param(zfs_deadman_synctime_ms, ulong, 0644);
MODULE_PARM_DESC(zfs_deadman_synctime_ms, "Expiration time in milliseconds"); MODULE_PARM_DESC(zfs_deadman_synctime_ms, "Expiration time in milliseconds");
module_param(zfs_deadman_checktime_ms, ulong, 0644);
MODULE_PARM_DESC(zfs_deadman_checktime_ms,
"Dead I/O check interval in milliseconds");
module_param(zfs_deadman_enabled, int, 0644); module_param(zfs_deadman_enabled, int, 0644);
MODULE_PARM_DESC(zfs_deadman_enabled, "Enable deadman timer"); MODULE_PARM_DESC(zfs_deadman_enabled, "Enable deadman timer");