mirror_zfs/module/zfs
Brian Behlendorf 8fb1ede146 Extend deadman logic
The intent of this patch is extend the existing deadman code
such that it's flexible enough to be used by both ztest and
on production systems.  The proposed changes include:

* Added a new `zfs_deadman_failmode` module option which is
  used to dynamically control the behavior of the deadman.  It's
  loosely modeled after, but independant from, the pool failmode
  property.  It can be set to wait, continue, or panic.

    * wait     - Wait for the "hung" I/O (default)
    * continue - Attempt to recover from a "hung" I/O
    * panic    - Panic the system

* Added a new `zfs_deadman_ziotime_ms` module option which is
  analogous to `zfs_deadman_synctime_ms` except instead of
  applying to a pool TXG sync it applies to zio_wait().  A
  default value of 300s is used to define a "hung" zio.

* The ztest deadman thread has been re-enabled by default,
  aligned with the upstream OpenZFS code, and then extended
  to terminate the process when it takes significantly longer
  to complete than expected.

* The -G option was added to ztest to print the internal debug
  log when a fatal error is encountered.  This same option was
  previously added to zdb in commit fa603f82.  Update zloop.sh
  to unconditionally pass -G to obtain additional debugging.

* The FM_EREPORT_ZFS_DELAY event which was previously posted
  when the deadman detect a "hung" pool has been replaced by
  a new dedicated FM_EREPORT_ZFS_DEADMAN event.

* The proposed recovery logic attempts to restart a "hung"
  zio by calling zio_interrupt() on any outstanding leaf zios.
  We may want to further restrict this to zios in either the
  ZIO_STAGE_VDEV_IO_START or ZIO_STAGE_VDEV_IO_DONE stages.
  Calling zio_interrupt() is expected to only be useful for
  cases when an IO has been submitted to the physical device
  but for some reasonable the completion callback hasn't been
  called by the lower layers.  This shouldn't be possible but
  has been observed and may be caused by kernel/driver bugs.

* The 'zfs_deadman_synctime_ms' default value was reduced from
  1000s to 600s.

* Depending on how ztest fails there may be no cache file to
  move.  This should not be considered fatal, collect the logs
  which are available and carry on.

* Add deadman test cases for spa_deadman() and zio_wait().

* Increase default zfs_deadman_checktime_ms to 60s.

Reviewed-by: Tim Chase <tim@chase2k.com>
Reviewed by: Thomas Caputi <tcaputi@datto.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6999
2018-01-25 13:40:38 -08:00
..
abd.c Update for cppcheck v1.80 2017-11-18 14:08:00 -08:00
arc.c Fix ARC hit rate 2018-01-08 09:52:36 -08:00
blkptr.c Undo c89 workarounds to match with upstream 2017-11-04 13:25:13 -07:00
bplist.c Change KM_PUSHPAGE -> KM_SLEEP 2015-01-16 14:41:26 -08:00
bpobj.c Undo c89 workarounds to match with upstream 2017-11-04 13:25:13 -07:00
bptree.c Native Encryption for ZFS on Linux 2017-08-14 10:36:48 -07:00
bqueue.c Call cv_signal() with mutex held 2017-06-26 14:36:49 -07:00
dbuf_stats.c Improved dnode allocation and dmu_hold_impl() 2017-09-05 16:15:04 -07:00
dbuf.c Fix ARC hit rate 2018-01-08 09:52:36 -08:00
ddt_zap.c Change KM_PUSHPAGE -> KM_SLEEP 2015-01-16 14:41:26 -08:00
ddt.c Sequential scrub and resilvers 2017-11-15 17:27:01 -08:00
dmu_diff.c OpenZFS 6950 - ARC should cache compressed data 2016-09-13 09:58:33 -07:00
dmu_object.c Improved dnode allocation and dmu_hold_impl() 2017-09-05 16:15:04 -07:00
dmu_objset.c Long hold the dataset during upgrade 2017-11-10 13:37:10 -08:00
dmu_send.c Undo c89 workarounds to match with upstream 2017-11-04 13:25:13 -07:00
dmu_traverse.c Sequential scrub and resilvers 2017-11-15 17:27:01 -08:00
dmu_tx.c Call commit callbacks from the tail of the list 2017-12-22 10:19:51 -08:00
dmu_zfetch.c OpenZFS 8835 - Speculative prefetch in ZFS not working for misaligned reads 2018-01-19 09:31:29 -08:00
dmu.c OpenZFS 8585 - improve batching done in zil_commit() 2017-12-05 09:39:16 -08:00
dnode_sync.c Undo c89 workarounds to match with upstream 2017-11-04 13:25:13 -07:00
dnode.c Undo c89 workarounds to match with upstream 2017-11-04 13:25:13 -07:00
dsl_bookmark.c Undo c89 workarounds to match with upstream 2017-11-04 13:25:13 -07:00
dsl_crypt.c Fix encryption root hierarchy issue 2017-11-08 15:25:30 -08:00
dsl_dataset.c Undo c89 workarounds to match with upstream 2017-11-04 13:25:13 -07:00
dsl_deadlist.c OpenZFS 5428 - provide fts(), reallocarray(), and strtonum() 2017-07-08 20:35:35 -07:00
dsl_deleg.c Performance optimization of AVL tree comparator functions 2016-08-31 14:35:34 -07:00
dsl_destroy.c Undo c89 workarounds to match with upstream 2017-11-04 13:25:13 -07:00
dsl_dir.c Native Encryption for ZFS on Linux 2017-08-14 10:36:48 -07:00
dsl_pool.c Sequential scrub and resilvers 2017-11-15 17:27:01 -08:00
dsl_prop.c Native Encryption for ZFS on Linux 2017-08-14 10:36:48 -07:00
dsl_scan.c OpenZFS 8959 - Add notifications when a scrub is paused or resumed 2018-01-17 10:31:00 -08:00
dsl_synctask.c Illumos 4951 - ZFS administrative commands should use reserved space 2015-05-04 09:41:10 -07:00
dsl_userhold.c Undo c89 workarounds to match with upstream 2017-11-04 13:25:13 -07:00
edonr_zfs.c DLPX-44812 integrate EP-220 large memory scalability 2016-11-29 14:34:27 -08:00
fm.c Linux 4.14 compat: CONFIG_GCC_PLUGIN_RANDSTRUCT 2017-11-28 17:33:48 -06:00
gzip.c GZIP compression offloading with QAT accelerator 2017-03-22 17:58:47 -07:00
hkdf.c Encryption patch follow-up 2017-10-11 16:54:48 -04:00
lz4.c Fix LZ4_uncompress_unknownOutputSize caused panic 2017-05-19 13:45:46 -07:00
lzjb.c Change KM_PUSHPAGE -> KM_SLEEP 2015-01-16 14:41:26 -08:00
Makefile.in Support -fsanitize=address with --enable-asan 2018-01-10 10:49:27 -08:00
metaslab.c Sequential scrub and resilvers 2017-11-15 17:27:01 -08:00
mmp.c Emit an error message before MMP suspends pool 2018-01-17 12:24:42 -08:00
multilist.c Undo c89 workarounds to match with upstream 2017-11-04 13:25:13 -07:00
pathname.c Add pn_alloc()/pn_free() functions 2016-04-21 09:49:25 -07:00
policy.c codebase style improvements for OpenZFS 6459 port 2017-01-22 13:25:40 -08:00
qat_compress.c Bug fix in qat_compress.c when compressed size is < 4KB 2017-11-07 14:51:30 -08:00
qat_compress.h GZIP compression offloading with QAT accelerator 2017-03-22 17:58:47 -07:00
range_tree.c Sequential scrub and resilvers 2017-11-15 17:27:01 -08:00
refcount.c Linux 4.11 compat: avoid refcount_t name conflict 2017-02-28 16:10:18 -08:00
rrwlock.c Fix spelling 2017-01-03 11:31:18 -06:00
sa.c Fix coverity defects: CID 147474 2017-10-10 16:41:47 -07:00
sha256.c DLPX-44812 integrate EP-220 large memory scalability 2016-11-29 14:34:27 -08:00
skein_zfs.c DLPX-44812 integrate EP-220 large memory scalability 2016-11-29 14:34:27 -08:00
spa_boot.c Add linux kernel module support 2010-08-31 13:41:58 -07:00
spa_config.c Undo c89 workarounds to match with upstream 2017-11-04 13:25:13 -07:00
spa_errlog.c Native Encryption for ZFS on Linux 2017-08-14 10:36:48 -07:00
spa_history.c Emit history events for 'zpool create' 2017-10-23 09:45:59 -07:00
spa_misc.c Extend deadman logic 2018-01-25 13:40:38 -08:00
spa_stats.c Update the default for zfs_txg_history 2017-09-29 15:58:52 -07:00
spa.c OpenZFS 8652 - Tautological comparisons with ZPROP_INVAL 2018-01-19 09:22:37 -08:00
space_map.c Undo c89 workarounds to match with upstream 2017-11-04 13:25:13 -07:00
space_reftree.c OpenZFS 6328 - Fix cstyle errors in zfs codebase 2017-01-12 09:42:11 -08:00
trace.c OpenZFS 6531 - Provide mechanism to artificially limit disk performance 2016-05-26 10:11:51 -07:00
txg.c OpenZFS 8585 - improve batching done in zil_commit() 2017-12-05 09:39:16 -08:00
uberblock.c Multi-modifier protection (MMP) 2017-07-13 13:54:00 -04:00
unique.c Performance optimization of AVL tree comparator functions 2016-08-31 14:35:34 -07:00
vdev_cache.c Undo c89 workarounds to match with upstream 2017-11-04 13:25:13 -07:00
vdev_disk.c Fix printk() calls missing log level 2017-09-25 10:38:27 -07:00
vdev_file.c Skip spurious resilver IO on raidz vdev 2017-05-12 17:28:03 -07:00
vdev_label.c Undo c89 workarounds to match with upstream 2017-11-04 13:25:13 -07:00
vdev_mirror.c Linux 4.14 compat: CONFIG_GCC_PLUGIN_RANDSTRUCT 2017-11-28 17:33:48 -06:00
vdev_missing.c Skip spurious resilver IO on raidz vdev 2017-05-12 17:28:03 -07:00
vdev_queue.c Support re-prioritizing asynchronous prefetches 2017-12-21 09:13:06 -08:00
vdev_raidz_math_aarch64_neon_common.h ABD raidz NEON support 2016-11-29 14:34:33 -08:00
vdev_raidz_math_aarch64_neon.c codebase style improvements for OpenZFS 6459 port 2017-01-22 13:25:40 -08:00
vdev_raidz_math_aarch64_neonx2.c ABD raidz NEON support 2016-11-29 14:34:33 -08:00
vdev_raidz_math_avx2.c ABD raidz avx512f support 2016-11-29 14:34:33 -08:00
vdev_raidz_math_avx512bw.c ABD: Adapt avx512bw raidz assembly 2016-12-15 17:31:33 -08:00
vdev_raidz_math_avx512f.c Use cstyle -cpP in make cstyle check 2016-12-12 10:46:26 -08:00
vdev_raidz_math_impl.h codebase style improvements for OpenZFS 6459 port 2017-01-22 13:25:40 -08:00
vdev_raidz_math_scalar.c ABD Vectorized raidz 2016-11-29 14:34:33 -08:00
vdev_raidz_math_sse2.c ABD raidz avx512f support 2016-11-29 14:34:33 -08:00
vdev_raidz_math_ssse3.c codebase style improvements for OpenZFS 6459 port 2017-01-22 13:25:40 -08:00
vdev_raidz_math.c codebase style improvements for OpenZFS 6459 port 2017-01-22 13:25:40 -08:00
vdev_raidz.c Linux 4.14 compat: CONFIG_GCC_PLUGIN_RANDSTRUCT 2017-11-28 17:33:48 -06:00
vdev_root.c Undo c89 workarounds to match with upstream 2017-11-04 13:25:13 -07:00
vdev.c Extend deadman logic 2018-01-25 13:40:38 -08:00
zap_leaf.c Use SET_ERROR for constant non-zero return codes 2017-08-02 21:16:12 -07:00
zap_micro.c Undo c89 workarounds to match with upstream 2017-11-04 13:25:13 -07:00
zap.c Sequential scrub and resilvers 2017-11-15 17:27:01 -08:00
zfeature.c Undo c89 workarounds to match with upstream 2017-11-04 13:25:13 -07:00
zfs_acl.c Linux 4.14 compat: CONFIG_GCC_PLUGIN_RANDSTRUCT 2017-11-28 17:33:48 -06:00
zfs_byteswap.c Add linux kernel module support 2010-08-31 13:41:58 -07:00
zfs_ctldir.c Use SET_ERROR for constant non-zero return codes 2017-08-02 21:16:12 -07:00
zfs_debug.c Add line info and SET_ERROR() to ZFS debug log 2017-07-25 23:09:48 -07:00
zfs_dir.c Use zap_count instead of cached z_size for unlink 2018-01-09 16:16:07 -08:00
zfs_fm.c OpenZFS 8731 - ASSERT3U(nui64s, <=, UINT16_MAX) fails for large blocks 2018-01-25 10:02:11 -08:00
zfs_fuid.c Rename zfs_sb_t -> zfsvfs_t 2017-03-10 09:51:33 -08:00
zfs_ioctl.c Long hold the dataset during upgrade 2017-11-10 13:37:10 -08:00
zfs_log.c OpenZFS 7578 - Fix/improve some aspects of ZIL writing 2017-06-09 09:15:37 -07:00
zfs_onexit.c zfsdev_getminor() should check for invalid file handles 2015-06-22 17:02:13 -07:00
zfs_ratelimit.c Add libtpool (thread pools) 2017-08-09 15:31:08 -07:00
zfs_replay.c OpenZFS 8081 - Compiler warnings in zdb 2017-10-27 12:46:35 -07:00
zfs_rlock.c Fix spelling 2017-01-03 11:31:18 -06:00
zfs_sa.c Modifying XATTRs doesnt change the ctime 2017-09-13 12:20:07 -07:00
zfs_vfsops.c Fix 'zfs get {user|group}objused@' functionality 2017-11-29 11:59:22 -08:00
zfs_vnops.c OpenZFS 8585 - improve batching done in zil_commit() 2017-12-05 09:39:16 -08:00
zfs_znode.c OpenZFS 8930 - zfs_zinactive: do not remove the node if the filesystem is readonly 2018-01-11 13:50:08 -08:00
zil.c OpenZFS 8909 - 8585 can cause a use-after-free kernel panic 2017-12-28 10:18:04 -08:00
zio_checksum.c Undo c89 workarounds to match with upstream 2017-11-04 13:25:13 -07:00
zio_compress.c Undo c89 workarounds to match with upstream 2017-11-04 13:25:13 -07:00
zio_crypt.c Fix for #6714 2017-10-11 16:59:42 -04:00
zio_inject.c Undo c89 workarounds to match with upstream 2017-11-04 13:25:13 -07:00
zio.c Extend deadman logic 2018-01-25 13:40:38 -08:00
zle.c Update core ZFS code from build 121 to build 141. 2010-05-28 13:45:14 -07:00
zpl_ctldir.c Linux 4.12 compat: CURRENT_TIME removed 2017-05-10 09:30:48 -07:00
zpl_export.c Use cstyle -cpP in make cstyle check 2016-12-12 10:46:26 -08:00
zpl_file.c misc: fix meaningless values 2017-09-19 12:19:08 -07:00
zpl_inode.c Linux 4.12 compat: CURRENT_TIME removed 2017-05-10 09:30:48 -07:00
zpl_super.c Restructure mount option handling 2017-03-10 09:51:41 -08:00
zpl_xattr.c Update for cppcheck v1.80 2017-11-18 14:08:00 -08:00
zrlock.c Undo c89 workarounds to match with upstream 2017-11-04 13:25:13 -07:00
zvol.c OpenZFS 8585 - improve batching done in zil_commit() 2017-12-05 09:39:16 -08:00