mirror_zfs/module/zfs
Prakash Surya ef7a79488a OpenZFS 8997 - ztest assertion failure in zil_lwb_write_issue
PROBLEM
=======

When `dmu_tx_assign` is called from `zil_lwb_write_issue`, it's possible
for either `ERESTART` or `EIO` to be returned.

If `ERESTART` is returned, this will cause an assertion to fail directly
in `zil_lwb_write_issue`, where the code assumes the return value is
`EIO` if `dmu_tx_assign` returns a non-zero value. This can occur if the
SPA is suspended when `dmu_tx_assign` is called, and most often occurs
when running `zloop`.

If `EIO` is returned, this can cause assertions to fail elsewhere in the
ZIL code. For example, `zil_commit_waiter_timeout` contains the
following logic:

    lwb_t *nlwb = zil_lwb_write_issue(zilog, lwb);
    ASSERT3S(lwb->lwb_state, !=, LWB_STATE_OPENED);

In this case, if `dmu_tx_assign` returned `EIO` from within
`zil_lwb_write_issue`, the `lwb` variable passed in will not be issued
to disk. Thus, it's `lwb_state` field will remain `LWB_STATE_OPENED` and
this assertion will fail. `zil_commit_waiter_timeout` assumes that after
it calls `zil_lwb_write_issue`, the `lwb` will be issued to disk, and
doesn't handle the case where this is not true; i.e. it doesn't handle
the case where `dmu_tx_assign` returns `EIO`.

SOLUTION
========

This change modifies the `dmu_tx_assign` function such that `txg_how` is
a bitmask, rather than of the `txg_how_t` enum type. Now, the previous
`TXG_WAITED` semantics can be used via `TXG_NOTHROTTLE`, along with
specifying either `TXG_NOWAIT` or `TXG_WAIT` semantics.

Previously, when `TXG_WAITED` was specified, `TXG_NOWAIT` semantics was
automatically invoked. This was not ideal when using `TXG_WAITED` within
`zil_lwb_write_issued`, leading the problem described above. Rather, we
want to achieve the semantics of `TXG_WAIT`, while also preventing the
`tx` from being penalized via the dirty delay throttling.

With this change, `zil_lwb_write_issued` can acheive the semtantics that
it requires by passing in the value `TXG_WAIT | TXG_NOTHROTTLE` to
`dmu_tx_assign`.

Further, consumers of `dmu_tx_assign` wishing to achieve the old
`TXG_WAITED` semantics can pass in the value `TXG_NOWAIT | TXG_NOTHROTTLE`.

Authored by: Prakash Surya <prakash.surya@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Andriy Gapon <avg@FreeBSD.org>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>

Porting Notes:
- Additionally updated `zfs_tmpfile` to use `TXG_NOTHROTTLE`

OpenZFS-issue: https://www.illumos.org/issues/8997
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/19ea6cb0f9
Closes #7084
2018-07-06 02:46:51 -07:00
..
abd.c Update for cppcheck v1.80 2018-01-30 10:27:31 -06:00
arc.c Fix zfs_arc_max minimum tuning 2018-05-07 17:19:57 -07:00
blkptr.c OpenZFS 8067 - zdb should be able to dump literal embedded block pointer 2017-07-07 11:28:01 -07:00
bplist.c Change KM_PUSHPAGE -> KM_SLEEP 2015-01-16 14:41:26 -08:00
bpobj.c Don't dirty bpobj if it has no entries 2017-05-26 11:42:10 -07:00
bptree.c OpenZFS 7082 - bptree_iterate() passes wrong args to zfs_dbgmsg() 2017-01-17 14:49:24 -08:00
bqueue.c Call cv_signal() with mutex held 2017-06-26 14:36:49 -07:00
dbuf_stats.c Improved dnode allocation and dmu_hold_impl() (#6611) 2017-09-13 15:46:15 -07:00
dbuf.c Fix ARC hit rate 2018-01-30 10:27:31 -06:00
ddt_zap.c Change KM_PUSHPAGE -> KM_SLEEP 2015-01-16 14:41:26 -08:00
ddt.c Cache ddt_get_dedup_dspace() value if there was no ddt changes 2016-12-02 16:59:35 -07:00
dmu_diff.c OpenZFS 6950 - ARC should cache compressed data 2016-09-13 09:58:33 -07:00
dmu_object.c Improved dnode allocation and dmu_hold_impl() (#6611) 2017-09-13 15:46:15 -07:00
dmu_objset.c Linux 4.18 compat: inode timespec -> timespec64 2018-07-06 02:46:51 -07:00
dmu_send.c Fix 'zfs send/recv' hang with 16M blocks 2018-05-07 17:19:57 -07:00
dmu_traverse.c Fix hung z_zvol tasks during 'zfs receive' 2018-05-07 17:19:57 -07:00
dmu_tx.c OpenZFS 8997 - ztest assertion failure in zil_lwb_write_issue 2018-07-06 02:46:51 -07:00
dmu_zfetch.c OpenZFS 8835 - Speculative prefetch in ZFS not working for misaligned reads 2018-01-30 10:27:31 -06:00
dmu.c Fix dirty check in dmu_offset_next() 2017-11-21 13:11:29 -06:00
dnode_sync.c OpenZFS 7968 - multi-threaded spa_sync() 2017-03-20 18:36:00 -07:00
dnode.c Improved dnode allocation and dmu_hold_impl() (#6611) 2017-09-13 15:46:15 -07:00
dsl_bookmark.c OpenZFS 8377 - Panic in bookmark deletion 2017-06-30 11:11:01 -07:00
dsl_dataset.c OpenZFS 7600 - zfs rollback should pass target snapshot to kernel 2017-07-04 15:29:52 -07:00
dsl_deadlist.c OpenZFS 5428 - provide fts(), reallocarray(), and strtonum() 2017-07-08 20:35:35 -07:00
dsl_deleg.c Performance optimization of AVL tree comparator functions 2016-08-31 14:35:34 -07:00
dsl_destroy.c OpenZFS 7254 - ztest failed assertion in ztest_dataset_dirobj_verify: dirobjs + 1 == usedobjs 2017-01-27 11:43:42 -08:00
dsl_dir.c Linux 4.18 compat: inode timespec -> timespec64 2018-07-06 02:46:51 -07:00
dsl_pool.c Multi-modifier protection (MMP) 2017-07-13 13:54:00 -04:00
dsl_prop.c Fix dsl_props_set_sync_impl to work with nested nvlist 2016-12-20 18:46:59 -08:00
dsl_scan.c OpenZFS 6939 - add sysevents to zfs core for commands 2017-07-12 21:28:13 -07:00
dsl_synctask.c Illumos 4951 - ZFS administrative commands should use reserved space 2015-05-04 09:41:10 -07:00
dsl_userhold.c OpenZFS 5428 - provide fts(), reallocarray(), and strtonum() 2017-07-08 20:35:35 -07:00
edonr_zfs.c DLPX-44812 integrate EP-220 large memory scalability 2016-11-29 14:34:27 -08:00
fm.c Linux 4.18 compat: inode timespec -> timespec64 2018-07-06 02:46:51 -07:00
gzip.c GZIP compression offloading with QAT accelerator 2017-03-22 17:58:47 -07:00
lz4.c Fix LZ4_uncompress_unknownOutputSize caused panic 2017-05-19 13:45:46 -07:00
lzjb.c Change KM_PUSHPAGE -> KM_SLEEP 2015-01-16 14:41:26 -08:00
Makefile.in Multi-modifier protection (MMP) 2017-07-13 13:54:00 -04:00
metaslab.c OpenZFS 8023 - Panic destroying a metaslab deferred range tree 2017-04-09 16:12:35 -07:00
mmp.c Update mmp_delay on sync or skipped, failed write 2018-05-07 17:19:57 -07:00
multilist.c OpenZFS 7968 - multi-threaded spa_sync() 2017-03-20 18:36:00 -07:00
pathname.c Add pn_alloc()/pn_free() functions 2016-04-21 09:49:25 -07:00
policy.c Take user namespaces into account in policy checks 2018-03-14 16:10:38 -07:00
qat_compress.c Bug fix in qat_compress.c for vmalloc addr check 2018-03-14 16:10:36 -07:00
qat_compress.h GZIP compression offloading with QAT accelerator 2017-03-22 17:58:47 -07:00
range_tree.c Performance optimization of AVL tree comparator functions 2016-08-31 14:35:34 -07:00
refcount.c Linux 4.11 compat: avoid refcount_t name conflict 2017-02-28 16:10:18 -08:00
rrwlock.c Fix spelling 2017-01-03 11:31:18 -06:00
sa.c OpenZFS 8061 - sa_find_idx_tab can be declared more type-safely 2017-04-14 11:11:28 -07:00
sha256.c DLPX-44812 integrate EP-220 large memory scalability 2016-11-29 14:34:27 -08:00
skein_zfs.c DLPX-44812 integrate EP-220 large memory scalability 2016-11-29 14:34:27 -08:00
spa_boot.c
spa_config.c Remove vn_rename and vn_remove dependency 2018-03-14 16:10:36 -07:00
spa_errlog.c OpenZFS 5428 - provide fts(), reallocarray(), and strtonum() 2017-07-08 20:35:35 -07:00
spa_history.c Emit history events for 'zpool create' 2017-12-04 17:21:03 -08:00
spa_misc.c Report pool suspended due to MMP 2018-05-07 17:19:56 -07:00
spa_stats.c Record skipped MMP writes in multihost_history 2018-05-07 17:19:56 -07:00
spa.c Update mmp_delay on sync or skipped, failed write 2018-05-07 17:19:57 -07:00
space_map.c OpenZFS 8023 - Panic destroying a metaslab deferred range tree 2017-04-09 16:12:35 -07:00
space_reftree.c OpenZFS 6328 - Fix cstyle errors in zfs codebase 2017-01-12 09:42:11 -08:00
trace.c OpenZFS 6531 - Provide mechanism to artificially limit disk performance 2016-05-26 10:11:51 -07:00
txg.c OpenZFS 8063 - verify that we do not attempt to access inactive txg 2017-05-10 13:52:22 -04:00
uberblock.c Multi-modifier protection (MMP) 2017-07-13 13:54:00 -04:00
unique.c Performance optimization of AVL tree comparator functions 2016-08-31 14:35:34 -07:00
vdev_cache.c Fix wrong offset args in vdev_cache_write 2017-03-28 11:06:22 -07:00
vdev_disk.c Linux 4.14 compat: IO acct, global_page_state, etc (#6655) 2017-09-19 14:24:34 -07:00
vdev_file.c Skip spurious resilver IO on raidz vdev 2017-05-12 17:28:03 -07:00
vdev_label.c Update mmp_delay on sync or skipped, failed write 2018-05-07 17:19:57 -07:00
vdev_mirror.c vdev_mirror: load balancing fixes 2018-01-30 10:27:30 -06:00
vdev_missing.c Skip spurious resilver IO on raidz vdev 2017-05-12 17:28:03 -07:00
vdev_queue.c vdev_mirror: load balancing fixes 2018-01-30 10:27:30 -06:00
vdev_raidz_math_aarch64_neon_common.h ABD raidz NEON support 2016-11-29 14:34:33 -08:00
vdev_raidz_math_aarch64_neon.c codebase style improvements for OpenZFS 6459 port 2017-01-22 13:25:40 -08:00
vdev_raidz_math_aarch64_neonx2.c ABD raidz NEON support 2016-11-29 14:34:33 -08:00
vdev_raidz_math_avx2.c ABD raidz avx512f support 2016-11-29 14:34:33 -08:00
vdev_raidz_math_avx512bw.c ABD: Adapt avx512bw raidz assembly 2016-12-15 17:31:33 -08:00
vdev_raidz_math_avx512f.c Use cstyle -cpP in make cstyle check 2016-12-12 10:46:26 -08:00
vdev_raidz_math_impl.h codebase style improvements for OpenZFS 6459 port 2017-01-22 13:25:40 -08:00
vdev_raidz_math_scalar.c ABD Vectorized raidz 2016-11-29 14:34:33 -08:00
vdev_raidz_math_sse2.c ABD raidz avx512f support 2016-11-29 14:34:33 -08:00
vdev_raidz_math_ssse3.c codebase style improvements for OpenZFS 6459 port 2017-01-22 13:25:40 -08:00
vdev_raidz_math.c codebase style improvements for OpenZFS 6459 port 2017-01-22 13:25:40 -08:00
vdev_raidz.c Linux 4.14 compat: CONFIG_GCC_PLUGIN_RANDSTRUCT 2017-12-04 17:21:39 -08:00
vdev_root.c Skip spurious resilver IO on raidz vdev 2017-05-12 17:28:03 -07:00
vdev.c Add zfs_scan_ignore_errors tunable 2018-05-07 17:19:56 -07:00
zap_leaf.c Revert "Handle zap_add() failures in mixed ... " 2018-04-09 17:29:59 -04:00
zap_micro.c Revert "Handle zap_add() failures in mixed ... " 2018-04-09 17:29:59 -04:00
zap.c Revert "Handle zap_add() failures in mixed ... " 2018-04-09 17:29:59 -04:00
zfeature_common.c OpenZFS 2932 - support crash dumps to raidz, etc. pools 2017-04-10 10:24:17 -07:00
zfeature.c OpenZFS 6328 - Fix cstyle errors in zfs codebase 2017-01-12 09:42:11 -08:00
zfs_acl.c OpenZFS 8966 - Source file zfs_acl.c, function zfs_aclset_common contains a use after end of the lifetime of a local variable 2018-03-14 16:10:36 -07:00
zfs_byteswap.c
zfs_ctldir.c Linux 4.18 compat: inode timespec -> timespec64 2018-07-06 02:46:51 -07:00
zfs_debug.c Add line info and SET_ERROR() to ZFS debug log 2017-07-25 23:09:48 -07:00
zfs_dir.c Revert "Handle zap_add() failures in mixed ... " 2018-04-09 17:29:59 -04:00
zfs_fm.c OpenZFS 6939 - add sysevents to zfs core for commands 2017-07-12 21:28:13 -07:00
zfs_fuid.c Rename zfs_sb_t -> zfsvfs_t 2017-03-10 09:51:33 -08:00
zfs_ioctl.c Fix zfs_ioc_pool_sync should not use fnvlist 2018-01-30 10:27:30 -06:00
zfs_log.c OpenZFS 7578 - Fix/improve some aspects of ZIL writing 2017-06-09 09:15:37 -07:00
zfs_onexit.c zfsdev_getminor() should check for invalid file handles 2015-06-22 17:02:13 -07:00
zfs_replay.c Rename zfs_sb_t -> zfsvfs_t 2017-03-10 09:51:33 -08:00
zfs_rlock.c Fix spelling 2017-01-03 11:31:18 -06:00
zfs_sa.c Modifying XATTRs doesnt change the ctime 2017-09-13 16:05:18 -07:00
zfs_vfsops.c Revert "Long hold the dataset during upgrade" 2017-12-06 13:25:40 -06:00
zfs_vnops.c OpenZFS 8997 - ztest assertion failure in zil_lwb_write_issue 2018-07-06 02:46:51 -07:00
zfs_znode.c Linux 4.18 compat: inode timespec -> timespec64 2018-07-06 02:46:51 -07:00
zil.c OpenZFS 8997 - ztest assertion failure in zil_lwb_write_issue 2018-07-06 02:46:51 -07:00
zio_checksum.c Remove dependency on linear ABD 2017-03-29 12:24:51 -07:00
zio_compress.c DLPX-44812 integrate EP-220 large memory scalability 2016-11-29 14:34:27 -08:00
zio_inject.c Inject zinject(8) a percentage amount of dev errs 2017-06-16 17:21:11 -07:00
zio.c Report pool suspended due to MMP 2018-05-07 17:19:56 -07:00
zle.c Fix zle_decompress out of bound access 2018-03-14 16:10:36 -07:00
zpl_ctldir.c RHEL 7.5 compat: FMODE_KABI_ITERATE 2018-05-07 17:19:57 -07:00
zpl_export.c Use cstyle -cpP in make cstyle check 2016-12-12 10:46:26 -08:00
zpl_file.c RHEL 7.5 compat: FMODE_KABI_ITERATE 2018-05-07 17:19:57 -07:00
zpl_inode.c Linux 4.18 compat: inode timespec -> timespec64 2018-07-06 02:46:51 -07:00
zpl_super.c Allow mounting datasets more than once 2018-05-07 17:19:57 -07:00
zpl_xattr.c Update for cppcheck v1.80 2018-01-30 10:27:31 -06:00
zrlock.c OpenZFS 3746 - ZRLs are racy 2017-01-23 10:35:58 -08:00
zvol.c Linux compat 4.18: check_disk_size_change() 2018-07-06 02:46:51 -07:00