mirror_zfs/include/sys
Prakash Surya ef7a79488a OpenZFS 8997 - ztest assertion failure in zil_lwb_write_issue
PROBLEM
=======

When `dmu_tx_assign` is called from `zil_lwb_write_issue`, it's possible
for either `ERESTART` or `EIO` to be returned.

If `ERESTART` is returned, this will cause an assertion to fail directly
in `zil_lwb_write_issue`, where the code assumes the return value is
`EIO` if `dmu_tx_assign` returns a non-zero value. This can occur if the
SPA is suspended when `dmu_tx_assign` is called, and most often occurs
when running `zloop`.

If `EIO` is returned, this can cause assertions to fail elsewhere in the
ZIL code. For example, `zil_commit_waiter_timeout` contains the
following logic:

    lwb_t *nlwb = zil_lwb_write_issue(zilog, lwb);
    ASSERT3S(lwb->lwb_state, !=, LWB_STATE_OPENED);

In this case, if `dmu_tx_assign` returned `EIO` from within
`zil_lwb_write_issue`, the `lwb` variable passed in will not be issued
to disk. Thus, it's `lwb_state` field will remain `LWB_STATE_OPENED` and
this assertion will fail. `zil_commit_waiter_timeout` assumes that after
it calls `zil_lwb_write_issue`, the `lwb` will be issued to disk, and
doesn't handle the case where this is not true; i.e. it doesn't handle
the case where `dmu_tx_assign` returns `EIO`.

SOLUTION
========

This change modifies the `dmu_tx_assign` function such that `txg_how` is
a bitmask, rather than of the `txg_how_t` enum type. Now, the previous
`TXG_WAITED` semantics can be used via `TXG_NOTHROTTLE`, along with
specifying either `TXG_NOWAIT` or `TXG_WAIT` semantics.

Previously, when `TXG_WAITED` was specified, `TXG_NOWAIT` semantics was
automatically invoked. This was not ideal when using `TXG_WAITED` within
`zil_lwb_write_issued`, leading the problem described above. Rather, we
want to achieve the semantics of `TXG_WAIT`, while also preventing the
`tx` from being penalized via the dirty delay throttling.

With this change, `zil_lwb_write_issued` can acheive the semtantics that
it requires by passing in the value `TXG_WAIT | TXG_NOTHROTTLE` to
`dmu_tx_assign`.

Further, consumers of `dmu_tx_assign` wishing to achieve the old
`TXG_WAITED` semantics can pass in the value `TXG_NOWAIT | TXG_NOTHROTTLE`.

Authored by: Prakash Surya <prakash.surya@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Andriy Gapon <avg@FreeBSD.org>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>

Porting Notes:
- Additionally updated `zfs_tmpfile` to use `TXG_NOTHROTTLE`

OpenZFS-issue: https://www.illumos.org/issues/8997
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/19ea6cb0f9
Closes #7084
2018-07-06 02:46:51 -07:00
..
crypto OpenZFS 4185 - add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R 2016-10-03 14:51:15 -07:00
fm OpenZFS 6939 - add sysevents to zfs core for commands 2017-07-12 21:28:13 -07:00
fs Report pool suspended due to MMP 2018-05-07 17:19:56 -07:00
sysevent Emit history events for 'zpool create' 2017-12-04 17:21:03 -08:00
abd.h OpenZFS 8416 - abd.h is not C++ friendly 2017-06-30 11:11:01 -07:00
arc_impl.h OpenZFS 7968 - multi-threaded spa_sync() 2017-03-20 18:36:00 -07:00
arc.h Fix ARC hit rate 2018-01-30 10:27:31 -06:00
avl_impl.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
avl.h Performance optimization of AVL tree comparator functions 2016-08-31 14:35:34 -07:00
blkptr.h OpenZFS 8067 - zdb should be able to dump literal embedded block pointer 2017-07-07 11:28:01 -07:00
bplist.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
bpobj.h Illumos 5810 - zdb should print details of bpobj 2015-05-11 15:10:24 -07:00
bptree.h Illumos 4914 - zfs on-disk bookmark structure should be named *_phys_t 2014-08-06 14:48:41 -07:00
bqueue.h Illumos 5960, 5925 2016-01-08 15:08:19 -08:00
dbuf.h OpenZFS 7531 - Assign correct flags to prefetched buffers 2017-11-21 13:11:29 -06:00
ddt.h DLPX-44812 integrate EP-220 large memory scalability 2016-11-29 14:34:27 -08:00
dmu_impl.h OpenZFS 7793 - ztest fails assertion in dmu_tx_willuse_space 2017-03-07 09:51:59 -08:00
dmu_objset.h Linux 4.18 compat: inode timespec -> timespec64 2018-07-06 02:46:51 -07:00
dmu_send.h Free objects when receiving full stream as clone 2017-10-16 10:57:55 -07:00
dmu_traverse.h OpenZFS 2605, 6980, 6902 2016-06-28 13:47:02 -07:00
dmu_tx.h OpenZFS 8997 - ztest assertion failure in zil_lwb_write_issue 2018-07-06 02:46:51 -07:00
dmu_zfetch.h OpenZFS 6322 - ZFS indirect block predictive prefetch 2016-08-30 14:26:55 -07:00
dmu.h OpenZFS 8997 - ztest assertion failure in zil_lwb_write_issue 2018-07-06 02:46:51 -07:00
dnode.h Improved dnode allocation and dmu_hold_impl() (#6611) 2017-09-13 15:46:15 -07:00
dsl_bookmark.h Illumos 4368, 4369. 2014-07-29 10:55:29 -07:00
dsl_dataset.h Typo in dsl_dataset.h 2017-10-17 16:49:03 -07:00
dsl_deadlist.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
dsl_deleg.h Add support for user/group dnode accounting & quota 2016-10-07 09:45:13 -07:00
dsl_destroy.h Illumos #3888 2013-11-04 11:18:14 -08:00
dsl_dir.h Linux 4.18 compat: inode timespec -> timespec64 2018-07-06 02:46:51 -07:00
dsl_pool.h Multi-modifier protection (MMP) 2017-07-13 13:54:00 -04:00
dsl_prop.h Illumos 6171 - dsl_prop_unregister() slows down dataset eviction. 2016-01-12 10:53:12 -08:00
dsl_scan.h Implemented zpool scrub pause/resume 2017-07-06 22:16:13 -07:00
dsl_synctask.h Illumos 4951 - ZFS administrative commands should use reserved space 2015-05-04 09:41:10 -07:00
dsl_userhold.h Illumos #3740 2013-11-04 11:17:48 -08:00
edonr.h OpenZFS 4185 - add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R 2016-10-03 14:51:15 -07:00
efi_partition.h Fix spelling 2017-01-03 11:31:18 -06:00
Makefile.am Multi-modifier protection (MMP) 2017-07-13 13:54:00 -04:00
metaslab_impl.h OpenZFS 7613 - ms_freetree[4] is only used in syncing context 2017-01-26 15:27:19 -08:00
metaslab.h OpenZFS 7303 - dynamic metaslab selection 2017-01-12 11:52:56 -08:00
mmp.h Record skipped MMP writes in multihost_history 2018-05-07 17:19:56 -07:00
mntent.h Make zfs mount according to relatime config in dataset 2016-04-05 18:55:59 -07:00
multilist.h OpenZFS 7968 - multi-threaded spa_sync() 2017-03-20 18:36:00 -07:00
nvpair_impl.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
nvpair.h Replace __va_list with va_list 2014-08-13 10:35:00 -07:00
pathname.h Add pn_alloc()/pn_free() functions 2016-04-21 09:49:25 -07:00
policy.h Add zfs allow and zfs unallow support 2016-06-07 09:16:52 -07:00
range_tree.h Illumos #4374 2014-07-30 09:20:35 -07:00
refcount.h Linux 4.11 compat: avoid refcount_t name conflict 2017-02-28 16:10:18 -08:00
rrwlock.h Illumos 5008 - lock contention (rrw_exit) while running a read only load 2015-07-06 09:34:13 -07:00
sa_impl.h Implement large_dnode pool feature 2016-06-24 13:13:21 -07:00
sa.h Remove unused sa_update_from_cb() 2016-12-01 16:39:06 -07:00
sdt.h Add line info and SET_ERROR() to ZFS debug log 2017-07-25 23:09:48 -07:00
sha2.h OpenZFS 4185 - add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R 2016-10-03 14:51:15 -07:00
skein.h OpenZFS 4185 - add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R 2016-10-03 14:51:15 -07:00
spa_boot.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
spa_checksum.h Implementation of AVX2 optimized Fletcher-4 2016-06-02 14:30:51 -07:00
spa_impl.h Linux 4.18 compat: inode timespec -> timespec64 2018-07-06 02:46:51 -07:00
spa.h Record skipped MMP writes in multihost_history 2018-05-07 17:19:56 -07:00
space_map.h Illumos 5164-5165 - space map fixes 2014-10-23 15:30:32 -07:00
space_reftree.h Illumos #4101, #4102, #4103, #4105, #4106 2014-07-22 09:39:16 -07:00
sysevent.h OpenZFS 6939 - add sysevents to zfs core for commands 2017-07-12 21:28:13 -07:00
trace_acl.h Linux 4.16 compat: inode_set_iversion() 2018-03-14 16:10:36 -07:00
trace_arc.h Fix build-it compilation regression 2017-01-24 08:50:15 -08:00
trace_common.h OpenZFS 6531 - Provide mechanism to artificially limit disk performance 2016-05-26 10:11:51 -07:00
trace_dbgmsg.h Add line info and SET_ERROR() to ZFS debug log 2017-07-25 23:09:48 -07:00
trace_dbuf.h Crash in dbuf_evict_one with DTRACE_PROBE 2017-08-21 16:41:22 -07:00
trace_dmu.h OpenZFS 7793 - ztest fails assertion in dmu_tx_willuse_space 2017-03-07 09:51:59 -08:00
trace_dnode.h Fix build-it compilation regression 2017-01-24 08:50:15 -08:00
trace_multilist.h Fix build-it compilation regression 2017-01-24 08:50:15 -08:00
trace_txg.h Fix build-it compilation regression 2017-01-24 08:50:15 -08:00
trace_zil.h OpenZFS 7578 - Fix/improve some aspects of ZIL writing 2017-06-09 09:15:37 -07:00
trace_zio.h Use cstyle -cpP in make cstyle check 2016-12-12 10:46:26 -08:00
trace_zrlock.h Use cstyle -cpP in make cstyle check 2016-12-12 10:46:26 -08:00
trace.h Remove duplicate typedefs from trace.h 2015-01-06 16:53:24 -08:00
txg_impl.h Fix spelling 2017-01-03 11:31:18 -06:00
txg.h OpenZFS 8063 - verify that we do not attempt to access inactive txg 2017-05-10 13:52:22 -04:00
u8_textprep_data.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
u8_textprep.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
uberblock_impl.h OpenZFS 8491 - uberblock on-disk padding to reserve space for smoothly merging zpool checkpoint & MMP in ZFS 2017-07-24 13:47:51 -04:00
uberblock.h Multi-modifier protection (MMP) 2017-07-13 13:54:00 -04:00
uio_impl.h Add basic uio support 2011-02-10 09:21:43 -08:00
unique.h Illumos #3742 2013-11-04 10:55:25 -08:00
uuid.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
vdev_disk.h Remove custom root pool import code 2016-08-11 11:19:34 -07:00
vdev_file.h Use a dedicated taskq for vdev_file 2016-12-21 10:47:15 -08:00
vdev_impl.h Change checksum & IO delay ratelimit values 2018-03-14 16:10:38 -07:00
vdev_raidz_impl.h Revert raidz_map and _col structure types 2018-01-30 10:27:31 -06:00
vdev_raidz.h Use cstyle -cpP in make cstyle check 2016-12-12 10:46:26 -08:00
vdev.h vdev_mirror: load balancing fixes 2018-01-30 10:27:30 -06:00
xvattr.h Linux 4.18 compat: inode timespec -> timespec64 2018-07-06 02:46:51 -07:00
zap_impl.h OpenZFS 7793 - ztest fails assertion in dmu_tx_willuse_space 2017-03-07 09:51:59 -08:00
zap_leaf.h Revert "Handle zap_add() failures in mixed ... " 2018-04-09 17:29:59 -04:00
zap.h OpenZFS 1300 - filename normalization doesn't work for removes 2017-02-02 14:13:41 -08:00
zfeature.h Revert "zhack: Add 'feature disable' command" 2016-05-17 11:52:07 -07:00
zfs_acl.h Rename zfs_sb_t -> zfsvfs_t 2017-03-10 09:51:33 -08:00
zfs_context.h Linux 4.18 compat: inode timespec -> timespec64 2018-07-06 02:46:51 -07:00
zfs_ctldir.h Rename zfs_sb_t -> zfsvfs_t 2017-03-10 09:51:33 -08:00
zfs_debug.h Add line info and SET_ERROR() to ZFS debug log 2017-07-25 23:09:48 -07:00
zfs_delay.h cstyle: Resolve C style issues 2013-12-18 16:46:35 -08:00
zfs_dir.h Rename zfs_sb_t -> zfsvfs_t 2017-03-10 09:51:33 -08:00
zfs_fuid.h Rename zfs_sb_t -> zfsvfs_t 2017-03-10 09:51:33 -08:00
zfs_ioctl.h Inject zinject(8) a percentage amount of dev errs 2017-06-16 17:21:11 -07:00
zfs_onexit.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
zfs_ratelimit.h Change checksum & IO delay ratelimit values 2018-03-14 16:10:38 -07:00
zfs_rlock.h Rename zfs_sb_t -> zfsvfs_t 2017-03-10 09:51:33 -08:00
zfs_sa.h Illumos 5027 - zfs large block support 2015-05-11 12:23:16 -07:00
zfs_stat.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
zfs_vfsops.h Linux 4.12 compat: super_setup_bdi_name() 2017-05-02 09:46:18 -07:00
zfs_vnops.h RHEL 7.5 compat: FMODE_KABI_ITERATE 2018-05-07 17:19:57 -07:00
zfs_znode.h Linux 4.18 compat: inode timespec -> timespec64 2018-07-06 02:46:51 -07:00
zil_impl.h OpenZFS 7578 - Fix/improve some aspects of ZIL writing 2017-06-09 09:15:37 -07:00
zil.h Preserve itx alloc size for zio_data_buf_free() 2017-12-04 17:21:39 -08:00
zio_checksum.h Remove dependency on linear ABD 2017-03-29 12:24:51 -07:00
zio_compress.h DLPX-44812 integrate EP-220 large memory scalability 2016-11-29 14:34:27 -08:00
zio_impl.h Compile zio.h and zio_impl.h mutual include 2016-12-01 16:36:25 -07:00
zio_priority.h Add -lhHpw options to "zpool iostat" for avg latency, histograms, & queues 2016-05-12 12:36:32 -07:00
zio.h Report pool suspended due to MMP 2018-05-07 17:19:56 -07:00
zpl.h Linux 4.18 compat: inode timespec -> timespec64 2018-07-06 02:46:51 -07:00
zrlock.h OpenZFS 6328 - Fix cstyle errors in zfs codebase 2017-01-12 09:42:11 -08:00
zvol.h Add port of FreeBSD 'volmode' property 2017-07-12 13:05:37 -07:00