mirror_zfs/include/sys
Prakash Surya 0735ecb334 OpenZFS 8997 - ztest assertion failure in zil_lwb_write_issue
PROBLEM
=======

When `dmu_tx_assign` is called from `zil_lwb_write_issue`, it's possible
for either `ERESTART` or `EIO` to be returned.

If `ERESTART` is returned, this will cause an assertion to fail directly
in `zil_lwb_write_issue`, where the code assumes the return value is
`EIO` if `dmu_tx_assign` returns a non-zero value. This can occur if the
SPA is suspended when `dmu_tx_assign` is called, and most often occurs
when running `zloop`.

If `EIO` is returned, this can cause assertions to fail elsewhere in the
ZIL code. For example, `zil_commit_waiter_timeout` contains the
following logic:

    lwb_t *nlwb = zil_lwb_write_issue(zilog, lwb);
    ASSERT3S(lwb->lwb_state, !=, LWB_STATE_OPENED);

In this case, if `dmu_tx_assign` returned `EIO` from within
`zil_lwb_write_issue`, the `lwb` variable passed in will not be issued
to disk. Thus, it's `lwb_state` field will remain `LWB_STATE_OPENED` and
this assertion will fail. `zil_commit_waiter_timeout` assumes that after
it calls `zil_lwb_write_issue`, the `lwb` will be issued to disk, and
doesn't handle the case where this is not true; i.e. it doesn't handle
the case where `dmu_tx_assign` returns `EIO`.

SOLUTION
========

This change modifies the `dmu_tx_assign` function such that `txg_how` is
a bitmask, rather than of the `txg_how_t` enum type. Now, the previous
`TXG_WAITED` semantics can be used via `TXG_NOTHROTTLE`, along with
specifying either `TXG_NOWAIT` or `TXG_WAIT` semantics.

Previously, when `TXG_WAITED` was specified, `TXG_NOWAIT` semantics was
automatically invoked. This was not ideal when using `TXG_WAITED` within
`zil_lwb_write_issued`, leading the problem described above. Rather, we
want to achieve the semantics of `TXG_WAIT`, while also preventing the
`tx` from being penalized via the dirty delay throttling.

With this change, `zil_lwb_write_issued` can acheive the semtantics that
it requires by passing in the value `TXG_WAIT | TXG_NOTHROTTLE` to
`dmu_tx_assign`.

Further, consumers of `dmu_tx_assign` wishing to achieve the old
`TXG_WAITED` semantics can pass in the value `TXG_NOWAIT | TXG_NOTHROTTLE`.

Authored by: Prakash Surya <prakash.surya@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Andriy Gapon <avg@FreeBSD.org>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>

Porting Notes:
- Additionally updated `zfs_tmpfile` to use `TXG_NOTHROTTLE`

OpenZFS-issue: https://www.illumos.org/issues/8997
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/19ea6cb0f9
Closes #7084
2018-01-26 20:19:46 -08:00
..
crypto
fm
fs
sysevent
abd.h OpenZFS 8416 - abd.h is not C++ friendly 2017-06-30 11:11:01 -07:00
arc_impl.h
arc.h Fix ARC hit rate 2018-01-08 09:52:36 -08:00
avl_impl.h
avl.h Remove dead code from AVL tree 2017-10-05 19:28:00 -07:00
blkptr.h OpenZFS 8067 - zdb should be able to dump literal embedded block pointer 2017-07-07 11:28:01 -07:00
bplist.h
bpobj.h
bptree.h
bqueue.h
dbuf.h OpenZFS 7531 - Assign correct flags to prefetched buffers 2017-11-11 20:24:34 -08:00
ddt.h Native Encryption for ZFS on Linux 2017-08-14 10:36:48 -07:00
dmu_impl.h
dmu_objset.h
dmu_send.h
dmu_traverse.h
dmu_tx.h
dmu_zfetch.h
dmu.h
dnode.h
dsl_bookmark.h
dsl_crypt.h
dsl_dataset.h
dsl_deadlist.h
dsl_deleg.h
dsl_destroy.h
dsl_dir.h
dsl_pool.h
dsl_prop.h
dsl_scan.h
dsl_synctask.h
dsl_userhold.h
edonr.h
efi_partition.h
frame.h
hkdf.h Encryption patch follow-up 2017-10-11 16:54:48 -04:00
Makefile.am Suppress incorrect objtool warnings 2017-12-07 10:28:50 -08:00
metaslab_impl.h
metaslab.h OpenZFS 7303 - dynamic metaslab selection 2017-01-12 11:52:56 -08:00
mmp.h Add callback for zfs_multihost_interval 2017-07-25 13:22:20 -04:00
mntent.h
multilist.h OpenZFS 7968 - multi-threaded spa_sync() 2017-03-20 18:36:00 -07:00
nvpair_impl.h
nvpair.h
pathname.h Add pn_alloc()/pn_free() functions 2016-04-21 09:49:25 -07:00
policy.h Add zfs allow and zfs unallow support 2016-06-07 09:16:52 -07:00
range_tree.h
refcount.h OpenZFS 8081 - Compiler warnings in zdb 2017-10-27 12:46:35 -07:00
rrwlock.h Illumos 5008 - lock contention (rrw_exit) while running a read only load 2015-07-06 09:34:13 -07:00
sa_impl.h
sa.h
sdt.h Add line info and SET_ERROR() to ZFS debug log 2017-07-25 23:09:48 -07:00
sha2.h OpenZFS 4185 - add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R 2016-10-03 14:51:15 -07:00
skein.h OpenZFS 4185 - add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R 2016-10-03 14:51:15 -07:00
spa_boot.h
spa_checksum.h
spa_impl.h
spa.h Extend deadman logic 2018-01-25 13:40:38 -08:00
space_map.h
space_reftree.h
sysevent.h OpenZFS 6939 - add sysevents to zfs core for commands 2017-07-12 21:28:13 -07:00
trace_acl.h
trace_arc.h
trace_common.h
trace_dbgmsg.h
trace_dbuf.h
trace_dmu.h
trace_dnode.h
trace_multilist.h
trace_txg.h
trace_zil.h
trace_zio.h
trace_zrlock.h
trace.h
txg_impl.h
txg.h OpenZFS 8063 - verify that we do not attempt to access inactive txg 2017-05-10 13:52:22 -04:00
u8_textprep_data.h
u8_textprep.h
uberblock_impl.h
uberblock.h Multi-modifier protection (MMP) 2017-07-13 13:54:00 -04:00
uio_impl.h
unique.h
uuid.h Support custom build directories and move includes 2010-09-08 12:38:56 -07:00
vdev_disk.h
vdev_file.h
vdev_impl.h
vdev_raidz_impl.h
vdev_raidz.h
vdev.h Extend deadman logic 2018-01-25 13:40:38 -08:00
xvattr.h
zap_impl.h
zap_leaf.h
zap.h OpenZFS 1300 - filename normalization doesn't work for removes 2017-02-02 14:13:41 -08:00
zfeature.h Revert "zhack: Add 'feature disable' command" 2016-05-17 11:52:07 -07:00
zfs_acl.h
zfs_context.h
zfs_ctldir.h
zfs_debug.h
zfs_delay.h
zfs_dir.h
zfs_fuid.h
zfs_ioctl.h
zfs_onexit.h
zfs_ratelimit.h
zfs_rlock.h
zfs_sa.h
zfs_stat.h
zfs_vfsops.h
zfs_vnops.h
zfs_znode.h
zil_impl.h
zil.h OpenZFS 8909 - 8585 can cause a use-after-free kernel panic 2017-12-28 10:18:04 -08:00
zio_checksum.h
zio_compress.h
zio_crypt.h
zio_impl.h
zio_priority.h
zio.h Extend deadman logic 2018-01-25 13:40:38 -08:00
zpl.h Use cstyle -cpP in make cstyle check 2016-12-12 10:46:26 -08:00
zrlock.h
zvol.h Add port of FreeBSD 'volmode' property 2017-07-12 13:05:37 -07:00