mirror of
https://git.proxmox.com/git/mirror_zfs.git
synced 2025-11-18 19:12:55 +03:00
PROBLEM
=======
When `dmu_tx_assign` is called from `zil_lwb_write_issue`, it's possible
for either `ERESTART` or `EIO` to be returned.
If `ERESTART` is returned, this will cause an assertion to fail directly
in `zil_lwb_write_issue`, where the code assumes the return value is
`EIO` if `dmu_tx_assign` returns a non-zero value. This can occur if the
SPA is suspended when `dmu_tx_assign` is called, and most often occurs
when running `zloop`.
If `EIO` is returned, this can cause assertions to fail elsewhere in the
ZIL code. For example, `zil_commit_waiter_timeout` contains the
following logic:
lwb_t *nlwb = zil_lwb_write_issue(zilog, lwb);
ASSERT3S(lwb->lwb_state, !=, LWB_STATE_OPENED);
In this case, if `dmu_tx_assign` returned `EIO` from within
`zil_lwb_write_issue`, the `lwb` variable passed in will not be issued
to disk. Thus, it's `lwb_state` field will remain `LWB_STATE_OPENED` and
this assertion will fail. `zil_commit_waiter_timeout` assumes that after
it calls `zil_lwb_write_issue`, the `lwb` will be issued to disk, and
doesn't handle the case where this is not true; i.e. it doesn't handle
the case where `dmu_tx_assign` returns `EIO`.
SOLUTION
========
This change modifies the `dmu_tx_assign` function such that `txg_how` is
a bitmask, rather than of the `txg_how_t` enum type. Now, the previous
`TXG_WAITED` semantics can be used via `TXG_NOTHROTTLE`, along with
specifying either `TXG_NOWAIT` or `TXG_WAIT` semantics.
Previously, when `TXG_WAITED` was specified, `TXG_NOWAIT` semantics was
automatically invoked. This was not ideal when using `TXG_WAITED` within
`zil_lwb_write_issued`, leading the problem described above. Rather, we
want to achieve the semantics of `TXG_WAIT`, while also preventing the
`tx` from being penalized via the dirty delay throttling.
With this change, `zil_lwb_write_issued` can acheive the semtantics that
it requires by passing in the value `TXG_WAIT | TXG_NOTHROTTLE` to
`dmu_tx_assign`.
Further, consumers of `dmu_tx_assign` wishing to achieve the old
`TXG_WAITED` semantics can pass in the value `TXG_NOWAIT | TXG_NOTHROTTLE`.
Authored by: Prakash Surya <prakash.surya@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Andriy Gapon <avg@FreeBSD.org>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
Porting Notes:
- Additionally updated `zfs_tmpfile` to use `TXG_NOTHROTTLE`
OpenZFS-issue: https://www.illumos.org/issues/8997
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/19ea6cb0f9
Closes #7084
|
||
|---|---|---|
| .. | ||
| crypto | ||
| fm | ||
| fs | ||
| sysevent | ||
| abd.h | ||
| arc_impl.h | ||
| arc.h | ||
| avl_impl.h | ||
| avl.h | ||
| blkptr.h | ||
| bplist.h | ||
| bpobj.h | ||
| bptree.h | ||
| bqueue.h | ||
| dbuf.h | ||
| ddt.h | ||
| dmu_impl.h | ||
| dmu_objset.h | ||
| dmu_send.h | ||
| dmu_traverse.h | ||
| dmu_tx.h | ||
| dmu_zfetch.h | ||
| dmu.h | ||
| dnode.h | ||
| dsl_bookmark.h | ||
| dsl_crypt.h | ||
| dsl_dataset.h | ||
| dsl_deadlist.h | ||
| dsl_deleg.h | ||
| dsl_destroy.h | ||
| dsl_dir.h | ||
| dsl_pool.h | ||
| dsl_prop.h | ||
| dsl_scan.h | ||
| dsl_synctask.h | ||
| dsl_userhold.h | ||
| edonr.h | ||
| efi_partition.h | ||
| frame.h | ||
| hkdf.h | ||
| Makefile.am | ||
| metaslab_impl.h | ||
| metaslab.h | ||
| mmp.h | ||
| mntent.h | ||
| multilist.h | ||
| nvpair_impl.h | ||
| nvpair.h | ||
| pathname.h | ||
| policy.h | ||
| range_tree.h | ||
| refcount.h | ||
| rrwlock.h | ||
| sa_impl.h | ||
| sa.h | ||
| sdt.h | ||
| sha2.h | ||
| skein.h | ||
| spa_boot.h | ||
| spa_checksum.h | ||
| spa_impl.h | ||
| spa.h | ||
| space_map.h | ||
| space_reftree.h | ||
| sysevent.h | ||
| trace_acl.h | ||
| trace_arc.h | ||
| trace_common.h | ||
| trace_dbgmsg.h | ||
| trace_dbuf.h | ||
| trace_dmu.h | ||
| trace_dnode.h | ||
| trace_multilist.h | ||
| trace_txg.h | ||
| trace_zil.h | ||
| trace_zio.h | ||
| trace_zrlock.h | ||
| trace.h | ||
| txg_impl.h | ||
| txg.h | ||
| u8_textprep_data.h | ||
| u8_textprep.h | ||
| uberblock_impl.h | ||
| uberblock.h | ||
| uio_impl.h | ||
| unique.h | ||
| uuid.h | ||
| vdev_disk.h | ||
| vdev_file.h | ||
| vdev_impl.h | ||
| vdev_raidz_impl.h | ||
| vdev_raidz.h | ||
| vdev.h | ||
| xvattr.h | ||
| zap_impl.h | ||
| zap_leaf.h | ||
| zap.h | ||
| zfeature.h | ||
| zfs_acl.h | ||
| zfs_context.h | ||
| zfs_ctldir.h | ||
| zfs_debug.h | ||
| zfs_delay.h | ||
| zfs_dir.h | ||
| zfs_fuid.h | ||
| zfs_ioctl.h | ||
| zfs_onexit.h | ||
| zfs_ratelimit.h | ||
| zfs_rlock.h | ||
| zfs_sa.h | ||
| zfs_stat.h | ||
| zfs_vfsops.h | ||
| zfs_vnops.h | ||
| zfs_znode.h | ||
| zil_impl.h | ||
| zil.h | ||
| zio_checksum.h | ||
| zio_compress.h | ||
| zio_crypt.h | ||
| zio_impl.h | ||
| zio_priority.h | ||
| zio.h | ||
| zpl.h | ||
| zrlock.h | ||
| zvol.h | ||