mirror_zfs/module/zfs
Rob Norris 72602f6ad9 ZIL: "crash" the ZIL if the pool suspends during fallback
If the ZIL runs into trouble, it calls txg_wait_synced(), which blocks
on suspend. We want it to not block on suspend, instead returning an
error. On the surface, this is simple: change all calls to
txg_wait_synced_flags(TXG_WAIT_SUSPEND), and then thread the error
return back to the zil_commit() caller.

Handling suspension means returning an error to all commit waiters. This
is relatively straightforward, as zil_commit_waiter_t already has
zcw_zio_error to hold the write IO error, which signals a fallback to
txg_wait_synced_flags(TXG_WAIT_SUSPEND), which will fail, and so the
waiter can now return an error from zil_commit().

However, commit waiters are normally signalled when their associated
write (LWB) completes. If the pool has suspended, those IOs may not
return for some time, or maybe not at all. We still want to signal those
waiters so they can return from zil_commit(). We have a list of those
in-flight LWBs on zl_lwb_list, so we can run through those, detach them
and signal them. The LWB itself is still in-flight, but no longer has
attached waiters, so when it returns there will be nothing to do.

(As an aside, ITXs can also supply completion callbacks, which are
called when they are destroyed. These are directly connected to LWBs
though, so are passed the error code and destroyed there too).

At this point, all ZIL waiters have been ejected, so we only have to
consider the internal state. We potentially still have ITXs that have
not been committed, LWBs still open, and LWBs in-flight. The on-disk ZIL
is in an unknown state; some writes may have been written but not
returned to us. We really can't rely on any of it; the best thing to do
is abandon it entirely and start over when the pool returns to service.
But, since we may have IO out that won't return until the pool resumes,
we need something for it to return to.

The simplest solution I could find, implemented here, is to "crash" the
ZIL: accept no new ITXs, make no further updates, and let it empty out
on its normal schedule, that is, as txgs complete and zil_sync() and
zil_clean() are called. We set a "restart txg" to three txgs in the
future (syncing + TXG_CONCURRENT_STATES), at which point all the
internal state will have been cleared out, and the ZIL can resume
operation (handled at the top of zil_clean()).

This commit adds zil_crash(), which handles all of the above:
 - sets the restart txg
 - capture and signal all waiters
 - zero the header

zil_crash() is called when txg_wait_synced_flags(TXG_WAIT_SUSPEND)
returns because the pool suspended (ESHUTDOWN).

The rest of the commit is just threading the errors through, and related
housekeeping.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17398
2025-08-08 16:43:26 -07:00
..
abd.c Prefer VERIFY0(n) over VERIFY3U(n, ==, 0) 2025-08-07 11:41:25 -07:00
aggsum.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
arc.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
blake3_zfs.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
blkptr.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
bplist.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
bpobj.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
bptree.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
bqueue.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
brt.c BRT: Fix ZAP entry endianness 2025-07-30 09:42:47 -07:00
btree.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
dataset_kstats.c ZIL: "crash" the ZIL if the pool suspends during fallback 2025-08-08 16:43:26 -07:00
dbuf_stats.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
dbuf.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
ddt_log.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
ddt_stats.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
ddt_zap.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
ddt.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
dmu_diff.c Allow physical rewrite without logical 2025-08-06 10:36:07 -07:00
dmu_direct.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
dmu_object.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
dmu_objset.c Prefer VERIFY0P(n) over VERIFY(n == NULL) 2025-08-07 11:41:37 -07:00
dmu_recv.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
dmu_redact.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
dmu_send.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
dmu_traverse.c Allow physical rewrite without logical 2025-08-06 10:36:07 -07:00
dmu_tx.c Prefer VERIFY0(n) over VERIFY(n == 0) 2025-08-07 11:40:59 -07:00
dmu_zfetch.c Wire O_DIRECT also to Uncached I/O (#17218) 2025-05-13 14:26:55 -07:00
dmu.c Prefer VERIFY0(n) over VERIFY(n == 0) 2025-08-07 11:40:59 -07:00
dnode_sync.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
dnode.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
dsl_bookmark.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
dsl_crypt.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
dsl_dataset.c Prefer VERIFY0P(n) over VERIFY(n == NULL) 2025-08-07 11:41:37 -07:00
dsl_deadlist.c Prefer VERIFY0P(n) over VERIFY(n == NULL) 2025-08-07 11:41:37 -07:00
dsl_deleg.c Prefer VERIFY0(n) over VERIFY(n == 0) 2025-08-07 11:40:59 -07:00
dsl_destroy.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
dsl_dir.c Prefer VERIFY0(n) over VERIFY(n == 0) 2025-08-07 11:40:59 -07:00
dsl_pool.c Prefer VERIFY0P(n) over VERIFY(n == NULL) 2025-08-07 11:41:37 -07:00
dsl_prop.c Prefer VERIFY0(n) over VERIFY(n == 0) 2025-08-07 11:40:59 -07:00
dsl_scan.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
dsl_synctask.c dmu_tx_assign: make all VERIFY0 calls use DMU_TX_SUSPEND 2025-05-28 10:28:59 -07:00
dsl_userhold.c Prefer VERIFY0(n) over VERIFY(n == 0) 2025-08-07 11:40:59 -07:00
edonr_zfs.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
fm.c Prefer VERIFY0(n) over VERIFY(n == 0) 2025-08-07 11:40:59 -07:00
gzip.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
hkdf.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
lz4_zfs.c SPDX: license tags: BSD-2-Clause 2025-03-13 17:56:46 -07:00
lz4.c SPDX: license tags: BSD-2-Clause 2025-03-13 17:56:46 -07:00
lzjb.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
metaslab.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
mmp.c Prefer VERIFY0P(n) over VERIFY(n == NULL) 2025-08-07 11:41:37 -07:00
multilist.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
objlist.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
pathname.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
range_tree.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
refcount.c Implement allocation size ranges and use for gang leaves (#17111) 2025-05-02 15:32:18 -07:00
rrwlock.c Prefer VERIFY0P(n) over VERIFY(n == NULL) 2025-08-07 11:41:37 -07:00
sa.c Prefer VERIFY0(n) over VERIFY3U(n, ==, 0) 2025-08-07 11:41:25 -07:00
sha2_zfs.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
skein_zfs.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
spa_checkpoint.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
spa_config.c tunables: ensure tunable and variable have same define gate 2025-05-28 16:50:22 -07:00
spa_errlog.c Allow physical rewrite without logical 2025-08-06 10:36:07 -07:00
spa_history.c dmu_tx: rename dmu_tx_assign() flags from TXG_* to DMU_TX_* (#17143) 2025-03-18 16:04:22 -07:00
spa_log_spacemap.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
spa_misc.c Prefer VERIFY0P(n) over VERIFY(n == NULL) 2025-08-07 11:41:37 -07:00
spa_stats.c Prefer VERIFY0(n) over VERIFY(n == 0) 2025-08-07 11:40:59 -07:00
spa.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
space_map.c Prefer VERIFY0P(n) over VERIFY(n == NULL) 2025-08-07 11:41:37 -07:00
space_reftree.c Prefer VERIFY0(n) over VERIFY(n == 0) 2025-08-07 11:40:59 -07:00
THIRDPARTYLICENSE.cityhash OpenZFS 8484 - Implement aggregate sum and use for arc counters 2018-06-06 09:35:59 -07:00
THIRDPARTYLICENSE.cityhash.descrip OpenZFS 8484 - Implement aggregate sum and use for arc counters 2018-06-06 09:35:59 -07:00
txg.c txg_wait_synced_flags: add TXG_WAIT_SUSPEND flag to not wait if pool suspended 2025-05-28 10:27:46 -07:00
uberblock.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
unique.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
vdev_draid_rand.c SPDX: license tags: LicenseRef-OpenZFS-ThirdParty-PublicDomain 2025-03-13 17:57:31 -07:00
vdev_draid.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
vdev_file.c Implement allocation size ranges and use for gang leaves (#17111) 2025-05-02 15:32:18 -07:00
vdev_indirect_births.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
vdev_indirect_mapping.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
vdev_indirect.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
vdev_initialize.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
vdev_label.c Prefer VERIFY0(n) over VERIFY(n == 0) 2025-08-07 11:40:59 -07:00
vdev_mirror.c Allow physical rewrite without logical 2025-08-06 10:36:07 -07:00
vdev_missing.c Implement allocation size ranges and use for gang leaves (#17111) 2025-05-02 15:32:18 -07:00
vdev_queue.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
vdev_raidz_math_aarch64_neon_common.h SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
vdev_raidz_math_aarch64_neon.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
vdev_raidz_math_aarch64_neonx2.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
vdev_raidz_math_avx2.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
vdev_raidz_math_avx512bw.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
vdev_raidz_math_avx512f.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
vdev_raidz_math_impl.h SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
vdev_raidz_math_powerpc_altivec_common.h SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
vdev_raidz_math_powerpc_altivec.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
vdev_raidz_math_scalar.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
vdev_raidz_math_sse2.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
vdev_raidz_math_ssse3.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
vdev_raidz_math.c tunables: don't assert initialisation in impl getters 2025-05-28 16:50:22 -07:00
vdev_raidz.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
vdev_rebuild.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
vdev_removal.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
vdev_root.c Implement allocation size ranges and use for gang leaves (#17111) 2025-05-02 15:32:18 -07:00
vdev_trim.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
vdev.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
zap_leaf.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
zap_micro.c Prefer VERIFY0(n) over VERIFY(n == 0) 2025-08-07 11:40:59 -07:00
zap.c Prefer VERIFY0(n) over VERIFY3U(n, ==, 0) 2025-08-07 11:41:25 -07:00
zcp_get.c zcp: get_prop: fix encryptionroot and encryption 2025-05-27 20:04:37 -04:00
zcp_global.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
zcp_iter.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
zcp_set.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
zcp_synctask.c zcp_synctask: add zfs.sync.clone() 2025-06-10 14:53:10 -07:00
zcp.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
zfeature.c Prefer VERIFY0(n) over VERIFY(n == 0) 2025-08-07 11:40:59 -07:00
zfs_byteswap.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
zfs_chksum.c Faster checksum benchmark on system boot 2025-07-29 17:09:48 -07:00
zfs_crrd.c Add TXG timestamp database 2025-08-06 10:31:21 -07:00
zfs_debug_common.c nvlist: Add nvlist_snprintf() and zfs_dbgmsg_nvlist() 2025-04-18 09:22:16 -04:00
zfs_fm.c events: include zio type in IO error reports 2025-05-30 10:29:29 -04:00
zfs_fuid.c Prefer VERIFY0(n) over VERIFY(n == 0) 2025-08-07 11:40:59 -07:00
zfs_impl.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
zfs_ioctl.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
zfs_log.c ZIL: pass commit errors back to ITX callbacks 2025-08-08 16:43:20 -07:00
zfs_onexit.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
zfs_quota.c Prefer VERIFY0(n) over VERIFY(n == 0) 2025-08-07 11:40:59 -07:00
zfs_ratelimit.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
zfs_replay.c dmu_tx: rename dmu_tx_assign() flags from TXG_* to DMU_TX_* (#17143) 2025-03-18 16:04:22 -07:00
zfs_rlock.c Prefer VERIFY0(n) over VERIFY3U(n, ==, 0) 2025-08-07 11:41:25 -07:00
zfs_sa.c ZIL: allow zil_commit() to fail with error 2025-08-08 16:43:09 -07:00
zfs_vnops.c ZIL: allow zil_commit() to fail with error 2025-08-08 16:43:09 -07:00
zfs_znode.c Add default user/group/project quota properties 2025-04-03 10:35:22 -07:00
zil.c ZIL: "crash" the ZIL if the pool suspends during fallback 2025-08-08 16:43:26 -07:00
zio_checksum.c Prefer VERIFY0(n) over VERIFY(n == 0) 2025-08-07 11:40:59 -07:00
zio_compress.c Removed unused zio_decompress_fail_fraction variable 2025-08-06 17:10:03 -07:00
zio_inject.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
zio.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
zle.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
zrlock.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
zthr.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
zvol.c ZIL: pass commit errors back to ITX callbacks 2025-08-08 16:43:20 -07:00