mirror_zfs/module/zfs
Alexander Motin f9d59b579e ZIL: Relax parallel write ZIOs processing
ZIL introduced dependencies between its write ZIOs to permit flush
defer, when we flush vdev caches only once all the write ZIOs has
completed.  But it was recently spotted that it serializes not only
ZIO completions handling, but also their ready stage.  It means ZIO
pipeline can't calculate checksums for the following ZIOs until all
the previous are checksumed, even though it is not required.  On a
systems where memory throughput of a single CPU core is limited,
it creates single-core CPU bottleneck, which is difficult to see
due to ZIO pipeline design with many taskqueue threads.

While it would be great to bypass the ready stage waits, it would
require changes to ZIO code, and I haven't found a clean way to do
it.  But I've noticed that we don't need any dependency between
the write ZIOs if the previous one has some waiters, which means
it won't defer any flushes and work as a barrier for the earlier
ones.

Bypassing it won't help large single-thread writes, since all the
write ZIOs except the last in that case won't have waiters, and
so will be dependent.  But in that case the ZIO processing might
not be a bottleneck, since there will be only one thread populating
the write buffers, that will likely be the bottleneck.

But bypassing the ZIO dependency on multi-threaded write workloads
really allows them to scale beyond the checksuming throughput of
one CPU core.

My tests with writing 12 files on a same dataset on a pool with
4 striped NVMes as SLOGs from 12 threads with 1MB blocks on a
system with Xeon Silver 4114 CPU show total throughput increase
from 4.3GB/s to 8.5GB/s, increasing the SLOGs busy from ~30% to
~70%.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #17458
2025-08-05 12:14:18 -04:00
..
abd.c Export correct symbols for Lustre Direct I/O 2025-05-28 16:00:28 -07:00
aggsum.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
arc.c ARC: parallel eviction 2025-06-17 10:50:26 -07:00
blake3_zfs.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
blkptr.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
bplist.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
bpobj.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
bptree.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
bqueue.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
brt.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
btree.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
dataset_kstats.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
dbuf_stats.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
dbuf.c Improve block cloning transactions accounting 2025-06-17 10:50:26 -07:00
ddt_log.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
ddt_stats.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
ddt_zap.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
ddt.c tunables: fix spelling 2025-06-17 10:50:26 -07:00
dmu_diff.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
dmu_direct.c Export correct symbols for Lustre Direct I/O 2025-05-28 16:00:28 -07:00
dmu_object.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
dmu_objset.c dmu_objset_hold_flags() should call dsl_dataset_rele_flags() on error 2025-05-28 16:00:28 -07:00
dmu_recv.c cred: properly pass and test creds on other threads (#17273) 2025-05-28 16:00:28 -07:00
dmu_redact.c dmu_tx: rename dmu_tx_assign() flags from TXG_* to DMU_TX_* (#17143) 2025-04-16 09:59:45 -07:00
dmu_send.c Fix 2 bugs in non-raw send with encryption 2025-05-28 16:00:28 -07:00
dmu_traverse.c dmu_traverse: remove 'ignore_hole_birth' tunable alias 2025-06-17 10:50:27 -07:00
dmu_tx.c Improve block cloning transactions accounting 2025-06-17 10:50:26 -07:00
dmu_zfetch.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
dmu.c Reduce zfs_dmu_offset_next_sync penalty 2025-06-17 10:50:26 -07:00
dnode_sync.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
dnode.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
dsl_bookmark.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
dsl_crypt.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
dsl_dataset.c Expose dataset encryption status via fast stat path 2025-06-17 10:49:40 -07:00
dsl_deadlist.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
dsl_deleg.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
dsl_destroy.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
dsl_dir.c cred: properly pass and test creds on other threads (#17273) 2025-05-28 16:00:28 -07:00
dsl_pool.c During pool export flush the ARC asynchronously 2025-06-17 10:50:26 -07:00
dsl_prop.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
dsl_scan.c Cause zpool scan resume commands to get logged in history 2025-05-28 16:00:28 -07:00
dsl_synctask.c txg: generalise txg_wait_synced_sig() to txg_wait_synced_flags() (#17284) 2025-05-28 16:00:28 -07:00
dsl_userhold.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
edonr_zfs.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
fm.c tunables: ensure tunable and variable have same define gate 2025-06-17 10:50:26 -07:00
gzip.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
hkdf.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
lz4_zfs.c SPDX: license tags: BSD-2-Clause 2025-04-16 09:59:44 -07:00
lz4.c SPDX: license tags: BSD-2-Clause 2025-04-16 09:59:44 -07:00
lzjb.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
metaslab.c Block remap for cloned blocks on device removal 2025-04-16 09:59:45 -07:00
mmp.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
multilist.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
objlist.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
pathname.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
range_tree.c Fix off-by-one bug in range tree code 2025-06-17 10:49:40 -07:00
refcount.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
rrwlock.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
sa.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
sha2_zfs.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
skein_zfs.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
spa_checkpoint.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
spa_config.c tunables: ensure tunable and variable have same define gate 2025-06-17 10:50:26 -07:00
spa_errlog.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
spa_history.c dmu_tx: rename dmu_tx_assign() flags from TXG_* to DMU_TX_* (#17143) 2025-04-16 09:59:45 -07:00
spa_log_spacemap.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
spa_misc.c During pool export flush the ARC asynchronously 2025-06-17 10:50:26 -07:00
spa_stats.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
spa.c Set spa_final_txg in spa_unload() 2025-06-17 10:50:26 -07:00
space_map.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
space_reftree.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
THIRDPARTYLICENSE.cityhash OpenZFS 8484 - Implement aggregate sum and use for arc counters 2018-06-06 09:35:59 -07:00
THIRDPARTYLICENSE.cityhash.descrip OpenZFS 8484 - Implement aggregate sum and use for arc counters 2018-06-06 09:35:59 -07:00
txg.c txg: generalise txg_wait_synced_sig() to txg_wait_synced_flags() (#17284) 2025-05-28 16:00:28 -07:00
uberblock.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
unique.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
vdev_draid_rand.c SPDX: license tags: LicenseRef-OpenZFS-ThirdParty-PublicDomain 2025-04-16 09:59:45 -07:00
vdev_draid.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
vdev_file.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
vdev_indirect_births.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
vdev_indirect_mapping.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
vdev_indirect.c dmu_tx: rename dmu_tx_assign() flags from TXG_* to DMU_TX_* (#17143) 2025-04-16 09:59:45 -07:00
vdev_initialize.c dmu_tx: rename dmu_tx_assign() flags from TXG_* to DMU_TX_* (#17143) 2025-04-16 09:59:45 -07:00
vdev_label.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
vdev_mirror.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
vdev_missing.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
vdev_queue.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
vdev_raidz_math_aarch64_neon_common.h SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
vdev_raidz_math_aarch64_neon.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
vdev_raidz_math_aarch64_neonx2.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
vdev_raidz_math_avx2.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
vdev_raidz_math_avx512bw.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
vdev_raidz_math_avx512f.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
vdev_raidz_math_impl.h SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
vdev_raidz_math_powerpc_altivec_common.h SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
vdev_raidz_math_powerpc_altivec.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
vdev_raidz_math_scalar.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
vdev_raidz_math_sse2.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
vdev_raidz_math_ssse3.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
vdev_raidz_math.c tunables: don't assert initialisation in impl getters 2025-06-17 10:50:26 -07:00
vdev_raidz.c dmu_tx: rename dmu_tx_assign() flags from TXG_* to DMU_TX_* (#17143) 2025-04-16 09:59:45 -07:00
vdev_rebuild.c dmu_tx: rename dmu_tx_assign() flags from TXG_* to DMU_TX_* (#17143) 2025-04-16 09:59:45 -07:00
vdev_removal.c Fix null dereference in spa_vdev_remove_cancel_sync() 2025-05-28 16:00:28 -07:00
vdev_root.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
vdev_trim.c dmu_tx: rename dmu_tx_assign() flags from TXG_* to DMU_TX_* (#17143) 2025-04-16 09:59:45 -07:00
vdev.c During pool export flush the ARC asynchronously 2025-06-17 10:50:26 -07:00
zap_leaf.c ZAP: Reduce leaf array and free chunks fragmentation 2025-05-28 16:00:28 -07:00
zap_micro.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
zap.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
zcp_get.c zcp: get_prop: fix encryptionroot and encryption 2025-06-17 10:49:40 -07:00
zcp_global.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
zcp_iter.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
zcp_set.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
zcp_synctask.c cred: properly pass and test creds on other threads (#17273) 2025-05-28 16:00:28 -07:00
zcp.c txg: generalise txg_wait_synced_sig() to txg_wait_synced_flags() (#17284) 2025-05-28 16:00:28 -07:00
zfeature.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
zfs_byteswap.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
zfs_chksum.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
zfs_fm.c zed: Ensure spare activation after kernel-initiated device removal 2025-04-16 09:59:45 -07:00
zfs_fuid.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
zfs_impl.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
zfs_ioctl.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
zfs_log.c zfs_log_write: only put the callback on the last itx 2025-06-17 10:50:26 -07:00
zfs_onexit.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
zfs_quota.c dmu_tx: rename dmu_tx_assign() flags from TXG_* to DMU_TX_* (#17143) 2025-04-16 09:59:45 -07:00
zfs_ratelimit.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
zfs_replay.c dmu_tx: rename dmu_tx_assign() flags from TXG_* to DMU_TX_* (#17143) 2025-04-16 09:59:45 -07:00
zfs_rlock.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
zfs_sa.c dmu_tx: rename dmu_tx_assign() flags from TXG_* to DMU_TX_* (#17143) 2025-04-16 09:59:45 -07:00
zfs_vnops.c Relax zfs_vnops_read_chunk_size limitations 2025-06-17 10:50:27 -07:00
zfs_znode.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
zil.c ZIL: Relax parallel write ZIOs processing 2025-08-05 12:14:18 -04:00
zio_checksum.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
zio_compress.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
zio_inject.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
zio.c Only interrupt active disk I/Os in failmode=continue 2025-06-17 10:49:40 -07:00
zle.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
zrlock.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
zthr.c SPDX: license tags: CDDL-1.0 2025-04-16 09:59:44 -07:00
zvol.c Improve block cloning transactions accounting 2025-06-17 10:50:26 -07:00