mirror of
https://git.proxmox.com/git/mirror_zfs.git
synced 2026-05-22 18:40:43 +03:00
Introduce minimal ZIL block commit delay
Despite all optimizations, tests on actual hardware show that FreeBSD kernel can't sleep for less then ~2us. Similar tests on Linux show ~50us delay at least from nanosleep() (haven't tested inside kernel). It means that on very fast log device ZIL may not be able to satisfy zfs_commit_timeout_pct block commit timeout, increasing log latency more than desired. Handle that by introduction of zil_min_commit_timeout parameter, specifying minimal timeout value where additional delays to aggregate writes may be skipped. Also skip delays if the LWB is more than 7/8 full, that often happens if I/O sizes are constant and match one of LWB sizes. Both things are applied only if there were no already outstanding log blocks, that may indicate single-threaded workload, that by definition can not benefit from the commit delays. While there, add short time moving average to zl_last_lwb_latency to make it more stable. Tests of single-threaded 4KB writes to NVDIMM SLOG on FreeBSD show IOPS increase by 9% instead of expected 5%. For zfs_commit_timeout_pct of 1 there IOPS increase by 5.5% instead of expected 1%. Reviewed-by: Allan Jude <allan@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Prakash Surya <prakash.surya@delphix.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #14418
This commit is contained in:
committed by
Brian Behlendorf
parent
c8d36e2192
commit
fd0893cf1f
@@ -2126,6 +2126,13 @@ On very fragmented pools, lowering this
|
||||
.Pq typically to Sy 36kB
|
||||
can improve performance.
|
||||
.
|
||||
.It Sy zil_min_commit_timeout Ns = Ns Sy 5000 Pq u64
|
||||
This sets the minimum delay in nanoseconds ZIL care to delay block commit,
|
||||
waiting for more records.
|
||||
If ZIL writes are too fast, kernel may not be able sleep for so short interval,
|
||||
increasing log latency above allowed by
|
||||
.Sy zfs_commit_timeout_pct .
|
||||
.
|
||||
.It Sy zil_nocacheflush Ns = Ns Sy 0 Ns | Ns 1 Pq int
|
||||
Disable the cache flush commands that are normally sent to disk by
|
||||
the ZIL after an LWB write has completed.
|
||||
|
||||
Reference in New Issue
Block a user