Add a delay to tearing down threads.

It's been observed that in certain workloads (zvol-related being a big one), ZFS will end up spending a large amount of time spinning up taskqs only to tear them down again almost immediately, then spin them up again... I noticed this when I looked at what my mostly-idle system was doing and wondered how on earth taskq creation/destroy was a bunch of time... So I added a configurable delay to avoid it tearing down tasks the first time it notices them idle, and the total number of threads at steady state went up, but the amount of time being burned just tearing down/turning up new ones almost vanished. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #14938
2026-05-22 02:27:36 +03:00 · 2023-06-26 16:57:12 -04:00
parent 8e8acabdca
commit 35a6247c5f
3 changed files with 49 additions and 1 deletions
@@ -193,4 +193,19 @@ The proc file will walk the lists with lock held,
 reading it could cause a lock-up if the list grow too large
 without limiting the output.
 "(truncated)" will be shown if the list is larger than the limit.
+.
+.It Sy spl_taskq_thread_timeout_ms Ns = Ns Sy 10000 Pq uint
+(Linux-only)
+How long a taskq has to have had no work before we tear it down.
+Previously, we would tear down a dynamic taskq worker as soon
+as we noticed it had no work, but it was observed that this led
+to a lot of churn in tearing down things we then immediately
+spawned anew.
+In practice, it seems any nonzero value will remove the vast
+majority of this churn, while the nontrivially larger value
+was chosen to help filter out the little remaining churn on
+a mostly idle system.
+Setting this value to
+.Sy 0
+will revert to the previous behavior.
 .El