OpenZFS 8199 - multi-threaded dmu_object_alloc()

dmu_object_alloc() is single-threaded, so when multiple threads are
creating files in a single filesystem, they spend a lot of time waiting
for the os_obj_lock.  To improve performance of multi-threaded file
creation, we must make dmu_object_alloc() typically not grab any
filesystem-wide locks.

The solution is to have a "next object to allocate" for each CPU. Each
of these "next object"s is in a different block of the dnode object, so
that concurrent allocation holds dnodes in different dbufs.  When a
thread's "next object" reaches the end of a chunk of objects (by default
4 blocks worth -- 128 dnodes), it will be reset to the per-objset
os_obj_next, which will be increased by a chunk of objects (128).  Only
when manipulating the os_obj_next will we need to grab the os_obj_lock.
This decreases lock contention dramatically, because each thread only
needs to grab the os_obj_lock briefly, once per 128 allocations.

This results in a 70% performance improvement to multi-threaded object
creation (where each thread is creating objects in its own directory),
from 67,000/sec to 115,000/sec, with 8 CPUs.

Work sponsored by Intel Corp.

Authored by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Ned Bass <bass6@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>

OpenZFS-issue: https://www.illumos.org/issues/8199
OpenZFS-commit: https://github.com/openzfs/openzfs/pull/374
Closes #4703
Closes #6117
This commit is contained in:
Matthew Ahrens
2016-05-12 21:16:36 -07:00
committed by Brian Behlendorf
parent 1b7c1e5ce9
commit dbeb879699
4 changed files with 135 additions and 71 deletions
+5 -1
View File
@@ -120,7 +120,11 @@ struct objset {
/* Protected by os_obj_lock */
kmutex_t os_obj_lock;
uint64_t os_obj_next;
uint64_t os_obj_next_chunk;
/* Per-CPU next object to allocate, protected by atomic ops. */
uint64_t *os_obj_next_percpu;
int os_obj_next_percpu_len;
/* Protected by os_lock */
kmutex_t os_lock;