Track emergency object in rbtree

In the initial implementation emergency objects were tracked on a
per-cache list.  The assumption was that under normal operation we
would never allocate more than a handful of these objects.  So the
cost of walking the list during free was expected to be negligible.

However real world usage has shown that emergency objects tend to
be allocated in batches.  A deadlock will be detected and several
thousand emergency objects will be allocated before the original
blocked slab allocation can complete.

Therefore the original list has been replaced by a red black tree
which is sorted by the memory address of each allocated object.
This bounds the worst case insertion and removal time to O(log n)
which minimize contention on the assoicated spin lock.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
This commit is contained in:
Brian Behlendorf
2012-10-30 10:45:50 -07:00
parent 165f13c33a
commit ed3163484d
2 changed files with 75 additions and 28 deletions
+3 -2
View File
@@ -31,6 +31,7 @@
#include <linux/spinlock.h>
#include <linux/rwsem.h>
#include <linux/hash.h>
#include <linux/rbtree.h>
#include <linux/ctype.h>
#include <asm/atomic.h>
#include <sys/types.h>
@@ -435,8 +436,8 @@ typedef struct spl_kmem_alloc {
} spl_kmem_alloc_t;
typedef struct spl_kmem_emergency {
struct rb_node ske_node; /* Emergency tree linkage */
void *ske_obj; /* Buffer address */
struct list_head ske_list; /* Emergency list linkage */
} spl_kmem_emergency_t;
typedef struct spl_kmem_cache {
@@ -463,7 +464,7 @@ typedef struct spl_kmem_cache {
struct list_head skc_list; /* List of caches linkage */
struct list_head skc_complete_list;/* Completely alloc'ed */
struct list_head skc_partial_list; /* Partially alloc'ed */
struct list_head skc_emergency_list; /* Min sized objects */
struct rb_root skc_emergency_tree; /* Min sized objects */
spinlock_t skc_lock; /* Cache lock */
wait_queue_head_t skc_waitq; /* Allocation waiters */
uint64_t skc_slab_fail; /* Slab alloc failures */