The first locking issue was due to the semaphore I used. I was trying

to be overly clever and the context switch when the semaphore was busy
was destroying performance.  Converting to a simple spin lock bough me
a factor of 50 or so.  That said it's still not good enough.  Tests
show bad performance and we are still CPU bound.  The logical fix is
I need to implement per-cpu hot caches to minimize the SMP contention.
Linux and Solaris both have this, I was hoping to do without but it
looks like that's not to be.

   kmem_lock: time (sec)        slabs           objs            hash
   kmem_lock:                   tot/max/calc    tot/max/calc    size/depth
   kmem_lock:  0.022000000      7/6/64  224/177/2048    32768/1
   kmem_lock:  0.039000000      13/13/128       416/404/4096    32768/1
   kmem_lock:  0.079000000      23/21/256       736/672/8192    32768/1
   kmem_lock:  0.158000000      48/47/512       1536/1504/16384 32768/1
   kmem_lock:  0.345000000      105/105/1024    3360/3358/32768 32768/2
   kmem_lock:  0.760000000      202/200/2048    6464/6400/65536 32768/3



git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@135 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
This commit is contained in:
behlendo
2008-06-24 17:18:15 +00:00
parent 44b8f1769f
commit d46630e0f3
3 changed files with 41 additions and 35 deletions
+8 -8
View File
@@ -364,7 +364,7 @@ extern int kmem_set_warning(int flag);
#define SKS_MAGIC 0x22222222
#define SKC_MAGIC 0x2c2c2c2c
#define SPL_KMEM_CACHE_HASH_BITS 12 /* 4k, sized for 1000's of objs */
#define SPL_KMEM_CACHE_HASH_BITS 12
#define SPL_KMEM_CACHE_HASH_ELTS (1 << SPL_KMEM_CACHE_HASH_BITS)
#define SPL_KMEM_CACHE_HASH_SIZE (sizeof(struct hlist_head) * \
SPL_KMEM_CACHE_HASH_ELTS)
@@ -417,16 +417,16 @@ typedef struct spl_kmem_cache {
struct list_head skc_list; /* List of caches linkage */
struct list_head skc_complete_list;/* Completely alloc'ed */
struct list_head skc_partial_list; /* Partially alloc'ed */
struct rw_semaphore skc_sem; /* Cache semaphore */
spinlock_t skc_lock; /* Cache lock */
uint64_t skc_slab_fail; /* Slab alloc failures */
uint64_t skc_slab_create;/* Slab creates */
uint64_t skc_slab_destroy;/* Slab destroys */
uint64_t skc_slab_total; /* Slab total */
uint64_t skc_slab_alloc; /* Slab alloc */
uint64_t skc_slab_max; /* Slab max */
uint64_t skc_obj_total; /* Obj total */
uint64_t skc_obj_alloc; /* Obj alloc */
uint64_t skc_obj_max; /* Obj max */
uint64_t skc_slab_total; /* Slab total current */
uint64_t skc_slab_alloc; /* Slab alloc current */
uint64_t skc_slab_max; /* Slab max historic */
uint64_t skc_obj_total; /* Obj total current */
uint64_t skc_obj_alloc; /* Obj alloc current */
uint64_t skc_obj_max; /* Obj max historic */
uint64_t skc_hash_depth; /* Hash depth */
uint64_t skc_hash_max; /* Hash depth max */
} spl_kmem_cache_t;