Fix race in spl_kmem_cache_reap_now()

The current code contains a race condition that triggers when bit 2 in
spl.spl_kmem_cache_expire is set, spl_kmem_cache_reap_now() is invoked
and another thread is concurrently accessing its magazine.

spl_kmem_cache_reap_now() currently invokes spl_cache_flush() on each
magazine in the same thread when bit 2 in spl.spl_kmem_cache_expire is
set. This is unsafe because there is one magazine per CPU and the
magazines are lockless, so it is impossible to guarentee that another
CPU is not using its magazine when this function is called.

The solution is to only touch the local CPU's magazine and leave other
CPU's magazines to other CPUs.

Reported-by: DHE
Signed-off-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #274
This commit is contained in:
Richard Yao 2013-08-04 19:35:08 -04:00 committed by Brian Behlendorf
parent ba06298072
commit 251e7a779b

View File

@ -2196,12 +2196,12 @@ spl_kmem_cache_reap_now(spl_kmem_cache_t *skc, int count)
/* Reclaim from the magazine then the slabs ignoring age and delay. */
if (spl_kmem_cache_expire & KMC_EXPIRE_MEM) {
spl_kmem_magazine_t *skm;
int i;
unsigned long irq_flags;
for_each_online_cpu(i) {
skm = skc->skc_mag[i];
local_irq_save(irq_flags);
skm = skc->skc_mag[smp_processor_id()];
spl_cache_flush(skc, skm, skm->skm_avail);
}
local_irq_restore(irq_flags);
}
spl_slab_reclaim(skc, count, 1);