Fix race in spl_kmem_cache_reap_now()

The current code contains a race condition that triggers when bit 2 in spl.spl_kmem_cache_expire is set, spl_kmem_cache_reap_now() is invoked and another thread is concurrently accessing its magazine. spl_kmem_cache_reap_now() currently invokes spl_cache_flush() on each magazine in the same thread when bit 2 in spl.spl_kmem_cache_expire is set. This is unsafe because there is one magazine per CPU and the magazines are lockless, so it is impossible to guarentee that another CPU is not using its magazine when this function is called. The solution is to only touch the local CPU's magazine and leave other CPU's magazines to other CPUs. Reported-by: DHE Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #274
2025-12-01 09:32:08 +03:00 · 2013-08-04 19:35:08 -04:00 · 2013-08-04 19:35:08 -04:00 · 251e7a779b
commit 251e7a779b
parent ba06298072
1 changed files with 5 additions and 5 deletions
--- a/module/spl/spl-kmem.c
+++ b/module/spl/spl-kmem.c
@ -2196,12 +2196,12 @@ spl_kmem_cache_reap_now(spl_kmem_cache_t *skc, int count)
 	/* Reclaim from the magazine then the slabs ignoring age and delay. */
 	if (spl_kmem_cache_expire & KMC_EXPIRE_MEM) {
 		spl_kmem_magazine_t *skm;
-		int i;
+		unsigned long irq_flags;
-		for_each_online_cpu(i) {
+		local_irq_save(irq_flags);
-			skm = skc->skc_mag[i];
+		skm = skc->skc_mag[smp_processor_id()];
-			spl_cache_flush(skc, skm, skm->skm_avail);
+		spl_cache_flush(skc, skm, skm->skm_avail);
-		}
+		local_irq_restore(irq_flags);
 	}
 	spl_slab_reclaim(skc, count, 1);