Commit Graph

8 Commits

Author SHA1 Message Date
Clemens Fruhwirth
8e99d66b05 Add support for rw semaphore under PREEMPT_RT_FULL
The main complication from the RT patch set is that the RW semaphore
locks change such that read locks on an rwsem can be taken only by
a single thread.  All other threads are locked out. This single
thread can take a read lock multiple times though. The underlying
implementation changes to a mutex with an additional read_depth
count.

The implementation can be best understood by inspecting the RT
patch.  rwsem_rt.h and rt.c give the best insight into how RT
rwsem works. My implementation for rwsem_tryupgrade is basically
an inversion of rt_downgrade_write found in rt.c. Please see the
comments in the code.

Unfortunately, I have to drop SPLAT rwlock test4 completely as this
test tries to take multiple locks from different threads, which RT
rwsems do not support.  Otherwise SPLAT, zconfig.sh, zpios-sanity.sh
and zfs-tests.sh pass on my Debian-testing VM with the kernel
linux-image-4.8.0-1-rt-amd64.

Tested-by: kernelOfTruth <kerneloftruth@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Clemens Fruhwirth <clemens@endorphin.org>
Closes zfsonlinux/zfs#5491
Closes #589
Closes #308
2016-12-19 12:45:24 -08:00
Brian Behlendorf
b7c7008ba2 Linux 4.8 compat: rw_semaphore atomic_long_t count
For non-rwsem-spinlocks the "count" member was changed from a
"long" to "atomic_long_t" type.  A configure check has been
added to detect this change along with new versions of the
_rwsem_tryupgrade() function and RWSEM_COUNT() macro.  See
https://github.com/torvalds/linux/commit/8ee62b18 for complete
details.

Signed-off-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #563
2016-07-29 14:17:53 -07:00
Chunwei Chen
f58040c0fc Implement a proper rw_tryupgrade
Current rw_tryupgrade does rw_exit and then rw_tryenter(RW_RWITER), and then
does rw_enter(RW_READER) if it fails. This violate the assumption that
rw_tryupgrade should be atomic and could cause extra contention or even lock
inversion.

This patch we implement a proper rw_tryupgrade. For rwsem-spinlock, we take
the spinlock to check rwsem->count and rwsem->wait_list. For normal rwsem, we
use cmpxchg on rwsem->count to change the value from single reader to single
writer.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tim Chase <tim@chase2k.com>
Closes zfsonlinux/zfs#4692
Closes #554
2016-05-31 11:44:15 -07:00
Brian Behlendorf
86c16c59fe Retire rwsem_is_locked() compat
Stock Linux 2.6.32 and earlier kernels contained a broken version of
rwsem_is_locked() which could return an incorrect value.  Because of
this compatibility code was added to detect the broken implementation
and replace it with our own if needed.

The fix for this issue was merged in to the mainline Linux kernel as
of 2.6.33 and the major enterprise distributions based on 2.6.32 have
all backported the fix.  Therefore there is no longer a need to carry
this code and it can be removed.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #454
2015-06-10 16:35:48 -07:00
Ned Bass
3d6af2dd6d Refresh links to web site
Update links to refer to the official ZFS on Linux website instead of
@behlendorf's personal fork on github.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2013-03-04 19:09:34 -08:00
Brian Behlendorf
d503b971f4 Optimize spl_rwsem_is_locked()
The spl_rwsem_is_locked() compatibility function has been observed
to be a hot spot.  The root cause of this is that we must check the
rwsem activity under the rwsem->wait_lock to avoid a race.  When
the lock is busy significant contention can occur.

The upstream kernel fix for this race had the insight that by using
spin_trylock_irqsave() this contention could be avoided.  When the
lock is contended it's reasonable to return that it is locked.

This change updates the SPLs implemention to be like the upstream
kernel.  Since the kernel code has been in use for years now this
a low risk change.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2012-07-13 13:07:39 -07:00
Darik Horn
588d900433 Linux 3.2 compat: rw_semaphore.wait_lock is raw
The wait_lock member of the rw_semaphore struct became a raw_spinlock_t
in Linux 3.2 at torvalds/linux@ddb6c9b58a.

Wrap spin_lock_* function calls in a new spl_rwsem_* interface to
ensure type safety if raw_spinlock_t becomes architecture specific,
and to satisfy these compiler warnings:

  warning: passing argument 1 of ‘spinlock_check’
    from incompatible pointer type [enabled by default]
  note: expected ‘struct spinlock_t *’
    but argument is of type ‘struct raw_spinlock_t *’

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes: #76
Closes: zfsonlinux/zfs#463
2012-01-11 16:28:05 -08:00
Ned Bass
46aa7b3939 Correctly handle rwsem_is_locked() behavior
A race condition in rwsem_is_locked() was fixed in Linux 2.6.33 and the fix was
backported to RHEL5 as of kernel 2.6.18-190.el5.  Details can be found here:

https://bugzilla.redhat.com/show_bug.cgi?id=526092

The race condition was fixed in the kernel by acquiring the semaphore's
wait_lock inside rwsem_is_locked().  The SPL worked around the race condition
by acquiring the wait_lock before calling that function, but with the fix in
place it must not do that.

This commit implements an autoconf test to detect whether the fixed version of
rwsem_is_locked() is present.  The previous version of rwsem_is_locked() was an
inline static function while the new version is exported as a symbol which we
can check for in module.symvers.  Depending on the result we correctly
implement the needed compatibility macros for proper spinlock handling.

Finally, we do the right thing with spin locks in RW_*_HELD() by using the
new compatibility macros.  We only only acquire the semaphore's wait_lock if
it is calling a rwsem_is_locked() that does not itself try to acquire the lock.

Some new overhead and a small harmless race is introduced by this change.
This is because RW_READ_HELD() and RW_WRITE_HELD() now acquire and release
the wait_lock twice: once for the call to rwsem_is_locked() and once for
the call to rw_owner().  This can't be avoided if calling a rwsem_is_locked()
that takes the wait_lock, as it will in more recent kernels.

The other case which only occurs in legacy kernels could be optimized by
taking the lock only once, as was done prior to this commit.  However, I
decided that the performance gain probably wasn't significant enough to
justify the messy special cases required.

The function spl_rw_get_owner() was only used to enable the afore-mentioned
optimization.  Since it is no longer used, I removed it.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2010-08-10 16:43:00 -07:00