Commit Graph

16 Commits

Author SHA1 Message Date
Chunwei Chen
f58040c0fc Implement a proper rw_tryupgrade
Current rw_tryupgrade does rw_exit and then rw_tryenter(RW_RWITER), and then
does rw_enter(RW_READER) if it fails. This violate the assumption that
rw_tryupgrade should be atomic and could cause extra contention or even lock
inversion.

This patch we implement a proper rw_tryupgrade. For rwsem-spinlock, we take
the spinlock to check rwsem->count and rwsem->wait_list. For normal rwsem, we
use cmpxchg on rwsem->count to change the value from single reader to single
writer.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tim Chase <tim@chase2k.com>
Closes zfsonlinux/zfs#4692
Closes #554
2016-05-31 11:44:15 -07:00
Brian Behlendorf
a6ae97caed Add rw_tryupgrade()
This implementation of rw_tryupgrade() behaves slightly differently
from its counterparts on other platforms.  It drops the RW_READER lock
and then acquires the RW_WRITER lock leaving a small window where no
lock is held.  On other platforms the lock is never released during
the upgrade process.  This is necessary under Linux because the kernel
does not provide an upgrade function.

There are currently no callers in the ZFS code where this change in
behavior is a problem.  In fact, in most cases the code is already
written such that if the upgrade fails the RW_READER lock is dropped
and the caller blocks waiting to acquire the lock as RW_WRITER.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Matthew Thode <prometheanfire@gentoo.org>
Closes zfsonlinux/zfs#4388
Closes #534
2016-03-10 13:05:25 -08:00
Brian Behlendorf
62aa81a577 Add defclsyspri macro
Add a new defclsyspri macro which can be used to request the default
Linux scheduler priority.  Neither the minclsyspri or maxclsyspri map
to the default Linux kernel thread priority.  This makes it awkward to
create taskqs which run with the same priority as the rest of the kernel
threads on the system which can lead to performance issues.

All SPL callers which previously used minclsyspri or maxclsyspri have
been changed to use defclsyspri.  The vast majority of callers were
part of the test suite which won't have an external impact.  The few
places where it could impact performance the change was from maxclsyspri
to defclsyspri.  This makes it more likely the process will be scheduled
which may help performance.

To facilitate further performance analysis the spl_taskq_thread_priority
module option has been added.  When disabled (0) all newly created kernel
threads will use the default kernel thread priority.  When enabled (1)
the specified taskq priority will be used.  By default this value is
enabled (1).

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2015-07-23 13:25:49 -07:00
Ned Bass
52479ecf58 Remove compat includes from sys/types.h
Don't include the compatibility code in linux/*_compat.h in the public
header sys/types.h. This causes problems when an external code base
includes the ZFS headers and has its own conflicting compatibility code.
Lustre, in particular, defined SHRINK_STOP for compatibility with
pre-3.12 kernels in a way that conflicted with the SPL's definition.
Because Lustre ZFS OSD includes ZFS headers it fails to build due to a
'"SHRINK_STOP" redefined' compiler warning.  To avoid such conflicts
only include the compat headers from .c files or private headers.

Also, for consistency, include sys/*.h before linux/*.h then sort by
header name.

Signed-off-by: Ned Bass <bass6@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #411
2014-11-19 10:35:12 -08:00
Tim Chase
17a527cb0f Support post-3.13 kthread_create() semantics.
Provide spl_kthread_create() as a wrapper to the kernel's kthread_create()
to provide pre-3.13 semantics.  Re-try if the call is interrupted or if it
would have returned -ENOMEM.  Otherwise return NULL.

Signed-off-by: Chunwei Chen <tuxoko@gmail.com>
Signed-off-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #339
2014-04-08 12:44:42 -07:00
Ned Bass
3d6af2dd6d Refresh links to web site
Update links to refer to the official ZFS on Linux website instead of
@behlendorf's personal fork on github.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2013-03-04 19:09:34 -08:00
Brian Behlendorf
b84412a6e8 Linux compat 3.7, kernel_thread()
The preferred kernel interface for creating threads has been
kthread_create() for a long time now.  However, several of the
SPLAT tests still use the legacy kernel_thread() function which
has finally been dropped (mostly).

Update the condvar and rwlock SPLAT tests to use the modern
interface.  Frankly this is something we should have done a
long time ago.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #194
2012-12-03 09:36:21 -08:00
Brian Behlendorf
df870a697f splat: Cleanup headers
Restructure the the SPLAT headers such that each test only
includes the minimal set of headers it requires.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2012-11-06 14:48:56 -08:00
Brian Behlendorf
1e18307b61 Fix incorrect krw_type_t type
Flagged by the default compile options on archlinux 2010.05, we should
be using the krw_t type not the krw_type_t type in the private data.

  module/splat/splat-rwlock.c: In function ‘splat_rwlock_test4_func’:
  module/splat/splat-rwlock.c:432:6: warning: case value ‘1’ not in
  enumerated type ‘krw_type_t’
2010-11-09 10:18:01 -08:00
Brian Behlendorf
32f5faff69 Simplify rwlock implementation.
Remove RW_COUNT() from the rwlock implementation.  The idea was that it
could be used as a generic wrapper for getting at the internal state
of a rwlock.  While a good idea it's proven problematic to keep it
correct for multiple archs and internal implementation changes.  In
short it hasn't been worth the trouble.

With that and simplicity in mind things have been updated to use the
rwsem_is_locked() function instead of RW_COUNT for the RW_*_HELD()
functions.  As for rw_upgrade() it remains only implemented for
the generic rwsem implemenation.  It remains to be determined if its
worth the effort of adding a custom implementation for each arch.
2010-05-20 14:20:34 -07:00
Brian Behlendorf
716154c592 Public Release Prep
Updated AUTHORS, COPYING, DISCLAIMER, and INSTALL files.  Added
standardized headers to all source file to clearly indicate the
copyright, license, and to give credit where credit is due.
2010-05-17 15:18:00 -07:00
Brian Behlendorf
6ff686c44d Type long expected explicitly cast for 32-bit systems. 2009-12-01 10:14:01 -08:00
Brian Behlendorf
e811949a57 Reimplement rwlocks for Linux lock profiling/analysis.
It turns out that the previous rwlock implementation worked well but
did not integrate properly with the upstream kernel lock profiling/
analysis tools.  This is a major problem since it would be awfully
nice to be able to use the automatic lock checker and profiler.

The problem is that the upstream lock tools use the pre-processor
to create a lock class for each uniquely named locked.  Since the
rwsem was embedded in a wrapper structure the name was always the
same.  The effect was that we only ended up with one lock class for
the entire SPL which caused the lock dependency checker to flag
nearly everything as a possible deadlock.

The solution was to directly map a krwlock to a Linux rwsem using
a typedef there by eliminating the wrapper structure.  This was not
done initially because the rwsem implementation is specific to the arch.
To fully implement the Solaris krwlock API using only the provided rwsem
API is not possible.  It can only be done by directly accessing some of
the internal data member of the rwsem structure.

For example, the Linux API provides a different function for dropping
a reader vs writer lock.  Whereas the Solaris API uses the same function
and the caller does not pass in what type of lock it is.  This means to
properly drop the lock we need to determine if the lock is currently a
reader or writer lock.  Then we need to call the proper Linux API function.
Unfortunately, there is no provided API for this so we must extracted this
information directly from arch specific lock implementation.  This is
all do able, and what I did, but it does complicate things considerably.

The good news is that in addition to the profiling benefits of this
change.  We may see performance improvements due to slightly reduced
overhead when creating rwlocks and manipulating them.

The only function I was forced to sacrafice was rw_owner() because this
information is simply not stored anywhere in the rwsem.  Luckily this
appears not to be a commonly used function on Solaris, and it is my
understanding it is mainly used for debugging anyway.

In addition to the core rwlock changes, extensive updates were made to
the rwlock regression tests.  Each class of test was extended to provide
more API coverage and to be more rigerous in checking for misbehavior.

This is a pretty significant change and with that in mind I have been
careful to validate it on several platforms before committing.  The full
SPLAT regression test suite was run numberous times on all of the following
platforms.  This includes various kernels ranging from 2.6.16 to 2.6.29.

- SLES10   (ppc64)
- SLES11   (x86_64)
- CHAOS4.2 (x86_64)
- RHEL5.3  (x86_64)
- RHEL6    (x86_64)
- FC11     (x86_64)
2009-09-18 16:09:47 -07:00
Brian Behlendorf
e554dffa60 SLES10 Fixes (part 9)
- Proper ioctl() 32/64-bit binary compatibility.  We need to ensure the
  ioctl data itself is always packed the same for 32/64-bit binaries.
  Additionally, the correct thing to do is encode this size in bytes
  as part of the command using _IOC_SIZE().
- Minor formatting changes to respect the 80 character limit.
- Move all SPLAT_SUBSYSTEM_* defines in to splat-ctl.h.
- Increase SPLAT_SUBSYSTEM_UNKNOWN because we were getting close
  to accidentally using it for a real registered subsystem.
2009-05-21 10:56:11 -07:00
Brian Behlendorf
3f4126739d Sleep uninteruptibly, waking up early may result in a crash 2009-01-22 09:58:48 -08:00
Brian Behlendorf
617d5a673c Rename modules to module and update references 2009-01-15 10:44:54 -08:00