Commit Graph

66 Commits

Author SHA1 Message Date
Brian Behlendorf
914b063133 Linux compat 2.6.37, invalidate_inodes()
In the 2.6.37 kernel the function invalidate_inodes() is no longer
exported for use by modules.  This memory management functionality
is needed to invalidate the inodes attached to a super block without
unmounting the filesystem.

Because this function still exists in the kernel and the prototype
is available is a common header all we strictly need is the symbol
address.  The address is obtained using spl_kallsyms_lookup_name()
and assigned to the variable invalidate_inodes_fn.  Then a #define
is used to replace all instances of invalidate_inodes() with a
call to the acquired address.  All the complexity is hidden behind
HAVE_INVALIDATE_INODES and invalidate_inodes() can be used as usual.

Long term we should try to get this, or another, interface made
available to modules again.
2011-02-23 12:44:32 -08:00
Brian Behlendorf
22ccfaa8b5 Prefer /lib/modules/$(uname -r)/ links
Preferentially use the /lib/modules/$(uname -r)/source and
/lib/modules/$(uname -r)/build links.  Only if neither of these
links exist fallback to alternate methods for deducing which
kernel to build with.  This resolves the need to manually
specify --with-linux= and --with-linux-obj= on Debian systems.
2011-02-10 14:47:08 -08:00
Brian Behlendorf
8beea9ac24 Refresh autogen.sh products
Refresh the autogen.sh products based on the versions which are
installed by default in the GA RHEL6.0 release.

autoconf (GNU Autoconf) 2.63
automake (GNU automake) 1.11.1
ltmain.sh (GNU libtool) 2.2.6b
2010-11-30 10:36:58 -08:00
Brian Behlendorf
9b2048c26b Linux 2.6.36 compat, fs_struct->lock type change
In the linux-2.6.36 kernel the fs_struct lock was changed from a
rwlock_t to a spinlock_t.  If the kernel would export the set_fs_pwd()
symbol by default this would not have caused us any issues, but they
don't.  So we're forced to add a new autoconf check which sets the
HAVE_FS_STRUCT_SPINLOCK define when a spinlock_t is used.  We can
then correctly use either spin_lock or write_lock in our custom
set_fs_pwd() implementation.
2010-11-09 13:29:47 -08:00
Brian Behlendorf
23aa63cbf5 Fix 2.6.35 shrinker callback API change
As of linux-2.6.35 the shrinker callback API now takes an additional
argument.  The shrinker struct is passed to the callback so that users
can embed the shrinker structure in private data and use container_of()
to access it.  This removes the need to always use global state for the
shrinker.

To handle this we add the SPL_AC_3ARGS_SHRINKER_CALLBACK autoconf
check to properly detect the API.  Then we simply setup a callback
function with the correct number of arguments.  For now we do not make
use of the new 3rd argument.
2010-10-22 14:51:26 -07:00
Brian Behlendorf
a7958f7eef Support custom build directories
One of the neat tricks an autoconf style project is capable of
is allow configurion/building in a directory other than the
source directory.  The major advantage to this is that you can
build the project various different ways while making changes
in a single source tree.

For example, this project is designed to work on various different
Linux distributions each of which work slightly differently.  This
means that changes need to verified on each of those supported
distributions perferably before the change is committed to the
public git repo.

Using nfs and custom build directories makes this much easier.
I now have a single source tree in nfs mounted on several different
systems each running a supported distribution.  When I make a
change to the source base I suspect may break things I can
concurrently build from the same source on all the systems each
in their own subdirectory.

wget -c http://github.com/downloads/behlendorf/spl/spl-x.y.z.tar.gz
tar -xzf spl-x.y.z.tar.gz
cd spl-x-y-z

------------------------- run concurrently ----------------------
<ubuntu system>  <fedora system>  <debian system>  <rhel6 system>
mkdir ubuntu     mkdir fedora     mkdir debian     mkdir rhel6
cd ubuntu        cd fedora        cd debian        cd rhel6
../configure     ../configure     ../configure     ../configure
make             make             make             make
make check       make check       make check       make check

This is something the project has almost supported for a long time
but finishing this support should save me lots of time.
2010-09-05 21:49:05 -07:00
Brian Behlendorf
73fc084e92 Move vendor check to spl-build.m4
This check was previously done with a hack in config.guess.
However, since a new config.guess is copied in to place when
forcing a full autoreconf this change was easily lost and
never a good idea.  This commit also updates all of the
autoconf style support scripts in config.
2010-09-02 16:12:02 -07:00
Ned Bass
46aa7b3939 Correctly handle rwsem_is_locked() behavior
A race condition in rwsem_is_locked() was fixed in Linux 2.6.33 and the fix was
backported to RHEL5 as of kernel 2.6.18-190.el5.  Details can be found here:

https://bugzilla.redhat.com/show_bug.cgi?id=526092

The race condition was fixed in the kernel by acquiring the semaphore's
wait_lock inside rwsem_is_locked().  The SPL worked around the race condition
by acquiring the wait_lock before calling that function, but with the fix in
place it must not do that.

This commit implements an autoconf test to detect whether the fixed version of
rwsem_is_locked() is present.  The previous version of rwsem_is_locked() was an
inline static function while the new version is exported as a symbol which we
can check for in module.symvers.  Depending on the result we correctly
implement the needed compatibility macros for proper spinlock handling.

Finally, we do the right thing with spin locks in RW_*_HELD() by using the
new compatibility macros.  We only only acquire the semaphore's wait_lock if
it is calling a rwsem_is_locked() that does not itself try to acquire the lock.

Some new overhead and a small harmless race is introduced by this change.
This is because RW_READ_HELD() and RW_WRITE_HELD() now acquire and release
the wait_lock twice: once for the call to rwsem_is_locked() and once for
the call to rw_owner().  This can't be avoided if calling a rwsem_is_locked()
that takes the wait_lock, as it will in more recent kernels.

The other case which only occurs in legacy kernels could be optimized by
taking the lock only once, as was done prior to this commit.  However, I
decided that the performance gain probably wasn't significant enough to
justify the messy special cases required.

The function spl_rw_get_owner() was only used to enable the afore-mentioned
optimization.  Since it is no longer used, I removed it.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2010-08-10 16:43:00 -07:00
Ned Bass
5ec44a37c3 Correctly detect atomic64_cmpxchg support
The RHEL5 2.6.18-194.7.1.el5 kernel added atomic64_cmpxchg to
asm-x86_64/atomic.h.  That macro is defined in terms of cmpxchg which
is provided by asm/system.h. However, asm/system.h is not #included by
atomic.h in this kernel nor by the autoconf test for atomic64_cmpxchg, so
the test failed with "implicit declaration of function 'cmpxchg'". This
leads the build system to erroneously conclude that the kernel does not
define atomic64_cmpxchg and enable the built-in definition.  This in
turn produces a '"atomic64_cmpxchg" redefined' build warning which is fatal
when building with --enable-debug.  This commit fixes this by including
asm/system.h in the autoconf test.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2010-08-08 13:48:03 -07:00
Brian Behlendorf
287b2fb117 Add Debian and Slackware style packaging via alien
The long term fix for Debian and Slackware style packaging is
to add native support for building these packages.  Unfortunately,
that is a large chunk of work I don't have time for right now.
That said it would be nice to have at least basic packages for
these distributions.

As a quick short/medium term solution I've settled on using alien
to convert the RPM packages to DEB or TGZ style packages.  The
build system has been updated with the following build targets
which will first build RPM packages and then convert them as
needed to the target package type:

  make rpm: Create .rpm packages
  make deb: Create .deb packages
  make tgz: Create .tgz packages
  make pkg: Create the right package type for your distribution

The solution comes with lot of caveats and your mileage may vary.
But basically the big limitations are that the resulting packages:

  1) Will not have the correct dependency information.
  2) Will not not include the kernel version in the release.
  3) Will not handle all differences between distributions.

But the resulting packages should be easy to install and remove
from your system and take care of running 'depmod -a' and such.
As I said at the top this is not the right long term solution.
If any of the upstream distribution maintainers want to jump in
and help do this right for their distribution I'd love the help.
2010-07-27 15:52:34 -07:00
Brian Behlendorf
f0ff89fc86 Linux 2.6.35 compat: filp_fsync() dropped 'stuct dentry *'
The prototype for filp_fsync() drop the unused argument 'stuct dentry *'.
I've fixed this by adding the needed autoconf check and moving all of
those filp related functions to file_compat.h.  This will simplify
handling any further API changes in the future.
2010-07-14 11:40:55 -07:00
Brian Behlendorf
a4bfd8ea1b Add __divdi3(), remove __udivdi3() kernel dependency
Up until now no SPL consumer attempted to perform signed 64-bit
division so there was no need to support this.  That has now
changed so I adding 64-bit division support for 32-bit platforms.
The signed implementation is based on the unsigned version.

Since the have been several bug reports in the past concerning
correct 64-bit division on 32-bit platforms I added some long
over due regression tests.  Much to my surprise the unsigned
64-bit division regression tests failed.

This was surprising because __udivdi3() was implemented by simply
calling div64_u64() which is provided by the kernel.  This meant
that the linux kernels 64-bit division algorithm on 32-bit platforms
was flawed.  After some investigation this turned out to be exactly
the case.

Because of this I was forced to abandon the kernel helper and
instead to fully implement 64-bit division in the spl.  There are
several published implementation out there on how to do this
properly and I settled on one proposed in the book Hacker's Delight.
Their proposed algoritm is freely available without restriction
and I have just modified it to be linux kernel friendly.

The update implementation now passed all the unsigned and signed
regression tests.  This should be functional, but not fast, which is
good enough for out purposes.  If you want fast too I'd strongly
suggest you upgrade to a 64-bit platform.  I have also reported the
kernel bug and we'll see if we can't get it fixed up stream.
2010-07-13 16:44:02 -07:00
Lars Johannsen
dbe561d8ab Allow config/build to work with autoconf-2.65
As of autoconf-2.65 the AC_LANG_SOURCE source macro no longer
includes the confdef.h results when expanded.  To handle this
simply explicitly include confdef.h in conftest.c.  This will
cause two copies to of confdef.h to be added to the test for
earlier autoconf versions but this is not harmful.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2010-07-02 14:00:28 -07:00
Brian Behlendorf
1814251453 Require gawk the usermode helper fails with awk
For some reason when awk invoked by the usermode helper the command
always fails.  Interestingly gawk does not suffer from this problem
which is why I never observed this failure since the distro I tested
with all had gawk installed instead of awk.  Anyway, the simplest
thing to do here is to just make gawk mandatory.  I've added a
configure check for gawk specifically and have updated the command
to call gawk not awk.
2010-07-01 16:38:08 -07:00
Brian Behlendorf
7119bf7044 Add configure check for user_path_dir()
I didn't notice at the time but user_path_dir() was not introduced
at the same time as set_fs_pwd() change.  I had lumped the two
together but in fact user_path_dir() was introduced in 2.6.27 and
set_fs_pwd() taking 2 args was introduced in 2.6.25.  This means
builds against 2.6.25-2.6.26 kernels were broken.

To fix this I've added a check for user_path_dir() and no longer
assume that if set_fs_pwd() takes 2 args then user_path_dir() is
also available.
2010-07-01 13:53:26 -07:00
Brian Behlendorf
e2d28a3743 Use $target_cpu instead of arch
We should not be using arch for a few reasons.  First off it might
not be installed on their system, and secondly they may be trying
to cross-compile.
2010-07-01 13:52:46 -07:00
Brian Behlendorf
8fd4e3af2e Check sourcelink is set before passing to readlink
When no source was found in any of the expected paths treat
this as fatal and provide the user with a hint as to what
they should do.
2010-07-01 13:52:04 -07:00
Brian Behlendorf
c2688979a4 Remove AC_DEFINE for DEBUG/NDEBUG
Whoops, I momentarilly forgot I had explicitly set these as CC
options so dependent packages which need to include spl_config.h
would not end up having these defined which can result in
accidentally hanging debug enabled at best, or a build failure
at worst.
2010-07-01 09:40:29 -07:00
Brian Behlendorf
c950d1480d Only make compiler warnings fatal with --enable-debug
While in theory I like the idea of compiler warnings always being
fatal.  In practice this causes problems when small harmless errors
cause build failures for end users.  To handle this I've updated
the build system such that -Werror is only used when --enable-debug
is passed to configure.  This is how I always build when developing
so I'll catch all build warnings and end users will not get stuck
by minor issues.
2010-06-30 17:05:36 -07:00
Brian Behlendorf
79a3bf130b Linux-2.6.33 compat, .ctl_name removed from struct ctl_table
As of linux-2.6.33 the ctl_name member of the ctl_table struct
has been entirely removed.  The upstream code has been updated
to depend entirely on the the procname member.  To handle this
all references to ctl_name are wrapped in a CTL_NAME macro which
simply expands to nothing for newer kernels.  Older kernels are
supported by having it expand to .ctl_name = X just as before.
2010-06-30 12:49:12 -07:00
Brian Behlendorf
fd921c2e0c Linux-2.6.33 compat, check <generated/utsrelease.h> for UTS_RELEASE
It seems the upstream community moved the definition of UTS_RELEASE
yet again as of linux-2.6.33.  Update the build system to check in
all three possible locations where your kernel version may be defined.

	$kernelbuild/include/linux/version.h
	$kernelbuild/include/linux/utsrelease.h
	$kernelbuild/include/generated/utsrelease.h
2010-06-30 12:48:18 -07:00
Brian Behlendorf
b868e22f05 Add kmem_asprintf(), strfree(), strdup(), and minor cleanup.
This patch adds three missing Solaris functions: kmem_asprintf(), strfree(),
and strdup().  They are all implemented as a thin layer which just calls
their Linux counterparts.  As part of this an autoconf check for kvasprintf
was added because it does not appear in older kernels.  If the kernel does
not provide it then spl-generic implements it.

Additionally the dead DEBUG_KMEM_UNIMPLEMENTED code was removed to clean
things up and make the kmem.h a little more readable.
2010-06-11 15:57:25 -07:00
Brian Behlendorf
8934764e60 Add support for 'make -s' silent builds
The cleanest way to do this is to set AM_LIBTOOLFLAGS = --silent.  However,
AM_LIBTOOLFLAGS is not honored by automake-1.9.6-2.1 which is what I have
been using.  To cleanly handle this I am updating to automake-1.11-3 which
is why it looks like there is a lot of churn in the Makefiles.
2010-03-26 15:41:17 -07:00
Brian Behlendorf
16b719f006 Allow spl_config.h to be included by dependant packages (updated)
We need dependent packages to be able to include spl_config.h to
build properly.  This was partially solved in commit 0cbaeb1 by using
AH_BOTTOM to #undef common #defines (PACKAGE, VERSION, etc) which
autoconf always adds and cannot be easily removed.  This solution
works as long as the spl_config.h is included before your projects
config.h.  That turns out to be easier said than done.  In particular,
this is a problem when your package includes its config.h using the
-include gcc option which ensures the first thing included is your
config.h.

To handle all cases cleanly I have removed the AH_BOTTOM hack and
replaced it with an AC_CONFIG_HEADERS command.  This command runs
immediately after spl_config.h is written and with a little awk-foo
it strips the offending #defines from the file.  This eliminates
the problem entirely and makes header safe for inclusion.

Also in this change I have removed the few places in the code where
spl_config.h is included.  It is now added to the gcc compile line
to ensure the config results are always available.

Finally, I have also disabled the verbose kernel builds.  If you
want them back you can always build with 'make V=1'.  Since things
are working now they don't need to be on by default.
2010-03-22 14:45:33 -07:00
Brian J. Murrell
534c4e38cb When no kernel source has been pointed to, first attempt to use
/lib/modules/$(uname -r)/source.  This will likely fail when building
under a mock (http://fedoraproject.org/wiki/Projects/Mock) chroot
environment since `uname -r` will report the running kernel which
likely is not the kernel in your chroot.  To cleanly handle this
we fallback to using the first kernel in your chroot.

The kernel-devel package which contains all the kernel headers and
a few build products such as Module.symver{s} is all the is required.
Full source is not needed.
2010-03-08 14:19:30 -08:00
Brian Behlendorf
3977f8370f Linux 2.6.32 compat, proc_handler() API change
As of linux-2.6.32 the 'struct file *filp' argument was dropped from
the proc_handle() prototype.  It was apparently unused _almost_
everywhere in the kernel and this was simply cleanup.

I've added a new SPL_AC_5ARGS_PROC_HANDLER autoconf check for this and
the proper compat macros to correctly define the prototypes and some
helper functions.  It's not pretty but API compat changes rarely are.
2010-03-04 12:14:56 -08:00
Brian Behlendorf
d04c8a563c Atomic64 compatibility for 32-bit systems without kernel support.
This patch is another step towards updating the code to handle the
32-bit kernels which I have not been regularly testing.  This changes
do not really impact the common case I'm expected which is the latest
kernel running on an x86_64 arch.

Until the linux-2.6.31 kernel the x86 arch did not have support for
64-bit atomic operations.  Additionally, the new atomic_compat.h support
for this case was wrong because it embedded a spinlock in the atomic
variable which must always and only be 64-bits total.  To handle these
32-bit issues we now simply fall back to the --enable-atomic-spinlock
implementation if the kernel does not provide the 64-bit atomic funcs.

The second issue this patch addresses is the DEBUG_KMEM assumption that
there will always be atomic64 funcs available.  On 32-bit archs this may
not be true, and actually that's just fine.  In that case the kernel will
will never be able to allocate more the 32-bits worth anyway.  So just
check if atomic64 funcs are available, if they are not it means this
is a 32-bit machine and we can safely use atomic_t's instead.
2009-12-04 15:54:12 -08:00
Brian Behlendorf
c1541dfef1 Add 'srpm' --with-config option for creation of spec files. 2009-11-24 14:21:45 -08:00
Brian Behlendorf
baf2979ed3 Linux 2.6.31 Compatibility Updates
SPL_AC_2ARGS_SET_FS_PWD macro updated to explicitly include
linux/fs_struct.h which was dropped from linux/sched.h.

min_wmark_pages, low_wmark_pages, high_wmark_pages macros
introduced in newer kernels.  For older kernels mm_compat.h
was introduced to define them as needed as direct mappings
to per zone min_pages, low_pages, max_pages.
2009-11-10 14:06:57 -08:00
Brian Behlendorf
055ffd98cf Autoconf --enable-debug-* cleanup
Cleanup the --enable-debug-* configure options, this has been pending
for quite some time and I am glad I finally got to it.  To summerize:

1) All SPL_AC_DEBUG_* macros were updated to be a more autoconf
friendly.  This mainly involved shift to the GNU approved usage of
AC_ARG_ENABLE and ensuring AS_IF is used rather than directly using
an if [ test ] construct.

2) --enable-debug-kmem=yes by default.  This simply enabled keeping
a running tally of total memory allocated and freed and reporting a
memory leak if there was one at module unload.  Additionally, it
ensure /proc/spl/kmem/slab will exist by default which is handy.
The overhead is low for this and it should not impact performance.

3) --enable-debug-kmem-tracking=no by default.  This option was added
to provide a configure option to enable to detailed memory allocation
tracking.  This support was always there but you had to know where to
turn it on.  By default this support is disabled because it is known
to badly hurt performence, however it is invaluable when chasing a
memory leak.

4) --enable-debug-kstat removed.  After further reflection I can't see
why you would ever really want to turn this support off.  It is now
always on which had the nice side effect of simplifying the proc handling
code in spl-proc.c.  We can now always assume the top level directory
will be there.

5) --enable-debug-callb removed.  This never really did anything, it was
put in provisionally because it might have been needed.  It turns out
it was not so I am just removing it to prevent confusion.
2009-10-30 13:58:51 -07:00
Brian Behlendorf
302b88e6ab Add autoconf checks for atomic64_cmpxchg + atomic64_xchg
These functions didn't exist for all archs prior to 2.6.24.  This
patch addes an autoconf test to detect this and add them when needed.
The autoconf check is needed instead of just an #ifndef because in
the most modern kernels atomic64_{cmp}xchg are implemented as in
inline function and not a #define.
2009-10-30 13:53:17 -07:00
Brian Behlendorf
5e9b5d832b Use Linux atomic primitives by default.
Previously Solaris style atomic primitives were implemented simply by
wrapping the desired operation in a global spinlock.  This was easy to
implement at the time when I wasn't 100% sure I could safely layer the
Solaris atomic primatives on the Linux counterparts.  It however was
likely not good for performance.

After more investigation however it does appear the Solaris primitives
can be layered on Linux's fairly safely.  The Linux atomic_t type really
just wraps a long so we can simply cast the Solaris unsigned value to
either a atomic_t or atomic64_t.  The only lingering problem for both
implementations is that Solaris provides no atomic read function.  This
means reading a 64-bit value on a 32-bit arch can (and will) result in
word breaking.  I was very concerned about this initially, but upon
further reflection it is a limitation of the Solaris API.  So really
we are just being bug-for-bug compatible here.

With this change the default implementation is layered on top of Linux
atomic types.  However, because we're assuming a lot about the internal
implementation of those types I've made it easy to fall-back to the
generic approach.  Simply build with --enable-atomic_spinlocks if
issues are encountered with the new implementation.
2009-10-30 10:55:25 -07:00
Brian Behlendorf
51a727e90f Set cwd to '/' for the process executing insmod.
Ricardo has pointed out that under Solaris the cwd is set to '/'
during module load, while under Linux it is set to the callers cwd.
To handle this cleanly I've reworked the module *_init()/_exit()
macros so they call a *_setup()/_cleanup() function when any SPL
dependent module is loaded or unloaded.  This gives us a chance to
perform any needed modification of the process, in this case changing
the cwd.  It also handily provides a way to avoid creating wrapper
init()/exit() functions because the Solaris and Linux prototypes
differ slightly.  All dependent modules should now call the spl
helper macros spl_module_{init,exit}() instead of the native linux
versions.

Unfortunately, it appears that under Linux there has been no consistent
API in the kernel to set the cwd in a module.  Because of this I have
had to add more autoconf magic than I'd like.  However, what I have
done is correct and has been tested on RHEL5, SLES11, FC11, and CHAOS
kernels.

In addition, I have change the rootdir type from a 'void *' to the
correct 'vnode_t *' type.  And I've set rootdir to a non-NULL value.
2009-10-01 16:06:15 -07:00
Brian Behlendorf
4d54fdee1d Reimplement mutexs for Linux lock profiling/analysis
For a generic explanation of why mutexs needed to be reimplemented
to work with the kernel lock profiling see commits:
  e811949a57 and
  d28db80fd0

The specific changes made to the mutex implemetation are as follows.
The Linux mutex structure is now directly embedded in the kmutex_t.
This allows a kmutex_t to be directly case to a mutex struct and
passed directly to the Linux primative.

Just like with the rwlocks it is critical that these functions be
implemented as '#defines to ensure the location information is
preserved.  The preprocessor can then do a direct replacement of
the Solaris primative with the linux primative.

Just as with the rwlocks we need to track the lock owner.  Here
things get a little more interesting because depending on your
kernel version, and how you've built your kernel Linux may already
do this for you.  If your running a 2.6.29 or newer kernel on a
SMP system the lock owner will be tracked.  This was added to Linux
to support adaptive mutexs, more on that shortly.  Alternately, your
kernel might track the lock owner if you've set CONFIG_DEBUG_MUTEXES
in the kernel build.  If neither of the above things is true for
your kernel the kmutex_t type will include and track the lock owner
to ensure correct behavior.  This is all handled by a new autoconf
check called SPL_AC_MUTEX_OWNER.

Concerning adaptive mutexs these are a very recent development and
they did not make it in to either the latest FC11 of SLES11 kernels.
Ideally, I'd love to see this kernel change appear in one of these
distros because it does help performance.  From Linux kernel commit:
  0d66bf6d3514b35eb6897629059443132992dbd7
  "Testing with Ingo's test-mutex application...
  gave a 345% boost for VFS scalability on my testbox"
However, if you don't want to backport this change yourself you
can still simply export the task_curr() symbol.  The kmutex_t
implementation will use this symbol when it's available to
provide it's own adaptive mutexs.

Finally, DEBUG_MUTEX support was removed including the proc handlers.
This was done because now that we are cleanly integrated with the
kernel profiling all this information and much much more is available
in debug kernel builds.  This code was now redundant.

Update mutexs validated on:
    - SLES10   (ppc64)
    - SLES11   (x86_64)
    - CHAOS4.2 (x86_64)
    - RHEL5.3  (x86_64)
    - RHEL6    (x86_64)
    - FC11     (x86_64)
2009-09-25 14:47:01 -07:00
Brian Behlendorf
e811949a57 Reimplement rwlocks for Linux lock profiling/analysis.
It turns out that the previous rwlock implementation worked well but
did not integrate properly with the upstream kernel lock profiling/
analysis tools.  This is a major problem since it would be awfully
nice to be able to use the automatic lock checker and profiler.

The problem is that the upstream lock tools use the pre-processor
to create a lock class for each uniquely named locked.  Since the
rwsem was embedded in a wrapper structure the name was always the
same.  The effect was that we only ended up with one lock class for
the entire SPL which caused the lock dependency checker to flag
nearly everything as a possible deadlock.

The solution was to directly map a krwlock to a Linux rwsem using
a typedef there by eliminating the wrapper structure.  This was not
done initially because the rwsem implementation is specific to the arch.
To fully implement the Solaris krwlock API using only the provided rwsem
API is not possible.  It can only be done by directly accessing some of
the internal data member of the rwsem structure.

For example, the Linux API provides a different function for dropping
a reader vs writer lock.  Whereas the Solaris API uses the same function
and the caller does not pass in what type of lock it is.  This means to
properly drop the lock we need to determine if the lock is currently a
reader or writer lock.  Then we need to call the proper Linux API function.
Unfortunately, there is no provided API for this so we must extracted this
information directly from arch specific lock implementation.  This is
all do able, and what I did, but it does complicate things considerably.

The good news is that in addition to the profiling benefits of this
change.  We may see performance improvements due to slightly reduced
overhead when creating rwlocks and manipulating them.

The only function I was forced to sacrafice was rw_owner() because this
information is simply not stored anywhere in the rwsem.  Luckily this
appears not to be a commonly used function on Solaris, and it is my
understanding it is mainly used for debugging anyway.

In addition to the core rwlock changes, extensive updates were made to
the rwlock regression tests.  Each class of test was extended to provide
more API coverage and to be more rigerous in checking for misbehavior.

This is a pretty significant change and with that in mind I have been
careful to validate it on several platforms before committing.  The full
SPLAT regression test suite was run numberous times on all of the following
platforms.  This includes various kernels ranging from 2.6.16 to 2.6.29.

- SLES10   (ppc64)
- SLES11   (x86_64)
- CHAOS4.2 (x86_64)
- RHEL5.3  (x86_64)
- RHEL6    (x86_64)
- FC11     (x86_64)
2009-09-18 16:09:47 -07:00
Brian Behlendorf
6ae7fef5b9 Update global_page_state() support for 2.6.29 kernels.
Basically everything we need to monitor the global memory state of
the system is now cleanly available via global_page_state().  The
problem is that this interface is still fairly recent, and there
has been one change in the page state enum which we need to handle.
These changes basically boil down to the following:
- If global_page_state() is available we should use it.  Several
  autoconf checks have been added to detect the correct enum names.
- If global_page_state() is not available check to see if
  get_zone_counts() symbol is available and use that.
- If the get_zone_counts() symbol is not exported we have no choice
  be to dynamically aquire it at load time.  This is an absolute
  last resort for old kernel which we don't want to patch to
  cleanly export the symbol.
2009-07-28 15:06:42 -07:00
Brian Behlendorf
ec7d53e99a Add basic credential support and splat tests.
The previous credential implementation simply provided the needed types and
a couple of dummy functions needed.  This update correctly ties the basic
Solaris credential API in to one of two Linux kernel APIs.

Prior to 2.6.29 the linux kernel embeded all credentials in the task
structure.  For these kernels, we pass around the entire task struct as if
it were the credential, then we use the helper functions to extract the
credential related bits.

As of 2.6.29 a new credential type was added which we can and do fairly
cleanly layer on top of.  Once again the helper functions nicely hide
the implementation details from all callers.

Three tests were added to the splat test framework to verify basic
correctness.  They should be extended as needed when need credential
functions are added.
2009-07-27 17:18:59 -07:00
Brian Behlendorf
3d0cb2d31d Remove LINUXINCLUDE from autoconf wrapper, breaks 2.6.28+ kernels.
Modern kernel build systems at least post 2.6.16 will set this properly
so we should not.  In fact post 2.6.28 the include headers have moved
under arch so the guess we make here is completely wrong.  Letting
the kernel build system set this ensure it will be correct.
2009-07-27 09:52:01 -07:00
Brian Behlendorf
749e5eb1ed Check arch/default/ path when detecting kernel objects on SLES
We still preferentially use arch/arch looking for a native version
but if that fails it is acceptable to use default.
2009-07-22 06:59:28 -07:00
Brian Behlendorf
bb339d0670 Cleanly handle --with-linux=NONE option when used to generate source
rpms.  These should not be fatal because we actually don't need them
until we build the source rpm.  When doing mock builds this is
important because these dependent rpms will only be installed if
they are specificed in the source rpms spec file.
2009-07-02 10:47:28 -07:00
Brian Behlendorf
86933a6e51 Simplify rpm build rules, added config/rpm.am.
Distro friendly changes such that the kernel modules are packaged seperately.
2009-07-01 14:37:44 -07:00
Brian Behlendorf
2e0e7e6976 Packaging improvements for RHEL and SLES (part 2)
- Allow checking for exported symbols in both Module.symvers
  and Module.symvers.  My stock SLES kernel ships an objects
  directory with Module.symvers, yet produces a Module.symvers
  in the local build directory.
2009-06-16 11:34:28 -07:00
Brian Behlendorf
39a3d2a421 Packaging improvements for RHEL and SLES
- Properly honor --prefix in build system and rpm spec file.
- Add '--define require_kdir' to spec file to support building
  rpms against kernel sources installed in non-default locations.
- Add '--define require_kobj' to spec file to support building
  rpms against kernel object installed in non-default locations.
- Stop suppressing errors in autogen.sh script.
- Improved logic to detect missing kernel objects when they are
  not located with the source.  This is the common case for SLES
  as well as in-tree chaos kernel builds and is done to simply
  support for multiple arches.
- Moved spl-devel build products to /usr/src/spl-<version>, a
  spl symlink is created to reference the last installed version.
2009-06-16 10:44:59 -07:00
Brian Behlendorf
5232d256b4 SLES10 Fixes (part 6)
- Prior to 2.6.17 there were no *_pgdat helper functions in mm/mmzone.c.
  Instead for_each_zone() operated directly on pgdat_list which may or
  may not have been exported depending on how your kernel was compiled.
  Now new configure checks determine if you have the helpers or not, and
  if the needed symbols are exported.  If they are not exported then they
  are dynamically aquired at runtime by kallsyms_lookup_name().
2009-05-20 14:23:13 -07:00
Brian Behlendorf
a093c6a499 SLES10 Fixes (part 4):
- Configure check for SLES specific API change to vfs_unlink()
  and vfs_rename() which added a 'struct vfsmount *' argument.
  This was for something called the linux-security-module, but
  it appears that it was never adopted upstream.
2009-05-20 11:31:55 -07:00
Brian Behlendorf
6c9433c150 SLES10 Fixes (part 3):
- Configure check for mutex_lock_nested().  This function was introduced
  as part of the mutex validator in 2.6.18, but if it's unavailable then
  it's safe to fallback to a plain mutex_lock().
2009-05-20 11:00:39 -07:00
Brian Behlendorf
96dded3844 SLES10 Fixes (part 2):
- Configure check, the div64_64() function was renamed to
  div64_u64() as of 2.6.26.
- Configure check, the global_page_state() fuction was introduced
  in 2.6.18 kernels.  The earlier 2.6.16 based SLES10 must not try
  and use it, thankfully get_zone_counts() is still available.
- To simplify debugging poison all symbols aquired dynamically
  using spl_kallsyms_lookup_name() with SYMBOL_POISON.
- Add console messages when the user mode helpers fail.
- spl_kmem_init_globals() use bit shifts instead of division.
- When the monotonic clock is unavailable __gethrtime() must perform
  the HZ division as an 'unsigned long long' because the SPL only
  implements __udivdi3(), and not __divdi3() for 'long long' division
  on 32-bit arches.
2009-05-20 10:08:37 -07:00
Brian Behlendorf
bf338d8d09 SLES10 Fixes (part 1):
- Exclude -obj when detecting installed kernel source.
- Detect -obj directory for out of tree kernel builds.
- Allow kernel build system to set CC to ensure -m64 is set properly.
  This is an issue on 64-bit SLES systems which by default always
  build 32-bit binaries (unlike RHEL/Fedora which default to 64-bit)
2009-05-19 11:42:39 -07:00
Brian Behlendorf
0cbaeb117a Allow spl_config.h to be included by dependant packages
We need dependent packages to be able to include spl_config.h so they
can leverage the configure checks the SPL has done.  This is important
because several of the spl headers need the results of these checks to
work properly.  Unfortunately, the autoheader build product is always
private to a particular build and defined certain common things.
(PACKAGE, VERSION, etc).  This prevents other packages which also use
autoheader from being include because the definitions conflict.  To
avoid this problem the SPL build system leverage AH_BOTTOM to include
a spl_unconfig.h at the botton of the autoheader build product.  This
custom include undefs all known shared symbols to prevent the confict.
This does however mean that those definition are also not availble
to the SPL package either.  The SPL package therefore uses the
equivilant SPL_META_* definitions.
2009-03-17 14:55:59 -07:00
Brian Behlendorf
e11d6c5f50 FC10/i686 Compatibility Update (2.6.27.19-170.2.35.fc10.i686)
In the interests of portability I have added a FC10/i686 box to
my list of development platforms.  The hope is this will allow me
to keep current with upstream kernel API changes, and at the same
time ensure I don't accidentally break x86 support.  This patch
resolves all remaining issues observed under that environment.

1) SPL_AC_ZONE_STAT_ITEM_FIA autoconf check added.  As of 2.6.21
the kernel added a clean API for modules to get the global count
for free, inactive, and active pages.  The SPL attempts to detect
if this API is available and directly map spl_global_page_state()
to global_page_state().  If the full API is not available then
spl_global_page_state() is implemented as a thin layer to get
these values via get_zone_counts() if that symbol is available.

2) New kmem:vmem_size regression test added to validate correct
vmem_size() functionality.  The test case acquires the current
global vmem state, allocates from the vmem region, then verifies
the allocation is correctly reflected in the vmem_size() stats.

3) Change splat_kmem_cache_thread_test() to always use KMC_KMEM
based memory.  On x86 systems with limited virtual address space
failures resulted due to exhaustig the address space.  The tests
really need to problem exhausting all memory on the system thus
we need to use the physical address space.

4) Change kmem:slab_lock to cap it's memory usage at availrmem
instead of using the native linux nr_free_pages().  This provides
additional test coverage of the SPL Linux VM integration.

5) Change kmem:slab_overcommit to perform allocation of 256K
instead of 1M.  On x86 based systems it is not possible to create
a kmem backed slab with entires of that size.  To compensate for
this the number of allocations performed in increased by 4x.

6) Additional autoconf documentation for proposed upstream API
changes to make additional symbols available to modules.

7) Console error messages added when spl_kallsyms_lookup_name()
fails to locate an expected symbol.  This causes the module to fail
to load and we need to know exactly which symbol was not available.
2009-03-17 12:16:31 -07:00