may not fail. To get this behavior I'd added a retry to the shim layer
even though it is abusive to the VM, at least it should prevent the crash.
Additionally I added a proc counter so I can easily check how often this
is happening. It should be fairly rare, but likely will get worse and
worse the longer the machine has been up.
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@104 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
not to support a few flags (we assert if they are used), and I
did not add the libkstat interface and instead exported everything
to proc for easy access.
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@103 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
- Ensure the mutex_stats_sem and mutex_stats_list are initialized
- Only spin if you have to in mutex_init
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@97 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
- Detailed kmem memory allocation tracking. We can now get on
spl module unload a list of all memory allocations which were
not free'd and where the original alloc was. E.g.
SPL: 15554:632:(spl-kmem.c:442:kmem_fini()) kmem leaked 90/319332 bytes
SPL: 15554:648:(spl-kmem.c:451:kmem_fini()) address size data func:line
SPL: 15554:648:(spl-kmem.c:457:kmem_fini()) ffff8100734b68b8 32 0100000001005a5a __spl_mutex_init:70
SPL: 15554:648:(spl-kmem.c:457:kmem_fini()) ffff8100734b6148 13 &tl->tl_lock __spl_mutex_init:74
SPL: 15554:648:(spl-kmem.c:457:kmem_fini()) ffff81007ac43730 32 0100000001005a5a __spl_mutex_init:70
SPL: 15554:648:(spl-kmem.c:457:kmem_fini()) ffff81007ac437d8 13 &tl->tl_lock __spl_mutex_init:74
- Shift to using rwsems in kmem implmentation, to simply locking and
improve concurency.
- Shift to using rwsems in mutex implementation, additionally ensure we
never sleep in the init function if non-zero preempt_count or
interrupts are disabled as can happen in a slab cache ctor/dtor.
- Other minor formating fixes and such.
TODO:
- Finish the vmem memory allocation tracking
- Vet all other SPL primatives for potential sleeping during *_init. I
suspect the rwlock implemenation does this and should be fixes just
like the mutex implemenation.
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@95 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
crashes but it's not clear to me yet if these are a problem with
the mutex implementation or ZFSs usage of it.
Minor taskq fixes to add new tasks to the end of the pending list.
Minor enhansements to the debug infrastructure.
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@94 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
configurable number of threads like the Solaris version and almost
all of the options are supported. Unfortunately, it appears to have
made absolutely no difference to our performance numbers. I need
to keep looking for where we are bottle necking.
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@93 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
before the debug subsystem is fully set up, or after the debug
subsystem has been torn down.
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@86 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
process of destroying the stacks. Threshhold set fairly aggressively
top 80% of stack usage.
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@82 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
- Replacing all BUG_ON()'s with proper ASSERT()'s
- Using ENTRY,EXIT,GOTO, and RETURN macro to instument call paths
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@78 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
changes bring over everything lustre had for debugging with
two exceptions. I dropped by the debug daemon and upcalls
just because it made things a little easier. They can be
readded easily enough if we feel they are needed.
Everything compiles and seems to work on first inspection
but I suspect there are a handful of issues still lingering
which I'll be sorting out right away. I just wanted to get
all these changes commited and safe. I'm getting a little
paranoid about losing them.
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@75 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
when necessary to avoid deadlocks. We were seeing the deadlock
when calling kmem_cache_generic_constructor() and then an interrupt
forced us to end up calling kmem_cache_generic_destructor()
which caused our deadlock.
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@74 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
think this should fix anything but it's a good idea regardless.
- Drop the lock before calling the construct/destructor for the slab
otherwise we can't sleep in a constructor/destructor and for long running
functions we may NMI.
- Do something braindead, but safe for the console debug logs for now.
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@73 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
at spl module load time can calls hostid. The resolved hostid
is then fed back in to a proc entry for latter use. It's
not a pretty thing, but it will work for now. The hw_serial
is required for things such as 'zpool status' to work.
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@71 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
function just to be extra safety and paranoid.
- Rewrite the thread shim to take full advantage of the
new kernel kthread API. This greatly simplifies things.
- Add a new regression test for thread_exit() to ensure
it properly terminates a thread immediately without
allowing futher execution of the thread.
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@69 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
your task is rescheduled to a different cpu after you've
taken the lock but before calling RW_LOCK_HELD is called.
We need the spinlock to ensure there is a wmb() there.
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@68 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
- Ensure we have at least 1 write-only splat test
- Fix return codes for vn_* Solaris does not use negative return
codes in the kernel. So linux errno's must be inverted.
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@67 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
We need to use kthread_create() here for a few reasons. First
off to old kernel_thread() API functioin will be going away.
Secondly, and more importantly if I use kthread_create() we can
then properly implement a thread_exit() function which terminates
the kernel thread at any point with do_exit(). This fixes our
cleanup bug which was caused by dropping a mutex twice after
thread_exit() didn't really exit.
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@66 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
atomic operations will be rewritten anyway with the correct arch
specific assembly. But not today.
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@65 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
- Added liunx block device headers to sunldi.h
- Made __taskq_dispatch safe for interrupt context where it turns out we
need to be useing it.
- Fixed NULL const/dest bug for kmem slab caches
- Places debug __dprintf debugging messages under a spin_lock_irqsave
so it's safe to use then in interrupt handlers. For debugging only!
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@64 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
what I would call effecient but it does have the advantage
of being correct which is all I need right now. I added
a regression test as well.
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@57 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
This should handle the absolute minimum I need for ZFS. It will
register the chdev with the right callbacks. Then the generic
registered linux callback will find the right registered solaris
callback for the function and munge the args just right before
passing it on. Should work, but untested (just compiled), so I
expect bugs.
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@52 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
stuff which only inclused the getf()/releasef() in to the vnode area
where it will only really be used. These calls allow a user to
grab an open file struct given only the known open fd for a particular
user context. ZFS makes use of these, but they're a bit tricky to
test from within the kernel since you already need the file open
and know the fd. So we basically spook the system calls to setup
the environment we need for the splat test case and verify given
just the know fd we can get the file, create the needed vnode, and
then use the vnode interface as usual to read and write from it.
While I was hacking away I also noticed a NULL termination issue
in the second kobj test case so I fixed that too. In fact, I fixed
a few other things as well but all for the best!
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@51 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
Adjust kmem slab interface to make a copy of the slab name before
passing it on to the linux slab (we free it latter too)
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@47 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
- Map the LE/BE_* byteorder macros to the linux versions
- More minor vnodes fixes
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@41 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
- Re-implmented kobj support based on the vnode support.
- Add TESTS option to check.sh, and removed delay after module load.
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@39 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
Update check.sh script to take V=1 env var so you can run it verbosely as
follows if your chasing something: sudo make check V=1
Add new kobj api and needed regression tests to allow reading of files from
within the kernel. Normally thats not something I support but the spa layer
needs the support for its config file.
Add some more missing stub headers
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@38 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c
Add sloopy atomic declaration which will need to be fixed (eventually)
Fill out more of the Solaris VM hooks
Adjust the create_thread function
git-svn-id: https://outreach.scidac.gov/svn/spl/trunk@26 7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c