Stack usage for ddt_zap_lookup() reduced from 368 bytes to 120
bytes. This large stack allocation significantly contributed to
the likelyhood of a stack overflow when scrubbing/resilvering
dedup pools.
This abomination is no longer required because the zio's issued
during this recursive call path will now be handled asynchronously
by the taskq thread pool.
This reverts commit 6656bf5621.
The majority of the recursive operations performed by the dsl
are done either in the context of the tgx_sync_thread or during
pool import. It is these recursive operations which contribute
greatly to the stack depth. When this recursion is coupled with
a synchronous I/O in the same context overflow becomes possible.
Previously to handle this case I have focused on keeping the
individual stack frames as light as possible. This is a good
idea as long as it can be done in a way which doesn't overly
complicate the code. However, there is a better solution.
If we treat all zio's issued by the tgx_sync_thread as async then
we can use the tgx_sync_thread stack for the recursive parts, and
the zio_* threads for the I/O parts. This effectively doubles our
available stack space with the only drawback being a small delay
to schedule the I/O. However, in practice the scheduling time
is so much smaller than the actual I/O time this isn't an issue.
Another benefit of making the zio async is that the zio pipeline
is now parallel. That should mean for CPU intensive pipelines
such as compression or dedup performance may be improved.
With this change in place the worst case stack usage observed so
far is 6902 bytes. This is still higher than I'd like but
significantly improved. Additional changes to specific functions
should improve this further. This change allows us to revent
commit 6656bf5 which did some horrible things to the recursive
traverse_visitbp() callpath in the name of saving stack.
Yesterday I ran across a 3TB drive which exposed 4K sectors to
Linux. While I thought I had gotten this support correct it
turns out there were 2 subtle bugs which prevented it from
working.
sudo ./cmd/zpool/zpool create -f large-sector /dev/sda
cannot create 'large-sector': one or more devices is currently unavailable
1) The first issue was that it was possible that bdev_capacity()
would return the number of 512 byte sectors rather than the number
of 4096 sectors. Internally, certain Linux functions only operate
with 512 byte sectors so you need to be careful. To avoid any
confusion in the future I've updated bdev_capacity() to simply
return the device (or partition) capacity in bytes. The higher
levels of ZFS want the value in bytes anyway so this is cleaner.
2) When creating a bio the ->bi_sector count must always be
expressed in 512 byte sectors. The existing code would scale
the byte offset by the logical sector size. Until now this was
always 512 so it never caused problems. Trying a 4K sector drive
clearly exposed the issue. The problem has been fixed by
hard coding the 512 byte sector which is exactly what the bio
code does internally.
With these changes I'm now able to create ZFS pools using 4K
sector drives. No issues were observed during fairly extensive
testing. This is also a low risk change if your using 512b
sectors devices because none of the logic changes.
Closes#256
The default buffer size when requesting multiple quota entries
is 100 times the zfs_useracct_t size. In practice this works out
to exactly 27200 bytes. Since this will be a short lived buffer
in a non-performance critical path it is preferable to vmem_alloc()
the needed memory.
Initially when zfsdev_ioctl() was ported to Linux we didn't have
any credential support implemented. So at the time we simply
passed NULL which wasn't much of a problem since most of the
secpolicy code was disabled.
However, one exception is quota handling which does require the
credential. Now that proper credentials are supported we can
safely start passing the callers credential. This is also an
initial step towards fully implemented the zfs secpolicy.
Normally when the arc_shrinker_func() function is called the return
value should be:
>=0 - To indicate the number of freeable objects in the cache, or
-1 - To indicate this cache should be skipped
However, when the shrinker callback is called with 'nr_to_scan' equal
to zero. The caller simply wants the number of freeable objects in
the cache and we must never return -1. This patch reorders the
first two conditionals in arc_shrinker_func() to ensure this behavior.
This patch also now explictly casts arc_size and arc_c_min to signed
int64_t types so MAX(x, 0) works as expected. As unsigned types
we would never see an negative value which defeated the purpose of
the MAX() lower bound and broke the shrinker logic.
Finally, when nr_to_scan is non-zero we explictly prevent all reclaim
below arc_c_min. This is done to prevent the Linux page cache from
completely crowding out the ARC. This limit is tunable and some
experimentation is likely going to be required to set it exactly right.
For now we're sticking with the OpenSolaris defaults.
Closes#218Closes#243
The comment in zfs_close() pertaining to decrementing the synchronous
open count needs to be updated for Linux. The code was already
updated to be correct, but the comment was missed and is now misleading.
Under Linux the zfs_close() hook is only called once when the final
reference is dropped. This differs from Solaris where zfs_close()
is called for each close.
Closes#237
Update the handling of named pipes and sockets to be consistent with
other platforms with regard to the rdev attribute. While all ZFS
ipmlementations store the rdev for device files in a system attribute
(SA), this is not the case for FIFOs and sockets. Indeed, Linux always
passes rdev=0 to mknod() for FIFOs and sockets, so the value is not
needed. Add an ASSERT that rdev==0 for FIFOs and sockets to detect if
the expected behavior ever changes.
Closes#216
The direct reclaim path in the z_wr_* threads must be disabled
to ensure forward progress is always maintained for txg processing.
This ensures that a txg will never get stuck waiting on itself
because it entered the following memory reclaim callpath.
->prune_icache()->dispose_list()->zpl_clear_inode()->zfs_inactive()
->dmu_tx_assign()->dmu_tx_wait()->tgx_wait_open()
It would be preferable to target this exact code path but the
kernel offers no way to do this without custom patches. To avoid
this we are forced to disable all reclaim for these threads. It
should not be necessary to do this for other other z_* threads
because they will not hold a txg open.
Closes#232
It has become necessary to be able to optionally disable
direct memory reclaim for certain taskqs. To support
this the TASKQ_NORECLAIM flags has been added which sets
the PF_MEMALLOC bit for all threads in the taskq.
How nfsd handles .fsync() has been changed a couple of times in the
recent kernels. But basically there are three cases we need to
consider.
Linux 2.6.12 - 2.6.33
* The .fsync() hook takes 3 arguments
* The nfsd will call .fsync() with a NULL file struct pointer.
Linux 2.6.34
* The .fsync() hook takes 3 arguments
* The nfsd no longer calls .fsync() but instead used sync_inode()
Linux 2.6.35 - 2.6.x
* The .fsync() hook takes 2 arguments
* The nfsd no longer calls .fsync() but instead used sync_inode()
For once it looks like we've gotten lucky. The first two cases can
actually be collased in to one if we stop using the file struct
pointer entirely. Since the dentry is still passed in both cases
this is possible. The last case can then be safely handled by
unconditionally using the dentry in the file struct pointer now
that we know the nfsd caller has been removed.
Closes#230
The default buffer size when requesting history is 128k. This
is far to large for a kmem_alloc() so instead use the slower
vmem_alloc(). This path has no performance concerns and the
buffer is immediately free'd after its contents are copied to
the user space buffer.
This commit adds module options for all existing zfs tunables.
Ideally the average user should never need to modify any of these
values. However, in practice sometimes you do need to tweak these
values for one reason or another. In those cases it's nice not to
have to resort to rebuilding from source. All tunables are visable
to modinfo and the list is as follows:
$ modinfo module/zfs/zfs.ko
filename: module/zfs/zfs.ko
license: CDDL
author: Sun Microsystems/Oracle, Lawrence Livermore National Laboratory
description: ZFS
srcversion: 8EAB1D71DACE05B5AA61567
depends: spl,znvpair,zcommon,zunicode,zavl
vermagic: 2.6.32-131.0.5.el6.x86_64 SMP mod_unload modversions
parm: zvol_major:Major number for zvol device (uint)
parm: zvol_threads:Number of threads for zvol device (uint)
parm: zio_injection_enabled:Enable fault injection (int)
parm: zio_bulk_flags:Additional flags to pass to bulk buffers (int)
parm: zio_delay_max:Max zio millisec delay before posting event (int)
parm: zio_requeue_io_start_cut_in_line:Prioritize requeued I/O (bool)
parm: zil_replay_disable:Disable intent logging replay (int)
parm: zfs_nocacheflush:Disable cache flushes (bool)
parm: zfs_read_chunk_size:Bytes to read per chunk (long)
parm: zfs_vdev_max_pending:Max pending per-vdev I/Os (int)
parm: zfs_vdev_min_pending:Min pending per-vdev I/Os (int)
parm: zfs_vdev_aggregation_limit:Max vdev I/O aggregation size (int)
parm: zfs_vdev_time_shift:Deadline time shift for vdev I/O (int)
parm: zfs_vdev_ramp_rate:Exponential I/O issue ramp-up rate (int)
parm: zfs_vdev_read_gap_limit:Aggregate read I/O over gap (int)
parm: zfs_vdev_write_gap_limit:Aggregate write I/O over gap (int)
parm: zfs_vdev_scheduler:I/O scheduler (charp)
parm: zfs_vdev_cache_max:Inflate reads small than max (int)
parm: zfs_vdev_cache_size:Total size of the per-disk cache (int)
parm: zfs_vdev_cache_bshift:Shift size to inflate reads too (int)
parm: zfs_scrub_limit:Max scrub/resilver I/O per leaf vdev (int)
parm: zfs_recover:Set to attempt to recover from fatal errors (int)
parm: spa_config_path:SPA config file (/etc/zfs/zpool.cache) (charp)
parm: zfs_zevent_len_max:Max event queue length (int)
parm: zfs_zevent_cols:Max event column width (int)
parm: zfs_zevent_console:Log events to the console (int)
parm: zfs_top_maxinflight:Max I/Os per top-level (int)
parm: zfs_resilver_delay:Number of ticks to delay resilver (int)
parm: zfs_scrub_delay:Number of ticks to delay scrub (int)
parm: zfs_scan_idle:Idle window in clock ticks (int)
parm: zfs_scan_min_time_ms:Min millisecs to scrub per txg (int)
parm: zfs_free_min_time_ms:Min millisecs to free per txg (int)
parm: zfs_resilver_min_time_ms:Min millisecs to resilver per txg (int)
parm: zfs_no_scrub_io:Set to disable scrub I/O (bool)
parm: zfs_no_scrub_prefetch:Set to disable scrub prefetching (bool)
parm: zfs_txg_timeout:Max seconds worth of delta per txg (int)
parm: zfs_no_write_throttle:Disable write throttling (int)
parm: zfs_write_limit_shift:log2(fraction of memory) per txg (int)
parm: zfs_txg_synctime_ms:Target milliseconds between tgx sync (int)
parm: zfs_write_limit_min:Min tgx write limit (ulong)
parm: zfs_write_limit_max:Max tgx write limit (ulong)
parm: zfs_write_limit_inflated:Inflated tgx write limit (ulong)
parm: zfs_write_limit_override:Override tgx write limit (ulong)
parm: zfs_prefetch_disable:Disable all ZFS prefetching (int)
parm: zfetch_max_streams:Max number of streams per zfetch (uint)
parm: zfetch_min_sec_reap:Min time before stream reclaim (uint)
parm: zfetch_block_cap:Max number of blocks to fetch at a time (uint)
parm: zfetch_array_rd_sz:Number of bytes in a array_read (ulong)
parm: zfs_pd_blks_max:Max number of blocks to prefetch (int)
parm: zfs_dedup_prefetch:Enable prefetching dedup-ed blks (int)
parm: zfs_arc_min:Min arc size (ulong)
parm: zfs_arc_max:Max arc size (ulong)
parm: zfs_arc_meta_limit:Meta limit for arc size (ulong)
parm: zfs_arc_reduce_dnlc_percent:Meta reclaim percentage (int)
parm: zfs_arc_grow_retry:Seconds before growing arc size (int)
parm: zfs_arc_shrink_shift:log2(fraction of arc to reclaim) (int)
parm: zfs_arc_p_min_shift:arc_c shift to calc min/max arc_p (int)
When a new znode/inode pair is created both the znode and the inode
should be immediately updated to the correct values. This was done
for the znode and for most of the values in the inode, but not all
of them. This normally wasn't a problem because most subsequent
operations would cause the inode to be immediately updated. This
change ensures the inode is now fully updated before it is inserted
in to the inode hash.
Closes#116Closes#146Closes#164
This change fixes a kernel panic which would occur when resizing
a dataset which was not open. The objset_t stored in the
zvol_state_t will be set to NULL when the block device is closed.
To avoid this issue we pass the correct objset_t as the third arg.
The code has also been updated to correctly notify the kernel
when the block device capacity changes. For 2.6.28 and newer
kernels the capacity change will be immediately detected. For
earlier kernels the capacity change will be detected when the
device is next opened. This is a known limitation of older
kernels.
Online ext3 resize test case passes on 2.6.28+ kernels:
$ dd if=/dev/zero of=/tmp/zvol bs=1M count=1 seek=1023
$ zpool create tank /tmp/zvol
$ zfs create -V 500M tank/zd0
$ mkfs.ext3 /dev/zd0
$ mkdir /mnt/zd0
$ mount /dev/zd0 /mnt/zd0
$ df -h /mnt/zd0
$ zfs set volsize=800M tank/zd0
$ resize2fs /dev/zd0
$ df -h /mnt/zd0
Original-patch-by: Fajar A. Nugraha <github@fajar.net>
Closes#68Closes#84
The vdev_metaslab_init() function has been observed to allocate
larger than 8k chunks. However, they are not much larger than 8k
and it does this infrequently so it is allowed and the warning is
supressed.
The dsl_scan_visit() function is a little heavy weight taking 464
bytes on the stack. This can be easily reduced for little cost by
moving zap_cursor_t and zap_attribute_t off the stack and on to the
heap. After this change dsl_scan_visit() has been reduced in size
by 320 bytes.
This change was made to reduce stack usage in the dsl_scan_sync()
callpath which is recursive and has been observed to overflow the
stack.
Issue #174
This function is called recursively so everything possible must be
done to limit its stack consumption. The dprintf_bp() debugging
function adds 30 bytes of local variables to the function we cannot
afford. By commenting out this debugging we save 30 bytes per
recursion and depths of 13 are not uncommon. This yeilds a total
stack saving of 390 bytes on our 8k stack.
Issue #174
The recursive call chain dsl_scan_visitbp() -> dsl_scan_recurse() ->
dsl_scan_visitdnode() -> dsl_scan_visitbp has been observed to consume
considerable stack resulting in a stack overflow (>8k). The cleanest
way I see to fix this with minimal impact to the existing flow of
code, and with the fewest performance concerns, is to always inline
dsl_scan_recurse() and dsl_scan_visitdnode(). While this will increase
the function size of dsl_scan_visitbp(), by 4660 bytes, it also reduces
the stack requirements by removing the function call overhead.
Issue #174
It's possible for a zvol_write thread to enter direct memory reclaim
while holding open a transaction group. This results in the system
attempting to write out data to the disk to free memory. Unfortunately,
this can't succeed because the the thread doing reclaim is holding open
the txg which must be closed to be synced to disk. To prevent this
the offending allocation is marked KM_PUSHPAGE which will prevent it
from attempting writeback.
Closes#191
Occasionally we would see an -EFAULT returned when setting the
I/O scheduler on a vdev. This was caused an improperly formatted
user mode helper command.
This commit restructures the command to something simpler, allocates
space for it dynamically to save stack, and removes the retry logic
which is no longer needed.
Closes#169
This change ensures the ARC meta-data limits are enforced. Without
this enforcement meta-data can grow to consume all of the ARC cache
pushing out data and hurting performance. The cache is aggressively
reclaimed but this is a soft and not a hard limit. The cache may
exceed the set limit briefly before being brought under control.
By default 25% of the ARC capacity can be used for meta-data. This
limit can be tuned by setting the 'zfs_arc_meta_limit' module option.
Once this limit is exceeded meta-data reclaim will occur in 3 percent
chunks, or may be tuned using 'arc_reduce_dnlc_percent'.
Closes#193
Fixed a bug where zfs_zget could access a stale znode pointer when
the inode had already been removed from the inode cache via iput ->
iput_final -> ... -> zfs_zinactive but the corresponding SA handle
was still alive.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#180
Change the SPL kernel messages for module loading and module
unloading so that they are similar to the ZFS kernel messages.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
This reverts commit 1814251453.
Demote the gawk call back to awk and ensure that stderr is attached. GNU gawk
tolerates a missing stderr handle, but many utilities do not, which could be
why a regular awk call was unexplainably failing on some systems.
Use argv[0] instead of sh_path for consistency internally and with other Linux
drivers.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Provide a call_usermodehelper() alternative by letting the hostid be passed as
a module parameter like this:
$ modprobe spl spl_hostid=0x12345678
Internally change the spl_hostid variable to unsigned long because that is the
type that the coreutils /usr/bin/hostid returns.
Move the hostid command into GET_HOSTID_CMD for consistency with the similar
GET_KALLSYMS_ADDR_CMD invocation.
Use argv[0] instead of sh_path for consistency internally and with other Linux
drivers.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
The function zlib_deflate_workspacesize() now take 2 arguments.
This was done to avoid always having to allocate the maximum size
workspace (268K). The caller can now specific the windowBits and
memLevel compression parameters to get a smaller workspace.
For our purposes we introduce a spl_zlib_deflate_workspacesize()
wrapper which accepts both arguments. When the two argument
version of zlib_deflate_workspacesize() is available the arguments
are passed through. When it's not we assume the worst case and
a maximally sized workspace is used.
The path_lookup() function has been renamed to kern_path_parent()
and the flags argument has been removed. The only behavior now
offered is that of LOOKUP_PARENT. The spl already always passed
this flag so dropping the flag does not impact us.
This is a long over due compatibility change. Way, way, way back
in 2007 there was a push to remove all consumers of SPIN_LOCK_UNLOCKED.
Finally, in 2011 with 2.6.39 all the consumers have been updated
and SPIN_LOCK_UNLOCKED was removed. It's about time we use the
new API as well, this change does exactly that. DEFINE_SPINLOCK()
was available as far back as 2.6.12 so there doesn't need to be
any additional autoconf-foo for this change.
As part of zfs_ioc_recv() a zfs_cmd_t is allocated in the kernel
which is 17808 bytes in size. This sort of thing in general should
be avoided. However, since this should be an infrequent event for
now we allow it and simply suppress the warning with the KM_NODEBUG
flag. This can be revisited latter if/when it becomes an issue.
Closes#178
If the attribute's new value was shorter than the old one the old
code would leave parts of the old value in the xattr znode.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#203
Without this we may mistakenly believe we have a dentry and try to
d_instantiate() it. This will result in the following BUG. It's
important to note that while the xattr directory has an inode
assoicated with it we never create a dentry for it.
kernel BUG at fs/dcache.c:1418!
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#202
Flagged by the default -Wunused-but-set-variable gcc option when
running under Fedora 15. Since it's correct this variable is
entirely unused this commit removes it.
To resolve a potiential filesystem corruption issue a second
argument was added to invalidate_inodes(). This argument controls
whether dirty inodes are dropped or treated as busy when invalidating
a super block. When only the legacy API is available the second
argument will be dropped for compatibility.
When compiling ZFS in user space gcc-4.6.0 correctly identifies
the variable 'os' as being set but never used. This generates a
warning and a build failure when using --enable-debug. However,
the code is correct we only want to use 'os' for the kernel space
builds. To suppress the warning the call was wrapped with a
VERIFY() which has the nice side effect of ensuring the 'os'
actually never is NULL. This was observed under Fedora 15.
module/zfs/dsl_pool.c: In function ‘dsl_pool_create’:
module/zfs/dsl_pool.c:229:12: error: variable ‘os’ set but not used
[-Werror=unused-but-set-variable]
Update code to use the spl_invalidate_inodes() wrapper. This hides
some of the complexity of determining if invalidate_inodes() was
exported, and if so what is its prototype. The second argument
of spl_invalidate_inodes() determined the behavior of how dirty
inodes are handled. By passing a zero we are indicated that we
want those inodes to be treated as busy and skipped.
The .sync_fs fix as applied did not use the updated SPL credential
API. This broke builds on Debian Lenny, this change applies the
needed fix to use the portable API. The original credential changes
are part of commit 81e97e2187.
Disable the normal reclaim path for the txg_sync thread. This
ensures the thread will never enter dmu_tx_assign() which can
otherwise occur due to direct reclaim. If this is allowed to
happen the system can deadlock. Direct reclaim call path:
->shrink_icache_memory->prune_icache->dispose_list->
clear_inode->zpl_clear_inode->zfs_inactive->dmu_tx_assign
Under OpenSolaris all memory reclaim is done asyncronously. Under
Linux memory reclaim is done asynchronously _and_ synchronously.
When a process allocates memory with GFP_KERNEL it explicitly allows
the kernel to do reclaim on its behalf to satify the allocation.
If that GFP_KERNEL allocation fails the kernel may take more drastic
measures to reclaim the memory such as killing user space processes.
This was observed to happen with ZFS because the ARC could consume
a large fraction of the system memory but no synchronous reclaim
could be performed on it. The result was GFP_KERNEL allocations
could fail resulting in OOM events, and only moments latter the
arc_reclaim thread would free unused memory from the ARC.
This change leaves the arc_thread in place to manage the fundamental
ARC behavior. But it adds a synchronous (direct) reclaim path for
the ARC which can be called when memory is badly needed. It also
adds an asynchronous (indirect) reclaim path which is called
much more frequently to prune the ARC slab caches.
The following useful values were missing the arcstats. This change
adds them in to provide greater visibility in to the arcs behavior.
arc_no_grow 4 0
arc_tempreserve 4 0
arc_loaned_bytes 4 0
arc_meta_used 4 624774592
arc_meta_limit 4 400785408
arc_meta_max 4 625594176
Under Linux a dentry referencing an inode must be instantiated before
the inode is unlocked. To accomplish this without overly modifing
the core ZFS code the dentry it passed via the vattr_t. There are
cases such as replay when a dentry is not available. In which case
it is obviously not initialized at inode creation time, if a dentry
is needed it will be spliced as when required via d_lookup().
Provide the dnlc_reduce_cache() function which attempts to prune
cached entries from the dcache and icache. After the entries are
pruned any slabs which they may have been using are reaped.
Note the API takes a reclaim percentage but we don't have easy
access to the total number of cache entries to calculate the
reclaim count. However, in practice this doesn't need to be
exactly correct. We simply need to reclaim some useful fraction
(but not all) of the cache. The caller can determine if more
needs to be done.
One of the most common things you want to know when looking at
the slab is how much memory is being used. This information was
available in /proc/spl/kmem/slab but only on a per-slab basis.
This commit adds the following /proc/sys/kernel/spl/kmem/slab*
entries to make total slab usage easily available at a glance.
slab_kmem_total - Total kmem slab size
slab_kmem_avail - Alloc'd kmem slab size
slab_kmem_max - Max observed kmem slab size
slab_vmem_total - Total vmem slab size
slab_vmem_avail - Alloc'd vmem slab size
slab_vmem_max - Max observed vmem slab size
NOTE: The slab_*_max values are expected to over report because
they show maximum values since boot, not current values.
The 'slab_fail', 'slab_create', and 'slab_destroy' columns in the slab
output have been removed because they are virtually always zero and
not very useful.
The much more useful 'size' and 'alloc' columns have been added which
show the total slab size and how much of the total size has been
allocated to objects.
Finally, the formatting has been updated to be much more human
readable while still being friendly for tool like awk to parse.
The Linux shrinker has gone through three API changes since 2.6.22.
Rather than force every caller to understand all three APIs this
change consolidates the compatibility code in to the mm-compat.h
header. The caller then can then use a single spl provided
shrinker API which does the right thing for your kernel.
SPL_SHRINKER_CALLBACK_PROTO(shrinker_callback, cb, nr_to_scan, gfp_mask);
SPL_SHRINKER_DECLARE(shrinker_struct, shrinker_callback, seeks);
spl_register_shrinker(&shrinker_struct);
spl_unregister_shrinker(&&shrinker_struct);
spl_exec_shrinker(&shrinker_struct, nr_to_scan, gfp_mask);
Making distclean in module
make[1]: Entering directory `/zfs/module'
make -C SUBDIRS=`pwd` clean
make: Entering an unknown directory
make: *** SUBDIRS=/zfs/module: No such file or directory. Stop.
When using --with-config=user the 'distclean' target would fail
because it assumes the kernel configuration infrastrure is set up.
This is not the case, nor does it need to be, because the
'--with-config=user' option will prune the entire ./module subtree
from SUBDIRS. This prevents most build rules from operating in the
./module directory.
However, the 'dist*' rules will still traverse this directory
because it is listed in DIST_SUBDIRS. This is correct because we
need to ensure the dist rules package the directory contents
regardless of the configuration for the 'dist' rule. The correct
way to handle this is to only invoke the kernel build system as
part of the 'clean' rule when CONFIG_KERNEL_TRUE is set.
Initial fix provided by Darik Horn <dajhorn@vanadac.com>.
This commit is a slightly refined form of the original.