Restructure per-filesystem reclaim

Originally when the ARC prune callback was introduced the idea was
to register a single callback for the ZPL.  The ARC could invoke this
call back if it needed the ZPL to drop dentries, inodes, or other
cache objects which might be pinning buffers in the ARC.  The ZPL
would iterate over all ZFS super blocks and perform the reclaim.

For the most part this design has worked well but due to limitations
in 2.6.35 and earlier kernels there were some problems.  This patch
is designed to address those issues.

1) iterate_supers_type() is not provided by all kernels which makes
it impossible to safely iterate over all zpl_fs_type filesystems in
a single callback.  The most straight forward and portable way to
resolve this is to register a callback per-filesystem during mount.
The arc_*_prune_callback() functions have always supported multiple
callbacks so this is functionally a very small change.

2) Commit 050d22b removed the non-portable shrink_dcache_memory()
and shrink_icache_memory() functions and didn't replace them with
equivalent functionality.  This meant that for Linux 3.1 and older
kernels the ARC had no mechanism to drop dentries and inodes from
the caches if needed.  This patch adds that missing functionality
by calling shrink_dcache_parent() to release dentries which may be
pinning inodes.  This will result in all unused cache entries being
dropped which is a bit heavy handed but it's the only interface
available for old kernels.

3) A zpl_drop_inode() callback is registered for kernels older than
2.6.35 which do not support the .evict_inode callback.  This ensures
that when the last reference on an inode is dropped it is immediately
removed from the cache.  If this isn't done than inode can end up on
the global unused LRU with no mechanism available to ZFS to drop them.
Since the ARC buffers are not dropped the hottest inodes can still
be recreated without performing disk IO.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pavel Snajdr <snajpa@snajpa.net>
Issue #3160
This commit is contained in:
Brian Behlendorf
2015-03-17 15:07:47 -07:00
parent 596a8935a1
commit 2cbb06b561
6 changed files with 59 additions and 43 deletions
+12 -3
View File
@@ -386,7 +386,11 @@ Use \fB1\fR for yes (default) and \fB0\fR to disable.
\fBzfs_arc_meta_limit\fR (ulong)
.ad
.RS 12n
Meta limit for arc size
The maximum allowed size in bytes that meta data buffers are allowed to
consume in the ARC. When this limit is reached meta data buffers will
be reclaimed even if the overall arc_c_max has not been reached. This
value defaults to 0 which indicates that 3/4 of the ARC may be used
for meta data.
.sp
Default value: \fB0\fR.
.RE
@@ -397,9 +401,14 @@ Default value: \fB0\fR.
\fBzfs_arc_meta_prune\fR (int)
.ad
.RS 12n
Bytes of meta data to prune
The number of dentries and inodes to be scanned looking for entries
which can be dropped. This may be required when the ARC reaches the
\fBzfs_arc_meta_limit\fR because dentries and inodes can pin buffers
in the ARC. Increasing this value will cause to dentry and inode caches
to be pruned more aggressively. Setting this value to 0 will disable
pruning the inode and dentry caches.
.sp
Default value: \fB1,048,576\fR.
Default value: \fB10,000\fR.
.RE
.sp