Multipath autoreplace, control enclosure LEDs, event rate limiting

1. Enable multipath autoreplace support for FMA.

This extends FMA autoreplace to work with multipath disks.  This
requires libdevmapper to be installed at build time.

2. Turn on/off fault LEDs when VDEVs become degraded/faulted/online

Set ZED_USE_ENCLOSURE_LEDS=1 in zed.rc to have ZED turn on/off the enclosure
LED for a drive when a drive becomes FAULTED/DEGRADED.  Your enclosure must
be supported by the Linux SES driver for this to work.  The enclosure LED
scripts work for multipath devices as well.  The scripts will clear the LED
when the fault is cleared.

3. Rate limit ZIO delay and checksum events so as not to flood ZED

ZIO delay and checksum events are rate limited to 5/sec in the zfs module.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed by: Don Brady <don.brady@intel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #2449 
Closes #3017 
Closes #5159
This commit is contained in:
Tony Hutter
2016-10-19 12:55:59 -07:00
committed by Brian Behlendorf
parent 7c502b0b1d
commit 6078881aa1
24 changed files with 668 additions and 61 deletions
+64
View File
@@ -40,6 +40,7 @@
#include <sys/int_limits.h>
#include <sys/nvpair.h>
#include "zfs_comutil.h"
#include <sys/zfs_ratelimit.h>
/*
* Are there allocatable vdevs?
@@ -206,10 +207,73 @@ const char *zfs_history_event_names[ZFS_NUM_LEGACY_HISTORY_EVENTS] = {
"pool split",
};
/*
* Initialize rate limit struct
*
* rl: zfs_ratelimit_t struct
* burst: Number to allow in an interval before rate limiting
* interval: Interval time in seconds
*/
void
zfs_ratelimit_init(zfs_ratelimit_t *rl, unsigned int burst,
unsigned int interval)
{
rl->count = 0;
rl->start = 0;
rl->interval = interval;
rl->burst = burst;
mutex_init(&rl->lock, NULL, MUTEX_DEFAULT, NULL);
}
/*
* Re-implementation of the kernel's __ratelimit() function
*
* We had to write our own rate limiter because the kernel's __ratelimit()
* function annoyingly prints out how many times it rate limited to the kernel
* logs (and there's no way to turn it off):
*
* __ratelimit: 59 callbacks suppressed
*
* If the kernel ever allows us to disable these prints, we should go back to
* using __ratelimit() instead.
*
* Return values are the same as __ratelimit():
*
* 0: If we're rate limiting
* 1: If we're not rate limiting.
*/
int
zfs_ratelimit(zfs_ratelimit_t *rl)
{
hrtime_t now;
hrtime_t elapsed;
int rc = 1;
mutex_enter(&rl->lock);
now = gethrtime();
elapsed = now - rl->start;
rl->count++;
if (NSEC2SEC(elapsed) >= rl->interval) {
rl->start = now;
rl->count = 0;
} else {
if (rl->count >= rl->burst) {
rc = 0; /* We're ratelimiting */
}
}
mutex_exit(&rl->lock);
return (rc);
}
#if defined(_KERNEL) && defined(HAVE_SPL)
EXPORT_SYMBOL(zfs_allocatable_devs);
EXPORT_SYMBOL(zpool_get_rewind_policy);
EXPORT_SYMBOL(zfs_zpl_version_map);
EXPORT_SYMBOL(zfs_spa_version_map);
EXPORT_SYMBOL(zfs_history_event_names);
EXPORT_SYMBOL(zfs_ratelimit_init);
EXPORT_SYMBOL(zfs_ratelimit);
#endif