Change checksum & IO delay ratelimit values

Change checksum & IO delay ratelimit thresholds from 5/sec to 20/sec.
This allows zed to actually trigger if a bunch of these events arrive in
a short period of time (zed has a threshold of 10 events in 10 sec).
Previously, if you had, say, 100 checksum errors in 1 sec, it would get
ratelimited to 5/sec which wouldn't trigger zed to fault the drive.

Also, convert the checksum and IO delay thresholds to module params for
easy testing.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #7252
This commit is contained in:
Tony Hutter
2018-03-04 17:34:51 -08:00
committed by Brian Behlendorf
parent 5666a994f2
commit 80d52c3919
6 changed files with 56 additions and 15 deletions
-2
View File
@@ -262,8 +262,6 @@ struct vdev {
* We rate limit ZIO delay and ZIO checksum events, since they
* can flood ZED with tons of events when a drive is acting up.
*/
#define DELAYS_PER_SECOND 5
#define CHECKSUMS_PER_SECOND 5
zfs_ratelimit_t vdev_delay_rl;
zfs_ratelimit_t vdev_checksum_rl;
};
+9 -3
View File
@@ -25,13 +25,19 @@
typedef struct {
hrtime_t start;
unsigned int count;
unsigned int burst; /* Number to allow per interval */
unsigned int interval; /* Interval length in seconds */
/*
* Pointer to number of events per interval. We do this to
* allow the burst to be a (changeable) module parameter.
*/
unsigned int *burst;
unsigned int interval; /* Interval length in seconds */
kmutex_t lock;
} zfs_ratelimit_t;
int zfs_ratelimit(zfs_ratelimit_t *rl);
void zfs_ratelimit_init(zfs_ratelimit_t *rl, unsigned int burst,
void zfs_ratelimit_init(zfs_ratelimit_t *rl, unsigned int *burst,
unsigned int interval);
void zfs_ratelimit_fini(zfs_ratelimit_t *rl);