Teach zpool scrub to scrub only blocks in error log

Added a flag '-e' in zpool scrub to scrub only blocks in error log. A
user can pause, resume and cancel the error scrub by passing additional
command line arguments -p -s just like a regular scrub. This involves
adding a new flag, creating new libzfs interfaces, a new ioctl, and the
actual iteration and read-issuing logic. Error scrubbing is executed in
multiple txg to make sure pool performance is not affected.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Co-authored-by: TulsiJain tulsi.jain@delphix.com
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #8995
Closes #12355
This commit is contained in:
George Amanakis
2021-12-17 21:35:28 +01:00
committed by Brian Behlendorf
parent e34e15ed6d
commit 482eeef804
29 changed files with 1602 additions and 71 deletions
+3
View File
@@ -1764,6 +1764,9 @@ Scrubs are processed by the sync thread.
While scrubbing, it will spend at least this much time
working on a scrub between TXG flushes.
.
.It Sy zfs_scrub_error_blocks_per_txg Ns = Ns Sy 4096 Pq uint
Error blocks to be scrubbed in one txg.
.
.It Sy zfs_scan_checkpoint_intval Ns = Ns Sy 7200 Ns s Po 2 hour Pc Pq uint
To preserve progress across reboots, the sequential scan algorithm periodically
needs to stop metadata scanning and issue all the verification I/O to disk.
+19
View File
@@ -38,6 +38,7 @@
.Cm scrub
.Op Fl s Ns | Ns Fl p
.Op Fl w
.Op Fl e
.Ar pool Ns …
.
.Sh DESCRIPTION
@@ -62,6 +63,13 @@ device
whereas scrubbing examines all data to discover silent errors due to hardware
faults or disk failure.
.Pp
When scrubbing a pool with encrypted filesystems the keys do not need to be
loaded.
However, if the keys are not loaded and an unrepairable checksum error is
detected the file name cannot be included in the
.Nm zpool Cm status Fl v
verbose error report.
.Pp
Because scrubbing and resilvering are I/O-intensive operations, ZFS only allows
one at a time.
.Pp
@@ -92,9 +100,20 @@ Once resumed the scrub will pick up from the place where it was last
checkpointed to disk.
To resume a paused scrub issue
.Nm zpool Cm scrub
or
.Nm zpool Cm scrub
.Fl e
again.
.It Fl w
Wait until scrub has completed before returning.
.It Fl e
Only scrub files with known data errors as reported by
.Nm zpool Cm status Fl v .
The pool must have been scrubbed at least once with the
.Sy head_errlog
feature enabled to use this option.
Error scrubbing cannot be run simultaneously with regular scrubbing or
resilvering, nor can it be run when a regular scrub is paused.
.El
.Sh EXAMPLES
.Ss Example 1