Add device rebuild feature

The device_rebuild feature enables sequential reconstruction when
resilvering.  Mirror vdevs can be rebuilt in LBA order which may
more quickly restore redundancy depending on the pools average block
size, overall fragmentation and the performance characteristics
of the devices.  However, block checksums cannot be verified
as part of the rebuild thus a scrub is automatically started after
the sequential resilver completes.

The new '-s' option has been added to the `zpool attach` and
`zpool replace` command to request sequential reconstruction
instead of healing reconstruction when resilvering.

    zpool attach -s <pool> <existing vdev> <new vdev>
    zpool replace -s <pool> <old vdev> <new vdev>

The `zpool status` output has been updated to report the progress
of sequential resilvering in the same way as healing resilvering.
The one notable difference is that multiple sequential resilvers
may be in progress as long as they're operating on different
top-level vdevs.

The `zpool wait -t resilver` command was extended to wait on
sequential resilvers.  From this perspective they are no different
than healing resilvers.

Sequential resilvers cannot be supported for RAIDZ, but are
compatible with the dRAID feature being developed.

As part of this change the resilver_restart_* tests were moved
in to the functional/replacement directory.  Additionally, the
replacement tests were renamed and extended to verify both
resilvering and rebuilding.

Original-patch-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: John Poduska <jpoduska@datto.com>
Co-authored-by: Mark Maybee <mmaybee@cray.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #10349
This commit is contained in:
Brian Behlendorf
2020-07-03 11:05:50 -07:00
committed by GitHub
parent 7ddb753d17
commit 9a49d3f3d3
65 changed files with 3281 additions and 362 deletions
+36
View File
@@ -1862,6 +1862,30 @@ queue's min_active. See the section "ZFS I/O SCHEDULER".
Default value: \fB1,000\fR.
.RE
.sp
.ne 2
.na
\fBzfs_vdev_rebuild_max_active\fR (int)
.ad
.RS 12n
Maximum sequential resilver I/Os active to each device.
See the section "ZFS I/O SCHEDULER".
.sp
Default value: \fB3\fR.
.RE
.sp
.ne 2
.na
\fBzfs_vdev_rebuild_min_active\fR (int)
.ad
.RS 12n
Minimum sequential resilver I/Os active to each device.
See the section "ZFS I/O SCHEDULER".
.sp
Default value: \fB1\fR.
.RE
.sp
.ne 2
.na
@@ -2707,6 +2731,18 @@ Include cache hits in read history
Use \fB1\fR for yes and \fB0\fR for no (default).
.RE
.sp
.ne 2
.na
\fBzfs_rebuild_max_segment\fR (ulong)
.ad
.RS 12n
Maximum read segment size to issue when sequentially resilvering a
top-level vdev.
.sp
Default value: \fB1,048,576\fR.
.RE
.sp
.ne 2
.na
+29
View File
@@ -255,6 +255,35 @@ This feature becomes \fBactive\fR when a bookmark is created and will be
returned to the \fBenabled\fR state when all bookmarks with these fields are destroyed.
.RE
.sp
.ne 2
.na
\fBdevice_rebuild\fR
.ad
.RS 4n
.TS
l l .
GUID org.openzfs:device_rebuild
READ\-ONLY COMPATIBLE yes
DEPENDENCIES none
.TE
This feature enables the ability for the \fBzpool attach\fR and \fBzpool
replace\fR subcommands to perform sequential reconstruction (instead of
healing reconstruction) when resilvering.
Sequential reconstruction resilvers a device in LBA order without immediately
verifying the checksums. Once complete a scrub is started which then verifies
the checksums. This approach allows full redundancy to be restored to the pool
in the minimum amount of time. This two phase approach will take longer than a
healing resilver when the time to verify the checksums is included. However,
unless there is additional pool damage no checksum errors should be reported
by the scrub. This feature is incompatible with raidz configurations.
This feature becomes \fBactive\fR while a sequential resilver is in progress,
and returns to \fBenabled\fR when the resilver completes.
.RE
.sp
.ne 2
.na