Add subcommand to wait for background zfs activity to complete

Currently the best way to wait for the completion of a long-running
operation in a pool, like a scrub or device removal, is to poll 'zpool
status' and parse its output, which is neither efficient nor convenient.

This change adds a 'wait' subcommand to the zpool command. When invoked,
'zpool wait' will block until a specified type of background activity
completes. Currently, this subcommand can wait for any of the following:

 - Scrubs or resilvers to complete
 - Devices to initialized
 - Devices to be replaced
 - Devices to be removed
 - Checkpoints to be discarded
 - Background freeing to complete

For example, a scrub that is in progress could be waited for by running

    zpool wait -t scrub <pool>

This also adds a -w flag to the attach, checkpoint, initialize, replace,
remove, and scrub subcommands. When used, this flag makes the operations
kicked off by these subcommands synchronous instead of asynchronous.

This functionality is implemented using a new ioctl. The type of
activity to wait for is provided as input to the ioctl, and the ioctl
blocks until all activity of that type has completed. An ioctl was used
over other methods of kernel-userspace communiction primarily for the
sake of portability.

Porting Notes:
This is ported from Delphix OS change DLPX-44432. The following changes
were made while porting:

 - Added ZoL-style ioctl input declaration.
 - Reorganized error handling in zpool_initialize in libzfs to integrate
   better with changes made for TRIM support.
 - Fixed check for whether a checkpoint discard is in progress.
   Previously it also waited if the pool had a checkpoint, instead of
   just if a checkpoint was being discarded.
 - Exposed zfs_initialize_chunk_size as a ZoL-style tunable.
 - Updated more existing tests to make use of new 'zpool wait'
   functionality, tests that don't exist in Delphix OS.
 - Used existing ZoL tunable zfs_scan_suspend_progress, together with
   zinject, in place of a new tunable zfs_scan_max_blks_per_txg.
 - Added support for a non-integral interval argument to zpool wait.

Future work:
ZoL has support for trimming devices, which Delphix OS does not. In the
future, 'zpool wait' could be extended to add the ability to wait for
trim operations to complete.

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: John Gallagher <john.gallagher@delphix.com>
Closes #9162
This commit is contained in:
John Gallagher
2019-09-13 18:09:06 -07:00
committed by Brian Behlendorf
parent 7238cbd4d3
commit e60e158eff
61 changed files with 2662 additions and 144 deletions
+12
View File
@@ -1968,6 +1968,18 @@ Pattern written to vdev free space by \fBzpool initialize\fR.
Default value: \fB16,045,690,984,833,335,022\fR (0xdeadbeefdeadbeee).
.RE
.sp
.ne 2
.na
\fBzfs_initialize_chunk_size\fR (ulong)
.ad
.RS 12n
Size of writes used by \fBzpool initialize\fR.
This option is used by the test suite to facilitate testing.
.sp
Default value: \fB1,048,576\fR
.RE
.sp
.ne 2
.na
+92 -9
View File
@@ -27,7 +27,7 @@
.\" Copyright 2017 Nexenta Systems, Inc.
.\" Copyright (c) 2017 Open-E, Inc. All Rights Reserved.
.\"
.Dd May 2, 2019
.Dd August 9, 2019
.Dt ZPOOL 8 SMM
.Os Linux
.Sh NAME
@@ -43,12 +43,12 @@
.Ar pool vdev Ns ...
.Nm
.Cm attach
.Op Fl f
.Op Fl fw
.Oo Fl o Ar property Ns = Ns Ar value Oc
.Ar pool device new_device
.Nm
.Cm checkpoint
.Op Fl d, -discard
.Op Fl d, -discard Oo Fl w, -wait Oc
.Ar pool
.Nm
.Cm clear
@@ -117,6 +117,7 @@
.Nm
.Cm initialize
.Op Fl c | Fl s
.Op Fl w
.Ar pool
.Op Ar device Ns ...
.Nm
@@ -155,7 +156,7 @@
.Ar pool
.Nm
.Cm remove
.Op Fl np
.Op Fl npw
.Ar pool Ar device Ns ...
.Nm
.Cm remove
@@ -163,7 +164,7 @@
.Ar pool
.Nm
.Cm replace
.Op Fl f
.Op Fl fw
.Oo Fl o Ar property Ns = Ns Ar value Oc
.Ar pool Ar device Op Ar new_device
.Nm
@@ -172,6 +173,7 @@
.Nm
.Cm scrub
.Op Fl s | Fl p
.Op Fl w
.Ar pool Ns ...
.Nm
.Cm trim
@@ -211,6 +213,13 @@
.Op Fl V Ar version
.Fl a Ns | Ns Ar pool Ns ...
.Nm
.Cm wait
.Op Fl Hp
.Op Fl T Sy u Ns | Ns Sy d
.Op Fl t Ar activity Ns Oo , Ns Ar activity Ns Oc Ns ...
.Ar pool
.Op Ar interval
.Nm
.Cm version
.Sh DESCRIPTION
The
@@ -988,7 +997,7 @@ supported at the moment is ashift.
.It Xo
.Nm
.Cm attach
.Op Fl f
.Op Fl fw
.Oo Fl o Ar property Ns = Ns Ar value Oc
.Ar pool device new_device
.Xc
@@ -1019,6 +1028,10 @@ Forces use of
.Ar new_device ,
even if it appears to be in use.
Not all devices can be overridden in this manner.
.It Fl w
Waits until
.Ar new_device
has finished resilvering before returning.
.It Fl o Ar property Ns = Ns Ar value
Sets the given pool properties. See the
.Sx Properties
@@ -1028,7 +1041,7 @@ supported at the moment is ashift.
.It Xo
.Nm
.Cm checkpoint
.Op Fl d, -discard
.Op Fl d, -discard Oo Fl w, -wait Oc
.Ar pool
.Xc
Checkpoints the current state of
@@ -1057,6 +1070,8 @@ command reports how much space the checkpoint takes from the pool.
.It Fl d, -discard
Discards an existing checkpoint from
.Ar pool .
.It Fl w, -wait
Waits until the checkpoint has finished being discarded before returning.
.El
.It Xo
.Nm
@@ -1687,6 +1702,7 @@ Will also set -o cachefile=none when not explicitly specified.
.Nm
.Cm initialize
.Op Fl c | Fl s
.Op Fl w
.Ar pool
.Op Ar device Ns ...
.Xc
@@ -1708,6 +1724,8 @@ initialized, the command will fail and no suspension will occur on any device.
Initializing can then be resumed by running
.Nm zpool Cm initialize
with no flags on the relevant target devices.
.It Fl w, -wait
Wait until the devices have finished initializing before returning.
.El
.It Xo
.Nm
@@ -2049,7 +2067,7 @@ result in partially resilvered devices unless a second scrub is performed.
.It Xo
.Nm
.Cm remove
.Op Fl np
.Op Fl npw
.Ar pool Ar device Ns ...
.Xc
Removes the specified device from the pool.
@@ -2091,6 +2109,8 @@ This is nonzero only for top-level vdevs.
Used in conjunction with the
.Fl n
flag, displays numbers as parsable (exact) values.
.It Fl w
Waits until the removal has completed before returning.
.El
.It Xo
.Nm
@@ -2102,7 +2122,7 @@ Stops and cancels an in-progress removal of a top-level vdev.
.It Xo
.Nm
.Cm replace
.Op Fl f
.Op Fl fw
.Op Fl o Ar property Ns = Ns Ar value
.Ar pool Ar device Op Ar new_device
.Xc
@@ -2144,11 +2164,14 @@ Sets the given pool properties. See the
section for a list of valid properties that can be set.
The only property supported at the moment is
.Sy ashift .
.It Fl w
Waits until the replacement has completed before returning.
.El
.It Xo
.Nm
.Cm scrub
.Op Fl s | Fl p
.Op Fl w
.Ar pool Ns ...
.Xc
Begins a scrub or resumes a paused scrub.
@@ -2198,6 +2221,8 @@ checkpointed to disk.
To resume a paused scrub issue
.Nm zpool Cm scrub
again.
.It Fl w
Wait until scrub has completed before returning.
.El
.It Xo
.Nm
@@ -2480,6 +2505,64 @@ supported legacy version number.
Displays the software version of the
.Nm
userland utility and the zfs kernel module.
.It Xo
.Nm
.Cm wait
.Op Fl Hp
.Op Fl T Sy u Ns | Ns Sy d
.Op Fl t Ar activity Ns Oo , Ns Ar activity Ns Oc Ns ...
.Ar pool
.Op Ar interval
.Xc
Waits until all background activity of the given types has ceased in the given
pool.
The activity could cease because it has completed, or because it has been
paused or canceled by a user, or because the pool has been exported or
destroyed.
If no activities are specified, the command waits until background activity of
every type listed below has ceased.
If there is no activity of the given types in progress, the command returns
immediately.
.Pp
These are the possible values for
.Ar activity ,
along with what each one waits for:
.Bd -literal
discard Checkpoint to be discarded
free 'freeing' property to become 0
initialize All initializations to cease
replace All device replacements to cease
remove Device removal to cease
resilver Resilver to cease
scrub Scrub to cease
.Ed
.Pp
If an
.Ar interval
is provided, the amount of work remaining, in bytes, for each activity is
printed every
.Ar interval
seconds.
.Bl -tag -width Ds
.It Fl H
Scripted mode.
Do not display headers, and separate fields by a single tab instead of arbitrary
space.
.It Fl p
Display numbers in parsable (exact) values.
.It Fl T Sy u Ns | Ns Sy d
Display a time stamp.
Specify
.Sy u
for a printed representation of the internal representation of time.
See
.Xr time 2 .
Specify
.Sy d
for standard date format.
See
.Xr date 1 .
.El
.El
.Sh EXIT STATUS
The following exit values are returned: