Remove races from scrub / resilver tests

Currently, several tests in the ZFS Test Suite that attempt to
test scrub and resilver behavior occasionally fail. A big reason
for this is that these tests use a combination of zinject and
zfs_scan_vdev_limit to attempt to slow these operations enough
to verify their test commands. This method works most of the time,
but provides no guarantees and leads to flaky behavior. This patch
adds a new tunable, zfs_scan_suspend_progress, that ensures that
scans make no progress, guaranteeing that tests can be run without
racing.

This patch also changes zfs_remove_max_bytes_pause to match this
new tunable. This provides some consistency between these two
similar tunables and ensures that the tunable will not misbehave
on 32-bit systems.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #8111
This commit is contained in:
Tom Caputi
2018-11-28 13:12:08 -05:00
committed by Brian Behlendorf
parent 00369f3338
commit cef48f14da
14 changed files with 65 additions and 83 deletions
+5 -5
View File
@@ -121,7 +121,7 @@ int vdev_removal_max_span = 32 * 1024;
* This is used by the test suite so that it can ensure that certain
* actions happen while in the middle of a removal.
*/
unsigned long zfs_remove_max_bytes_pause = -1UL;
int zfs_removal_suspend_progress = 0;
#define VDEV_REMOVAL_ZAP_OBJS "lzap"
@@ -1449,14 +1449,14 @@ spa_vdev_remove_thread(void *arg)
/*
* This delay will pause the removal around the point
* specified by zfs_remove_max_bytes_pause. We do this
* specified by zfs_removal_suspend_progress. We do this
* solely from the test suite or during debugging.
*/
uint64_t bytes_copied =
spa->spa_removing_phys.sr_copied;
for (int i = 0; i < TXG_SIZE; i++)
bytes_copied += svr->svr_bytes_done[i];
while (zfs_remove_max_bytes_pause <= bytes_copied &&
while (zfs_removal_suspend_progress &&
!svr->svr_thread_exit)
delay(hz);
@@ -2178,8 +2178,8 @@ MODULE_PARM_DESC(vdev_removal_max_span,
"Largest span of free chunks a remap segment can span");
/* BEGIN CSTYLED */
module_param(zfs_remove_max_bytes_pause, ulong, 0644);
MODULE_PARM_DESC(zfs_remove_max_bytes_pause,
module_param(zfs_removal_suspend_progress, int, 0644);
MODULE_PARM_DESC(zfs_removal_suspend_progress,
"Pause device removal after this many bytes are copied "
"(debug use only - causes removal to hang)");
/* END CSTYLED */