zed: mark disks as REMOVED when they are removed

ZED does not take any action for disk removal events if there is no
spare VDEV available. Added zpool_vdev_remove_wanted() in libzfs
and vdev_remove_wanted() in vdev.c to remove the VDEV through ZED
on removal event.  This means that if you are running zed and
remove a disk, it will be properly marked as REMOVED.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #13797
This commit is contained in:
Ameer Hamza
2022-09-28 21:48:46 +05:00
committed by GitHub
parent eb9bec0a5d
commit 55c12724d3
24 changed files with 395 additions and 51 deletions
+1 -1
View File
@@ -1884,7 +1884,7 @@ function wait_hotspare_state # pool disk state timeout
#
# Return 0 is pool/disk matches expected state, 1 otherwise
#
function check_vdev_state # pool disk state{online,offline,unavail}
function check_vdev_state # pool disk state{online,offline,unavail,removed}
{
typeset pool=$1
typeset disk=${2#*$DEV_DSKDIR/}
@@ -24,29 +24,28 @@
#
# DESCRIPTION:
# Testing Fault Management Agent ZED Logic - Physically removed device is
# made unavail and onlined when reattached
# Testing Fault Management Agent ZED Logic - Physically detached device is
# made removed and onlined when reattached
#
# STRATEGY:
# 1. Create a pool
# 2. Simulate physical removal of one device
# 3. Verify the device is unavailable
# 3. Verify the device is removed when detached
# 4. Reattach the device
# 5. Verify the device is onlined
# 6. Repeat the same tests with a spare device:
# zed will use the spare to handle the removed data device
# 7. Repeat the same tests again with a faulted spare device:
# the removed data device should be unavailable
# the removed data device should be removed
#
# NOTE: the use of 'block_device_wait' throughout the test helps avoid race
# conditions caused by mixing creation/removal events from partitioning the
# disk (zpool create) and events from physically removing it (remove_disk).
#
# NOTE: the test relies on 'zpool sync' to prompt the kmods to transition a
# vdev to the unavailable state. The ZED does receive a removal notification
# but only relies on it to activate a hot spare. Additional work is planned
# to extend an existing ioctl interface to allow the ZED to transition the
# vdev in to a removed state.
# NOTE: the test relies on ZED to transit state to removed on device removed
# event. The ZED does receive a removal notification but only relies on it to
# activate a hot spare. Additional work is planned to extend an existing ioctl
# interface to allow the ZED to transition the vdev in to a removed state.
#
verify_runnable "both"
@@ -103,8 +102,8 @@ do
log_must mkfile 1m $mntpnt/file
sync_pool $TESTPOOL
# 3. Verify the device is unavailable.
log_must wait_vdev_state $TESTPOOL $removedev "UNAVAIL"
# 3. Verify the device is removed.
log_must wait_vdev_state $TESTPOOL $removedev "REMOVED"
# 4. Reattach the device
insert_disk $removedev
@@ -136,7 +135,7 @@ do
# 3. Verify the device is handled by the spare.
log_must wait_hotspare_state $TESTPOOL $sparedev "INUSE"
log_must wait_vdev_state $TESTPOOL $removedev "UNAVAIL"
log_must wait_vdev_state $TESTPOOL $removedev "REMOVED"
# 4. Reattach the device
insert_disk $removedev
@@ -170,8 +169,8 @@ do
log_must mkfile 1m $mntpnt/file
sync_pool $TESTPOOL
# 4. Verify the device is unavailable
log_must wait_vdev_state $TESTPOOL $removedev "UNAVAIL"
# 4. Verify the device is removed
log_must wait_vdev_state $TESTPOOL $removedev "REMOVED"
# 5. Reattach the device
insert_disk $removedev