zvol: ensure device minors are properly cleaned up

Currently, if a minor is in use when we try to remove it, we'll skip it
and never come back to it again. Since the zvol state is hung off the
minor in the kernel, this can get us into weird situations if something
tries to use it after the removal fails. It's even worse at pool export,
as there's now a vestigial zvol state with no pool under it. It's
weirder again if the pool is subsequently reimported, as the zvol code
(reasonably) assumes the zvol state has been properly setup, when it's
actually left over from the previous import of the pool.

This commit attempts to tackle that by setting a flag on the zvol if its
minor can't be removed, and then checking that flag when a request is
made and rejecting it, thus stopping new work coming in.

The flag also causes a condvar to be signaled when the last client
finishes. For the case where a single minor is being removed (eg
changing volmode), it will wait for this signal before proceeding.
Meanwhile, when removing all minors, a background task is created for
each minor that couldn't be removed on the spot, and those tasks then
wake and clean up.

Since any new tasks are queued on to the pool's spa_zvol_taskq,
spa_export_common() will continue to wait at export until all minors are
removed.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #14872
Closes #16364
This commit is contained in:
Rob Norris
2024-07-18 13:24:05 +10:00
committed by Tony Hutter
parent f3a9746f21
commit cf80a803d5
4 changed files with 122 additions and 26 deletions
+5
View File
@@ -18,6 +18,9 @@
*
* CDDL HEADER END
*/
/*
* Copyright (c) 2024, Klara, Inc.
*/
#ifndef _SYS_ZVOL_IMPL_H
#define _SYS_ZVOL_IMPL_H
@@ -27,6 +30,7 @@
#define ZVOL_RDONLY (1<<0) /* zvol is readonly (writes rejected) */
#define ZVOL_WRITTEN_TO (1<<1) /* zvol has been written to (needs flush) */
#define ZVOL_EXCL (1<<2) /* zvol has O_EXCL client right now */
#define ZVOL_REMOVING (1<<3) /* zvol waiting to remove minor */
/*
* The in-core state of each volume.
@@ -50,6 +54,7 @@ typedef struct zvol_state {
kmutex_t zv_state_lock; /* protects zvol_state_t */
atomic_t zv_suspend_ref; /* refcount for suspend */
krwlock_t zv_suspend_lock; /* suspend lock */
kcondvar_t zv_removing_cv; /* ready to remove minor */
struct zvol_state_os *zv_zso; /* private platform state */
} zvol_state_t;