Prevent race condition in dnode_dest (#10101)

dnode_special_close() waits for the refcount of dn_holds to go to zero
without holding the dn_mtx. dnode_rele_and_unlock() does the final
remove to dn_holds with dn_mtx being held:

	refs = zfs_refcount_remove(&dn->dn_holds, tag);
	mutex_exit(&dn->dn_mtx);

So, there is a race condition after the remove until dn_mtx is
dropped. During that time, dnode_destroy() can get called, which ends
up in dnode_dest() calling mutex_destroy() and a panic since the lock
is still held.

This change adds a condvar to wait for the final dnode_rele_and_unlock()
to release the dn_mtx before calling dnode_destroy().

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: John Poduska <jpoduska@datto.com>
Closes #7814
Closes #10101
This commit is contained in:
John Poduska
2020-03-12 13:25:56 -04:00
committed by GitHub
parent 1e9231ada8
commit e6b28efccc
3 changed files with 15 additions and 6 deletions
+1
View File
@@ -164,6 +164,7 @@ extern "C" {
* dn_dirty_txg
* dd_assigned_tx
* dn_notxholds
* dn_nodnholds
* dn_dirtyctx
* dn_dirtyctx_firstset
* (dn_phys copy fields?)
+1
View File
@@ -332,6 +332,7 @@ struct dnode {
uint64_t dn_assigned_txg;
uint64_t dn_dirty_txg; /* txg dnode was last dirtied */
kcondvar_t dn_notxholds;
kcondvar_t dn_nodnholds;
enum dnode_dirtycontext dn_dirtyctx;
void *dn_dirtyctx_firstset; /* dbg: contents meaningless */