Fix dnode allocation race

When performing concurrent object allocations using the new
multi-threaded allocator and large dnodes it's possible to
allocate overlapping large dnodes.

This case should have been handled by detecting an error
returned by dnode_hold_impl().  But that logic only checked
the returned dnp was not-NULL, and the dnp variable was not
reset to NULL when retrying.  Resolve this issue by properly
checking the return value of dnode_hold_impl().

Additionally, it was possible that dnode_hold_impl() would
misreport a dnode as free when it was in fact in use.  This
could occurs for two reasons:

* The per-slot zrl_lock must be held over the entire critical
  section which includes the alloc/free until the new dnode
  is assigned to children_dnodes.  Additionally, all of the
  zrl_lock's in the range must be held to protect moving
  dnodes.

* The dn->dn_ot_type cannot be solely relied upon to check
  the type.  When allocating a new dnode its type will be
  DMU_OT_NONE after dnode_create().  Only latter when
  dnode_allocate() is called will it transition to the new
  type.  This means there's a window when allocating where
  it can mistaken for a free dnode.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Ned Bass <bass6@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6414
Closes #6439
This commit is contained in:
Brian Behlendorf
2017-08-08 08:38:53 -07:00
committed by Tony Hutter
parent ef605a5517
commit 751941e248
8 changed files with 136 additions and 68 deletions
+1 -1
View File
@@ -364,7 +364,7 @@ tests = ['async_destroy_001_pos']
[tests/functional/features/large_dnode]
tests = ['large_dnode_001_pos', 'large_dnode_002_pos', 'large_dnode_003_pos',
'large_dnode_004_neg', 'large_dnode_005_pos', 'large_dnode_006_pos',
'large_dnode_007_neg']
'large_dnode_007_neg', 'large_dnode_008_pos']
[tests/functional/grow_pool]
tests = ['grow_pool_001_pos']
@@ -8,4 +8,5 @@ dist_pkgdata_SCRIPTS = \
large_dnode_004_neg.ksh \
large_dnode_005_pos.ksh \
large_dnode_006_pos.ksh \
large_dnode_007_neg.ksh
large_dnode_007_neg.ksh \
large_dnode_008_pos.ksh
@@ -0,0 +1,60 @@
#!/bin/ksh -p
#
# CDDL HEADER START
#
# The contents of this file are subject to the terms of the
# Common Development and Distribution License (the "License").
# You may not use this file except in compliance with the License.
#
# You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
# or http://www.opensolaris.org/os/licensing.
# See the License for the specific language governing permissions
# and limitations under the License.
#
# When distributing Covered Code, include this CDDL HEADER in each
# file and include the License file at usr/src/OPENSOLARIS.LICENSE.
# If applicable, add the following below this CDDL HEADER, with the
# fields enclosed by brackets "[]" replaced with your own identifying
# information: Portions Copyright [yyyy] [name of copyright owner]
#
# CDDL HEADER END
#
#
# Copyright (c) 2017 by Lawrence Livermore National Security, LLC.
# Use is subject to license terms.
#
. $STF_SUITE/include/libtest.shlib
#
# DESCRIPTION:
# Run many xattrtests on a dataset with large dnodes and xattr=sa to
# stress concurrent allocation of large dnodes.
#
TEST_FS=$TESTPOOL/large_dnode
verify_runnable "both"
function cleanup
{
datasetexists $TEST_FS && log_must zfs destroy $TEST_FS
}
log_onexit cleanup
log_assert "xattrtest runs concurrently on dataset with large dnodes"
log_must zfs create $TEST_FS
log_must zfs set dnsize=auto $TEST_FS
log_must zfs set xattr=sa $TEST_FS
for ((i=0; i < 100; i++)); do
dir="/$TEST_FS/dir.$i"
log_must mkdir "$dir"
log_must eval "xattrtest -R -r -y -x 1 -f 1024 -k -p $dir &"
done
log_must wait
log_pass