Don't assert mg_initialized due to device addition race

During device removal stress tests, we noticed that we were tripping 
the assertion that mg_initialized was true. After investigation, it was 
determined that the mg in question was the embedded log metaslab 
group for a newly added vdev; the normal mg had been initialized (by 
metaslab_sync_reassess, via vdev_sync_done). However, because the spa 
config alloc lock is not held as writer across both calls to 
metaslab_sync_reassess, it is possible for an allocation to happen 
between the two metaslab_groups being initialized. Because the metaslab 
code doesn't check the group in question, just the vdev's main mg, it 
is possible to get past the initial check in vdev_allocatable and 
later fail due to the assertion.

We simply remove the assertions. We could also consider locking the 
ALLOC lock around the reassess calls in vdev_sync_done, but that risks 
deadlocks. We could check the actual target mg in vdev_allocatable, 
but that risks racing with a passivation that comes in after that 
check but before the assertion. We still won't be able to actually 
allocate from the metaslab group if no metaslabs are ready, so this 
change shouldn't break anything.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Wilson <george.wilson@delphix.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #15818
This commit is contained in:
Paul Dagnelie 2024-01-29 10:36:42 -08:00 committed by Brian Behlendorf
parent acc7cd8e99
commit ab653603f8

View File

@ -5061,7 +5061,6 @@ metaslab_group_alloc(metaslab_group_t *mg, zio_alloc_list_t *zal,
int allocator, boolean_t try_hard) int allocator, boolean_t try_hard)
{ {
uint64_t offset; uint64_t offset;
ASSERT(mg->mg_initialized);
offset = metaslab_group_alloc_normal(mg, zal, asize, txg, want_unique, offset = metaslab_group_alloc_normal(mg, zal, asize, txg, want_unique,
dva, d, allocator, try_hard); dva, d, allocator, try_hard);
@ -5212,8 +5211,6 @@ top:
goto next; goto next;
} }
ASSERT(mg->mg_initialized);
/* /*
* Avoid writing single-copy data to an unhealthy, * Avoid writing single-copy data to an unhealthy,
* non-redundant vdev, unless we've already tried all * non-redundant vdev, unless we've already tried all