mirror_zfs/module
Serapheim Dimitropoulos 793c958f6f
Initialize metaslab range trees in metaslab_init
= Motivation

We've noticed several zloop crashes within Delphix generated
due to the following sequence of events:

- A device gets expanded and new metaslabas are allocated for
  it. These metaslabs go through `metaslab_init()` but haven't
  gone through `metaslab_sync_done()` yet. This meas that the
  only range tree that's actually set is the `ms_allocatable`.
  All the others are NULL.

- A vdev_initialization is issues and `vdev_initialize_thread`
  starts processing one of these new metaslabs of the expanded
  vdev.

- As part of `vdev_initialize_calculate_progress()` we call
  into `metaslab_load()` and `metaslab_load_impl()` which
  in turn tries to dereference the metaslabs trees that
  are still NULL and therefore we crash.

The same failure can come up from the `vdev_trim` code paths.

= This Patch

We considered the following solutions to deal with this issue:

[A] Add logic to `vdev_initialize/trim` to skip those new
    metaslabs. We decided against this as it would be good
    to avoid exposing this lower-level detail to higer-level
    operations.

[B] Have `metaslab_load_impl()` return early for new metaslabs
    and thus never touch those range_trees that are NULL at
    that time. This seemed more of a work-around for the bug
    and not a clear-cut solution.

[C] Refactor our logic so all metaslabs have their range_trees
    created at the time of their creatin in `metaslab_init()`.

In this patch we decided to go with [C] because:

(1) It doesn't expose more metaslab details to higher level
    operations such as vdev initialize and trim.

(2) The current behavior of creating the range trees lazily
    in `metaslab_sync_done()` is unnecessarily complicated.

(3) Always initializing the metaslab range_trees makes other
    parts of the codebase cleaner. For example, we used to
    use `ms_freed` as the reference value for knowing whether
    all the range_trees have been initialized. Now we no
    longer need to do that check in most places (and in the
    few that we do we use the `ms_new` boolean field now
    which is more readable).

= Side Changes

Probably due to a mismerge we set `ms_loaded` to `B_TRUE` twice
in `metasloab_load_impl()`. In this patch we remove the extraneous
assignment.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #11737
2021-03-19 22:36:02 -07:00
..
avl cppcheck: remove redundant ASSERTs 2021-01-26 16:12:10 -08:00
icp Extending FreeBSD UIO Struct 2021-01-20 21:27:30 -08:00
lua cppcheck: integrete cppcheck 2021-01-26 16:12:26 -08:00
nvpair Links in Source Files 2020-09-02 09:42:12 -07:00
os Linux 5.12 update: bio_max_segs() replaces BIO_MAX_PAGES 2021-03-19 22:33:42 -07:00
spl Cleanup linux module kbuild files 2020-06-10 09:24:15 -07:00
unicode Throw const on some strings 2020-10-02 17:44:10 -07:00
zcommon Add "zstd-fast" to help options for "compression" property 2021-03-03 08:14:19 -08:00
zfs Initialize metaslab range trees in metaslab_init 2021-03-19 22:36:02 -07:00
zstd Optimize locking checks in mempool allocator 2020-11-02 12:10:07 -08:00
.gitignore Cleanup linux module kbuild files 2020-06-10 09:24:15 -07:00
Kbuild.in Add zstd support to zfs 2020-08-20 10:30:06 -07:00
Makefile.bsd Restore FreeBSD resource usage accounting 2021-02-19 22:34:33 -08:00
Makefile.in FreeBSD module --enable-debug --enable-invariants 2021-03-05 12:16:41 -08:00