Files
mirror_zfs/module/zfs
Serapheim Dimitropoulos 793c958f6f Initialize metaslab range trees in metaslab_init
= Motivation

We've noticed several zloop crashes within Delphix generated
due to the following sequence of events:

- A device gets expanded and new metaslabas are allocated for
  it. These metaslabs go through `metaslab_init()` but haven't
  gone through `metaslab_sync_done()` yet. This meas that the
  only range tree that's actually set is the `ms_allocatable`.
  All the others are NULL.

- A vdev_initialization is issues and `vdev_initialize_thread`
  starts processing one of these new metaslabs of the expanded
  vdev.

- As part of `vdev_initialize_calculate_progress()` we call
  into `metaslab_load()` and `metaslab_load_impl()` which
  in turn tries to dereference the metaslabs trees that
  are still NULL and therefore we crash.

The same failure can come up from the `vdev_trim` code paths.

= This Patch

We considered the following solutions to deal with this issue:

[A] Add logic to `vdev_initialize/trim` to skip those new
    metaslabs. We decided against this as it would be good
    to avoid exposing this lower-level detail to higer-level
    operations.

[B] Have `metaslab_load_impl()` return early for new metaslabs
    and thus never touch those range_trees that are NULL at
    that time. This seemed more of a work-around for the bug
    and not a clear-cut solution.

[C] Refactor our logic so all metaslabs have their range_trees
    created at the time of their creatin in `metaslab_init()`.

In this patch we decided to go with [C] because:

(1) It doesn't expose more metaslab details to higher level
    operations such as vdev initialize and trim.

(2) The current behavior of creating the range trees lazily
    in `metaslab_sync_done()` is unnecessarily complicated.

(3) Always initializing the metaslab range_trees makes other
    parts of the codebase cleaner. For example, we used to
    use `ms_freed` as the reference value for knowing whether
    all the range_trees have been initialized. Now we no
    longer need to do that check in most places (and in the
    few that we do we use the `ms_new` boolean field now
    which is more readable).

= Side Changes

Probably due to a mismerge we set `ms_loaded` to `B_TRUE` twice
in `metasloab_load_impl()`. In this patch we remove the extraneous
assignment.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #11737
2021-03-19 22:36:02 -07:00
..
2020-12-10 14:09:23 -08:00
2020-08-20 10:30:06 -07:00
2019-07-26 10:54:14 -07:00
2020-07-29 16:35:33 -07:00
2020-07-29 16:35:33 -07:00
2019-06-19 09:48:12 -07:00
2020-08-17 15:25:37 -07:00
2020-06-18 12:21:18 -07:00
2019-12-03 09:51:44 -08:00
2020-11-02 11:51:12 -08:00
2020-11-10 10:37:10 -08:00
2021-02-19 22:33:15 -08:00
2020-09-04 10:29:39 -07:00
2020-06-18 12:21:25 -07:00
2020-09-04 10:34:28 -07:00
2020-06-18 12:21:25 -07:00
2017-10-11 16:54:48 -04:00
2020-06-18 12:21:25 -07:00
2020-11-13 13:51:51 -08:00
2020-11-13 13:51:51 -08:00
2020-12-10 14:09:23 -08:00
2019-06-19 09:48:12 -07:00
2020-07-29 16:35:33 -07:00
2021-01-20 21:27:30 -08:00
2020-06-18 12:21:25 -07:00
2020-06-18 12:21:25 -07:00
2020-06-18 12:21:25 -07:00
2021-02-20 20:16:50 -08:00
2019-09-02 17:56:41 -07:00
2019-10-09 10:36:03 -07:00
2021-03-19 16:22:10 -07:00
2021-01-26 19:36:51 -08:00
2021-03-19 16:22:10 -07:00
2020-11-13 13:51:51 -08:00
2020-07-29 16:35:33 -07:00
2020-06-18 12:21:25 -07:00
2019-09-02 17:56:41 -07:00
2020-10-02 17:44:10 -07:00
2020-06-18 12:20:38 -07:00
2021-03-19 16:22:10 -07:00
2021-03-12 15:51:39 -08:00
2020-10-02 17:44:10 -07:00
2021-01-20 21:27:30 -08:00
2020-06-18 12:20:38 -07:00
2020-11-13 13:51:51 -08:00
2021-03-19 16:22:10 -07:00
2020-06-18 12:21:25 -07:00
2020-06-18 12:21:18 -07:00
2020-09-03 20:09:52 -07:00