Go to file
George Wilson ffd2e15d65
zio can deadlock during device removal
When doing a device removal on a pool with gang blocks, the zio pipeline
can deadlock when trying to free blocks from a device which is being
removed with a stack similar to this:

 0xffff8ab9a13a1740 UNINTERRUPTIBLE       4
                   __schedule+0x2e5
                   __schedule+0x2e5
                   schedule+0x33
                   schedule_preempt_disabled+0xe
                   __mutex_lock.isra.12+0x2a7
                   __mutex_lock.isra.12+0x2a7
                   __mutex_lock_slowpath+0x13
                   mutex_lock+0x2c
                   free_from_removing_vdev+0x61
                   metaslab_free_impl+0xd6
                   metaslab_free_dva+0x5e
                   metaslab_free+0x196
                   zio_free_sync+0xe4
                   zio_free_gang+0x38
                   zio_gang_tree_issue+0x42
                   zio_gang_tree_issue+0xa2
                   zio_gang_issue+0x6d
                   zio_execute+0x94
                   zio_execute+0x94
                   taskq_thread+0x23b
                   kthread+0x120
                   ret_from_fork+0x1f

Since there are gang blocks we have to read the gang members as part of
the free. This can be seen with a zio dependency tree that looks like
this:

sdb> echo 0xffff900c24f8a700 | zio -rc | zio
ADDRESS                       TYPE  STAGE            WAITER
0xffff900c24f8a700            NULL  CHECKSUM_VERIFY  0xffff900ddfd31740
0xffff900c24f8c920            FREE  GANG_ASSEMBLE    -
0xffff900d93d435a0            READ  DONE

In the illustration above we are processing frees but because of gang
block we have to read the constituents blocks. Once we finish the READ
in the zio pipeline we will execute the parent. In this case the parent
is a FREE but the zio taskq is a READ and we continue to process the
pipeline leading to the stack above. In the stack above, we are blocked
waiting for the svr_lock so as a result a READ interrupt taskq thread
is now consumed. Eventually, all of the READ taskq threads end up
blocked and we're unable to complete any read requests.

In zio_notify_parent there is an optimization to continue to use
the taskq thread to exectue the parent's pipeline. To resolve the
deadlock above, we only allow this optimization if the parent's
zio type matches the child which just completed.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: George Wilson <gwilson@delphix.com>
External-issue: DLPX-80130
Closes #14236
2022-12-02 17:46:29 -08:00
.github Retire Ubuntu 18.04 CI builder 2022-11-30 10:25:57 -08:00
cmd Fix GCC 12 compilation errors 2022-11-30 13:45:53 -08:00
config autoconf: add support for openEuler 2022-12-02 17:39:48 -08:00
contrib Coverity Model Update 2022-11-29 10:00:56 -08:00
etc Correct multipathd.target to .service 2022-11-18 11:36:19 -08:00
include Fix the last two CFI callback prototype mismatches 2022-11-29 09:56:16 -08:00
lib Fix GCC 12 compilation errors 2022-11-30 13:45:53 -08:00
man zfs.4: Correct the default for zfs_vdev_async_write_max_active 2022-11-28 11:41:14 -08:00
module zio can deadlock during device removal 2022-12-02 17:46:29 -08:00
rpm rpm: add support for openEuler 2022-11-29 09:27:22 -08:00
scripts Ubuntu 22.04 integration: ShellCheck 2022-11-18 11:24:48 -08:00
tests nopwrites on dmu_sync-ed blocks can result in a panic 2022-12-02 17:45:33 -08:00
udev Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
.editorconfig Add an .editorconfig; document git whitespace settings 2020-01-27 13:32:52 -08:00
.gitignore autoconf: use include directives instead of recursing down cmd 2022-05-10 10:18:38 -07:00
.gitmodules .gitmodules: link to openzfs github repository 2021-04-12 09:37:23 -07:00
AUTHORS zfs_rename: support RENAME_* flags 2022-10-28 09:49:20 -07:00
autogen.sh Ubuntu 22.04 integration: ShellCheck 2022-11-18 11:24:48 -08:00
CODE_OF_CONDUCT.md Replace ZFS on Linux references with OpenZFS 2020-10-08 20:10:13 -07:00
configure.ac Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
copy-builtin copy-builtin: add hooks with sed/>> 2022-05-10 10:17:43 -07:00
COPYRIGHT Fix typos 2020-06-09 21:24:09 -07:00
LICENSE Update build system and packaging 2018-05-29 16:00:33 -07:00
Makefile.am Process script directory for all configs 2022-10-27 16:45:14 -07:00
META Linux 6.0 compat: META 2022-10-26 14:55:12 -07:00
NEWS Fix NEWS file 2020-08-26 21:44:41 -07:00
NOTICE Update build system and packaging 2018-05-29 16:00:33 -07:00
README.md README: Update OpenZFS website url 2022-01-06 16:25:01 -08:00
RELEASES.md Add RELEASES.md file 2021-04-02 16:33:40 -07:00
TEST Remove CI builder customization from TEST 2020-03-16 10:46:03 -07:00
zfs.release.in Move zfs.release generation to configure step 2012-07-12 12:22:51 -07:00

img

OpenZFS is an advanced file system and volume manager which was originally developed for Solaris and is now maintained by the OpenZFS community. This repository contains the code for running OpenZFS on Linux and FreeBSD.

codecov coverity

Official Resources

Installation

Full documentation for installing OpenZFS on your favorite operating system can be found at the Getting Started Page.

Contribute & Develop

We have a separate document with contribution guidelines.

We have a Code of Conduct.

Release

OpenZFS is released under a CDDL license. For more details see the NOTICE, LICENSE and COPYRIGHT files; UCRL-CODE-235197

Supported Kernels

  • The META file contains the officially recognized supported Linux kernel versions.
  • Supported FreeBSD versions are any supported branches and releases starting from 12.2-RELEASE.