mirror_zfs/module/zfs
Serapheim Dimitropoulos 9d5b524597 OpenZFS 9079 - race condition in starting and ending condensing thread for indirect vdevs
The timeline of the race condition is the following:

[1] Thread A is about to finish condesing the first vdev in
    spa_condense_indirect_thread(), so it calls the
    spa_condense_indirect_complete_sync() sync task which sets
    the spa_condensing_indirect field to NULL. Waiting for the
    sync task to finish, thread A sleeps until the txg is done.
    When this happens, thread A will acquire spa_async_lock and
    set spa_condense_thread to NULL.

[2] While thread A waits for the txg to finish, thread B which is
    running spa_sync() checks whether it should condense the
    second vdev in vdev_indirect_should_condense() by checking the
    spa_condensing_indirect field which was set to NULL by
    spa_condense_indirect_thread() from thread A. So it goes on
    and tries to spawn a new condensing thread in
    spa_condense_indirect_start_sync() and the aforementioned
    assertions fails because thread A has not set spa_condense_thread
    to NULL (which is basically the last thing it does before returning).

The main issue here is that we rely on both spa_condensing_indirect
and spa_condense_thread to signify whether a condensing thread is
running. Ideally we would only use one throughout the codebase. In
addition, for managing spa_condense_thread we currently use
spa_async_lock which basically tights condensing to scrubing when
it comes to pausing and resuming those actions during spa export.

This commit introduces the ZTHR infrastructure, which is basically
threads created during spa_load()/spa_create() and exist until we
export or destroy the pool. ZTHRs sleep the majority of the time,
until they are notified to wake up and do some predefined type of work.

In the context of the current bug, a zthr to does the condensing of
indirect mappings replacing the older code that used bare kthreads.
When a pool is created, the condensing zthr is spawned but sleeps
right away, until it is awaken by a signal from spa_sync(). If an
existing pool is loaded, the condensing zthr looks if there is
anything to condense before going to sleep, in case we were condensing
mappings in the pool before it got exported.

The benefits of this solution are the following:
- The current bug is fixed
- spa_condensing_indirect is the sole indicator of whether we are
  currently condensing or not
- condensing is more decoupled from the spa_async_thread related
  functionality.

As a final note, this commit also sets up the path on upstreaming
other features that use the ZTHR code like zpool checkpoint and
fast clone deletion.

Authored by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Approved by: Hans Rosenfeld <rosenfeld@grumpf.hope-2000.org>
Ported-by: Tim Chase <tim@chase2k.com>

OpenZFS-issue: https://illumos.org/issues/9079
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/3dc606ee
Closes #6900
2018-04-14 12:23:53 -07:00
..
abd.c Update for cppcheck v1.80 2017-11-18 14:08:00 -08:00
arc.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
blkptr.c Undo c89 workarounds to match with upstream 2017-11-04 13:25:13 -07:00
bplist.c Change KM_PUSHPAGE -> KM_SLEEP 2015-01-16 14:41:26 -08:00
bpobj.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
bptree.c Native Encryption for ZFS on Linux 2017-08-14 10:36:48 -07:00
bqueue.c Call cv_signal() with mutex held 2017-06-26 14:36:49 -07:00
dbuf_stats.c Add dbuf hash and dbuf cache kstats 2018-01-29 10:24:52 -08:00
dbuf.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
ddt_zap.c Change KM_PUSHPAGE -> KM_SLEEP 2015-01-16 14:41:26 -08:00
ddt.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
dmu_diff.c OpenZFS 6950 - ARC should cache compressed data 2016-09-13 09:58:33 -07:00
dmu_object.c Raw sends must be able to decrease nlevels 2018-02-02 11:43:11 -08:00
dmu_objset.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
dmu_send.c Fix 'zfs send/recv' hang with 16M blocks 2018-04-08 19:41:15 -07:00
dmu_traverse.c Fix hung z_zvol tasks during 'zfs receive' 2018-03-30 12:10:01 -07:00
dmu_tx.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
dmu_zfetch.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
dmu.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
dnode_sync.c Raw sends must be able to decrease nlevels 2018-02-02 11:43:11 -08:00
dnode.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
dsl_bookmark.c Undo c89 workarounds to match with upstream 2017-11-04 13:25:13 -07:00
dsl_crypt.c Raw receive should change key atomically 2018-02-21 12:31:03 -08:00
dsl_dataset.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
dsl_deadlist.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
dsl_deleg.c Performance optimization of AVL tree comparator functions 2016-08-31 14:35:34 -07:00
dsl_destroy.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
dsl_dir.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
dsl_pool.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
dsl_prop.c Native Encryption for ZFS on Linux 2017-08-14 10:36:48 -07:00
dsl_scan.c OpenZFS 9290 - device removal reduces redundancy of mirrors 2018-04-14 12:21:39 -07:00
dsl_synctask.c Illumos 4951 - ZFS administrative commands should use reserved space 2015-05-04 09:41:10 -07:00
dsl_userhold.c Undo c89 workarounds to match with upstream 2017-11-04 13:25:13 -07:00
edonr_zfs.c DLPX-44812 integrate EP-220 large memory scalability 2016-11-29 14:34:27 -08:00
fm.c Linux 4.14 compat: CONFIG_GCC_PLUGIN_RANDSTRUCT 2017-11-28 17:33:48 -06:00
gzip.c Resolve QAT issues with incompressible data 2018-03-29 17:40:34 -07:00
hkdf.c Encryption patch follow-up 2017-10-11 16:54:48 -04:00
lz4.c Fix LZ4_uncompress_unknownOutputSize caused panic 2017-05-19 13:45:46 -07:00
lzjb.c Change KM_PUSHPAGE -> KM_SLEEP 2015-01-16 14:41:26 -08:00
Makefile.in OpenZFS 9079 - race condition in starting and ending condensing thread for indirect vdevs 2018-04-14 12:23:53 -07:00
metaslab.c OpenZFS 9290 - device removal reduces redundancy of mirrors 2018-04-14 12:21:39 -07:00
mmp.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
multilist.c Undo c89 workarounds to match with upstream 2017-11-04 13:25:13 -07:00
pathname.c Add pn_alloc()/pn_free() functions 2016-04-21 09:49:25 -07:00
policy.c Take user namespaces into account in policy checks 2018-03-07 15:40:42 -08:00
qat_compress.c Resolve QAT issues with incompressible data 2018-03-29 17:40:34 -07:00
qat_crypt.c Fix spelling errors in comments 2018-03-21 08:42:13 -07:00
qat.c SHA256 QAT acceleration 2018-03-15 10:53:58 -07:00
qat.h Resolve QAT issues with incompressible data 2018-03-29 17:40:34 -07:00
range_tree.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
refcount.c Linux 4.11 compat: avoid refcount_t name conflict 2017-02-28 16:10:18 -08:00
rrwlock.c Fix spelling 2017-01-03 11:31:18 -06:00
sa.c Project Quota on ZFS 2018-02-13 14:54:54 -08:00
sha256.c SHA256 QAT acceleration 2018-03-15 10:53:58 -07:00
skein_zfs.c DLPX-44812 integrate EP-220 large memory scalability 2016-11-29 14:34:27 -08:00
spa_boot.c Add linux kernel module support 2010-08-31 13:41:58 -07:00
spa_config.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
spa_errlog.c Native Encryption for ZFS on Linux 2017-08-14 10:36:48 -07:00
spa_history.c Emit history events for 'zpool create' 2017-10-23 09:45:59 -07:00
spa_misc.c OpenZFS 9290 - device removal reduces redundancy of mirrors 2018-04-14 12:21:39 -07:00
spa_stats.c Record skipped MMP writes in multihost_history 2018-03-06 15:15:15 -08:00
spa.c OpenZFS 9079 - race condition in starting and ending condensing thread for indirect vdevs 2018-04-14 12:23:53 -07:00
space_map.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
space_reftree.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
trace.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
txg.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
uberblock.c Multi-modifier protection (MMP) 2017-07-13 13:54:00 -04:00
unique.c Performance optimization of AVL tree comparator functions 2016-08-31 14:35:34 -07:00
vdev_cache.c Undo c89 workarounds to match with upstream 2017-11-04 13:25:13 -07:00
vdev_disk.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
vdev_file.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
vdev_indirect_births.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
vdev_indirect_mapping.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
vdev_indirect.c OpenZFS 9079 - race condition in starting and ending condensing thread for indirect vdevs 2018-04-14 12:23:53 -07:00
vdev_label.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
vdev_mirror.c OpenZFS 9290 - device removal reduces redundancy of mirrors 2018-04-14 12:21:39 -07:00
vdev_missing.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
vdev_queue.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
vdev_raidz_math_aarch64_neon_common.h ABD raidz NEON support 2016-11-29 14:34:33 -08:00
vdev_raidz_math_aarch64_neon.c codebase style improvements for OpenZFS 6459 port 2017-01-22 13:25:40 -08:00
vdev_raidz_math_aarch64_neonx2.c ABD raidz NEON support 2016-11-29 14:34:33 -08:00
vdev_raidz_math_avx2.c ABD raidz avx512f support 2016-11-29 14:34:33 -08:00
vdev_raidz_math_avx512bw.c ABD: Adapt avx512bw raidz assembly 2016-12-15 17:31:33 -08:00
vdev_raidz_math_avx512f.c Use cstyle -cpP in make cstyle check 2016-12-12 10:46:26 -08:00
vdev_raidz_math_impl.h codebase style improvements for OpenZFS 6459 port 2017-01-22 13:25:40 -08:00
vdev_raidz_math_scalar.c ABD Vectorized raidz 2016-11-29 14:34:33 -08:00
vdev_raidz_math_sse2.c ABD raidz avx512f support 2016-11-29 14:34:33 -08:00
vdev_raidz_math_ssse3.c codebase style improvements for OpenZFS 6459 port 2017-01-22 13:25:40 -08:00
vdev_raidz_math.c OpenZFS 7431 - ZFS Channel Programs 2018-02-08 15:28:18 -08:00
vdev_raidz.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
vdev_removal.c OpenZFS 9290 - device removal reduces redundancy of mirrors 2018-04-14 12:21:39 -07:00
vdev_root.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
vdev.c OpenZFS 9290 - device removal reduces redundancy of mirrors 2018-04-14 12:21:39 -07:00
zap_leaf.c Revert "Handle zap_add() failures in mixed ... " 2018-04-09 14:24:46 -07:00
zap_micro.c Revert "Handle zap_add() failures in mixed ... " 2018-04-09 14:24:46 -07:00
zap.c Revert "Handle zap_add() failures in mixed ... " 2018-04-09 14:24:46 -07:00
zcp_get.c Fix coverity defects: zfs channel programs 2018-02-20 11:19:42 -08:00
zcp_global.c OpenZFS 8600 - ZFS channel programs - snapshot 2018-02-08 15:29:24 -08:00
zcp_iter.c OpenZFS 7431 - ZFS Channel Programs 2018-02-08 15:28:18 -08:00
zcp_synctask.c Fix coverity defects: zfs channel programs 2018-02-20 11:19:42 -08:00
zcp.c OpenZFS 8677 - Open-Context Channel Programs 2018-02-08 16:05:57 -08:00
zfeature.c Undo c89 workarounds to match with upstream 2017-11-04 13:25:13 -07:00
zfs_acl.c Project Quota on ZFS 2018-02-13 14:54:54 -08:00
zfs_byteswap.c Add linux kernel module support 2010-08-31 13:41:58 -07:00
zfs_ctldir.c Use SET_ERROR for constant non-zero return codes 2017-08-02 21:16:12 -07:00
zfs_debug.c enable zfs_dbgmsg() by default, without dprintf() 2018-03-21 15:37:32 -07:00
zfs_dir.c Revert "Handle zap_add() failures in mixed ... " 2018-04-09 14:24:46 -07:00
zfs_fm.c Decryption error handling improvements 2018-03-31 11:12:51 -07:00
zfs_fuid.c Rename zfs_sb_t -> zfsvfs_t 2017-03-10 09:51:33 -08:00
zfs_ioctl.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
zfs_log.c Project Quota on ZFS 2018-02-13 14:54:54 -08:00
zfs_onexit.c zfsdev_getminor() should check for invalid file handles 2015-06-22 17:02:13 -07:00
zfs_ratelimit.c Change checksum & IO delay ratelimit values 2018-03-04 17:34:51 -08:00
zfs_replay.c Project Quota on ZFS 2018-02-13 14:54:54 -08:00
zfs_rlock.c Fix spelling 2017-01-03 11:31:18 -06:00
zfs_sa.c Project Quota on ZFS 2018-02-13 14:54:54 -08:00
zfs_vfsops.c ZIL claiming should not start user accounting 2018-02-20 16:27:31 -08:00
zfs_vnops.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
zfs_znode.c Project Quota on ZFS 2018-02-13 14:54:54 -08:00
zil.c OpenZFS 7614, 9064 - zfs device evacuation/removal 2018-04-14 12:16:17 -07:00
zio_checksum.c Undo c89 workarounds to match with upstream 2017-11-04 13:25:13 -07:00
zio_compress.c Undo c89 workarounds to match with upstream 2017-11-04 13:25:13 -07:00
zio_crypt.c Fix spelling errors in comments 2018-03-21 08:42:13 -07:00
zio_inject.c Undo c89 workarounds to match with upstream 2017-11-04 13:25:13 -07:00
zio.c OpenZFS 9290 - device removal reduces redundancy of mirrors 2018-04-14 12:21:39 -07:00
zle.c Fix zle_decompress out of bound access 2018-02-09 10:08:05 -08:00
zpl_ctldir.c Linux 4.12 compat: CURRENT_TIME removed 2017-05-10 09:30:48 -07:00
zpl_export.c Use cstyle -cpP in make cstyle check 2016-12-12 10:46:26 -08:00
zpl_file.c Project Quota on ZFS 2018-02-13 14:54:54 -08:00
zpl_inode.c Linux 4.12 compat: CURRENT_TIME removed 2017-05-10 09:30:48 -07:00
zpl_super.c Allow mounting datasets more than once 2018-04-13 10:44:05 -07:00
zpl_xattr.c Update for cppcheck v1.80 2017-11-18 14:08:00 -08:00
zrlock.c Fix race in trace point in zrl_add_impl 2018-03-12 11:27:02 -07:00
zthr.c OpenZFS 9079 - race condition in starting and ending condensing thread for indirect vdevs 2018-04-14 12:23:53 -07:00
zvol.c Linux compat 4.16: blk_queue_flag_{set,clear} 2018-04-10 10:32:14 -07:00