mirror_zfs/module/zfs
Andriy Tkachuk b403040c4c
draid: fix data corruption after disk clear
Currently, when there there are several faulted disks with attached
dRAID spares, and one of those disks is cleared from errors (zpool
clear), followed by its spare being detached, the data in all the
remaining spares that were attached while the cleared disk was in
FAULTED state might get corrupted (which can be seen by running scrub).
In some cases, when too many disks get cleared at a time, this can
result in data corruption/loss.

dRAID spare is a virtual device whose blocks are distributed among
other disks. Those disks can be also in FAULTED state with attached
spares on their own. When a disk gets sequentially resilvered (rebuilt),
the changes made by that resilvering won't get captured in the DTL
(Dirty Time Log) of other FAULTED disks with the attached spares to
which the data is written during the resilvering (as it would normally
be done for the changes made by the user if a new file is written or
some existing one is deleted). It is because sequential resilvering
works on the block level, without touching or looking into metadata,
so it doesn't know anything about the old BPs or transactions groups
that it is resilvering. So later on, when that disk gets cleared
from errors and healing resilvering is trying to sync all the data
from its spare onto it, all the changes made on its spare during the
resilvering of other disks will be missed because they won't be
captured in its DTL. That's why other dRAID spares may get corrupted.

Here's another way to explain it that might be helpful. Imagine a
scenario:

1. d1 fails and gets resilvered to some spare s1 - OK.
2. d2 fails and gets sequentially resilvered on draid spare s2. Now,
   in some slices, s2 would map to d1, which is failed. But d1 has s1
   spare attached, so the data from that resilvering goes to s1, but
   not recorded in d1's DTL.
3. Now, d1 gets cleared and its s1 gets detached. All the changes
   done by the user (writes or deletions) have their txgs captured
   in d1's DTL, so they will be resilvered by the healing resilver
   from its spare (s1) - that part works fine. But the data which
   was written during resilvering of d2 and went to s1 - that one
   will be missed from d1's DTL and won't get resilvered to it. So
   here we are:
4. s2 under d2 is corrupted in the slices which map to d1, because
   d1 doesn't have that data resilvered from s1.

Now, if there are more failed disks with draid spares attached which
were sequentially resilvered while d1 was failed, d3+s3, d4+s4 and
so on - all their spares will be corrupted. Because, in some slices,
each of them will map to d1 which will miss their data.

Solution: add all known txgs starting from TXG_INITIAL to DTLs of
non-writable devices during sequential resilvering so when healing
resilver starts on disk clear, it would be able to check and heal
blocks from all txgs.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Akash B <akash-b@hpe.com>
Signed-off-by: Andriy Tkachuk <andriy.tkachuk@seagate.com>
Closes #18286
Closes #18294
2026-03-11 14:54:20 -07:00
..
abd.c Preserve LIFO ordering of kmap ops in abd_raidz_gen_iterate() 2025-12-09 09:12:16 -08:00
aggsum.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
arc.c L2ARC: Fix prev_hdr use-after-free in l2arc_write_sublist 2026-03-10 11:00:23 -07:00
blake3_zfs.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
blkptr.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
bplist.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
bpobj.c Pass flags to more DMU write/hold functions 2025-10-29 11:17:51 -07:00
bptree.c Pass flags to more DMU write/hold functions 2025-10-29 11:17:51 -07:00
bqueue.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
brt.c Add BRT support to zpool prefetch command 2025-11-10 16:16:22 -08:00
btree.c Move range_tree, btree, highbit64 to common code 2026-02-22 11:43:51 -08:00
dataset_kstats.c CI: Test & fix Linux ZFS built-in build 2026-02-19 10:15:41 -08:00
dbuf_stats.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
dbuf.c Remove parent ZIO from dbuf_prefetch() 2026-02-18 18:12:13 -08:00
ddt_log.c CI: Test & fix Linux ZFS built-in build 2026-02-19 10:15:41 -08:00
ddt_stats.c Introduce dedupused/dedupsaved pool properties 2026-02-25 09:41:38 -05:00
ddt_zap.c DDT: Fix compressed entry buffer size 2025-12-15 14:52:44 -08:00
ddt.c CI: Test & fix Linux ZFS built-in build 2026-02-19 10:15:41 -08:00
dmu_diff.c Allow physical rewrite without logical 2025-08-06 10:36:07 -07:00
dmu_direct.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
dmu_object.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
dmu_objset.c Allow rewrite skip cloned and snapshotted blocks 2026-02-09 10:17:56 -08:00
dmu_recv.c Fix activating large_microzap on receive 2026-02-05 15:48:03 -08:00
dmu_redact.c More consistent use of TREE_* macros in AVL comparators 2026-03-03 09:08:23 -08:00
dmu_send.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
dmu_traverse.c Allow physical rewrite without logical 2025-08-06 10:36:07 -07:00
dmu_tx.c Improve caching for dbuf prefetches 2026-02-04 10:12:32 -08:00
dmu_zfetch.c tunables: remove legacy FreeBSD aliases 2025-09-08 10:03:01 -07:00
dmu.c Cleanup allocation class selection 2026-02-16 10:33:21 -05:00
dnode_sync.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
dnode.c Update dnode_next_offset_level to accept blkid instead of offset 2025-11-04 13:12:17 -08:00
dsl_bookmark.c More consistent use of TREE_* macros in AVL comparators 2026-03-03 09:08:23 -08:00
dsl_crypt.c More consistent use of TREE_* macros in AVL comparators 2026-03-03 09:08:23 -08:00
dsl_dataset.c Add snapshots_changed_nsecs dataset property 2026-01-06 09:36:20 -08:00
dsl_deadlist.c More consistent use of TREE_* macros in AVL comparators 2026-03-03 09:08:23 -08:00
dsl_deleg.c More consistent use of TREE_* macros in AVL comparators 2026-03-03 09:08:23 -08:00
dsl_destroy.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
dsl_dir.c Prefer VERIFY0(n) over VERIFY(n == 0) 2025-08-07 11:40:59 -07:00
dsl_pool.c Prefer VERIFY0P(n) over VERIFY(n == NULL) 2025-08-07 11:41:37 -07:00
dsl_prop.c Prefer VERIFY0(n) over VERIFY(n == 0) 2025-08-07 11:40:59 -07:00
dsl_scan.c More consistent use of TREE_* macros in AVL comparators 2026-03-03 09:08:23 -08:00
dsl_synctask.c dmu_tx_assign: make all VERIFY0 calls use DMU_TX_SUSPEND 2025-05-28 10:28:59 -07:00
dsl_userhold.c Prefer VERIFY0(n) over VERIFY(n == 0) 2025-08-07 11:40:59 -07:00
edonr_zfs.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
fm.c Prefer VERIFY0(n) over VERIFY(n == 0) 2025-08-07 11:40:59 -07:00
gzip.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
hkdf.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
lz4_zfs.c SPDX: license tags: BSD-2-Clause 2025-03-13 17:56:46 -07:00
lz4.c SPDX: license tags: BSD-2-Clause 2025-03-13 17:56:46 -07:00
lzjb.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
metaslab.c Add zpool properties for allocation class space 2026-03-02 15:50:23 -08:00
mmp.c mmp: claim sequence id before final import 2026-02-09 09:36:01 -08:00
multilist.c Allow vmem_alloc backed multilists 2025-08-12 13:36:03 -07:00
objlist.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
pathname.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
range_tree.c range_tree: use zfs_panic_recover() for partial-overlap remove 2026-02-25 11:26:10 -08:00
refcount.c More consistent use of TREE_* macros in AVL comparators 2026-03-03 09:08:23 -08:00
rrwlock.c Prefer VERIFY0P(n) over VERIFY(n == NULL) 2025-08-07 11:41:37 -07:00
sa.c Lock db_mtx around arc_release() in couple places 2026-01-26 21:32:16 -05:00
sha2_zfs.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
skein_zfs.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
spa_checkpoint.c Pass flags to more DMU write/hold functions 2025-10-29 11:17:51 -07:00
spa_config.c spa_misc: add an API for spa_namespace_lock 2025-11-10 14:23:39 -08:00
spa_errlog.c Allow physical rewrite without logical 2025-08-06 10:36:07 -07:00
spa_history.c Pass flags to more DMU write/hold functions 2025-10-29 11:17:51 -07:00
spa_log_spacemap.c Fix available space accounting for special/dedup (#18222) 2026-02-19 10:36:35 -08:00
spa_misc.c draid: fix data corruption after disk clear 2026-03-11 14:54:20 -07:00
spa_stats.c Prefer VERIFY0(n) over VERIFY(n == 0) 2025-08-07 11:40:59 -07:00
spa.c Add zpool properties for allocation class space 2026-03-02 15:50:23 -08:00
space_map.c Pass flags to more DMU write/hold functions 2025-10-29 11:17:51 -07:00
space_reftree.c Prefer VERIFY0(n) over VERIFY(n == 0) 2025-08-07 11:40:59 -07:00
THIRDPARTYLICENSE.cityhash OpenZFS 8484 - Implement aggregate sum and use for arc counters 2018-06-06 09:35:59 -07:00
THIRDPARTYLICENSE.cityhash.descrip OpenZFS 8484 - Implement aggregate sum and use for arc counters 2018-06-06 09:35:59 -07:00
txg.c txg_wait_synced_flags: add TXG_WAIT_SUSPEND flag to not wait if pool suspended 2025-05-28 10:27:46 -07:00
u8_textprep.c u8_textprep: move into module/zfs 2025-12-22 14:58:36 -08:00
uberblock.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
unique.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
vdev_draid_rand.c SPDX: license tags: LicenseRef-OpenZFS-ThirdParty-PublicDomain 2025-03-13 17:57:31 -07:00
vdev_draid.c draid: fix data corruption after disk clear 2026-03-11 14:54:20 -07:00
vdev_file.c Add vdev property to disable vdev scheduler 2026-02-23 09:34:33 -08:00
vdev_indirect_births.c Pass flags to more DMU write/hold functions 2025-10-29 11:17:51 -07:00
vdev_indirect_mapping.c Pass flags to more DMU write/hold functions 2025-10-29 11:17:51 -07:00
vdev_indirect.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
vdev_initialize.c spa_misc: add an API for spa_namespace_lock 2025-11-10 14:23:39 -08:00
vdev_label.c mmp: claim sequence id before final import 2026-02-09 09:36:01 -08:00
vdev_mirror.c Allow physical rewrite without logical 2025-08-06 10:36:07 -07:00
vdev_missing.c Implement allocation size ranges and use for gang leaves (#17111) 2025-05-02 15:32:18 -07:00
vdev_queue.c More consistent use of TREE_* macros in AVL comparators 2026-03-03 09:08:23 -08:00
vdev_raidz_math_aarch64_neon_common.h SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
vdev_raidz_math_aarch64_neon.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
vdev_raidz_math_aarch64_neonx2.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
vdev_raidz_math_avx2.c Convert all HAVE_<name> SIMD gates to HAVE_SIMD(<name>) 2026-03-05 15:01:37 -08:00
vdev_raidz_math_avx512bw.c Convert all HAVE_<name> SIMD gates to HAVE_SIMD(<name>) 2026-03-05 15:01:37 -08:00
vdev_raidz_math_avx512f.c Convert all HAVE_<name> SIMD gates to HAVE_SIMD(<name>) 2026-03-05 15:01:37 -08:00
vdev_raidz_math_impl.h SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
vdev_raidz_math_powerpc_altivec_common.h SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
vdev_raidz_math_powerpc_altivec.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
vdev_raidz_math_scalar.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
vdev_raidz_math_sse2.c Convert all HAVE_<name> SIMD gates to HAVE_SIMD(<name>) 2026-03-05 15:01:37 -08:00
vdev_raidz_math_ssse3.c Convert all HAVE_<name> SIMD gates to HAVE_SIMD(<name>) 2026-03-05 15:01:37 -08:00
vdev_raidz_math.c Convert all HAVE_<name> SIMD gates to HAVE_SIMD(<name>) 2026-03-05 15:01:37 -08:00
vdev_raidz.c RAIDZ: Remove some excessive logging 2025-12-17 14:00:01 -08:00
vdev_rebuild.c draid: fix data corruption after disk clear 2026-03-11 14:54:20 -07:00
vdev_removal.c Fix log vdev removal issues 2026-03-04 09:12:14 -05:00
vdev_root.c Implement allocation size ranges and use for gang leaves (#17111) 2025-05-02 15:32:18 -07:00
vdev_trim.c spa_misc: add an API for spa_namespace_lock 2025-11-10 14:23:39 -08:00
vdev.c draid: fix data corruption after disk clear 2026-03-11 14:54:20 -07:00
zap_leaf.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
zap_micro.c DDT: Add/use zap_lookup_length_uint64_by_dnode() 2025-12-15 14:38:34 -08:00
zap.c DDT: Add/use zap_lookup_length_uint64_by_dnode() 2025-12-15 14:38:34 -08:00
zcp_get.c Add snapshots_changed_nsecs dataset property 2026-01-06 09:36:20 -08:00
zcp_global.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
zcp_iter.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
zcp_set.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
zcp_synctask.c zcp_synctask: add zfs.sync.clone() 2025-06-10 14:53:10 -07:00
zcp.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
zfeature.c Synchronize the update of feature refcount 2025-08-22 16:35:58 -07:00
zfs_byteswap.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
zfs_chksum.c chksum: run 256K benchmark on demand, preserve chksum_stat_data 2025-12-01 10:14:52 -08:00
zfs_crrd.c Fix time database update calculations 2025-09-12 16:33:36 -07:00
zfs_debug_common.c nvlist: Add nvlist_snprintf() and zfs_dbgmsg_nvlist() 2025-04-18 09:22:16 -04:00
zfs_fm.c Fix snapshot automount expiry cancellation deadlock 2025-12-01 14:43:42 -08:00
zfs_fuid.c More consistent use of TREE_* macros in AVL comparators 2026-03-03 09:08:23 -08:00
zfs_impl.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
zfs_ioctl.c Fix send:raw permission for send -w -I 2026-02-11 10:30:26 -08:00
zfs_log.c ZIL: pass commit errors back to ITX callbacks 2025-08-08 16:43:20 -07:00
zfs_onexit.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
zfs_quota.c Bypass snprintf() in quota checks if no quotas set 2025-12-17 21:59:47 -05:00
zfs_ratelimit.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
zfs_replay.c dmu_tx: rename dmu_tx_assign() flags from TXG_* to DMU_TX_* (#17143) 2025-03-18 16:04:22 -07:00
zfs_rlock.c Prefer VERIFY0(n) over VERIFY3U(n, ==, 0) 2025-08-07 11:41:25 -07:00
zfs_sa.c ZIL: allow zil_commit() to fail with error 2025-08-08 16:43:09 -07:00
zfs_vnops.c Restrict cloning with different properties 2026-02-10 09:53:24 -08:00
zfs_znode.c Add default user/group/project quota properties 2025-04-03 10:35:22 -07:00
zil.c Fix log vdev removal issues 2026-03-04 09:12:14 -05:00
zio_checksum.c Prefer VERIFY0(n) over VERIFY(n == 0) 2025-08-07 11:40:59 -07:00
zio_compress.c CI: Test & fix Linux ZFS built-in build 2026-02-19 10:15:41 -08:00
zio_inject.c spa_misc: add an API for spa_namespace_lock 2025-11-10 14:23:39 -08:00
zio.c More consistent use of TREE_* macros in AVL comparators 2026-03-03 09:08:23 -08:00
zle.c SPDX: license tags: CDDL-1.0 2025-03-13 17:56:27 -07:00
zrlock.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
zthr.c Prefer VERIFY0P(n) over VERIFY3P(n, ==, NULL) 2025-08-07 11:41:42 -07:00
zvol.c spa_misc: add an API for spa_namespace_lock 2025-11-10 14:23:39 -08:00