mirror_zfs/module/zfs
Matthew Ahrens 62d4287f27
RAIDZ2/3 fails to heal silently corrupted parity w/2+ bad disks
When scrubbing, (non-sequential) resilvering, or correcting a checksum
error using RAIDZ parity, ZFS should heal any incorrect RAIDZ parity by
overwriting it.  For example, if P disks are silently corrupted (P being
the number of failures tolerated; e.g. RAIDZ2 has P=2), `zpool scrub`
should detect and heal all the bad state on these disks, including
parity.  This way if there is a subsequent failure we are fully
protected.

With RAIDZ2 or RAIDZ3, a block can have silent damage to a parity
sector, and also damage (silent or known) to a data sector.  In this
case the parity should be healed but it is not.

The problem can be noticed by scrubbing the pool twice.  Assuming there
was no damage concurrent with the scrubs, the first scrub should fix all
silent damage, and the second scrub should be "clean" (`zpool status`
should not report checksum errors on any disks).  If the bug is
encountered, then the second scrub will repair the silently-damaged
parity that the first scrub failed to repair, and these checksum errors
will be reported after the second scrub.  Since the first scrub repaired
all the damaged data, the bug can not be encountered during the second
scrub, so subsequent scrubs (more than two) are not necessary.

The root cause of the problem is some code that was inadvertently added
to `raidz_parity_verify()` by the DRAID changes.  The incorrect code
causes the parity healing to be aborted if there is damaged data
(`rc_error != 0`) or the data disk is not present (`!rc_tried`).  These
checks are not necessary, because we only call `raidz_parity_verify()`
if we have the correct data (which may have been reconstructed using
parity, and which was verified by the checksum).

This commit fixes the problem by removing the incorrect checks in
`raidz_parity_verify()`.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #11489 
Closes #11510
2021-01-26 16:05:05 -08:00
..
abd.c Fix two minor lint errors (cppcheck) 2021-01-23 15:49:32 -08:00
aggsum.c Implement memory and CPU hotplug 2020-12-10 14:09:23 -08:00
arc.c Fix two minor lint errors (cppcheck) 2021-01-23 15:49:32 -08:00
blkptr.c Add zstd support to zfs 2020-08-20 10:30:06 -07:00
bplist.c Fast Clone Deletion 2019-07-26 10:54:14 -07:00
bpobj.c Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
bptree.c Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
bqueue.c Implement Redacted Send/Receive 2019-06-19 09:48:12 -07:00
btree.c Fix typo in btree.c 2020-08-17 15:25:37 -07:00
dataset_kstats.c Fix panic on DilOS with kstat per dataset statistics 2019-09-03 12:12:31 -07:00
dbuf_stats.c Eliminate gratuitous bzeroing in dbuf_stats_hash_table_data 2020-09-30 13:24:38 -07:00
dbuf.c allow callers to allocate and provide the abd_t struct 2021-01-20 11:24:37 -08:00
ddt_zap.c Refactor dnode dirty context from dbuf_dirty 2020-02-26 16:09:17 -08:00
ddt.c Remove dead code 2020-06-18 12:21:18 -07:00
dmu_diff.c Mark write_record static 2019-12-03 09:51:44 -08:00
dmu_object.c Introduce CPU_SEQID_UNSTABLE 2020-11-02 11:51:12 -08:00
dmu_objset.c Relax special_small_blocks assertion. 2021-01-23 15:45:27 -08:00
dmu_recv.c implicit conversion from 'boolean_t' to 'ds_hold_flags_t' 2020-12-27 16:31:02 -08:00
dmu_redact.c Fix dnode refcount tracking 2020-11-10 10:37:10 -08:00
dmu_send.c implicit conversion from 'boolean_t' to 'ds_hold_flags_t' 2020-12-27 16:31:02 -08:00
dmu_traverse.c zil_parse: make callback parameters const 2020-10-09 09:34:54 -07:00
dmu_tx.c Remove unused check from dmu_tx_count_write() 2020-12-21 20:17:13 -08:00
dmu_zfetch.c dmu_zfetch: don't leak unreferenced stream when zfetch is freed 2020-10-13 21:03:36 -07:00
dmu.c Extending FreeBSD UIO Struct 2021-01-20 21:27:30 -08:00
dnode_sync.c Improve zfs receive performance with lightweight write 2020-12-11 10:26:02 -08:00
dnode.c Improve zfs receive performance with lightweight write 2020-12-11 10:26:02 -08:00
dsl_bookmark.c Fix kernel panic induced by redacted send 2020-12-11 10:22:29 -08:00
dsl_crypt.c Fix raw sends on encrypted datasets when copying back snapshots 2020-12-04 14:34:29 -08:00
dsl_dataset.c Improve zfs receive performance with lightweight write 2020-12-11 10:26:02 -08:00
dsl_deadlist.c Fix i/o error handling of livelists and zap iteration 2020-08-05 10:22:09 -07:00
dsl_deleg.c Reduce loaded range tree memory usage 2019-10-09 10:36:03 -07:00
dsl_destroy.c Fix i/o error handling of livelists and zap iteration 2020-08-05 10:22:09 -07:00
dsl_dir.c Add 'zfs rename -u' to rename without remounting 2020-09-01 16:14:16 -07:00
dsl_pool.c dsl_pool: extend comment on DSL Pool Configuration Lock 2020-12-19 18:04:05 -08:00
dsl_prop.c Replace sprintf()->snprintf() and strcpy()->strlcpy() 2020-06-07 11:42:12 -07:00
dsl_scan.c Distributed Spare (dRAID) Feature 2020-11-13 13:51:51 -08:00
dsl_synctask.c nowait synctask must succeed 2020-09-04 10:29:39 -07:00
dsl_userhold.c Replace sprintf()->snprintf() and strcpy()->strlcpy() 2020-06-07 11:42:12 -07:00
edonr_zfs.c Add include files for prototypes 2020-06-18 12:21:25 -07:00
fm.c Avoid posting duplicate zpool events 2020-09-04 10:34:28 -07:00
gzip.c Add include files for prototypes 2020-06-18 12:21:25 -07:00
hkdf.c Encryption patch follow-up 2017-10-11 16:54:48 -04:00
lz4.c Prefix zfs internal endian checks with _ZFS 2020-07-28 13:02:49 -07:00
lzjb.c Add include files for prototypes 2020-06-18 12:21:25 -07:00
Makefile.in Distributed Spare (dRAID) Feature 2020-11-13 13:51:51 -08:00
metaslab.c Set aside a metaslab for ZIL blocks 2021-01-21 15:12:54 -08:00
mmp.c Distributed Spare (dRAID) Feature 2020-11-13 13:51:51 -08:00
multilist.c Implement memory and CPU hotplug 2020-12-10 14:09:23 -08:00
objlist.c Implement Redacted Send/Receive 2019-06-19 09:48:12 -07:00
pathname.c Replace ZFS on Linux references with OpenZFS 2020-10-08 20:10:13 -07:00
range_tree.c Fix incorrect deletion order in range_tree_add_impl gap case 2020-10-14 08:59:54 -07:00
refcount.c Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
rrwlock.c Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
sa.c Extending FreeBSD UIO Struct 2021-01-20 21:27:30 -08:00
sha256.c Add include files for prototypes 2020-06-18 12:21:25 -07:00
skein_zfs.c Add include files for prototypes 2020-06-18 12:21:25 -07:00
spa_boot.c Add include files for prototypes 2020-06-18 12:21:25 -07:00
spa_checkpoint.c Refactor dnode dirty context from dbuf_dirty 2020-02-26 16:09:17 -08:00
spa_config.c vdev_ashift should only be set once 2020-09-18 12:13:47 -07:00
spa_errlog.c Fix typos in module/zfs/ 2019-09-02 17:56:41 -07:00
spa_history.c record ioctl elapsed time in zpool history 2021-01-11 09:29:25 -08:00
spa_log_spacemap.c Make module tunables cross platform 2019-09-05 14:49:49 -07:00
spa_misc.c Set aside a metaslab for ZIL blocks 2021-01-21 15:12:54 -08:00
spa_stats.c FreeBSD: Add support for procfs_list 2020-09-23 16:43:51 -07:00
spa.c spa_export_common: refactor common exit points 2021-01-25 15:04:11 -08:00
space_map.c Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
space_reftree.c Reduce loaded range tree memory usage 2019-10-09 10:36:03 -07:00
THIRDPARTYLICENSE.cityhash OpenZFS 8484 - Implement aggregate sum and use for arc counters 2018-06-06 09:35:59 -07:00
THIRDPARTYLICENSE.cityhash.descrip OpenZFS 8484 - Implement aggregate sum and use for arc counters 2018-06-06 09:35:59 -07:00
txg.c Implement memory and CPU hotplug 2020-12-10 14:09:23 -08:00
uberblock.c MMP interval and fail_intervals in uberblock 2019-03-21 12:47:57 -07:00
unique.c Reduce loaded range tree memory usage 2019-10-09 10:36:03 -07:00
vdev_cache.c Replace ASSERTV macro with compiler annotation 2019-12-05 12:37:00 -08:00
vdev_draid_rand.c Distributed Spare (dRAID) Feature 2020-11-13 13:51:51 -08:00
vdev_draid.c allow callers to allocate and provide the abd_t struct 2021-01-20 11:24:37 -08:00
vdev_indirect_births.c Fixes: #8934 Large kmem_alloc 2019-07-10 15:54:49 -07:00
vdev_indirect_mapping.c Replace ASSERTV macro with compiler annotation 2019-12-05 12:37:00 -08:00
vdev_indirect.c allow callers to allocate and provide the abd_t struct 2021-01-20 11:24:37 -08:00
vdev_initialize.c Distributed Spare (dRAID) Feature 2020-11-13 13:51:51 -08:00
vdev_label.c Distributed Spare (dRAID) Feature 2020-11-13 13:51:51 -08:00
vdev_mirror.c Distributed Spare (dRAID) Feature 2020-11-13 13:51:51 -08:00
vdev_missing.c Distributed Spare (dRAID) Feature 2020-11-13 13:51:51 -08:00
vdev_queue.c allow callers to allocate and provide the abd_t struct 2021-01-20 11:24:37 -08:00
vdev_raidz_math_aarch64_neon_common.h FreeBSD: fix the build with Clang 11 2020-08-17 15:40:17 -07:00
vdev_raidz_math_aarch64_neon.c Linux 5.0 compat: SIMD compatibility 2019-07-12 09:31:20 -07:00
vdev_raidz_math_aarch64_neonx2.c Linux 5.0 compat: SIMD compatibility 2019-07-12 09:31:20 -07:00
vdev_raidz_math_avx2.c FreeBSD: fix the build with Clang 11 2020-08-17 15:40:17 -07:00
vdev_raidz_math_avx512bw.c Refactor ccompile.h to not include system headers 2020-07-25 20:09:50 -07:00
vdev_raidz_math_avx512f.c FreeBSD: fix the build with Clang 11 2020-08-17 15:40:17 -07:00
vdev_raidz_math_impl.h Distributed Spare (dRAID) Feature 2020-11-13 13:51:51 -08:00
vdev_raidz_math_powerpc_altivec_common.h FreeBSD: fix the build with Clang 11 2020-08-17 15:40:17 -07:00
vdev_raidz_math_powerpc_altivec.c Prefix zfs internal endian checks with _ZFS 2020-07-28 13:02:49 -07:00
vdev_raidz_math_scalar.c Linux 5.3: Fix switch() fall though compiler errors 2019-08-21 09:29:23 -07:00
vdev_raidz_math_sse2.c FreeBSD: fix the build with Clang 11 2020-08-17 15:40:17 -07:00
vdev_raidz_math_ssse3.c Refactor ccompile.h to not include system headers 2020-07-25 20:09:50 -07:00
vdev_raidz_math.c Reduce fletcher4 and raidz benchmark times 2020-12-06 09:57:20 -08:00
vdev_raidz.c RAIDZ2/3 fails to heal silently corrupted parity w/2+ bad disks 2021-01-26 16:05:05 -08:00
vdev_rebuild.c Distributed Spare (dRAID) Feature 2020-11-13 13:51:51 -08:00
vdev_removal.c Set aside a metaslab for ZIL blocks 2021-01-21 15:12:54 -08:00
vdev_root.c Distributed Spare (dRAID) Feature 2020-11-13 13:51:51 -08:00
vdev_trim.c Distributed Spare (dRAID) Feature 2020-11-13 13:51:51 -08:00
vdev.c Set aside a metaslab for ZIL blocks 2021-01-21 15:12:54 -08:00
zap_leaf.c Refactor dnode dirty context from dbuf_dirty 2020-02-26 16:09:17 -08:00
zap_micro.c Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
zap.c Rename refcount.h to zfs_refcount.h 2020-07-29 16:35:33 -07:00
zcp_get.c Add include files for prototypes 2020-06-18 12:21:25 -07:00
zcp_global.c OpenZFS 8600 - ZFS channel programs - snapshot 2018-02-08 15:29:24 -08:00
zcp_iter.c Fix typos in module/zfs/ 2019-09-02 17:56:41 -07:00
zcp_set.c Support setting user properties in a channel program 2020-02-14 13:41:42 -08:00
zcp_synctask.c filesystem_limit/snapshot_limit is incorrectly enforced against root 2020-07-11 17:18:02 -07:00
zcp.c Channel program may spuriously fail with "memory limit exhausted" 2020-11-11 17:16:15 -08:00
zfeature.c Throw const on some strings 2020-10-02 17:44:10 -07:00
zfs_byteswap.c Mark functions as static 2020-06-18 12:20:38 -07:00
zfs_fm.c Distributed Spare (dRAID) Feature 2020-11-13 13:51:51 -08:00
zfs_fuid.c FreeBSD: Fix UNIX permissions checking 2020-08-18 09:57:07 -07:00
zfs_ioctl.c record ioctl elapsed time in zpool history 2021-01-11 09:29:25 -08:00
zfs_log.c Throw const on some strings 2020-10-02 17:44:10 -07:00
zfs_onexit.c Remove deduplicated send/receive code 2020-04-23 10:06:57 -07:00
zfs_quota.c File incorrectly zeroed when receiving incremental stream that toggles -L 2020-06-09 10:41:01 -07:00
zfs_ratelimit.c Change checksum & IO delay ratelimit values 2018-03-04 17:34:51 -08:00
zfs_replay.c Simplify FreeBSD's locking requirements in zfs_replay.c 2020-01-22 17:55:56 -08:00
zfs_rlock.c Add a "try" operation for range locks 2020-07-06 11:53:31 -07:00
zfs_sa.c Extending FreeBSD UIO Struct 2021-01-20 21:27:30 -08:00
zfs_vnops.c Extending FreeBSD UIO Struct 2021-01-20 21:27:30 -08:00
zil.c allow callers to allocate and provide the abd_t struct 2021-01-20 11:24:37 -08:00
zio_checksum.c Mark functions as static 2020-06-18 12:20:38 -07:00
zio_compress.c Avoid symbol collision with in-kernel zstdlib 2020-08-24 12:20:41 -07:00
zio_inject.c Distributed Spare (dRAID) Feature 2020-11-13 13:51:51 -08:00
zio.c Set aside a metaslab for ZIL blocks 2021-01-21 15:12:54 -08:00
zle.c Add include files for prototypes 2020-06-18 12:21:25 -07:00
zrlock.c Remove dead code 2020-06-18 12:21:18 -07:00
zthr.c Retain thread name when resuming a zthr 2020-09-03 20:09:52 -07:00
zvol.c Fix problems in zvol_set_volmode_impl 2020-11-17 09:50:52 -08:00