mirror_zfs/module/zfs
Matthew Ahrens 14872aaa4f
EIO caused by encryption + recursive gang
Encrypted blocks can not have 3 DVAs, because they use the space of the
3rd DVA for the IV+salt.  zio_write_gang_block() takes this into
account, setting `gbh_copies` to no more than 2 in this case.  Gang
members BP's do not have the X (encrypted) bit set (nor do they have the
DMU level and type fields set), because encryption is not handled at
this level.  The gang block is reassembled, and then encryption (and
compression) are handled.

To check if this gang block is encrypted, the code in
zio_write_gang_block() checks `pio->io_bp`.  This is normally fine,
because the block that's being ganged is typically the encrypted BP.

The problem is that if there is "recursive ganging", where a gang member
is itself a gang block, then when zio_write_gang_block() is called to
create a gang block for a gang member, `pio->io_bp` is the gang member's
BP, which doesn't have the X bit set, so the number of DVA's is not
restricted to 2.  It should instead be looking at the the "gang leader",
i.e. the top-level gang block, to determine how many DVA's can be used,
to avoid a "NDVA's inversion" (where a child has more DVA's than its
parent).

gang leader BP: X (encrypted) bit set, 2 DVA's, IV+salt in 3rd DVA's
space:
```
DVA[0]=<1:...:100400> DVA[1]=<0:...:100400> salt=... iv=...
[L0 ZFS plain file] fletcher4 uncompressed encrypted LE
gang unique double size=100000L/100000P birth=... fill=1 cksum=...
```

leader's GBH contains a BP with gang bit set and 3 DVA's:
```
DVA[0]=<1:...:55600> DVA[1]=<0:...:55600>
[L0 unallocated] fletcher4 uncompressed unencrypted LE
contiguous unique double size=55600L/55600P birth=... fill=0 cksum=...

DVA[0]=<1:...:55600> DVA[1]=<0:...:55600>
[L0 unallocated] fletcher4 uncompressed unencrypted LE
contiguous unique double size=55600L/55600P birth=... fill=0 cksum=...

DVA[0]=<1:...:55600> DVA[1]=<0:...:55600> DVA[2]=<1:...:200>
[L0 unallocated] fletcher4 uncompressed unencrypted LE
gang unique double size=55400L/55400P birth=... fill=0 cksum=...
```

On nondebug bits, having the 3rd DVA in the gang block works for the
most part, because it's true that all 3 DVA's are available in the gang
member BP (in the GBH).  However, for accounting purposes, gang block
DVA's ASIZE include all the space allocated below them, i.e. the
512-byte gang block header (GBH) as well as the gang members below that.
We see that above where the gang leader BP is 1MB logical (and after
compression: 0x`100000P`), but the ASIZE of each DVA is 2 sectors (1KB)
more than 1MB (0x`100400`).

Since thre are 3 copies of a block below it, we increment the ATIME of
the 3rd DVA of the gang leader by the space used by the 3rd DVA of the
child (1 sector, in this case).  But there isn't really a 3rd DVA of the
parent; the salt is stored in place of the 3rd DVA's ASIZE.

So when zio_write_gang_member_ready() increments the parent's BP's
`DVA[2]`'s ASIZE, it's actually incrementing the parent's salt.  When we
later try to read the encrypted recursively-ganged block, the salt
doesn't match what we used to write it, so MAC verification fails and we
get an EIO.

```
zio_encrypt():  encrypted 515/2/0/403 salt: 25 25 bb 9d ad d6 cd 89
zio_decrypt(): decrypting 515/2/0/403 salt: 26 25 bb 9d ad d6 cd 89
```

This commit addresses the problem by not increasing the number of copies
of the GBH beyond 2 (even for non-encrypted blocks).  This simplifies
the logic while maintaining the ability to traverse all metadata
(including gang blocks) even if one copy is lost.  (Note that 3 copies
of the GBH will still be created if requested, e.g. for `copies=3` or
MOS blocks.)  Additionally, the code that increments the parent's DVA's
ASIZE is made to check the parent DVA's NDVAS even on nondebug bits.  So
if there's a similar bug in the future, it will cause a panic when
trying to write, rather than corrupting the parent BP and causing an
error when reading.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Caused-by: #14356
Closes #14440
Closes #14413
2023-02-06 09:37:06 -08:00
..
abd.c abd_return_buf() should call zfs_refcount_remove_many() early 2022-10-19 17:11:01 -07:00
aggsum.c Remove bcopy(), bzero(), bcmp() 2022-03-15 15:13:42 -07:00
arc.c Cleanup: Use NULL when doing NULL pointer comparisons 2023-01-12 16:00:37 -08:00
blake3_zfs.c Cleanup: Use NULL when doing NULL pointer comparisons 2023-01-12 16:00:37 -08:00
blkptr.c Remove bcopy(), bzero(), bcmp() 2022-03-15 15:13:42 -07:00
bplist.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
bpobj.c Prefetch on deadlists merge 2023-01-25 11:30:24 -08:00
bptree.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
bqueue.c Batch enqueue/dequeue for bqueue 2023-01-10 13:39:22 -08:00
btree.c Optimize microzaps 2022-10-20 11:57:15 -07:00
dataset_kstats.c Introduce kmem_scnprintf() 2022-10-29 13:05:11 -07:00
dbuf_stats.c Revert "Reduce dbuf_find() lock contention" 2022-09-22 12:59:41 -07:00
dbuf.c Implement uncached prefetch 2023-01-04 17:29:54 -07:00
ddt_zap.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
ddt.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
dmu_diff.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
dmu_object.c Cleanup: Specify unsignedness on things that should not be signed 2022-09-27 16:42:41 -07:00
dmu_objset.c Activate filesystem features only in syncing context 2023-01-11 18:00:39 -08:00
dmu_recv.c Wait for txg sync if the last DRR_FREEOBJECTS might result in a hole 2023-01-23 13:19:43 -08:00
dmu_redact.c Fix incorrect size given to bqueue_enqueue() call in dmu_redact.c 2022-09-15 16:21:21 -07:00
dmu_send.c Cleanup: Use MIN() macro 2023-01-12 16:00:23 -08:00
dmu_traverse.c arc_read()/arc_access() refactoring and cleanup 2022-12-22 12:10:24 -08:00
dmu_tx.c Add tunable to allow changing micro ZAP's max size 2023-01-10 13:41:54 -08:00
dmu_zfetch.c arc_read()/arc_access() refactoring and cleanup 2022-12-22 12:10:24 -08:00
dmu.c Implement uncached prefetch 2023-01-04 17:29:54 -07:00
dnode_sync.c free_blocks(): Fix reports from 2016 PVS Studio FreeBSD report 2023-01-23 13:12:37 -08:00
dnode.c Turn default_bs and default_ibs into ZFS_MODULE_PARAMs 2023-01-11 09:38:20 -08:00
dsl_bookmark.c Cleanup: Address Clang's static analyzer's unused code complaints 2022-10-14 13:37:54 -07:00
dsl_crypt.c Handle and detect #13709's unlock regression (#14161) 2022-11-15 14:44:12 -08:00
dsl_dataset.c Activate filesystem features only in syncing context 2023-01-11 18:00:39 -08:00
dsl_deadlist.c Prefetch on deadlists merge 2023-01-25 11:30:24 -08:00
dsl_deleg.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
dsl_destroy.c Prevent zevent list from consuming all of kernel memory 2022-08-22 12:36:22 -07:00
dsl_dir.c Cleanup ->dd_space_towrite should be unsigned 2023-01-20 11:10:15 -08:00
dsl_pool.c Cleanup: Address Clang's static analyzer's unused code complaints 2022-10-14 13:37:54 -07:00
dsl_prop.c Micro-optimize dsl_prop_get_dd() 2023-01-20 11:01:41 -08:00
dsl_scan.c Increase default zfs_scan_vdev_limit to 16MB 2023-01-27 10:01:13 -08:00
dsl_synctask.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
dsl_userhold.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
edonr_zfs.c Remove bcopy(), bzero(), bcmp() 2022-03-15 15:13:42 -07:00
fm.c Cleanup of dead code suggested by Clang Static Analyzer (#14380) 2023-01-17 09:57:12 -08:00
gzip.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
hkdf.c Remove bcopy(), bzero(), bcmp() 2022-03-15 15:13:42 -07:00
lz4_zfs.c Updated the lz4 decompressor 2022-01-07 10:36:49 -08:00
lz4.c lz4: Cherrypick fix for CVE-2021-3520 2022-01-12 16:14:36 -08:00
lzjb.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
metaslab.c Bypass metaslab throttle for removal allocations 2022-12-09 10:48:33 -08:00
mmp.c Cleanup: Address Clang's static analyzer's unused code complaints 2022-10-14 13:37:54 -07:00
multilist.c Cleanup: Specify unsignedness on things that should not be signed 2022-09-27 16:42:41 -07:00
objlist.c Implement Redacted Send/Receive 2019-06-19 09:48:12 -07:00
pathname.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
range_tree.c Add defensive assertions 2022-10-12 11:25:18 -07:00
refcount.c Cleanup: Specify unsignedness on things that should not be signed 2022-09-27 16:42:41 -07:00
rrwlock.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
sa.c Fix double const qualifier declarations 2022-09-30 15:34:39 -07:00
sha256.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
skein_zfs.c Remove bcopy(), bzero(), bcmp() 2022-03-15 15:13:42 -07:00
spa_checkpoint.c Cleanup: 64-bit kernel module parameters should use fixed width types 2022-10-13 10:03:29 -07:00
spa_config.c zed: add hotplug support for spare vdevs 2023-01-09 12:43:03 -08:00
spa_errlog.c deadlock between spa_errlog_lock and dp_config_rwlock 2022-12-22 11:48:49 -08:00
spa_history.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
spa_log_spacemap.c Address warnings about possible division by zero from clangsa 2022-11-03 09:58:14 -07:00
spa_misc.c Improve resilver ETAs 2023-01-25 11:28:54 -08:00
spa_stats.c Cleanup: Specify unsignedness on things that should not be signed 2022-09-27 16:42:41 -07:00
spa.c deadlock between spa_errlog_lock and dp_config_rwlock 2022-12-22 11:48:49 -08:00
space_map.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
space_reftree.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
THIRDPARTYLICENSE.cityhash OpenZFS 8484 - Implement aggregate sum and use for arc counters 2018-06-06 09:35:59 -07:00
THIRDPARTYLICENSE.cityhash.descrip OpenZFS 8484 - Implement aggregate sum and use for arc counters 2018-06-06 09:35:59 -07:00
txg.c Fix the last two CFI callback prototype mismatches 2022-11-29 09:56:16 -08:00
uberblock.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
unique.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
vdev_cache.c Cleanup: Specify unsignedness on things that should not be signed 2022-09-27 16:42:41 -07:00
vdev_draid_rand.c Distributed Spare (dRAID) Feature 2020-11-13 13:51:51 -08:00
vdev_draid.c vdev_draid_lookup_map() should not iterate outside draid_maps 2022-09-12 12:51:17 -07:00
vdev_indirect_births.c Remove bcopy(), bzero(), bcmp() 2022-03-15 15:13:42 -07:00
vdev_indirect_mapping.c Remove bcopy(), bzero(), bcmp() 2022-03-15 15:13:42 -07:00
vdev_indirect.c Cleanup: Replace oldstyle struct hack with C99 flexible array members 2023-01-12 16:00:03 -08:00
vdev_initialize.c Cleanup: 64-bit kernel module parameters should use fixed width types 2022-10-13 10:03:29 -07:00
vdev_label.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
vdev_mirror.c Improve too large physical ashift handling 2022-09-08 10:30:53 -07:00
vdev_missing.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
vdev_queue.c Convert enum zio_flag to uint64_t 2022-10-27 09:54:54 -07:00
vdev_raidz_math_aarch64_neon_common.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
vdev_raidz_math_aarch64_neon.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
vdev_raidz_math_aarch64_neonx2.c Fix Clang 15 compilation errors 2022-11-30 13:46:26 -08:00
vdev_raidz_math_avx2.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
vdev_raidz_math_avx512bw.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
vdev_raidz_math_avx512f.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
vdev_raidz_math_impl.h Cleanup Raid-Z Typo fixes 2022-09-06 09:43:21 -07:00
vdev_raidz_math_powerpc_altivec_common.h Linux ppc64le ieee128 compat: Do not redefine __asm on external headers 2023-01-13 10:58:58 -08:00
vdev_raidz_math_powerpc_altivec.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
vdev_raidz_math_scalar.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
vdev_raidz_math_sse2.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
vdev_raidz_math_ssse3.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
vdev_raidz_math.c Convert some sprintf() calls to kmem_scnprintf() 2022-11-28 13:49:58 -08:00
vdev_raidz.c Bump checksum error counter before reporting to ZED 2022-12-02 17:42:22 -08:00
vdev_rebuild.c Increase default zfs_rebuild_vdev_limit to 64MB 2023-01-27 10:02:24 -08:00
vdev_removal.c Bypass metaslab throttle for removal allocations 2022-12-09 10:48:33 -08:00
vdev_root.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
vdev_trim.c Propagate extent_bytes change to autotrim thread 2022-10-28 10:16:31 -07:00
vdev.c Configure zed's diagnosis engine with vdev properties 2023-01-23 13:14:25 -08:00
zap_leaf.c Optimize microzaps 2022-10-20 11:57:15 -07:00
zap_micro.c Add tunable to allow changing micro ZAP's max size 2023-01-10 13:41:54 -08:00
zap.c Cleanup: Use NULL when doing NULL pointer comparisons 2023-01-12 16:00:37 -08:00
zcp_get.c Cleanup: Address Clang's static analyzer's unused code complaints 2022-10-14 13:37:54 -07:00
zcp_global.c OpenZFS 8600 - ZFS channel programs - snapshot 2018-02-08 15:29:24 -08:00
zcp_iter.c module/*.ko: prune .data, global .rodata 2022-01-14 15:37:55 -08:00
zcp_set.c Support setting user properties in a channel program 2020-02-14 13:41:42 -08:00
zcp_synctask.c Add zfs.sync.snapshot_rename 2022-09-02 13:31:19 -07:00
zcp.c ztest: update ztest_dmu_snapshot_create_destroy() 2023-01-10 13:27:48 -08:00
zfeature.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_byteswap.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_chksum.c Cleanup: Remove unnecessary explicit casts of pointers from allocators 2023-01-12 15:59:12 -08:00
zfs_fm.c Configure zed's diagnosis engine with vdev properties 2023-01-23 13:14:25 -08:00
zfs_fuid.c Cleanup: Remove unneeded semicolons 2023-01-12 16:00:30 -08:00
zfs_ioctl.c Cleanup of dead code suggested by Clang Static Analyzer (#14380) 2023-01-17 09:57:12 -08:00
zfs_log.c zfs_rename: support RENAME_* flags 2022-10-28 09:49:20 -07:00
zfs_onexit.c zfs_onexit_add_cb: make action_handle point to a uintptr_t 2022-11-03 09:52:12 -07:00
zfs_quota.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_ratelimit.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_replay.c Support idmapped mount in user namespace 2022-11-08 10:28:56 -08:00
zfs_rlock.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_sa.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_vnops.c Cleanup: Remove unnecessary explicit casts of pointers from allocators 2023-01-12 15:59:12 -08:00
zil.c Introduce minimal ZIL block commit delay 2023-01-24 09:20:32 -08:00
zio_checksum.c Fix double const qualifier declarations 2022-09-30 15:34:39 -07:00
zio_compress.c Fix declarations of non-global variables 2022-10-18 11:05:32 -07:00
zio_inject.c Cleanup: Switch to strlcpy from strncpy 2022-09-27 16:35:29 -07:00
zio.c EIO caused by encryption + recursive gang 2023-02-06 09:37:06 -08:00
zle.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zrlock.c Micro-optimize zrl_remove() 2022-11-29 09:26:03 -08:00
zthr.c Switch from _Noreturn to __attribute__((noreturn)) 2022-03-23 08:51:00 -07:00
zvol.c Cleanup of dead code suggested by Clang Static Analyzer (#14380) 2023-01-17 09:57:12 -08:00