mirror_zfs/module/zfs
Alex Reece 463a8cfe2b Illumos 6844 - dnode_next_offset can detect fictional holes
6844 dnode_next_offset can detect fictional holes
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>

dnode_next_offset is used in a variety of places to iterate over the
holes or allocated blocks in a dnode. It operates under the premise that
it can iterate over the blockpointers of a dnode in open context while
holding only the dn_struct_rwlock as reader. Unfortunately, this premise
does not hold.

When we create the zio for a dbuf, we pass in the actual block pointer
in the indirect block above that dbuf. When we later zero the bp in
zio_write_compress, we are directly modifying the bp. The state of the
bp is now inconsistent from the perspective of dnode_next_offset: the bp
will appear to be a hole until zio_dva_allocate finally finishes filling
it in. In the meantime, dnode_next_offset can detect a hole in the dnode
when none exists.

I was able to experimentally demonstrate this behavior with the
following setup:
1. Create a file with 1 million dbufs.
2. Create a thread that randomly dirties L2 blocks by writing to the
first L0 block under them.
3. Observe dnode_next_offset, waiting for it to skip over a hole in the
middle of a file.
4. Do dnode_next_offset in a loop until we skip over such a non-existent
hole.

The fix is to ensure that it is valid to iterate over the indirect
blocks in a dnode while holding the dn_struct_rwlock by passing the zio
a copy of the BP and updating the actual BP in dbuf_write_ready while
holding the lock.

References:
  https://www.illumos.org/issues/6844
  https://github.com/openzfs/openzfs/pull/82
  DLPX-35372

Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4548
2016-04-27 16:24:15 -07:00
..
arc.c Add l2arc_max_block_size tunable 2016-02-25 09:44:00 -08:00
blkptr.c
bplist.c
bpobj.c
bptree.c Illumos 5960, 5925 2016-01-08 15:08:19 -08:00
bqueue.c Allow 16M send/recv blocks 2016-01-08 20:23:23 -05:00
dbuf_stats.c
dbuf.c Illumos 6844 - dnode_next_offset can detect fictional holes 2016-04-27 16:24:15 -07:00
ddt_zap.c
ddt.c
dmu_diff.c Illumos 5960, 5925 2016-01-08 15:08:19 -08:00
dmu_object.c Illumos 6370 - ZFS send fails to transmit some holes 2016-03-10 14:25:22 -08:00
dmu_objset.c Add support for asynchronous zvol minor operations 2016-03-10 09:49:22 -08:00
dmu_send.c Add support for asynchronous zvol minor operations 2016-03-10 09:49:22 -08:00
dmu_traverse.c Illumos 6370 - ZFS send fails to transmit some holes 2016-03-10 14:25:22 -08:00
dmu_tx.c Illumos 4950 - files sometimes can't be removed from a full filesystem 2016-01-21 16:59:30 -08:00
dmu_zfetch.c Illumos 6281 - prefetching should apply to 1MB reads 2016-01-12 13:51:27 -08:00
dmu.c Illumos 4950 - files sometimes can't be removed from a full filesystem 2016-01-21 16:59:30 -08:00
dnode_sync.c Illumos 5960, 5925 2016-01-08 15:08:19 -08:00
dnode.c Illumos 5987 - zfs prefetch code needs work 2016-01-12 09:02:33 -08:00
dsl_bookmark.c
dsl_dataset.c Add support for asynchronous zvol minor operations 2016-03-10 09:49:22 -08:00
dsl_deadlist.c
dsl_deleg.c
dsl_destroy.c Add support for asynchronous zvol minor operations 2016-03-10 09:49:22 -08:00
dsl_dir.c Add support for asynchronous zvol minor operations 2016-03-10 09:49:22 -08:00
dsl_pool.c
dsl_prop.c Illumos 6681 - zfs list burning lots of time in dodefault() via dsl_prop_* 2016-03-15 18:46:44 -07:00
dsl_scan.c Illumos 6537 - Panic on zpool scrub with DEBUG kernel 2016-02-05 11:29:32 -08:00
dsl_synctask.c
dsl_userhold.c
fm.c Illumos 5045 - use atomic_{inc,dec}_* instead of atomic_add_* 2016-01-15 15:38:36 -08:00
gzip.c
lz4.c
lzjb.c
Makefile.in Add pn_alloc()/pn_free() functions 2016-04-21 09:49:25 -07:00
metaslab.c gcc build error: -Wbool-compare in metaslab.c 2016-03-30 09:36:51 -07:00
multilist.c Identify locks flagged by lockdep 2015-12-22 10:21:33 -08:00
pathname.c Add pn_alloc()/pn_free() functions 2016-04-21 09:49:25 -07:00
range_tree.c
refcount.c
rrwlock.c
sa.c Prevent SA length overflow 2015-12-30 13:20:12 -08:00
sha256.c
spa_boot.c
spa_config.c Illumos 6659 - nvlist_free(NULL) is a no-op 2016-04-27 15:58:23 -07:00
spa_errlog.c
spa_history.c
spa_misc.c Change KM_SLEEP to TQ_SLEEP in spa_deadman() 2016-03-09 10:41:31 -08:00
spa_stats.c
spa.c Illumos 6659 - nvlist_free(NULL) is a no-op 2016-04-27 15:58:23 -07:00
space_map.c Illumos 5960, 5925 2016-01-08 15:08:19 -08:00
space_reftree.c
trace.c
txg.c Increase default user space stack size 2016-01-13 13:55:12 -08:00
uberblock.c
unique.c
vdev_cache.c Illumos 5045 - use atomic_{inc,dec}_* instead of atomic_add_* 2016-01-15 15:38:36 -08:00
vdev_disk.c Use udev for partition detection 2016-04-25 11:13:20 -07:00
vdev_file.c
vdev_label.c Illumos 6414 - vdev_config_sync could be simpler 2016-01-28 12:44:39 -05:00
vdev_mirror.c FreeBSD r256956: Improve ZFS N-way mirror read performance by using load and locality information. 2016-02-26 11:24:35 -08:00
vdev_missing.c
vdev_queue.c FreeBSD r256956: Improve ZFS N-way mirror read performance by using load and locality information. 2016-02-26 11:24:35 -08:00
vdev_raidz.c
vdev_root.c
vdev.c Identify locks flagged by lockdep 2015-12-22 10:21:33 -08:00
zap_leaf.c
zap_micro.c
zap.c Illumos 5960, 5925 2016-01-08 15:08:19 -08:00
zfeature_common.c
zfeature.c
zfs_acl.c Fix atime handling and relatime 2016-04-05 18:54:55 -07:00
zfs_byteswap.c
zfs_ctldir.c Fix atime handling and relatime 2016-04-05 18:54:55 -07:00
zfs_debug.c
zfs_dir.c Fix atime handling and relatime 2016-04-05 18:54:55 -07:00
zfs_fm.c Remove wrong ASSERT in annotate_ecksum 2016-02-17 10:43:02 -08:00
zfs_fuid.c
zfs_ioctl.c Illumos 6659 - nvlist_free(NULL) is a no-op 2016-04-27 15:58:23 -07:00
zfs_log.c
zfs_onexit.c
zfs_replay.c Add pn_alloc()/pn_free() functions 2016-04-21 09:49:25 -07:00
zfs_rlock.c
zfs_sa.c Fix atime handling and relatime 2016-04-05 18:54:55 -07:00
zfs_vfsops.c Fix zsb->z_hold_mtx deadlock 2016-01-15 15:33:45 -08:00
zfs_vnops.c Add pn_alloc()/pn_free() functions 2016-04-21 09:49:25 -07:00
zfs_znode.c Enable lazytime semantic for atime 2016-04-05 18:55:51 -07:00
zil.c
zio_checksum.c
zio_compress.c
zio_inject.c Fix zpool_scrub_* test cases 2016-03-30 09:30:34 -07:00
zio.c Illumos 5438 - zfs_blkptr_verify should continue after zfs_panic_recover 2016-01-12 13:54:05 -08:00
zle.c
zpl_ctldir.c
zpl_export.c
zpl_file.c Fix atime handling and relatime 2016-04-05 18:54:55 -07:00
zpl_inode.c Add pn_alloc()/pn_free() functions 2016-04-21 09:49:25 -07:00
zpl_super.c
zpl_xattr.c Linux 4.5 compat: Use xattr_handler->name for acl 2016-04-25 08:42:08 -07:00
zrlock.c
zvol.c Fix lock order inversion with zvol_open() 2016-03-10 09:53:36 -08:00