Go to file
Matthew Macy 01c4f2bf29
Use vn_io_fault_uiomove on FreeBSD to avoid potential deadlock
Added to prevent a possible deadlock, the following comments from
FreeBSD explain the issue.  The comment describing vn_io_fault_uiomove:

/*
 * Helper function to perform the requested uiomove operation using
 * the held pages for io->uio_iov[0].iov_base buffer instead of
 * copyin/copyout.  Access to the pages with uiomove_fromphys()
 * instead of iov_base prevents page faults that could occur due to
 * pmap_collect() invalidating the mapping created by
 * vm_fault_quick_hold_pages(), or pageout daemon, page laundry or
 * object cleanup revoking the write access from page mappings.
 *
 * Filesystems specified MNTK_NO_IOPF shall use vn_io_fault_uiomove()
 * instead of plain uiomove().
 */

This used for vn_io_fault which has the following motivation:

/*
 * The vn_io_fault() is a wrapper around vn_read() and vn_write() to
 * prevent the following deadlock:
 *
 * Assume that the thread A reads from the vnode vp1 into userspace
 * buffer buf1 backed by the pages of vnode vp2.  If a page in buf1 is
 * currently not resident, then system ends up with the call chain
 *   vn_read() -> VOP_READ(vp1) -> uiomove() -> [Page Fault] ->
 *     vm_fault(buf1) -> vnode_pager_getpages(vp2) -> VOP_GETPAGES(vp2)
 * which establishes lock order vp1->vn_lock, then vp2->vn_lock.
 * If, at the same time, thread B reads from vnode vp2 into buffer buf2
 * backed by the pages of vnode vp1, and some page in buf2 is not
 * resident, we get a reversed order vp2->vn_lock, then vp1->vn_lock.
 *
 * To prevent the lock order reversal and deadlock, vn_io_fault() does
 * not allow page faults to happen during VOP_READ() or VOP_WRITE().
 * Instead, it first tries to do the whole range i/o with pagefaults
 * disabled. If all pages in the i/o buffer are resident and mapped,
 * VOP will succeed (ignoring the genuine filesystem errors).
 * Otherwise, we get back EFAULT, and vn_io_fault() falls back to do
 * i/o in chunks, with all pages in the chunk prefaulted and held
 * using vm_fault_quick_hold_pages().
 *
 * Filesystems using this deadlock avoidance scheme should use the
 * array of the held pages from uio, saved in the curthread->td_ma,
 * instead of doing uiomove().  A helper function
 * vn_io_fault_uiomove() converts uiomove request into
 * uiomove_fromphys() over td_ma array.
 *
 * Since vnode locks do not cover the whole i/o anymore, rangelocks
 * make the current i/o request atomic with respect to other i/os and
 * truncations.
 */

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #10177
2020-04-08 10:30:27 -07:00
.github Add an .editorconfig; document git whitespace settings 2020-01-27 13:32:52 -08:00
cmd Add 'zfs wait' command 2020-04-01 10:02:06 -07:00
config Fix CONFIG_MODULES=no Linux kernel config 2020-02-28 09:23:48 -08:00
contrib Fix cstyle warnings 2020-03-17 15:42:27 -07:00
etc Fix zfs-functions packaging bug 2020-03-10 09:53:20 -07:00
include Finish refactoring for ZFS_MODULE_PARAM_CALL 2020-04-07 10:06:22 -07:00
lib libzfs_pool: Remove unused check for ENOTBLK 2020-04-07 10:04:40 -07:00
man Add 'zfs wait' command 2020-04-01 10:02:06 -07:00
module Use vn_io_fault_uiomove on FreeBSD to avoid potential deadlock 2020-04-08 10:30:27 -07:00
rpm Change http://zfsonlinux.org links to https://zfsonlinux.org 2020-01-13 16:43:59 -08:00
scripts zloop.sh should call ZDB with pool name 2020-03-11 10:02:23 -07:00
tests ZTS: Fix non-portable date format 2020-04-06 16:07:35 -07:00
udev Create symbolic links in /dev/disk/by-vdev for nvme disk devices 2019-12-17 17:50:20 -08:00
.editorconfig Add an .editorconfig; document git whitespace settings 2020-01-27 13:32:52 -08:00
.gitignore Adapt gitignore for modules 2019-12-02 13:23:47 -08:00
.gitmodules Add zimport.sh compatibility test script 2014-02-21 12:10:31 -08:00
.travis.yml Add .travis.yml 2017-11-13 09:18:18 -08:00
AUTHORS Update build system and packaging 2018-05-29 16:00:33 -07:00
autogen.sh Cause autogen.sh to fail if autoreconf fails 2018-07-06 09:27:37 -07:00
CODE_OF_CONDUCT.md Add CODE_OF_CONDUCT.md 2019-04-30 10:58:45 -07:00
configure.ac Add 'zfs wait' command 2020-04-01 10:02:06 -07:00
copy-builtin bash scripts: use /usr/bin/env for bash shebangs 2020-02-10 13:13:46 -08:00
COPYRIGHT ICP: Improve AES-GCM performance 2020-02-10 12:59:50 -08:00
LICENSE Update build system and packaging 2018-05-29 16:00:33 -07:00
Makefile.am Perform KABI checks in parallel 2019-10-01 12:50:34 -07:00
META Update maximum kernel version to 5.4 2019-12-23 14:24:36 -08:00
NEWS Add NEWS file 2018-09-18 12:03:47 -07:00
NOTICE Update build system and packaging 2018-05-29 16:00:33 -07:00
README.md Update README for OpenZFS 2020-02-25 11:43:20 -08:00
TEST Remove CI builder customization from TEST 2020-03-16 10:46:03 -07:00
zfs.release.in Move zfs.release generation to configure step 2012-07-12 12:22:51 -07:00

img

OpenZFS is an advanced file system and volume manager which was originally developed for Solaris and is now maintained by the OpenZFS community. This repository contains the code for running OpenZFS on Linux and FreeBSD.

codecov coverity

Official Resources

  • Wiki - for using and developing this repo
  • ZoL Site - Linux release info & links
  • Mailing lists
  • OpenZFS site - for conference videos and info on other platforms (illumos, OSX, Windows, etc)

Installation

Full documentation for installing OpenZFS on your favorite Linux distribution can be found at the ZoL Site.

FreeBSD support is a work in progress. See the PR.

Contribute & Develop

We have a separate document with contribution guidelines.

We have a Code of Conduct.

Release

OpenZFS is released under a CDDL license. For more details see the NOTICE, LICENSE and COPYRIGHT files; UCRL-CODE-235197

Supported Kernels

  • The META file contains the officially recognized supported Linux kernel versions.