Go to file
Matthew Ahrens 1dc32a67e9
Improve zfs send performance by bypassing the ARC
When doing a zfs send on a dataset with small recordsize (e.g. 8K),
performance is dominated by the per-block overheads.  This is especially
true with `zfs send --compressed`, which further reduces the amount of
data sent, for the same number of blocks.  Several threads are involved,
but the limiting factor is the `send_prefetch` thread, which is 100% on
CPU.

The main job of the `send_prefetch` thread is to issue zio's for the
data that will be needed by the main thread.  It does this by calling
`arc_read(ARC_FLAG_PREFETCH)`.  This has an immediate cost of creating
an arc_hdr, which takes around 14% of one CPU.  It also induces later
costs by other threads:

 * Since the data was only prefetched, dmu_send()->dmu_dump_write() will
   need to call arc_read() again to get the data.  This will have to
   look up the arc_hdr in the hash table and copy the data from the
   scatter ABD in the arc_hdr to a linear ABD in arc_buf.  This takes
   27% of one CPU.

 * dmu_dump_write() needs to arc_buf_destroy()  This takes 11% of one
   CPU.

 * arc_adjust() will need to evict this arc_hdr, taking about 50% of one
   CPU.

All of these costs can be avoided by bypassing the ARC if the data is
not already cached.  This commit changes `zfs send` to check for the
data in the ARC, and if it is not found then we directly call
`zio_read()`, reading the data into a linear ABD which is used by
dmu_dump_write() directly.

The performance improvement is best expressed in terms of how many
blocks can be processed by `zfs send` in one second.  This change
increases the metric by 50%, from ~100,000 to ~150,000.  When the amount
of data per block is small (e.g. 2KB), there is a corresponding
reduction in the elapsed time of `zfs send >/dev/null` (from 86 minutes
to 58 minutes in this test case).

In addition to improving the performance of `zfs send`, this change
makes `zfs send` not pollute the ARC cache.  In most cases the data will
not be reused, so this allows us to keep caching useful data in the MRU
(hit-once) part of the ARC.

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #10067
2020-03-10 10:51:04 -07:00
.github Add an .editorconfig; document git whitespace settings 2020-01-27 13:32:52 -08:00
cmd Add trim support to zpool wait 2020-03-04 15:07:11 -08:00
config Fix CONFIG_MODULES=no Linux kernel config 2020-02-28 09:23:48 -08:00
contrib Fix zfs-functions packaging bug 2020-03-10 09:53:20 -07:00
etc Fix zfs-functions packaging bug 2020-03-10 09:53:20 -07:00
include Improve zfs send performance by bypassing the ARC 2020-03-10 10:51:04 -07:00
lib Change default to overlay=on 2020-03-06 09:28:19 -08:00
man Change default to overlay=on 2020-03-06 09:28:19 -08:00
module Improve zfs send performance by bypassing the ARC 2020-03-10 10:51:04 -07:00
rpm Change http://zfsonlinux.org links to https://zfsonlinux.org 2020-01-13 16:43:59 -08:00
scripts bash scripts: use /usr/bin/env for bash shebangs 2020-02-10 13:13:46 -08:00
tests ZTS: Simplify some libtest functions 2020-03-10 10:44:14 -07:00
udev Create symbolic links in /dev/disk/by-vdev for nvme disk devices 2019-12-17 17:50:20 -08:00
.editorconfig Add an .editorconfig; document git whitespace settings 2020-01-27 13:32:52 -08:00
.gitignore Adapt gitignore for modules 2019-12-02 13:23:47 -08:00
.gitmodules Add zimport.sh compatibility test script 2014-02-21 12:10:31 -08:00
.travis.yml Add .travis.yml 2017-11-13 09:18:18 -08:00
AUTHORS Update build system and packaging 2018-05-29 16:00:33 -07:00
autogen.sh Cause autogen.sh to fail if autoreconf fails 2018-07-06 09:27:37 -07:00
CODE_OF_CONDUCT.md Add CODE_OF_CONDUCT.md 2019-04-30 10:58:45 -07:00
configure.ac Fix zfs-functions packaging bug 2020-03-10 09:53:20 -07:00
copy-builtin bash scripts: use /usr/bin/env for bash shebangs 2020-02-10 13:13:46 -08:00
COPYRIGHT ICP: Improve AES-GCM performance 2020-02-10 12:59:50 -08:00
LICENSE Update build system and packaging 2018-05-29 16:00:33 -07:00
Makefile.am Perform KABI checks in parallel 2019-10-01 12:50:34 -07:00
META Update maximum kernel version to 5.4 2019-12-23 14:24:36 -08:00
NEWS Add NEWS file 2018-09-18 12:03:47 -07:00
NOTICE Update build system and packaging 2018-05-29 16:00:33 -07:00
README.md Update README for OpenZFS 2020-02-25 11:43:20 -08:00
TEST Update build system and packaging 2018-05-29 16:00:33 -07:00
zfs.release.in Move zfs.release generation to configure step 2012-07-12 12:22:51 -07:00

img

OpenZFS is an advanced file system and volume manager which was originally developed for Solaris and is now maintained by the OpenZFS community. This repository contains the code for running OpenZFS on Linux and FreeBSD.

codecov coverity

Official Resources

  • Wiki - for using and developing this repo
  • ZoL Site - Linux release info & links
  • Mailing lists
  • OpenZFS site - for conference videos and info on other platforms (illumos, OSX, Windows, etc)

Installation

Full documentation for installing OpenZFS on your favorite Linux distribution can be found at the ZoL Site.

FreeBSD support is a work in progress. See the PR.

Contribute & Develop

We have a separate document with contribution guidelines.

We have a Code of Conduct.

Release

OpenZFS is released under a CDDL license. For more details see the NOTICE, LICENSE and COPYRIGHT files; UCRL-CODE-235197

Supported Kernels

  • The META file contains the officially recognized supported Linux kernel versions.