Go to file
Matthew Ahrens c618f87cd2
Add zstream redup command to convert deduplicated send streams
Deduplicated send and receive is deprecated.  To ease migration to the
new dedup-send-less world, the commit adds a `zstream redup` utility to
convert deduplicated send streams to normal streams, so that they can
continue to be received indefinitely.

The new `zstream` command also replaces the functionality of
`zstreamdump`, by way of the `zstream dump` subcommand.  The
`zstreamdump` command is replaced by a shell script which invokes
`zstream dump`.

The way that `zstream redup` works under the hood is that as we read the
send stream, we build up a hash table which maps from `<GUID, object,
offset> -> <file_offset>`.

Whenever we see a WRITE record, we add a new entry to the hash table,
which indicates where in the stream file to find the WRITE record for
this block. (The key is `drr_toguid, drr_object, drr_offset`.)

For entries other than WRITE_BYREF, we pass them through unchanged
(except for the running checksum, which is recalculated).

For WRITE_BYREF records, we change them to WRITE records.  We find the
referenced WRITE record by looking in the hash table (for the record
with key `drr_refguid, drr_refobject, drr_refoffset`), and then reading
the record header and payload from the specified offset in the stream
file.  This is why the stream can not be a pipe.  The found WRITE record
replaces the WRITE_BYREF record, with its `drr_toguid`, `drr_object`,
and `drr_offset` fields changed to be the same as the WRITE_BYREF's
(i.e. we are writing the same logical block, but with the data supplied
by the previous WRITE record).

This algorithm requires memory proportional to the number of WRITE
records (same as `zfs send -D`), but the size per WRITE record is
relatively low (40 bytes, vs. 72 for `zfs send -D`).  A 1TB send stream
with 8KB blocks (`recordsize=8k`) would use around 5GB of RAM to
"redup".

Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #10124 
Closes #10156
2020-04-10 10:39:55 -07:00
.github Add an .editorconfig; document git whitespace settings 2020-01-27 13:32:52 -08:00
cmd Add zstream redup command to convert deduplicated send streams 2020-04-10 10:39:55 -07:00
config Linux 5.7 compat: blk_alloc_queue() 2020-04-09 09:16:46 -07:00
contrib Fix cstyle warnings 2020-03-17 15:42:27 -07:00
etc Fix zfs-functions packaging bug 2020-03-10 09:53:20 -07:00
include Persistent L2ARC 2020-04-10 10:33:35 -07:00
lib Add zstream redup command to convert deduplicated send streams 2020-04-10 10:39:55 -07:00
man Add zstream redup command to convert deduplicated send streams 2020-04-10 10:39:55 -07:00
module Persistent L2ARC 2020-04-10 10:33:35 -07:00
rpm Change http://zfsonlinux.org links to https://zfsonlinux.org 2020-01-13 16:43:59 -08:00
scripts zloop.sh should call ZDB with pool name 2020-03-11 10:02:23 -07:00
tests Add zstream redup command to convert deduplicated send streams 2020-04-10 10:39:55 -07:00
udev Create symbolic links in /dev/disk/by-vdev for nvme disk devices 2019-12-17 17:50:20 -08:00
.editorconfig Add an .editorconfig; document git whitespace settings 2020-01-27 13:32:52 -08:00
.gitignore Adapt gitignore for modules 2019-12-02 13:23:47 -08:00
.gitmodules Add zimport.sh compatibility test script 2014-02-21 12:10:31 -08:00
.travis.yml Add .travis.yml 2017-11-13 09:18:18 -08:00
AUTHORS Update build system and packaging 2018-05-29 16:00:33 -07:00
autogen.sh Cause autogen.sh to fail if autoreconf fails 2018-07-06 09:27:37 -07:00
CODE_OF_CONDUCT.md Add CODE_OF_CONDUCT.md 2019-04-30 10:58:45 -07:00
configure.ac Add zstream redup command to convert deduplicated send streams 2020-04-10 10:39:55 -07:00
copy-builtin bash scripts: use /usr/bin/env for bash shebangs 2020-02-10 13:13:46 -08:00
COPYRIGHT ICP: Improve AES-GCM performance 2020-02-10 12:59:50 -08:00
LICENSE Update build system and packaging 2018-05-29 16:00:33 -07:00
Makefile.am Perform KABI checks in parallel 2019-10-01 12:50:34 -07:00
META Linux 5.7 compat: blk_alloc_queue() 2020-04-09 09:16:46 -07:00
NEWS Add NEWS file 2018-09-18 12:03:47 -07:00
NOTICE Update build system and packaging 2018-05-29 16:00:33 -07:00
README.md Update README for OpenZFS 2020-02-25 11:43:20 -08:00
TEST Remove CI builder customization from TEST 2020-03-16 10:46:03 -07:00
zfs.release.in Move zfs.release generation to configure step 2012-07-12 12:22:51 -07:00

img

OpenZFS is an advanced file system and volume manager which was originally developed for Solaris and is now maintained by the OpenZFS community. This repository contains the code for running OpenZFS on Linux and FreeBSD.

codecov coverity

Official Resources

  • Wiki - for using and developing this repo
  • ZoL Site - Linux release info & links
  • Mailing lists
  • OpenZFS site - for conference videos and info on other platforms (illumos, OSX, Windows, etc)

Installation

Full documentation for installing OpenZFS on your favorite Linux distribution can be found at the ZoL Site.

FreeBSD support is a work in progress. See the PR.

Contribute & Develop

We have a separate document with contribution guidelines.

We have a Code of Conduct.

Release

OpenZFS is released under a CDDL license. For more details see the NOTICE, LICENSE and COPYRIGHT files; UCRL-CODE-235197

Supported Kernels

  • The META file contains the officially recognized supported Linux kernel versions.