Matthew Ahrens c618f87cd2 Add zstream redup command to convert deduplicated send streams
Deduplicated send and receive is deprecated.  To ease migration to the
new dedup-send-less world, the commit adds a `zstream redup` utility to
convert deduplicated send streams to normal streams, so that they can
continue to be received indefinitely.

The new `zstream` command also replaces the functionality of
`zstreamdump`, by way of the `zstream dump` subcommand.  The
`zstreamdump` command is replaced by a shell script which invokes
`zstream dump`.

The way that `zstream redup` works under the hood is that as we read the
send stream, we build up a hash table which maps from `<GUID, object,
offset> -> <file_offset>`.

Whenever we see a WRITE record, we add a new entry to the hash table,
which indicates where in the stream file to find the WRITE record for
this block. (The key is `drr_toguid, drr_object, drr_offset`.)

For entries other than WRITE_BYREF, we pass them through unchanged
(except for the running checksum, which is recalculated).

For WRITE_BYREF records, we change them to WRITE records.  We find the
referenced WRITE record by looking in the hash table (for the record
with key `drr_refguid, drr_refobject, drr_refoffset`), and then reading
the record header and payload from the specified offset in the stream
file.  This is why the stream can not be a pipe.  The found WRITE record
replaces the WRITE_BYREF record, with its `drr_toguid`, `drr_object`,
and `drr_offset` fields changed to be the same as the WRITE_BYREF's
(i.e. we are writing the same logical block, but with the data supplied
by the previous WRITE record).

This algorithm requires memory proportional to the number of WRITE
records (same as `zfs send -D`), but the size per WRITE record is
relatively low (40 bytes, vs. 72 for `zfs send -D`).  A 1TB send stream
with 8KB blocks (`recordsize=8k`) would use around 5GB of RAM to
"redup".

Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #10124 
Closes #10156
2020-04-10 10:39:55 -07:00
2020-03-17 15:42:27 -07:00
2020-03-10 09:53:20 -07:00
2020-04-10 10:33:35 -07:00
2020-04-10 10:33:35 -07:00
2019-12-02 13:23:47 -08:00
2017-11-13 09:18:18 -08:00
2018-05-29 16:00:33 -07:00
2019-04-30 10:58:45 -07:00
2020-02-10 12:59:50 -08:00
2018-05-29 16:00:33 -07:00
2019-10-01 12:50:34 -07:00
2020-04-09 09:16:46 -07:00
2018-09-18 12:03:47 -07:00
2018-05-29 16:00:33 -07:00
2020-02-25 11:43:20 -08:00
2020-03-16 10:46:03 -07:00

img

OpenZFS is an advanced file system and volume manager which was originally developed for Solaris and is now maintained by the OpenZFS community. This repository contains the code for running OpenZFS on Linux and FreeBSD.

codecov coverity

Official Resources

  • Wiki - for using and developing this repo
  • ZoL Site - Linux release info & links
  • Mailing lists
  • OpenZFS site - for conference videos and info on other platforms (illumos, OSX, Windows, etc)

Installation

Full documentation for installing OpenZFS on your favorite Linux distribution can be found at the ZoL Site.

FreeBSD support is a work in progress. See the PR.

Contribute & Develop

We have a separate document with contribution guidelines.

We have a Code of Conduct.

Release

OpenZFS is released under a CDDL license. For more details see the NOTICE, LICENSE and COPYRIGHT files; UCRL-CODE-235197

Supported Kernels

  • The META file contains the officially recognized supported Linux kernel versions.
S
Description
No description provided
Readme 122 MiB
Languages
C 70.2%
Shell 19.9%
Assembly 5.1%
M4 1.9%
Python 1.6%
Other 1.3%