Native Encryption for ZFS on Linux

This change incorporates three major pieces:

The first change is a keystore that manages wrapping
and encryption keys for encrypted datasets. These
commands mostly involve manipulating the new
DSL Crypto Key ZAP Objects that live in the MOS. Each
encrypted dataset has its own DSL Crypto Key that is
protected with a user's key. This level of indirection
allows users to change their keys without re-encrypting
their entire datasets. The change implements the new
subcommands "zfs load-key", "zfs unload-key" and
"zfs change-key" which allow the user to manage their
encryption keys and settings. In addition, several new
flags and properties have been added to allow dataset
creation and to make mounting and unmounting more
convenient.

The second piece of this patch provides the ability to
encrypt, decyrpt, and authenticate protected datasets.
Each object set maintains a Merkel tree of Message
Authentication Codes that protect the lower layers,
similarly to how checksums are maintained. This part
impacts the zio layer, which handles the actual
encryption and generation of MACs, as well as the ARC
and DMU, which need to be able to handle encrypted
buffers and protected data.

The last addition is the ability to do raw, encrypted
sends and receives. The idea here is to send raw
encrypted and compressed data and receive it exactly
as is on a backup system. This means that the dataset
on the receiving system is protected using the same
user key that is in use on the sending side. By doing
so, datasets can be efficiently backed up to an
untrusted system without fear of data being
compromised.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #494 
Closes #5769
This commit is contained in:
Tom Caputi
2017-08-14 13:36:48 -04:00
committed by Brian Behlendorf
parent 376994828f
commit b525630342
163 changed files with 16091 additions and 1204 deletions
+21
View File
@@ -619,5 +619,26 @@ files.
.RE
.sp
.ne 2
.na
\fB\fBencryption\fR\fR
.ad
.RS 4n
.TS
l l .
GUID com.datto:encryption
READ\-ONLY COMPATIBLE no
DEPENDENCIES extensible_dataset
.TE
This feature enables the creation and management of natively encrypted datasets.
This feature becomes \fBactive\fR when an encrypted dataset is created and will
be returned to the \fBenabled\fR state when all datasets that use this feature
are destroyed.
.RE
.SH "SEE ALSO"
\fBzpool\fR(8)
+370 -5
View File
@@ -148,7 +148,7 @@
.Cm mount
.Nm
.Cm mount
.Op Fl Ov
.Op Fl Olv
.Op Fl o Ar options
.Fl a | Ar filesystem
.Nm
@@ -166,12 +166,12 @@
.Ar snapshot bookmark
.Nm
.Cm send
.Op Fl DLPRcenpv
.Op Fl DLPRcenpvw
.Op Oo Fl I Ns | Ns Fl i Oc Ar snapshot
.Ar snapshot
.Nm
.Cm send
.Op Fl Lce
.Op Fl Lcew
.Op Fl i Ar snapshot Ns | Ns Ar bookmark
.Ar filesystem Ns | Ns Ar volume Ns | Ns Ar snapshot
.Nm
@@ -270,6 +270,27 @@
.Cm diff
.Op Fl FHt
.Ar snapshot Ar snapshot Ns | Ns Ar filesystem
.Nm
.Cm load-key
.Op Fl nr
.Op Fl L Ar keylocation
.Fl a | Ar filesystem
.Nm
.Cm unload-key
.Op Fl r
.Fl a | Ar filesystem
.Nm
.Cm change-key
.Op Fl l
.Op Fl o Ar keylocation Ns = Ns Ar value
.Op Fl o Ar keyformat Ns = Ns Ar value
.Op Fl o Ar pbkdf2iters Ns = Ns Ar value
.Ar filesystem
.Nm
.Cm change-key
.Fl i
.Op Fl l
.Ar filesystem
.Sh DESCRIPTION
The
.Nm
@@ -572,12 +593,36 @@ if the snapshot has been marked for deferred destroy by using the
command.
Otherwise, the property is
.Sy off .
.It Sy encryptionroot
For encrypted datasets, indicates where the dataset is currently inheriting its
encryption key from. Loading or unloading a key for the
.Sy encryptionroot
will implicitly load / unload the key for any inheriting datasets (see
.Nm zfs Cm load-key
and
.Nm zfs Cm unload-key
for details).
Clones will always share an
encryption key with their origin. See the
.Sx Encryption
section for details.
.It Sy filesystem_count
The total number of filesystems and volumes that exist under this location in
the dataset tree.
This value is only available when a
.Sy filesystem_limit
has been set somewhere in the tree under which the dataset resides.
.It Sy keystatus
Indicates if an encryption key is currently loaded into ZFS. The possible
values are
.Sy none ,
.Sy available ,
and
.Sy unavailable .
See
.Nm zfs Cm load-key
and
.Nm zfs Cm unload-key .
.It Sy guid
The 64 bit GUID of this dataset or bookmark which does not change over its
entire lifetime. When a snapshot is sent to another pool, the received
@@ -1218,6 +1263,93 @@ that doesn't support the large_dnode feature.
.Pp
This property can also be referred to by its shortened column name,
.Sy dnsize .
.It Xo
.Sy encryption Ns = Ns Sy off Ns | Ns Sy on Ns | Ns Sy aes-128-ccm Ns | Ns
.Sy aes-192-ccm Ns | Ns Sy aes-256-ccm Ns | Ns Sy aes-128-gcm Ns | Ns
.Sy aes-192-gcm Ns | Ns Sy aes-256-gcm
.Xc
Controls the encryption cipher suite (block cipher, key length, and mode) used
for this dataset. Requires the
.Sy encryption
feature to be enabled on the pool.
Requires a
.Sy keyformat
to be set at dataset creation time.
.Pp
Selecting
.Sy encryption Ns = Ns Sy on
when creating a dataset indicates that the default encryption suite will be
selected, which is currently
.Sy aes-256-ccm .
In order to provide consistent data protection, encryption must be specified at
dataset creation time and it cannot be changed afterwards.
.Pp
For more details and caveats about encryption see the
.Sy Encryption
section.
.It Sy keyformat Ns = Ns Sy raw Ns | Ns Sy hex Ns | Ns Sy passphrase
Controls what format the user's encryption key will be provided as. This
property is only set when the dataset is encrypted.
.Pp
Raw keys and hex keys must be 32 bytes long (regardless of the chosen
encryption suite) and must be randomly generated. A raw key can be generated
with the following command:
.Bd -literal
# dd if=/dev/urandom of=/path/to/output/key bs=32 count=1
.Ed
.Pp
Passphrases must be between 8 and 512 bytes long and will be processed through
PBKDF2 before being used (see the
.Sy pbkdf2iters
property). Even though the
encryption suite cannot be changed after dataset creation, the keyformat can be
with
.Nm zfs Cm change-key .
.It Xo
.Sy keylocation Ns = Ns Sy prompt Ns | Ns Sy file:// Ns Em </absolute/file/path>
.Xc
Controls where the user's encryption key will be loaded from by default for
commands such as
.Nm zfs Cm load-key
and
.Nm zfs Cm mount Cm -l . This property is
only set for encrypted datasets which are encryption roots. If unspecified, the
default is
.Sy prompt.
.Pp
Even though the encryption suite cannot be changed after dataset creation, the
keylocation can be with either
.Nm zfs Cm set
or
.Nm zfs Cm change-key .
If
.Sy prompt
is selected ZFS will ask for the key at the command prompt when it is required
to access the encrypted data (see
.Nm zfs Cm load-key
for details). This setting will also allow the key to be passed in via STDIN,
but users should be careful not to place keys which should be kept secret on
the command line. If a file URI is selected, the key will be loaded from the
specified absolute file path.
.It Sy pbkdf2iters Ns = Ns Ar iterations
Controls the number of PBKDF2 iterations that a
.Sy passphrase
encryption key should be run through when processing it into an encryption key.
This property is only defined when encryption is enabled and a keyformat of
.Sy passphrase
is selected. The goal of PBKDF2 is to significantly increase the
computational difficulty needed to brute force a user's passphrase. This is
accomplished by forcing the attacker to run each passphrase through a
computationally expensive hashing function many times before they arrive at the
resulting key. A user who actually knows the passphrase will only have to pay
this cost once. As CPUs become better at processing, this number should be
raised to ensure that a brute force attack is still not possible. The current
default is
.Sy 350000
and the minimum is
.Sy 100000 .
This property may be changed with
.Nm zfs Cm change-key .
.It Sy exec Ns = Ns Sy on Ns | Ns Sy off
Controls whether processes can be executed from within this file system.
The default value is
@@ -2020,6 +2152,69 @@ and
.Xr swapon 8
commands. Do not swap to a file on a ZFS file system. A ZFS swap file
configuration is not supported.
.Ss Encryption
Enabling the
.Sy encryption
feature allows for the creation of encrypted filesystems and volumes.
.Nm
will encrypt all user data including file and zvol data, file attributes,
ACLs, permission bits, directory listings, FUID mappings, and userused /
groupused data.
.Nm
will not encrypt metadata related to the pool structure, including dataset
names, dataset hierarchy, file size, file holes, and dedup tables. Key rotation
is managed internally by the kernel module and changing the user's key does not
require re-encrypting the entire dataset. Datasets can be scrubbed, resilvered,
renamed, and deleted without the encryption keys being loaded (see the
.Nm zfs Cm load-key
subcommand for more info on key loading).
.Pp
Creating an encrypted dataset requires specifying the
.Sy encryption
and
.Sy keyformat
properties at creation time, along with an optional
.Sy
keylocation
and
.Sy pbkdf2iters .
After entering an encryption key, the
created dataset will become an encryption root. Any descendant datasets will
inherit their encryption key from the encryption root, meaning that loading,
unloading, or changing the key for the encryption root will implicitly do the
same for all inheriting datasets. If this inheritence is not desired, simply
supply a new
.Sy encryption
and
.Sy keyformat
when creating the child dataset or use
.Nm zfs Cm change-key
to break the relationship. The one exception is that clones will always use
their origin's encryption key. Encryption root inheritence can be tracked via
the read-only
.Sy encryptionroot
property.
.Pp
Encryption changes the behavior of a few
.Nm
operations. Encryption is applied after compression so compression ratios are
preserved. Normally checksums in ZFS are 256 bits long, but for encrypted data
the checksum is 128 bits of the user-chosen checksum and 128 bits of MAC from
the encryption suite, which provides additional protection against maliciously
altered data. Deduplication is still possible with encryption enabled but for
security, datasets will only dedup against themselves, their snapshots, and
their clones.
.Pp
There are a few limitations on encrypted datasets. Encrypted data cannot be
embedded via the
.Sy embedded_data
feature. Encrypted datasets may not have
.Sy copies Ns = Ns Em 3
since the implementation stores some encryption metadata where the third copy
would normally be. Since compression is applied before encryption datasets may
be vulnerable to a CRIME-like attack if applications accessing the data allow
for it. Deduplication with encryption will leak information about which blocks
are equivalent in a dataset and will incur an extra CPU cost per block written.
.Sh SUBCOMMANDS
All subcommands that modify state are logged persistently to the pool in their
original form.
@@ -2776,7 +2971,7 @@ Displays all ZFS file systems currently mounted.
.It Xo
.Nm
.Cm mount
.Op Fl Ov
.Op Fl Olv
.Op Fl o Ar options
.Fl a | Ar filesystem
.Xc
@@ -2798,6 +2993,15 @@ duration of the mount.
See the
.Sx Temporary Mount Point Properties
section for details.
.It Fl l
Load keys for encrypted filesystems as they are being mounted. This is
equivalent to executing
.Nm zfs Cm load-key
on each encryption root before mounting it. Note that if a filesystem has a
.Sy keylocation
of
.Sy prompt
this will cause the terminal to interactively block after asking for the key.
.It Fl v
Report mount progress.
.El
@@ -2875,7 +3079,7 @@ feature.
.It Xo
.Nm
.Cm send
.Op Fl DLPRcenpv
.Op Fl DLPRcenpvw
.Op Oo Fl I Ns | Ns Fl i Oc Ar snapshot
.Ar snapshot
.Xc
@@ -2987,6 +3191,23 @@ option is not supplied in conjunction with
.Fl c ,
then the data will be decompressed before sending so it can be split into
smaller block sizes.
.It Fl w, -raw
For encrypted datasets, send data exactly as it exists on disk. This allows
backups to be taken even if encryption keys are not currently loaded. The
backup may then be received on an untrusted machine since that machine will
not have the encryption keys to read the protected data or alter it without
being detected. Upon being received, the dataset will have the same encryption
keys as it did on the send side, although the
.Sy keylocation
property will be defaulted to
.Sy prompt
if not otherwise provided. For unencrypted datasets, this flag will be
equivalent to
.Fl Lec .
Note that if you do not use this flag for sending encrypted datasets, data will
be sent unencrypted and may be re-encrypted with a different encryption key on
the receiving system, which will disable the ability to do a raw send to that
system for incrementals.
.It Fl i Ar snapshot
Generate an incremental stream from the first
.Ar snapshot
@@ -3085,6 +3306,23 @@ option is not supplied in conjunction with
.Fl c ,
then the data will be decompressed before sending so it can be split into
smaller block sizes.
.It Fl w, -raw
For encrypted datasets, send data exactly as it exists on disk. This allows
backups to be taken even if encryption keys are not currently loaded. The
backup may then be received on an untrusted machine since that machine will
not have the encryption keys to read the protected data or alter it without
being detected. Upon being received, the dataset will have the same encryption
keys as it did on the send side, although the
.Sy keylocation
property will be defaulted to
.Sy prompt
if not otherwise provided. For unencrypted datasets, this flag will be
equivalent to
.Fl Lec .
Note that if you do not use this flag for sending encrypted datasets, data will
be sent unencrypted and may be re-encrypted with a different encryption key on
the receiving system, which will disable the ability to do a raw send to that
system for incrementals.
.It Fl e, -embed
Generate a more compact stream by using
.Sy WRITE_EMBEDDED
@@ -3478,6 +3716,10 @@ diff subcommand Allows lookup of paths within a dataset
given an object number, and the ability
to create snapshots necessary to
'zfs diff'.
load-key subcommand Allows loading and unloading of encryption key
(see 'zfs load-key' and 'zfs unload-key').
change-key subcommand Allows changing an encryption key via
'zfs change-key'.
mount subcommand Allows mount/umount of ZFS datasets
promote subcommand Must also have the 'mount' and 'promote'
ability in the origin file system
@@ -3726,6 +3968,129 @@ arrows.
.It Fl t
Display the path's inode change time as the first column of output.
.El
.It Xo
.Nm
.Cm load-key
.Op Fl nr
.Op Fl L Ar keylocation
.Fl a | Ar filesystem
.Xc
Load the key for
.Ar filesystem ,
allowing it and all children that inherit the
.Sy keylocation
property to be accessed. The key will be expected in the format specified by the
.Sy keyformat
and location specified by the
.Sy keylocation
property. Note that if the
.Sy keylocation
is set to
.Sy prompt
the terminal will interactively wait for the key to be entered. Loading a key
will not automatically mount the dataset. If that functionality is desired,
.Nm zfs Cm mount Sy -l
will ask for the key and mount the dataset. Once the key is loaded the
.Sy keystatus
property will become
.Sy available .
.Bl -tag -width "-r"
.It Fl r
Recursively loads the keys for the specified filesystem and all descendent
encryption roots.
.It Fl a
Loads the keys for all encryption roots in all imported pools.
.It Fl n
Do a dry-run
.Pq Qq No-op
load-key. This will cause zfs to simply check that the
provided key is correct. This command may be run even if the key is already
loaded.
.It Fl L Ar keylocation
Use
.Ar keylocation
instead of the
.Sy keylocation
property. This will not change the value of the property on the dataset. Note
that if used with either
.Fl r
or
.Fl a ,
.Ar keylocation
may only be given as
.Sy prompt .
.El
.It Xo
.Nm
.Cm unload-key
.Op Fl r
.Fl a | Ar filesystem
.Xc
Unloads a key from ZFS, removing the ability to access the dataset and all of
its children that inherit the
.Sy keylocation
property. This requires that the dataset is not currently open or mounted. Once
the key is unloaded the
.Sy keystatus
property will become
.Sy unavailable .
.Bl -tag -width "-r"
.It Fl r
Recursively unloads the keys for the specified filesystem and all descendent
encryption roots.
.It Fl a
Unloads the keys for all encryption roots in all imported pools.
.El
.It Xo
.Nm
.Cm change-key
.Op Fl l
.Op Fl o Ar keylocation Ns = Ns Ar value
.Op Fl o Ar keyformat Ns = Ns Ar value
.Op Fl o Ar pbkdf2iters Ns = Ns Ar value
.Ar filesystem
.Xc
.It Xo
.Nm
.Cm change-key
.Fl i
.Op Fl l
.Ar filesystem
.Xc
Allows a user to change the encryption key used to access a dataset. This
command requires that the existing key for the dataset is already loaded into
ZFS. This command may also be used to change the
.Sy keylocation ,
.Sy keyformat ,
and
.Sy pbkdf2iters
properties as needed. If the dataset was not previously an encryption root it
will become one. Alternatively, the
.Fl i
flag may be provided to cause an encryption root to inherit the parent's key
instead.
.Bl -tag -width "-r"
.It Fl l
Ensures the key is loaded before attempting to change the key. This is
effectively equivalent to
.Qq Nm zfs Cm load-key Ar filesystem ; Nm zfs Cm change-key Ar filesystem
.It Fl o Ar property Ns = Ns Ar value
Allows the user to set encryption key properties (
.Sy keyformat ,
.Sy keylocation ,
and
.Sy pbkdf2iters
) while changing the key. This is the only way to alter
.Sy keyformat
and
.Sy pbkdf2iters
after the dataset has been created.
.It Fl i
Indicates that zfs should make
.Ar filesystem
inherit the key of its parent. Note that this command can only be run on an
encryption root that has an encrypted parent.
.El
.El
.Sh EXIT STATUS
The
+33 -6
View File
@@ -92,7 +92,7 @@
.Nm
.Cm import
.Fl a
.Op Fl DfmN
.Op Fl DflmN
.Op Fl F Oo Fl n Oc Oo Fl T Oc Oo Fl X Oc
.Op Fl c Ar cachefile Ns | Ns Fl d Ar dir
.Op Fl o Ar mntopts
@@ -100,7 +100,7 @@
.Op Fl R Ar root
.Nm
.Cm import
.Op Fl Dfm
.Op Fl Dflm
.Op Fl F Oo Fl n Oc Oo Fl T Oc Oo Fl X Oc
.Op Fl c Ar cachefile Ns | Ns Fl d Ar dir
.Op Fl o Ar mntopts
@@ -160,7 +160,7 @@
.Ar pool
.Nm
.Cm split
.Op Fl gLnP
.Op Fl gLlnP
.Oo Fl o Ar property Ns = Ns Ar value Oc Ns ...
.Op Fl R Ar root
.Ar pool newpool
@@ -1186,7 +1186,7 @@ Lists destroyed pools only.
.Nm
.Cm import
.Fl a
.Op Fl DfmN
.Op Fl DflmN
.Op Fl F Oo Fl n Oc Oo Fl T Oc Oo Fl X Oc
.Op Fl c Ar cachefile Ns | Ns Fl d Ar dir
.Op Fl o Ar mntopts
@@ -1237,6 +1237,15 @@ transactions.
Not all damaged pools can be recovered by using this option.
If successful, the data from the discarded transactions is irretrievably lost.
This option is ignored if the pool is importable or already imported.
.It Fl l
Indicates that this command will request encryption keys for all encrypted
datasets it attempts to mount as it is bringing the pool online. Note that if
any datasets have a
.Sy keylocation
of
.Sy prompt
this command will block waiting for the keys to be entered. Without this flag
encrypted datasets will be left unavailable until the keys are loaded.
.It Fl m
Allows a pool to import when there is a missing log device.
Recent transactions can be lost because the log device will be discarded.
@@ -1298,7 +1307,7 @@ health of your pool and should only be used as a last resort.
.It Xo
.Nm
.Cm import
.Op Fl Dfm
.Op Fl Dflm
.Op Fl F Oo Fl n Oc Oo Fl t Oc Oo Fl T Oc Oo Fl X Oc
.Op Fl c Ar cachefile Ns | Ns Fl d Ar dir
.Op Fl o Ar mntopts
@@ -1357,6 +1366,15 @@ transactions.
Not all damaged pools can be recovered by using this option.
If successful, the data from the discarded transactions is irretrievably lost.
This option is ignored if the pool is importable or already imported.
.It Fl l
Indicates that this command will request encryption keys for all encrypted
datasets it attempts to mount as it is bringing the pool online. Note that if
any datasets have a
.Sy keylocation
of
.Sy prompt
this command will block waiting for the keys to be entered. Without this flag
encrypted datasets will be left unavailable until the keys are loaded.
.It Fl m
Allows a pool to import when there is a missing log device.
Recent transactions can be lost because the log device will be discarded.
@@ -1849,7 +1867,7 @@ values.
.It Xo
.Nm
.Cm split
.Op Fl gLnP
.Op Fl gLlnP
.Oo Fl o Ar property Ns = Ns Ar value Oc Ns ...
.Op Fl R Ar root
.Ar pool newpool
@@ -1887,6 +1905,15 @@ Display real paths for vdevs resolving all symbolic links. This can
be used to look up the current block device name regardless of the
.Pa /dev/disk/
path used to open it.
.It Fl l
Indicates that this command will request encryption keys for all encrypted
datasets it attempts to mount as it is bringing the new pool online. Note that
if any datasets have a
.Sy keylocation
of
.Sy prompt
this command will block waiting for the keys to be entered. Without this flag
encrypted datasets will be left unavailable until the keys are loaded.
.It Fl n
Do dry run, do not actually perform the split.
Print out the expected configuration of