Go to file
Matthew Ahrens aa755b3549
Set aside a metaslab for ZIL blocks
Mixing ZIL and normal allocations has several problems:

1. The ZIL allocations are allocated, written to disk, and then a few
seconds later freed.  This leaves behind holes (free segments) where the
ZIL blocks used to be, which increases fragmentation, which negatively
impacts performance.

2. When under moderate load, ZIL allocations are of 128KB.  If the pool
is fairly fragmented, there may not be many free chunks of that size.
This causes ZFS to load more metaslabs to locate free segments of 128KB
or more.  The loading happens synchronously (from zil_commit()), and can
take around a second even if the metaslab's spacemap is cached in the
ARC.  All concurrent synchronous operations on this filesystem must wait
while the metaslab is loading.  This can cause a significant performance
impact.

3. If the pool is very fragmented, there may be zero free chunks of
128KB or more.  In this case, the ZIL falls back to txg_wait_synced(),
which has an enormous performance impact.

These problems can be eliminated by using a dedicated log device
("slog"), even one with the same performance characteristics as the
normal devices.

This change sets aside one metaslab from each top-level vdev that is
preferentially used for ZIL allocations (vdev_log_mg,
spa_embedded_log_class).  From an allocation perspective, this is
similar to having a dedicated log device, and it eliminates the
above-mentioned performance problems.

Log (ZIL) blocks can be allocated from the following locations.  Each
one is tried in order until the allocation succeeds:
1. dedicated log vdevs, aka "slog" (spa_log_class)
2. embedded slog metaslabs (spa_embedded_log_class)
3. other metaslabs in normal vdevs (spa_normal_class)

The space required for the embedded slog metaslabs is usually between
0.5% and 1.0% of the pool, and comes out of the existing 3.2% of "slop"
space that is not available for user data.

On an all-ssd system with 4TB storage, 87% fragmentation, 60% capacity,
and recordsize=8k, testing shows a ~50% performance increase on random
8k sync writes.  On even more fragmented systems (which hit problem #3
above and call txg_wait_synced()), the performance improvement can be
arbitrarily large (>100x).

Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #11389
2021-01-21 15:12:54 -08:00
.github Forward questions to github discussions 2020-12-21 20:09:02 -08:00
cmd Set aside a metaslab for ZIL blocks 2021-01-21 15:12:54 -08:00
config Autoconf 2.70 compatibility 2021-01-02 16:55:55 -08:00
contrib dracut: Support /usr/bin as 'systemctl' path 2021-01-21 12:59:24 -08:00
etc Verify zfs module loaded before starting services 2020-11-28 11:11:18 -08:00
include Set aside a metaslab for ZIL blocks 2021-01-21 15:12:54 -08:00
lib zpool: speed up importing large pools (#11469) 2021-01-21 12:55:54 -08:00
man Set aside a metaslab for ZIL blocks 2021-01-21 15:12:54 -08:00
module Set aside a metaslab for ZIL blocks 2021-01-21 15:12:54 -08:00
rpm Install zgenhostid to sbindir 2021-01-21 12:58:24 -08:00
scripts DKMS: Disable weak modules 2020-12-15 09:22:30 -08:00
tests Extending FreeBSD UIO Struct 2021-01-20 21:27:30 -08:00
udev Centralize variable substitution 2020-07-14 17:33:44 -07:00
.editorconfig Add an .editorconfig; document git whitespace settings 2020-01-27 13:32:52 -08:00
.gitignore Add FreeBSD support to OpenZFS 2020-04-14 11:36:28 -07:00
.gitmodules Add zimport.sh compatibility test script 2014-02-21 12:10:31 -08:00
AUTHORS Add zstd support to zfs 2020-08-20 10:30:06 -07:00
autogen.sh Cause autogen.sh to fail if autoreconf fails 2018-07-06 09:27:37 -07:00
CODE_OF_CONDUCT.md Replace ZFS on Linux references with OpenZFS 2020-10-08 20:10:13 -07:00
configure.ac Autoconf 2.70 compatibility 2021-01-02 16:55:55 -08:00
copy-builtin Replace ZFS on Linux references with OpenZFS 2020-10-08 20:10:13 -07:00
COPYRIGHT Fix typos 2020-06-09 21:24:09 -07:00
cppcheck-suppressions.txt Import ZStandard v1.4.5 2020-08-20 10:30:06 -07:00
LICENSE Update build system and packaging 2018-05-29 16:00:33 -07:00
Makefile.am dracut: use /bin/sh instead of bash as the intepreter 2020-11-28 11:02:08 -08:00
META Linux 5.10 compat: META 2020-12-23 08:55:02 -08:00
NEWS Fix NEWS file 2020-08-26 21:44:41 -07:00
NOTICE Update build system and packaging 2018-05-29 16:00:33 -07:00
README.md docs: update README's installation link 2020-10-08 09:33:53 -07:00
TEST Remove CI builder customization from TEST 2020-03-16 10:46:03 -07:00
zfs.release.in Move zfs.release generation to configure step 2012-07-12 12:22:51 -07:00

img

OpenZFS is an advanced file system and volume manager which was originally developed for Solaris and is now maintained by the OpenZFS community. This repository contains the code for running OpenZFS on Linux and FreeBSD.

codecov coverity

Official Resources

Installation

Full documentation for installing OpenZFS on your favorite operating system can be found at the Getting Started Page.

Contribute & Develop

We have a separate document with contribution guidelines.

We have a Code of Conduct.

Release

OpenZFS is released under a CDDL license. For more details see the NOTICE, LICENSE and COPYRIGHT files; UCRL-CODE-235197

Supported Kernels

  • The META file contains the officially recognized supported Linux kernel versions.
  • Supported FreeBSD versions are 12-STABLE and 13-CURRENT.