Compare commits

..

391 Commits

Author SHA1 Message Date
Tony Hutter d2632f0cc1 Tag zfs-0.8.5
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2020-10-06 15:56:16 -07:00
Brian Behlendorf e23ba78bd8 Fix buggy procfs_list_seq_next warning
The kernel seq_read() helper function expects ->next() to update
the passed position even there are no more entries.  Failure to
do so results in the following warning being logged.

    seq_file: buggy .next function procfs_list_seq_next [spl]
    did not update position index

Functionally there is no issue with the way procfs_list_seq_next()
is implemented and the warning is harmless.  However, we want to
silence this some what scary incorrect warning.  This commit
updates the Linux procfs code to advance the position even for
the last entry.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #10984 
Closes #10996
2020-09-30 16:31:37 -07:00
Brian Behlendorf bf75263ace Fix CONFIG_DEBUG_LOCK_ALLOC configure check
This check was accidentally broken when the kABI checks were updated
to run in parallel, commit 608f874.  The check must be for the
config_debug_lock_alloc_license name to determine if the symbol
is license compatible.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #10991
2020-09-29 09:42:31 -07:00
Brian Behlendorf 5e245ab21b Fix objtool configure check
The m4 objtool configure check can incorrectly fail because of a
missing header in the test.  This appears to be the result of a
recent kernel change and was observed on the Fedora 5.8.11-200
kernel.

  In file included from /home/fedora/zfs/build/objtool/objtool.c:75:
  ./arch/x86/include/asm/frame.h:100:57: error: 'struct pt_regs'
      declared inside parameter list will not be visible outside
      of this definition or declaration [-Werror]

The consequence of this is that the "stack_frame_non_standard"
check is never run and HAVE_STACK_FRAME_NON_STANDARD is set
incorrectly which results in a build failure.  This change adds
the appropriate header to the "objtool" check so it now behaves
as intended.

Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #10990
2020-09-29 09:42:31 -07:00
Brian Behlendorf 3df7f11322 Fix PREEMPTION=y and BLK_CGROUP=y config on arm64
With PREEMPTION=y and BLK_CGROUP=y preempt_schedule_notrace() is being
used on arm64 which is a GPL-only function and hence the build of the
DKMS kernel module fails.

Fix that by redefining preempt_schedule_notrace() to preempt_schedule()
which should be safe as long as tracing is not used.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Juerg Haefliger <juergh@canonical.com>
Closes #8545 
Closes #9948 
Closes #10416 
Closes #10973
2020-09-28 09:53:32 -07:00
Brian Behlendorf ae64398819 Update zts-report.py with additional tests
The following test cases may still occasionally fail and are being
added to the "maybe" list for Linux until they can be updated to be
entirely reliable.

  cli_root/zfs_rename/zfs_rename_002_pos.ksh
  cli_root/zpool_reopen/zpool_reopen_003_pos.ksh
  refreserv/refreserv_raidz

These 6 tests consistently fail only on Fedora 31+, the failures
are related to the kernel rescanning the partition table on loopback
devices which is no longer reliable unless partprobe is used.  In
order to enable the Fedora bot by default they are also being added
to the list until the tests can be updated.  Any significant regression
in functionality covered by these tests will still be detected by the
FreeBSD builders.

  alloc_class/alloc_class_009_pos
  alloc_class/alloc_class_010_pos
  cli_root/zpool_expand/zpool_expand_001_pos
  cli_root/zpool_expand/zpool_expand_005_pos
  rsend/rsend_007_pos
  rsend/rsend_010_pos
  rsend/rsend_011_pos
  snapshot/rollback_003_pos

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2020-09-24 16:05:35 -07:00
Richard Laager 3275147c71 Fix another dependency loop
zfs-load-key-DATASET.service was gaining an
After=systemd-journald.socket due to its stdout/stderr going to the
journal (which is the default).  systemd-journald.socket has an After
(via RequiresMountsFor=/run/systemd/journal) on -.mount.  If the root
filesystem is encrypted, -.mount gets an After
zfs-load-key-DATASET.service.

By setting stdout and stderr to null on the key load services, we avoid
this loop.

Reviewed-by: Antonio Russo <antonio.e.russo@gmail.com>
Reviewed-by: InsanePrawn <insane.prawny@gmail.com>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Closes #10356
Closes #10388
2020-09-22 12:22:26 -07:00
Richard Laager 230970d368 Fix a dependency loop
When generating units with zfs-mount-generator, if the pool is already
imported, zfs-import.target is not needed.  This avoids a dependency
loop on root-on-ZFS systems:
  systemd-random-seed.service After (via RequiresMountsFor)
  var-lib.mount After
  zfs-import.target After
  zfs-import-{cache,scan}.service After
  cryptsetup.service After
  systemd-random-seed.service

Reviewed-by: Antonio Russo <antonio.e.russo@gmail.com>
Reviewed-by: InsanePrawn <insane.prawny@gmail.com>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Closes #10388
2020-09-22 12:18:33 -07:00
Jonathon 79291065fc Verify zfs module loaded before starting services
This is a minor change to the systemd service templates that verifies
the zfs kernel module is loaded by the kernel prior to attempting to
import any zpool.

The services check for the presence of /sys/module/zfs which indicates
the zfs is module is loaded. This uses the systemd built-in check
ConditionPathIsDirectory.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Thode <prometheanfire@gentoo.org>
Signed-off-by: Jonathon Fernyhough <jonathon.fernyhough@york.ac.uk>
Closes #10663
2020-09-22 12:18:26 -07:00
Jonathon 10c7a12f3b Verify zfs module loaded before starting services
This is a minor change to the systemd service templates that verifies the zfs
kernel module is loaded by the kernel prior to attempting to import any zpool.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Jonathon Fernyhough <jonathon.fernyhough@york.ac.uk>
Closes #10627
2020-09-22 12:18:19 -07:00
sterlingjensen 2acca5ec70 Mark lua setjmp/longjmp for powerpc weak
Linux already defines setjmp/longjmp for powerpc, which leads to
duplicate symbols in a statically linked build.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Sterlng Jensen <sterlingjensen@users.noreply.github.com>
Closes #10795
2020-09-22 11:43:07 -07:00
Jean-Baptiste Lallement e8baad51e0 Make unloading the key more robust
The unit was failing instead of stopping if someone manually unloaded
the key before stopping the unit (zfs unload-key is failing on an
unavailable key).
Follow a similar logic than for loading the key, checking for the key
status before unloading it.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Co-authored-by: Didier Roche <didrocks@ubuntu.com>
Signed-off-by: Didier Roche <didrocks@ubuntu.com>
Closes #10477
2020-09-22 11:43:01 -07:00
Jean-Baptiste Lallement df03f21b54 BindsTo dataset keyload unit to mount associate unit
We need a stronger dependency between the mount unit and its keyload unit
when we know that the dataset is encrypted.
If the keyload unit fails, Wants= will still try to mount the dataset,
which will then fail.
It’s better to show that the failure is due to a dependency failing, the
keyload unit, by tighting up the dependency. We can do this as we know
that we generate both units in the generator and so, it’s not an
optional dependency.
BindsTo enable as well that if the keyload unit fails at any point, the
associated mountpoint will be then unmounted.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Didier Roche <didrocks@ubuntu.com>
Signed-off-by: Didier Roche <didrocks@ubuntu.com>
Closes #10477
2020-09-22 11:42:56 -07:00
Jean-Baptiste Lallement e8e68905c9 Ensure mount unit pilots when its ZFS key is loaded
Drop Before=zfs.mount dependency explicity on generated key-load .service
unit.
Indeed, the associated mount unit is After=<dataset-key-load>.service.
This is thus the mount point which controls at what point it wants to be
mounted (Before=zfs-mount.service in stock generator), but this can be
an automount point, or triggered by another service.
This additional dependency from the key load service is not needed thus.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Didier Roche <didrocks@ubuntu.com>
Signed-off-by: Didier Roche <didrocks@ubuntu.com>
Closes #10477
2020-09-22 11:42:45 -07:00
John Poduska c161360dce Resilver restarts unnecessarily when it encounters errors
When a resilver finishes, vdev_dtl_reassess is called to hopefully
excise DTL_MISSING (amongst other things). If there are errors during
the resilver, they are tracked in DTL_SCRUB, as spelled out in the
block comment in vdev.c. DTL_SCRUB is in-core only, so it can only
be used if the pool was online for the whole resilver. This state is
tracked with the spa_scrub_started flag, which only gets set when
the scan is initialized. Unfortunately, this flag gets cleared right
before vdev_dtl_reassess gets called, so if there are any errors
during the scan, DTL_MISSING will never get excised and the resilver
will just continually restart. This fix simply moves clearing that
flag until after the call to vdev_dtl_reasses.

In addition, if a pool is imported and already has scn_errors > 0,
this change will restart the resilver immediately instead of doing
the rest of the scan and then restarting it from the beginning. On
the other hand, if scn_errors == 0 at import, then no errors have
been encountered so far, so the spa_scrub_started flag can be safely
set.

A test has been added to verify that resilver does not restart when
relevant DTL's are available.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Paul Zuchowski <pzuchowski@datto.com>
Signed-off-by: John Poduska <jpoduska@datto.com>
Closes #10291
2020-09-22 11:33:35 -07:00
Brian Behlendorf a72ae9bf3d ZED: Do not offline a missing device if no spare is available
Due to commit d48091d a removed device is now explicitly offlined by
the ZED if no spare is available, rather than the letting ZFS detect
it as UNAVAIL. This broke auto-replacing of whole-disk devices, as
described in issue #10577.  In short, when a new device is reinserted
in the same slot, the ZED will try to ONLINE it without letting ZFS
recreate the necessary partition table.

This change simply avoids setting the device OFFLINE when removed if
no spare is available (or if spare_on_remove is false).  This change
has been left minimal to allow it to be backported to 0.8.x release.
The auto_offline_001_pos ZTS test has been updated accordingly.

Some follow up work is planned to update the ZED so it transitions
the vdev to a REMOVED state.  This is a state which has always
existed but there is no current interface the ZED can use to
accomplish this.  Therefore it's being left to a follow up PR.

Reviewed-by: Gionatan Danti <g.danti@assyoma.it>
Co-authored-by: Gionatan Danti <g.danti@assyoma.it>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #10577
Closes #10730
2020-09-16 14:48:22 -07:00
Brian Behlendorf 3e268f4934 ZTS: Improve zts-auto_offline_001_pos
The zts-auto_offline_001_pos test could exceed the 10 minute test
limit and be KILLED by the test infrastructure.  To prevent this
speed up the test case by:

* Removing redundant pool configurations.  Each of the following
  vdev types is tested once: mirror, raidz, cache, and special.

* The block_device_wait function need only wait on the block
  device which has been removed as part of the test.

Reviewed-by: Paul Zuchowski <pzuchowski@datto.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9827
2020-09-16 14:48:05 -07:00
youzhongyang df01daab13 Fix inability to destroy snapshot used over NFS
The cache of struct svc_export and struct svc_expkey by nfsd and
rpc.mountd for the snapshot holds references to the mount point.
We need to flush them out before unmounting, otherwise umount
would fail with EBUSY.

Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Youzhong Yang <yyang@mathworks.com>
Closes #6000 
Closes #10783
2020-09-16 14:30:51 -07:00
Brian Behlendorf 62278325a5 Export dmu_offset_next() symbol
Export the dmu_offset_next() symbol for use by Lustre.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #10796
2020-09-16 14:17:51 -07:00
Matthew Ahrens 43186f94f4 Fix lua stack overflow on recursive call to gsub()
The `zfs program` subcommand invokes a LUA interpreter to run ZFS
"channel programs".  This interpreter runs in a constrained environment,
with defined memory limits.  The LUA stack (used for LUA functions that
call each other) is allocated in the kernel's heap, and is limited by
the `-m MEMORY-LIMIT` flag and the `zfs_lua_max_memlimit` module
parameter.  The C stack is used by certain LUA features that are
implemented in C.  The C stack is limited by `LUAI_MAXCCALLS=20`, which
limits call depth.

Some LUA C calls use more stack space than others, and `gsub()` uses an
unusually large amount.  With a programming trick, it can be invoked
recursively using the C stack (rather than the LUA stack).  This
overflows the 16KB Linux kernel stack after about 11 iterations, less
than the limit of 20.

One solution would be to decrease `LUAI_MAXCCALLS`.  This could be made
to work, but it has a few drawbacks:

1. The existing test suite does not pass with `LUAI_MAXCCALLS=10`.

2. There may be other LUA functions that use a lot of stack space, and
the stack space may change depending on compiler version and options.

This commit addresses the problem by adding a new limit on the amount of
free space (in bytes) remaining on the C stack while running the LUA
interpreter: `LUAI_MINCSTACK=4096`.  If there is less than this amount
of stack space remaining, a LUA runtime error is generated.

Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Allan Jude <allanjude@freebsd.org>
Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #10611
Closes #10613
2020-09-16 14:17:32 -07:00
DeHackEd 76a1232ee7 Use boot_ncpus in place of max_ncpus in taskq_create
Due to hotplug support or BIOS bugs sometimes max_ncpus can be
an absurdly high value. I have a system with 32 cores/threads
but reports max_ncpus == 440. This many threads potentially
cripples the system during arc_prune floods for example.

boot_ncpus is the number of working CPUs when called so use
that instead.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: DHE <git@dehacked.net>
Closes #10282
2020-09-16 13:35:42 -07:00
Olaf Faaland 0801e4e5c9 Initialize mmp_last_write when the mmp thread starts (#10912)
A great deal of time may go by between when mmp_init() is called and
the MMP thread starts, particularly if there are bad devices, because
there is I/O checking configs etc.  If this time is too long,

    (gethrtime() - mmp_last_write) > mmp_fail_ns

at the time the MMP thread starts.  If MMP is configured to suspend
the pool, the pool will be suspended immediately.

This can be seen in issue #10838

The value of mmp_last_write doesn't matter before the mmp thread
starts.  To give the MMP thread time to issue and land MMP writes,
initialize mmp_last_write when the MMP thread starts.

Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Closes #10873
2020-09-16 00:16:47 +00:00
Chris McDonough fa612dd1fd Appease GCC sprintf warnings found on Fedora 32/GCC 10.0.1
Increase the size of DDT_NAMELEN and MNT_LINE_MAX to appease GCC
snprintf truncation warnings.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris McDonough <chrism@plope.com>
Closes #10712
Closes #10766
2020-09-15 23:43:12 +00:00
Arvind Sankar f78b6dbd5b Switch off -Wmissing-prototypes for libgcc math functions
spl-generic.c defines some of the libgcc integer library functions on
32-bit. Don't bother checking -Wmissing-prototypes since nothing should
directly call these functions from C code.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes #10470
2020-09-15 21:21:05 +00:00
George Amanakis 88533ec59a Fix gcc10.1 truncation error
gcc10.1 complains with:

../../include/sys/dmu.h:373:24: error: ‘%s’ directive output may be
truncated writing up to 95 bytes into a region of size 75
[-Werror=format-truncation=]
  373 | #define DMU_POOL_DDT   "DDT-%s-%s-%s"
      |                        ^~~~~~~~~~~~~~
../../module/zfs/ddt.c:256:37: note: in expansion of macro
‘DMU_POOL_DDT’
  256 |  (void) snprintf(name, DDT_NAMELEN, DMU_POOL_DDT,
      |                                     ^~~~~~~~~~~~
../../include/sys/dmu.h:373:32: note: format string is defined here
  373 | #define DMU_POOL_DDT   "DDT-%s-%s-%s"
      |                                ^~
../../module/zfs/ddt.c:256:9: note: ‘snprintf’ output 7 or more bytes
(assuming 102) into a destination of size 80
  256 |  (void) snprintf(name, DDT_NAMELEN, DMU_POOL_DDT,
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  257 |      zio_checksum_table[ddt->ddt_checksum].ci_name,
      |      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  258 |      ddt_ops[type]->ddt_op_name, ddt_class_name[class]);
      |      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Increasing DTT_NAMELEN fixes it.

Reviewed-By: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #10433
2020-09-15 21:20:43 +00:00
Brian Behlendorf d2acd3696f Linux 5.9 compat: NR_SLAB_RECLAIMABLE (#10865)
Commit dcdc12e added compatibility code to treat NR_SLAB_RECLAIMABLE_B
as if it were the same as NR_SLAB_RECLAIMABLE.  However, the new value
is in bytes while the old value was in pages which means they are not
interchangeable.

The only place the reclaimable slab size is used is as a component of
the calculation done by arc_free_memory().  This function returns the
amount of memory the ARC considers to be free or reclaimable at little
cost.  Rather than switch to a new interface to get this value it has
been removed it from the calculation.  It is normally a minor component
compared to the number of inactive or free pages, and removing it
aligns the behavior with the FreeBSD version of arc_free_memory().


Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Coleman Kane <ckane@colemankane.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Backported-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Closes #10834
2020-09-15 21:16:58 +00:00
Pavel Snajdr f6e54ea4f5 Linux 5.7 compat: Include linux/sched.h in spl/sys/mutex.h
struct task_struct is needed for lockdep_off() in mutex.h

This has popped up after e616cb8daadf (in linux-5.7-rc7).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pavel Snajdr <snajpa@snajpa.net>
Closes #10741
(cherry picked from commit 772c69d230)
Signed-off-by: Eli Schwartz <eschwartz@archlinux.org>
2020-09-15 21:16:58 +00:00
Coleman Kane c33b623535 Linux 5.9 compat: make_request_fn replaced with submit_bio interface
The make_request_fn and associated API was replaced recently in a
Linux 5.9 merge, to replace its functionality with a new submit_bio
member in struct block_device_operations.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #10696
(cherry picked from commit d817c17100)
Signed-off-by: Eli Schwartz <eschwartz@archlinux.org>
2020-09-15 21:16:58 +00:00
Coleman Kane 42e9450831 Linux 5.9 compat: Update NR_SLAB_RECLAIMABLE to NR_SLAB_RECLAIMABLE_B
This change appears to primarily be a name change for the enum. Had
to update the test logic so that it works so long as either one of
these is present (favoring the newer one). Additionally, as this is
newer, it only shows up in node_page_item, so this commit doesn't
test zone_page_item for the same enum.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #10696
(cherry picked from commit dcdc12e8ba)
Signed-off-by: Eli Schwartz <eschwartz@archlinux.org>
2020-09-15 21:16:58 +00:00
Coleman Kane 70719549f0 Linux 5.9 compat: add linux/blkdev.h include
Many of the block device operations (often functions with bdev in
the name) were moved into linux/blkdev.h from linux/fs.h. Seems
that this header is already included where needed in the code, but
in the autoconf tests it was missing causing false negatives. This
commit has those tests include linux/fs.h (old location) and now
also linux/blkdev.h (new locations).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #10696
(cherry picked from commit 1823c8fe6a)
Signed-off-by: Eli Schwartz <eschwartz@archlinux.org>
2020-09-15 21:16:58 +00:00
Neal Gompa (ニール・ゴンパ) 2d2ce04b99 lib/libzfs, rpm: Install pkgconfig files in the correct directory (#10628)
libzfs is an architecture-specific library, thus its pkgconfig files
should also be installed into the architecture-specific path for
pkgconfig files so that systems that support multilib or multiarch
installation will be able to work properly.

Signed-off-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
2020-09-15 21:16:58 +00:00
Eli Schwartz 638edf1d42 Linux 5.8 compat: __vmalloc()
The `pgprot` argument has been removed from `__vmalloc` in Linux 5.8,
being `PAGE_KERNEL` always now [1].

Detect this during configure and define a wrapper for older kernels.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/mm/vmalloc.c?h=next-20200605&id=88dca4ca5a93d2c09e5bbc6a62fbfc3af83c4fca

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>
Co-authored-by: Michael Niewöhner <foss@mniewoehner.de>
Signed-off-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>
Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>
Closes #10422

(cherry picked from commit 080102a1b6)
- apply to 0.8.4 before certain files were moved around
- config/kernel-kmem.m4 exists in git but not release tarballs because
  it is unused; introduce it in a new file to prevent conflicts
- linux/mm.h is included in git master via sys/kmem.h; do not remove it
  here or the build will error due to undefined is_vmalloc_addr()

Original-patch-by: Michael Niewöhner <c0d3z3r0@users.noreply.github.com>
Signed-off-by: Eli Schwartz <eschwartz@archlinux.org>
2020-09-15 21:16:58 +00:00
Jorgen Lundman 04837c8dcb Replace sprintf()->snprintf() and strcpy()->strlcpy()
The strcpy() and sprintf() functions are deprecated on some platforms.
Care is needed to ensure correct size is used.  If some platforms
miss snprintf, we can add a #define to sprintf, likewise strlcpy().

The biggest change is adding a size parameter to zfs_id_to_fuidstr().

The various *_impl_get() functions are only used on linux and have
not yet been updated.
2020-09-15 21:16:58 +00:00
George Amanakis 7b98b55282 Fix gcc 10.1 stringop-truncation error
As we do not expect the destination of these strncpy calls to be NULL
terminated, substitute them with memcpy.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #10346
2020-09-15 21:16:58 +00:00
ColMelvin 88451845af RPM: Remove old versions of DKMS on upgrade
Due to a mismatch between the text and a regex looking for that text,
the `%preuninstall` script would never run the `dkms remove` command
necessary to avoid corrupting the DKMS data configuration.  Increase
regex specificity to avoid this issue.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Lindee <chris.lindee+github@gmail.com>
Closes: #9891
Closes #10327
2020-09-15 21:16:58 +00:00
Brian Behlendorf 4f5dcc9dc1 Add longjmp support for Thumb-2
When a Thumb-2 kernel is being used, then longjmp must be implemented
using the Thumb-2 instruction set in module/lua/setjmp/setjmp_arm.S.

Original-patch-by: @jsrlabs
Reviewed-by: @awehrfritz
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7408 
Closes #9957 
Closes #9967
2020-09-15 17:07:40 +00:00
Tony Hutter 6b18d7df37 Tag zfs-0.8.4
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>

TEST_ZIMPORT_SKIP="yes"
2020-05-12 10:53:32 -07:00
George Amanakis 4fe5f9016f Add missing zfs_refcount_destroy() in key_mapping_rele()
Otherwise when running with reference_tracking_enable=TRUE mounting
and unmounting an encrypted dataset panics with:

Call Trace:
 dump_stack+0x66/0x90
 slab_err+0xcd/0xf2
 ? __kmalloc+0x174/0x260
 ? __kmem_cache_shutdown+0x158/0x240
 __kmem_cache_shutdown.cold+0x1d/0x115
 shutdown_cache+0x11/0x140
 kmem_cache_destroy+0x210/0x230
 spl_kmem_cache_destroy+0x122/0x3e0 [spl]
 zfs_refcount_fini+0x11/0x20 [zfs]
 spa_fini+0x4b/0x120 [zfs]
 zfs_kmod_fini+0x6b/0xa0 [zfs]
 _fini+0xa/0x68c [zfs]
 __x64_sys_delete_module+0x19c/0x2b0
 do_syscall_64+0x5b/0x1a0
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Reviewed-By: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-By: Tom Caputi <tcaputi@datto.com>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #10246
2020-05-12 10:53:32 -07:00
Brian Behlendorf e6142ac0f2 Linux 5.7 compat: blk_alloc_queue()
Commit https://github.com/torvalds/linux/commit/3d745ea5 simplified
the blk_alloc_queue() interface by updating it to take the request
queue as an argument.  Add a wrapper function which accepts the new
arguments and internally uses the available interfaces.

Other minor changes include increasing the Linux-Maximum to 5.6 now
that 5.6 has been released.  It was not bumped to 5.7 because this
release has not yet been finalized and is still subject to change.

Added local 'struct zvol_state_os *zso' variable to zvol_alloc.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #10181
Closes #10187
2020-05-12 10:53:32 -07:00
Matthew Macy ea15efd4c9 Prefix struct rangelock
A struct rangelock already exists on FreeBSD.  Add a zfs_ prefix as
per our convention to prevent any conflict with existing symbols.
This change is a follow up to 2cc479d0.

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #9534
2020-05-12 10:53:32 -07:00
Arvind Sankar 5c474614ff Fix icp include directories for in-tree build
When zfs is built in-tree using --enable-linux-builtin, the compile
commands are executed from the kernel build directory. If the build
directory is different from the kernel source directory, passing
-Ifs/zfs/icp will not find the headers as they are not present in the
build directory.

Fix this by adding @abs_top_srcdir@ to pull the headers from the zfs
source tree instead.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes #10021
2020-05-12 10:53:32 -07:00
Attila Fülöp 3d09d3809b ICP: gcm-avx: Support architectures lacking the MOVBE instruction
There are a couple of x86_64 architectures which support all needed
features to make the accelerated GCM implementation work but the
MOVBE instruction. Those are mainly Intel Sandy- and Ivy-Bridge
and AMD Bulldozer, Piledriver, and Steamroller.

By using MOVBE only if available and replacing it with a MOV
followed by a BSWAP if not, those architectures now benefit from
the new GCM routines and performance is considerably better
compared to the original implementation.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Adam D. Moss <c@yotes.com>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Followup #9749 
Closes #10029
2020-05-12 10:53:32 -07:00
Attila Fülöp 76354f945e ICP: Improve AES-GCM performance
Currently SIMD accelerated AES-GCM performance is limited by two
factors:

a. The need to disable preemption and interrupts and save the FPU
state before using it and to do the reverse when done. Due to the
way the code is organized (see (b) below) we have to pay this price
twice for each 16 byte GCM block processed.

b. Most processing is done in C, operating on single GCM blocks.
The use of SIMD instructions is limited to the AES encryption of the
counter block (AES-NI) and the Galois multiplication (PCLMULQDQ).
This leads to the FPU not being fully utilized for crypto
operations.

To solve (a) we do crypto processing in larger chunks while owning
the FPU. An `icp_gcm_avx_chunk_size` module parameter was introduced
to make this chunk size tweakable. It defaults to 32 KiB. This step
alone roughly doubles performance. (b) is tackled by porting and
using the highly optimized openssl AES-GCM assembler routines, which
do all the processing (CTR, AES, GMULT) in a single routine. Both
steps together result in up to 32x reduction of the time spend in
the en/decryption routines, leading up to approximately 12x
throughput increase for large (128 KiB) blocks.

Lastly, this commit changes the default encryption algorithm from
AES-CCM to AES-GCM when setting the `encryption=on` property.

Reviewed-By: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-By: Jason King <jason.king@joyent.com>
Reviewed-By: Tom Caputi <tcaputi@datto.com>
Reviewed-By: Richard Laager <rlaager@wiktel.com>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes #9749
2020-05-12 10:53:32 -07:00
Fabio Scaccabarozzi 590ababea2 Bugfix/fix uio partial copies
In zfs_write(), the loop continues to the next iteration without
accounting for partial copies occurring in uiomove_iov when 
copy_from_user/__copy_from_user_inatomic return a non-zero status.
This results in "zfs: accessing past end of object..." in the 
kernel log, and the write failing.

Account for partial copies and update uio struct before returning
EFAULT, leave a comment explaining the reason why this is done.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: ilbsmart <wgqimut@gmail.com>
Signed-off-by: Fabio Scaccabarozzi <fsvm88@gmail.com>
Closes #8673 
Closes #10148
2020-05-12 10:53:32 -07:00
Mark Roper 009ff83548 Prevent deadlock in arc_read in Linux memory reclaim callback
Using zfs with Lustre, an arc_read can trigger kernel memory allocation
that in turn leads to a memory reclaim callback and a deadlock within a
single zfs process. This change uses spl_fstrans_mark and
spl_trans_unmark to prevent the reclaim attempt and the deadlock
(https://zfsonlinux.topicbox.com/groups/zfs-devel/T4db2c705ec1804ba).
The stack trace observed is:

    __schedule at ffffffff81610f2e
    schedule at ffffffff81611558
    schedule_preempt_disabled at ffffffff8161184a
    __mutex_lock at ffffffff816131e8
    arc_buf_destroy at ffffffffa0bf37d7 [zfs]
    dbuf_destroy at ffffffffa0bfa6fe [zfs]
    dbuf_evict_one at ffffffffa0bfaa96 [zfs]
    dbuf_rele_and_unlock at ffffffffa0bfa561 [zfs]
    dbuf_rele_and_unlock at ffffffffa0bfa32b [zfs]
    osd_object_delete at ffffffffa0b64ecc [osd_zfs]
    lu_object_free at ffffffffa06d6a74 [obdclass]
    lu_site_purge_objects at ffffffffa06d7fc1 [obdclass]
    lu_cache_shrink_scan at ffffffffa06d81b8 [obdclass]
    shrink_slab at ffffffff811ca9d8
    shrink_node at ffffffff811cfd94
    do_try_to_free_pages at ffffffff811cfe63
    try_to_free_pages at ffffffff811d01c4
    __alloc_pages_slowpath at ffffffff811be7f2
    __alloc_pages_nodemask at ffffffff811bf3ed
    new_slab at ffffffff81226304
    ___slab_alloc at ffffffff812272ab
    __slab_alloc at ffffffff8122740c
    kmem_cache_alloc at ffffffff81227578
    spl_kmem_cache_alloc at ffffffffa048a1fd [spl]
    arc_buf_alloc_impl at ffffffffa0befba2 [zfs]
    arc_read at ffffffffa0bf0924 [zfs]
    dbuf_read at ffffffffa0bf9083 [zfs]
    dmu_buf_hold_by_dnode at ffffffffa0c04869 [zfs]

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Mark Roper <markroper@gmail.com>
Closes #9987
2020-05-12 10:53:32 -07:00
Alexander Motin 4e55349857 Fix infinite scan on a pool with only special allocations
Attempt to run scrub or resilver on a new pool containing only special
allocations (special vdev added on creation) caused infinite loop
because of dsl_scan_should_clear() limiting memory usage to 5% of pool
size, which it calculated accounting only normal allocation class.

Addition of special and just in case dedup classes fixes the issue.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #10106 
Closes #8694
2020-05-12 10:53:32 -07:00
Brian Behlendorf 0cbed7f026 Static symbols exported by ICP
The crypto_cipher_init_prov and crypto_cipher_init are declared static
and should not be exported by the ICP.  This resolves the following
warnings observed when building with the 5.4 kernel.

WARNING: "crypto_cipher_init" [.../icp] is a static EXPORT_SYMBOL
WARNING: "crypto_cipher_init_prov" [.../icp] is a static EXPORT_SYMBOL

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9791
2020-05-12 10:53:32 -07:00
Brian Behlendorf 6e8c4dc460 Linux 5.6 compat: struct proc_ops
The proc_ops structure was introduced to replace the use of of the
file_operations structure when registering proc handlers.  This
change creates a new kstat_proc_op_t typedef for compatibility
which can be used to pass around the correct structure.

This change additionally adds the 'const' keyword to all of the
existing proc operations structures.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9961
2020-05-12 10:53:32 -07:00
Brian Behlendorf 06b473a8ae Linux 5.6 compat: timestamp_truncate()
The timestamp_truncate() function was added, it replaces the existing
timespec64_trunc() function.  This change renames our wrapper function
to be consistent with the upstream name and updates the compatibility
code for older kernels accordingly.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9956
Closes #9961
2020-05-12 10:53:32 -07:00
Brian Behlendorf ba2cf53545 Linux 5.6 compat: ktime_get_raw_ts64()
The getrawmonotonic() and getrawmonotonic64() interfaces have been
fully retired.  Update gethrtime() to use the replacement interface
ktime_get_raw_ts64() which was introduced in the 4.18 kernel.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #10052
Closes #10064
2020-05-12 10:53:32 -07:00
Brian Behlendorf cf2a3464e9 Linux 5.6 compat: time_t
As part of the Linux kernel's y2038 changes the time_t type has been
fully retired.  Callers are now required to use the time64_t type.

Rather than move to the new type, I've removed the few remaining
places where a time_t is used in the kernel code.  They've been
replaced with a uint64_t which is already how ZFS internally
handled these values.

Going forward we should work towards updating the remaining user
space time_t consumers to the 64-bit interfaces.

Reviewed-by: Matthew Macy <mmacy@freebsd.org>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #10052
Closes #10064
2020-05-12 10:53:32 -07:00
Romain Dolbeau 43135b3746 Fix static data to link with -fno-common
-fno-common is the new default in GCC 10, replacing -fcommon in
GCC <= 9, so static data must only be allocated once.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Romain Dolbeau <romain.dolbeau@european-processor-initiative.eu>
Closes #9943
2020-05-12 10:53:32 -07:00
alex 6f42372635 zfs_get: change time format string from %k to %H
Issue #10090 reported that snapshots created between midnight and 1 AM
are missing a padded zero in the creation property

This change fixes the bug reported in issue #10090 where snapshots
created between midnight and 1 AM were missing a padded zero in the
creation timestamp output.

The leading zero was missing because the time format string used `%k`
which formats the hour as a decimal number from 0 to 23 where single
digits are preceded by blanks[0] and is fixed by changing it to `%H`
which formats the hour as 00-23.

The difference in output is as below

```
-Thu Mar 26  0:39 2020
+Thu Mar 26 00:39 2020
```

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Alex John <alex@stty.io>
Closes #10090 
Closes #10153
2020-05-12 10:53:32 -07:00
Matthew Ahrens fc6700425a Deprecate deduplicated send streams
Dedup send can only deduplicate over the set of blocks in the send
command being invoked, and it does not take advantage of the dedup table
to do so. This is a very common misconception among not only users, but
developers, and makes the feature seem more useful than it is. As a
result, many users are using the feature but not getting any benefit
from it.

Dedup send requires a nontrivial expenditure of memory and CPU to
operate, especially if the dataset(s) being sent is (are) not already
using a dedup-strength checksum.

Dedup send adds developer burden. It expands the test matrix when
developing new features, causing bugs in released code, and delaying
development efforts by forcing more testing to be done.

As a result, we are deprecating the use of `zfs send -D` and receiving
of such streams.  This change adds a warning to the man page, and also
prints the warning whenever dedup send or receive are used.

In a future release, we plan to:
1. remove the kernel code for generating deduplicated streams
2. make `zfs send -D` generate regular, non-deduplicated streams
3. remove the kernel code for receiving deduplicated streams
4. make `zfs receive` of deduplicated streams process them in userland
   to "re-duplicate" them, so that they can still be received.

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #7887
Closes #10117
2020-05-12 10:53:32 -07:00
Richard Laager e1b0704568 Fix zfs-functions packaging bug
This fixes a bug where the generated zfs-functions was being included
along with original zfs-functions.in in the make dist tarball.  This
caused an unfortunate series of events during build/packaging that
resulted in the RPM-installed /etc/zfs/zfs-functions listing the
paths as:

ZFS="/usr/local/sbin/zfs"
ZED="/usr/local/sbin/zed"
ZPOOL="/usr/local/sbin/zpool"

When they should have been:

ZFS="/sbin/zfs"
ZED="/sbin/zed"
ZPOOL="/sbin/zpool"

This affects init.d (non-systemd) distros like CentOS 6.

/etc/default/zfs and /etc/zfs/zfs-functions are also used by the
initramfs, so they need to be built even when init.d support is not.
They have been moved to the (new) etc/default and (existing) etc/zfs
source directories, respectively.

Fixes: #9443

Co-authored-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
2020-05-12 10:53:32 -07:00
Richard Laager d7c076c793 initramfs: Eliminate substitutions
These are now handled in zfs-functions, so this is all duplicative and
unnecessary.

Signed-off-by: Richard Laager <rlaager@wiktel.com>
2020-05-12 10:53:32 -07:00
Richard Laager e4185a03de Delete built init scripts in make clean
Previously, they were being deleted in make distclean.  This brings it
in line with the example:
https://www.gnu.org/software/automake/manual/html_node/Scripts.html

Signed-off-by: Richard Laager <rlaager@wiktel.com>
2020-05-12 10:53:32 -07:00
Ryan Moeller b06256a997 Restore :: in Makefile.am
The double-colon looked like a typo, but it's actually an obscure
feature. Rules with :: may appear multiple times and are run
independently of one another in the order they appear. The use of ::
for distclean-local was conventional, not accidental.

Add comments to indicate the intentional use of double-colon rules.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@ixsystems.com>
Closes #9210
2020-05-12 10:53:32 -07:00
Richard Laager f427973159 Make init scripts depend on Makefile
This brings it in line with the example:
https://www.gnu.org/software/automake/manual/html_node/Scripts.html

This way, if the substitution code is changed, they should update.

Signed-off-by: Richard Laager <rlaager@wiktel.com>
2020-05-12 10:53:32 -07:00
InsanePrawn 4bc401b30f Systemd mount generator: don't fail keyload from file if already loaded
Previously the generated keyload units for encryption roots with
keylocation=file://* didn't contain the code to detect if the key
was already loaded and would be marked failed in such situations.

Move the code to check whether the key is already loaded
from keylocation=prompt handling to general key loading code.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: InsanePrawn <insane.prawny@gmail.com>
Closes #10103
2020-05-12 10:53:32 -07:00
InsanePrawn b19477898c Systemd mount generator: Generate noauto units; add control properties
This commit refactors the systemd mount generators and makes the
following major changes:

- The generator now generates units for datasets marked canmount=noauto,
  too. These units are NOT WantedBy local-fs.target.
  If there are multiple noauto datasets for a path, no noauto unit will
  be created. Datasets with canmount=on are prioritized.

- Introduces handling of new user properties which are now included in
  the zfs-list.cache files:
    - org.openzfs.systemd:requires:
      List of units to require for this mount unit
    - org.openzfs.systemd:requires-mounts-for:
      List of mounts to require by this mount unit
    - org.openzfs.systemd:before:
      List of units to order after this mount unit
    - org.openzfs.systemd:after:
      List of units to order before this mount unit
    - org.openzfs.systemd:wanted-by:
      List of units to add a Wants dependency on this mount unit to
    - org.openzfs.systemd:required-by:
      List of units to add a Requires dependency on this mount unit to
    - org.openzfs.systemd:nofail:
      Toggles between a wants and a requires dependency.
    - org.openzfs.systemd:ignore:
      Do not generate a mount unit for this dataset.

  Consult the updated man page for detailed documentation.

- Restructures and extends the zfs-mount-generator(8) man page with the
  above properties, information on unit ordering and a license header.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Antonio Russo <antonio.e.russo@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: InsanePrawn <insane.prawny@gmail.com>
Closes #9649
2020-05-12 10:53:32 -07:00
InsanePrawn d1436d58a7 Systemd mount generator: Silence shellcheck warnings
Silences a warning about an intentionally unquoted variable.
Fixes a warning caused by strings split across lines by slightly
refactoring keyloadcmd.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Antonio Russo <antonio.e.russo@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: InsanePrawn <insane.prawny@gmail.com>
Closes #9649
2020-05-12 10:53:32 -07:00
Brian Behlendorf 13bfad0c96 Fix CONFIG_MODULES=no Linux kernel config
When configuring as builtin (--enable-linux-builtin) for kernels
without loadable module support (CONFIG_MODULES=n) only the object
file is created.  Never a loadable kmod.

Update ZFS_LINUX_TRY_COMPILE to handle this in a manor similar to
the ZFS_LINUX_TEST_COMPILE_ALL macro.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9887
Closes #10063
2020-05-12 10:53:32 -07:00
Brian Behlendorf 49f065d5a4 Linux 5.5 compat: blkg_tryget()
Commit https://github.com/torvalds/linux/commit/9e8d42a0f accidentally
converted the static inline function blkg_tryget() to GPL-only for
kernels built with CONFIG_PREEMPT_RCU=y and CONFIG_BLK_CGROUP=y.

Resolve the build issue by providing our own equivalent functionality
when needed which uses rcu_read_lock_sched() internally as before.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9745
Closes #10072
2020-05-12 10:53:32 -07:00
Richard Laager ebc8e360d5 zfs-mount-generator: Fix escaping for /
The correct name for the mount unit for / is "-.mount", not ".mount".

Reviewed-by: InsanePrawn <insane.prawny@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Antonio Russo <antonio.e.russo@gmail.com>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Closes #9970
2020-05-12 10:53:32 -07:00
Matthew Ahrens d4e04cc145 Missed wakeup when growing kmem cache
When growing the size of a (VMEM or KVMEM) kmem cache, spl_cache_grow()
always does taskq_dispatch(spl_cache_grow_work), and then waits for the
KMC_BIT_GROWING to be cleared by the taskq thread.

The taskq thread (spl_cache_grow_work()) does:
1. allocate new slab and add to list
2. wake_up_all(skc_waitq)
3. clear_bit(KMC_BIT_GROWING)

Therefore, the waiting thread can wake up before GROWING has been
cleared.  It will see that the growing has not yet completed, and go
back to sleep until it hits the 100ms timeout.

This can have an extreme performance impact on workloads that alloc/free
more than fits in the (statically-sized) magazines.  These workloads
allocate and free slabs with high frequency.

The problem can be observed with `funclatency spl_cache_grow`, which on
some workloads shows that 99.5% of the time it takes <64us to allocate
slabs, but we spend ~70% of our time in outliers, waiting for the 100ms
timeout.

The fix is to do `clear_bit(KMC_BIT_GROWING)` before
`wake_up_all(skc_waitq)`.

A future investigation should evaluate if we still actually need to
taskq_dispatch() at all, and if so on which kernel versions.

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Wilson <gwilson@delphix.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #9989
2020-05-12 10:53:32 -07:00
Richard Laager f3bf67d04d Order zfs-import-*.service after multipathd
If someone is using both multipathd and ZFS, they are probably using
them together.  Ordering the zpool imports after multipathd is ready
fixes import issues for multipath configurations.

Tested-by: Mike Pastore <mike@oobak.org>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Closes #9863
2020-05-12 10:53:32 -07:00
lorenz 3916ac5a56 Avoid here-documents in systemd mount generator
On some systems - openSUSE, for example - there is not yet a writeable
temporary file system available, so bash bails out with an error,

  'cannot create temp file for here-document: Read-only file system',

on the here documents in zfs-mount-generator. The simple fix is to
change these into a multi-line echo statement.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-By: Richard Laager <rlaager@wiktel.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Lorenz Hüdepohl <dev@stellardeath.org>
Closes #9802
2020-05-12 10:53:32 -07:00
Tony Hutter 9bb3d57b03 Tag zfs-0.8.3
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2020-01-22 13:51:17 -08:00
Tony Hutter 9e36832d31 Fix zfs-0.8.3 "qat.h"
This applies the patch from:

https://github.com/zfsonlinux/zfs/issues/9476#issuecomment-543854498

...which was originally from:

9fa8b5b  QAT related bug fixes

This allows QAT to build.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2020-01-22 13:49:07 -08:00
John Poduska 504aae708e ZTS: Fixes for spurious failures of resilver_restart_001 test
The resilver restart test was reported as failing about 2% of the
time. Two issues were found:

- The event log wasn't large enough, so resilver events were missing
- One 'zpool sync' wasn't enough for resilver to start after zinject

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Signed-off-by: John Poduska <jpoduska@datto.com>
Issue #9588
Closes #9677
Closes #9703
2020-01-22 13:49:07 -08:00
jwpoduska 1be3cba381 Prevent unnecessary resilver restarts
If a device is participating in an active resilver, then it will have a
non-empty DTL. Operations like vdev_{open,reopen,probe}() can cause the
resilver to be restarted (or deferred to be restarted later), which is
unnecessary if the DTL is still covered by the current scan range. This
is similar to the logic in vdev_dtl_should_excise() where the DTL can
only be excised if it's max txg is in the resilvered range.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Gallagher <john.gallagher@delphix.com>
Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Signed-off-by: John Poduska <jpoduska@datto.com>
Issue #840
Closes #9155
Closes #9378
Closes #9551
Closes #9588
2020-01-22 13:49:07 -08:00
Brian Behlendorf 0fd9a28de8 Fix QAT allocation failure return value
When qat_compress() fails to allocate the required contiguous memory
it mistakenly returns success.  This prevents the fallback software
compression from taking over and (un)compressing the block.

Resolve the issue by correctly setting the local 'status' variable
on all exit paths.  Furthermore, initialize it to CPA_STATUS_FAIL
to ensure qat_compress() always fails safe to guard against any
similar bugs in the future.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9784
Closes #9788
2020-01-22 13:49:07 -08:00
Tony Hutter 7eaaa6f32e Fix zfs-0.8.3 zfs_receive_raw test case
Fix the zfs_receive_raw test case for zfs-0.8.3 by including the
one-liner fix from loli10k described here:
https://github.com/zfsonlinux/zfs/pull/9776#issuecomment-570252679

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2020-01-22 13:49:07 -08:00
Tony Hutter 543b0c644a Fix zfs-0.8.3 'make lint' warnings
Fix these lint warnings on zfs-0.8.3:

$ make lint
[module/spl/spl-vnode.c:494]: (error) Uninitialized variable: fp
[module/spl/spl-vnode.c:706]: (error) Uninitialized variable: fp
[module/spl/spl-vnode.c:706]: (error) Uninitialized variable: next_fp
^CMakefile:1632: recipe for target 'cppcheck' failed
make: *** [cppcheck] Interrupt

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2020-01-22 13:49:07 -08:00
Brian Behlendorf 82be309780 ZTS: Cleanup partition tables
The cleanup_devices function should remove any partitions created
on the device and force the partition table to be reread.  This
is needed to ensure that blkid has an up to date version of what
devices and partitions are used by zfs.

The cleanup_devices call was removed from inuse_008_pos.ksh since
it operated on partitions instead of devices and was not needed.

Lastly ddidecode may be called by parted and was therefore added
to the constrained path.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9806
2020-01-22 13:49:07 -08:00
Ned Bass 0a37abc206 zdb: print block checksums with 6 d's of verbosity
Include checksums in the output of 'zdb -dddddd' along
with other indirect block information already displayed.

Example output follows (with long lines trimmed):

$ zdb -dddddd tank/fish 128
Dataset tank/fish [ZPL], ID 259, cr_txg 10, 16.2M, 93 objects, rootbp DV

    Object  lvl   iblk   dblk  dsize  dnsize  lsize   %full  type
       128    2   128K   128K   634K     512     1M  100.00  ZFS plain f
                                               168   bonus  System attri
    dnode flags: USED_BYTES USERUSED_ACCOUNTED USEROBJUSED_ACCOUNTED
    dnode maxblkid: 7
    path    /c
    uid     0
    gid     0
    atime    Sat Dec 21 10:49:26 2019
    mtime    Sat Dec 21 10:49:26 2019
    ctime    Sat Dec 21 10:49:26 2019
    crtime    Sat Dec 21 10:49:26 2019
    gen    41
    mode    100755
    size    964592
    parent    34
    links    1
    pflags    40800000104
Indirect blocks:
               0 L1  0:2c0000:400 0:c021e00:400 20000L/400P F=8 B=41/41
               0  L0 0:227800:13800 20000L/13800P F=1 B=41/41 cksum=167a
           20000  L0 0:25ec00:17c00 20000L/17c00P F=1 B=41/41 cksum=2312
           40000  L0 0:276800:18400 20000L/18400P F=1 B=41/41 cksum=24e0
           60000  L0 0:2a7800:18800 20000L/18800P F=1 B=41/41 cksum=25be
           80000  L0 0:28ec00:18c00 20000L/18c00P F=1 B=41/41 cksum=2579
           a0000  L0 0:24d000:11c00 20000L/11c00P F=1 B=41/41 cksum=140a
           c0000  L0 0:23b000:12000 20000L/12000P F=1 B=41/41 cksum=164e
           e0000  L0 0:221e00:5a00 20000L/5a00P F=1 B=41/41 cksum=9de790

        segment [0000000000000000, 0000000000100000) size    1M

Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ned Bass <bass6@llnl.gov>
2020-01-22 13:49:07 -08:00
Ben Cordero c7dc6f3ab3 zfs-load-key.sh: ${ZFS} is not the zfs binary
A change[1] was merged yesterday that should refer
to the zfs binary in the initramfs, but is actually
an unset shell variable.

This commit changes this line to call `zfs` directly
like the surrounding code.

[1]: cb5b875b273235a4a3ed28e16f416d5bb8865166

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Garrett Fields <ghfields@gmail.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: Ben Cordero <bencord0@condi.me>
Closes #9780
2020-01-22 13:49:07 -08:00
Brian Behlendorf 756c58cf71 ZTS: Fix pool_state cleanup
The externally faulted vdev should be brought back online and have
its errors cleared before the pool is destroyed.  Failure to do so
will leave a vdev with a valid active label.  This vdev may then
not be used to create a new pool without the -f flag potentially
leading to subsequent test failures.

Additionally remove an unreachable log_pass from setup.ksh.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9777
2020-01-22 13:49:07 -08:00
Brian Behlendorf 70d2e938b5 ZTS: Replace /var/tmp with $TEST_BASE_DIR
Remove a few hardcoded instances of /var/tmp.  This should use
the $TEST_BASE_DIR in order to allow the ZTS to be optionally
run in an alternate directory using `zfs-tests.sh -d <path>`.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9775
2020-01-22 13:49:07 -08:00
Brian Behlendorf 9aec34703e ZTS: zfs_program_json
As of Python 3.5 the default behavior of json.tool was changed to
preserve the input order rather than lexical order.  The test case
expects the output to be sorted so apply the --sort-keys option
to the json.tool command when using Python 3.5 and the option is
supported.

    https://docs.python.org/3/library/json.html#module-json.tool

Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9774
2020-01-22 13:49:06 -08:00
Brian Behlendorf 16777b7dee ZTS: devices_001_pos and devices_002_neg
Update the devices_001_pos and devices_002_neg test cases such that the
special block device file created is backed by a ZFS volume.  Specifying
a specific device allows the major and minor numbers to be easily
determined.  Furthermore, this avoids the potentially dangerous behavior
of opening the first block device we happen to find under /dev/.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9773
2020-01-22 13:49:06 -08:00
Steve Mokris da6a7f0239 Avoid some crashes when importing a pool with corrupt metadata
- Skip invalid DVAs when importing pools in readonly mode
  (in addition to when the config is untrusted).

- Upon encountering a DVA with a null VDEV, fail gracefully
  instead of panicking with a NULL pointer dereference.

Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Steve Mokris <smokris@softpixel.com>
Closes #9022
2020-01-22 13:49:06 -08:00
sam-lunt 5b8f560713 In initramfs, do not prompt if keylocation is "file://"
If the encryption key is stored in a file, the initramfs should not
prompt for the password. For example, this could be the case if the boot
partition is stored on removable media that is only present at boot time

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Garrett Fields <ghfields@gmail.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Signed-off-by: Sam Lunt <samuel.j.lunt@gmail.com>
Closes #9764
2020-01-22 13:49:06 -08:00
Nick Black 0d55a0957f libspl: declare aok extern in header
Rather than defining a new instance of 'aok' in every compilation
unit which includes this header, there is a single instance
defined in zone.c, and the header now only declares an extern.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Paul Zuchowski <pzuchowski@datto.com>
Signed-off-by: Nick Black <dankamongmen@gmail.com>
Closes #9752
2020-01-22 13:49:06 -08:00
Brian Behlendorf bb04f9c195 Cancel initialize and TRIM before vdev_metaslab_fini()
Any running 'zpool initialize' or TRIM must be cancelled prior
to the vdev_metaslab_fini() call in spa_vdev_remove_log() which
will unload the metaslabs and set ms->ms_group == NULL.

Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #8602
Closes #9751
2020-01-22 13:49:06 -08:00
Brian Behlendorf 421f8a2be0 ZTS: Test case failures
* large_dnode_008_pos - Force a pool sync before invoking zdb to
  ensure the updated dnode blocks have been persisted to disk.

* refreserv_raidz - Wait for the /dev/zvol links to be both created
  and removed, this is important because the same device volume
  names are being used repeatedly.

* btree_test - Add missing .gitignore file for btree_test binary.

Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9769
2020-01-22 13:49:06 -08:00
Brian Behlendorf f28e58b479 Update maximum kernel version to 5.4
Increase the maximum supported kernel version to 5.4.  This was
verified using the Fedora 5.4.2-300.fc31.x86_64 kernel.

Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9754
Closes #9759
2020-01-22 13:49:06 -08:00
Brian Behlendorf b051968de3 ZTS: Various test case fixes
* devices_001_pos and devices_002_neg - Failing after FreeBSD ZTS
  merged due to missing 'function' keyword for create_dev_file_linux.

* pool_state - Occasionally fails due to an insufficient delay
  before checking 'zpool status'.  Increasing the delay from 1 to 3
  seconds resolved the issue in local testing.

* procfs_list_basic - Fails when run in-tree because the logged
  command is actually 'lt-zfs'.  Updated the regex accordingly.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9748
2020-01-22 13:49:06 -08:00
loli10K e05c965d5b Fix for ARC sysctls ignored at runtime
This change leverage module_param_call() to run arc_tuning_update()
immediately after the ARC tunable has been updated as suggested in
cffa837 code review.
A simple test case is added to the ZFS Test Suite to prevent future
regressions in functionality.

This is a backport of #9489 provided from:
https://github.com/zfsonlinux/zfs/pull/9776#issuecomment-569418370

Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
2020-01-22 13:49:06 -08:00
Brian Behlendorf 9791683901 cppcheck: (warning) Possible null pointer dereference: nvh
Move the 'nvh = (void *)buf' assignment after the 'buf == NULL'
check to resolve the warning.  Interestingly, cppcheck 1.88
correctly determines that the existing code is safe, while
cppcheck 1.86 reports the warning.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9732
2020-01-22 13:49:06 -08:00
Brian Behlendorf 180c41e0b7 cppcheck: (error) Address of local auto-variable assigned
Suppress autoVariables warnings in the lua interpreter.  The usage
here while unconventional in intentional and the same as upstream.

[module/lua/ldebug.c:327]: (error) Address of local auto-variable
    assigned to a function parameter.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9732
2020-01-22 13:49:06 -08:00
Brian Behlendorf 1c27877ab2 cppcheck: (error) Null pointer dereference: who_perm
As indicated by the VERIFY the local who_perm variable can never
be NULL in parse_fs_perm().  Due to the existence of the is_set
conditional, which is always true, cppcheck 1.88 was reporting
a possible NULL reference.  Resolve the issue by removing the
extraneous is_set variable.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9732
2020-01-22 13:49:06 -08:00
Brian Behlendorf d01290f44d cppcheck: (warning) Possible null pointer dereference: dnp
The dnp argument can only be set to NULL when the DNODE_DRY_RUN flag
is set.  In which case, an early return path will be executed and a
NULL pointer dereference at the given location is impossible.  Add
an additional ASSERT to silence the cppcheck warning and document
that dbp must never be NULL at the point in the function.

[module/zfs/dnode.c:1566]: (warning) Possible null pointer deref: dnp

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9732
2020-01-22 13:49:06 -08:00
Brian Behlendorf 1074834f77 cppcheck: (error) Memory leak: vtoc
Resolve the reported memory leak by using a dedicated local vptr
variable to store the pointer reported by calloc().  Only assign the
passed **vtoc function argument on success, in all other cases vptr
is freed.

[lib/libefi/rdwr_efi.c:403]: (error) Memory leak: vtoc
[lib/libefi/rdwr_efi.c:422]: (error) Memory leak: vtoc
[lib/libefi/rdwr_efi.c:440]: (error) Memory leak: vtoc
[lib/libefi/rdwr_efi.c:454]: (error) Memory leak: vtoc
[lib/libefi/rdwr_efi.c:470]: (error) Memory leak: vtoc

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9732
2020-01-22 13:49:06 -08:00
Ubuntu 603ae6a8c0 cppcheck: (error) Shifting signed 64-bit value by 63 bits
As of cppcheck 1.82 surpress the warning regarding shifting too many
bits for __divdi3() implemention.  The algorithm used here is correct.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9732
2020-01-22 13:49:06 -08:00
Ubuntu 363d7332f2 cppcheck: (error) Uninitialized variable
As of cppcheck 1.82 warnings are issued when using the list_for_each_*
functions with an uninitialized variable.  Functionally, this is fine
but to resolve the warning initialize these variables.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9732
2020-01-22 13:49:06 -08:00
Ubuntu bf01567e4e cppcheck: (error) Uninitialized variable
Resolve the following uninitialized variable warnings.  In practice
these were unreachable due to the goto.  Replacing the goto with a
return resolves the warning and yields more readable code.

[module/icp/algs/modes/ccm.c:892]: (error) Uninitialized variable: ccm_param
[module/icp/algs/modes/ccm.c:893]: (error) Uninitialized variable: ccm_param
[module/icp/algs/modes/gcm.c:564]: (error) Uninitialized variable: gcm_param
[module/icp/algs/modes/gcm.c:565]: (error) Uninitialized variable: gcm_param
[module/icp/algs/modes/gcm.c:599]: (error) Uninitialized variable: gmac_param
[module/icp/algs/modes/gcm.c:600]: (error) Uninitialized variable: gmac_param

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9732
2020-01-22 13:49:06 -08:00
Garrett Fields 78072b7936 Exchanged two "${ZFS} get -H -o value" commands
Initramfs uses "get_fs_value()" elsewhere.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl>
Signed-off-by: Garrett Fields <ghfields@gmail.com>
Closes #9736
2020-01-22 13:49:05 -08:00
Thomas Geppert 7f7c15c678 Create symbolic links in /dev/disk/by-vdev for nvme disk devices
The existing rules miss nvme disk devices because of the trailing
digits in the KERNEL device name, e.g. nvme0n1. Partitions of nvme
disk devices are already properly handled by the existing rule for
ENV{DEVTYPE}=="partition".

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Signed-off-by: Thomas Geppert <geppi@digitx.de>
Closes #9730
2020-01-22 13:49:05 -08:00
Garrett Fields fb244566c2 Force systems with kernel option "quiet" to display prompt for password
On systems that utilize TTY for password entry, if the kernel
option "quiet" is set, the system would appear to freeze on a
blank screen, when in fact it is waiting for password entry
from the user.

Since TTY is the fallback method, this has no effect on systemd
or plymouth password prompting.

By temporarily setting "printk" to "7", running the command,
then resuming with the original "printk" state, the user can
see the password prompt.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Garrett Fields <ghfields@gmail.com>
Closes #9731
2020-01-22 13:49:05 -08:00
Richard Laager 0f256176d9 initramfs: setup keymapping and video for prompts
From Steve Langasek <steve.langasek@canonical.com>:
> The poorly-named 'FRAMEBUFFER' option in initramfs-tools controls
> whether the console_setup and plymouth scripts are included and used
> in the initramfs. These are required for any initramfs which will be
> prompting for user input: console_setup because without it the user's
> configured keymap will not be set up, and plymouth because you are
> not guaranteed to have working video output in the initramfs without
> it (e.g. some nvidia+UEFI configurations with the default GRUB
> behavior).

> The zfs initramfs script may need to prompt the user for passphrases
> for encrypted zfs datasets, and we don't know definitively whether
> this is the case or not at the time the initramfs is constructed (and
> it's difficult to dynamically populate initramfs config variables
> anyway), therefore the zfs-initramfs package should just set
> FRAMEBUFFER=yes in a conf snippet the same way that the
> cryptsetup-initramfs package does
> (/usr/share/initramfs-tools/conf-hooks.d/cryptsetup).

https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1856408

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Signed-off-by: Steve Langasek <steve.langasek@canonical.com>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Closes #9723
2020-01-22 13:49:05 -08:00
Tomohiro Kusumi 6455859ee7 Don't fail to apply umask for O_TMPFILE files
Apply umask to `mode` which will eventually be applied to inode.
This is needed since VFS doesn't apply umask for O_TMPFILE files.

(Note that zpl_init_acl() applies `ip->i_mode &= ~current_umask();`
only when POSIX ACL is used.)

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Closes #8997
Closes #8998
2020-01-22 13:49:05 -08:00
Tom Caputi 7ad0ae91d5 Allow empty ds_props_obj to be destroyed
Currently, 'zfs list' and 'zfs get' commands can be slow when
working with snapshots that have a ds_props_obj. This is
because the code that discovers all of the properties for these
snapshots needs to read this object for each snapshot, which
almost always ends up causing an extra random synchronous read
for each snapshot. This performance penalty exists even if the
properties on that snapshot have been unset because the object
is normally only freed when the snapshot is freed, even though
it is only created when it is needed.

This patch allows the user to regain 'zfs list' performance on
these snapshots by destroying the ds_props_obj when it no longer
has any entries left. In practice on a production machine, this
optimization seems to make 'zfs list' about 55% faster.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Paul Zuchowski <pzuchowski@datto.com>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #9704
2020-01-22 13:49:05 -08:00
Matthew Ahrens 856d185dc2 Fix use-after-free of vd_path in spa_vdev_remove()
After spa_vdev_remove_aux() is called, the config nvlist is no longer
valid, as it's been replaced by the new one (with the specified device
removed).  Therefore any pointers into the nvlist are no longer valid.
So we can't save the result of
`fnvlist_lookup_string(nv, ZPOOL_CONFIG_PATH)` (in vd_path) across the
call to spa_vdev_remove_aux().

Instead, use spa_strdup() to save a copy of the string before calling
spa_vdev_remove_aux.

Found by AddressSanitizer:

ERROR: AddressSanitizer: heap-use-after-free on address ...
READ of size 34 at 0x608000a1fcd0 thread T686
    #0 0x7fe88b0c166d  (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x5166d)
    #1 0x7fe88a5acd6e in spa_strdup spa_misc.c:1447
    #2 0x7fe88a688034 in spa_vdev_remove vdev_removal.c:2259
    #3 0x55ffbc7748f8 in ztest_vdev_aux_add_remove ztest.c:3229
    #4 0x55ffbc769fba in ztest_execute ztest.c:6714
    #5 0x55ffbc779a90 in ztest_thread ztest.c:6761
    #6 0x7fe889cbc6da in start_thread
    #7 0x7fe8899e588e in __clone

0x608000a1fcd0 is located 48 bytes inside of 88-byte region
freed by thread T686 here:
    #0 0x7fe88b14e7b8 in __interceptor_free
    #1 0x7fe88ae541c5 in nvlist_free nvpair.c:874
    #2 0x7fe88ae543ba in nvpair_free nvpair.c:844
    #3 0x7fe88ae57400 in nvlist_remove_nvpair nvpair.c:978
    #4 0x7fe88a683c81 in spa_vdev_remove_aux vdev_removal.c:185
    #5 0x7fe88a68857c in spa_vdev_remove vdev_removal.c:2221
    #6 0x55ffbc7748f8 in ztest_vdev_aux_add_remove ztest.c:3229
    #7 0x55ffbc769fba in ztest_execute ztest.c:6714
    #8 0x55ffbc779a90 in ztest_thread ztest.c:6761
    #9 0x7fe889cbc6da in start_thread

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #9706
2020-01-22 13:49:05 -08:00
Paul Zuchowski 4d658bda32 zio_decompress_data always ASSERTs successful decompression
This interferes with zdb_read_block trying all the decompression
algorithms when the 'd' flag is specified, as some are
expected to fail.  Also control the output when guessing
algorithms, try the more common compression types first, allow
specifying lsize/psize, and fix an uninitialized variable.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Zuchowski <pzuchowski@datto.com>
Closes #9612
Closes #9630
2020-01-22 13:49:05 -08:00
Matthew Macy d2233a08fa Exclude data from cores unconditionally and metadata conditionally
This change allows us to align the code dump logic across platforms.

Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #9691
2020-01-22 13:49:05 -08:00
Brian Behlendorf 2525b71c68 ZTS: Fix zpool_reopen_001_pos
Update the vdev_disk_open() retry logic to use a specified number
of milliseconds to be more robust.  Additionally, on failure log
both the time waited and requested timeout to the internal log.

The default maximum allowed open retry time has been increased
from 500ms to 1000ms.

Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9680
Conflicts:
2020-01-22 13:49:05 -08:00
Kjeld Schouten 85ff6a23f4 Set send_realloc_files.ksh to use properties.shlib
This sets send_realloc_files.ksh to use properties.shlib
(like the other compression related tests)

It was missing from #9645

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl>
Issue #9645
Closes #9679
2020-01-22 13:49:05 -08:00
George Amanakis ba8a5a882d Fix reporting of L2ARC hits/misses in arc_summary3
arc_summary3 reports L2ARC hits and misses as Bytes, whereas they
should be reported as events. arc_summary2 reports these correctly.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #9669
2020-01-22 13:49:05 -08:00
Paul Zuchowski 73b5231187 Fix zdb_read_block using zio after it is destroyed
The checksum display code of zdb_read_block uses a zio
to read in the block and then calls zio_checksum_compute.
Use a new zio in the call to zio_checksum_compute not the zio
from the read which has been destroyed by zio_wait.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Paul Zuchowski <pzuchowski@datto.com>
Closes #9644
Closes #9657
2020-01-22 13:49:05 -08:00
Alexander Motin 388ef045b2 Fix use-after-free in case of L2ARC prefetch failure
In case L2ARC read failed, l2arc_read_done() creates _different_ ZIO
to read data from the original storage device.  Unfortunately pointer
to the failed ZIO remains in hdr->b_l1hdr.b_acb->acb_zio_head, and if
some other read try to bump the ZIO priority, it will crash.

The problem is reproducible by corrupting L2ARC content and reading
some data with prefetch if l2arc_noprefetch tunable is changed to 0.
With the default setting the issue is probably not reproducible now.

Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #9648
2020-01-22 13:49:05 -08:00
Brian Behlendorf 9cf46ddedc Increase allowed 'special_small_blocks' maximum value
There may be circumstances where it's desirable that all blocks
in a specified dataset be stored on the special device.  Relax
the artificial 128K limit and allow the special_small_blocks
property to be set up to 1M.  When blocks >1MB have been enabled
via the zfs_max_recordsize module option, this limit is increased
accordingly.

Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9131
Closes #9355
2020-01-22 13:49:05 -08:00
Michael Niewöhner 85204e30dd Adapt gitignore for modules
Remove the specific gitignore rules for module left-overs and add a
generic one in modules/.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>
Closes #9656
2020-01-22 13:49:05 -08:00
InsanePrawn e74055920e Fix encryption logic in systemd mount generator
Previously the generator would skip a dataset if it wasn't mountable by
'zfs mount -a' (legacy/none mountpoint, canmount off/noauto). This also
skipped the generation of key-load units for such datasets, breaking
the dependency handling for mountable child datasets.

Reviewed-by: Antonio Russo <antonio.e.russo@gmail.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: InsanePrawn <insane.prawny@gmail.com>
Closes #9611
2020-01-22 13:49:05 -08:00
InsanePrawn 19ea83c594 Fix non-absolute path in systemd mount generator
Systemd will ignore units that try to execute programs from non-absolute
paths. Use hardcoded /bin/sh instead.

Reviewed-by: Antonio Russo <antonio.e.russo@gmail.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: InsanePrawn <insane.prawny@gmail.com>
Closes #9611
2020-01-22 13:49:05 -08:00
InsanePrawn 922244cc23 Fix small typo in systemd mount generator
Reviewed-by: Antonio Russo <antonio.e.russo@gmail.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: InsanePrawn <insane.prawny@gmail.com>
Closes #9611
2020-01-22 13:49:04 -08:00
Paul Zuchowski 48be45cd2d Implement -A (ignore ASSERTs) for zdb
The command line switch -A (ignore ASSERTs) has always been available
in zdb but was never connected up to the correct global variable.

There are times when you need zdb to ignore asserts and keep dumping
out whatever information it can get despite the ASSERT(s) failing.
It was always intended to be part of zdb but was incomplete.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Zuchowski <pzuchowski@datto.com>
Closes #9610
2020-01-22 13:49:04 -08:00
Brian Behlendorf 36fe63042c Remove zfs_vdev_elevator module option
As described in commit f81d5ef6 the zfs_vdev_elevator module
option is being removed.  Users who require this functionality
should update their systems to set the disk scheduler using a
udev rule.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #8664
Closes #9417
Closes #9609
2020-01-22 13:49:04 -08:00
Paul Zuchowski c9ac5ec178 Add display of checksums to zdb -R
The function zdb_read_block (zdb -R) was always intended to have a :c
flag which would read the DVA and length supplied by the user, and
display the checksum. Since we don't know which checksum goes with
the data, we should calculate and display them all.

For each checksum in the table, read in the data at the supplied
DVA:length, calculate the checksum, and display it. Update the man
page and create a zfs test for the new feature.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Signed-off-by: Paul Zuchowski <pzuchowski@datto.com>
Closes #9607
2020-01-22 13:49:04 -08:00
Mauricio Faria de Oliveira bc21c56c2d Check for unlinked znodes after igrab()
The changes in commit 41e1aa2a / PR #9583 introduced a regression on
tmpfile_001_pos: fsetxattr() on a O_TMPFILE file descriptor started
to fail with errno ENODATA:

    openat(AT_FDCWD, "/test", O_RDWR|O_TMPFILE, 0666) = 3
    <...>
    fsetxattr(3, "user.test", <...>, 64, 0) = -1 ENODATA

The originally proposed change on PR #9583 is not susceptible to it,
so just move the code/if-checks around back in that way, to fix it.

Reviewed-by: Pavel Snajdr <snajpa@snajpa.net>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Original-patch-by: Heitor Alves de Siqueira <halves@canonical.com>
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
Closes #9602
2020-01-22 13:49:04 -08:00
Brian Behlendorf 6fed191975 ZTS: tst.terminate_by_signal increase test threshold
The tst.terminate_by_signal test case may occasionally fail when
running in a less consistent virtual environment.  For all observed
failures the process was terminated correctly but it took longer than
expected resulting in too many snapshot being created.

To minimize the likelyhood of this occuring increase the threshold
from 50 to 90 snapshots.  The larger limit will still verifiy that
the channel program was correctly terminated early.

Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9601
2020-01-22 13:49:04 -08:00
George Melikov e688774ea6 ZTS: Casenorm fix unicode interpretation
Use `printf` to properly interpret unicode characters.

Illumos uses a utility called `zlook` to allow additional flags to be
provided to readdir and lookup for testing.  This functionality could
be ported to Linux, but even without it several of the tests can be
enabled by instead using the standard `test` command.

Additional, work is required to enable the remaining test cases.

Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Issue #7633
Closes #8812
2020-01-22 13:49:04 -08:00
InsanePrawn 7191f049d5 Remove requirement for -d 1 for zfs list and zfs get with bookmarks
df58307 removed the need to specify -d 1 when zfs list and zfs get are
called with -t snapshot on a datset. This commit extends the same
behaviour to -t bookmark.

This commit also introduces the 'snap' shorthand for snapshots from
zfs list to zfs get.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Signed-off-by: InsanePrawn <insane.prawny@gmail.com>
Closes #9589
2020-01-22 13:49:04 -08:00
Heitor Alves de Siqueira 20e124dd71 Break out of zfs_zget early if unlinked znode
If zp->z_unlinked is set, we're working with a znode that has been
marked for deletion. If that's the case, we can skip the "goto again"
loop and return ENOENT, as the znode should not be discovered.

Reviewed-by: Richard Yao <ryao@gentoo.org>
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Heitor Alves de Siqueira <halves@canonical.com>
Closes #9583
2020-01-22 13:49:04 -08:00
InsanePrawn ef0b539581 Remove inappropiate error message suggesting to use '-r'
Removes an incorrect error message from libzfs that suggests applying
'-r' when a zfs subcommand is called with a filesystem path while
expecting either a snapshot or bookmark path.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Signed-off-by: InsanePrawn <insane.prawny@gmail.com>
Closes #9574
2020-01-22 13:49:04 -08:00
Kjeld Schouten 6657800745 Change zed.service to zfs-zed.service in man page
zed.service does not exist
replaced with correct service name in man.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl>
Closes #9581
2020-01-22 13:49:04 -08:00
loli10K 880a37aa35 Prevent NULL pointer dereference in blkg_tryget() on EL8 kernels
blkg_tryget() as shipped in EL8 kernels does not seem to handle NULL
@blkg as input; this is different from its mainline counterpart where
NULL is accepted.  To prevent dereferencing a NULL pointer when dealing
with block devices which do not set a root_blkg on the request queue
perform the NULL check in vdev_bio_associate_blkg().

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #9546
Closes #9577
2020-01-22 13:49:04 -08:00
Michael Niewöhner 1545f7c59d Add missing documentation for some KMC flags
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Ahrens <matt@delphix.com>
Signed-off-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>
Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>
Closes #9034
2020-01-22 13:49:04 -08:00
Brian Behlendorf 2c7549fb6f Fix zpool create -o <property> error message
When `zpool create -o <property>` is run without root permissions
and the pool property requested is not specifically enumerated in
zpool_valid_proplist().  Then an incorrect error message referring
to an invalid property is printed rather than the expected permission
denied error.

Specifying a pool property at create time should be handled the same
way as filesystem properties in zfs_valid_proplist().  There should
not be default zfs_error_aux() set for properties which are not
listed.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9550
Closes #9568
2020-01-22 13:49:04 -08:00
Alexander Motin edaec84225 Improve logging of 128KB writes
Before my ZIL space optimization few years ago 128KB writes were logged
as two 64KB+ records in two 128KB log blocks.  After that change it
became ~127KB+/1KB+ in two 128KB log blocks to free space in the second
block for another record.  Unfortunately in case of 128KB only writes,
when space in the second block remained unused, that change increased
write latency by unbalancing checksum computation and write times
between parallel threads.  It also didn't help with SLOG space
efficiency in that case.

This change introduces new 68KB log block size, used for both writes
below 67KB and 128KB-sharp writes.  Writes of 68-127KB are still using
one 128KB block to not increase processing overhead.  Writes above
131KB are still using full 128KB blocks, since possible saving there
is small.  Mixed loads will likely also fall back to previous 128KB,
since code uses maximum of the last 16 requested block sizes.

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:  Alexander Motin <mav@FreeBSD.org>
Closes #9409
2020-01-22 13:49:04 -08:00
Witaut Bajaryn 618206c0b9 Skip loading already loaded key
Don't ask for the password / try to load the key if the key for the
encryptionroot is already loaded.  The user might have loaded the key
manually or by other means before the scripts get called.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: Witaut Bajaryn <vitaut.bayaryn@gmail.com>
Closes #9495
Closes #9529
2020-01-22 13:49:03 -08:00
M. Zhou 1253fcc70a Add a notice in /etc/defaults/zfs for systemd users
Some systemd users may want to change configurations in
/etc/defaults/zfs, but these settings won't affect systemd
services.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Mo Zhou <cdluminate@gmail.com>
Closes #9544
2020-01-22 13:49:03 -08:00
Matthew Macy ca0f9b7473 Include prototypes for vdev_initialize
Address two prototype related warnings emitted by clang.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #9535
2020-01-22 13:49:03 -08:00
alaviss 5187a14f54 dracut/zfs-load-key.sh: properly remove prefixes
Removes the 'ZFS=' prefix from $BOOTFS instead of $root. This makes sure
that the 'zfs:' prefix remains stripped so that users with
'root=zfs:dataset' cmdline can have key loaded on boot again.

Reviewed-by: Garrett Fields <ghfields@gmail.com>
Reviewed-by: Dacian Reece-Stremtan <dacianstremtan@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Hiếu Lê <leorize+oss@disroot.org>
Closes #9520
2020-01-22 13:49:03 -08:00
Brian Behlendorf 123aa2fc14 Fix contrib/zcp/Makefile.am
Remove the stray leading + from the Makefile.  This was
preventing the autosnap.lua channel program from being
properly included by `make dist`.

Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9527
2020-01-22 13:49:03 -08:00
Tom Caputi 7e1b772edd Fix 'zfs change-key' with unencrypted child
Currently, when you call 'zfs change-key' on an encrypted dataset
that has an unencrypted child, the code will trigger a VERIFY.
This VERIFY is leftover from before we allowed unencrypted
datasets to exist underneath encrypted ones. This patch fixes the
issue by simply replacing the VERIFY with an early return when
recursing through datasets.

Reviewed by: Jason King <jason.brian.king@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #9524
2020-01-22 13:49:03 -08:00
Chunwei Chen c6eaa8b7f9 Fix zpool history unbounded memory usage
In original implementation, zpool history will read the whole history
before printing anything, causing memory usage goes unbounded. We fix
this by breaking it into read-print iterations.

Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #9516
2020-01-22 13:49:03 -08:00
Tom Caputi 635603a1c2 Fix incremental recursive encrypted receive
Currently, incremental recursive encrypted receives fail to work
for any snapshot after the first. The reason for this is because
the check in zfs_setup_cmdline_props() did not properly realize
that when the user attempts to use '-x encryption' in this
situation, they are not really overriding the existing encryption
property and instead are attempting to prevent it from changing.
This resulted in an error message stating: "encryption property
'encryption' cannot be set or excluded for raw or incremental
streams".

This problem is fixed by updating the logic to expect this use
case.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #9494
2020-01-22 13:49:03 -08:00
Ryan Moeller 635bf1c37c ZTS: Consistency pass for .ksh extensions
* Use .ksh extension for ksh scripts, not .sh
* Remove .ksh extension from tests in common.run

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #9502
2020-01-22 13:49:03 -08:00
Matthew Macy 09015c212f Use correct format string when printing int8
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #9486
2020-01-22 13:49:03 -08:00
John Wren Kennedy 601dd2a504 ZTS: Written props test fails with 4k disks
With 4k disks, this test will fail in the last section because the
expected human readable value of 20.0M is reported as 20.1M. Rather than
use the human readable property, switch to the parsable property and
verify that the values are reasonably close.

Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: John Kennedy <john.kennedy@delphix.com>
Closes #9477
2020-01-22 13:49:03 -08:00
Serapheim Dimitropoulos c19a6512fd Name anonymous enum of KMC_BIT constants
Giving a name to this enum makes it discoverable from
debugging tools like DRGN and SDB. For example, with
the name proposed on this patch we can iterate over
these values in DRGN:
```
>>> prog.type('enum kmc_bit').enumerators
(('KMC_BIT_NOTOUCH', 0), ('KMC_BIT_NODEBUG', 1),
('KMC_BIT_NOMAGAZINE', 2), ('KMC_BIT_NOHASH', 3),
('KMC_BIT_QCACHE', 4), ('KMC_BIT_KMEM', 5),
('KMC_BIT_VMEM', 6), ('KMC_BIT_SLAB', 7),
...
```
This enables SDB to easily pretty-print the flags of
the spl_kmem_caches in the system like this:
```
> spl_kmem_caches -o "name,flags,total_memory"
name                                       flags total_memory
------------------------ ----------------------- ------------
abd_t                    KMC_NOMAGAZINE|KMC_SLAB        4.5MB
arc_buf_hdr_t_full       KMC_NOMAGAZINE|KMC_SLAB       12.3MB
... <cropped> ...
ddt_cache                               KMC_VMEM      583.7KB
ddt_entry_cache          KMC_NOMAGAZINE|KMC_SLAB         0.0B
... <cropped> ...
zio_buf_1048576             KMC_NODEBUG|KMC_VMEM         0.0B
... <cropped> ...
```

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #9478
2020-01-22 13:49:03 -08:00
Serapheim Dimitropoulos 8c8f84472b Update skc_obj_alloc for spl kmem caches that are backed by Linux
Currently, for certain sizes and classes of allocations we use
SPL caches that are backed by caches in the Linux Slab allocator
to reduce fragmentation and increase utilization of memory. The
way things are implemented for these caches as of now though is
that we don't keep any statistics of the allocations that we
make from these caches.

This patch enables the tracking of allocated objects in those
SPL caches by making the trade-off of grabbing the cache lock
at every object allocation and free to update the respective
counter.

Additionally, this patch makes those caches visible in the
/proc/spl/kmem/slab special file.

As a side note, enabling the specific counter for those caches
enables SDB to create a more user-friendly interface than
/proc/spl/kmem/slab that can also cross-reference data from
slabinfo. Here is for example the output of one of those
caches in SDB that outputs the name of the underlying Linux
cache, the memory of SPL objects allocated in that cache,
and the percentage of those objects compared to all the
objects in it:
```
> spl_kmem_caches | filter obj.skc_name == "zio_buf_512" | pp
name        ...            source total_memory util
----------- ... ----------------- ------------ ----
zio_buf_512 ... kmalloc-512[SLUB]       16.9MB    8
```

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #9474
2020-01-22 13:49:03 -08:00
Brian Behlendorf e08b98e983 Modify sharenfs=on default behavior
While it may sometimes be convenient to export an NFS filesystem with
no_root_squash it should not be the default behavior.  Align the
default behavior with the Linux NFS server defaults.  To restore
the previous behavior use 'zfs set sharenfs="no_root_squash,..."'.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9397
Closes #9425
2020-01-22 13:49:03 -08:00
Brian Behlendorf c54ee4c0d3 ZTS: Fix zpool_status_-s
After commit 5e74ac51 which split and reordered the run files the
`zpool_status_-s` test began failing.  The new ordering placed
the test after a previous test which used `zpool replace` to replace
a disk but did not clear its label.  This resulted in the next test,
`zpool_status_-s`, failing because of the potentially active
pool being detected on the replaced vdev.

    /dev/loop0 is part of potentially active pool 'testpool'

Use the default_mirror_setup_noexit() and default_cleanup_noexit()
functions to create the pool in `zpool_status_-s`.  They use the -f
flag by default.

In the `scrub_after_resilver` test wipe the label during cleanup
to prevent future failures if the tests are again reordered.

Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9451
2020-01-22 13:49:03 -08:00
Richard Yao e416b165ff Implement ZPOOL_IMPORT_UDEV_TIMEOUT_MS
Since 0.7.0, zpool import would unconditionally block on udev for 30
seconds. This introduced a regression in initramfs environments that
lack udev (particularly mdev based environments), yet use a zfs userland
tools intended for the system that had been built against udev. Gentoo's
genkernel is the main example, although custom user initramfs
environments would be similarly impacted unless special builds of the
ZFS userland utilities were done for them.  Such environments already
have their own mechanisms for blocking until device nodes are ready
(such as genkernel's scandelay parameter), so it is unnecessary for
zpool import to block on a non-existent udev until a timeout is reached
inside of them.

Rather than trying to intelligently determine whether udev is available
on the system to avoid unnecessarily blocking in such environments, it
seems best to just allow the environment to override the timeout.  I
propose that we add an environment variable called
ZPOOL_IMPORT_UDEV_TIMEOUT_MS. Setting it to 0 would restore the 0.6.x
behavior that was more desirable in mdev based initramfs environments.
This allows the system user land utilities to be reused when building
mdev-based initramfs archives.

Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Georgy Yakovlev <gyakovlev@gentoo.org>
Signed-off-by: Richard Yao <ryao@gentoo.org>
Closes #9436
2020-01-22 13:49:03 -08:00
Ryan Moeller c99b304f01 Clarify loop variable name in zfs copies test
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #9445
2020-01-22 13:49:03 -08:00
Ryan Moeller 33cd5f2997 Fix some style nits in tests
Mostly whitespace changes, no functional changes intended.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #9447
2020-01-22 13:49:03 -08:00
loli10K f1ba5478a3 Fix pool creation with feature@allocation_classes disabled
When "feature@allocation_classes" is not enabled on the pool no vdev
with "special" or "dedup" allocation type should be allowed to exist in
the vdev tree.

Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #9427
Closes #9429
2020-01-22 13:49:02 -08:00
Brian Behlendorf 90bc5ca5e1 Update zfs program command usage
Update the zfs(8) man page to clearly describe that arguments for
channel programs are to be listed after the -- sentinel which
terminates argument processing.  This behavior is supported by
getopt on Linux, FreeBSD, and Illumos according to each platforms
respective man pages.

    zfs program [-jn] [-t instruction-limit] [-m memory-limit]
        pool script [--] arg1 ...

Reviewed-by: Clint Armstrong <clint@clintarmstrong.net>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9056
Closes #9428
2020-01-22 13:49:02 -08:00
Igor K 8af362c3e9 ZTS: Fix mmp_hostid test
Correctly use the `mntpnt_fs` variable, and include additional
logic to ensure the /etc/hostid is correct set up and cleaned up.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Igor Kozhukhov <igor@dilos.org>
Closes #9349
2020-01-22 13:49:02 -08:00
George Melikov 8139355dce module/Makefile.in: don't run xargs if empty
If stdin if empty - don't run xargs command,
otherwise we can get `cp: missing file operand`
error.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #9418
2020-01-22 13:49:02 -08:00
Brian Behlendorf 5e78137f28 ZTS: Fix trim/trim_config and trim/autotrim_config
There have been occasional CI failures which occur when the trimmed
vdev size exactly matches the target size.  Resolve this by slightly
relaxing the conditional and checking for -ge rather than -gt.  In
all of the cases observer, the values match exactly.  For example:

    Failure /mnt/trim-vdev1 is 768 MB which is not -gt than 768 MB

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9399
2020-01-22 13:49:02 -08:00
Brian Behlendorf 5a1bf9e8b1 Fix automount for root filesystems
Commit 093bb64 resolved an automount failures for chroot'd processes
but inadvertently broke automounting for root filesystems where the
vfs_mntpoint is NULL.  Resolve the issue by checking for NULL in order
to generate the correct path.

Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9381
Closes #9384
2020-01-22 13:49:02 -08:00
Matthew Macy b43893de86 Rename rangelock_ functions to zfs_rangelock_
A rangelock KPI already exists on FreeBSD.  Add a zfs_ prefix as
per our convention to prevent any conflict with existing symbols.

Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes #9402
2020-01-22 13:49:02 -08:00
Brian Behlendorf 05e2a4cfc9 ZTS: Fix upgrade_readonly_pool
Update cleanup_upgrade to use destroy_dataset and destroy_pool
when performing cleanup.  These wrappers retry if the pool is busy
preventing occasional failures like those observed when running
tests upgrade_readonly_pool.  For example:

    SUCCESS: test enabled == enabled
    User accounting upgrade is not executed on readonly pool
    NOTE: Performing local cleanup via log_onexit (cleanup_upgrade)
    cannot destroy 'testpool': pool is busy
    ERROR: zpool destroy testpool exited 1

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9400
2020-01-22 13:49:02 -08:00
Didier Roche 5f67022bf7 Workaround to avoid a race when /var/lib is a persistent dataset
If /var/lib is a dataset not under <pool>/ROOT/<root_dataset>, as
proposed in the ubuntu root on zfs upstream guide
(https://github.com/zfsonlinux/zfs/wiki/Ubuntu-18.04-Root-on-ZFS),
we end up with a race where some services, like systemd-random-seed
are writing under /var/lib, while zfs-mount is called. zfs mount will
then potentially fail because of /var/lib isn't empty and so, can't be
mounted.
Order those 2 units for now (more may be needed) as we can't declare
virtually a provide mount point to match
"RequiresMountsFor=/var/lib/systemd/random-seed" from
systemd-random-seed.service.
The optional generator for zfs 0.8 fixes it, but it's not enabled
by default nor necessarily required.

Example:
- rpool/ROOT/ubuntu (mountpoint = /)
- rpool/var/ (mountpoint = /var)
- rpool/var/lib  (mountpoint = /var/lib)

Both zfs-mount.service and systemd-random-seed.service are starting
After=systemd-remount-fs.service. zfs-mount.service should be done
before local-fs.target while systemd-random-seed.service should finish
before sysinit.target (which is a later target).
Ideally, we would have a way for zfs mount -a unit to declare all paths
or move systemd-random-seed after local-fs.target.

Reviewed-by: Antonio Russo <antonio.e.russo@gmail.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Didier Roche <didrocks@ubuntu.com>
Closes #9360
2020-01-22 13:49:02 -08:00
dacianstremtan 0be40959fe Fix for zfs-dracut regression
Line 31 and 32 overwrote the ${root} variable which broke mount-zfs.sh
We have create a new variable for the dataset instead of overwriting the
${root} variable in zfs-load-key.sh${root} variable in zfs-load-key.sh

Reviewed-by: Kash Pande <kash@tripleback.net>
Reviewed-by: Garrett Fields <ghfields@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Dacian Reece-Stremtan <dacianstremtan@gmail.com>
Closes #8913
Closes #9379
2020-01-22 13:49:01 -08:00
Brian Behlendorf ff3e2e3c70 Perform KABI checks in parallel
Reduce the time required for ./configure to perform the needed
KABI checks by allowing kbuild to compile multiple test cases in
parallel.  This was accomplished by splitting each test's source
code from the logic handling whether that code could be compiled
or not.

By introducing this split it's possible to minimize the number of
times kbuild needs to be invoked.  As importantly, it means all of
the tests can be built in parallel.  This does require a little extra
care since we expect some tests to fail, so the --keep-going (-k)
option must be provided otherwise some tests may not get compiled.
Furthermore, since a failure during the kbuild modpost phase will
result in an early exit; the final linking phase is limited to tests
which passed the initial compilation and produced an object file.

Once everything has been built the configure script proceeds as
previously.  The only significant difference is that it now merely
needs to test for the existence of a .ko file to determine the
result of a given test.  This vastly speeds up the entire process.

New test cases should use ZFS_LINUX_TEST_SRC to declare their test
source code and ZFS_LINUX_TEST_RESULT to check the result.  All of
the existing kernel-*.m4 files have been updated accordingly, see
config/kernel-current-time.m4 for a basic example.  The legacy
ZFS_LINUX_TRY_COMPILE macro has been kept to handle special cases
but it's use is not encouraged.

                  master (secs)   patched (secs)
                  -------------   ----------------
autogen.sh        61              68
configure         137             24  (~17% of current run time)
make -j $(nproc)  44              44
make rpms         287             150

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #8547
Closes #9132
Closes #9341
Conflicts:
	Makefile.am
	config/kernel-fpu.m4
2020-01-22 13:49:01 -08:00
Fabian-Gruenbichler 35155c0132 SIMD: Use alloc_pages_node to force alignment
fxsave and xsave require the target address to be 16-/64-byte aligned.

kmalloc(_node) does not (yet) offer such fine-grained control over
alignment[0,1], even though it does "the right thing" most of the time
for power-of-2 sizes. unfortunately, alignment is completely off when
using certain debugging or hardening features/configs, such as KASAN,
slub_debug=Z or the not-yet-upstream SLAB_CANARY.

Use alloc_pages_node() instead which allows us to allocate page-aligned
memory. Since fpregs_state is padded to a full page anyway, and this
code is only relevant for x86 which has 4k pages, this approach should
not allocate any unnecessary memory but still guarantee the needed
alignment.

0: https://lwn.net/Articles/787740/
1: https://lore.kernel.org/linux-block/20190826111627.7505-1-vbabka@suse.cz/

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9608
Closes #9674
2020-01-22 13:49:01 -08:00
Brian Behlendorf 62c034f6d4 Linux 5.0 compat: SIMD compatibility
Restore the SIMD optimization for 4.19.38 LTS, 4.14.120 LTS,
and 5.0 and newer kernels.

This commit squashes the following commits from master in to
a single commit which can be applied to 0.8.2.

10fa2545 - Linux 4.14, 4.19, 5.0+ compat: SIMD save/restore
b88ca2ac - Enable SIMD for encryption
095b5412 - Fix CONFIG_X86_DEBUG_FPU build failure
e5db3134 - Linux 5.0 compat: SIMD compatibility

Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
TEST_ZIMPORT_SKIP="yes"
2020-01-22 13:49:01 -08:00
Brian Behlendorf 988b040476 ZTS: harden xattr/cleanup.ksh
When the xattr/cleanup.ksh script is unable to remove the test group
due to an active process then it will not call default_cleanup.  This
will result in a zvol_ENOSPC/setup failure when attempting to create
the /mnt/testdir directory which will already exist.

Resolve the issue by performing the default_cleanup before removing
the test user and group to ensure this step always happens.  Also
allow one more retry to further minimize the likelihood of the
cleanup failing.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9358
2020-01-22 13:49:01 -08:00
Brian Behlendorf 055238d2eb Add warning for zfs_vdev_elevator option removal
Originally the zfs_vdev_elevator module option was added as a
convenience so the requested elevator would be automatically set
on the underlying block devices.  At the time this was simple
because the kernel provided an API function which did exactly this.

This API was then removed in the Linux 4.12 kernel which prompted
us to add compatibly code to set the elevator via a usermodehelper.
While well intentioned this introduced a bug which could cause a
system hang, that issue was subsequently fixed by commit 2a0d4188.

In order to avoid future bugs in this area, and to simplify the code,
this functionality is being deprecated.  A console warning has been
added to notify any existing consumers and the documentation updated
accordingly.  This option will remain for the lifetime of the 0.8.x
series for compatibility but if planned to be phased out of master.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #8664
Closes #9317
2020-01-22 13:49:01 -08:00
loli10K ec5d76e853 diff_cb() does not handle large dnodes
Trying to 'zfs diff' a snapshot with large dnodes will incorrectly try
to access its interior slots when dnodesize > sizeof(dnode_phys_t).
This is normally not an issue because the interior slots are
zero-filled, which report_dnode() handles calling
report_free_dnode_range(). However this is not the case for encrypted
large dnodes or filesystem using many SA based xattrs where the extra
data past the legacy dnode size boundary is interpreted as a
dnode_phys_t.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7678
Closes #8931
Closes #9343
2020-01-22 13:49:01 -08:00
Ryan Moeller 8498a2f3f8 Use signed types to prevent subtraction overflow
The difference between the sizes could be positive or negative. Leaving
the types as unsigned means the result overflows when the difference is
negative and removing the labs() means we'll have introduced a bug. The
subtraction results in the correct value when the unsigned integer is
interpreted as a signed integer by labs().

Clang doesn't see that we're doing a subtraction and abusing the types.
It sees the result of the subtraction, an unsigned value, being passed
to an absolute value function and emits a warning which we treat as an
error.

Reviewed by: Youzhong Yang <youzhong@gmail.com>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@ixsystems.com>
Closes #9355
2020-01-22 13:49:01 -08:00
Ryan Moeller 3ec97ba6f1 Refactor libzfs_error_init newlines
Move the trailing newlines from the error message strings to the format
strings to more closely match the other error messages.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Ryan Moeller <ryan@ixsystems.com>
Closes #9330
2020-01-22 13:49:01 -08:00
loli10K 444df1051c Device removal of indirect vdev panics the kernel
This commit fixes a NULL pointer dereference triggered in
spa_vdev_remove_top_check() by trying to "zpool remove" an indirect
vdev.

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #9327
2020-01-22 13:49:01 -08:00
loli10K b8bd3ec2af ZTS: Fix /usr/bin/env: 'python2': No such file or directory
Since 4f342e45 env(1) must be able to find a "python2" executable in
the "constrained path" on systems configured with --with-python=2.x
otherwise the ZFS Test Suite won't be able to use Python scripts.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #9325
2020-01-22 13:49:00 -08:00
Tom Caputi 5986c5c687 Fix clone handling with encryption roots
Currently, spa_keystore_change_key_sync_impl() does not recurse
into clones when updating encryption roots for either a call to
'zfs promote' or 'zfs change-key'. This can cause children of
these clones to end up in a state where they point to the wrong
dataset as the encryption root. It can also trigger ASSERTs in
some cases where the code checks reference counts on wrapping
keys. This patch fixes this issue by ensuring that this function
properly recurses into clones during processing.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alek Pinchuk <apinchuk@datto.com>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #9267
Closes #9294
2020-01-22 13:49:00 -08:00
Ryan Moeller 088f97b921 Canonicalize Python shebangs
/usr/bin/env python3 is the suggested[1] shebang for Python in general
(likewise for python2) and is conventional across platforms. This eases
development on systems where python is not installed in /usr/bin
(FreeBSD for example) and makes it possible to develop in virtual
environments (venv) for isolating dependencies.

Many packaging guidelines discourage the use of /usr/bin/env, but since
this is the canonical way of writing shebangs in the Python community,
many packaging scripts are already equipped to handle substituting the
appropriate absolute path to python automatically.

Some RPM package builders lacking brp-mangle-shebangs need a small
fallback mechanism in the package spec to stamp the appropriate shebang
on installed Python scripts.

[1]: https://docs.python.org/3/using/unix.html?#miscellaneous

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Ryan Moeller <ryan@ixsystems.com>
Closes #9314
2020-01-22 13:49:00 -08:00
Tom Caputi 8747ee4513 Fix stalled txg with repeated noop scans
Currently, the DSL scan code figures out when it should suspend
processing and allow a txg to continue by calling the function
dsl_scan_check_suspend(). Unfortunately, this function only
allows the scan to suspend at a level 0 block. In the event that
the system is scanning a bunch of empty snapshots or a resilver
is running with a high enough scn_cur_min_txg, the scan will
stop processing each dataset at the root level, deciding it
has nothing left to do. This means that the check_suspend
function is never called and the txg remains stuck until a
dataset is found that has data to scan.

This patch fixes the problem by allowing scans to suspend at
the root level of the objset. For backwards compatibility, we
use the bookmark <objsetid, 0, 0, 0> when we suspend here so
that older versions of the code will work as intended.

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #9300
2020-01-22 13:49:00 -08:00
John Wren Kennedy 43258fb78c ZTS: Introduce targeted corruption in file blocks
filetest_001_pos verifies that various checksum algorithms detect
corruption by overwriting the underlying vdev on which a file resides.
It is possible for the overwrite to miss the blocks of a file, causing a
spurious failure. This change introduces a function to corrupt the
individual blocks of a file as determined by zdb.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: John Kennedy <john.kennedy@delphix.com>
Closes #9288
2020-01-22 13:49:00 -08:00
Ryan Moeller 4818563f85 Clean up do_vol_test in zfs_copies tests
Get rid of the `get_used_prop` function. `get_prop used` works fine.

Fix the comment describing the function parameters. The type does not
have a default, and mntp is also used for ext2.

Rename the variable for the number of copies from `copy` to `copies`.

Use a `case` statement to match the type parameter, order the cases
alphabetically, and add a little sanity checking for good measure.

Use eval to make sure the output of commands is silenced rather than
the log messages when redirecting output to /dev/null.

Simplify cases where zfs requires special behavior.

Don't allow the test to loop forever in the event space usage does not
change. Bail out of the loop and fail after an arbitrary number of
iterations.

Add more information to the log message when the test fails, to help
debugging.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@ixsystems.com>
Closes #9286
2020-01-22 13:49:00 -08:00
Tom Caputi ffe29e7e3d Fix noop receive of raw send stream
Currently, the noop receive code fails to work with raw send streams
and resuming send streams. This happens because zfs_receive_impl()
reads the DRR_BEGIN payload without reading the payload itself.
Normally, the kernel expects to read this itself, but in this case
the recv_skip() code runs instead and it is not prepared to handle
the stream being left at any place other than the beginning of a
record.

This patch resolves this issue by manually reading the DRR_BEGIN
payload in the dry-run case. This patch also includes a number of
small fixups in this code path.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #9221
Closes #9173
2020-01-22 13:49:00 -08:00
Ryan Moeller cea50025fd Clean up zfs_clone_010_pos
Remove a lot of unnecessary setting and incrementing of `i`.

Remove unused variable `j`.

Instead of calling out to Python in a loop to generate the same string
repeatedly, generate the string once using shell constructs before
entering the loop.

Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Signed-off-by: Ryan Moeller <ryan@ixsystems.com>
Closes #9284
2020-01-22 13:49:00 -08:00
Ryan Moeller 27dda98b88 Refactor checksum operations in tests
md5sum in particular but also sha256sum to a lesser extent is used
in several areas of the test suite for computing checksums. The vast
majority of invocations are followed by `| awk '{ print $1 }'`.

Introduce functions to wrap up `md5sum $file | awk '{ print $1 }'` and
likewise for sha256sum. These also serve as a convenient interface for
alternative implementations on other platforms.

Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@ixsystems.com>
Closes #9280
2020-01-22 13:49:00 -08:00
Ryan Moeller 8c9c049502 Use the right booleans
TRUE and FALSE happen to be defined, but we should use B_TRUE and
B_FALSE for the sake of consistency.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Ryan Moeller <ryan@ixsystems.com>
Closes #9264
2020-01-22 13:49:00 -08:00
Igor K 5cb46afcf1 Fix panic on DilOS with kstat per dataset statistics
Account for ZFS_MAX_DATASET_NAME_LEN in kstat data size.  This value
is ignored in the Linux kstat code but resolves the issue for other
platforms.

Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Igor Kozhukhov <igor@dilos.org>
Closes #9254
Closes #9151
2020-01-22 13:49:00 -08:00
Igor K 068c5495f0 ZTS: Fix removal_cancel.ksh
Create a larger file to extend the time required to perform the
removal.  Occasional failures were observed due to the removal
completing before the cancel could be requested.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Igor Kozhukhov <igor@dilos.org>
Closes #9259
2020-01-22 13:49:00 -08:00
George Wilson 7e93917309 maxinflight can overflow in spa_load_verify_cb()
When running on larger memory systems, we can overflow the value of
maxinflight. This can result in maxinflight having a value of 0 causing
the system to hang.

Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Wilson <george.wilson@delphix.com>
Closes #9272
2020-01-22 13:49:00 -08:00
Andrea Gelmini 4ff90260c0 Fix typos
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Closes #9251
2020-01-22 13:49:00 -08:00
Andrea Gelmini 18d335d830 Fix typos in tests/
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Closes #9250
2020-01-22 13:49:00 -08:00
Andrea Gelmini 2af76a25ab Fix typos in tests/
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Closes #9249
2020-01-22 13:49:00 -08:00
Andrea Gelmini 36be89b8e5 Fix typos in tests/
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Closes #9247
2020-01-22 13:48:59 -08:00
Andrea Gelmini 7a7da11671 Fix typos in tests/
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Closes #9246
2020-01-22 13:48:59 -08:00
Andrea Gelmini f6a70187d2 Fix typos in tests/
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Closes #9244
2020-01-22 13:48:59 -08:00
Andrea Gelmini d632608210 Fix typos in tests/
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Closes #9243
2020-01-22 13:48:59 -08:00
Andrea Gelmini 500977eed2 Fix typos in tests/
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Closes #9242
2020-01-22 13:48:59 -08:00
Andrea Gelmini 5097eb6ac9 Fix typos in module/zfs/
Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Closes #9240
2020-01-22 13:48:59 -08:00
Andrea Gelmini 6673ef3f6f Fix typos in lib/
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Closes #9237
2020-01-22 13:48:59 -08:00
Andrea Gelmini 7572926bc5 Fix typos in tests/
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Closes #9248
2020-01-22 13:48:59 -08:00
Andrea Gelmini 8e1f209fa1 Fix typos in tests/
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
2020-01-22 13:48:59 -08:00
Andrea Gelmini 48d8b249c9 Fix typos in module/
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Closes #9241
2020-01-22 13:48:59 -08:00
Andrea Gelmini 8c01eb1c4a Fix typos in modules/icp/
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Closes #9239
2020-01-22 13:48:59 -08:00
Andrea Gelmini bcfa65802c Fix typos in include/
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Closes #9238
2020-01-22 13:48:59 -08:00
Andrea Gelmini 10e8abf1af Fix typos in etc/
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Closes #9236
2020-01-22 13:48:59 -08:00
Andrea Gelmini 44ae857ca4 Fix typos in contrib/
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Closes #9235
2020-01-22 13:48:58 -08:00
Andrea Gelmini eaf8e3b779 Fix typos in cmd/
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Closes #9234
2020-01-22 13:48:58 -08:00
Andrea Gelmini cac5f924ce Fix typos in man/
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Closes #9233
2020-01-22 13:48:58 -08:00
Andrea Gelmini 35c8730d1a Fix typos in config/
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Closes #9232
2020-01-22 13:48:58 -08:00
Igor K 619fda527a Fix refquota_007_neg.ksh
Must use 'zfs' instead of '$ZFS' which is undefined.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Igor Kozhukhov <igor@dilos.org>
Closes #9257
2020-01-22 13:48:58 -08:00
Paul Dagnelie ebdb770554 Prevent metaslab_sync panic due to spa_final_dirty_txg
If a pool enables the SPACEMAP_HISTOGRAM feature shortly before being
exported, we can enter a situation that causes a kernel panic. Any metaslabs
that are loaded during the final dirty txg and haven't already been condensed
will cause metaslab_sync to proceed after the final dirty txg so that the
condense can be performed, which there are assertions to prevent. Because of
the nature of this issue, there are a number of ways we can enter this
state. Rather than try to prevent each of them one by one, potentially missing
some edge cases, we instead cut it off at the point of intersection; by
preventing metaslab_sync from proceeding if it would only do so to perform a
condense and we're past the final dirty txg, we preserve the utility of the
existing asserts while preventing this particular issue.

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #9185
Closes #9186
Closes #9231
Closes #9253
2020-01-22 13:48:58 -08:00
Ryan Moeller 0302546b8d Simplify deleting partitions in libtest
Eliminate unnecessary code duplication. We can use a for-loop instead
of a while-loop. There is no need to echo $DISKSARRAY in a subshell or
return 0. Declare all variables with typeset.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Ryan Moeller <ryan@ixsystems.com>
Closes #9224
2020-01-22 13:48:58 -08:00
Ryan Moeller 2f1f18a6b4 Use compatible arg order in tests
BSD getopt() and getopt_long() want options before arguments.
Reorder arguments to zfs/zpool in tests to put all the options first.

Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@ixsystems.com>
Closes #9228
2020-01-22 13:48:58 -08:00
Tony Nguyen 16f42e1b6d Use smaller default slack/delta value for schedule_hrtimeout_range()
For interrupt coalescing, cv_timedwait_hires() uses a 100us slack/delta
for calls to schedule_hrtimeout_range(). This 100us slack can be costly
for small writes.

This change improves small write performance by passing resolution `res`
parameter to schedule_hrtimeout_range() to be used as delta/slack. A new
tunable `spl_schedule_hrtimeout_slack_us` is added to preserve old
behavior when desired.

Performance observations on 8K recordsize filesystem:
- 8K random writes at 1-64 threads, up to 60% improvement for one thread
  and smaller gains as thread count increases. At >64 threads, 2-5%
  decrease in performance was observed.
- 8K sequential writes, similar 60% improvement for one thread and
  leveling out around 64 threads. At >64 threads, 5-10% decrease in
  performance was observed.
- 128K sequential write sees 1-5 for the 128K. No observed regression at
  high thread count.

Testing done on Ubuntu 18.04 with 4.15 kernel, 8vCPUs and SSD storage on
VMware ESX.

Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Signed-off-by: Tony Nguyen <tony.nguyen@delphix.com>
Closes #9217
2020-01-22 13:48:58 -08:00
Ryan Moeller f785ce65c1 Prefer for (;;) to while (TRUE)
Defining a special constant to make an infinite loop is excessive,
especially when the name clashes with symbols commonly defined on
some platforms (ie FreeBSD).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: John Kennedy <john.kennedy@delphix.com
Signed-off-by: Ryan Moeller <ryan@ixsystems.com>
Closes #9219
2020-01-22 13:48:58 -08:00
Paul Dagnelie 0c6ccc99b2 Add regression test for "zpool list -p"
Other than this test, zpool list -p is not well tested by any of the
automated tests.  Add a test for zpool list -p.

Reviewed-by: Prakash Surya <prakash.surya@delphix.com>
Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #9134
2020-01-22 13:48:58 -08:00
Ryan Moeller be6ae01435 Split argument list, satisfy shellcheck SC2086
Split the arguments for ${TEST_RUNNER} across multiple lines for
clarity. Also added quotes in the message to match the invoked command.

Unquoted variables in argument lists are subject to splitting. In this
particular case we can't quote the variable because it is an optional
argument. Use the method suggested in the description linked below,
instead.

The technique is to use an unquoted variable with an alternate value.

https://github.com/koalaman/shellcheck/wiki/SC2086

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Ryan Moeller <ryan@ixsystems.com>
Closes #9212
2020-01-22 13:48:58 -08:00
Brian Behlendorf 9ad6f69a03 ZTS: Fix in-tree dbufstats test case
Commit a887d653 updated the dbufstats such that escalated privileges
are required.  Since all tests under cli_user are run with normal
privileges move this test case to a location where it will be run
required privileges.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Michael Niewöhner <foss@mniewoehner.de>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9118
Closes #9196
2020-01-22 13:48:58 -08:00
Paul Dagnelie 3b47c941eb Fix install error introduced by #9089
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
2020-01-22 13:48:58 -08:00
Mauricio Faria de Oliveira 7ed41d292b Document ZFS_DKMS_ENABLE_DEBUGINFO in userland configuration
Document the ZFS_DKMS_ENABLE_DEBUGINFO option in the userland
configuration file, as done with the other ZFS_DKMS_* options.

It has been introduced with commit e45c1734a6 ("dkms: Enable
debuginfo option to be set with zfs sysconfig file") but isn't
mentioned anywhere other than the 'dkms.conf' file (generated).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
Closes #9191
2020-01-22 13:48:58 -08:00
Ryan Moeller 0189eb4762 Dedup IOC enum values in libzfs_input_check
Reuse enum value ZFS_IOC_BASE for `('Z' << 8)`.
This is helpful on FreeBSD where ZFS_IOC_BASE has a different value and
`('Z' << 8)` is wrong.

Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@ixsystems.com>
Closes #9188
2020-01-22 13:48:58 -08:00
Ryan Moeller f64ef7317a Enhance ioctl number checks
When checking ZFS_IOC_* numbers, print which numbers are wrong rather
than silently failing.

Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@ixsystems.com>
Closes #9187
2020-01-22 13:48:58 -08:00
Brian Behlendorf b72548575e ZTS: Fix vdev_zaps_005_pos on CentOS 6
The ancient version of blkid (v2.17.2) used in CentOS 6 will not
detect the newly created pool unless it has been written to.
Force a pool sync so `zpool import` will detect the newly created
pool.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9199
2020-01-22 13:48:58 -08:00
Ryan Moeller b7c9207fbd Minor cleanup in Makefile.am
Split long lines where adding license info to dist archive.

Remove extra colon from target line.

Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@ixsystems.com>
Closes #9189
2020-01-22 13:48:58 -08:00
Alexey Smirnoff dcaa460d6d zfs-functions.in: in_mtab() always returns 1
$fs used with the wrong sed command where should be $mntpnt instead
to match a variable exported by read_mtab()

The fix is mostly to reuse the sed command found in read_mtab()

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Michael Niewöhner <foss@mniewoehner.de>
Signed-off-by: Alexey Smirnoff <fling@member.fsf.org>
Closes #9168
2020-01-22 13:48:58 -08:00
jdike db58aa717b Fix lockdep circular locking false positive involving sa_lock
There are two different deadlock scenarios, but they share a common
link, which is
thread 1 holding sa_lock and trying to get zap->zap_rwlock:
    zap_lockdir_impl+0x858/0x16c0 [zfs]
    zap_lockdir+0xd2/0x100 [zfs]
    zap_lookup_norm+0x7f/0x100 [zfs]
    zap_lookup+0x12/0x20 [zfs]
    sa_setup+0x902/0x1380 [zfs]
    zfsvfs_init+0x3d6/0xb20 [zfs]
    zfsvfs_create+0x5dd/0x900 [zfs]
    zfs_domount+0xa3/0xe20 [zfs]

and thread 2 trying to get sa_lock, either in sa_setup:
   sa_setup+0x742/0x1380 [zfs]
   zfsvfs_init+0x3d6/0xb20 [zfs]
   zfsvfs_create+0x5dd/0x900 [zfs]
   zfs_domount+0xa3/0xe20 [zfs]
or in sa_build_index:
   sa_build_index+0x13d/0x790 [zfs]
   sa_handle_get_from_db+0x368/0x500 [zfs]
   zfs_znode_sa_init.isra.0+0x24b/0x330 [zfs]
   zfs_znode_alloc+0x3da/0x1a40 [zfs]
   zfs_zget+0x39a/0x6e0 [zfs]
   zfs_root+0x101/0x160 [zfs]
   zfs_domount+0x91f/0xea0 [zfs]

From there, there are different locking paths back to something
holding zap->zap_rwlock.

The deadlock scenarios involve multiple different ZFS filesystems
being mounted.  sa_lock is common to these scenarios, and the sa
struct involved is private to a mount.  Therefore, these must be
referring to different sa_lock instances and these deadlocks can't
occur in practice.

The fix, from Brian Behlendorf, is to remove sa_lock from lockdep
coverage by initializing it with MUTEX_NOLOCKDEP.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Jeff Dike <jdike@akamai.com>
Closes #9110
2020-01-22 13:48:57 -08:00
colmbuckley ed235deffd Set "none" scheduler if available (initramfs)
Existing zfs initramfs script logic will attempt to set the 'noop'
scheduler if it's available on the vdev block devices. Newer kernels
have the similar 'none' scheduler on multiqueue devices; this change
alters the initramfs script logic to also attempt to set this scheduler
if it's available.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Garrett Fields <ghfields@gmail.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: Colm Buckley <colm@tuatha.org>
Closes #9042
2020-01-22 13:48:57 -08:00
Paul Dagnelie 72dbc01e7f Add more refquota tests
It used to be possible for zfs receive (and other operations related
to clone swap) to bypass refquotas. This can cause a number of issues,
and there should be an automated test for it.

Added tests for rollback and receive not overriding refquota.

Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #9139
2020-01-22 13:48:57 -08:00
Michael Niewöhner c75d3968bd initramfs: fixes for (debian) initramfs
* contrib/initramfs: include /etc/default/zfs and /etc/zfs/zfs-functions
At least debian needs /etc/default/zfs and /etc/zfs/zfs-functions for
its initramfs. Include both in build when initramfs is configured.

* contrib/initramfs: include 60-zvol.rules and zvol_id
Include 60-zvol.rules and zvol_id and set udev as predependency instead
of debians zdev. This makes debians additional zdev hook unneeded.

* Correct initconfdir substitution for some distros
Not every Linux distro is using @sysconfdir@/default but @initconfdir@
which is already determined by configure. Let's use it.

* systemd: prevent possible conflict between systemd and sysvinit
Systemd will not load a sysvinit service if a unit exists with the same
name. This prevents conflicts between sysvinit and systemd.
In ZFS there is one sysvinit service that does not have a systemd
service but a target counterpart, zfs-import.target.
Usually it does not make any sense to install both but it is possisble.
Let's prevent any conflict by masking zfs-import.service by default.
This does not harm even if init.d/zfs-import does not exist.

Reviewed-by: Chris Wedgwood <cw@f00f.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Tested-by: Alex Ingram <reimu@reimuhakurei.net>
Tested-by: Dreamcat4 <dreamcat4@gmail.com>
Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>
Closes #7904
Closes #9089
2020-01-22 13:48:57 -08:00
Serapheim Dimitropoulos dfa4d3d986 dmu_tx_wait() hang likely due to cv_signal() in dsl_pool_dirty_delta()
Even though the bug's writeup (Github issue #9136) is very detailed,
we still don't know exactly how we got to that state, thus I wasn't
able to reproduce the bug. That said, we can make an educated guess
combining the information on filled issue with the code.

From the fact that `dp_dirty_total` was 0 (which is less than
`zfs_dirty_data_max`) we know that there was one thread that set it to
0 and then signaled one of the waiters of `dp_spaceavail_cv` [see
`dsl_pool_dirty_delta()` which is also the only place that
`dp_dirty_total` is changed].  Thus, the only logical explaination
then for the bug being hit is that the waiter that just got awaken
didn't go through `dsl_pool_dirty_data()`. Given that this function
is only called by `dsl_pool_dirty_space()` or `dsl_pool_undirty_space()`
I can only think of two possible ways of the above scenario happening:

[1] The waiter didn't call into any of the two functions - which I
    find highly unlikely (i.e. why wait on `dp_spaceavail_cv` to begin
    with?).
[2] The waiter did call in one of the above function but it passed 0 as
    the space/delta to be dirtied (or undirtied) and then the callee
    returned immediately (e.g both `dsl_pool_dirty_space()` and
    `dsl_pool_undirty_space()` return immediately when space is 0).

In any case and no matter how we got there, the easy fix would be to
just broadcast to all waiters whenever `dp_dirty_total` hits 0. That
said and given that we've never hit this before, it would make sense
to think more on why the above situation occured.

Attempting to mimic what Prakash was doing in the issue filed, I
created a dataset with `sync=always` and started doing contiguous
writes in a file within that dataset. I observed with DTrace that even
though we update the pool's dirty data accounting when we would dirty
stuff, the accounting wouldn't be decremented incrementally as we were
done with the ZIOs of those writes (the reason being that
`dbuf_write_physdone()` isn't be called as we go through the override
code paths, and thus `dsl_pool_undirty_space()` is never called). As a
result we'd have to wait until we get to `dsl_pool_sync()` where we
zero out all dirty data accounting for the pool and the current TXG's
metadata.

In addition, as Matt noted and I later verified, the same issue would
arise when using dedup.

In both cases (sync & dedup) we shouldn't have to wait until
`dsl_pool_sync()` zeros out the accounting data. According to the
comment in that part of the code, the reasons why we do the zeroing,
have nothing to do with what we observe:
````
/*
 * We have written all of the accounted dirty data, so our
 * dp_space_towrite should now be zero.  However, some seldom-used
 * code paths do not adhere to this (e.g. dbuf_undirty(), also
 * rounding error in dbuf_write_physdone).
 * Shore up the accounting of any dirtied space now.
 */
dsl_pool_undirty_space(dp, dp->dp_dirty_pertxg[txg & TXG_MASK], txg);
````

Ideally what we want to do is to undirty in the accounting exactly what
we dirty (I use the word ideally as we can still have rounding errors).
This would make the behavior of the system more clear and predictable.

Another interesting issue that I observed with DTrace was that we
wouldn't update any of the pool's dirty data accounting whenever we
would dirty and/or undirty MOS data. In addition, every time we would
change the size of a dbuf through `dbuf_new_size()` we wouldn't update
the accounted space dirtied in the appropriate dirty record, so when
ZIOs are done we would undirty less that we dirtied from the pool's
accounting point of view.

For the first two issues observed (sync & dedup) this patch ensures
that we still update the pool's accounting when we undirty data,
regardless of the write being physical or not.

For changes in the MOS, we first ensure to zero out the pool's dirty
data accounting in `dsl_pool_sync()` after we synced the MOS. Then we
can go ahead and enable the update of the pool's dirty data accounting
wheneve we change MOS data.

Another fix is that we now update the accounting explicitly for
counting errors in `dbuf_write_done()`.

Finally, `dbuf_new_size()` updates the accounted space of the
appropriate dirty record correctly now.

The problem is that we still don't know how the bug came up in the
issue filled. That said the issues fixed seem to be very relevant, so
instead of going with the broadcasting solution right away,
I decided to leave this patch as is.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Prakash Surya <prakash.surya@delphix.com>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
External-issue: DLPX-47285
Closes #9137
2020-01-22 13:48:57 -08:00
Tony Nguyen b78d32cc25 Improve write performance by using dmu_read_by_dnode()
In zfs_log_write(), we can use dmu_read_by_dnode() rather than
dmu_read() thus avoiding unnecessary dnode_hold() calls.

We get a 2-5% performance gain for large sequential_writes tests, >=128K
writes to files with recordsize=8K.

Testing done on Ubuntu 18.04 with 4.15 kernel, 8vCPUs and SSD storage on
VMware ESX.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Nguyen <tony.nguyen@delphix.com>
Closes #9156
2020-01-22 13:48:57 -08:00
Serapheim Dimitropoulos dd6d0bdbb3 Assert that a dnode's bonuslen never exceeds its recorded size
This patch introduces an assertion that can catch pitfalls in
development where there is a mismatch between the size of
reads and writes between a *_phys structure and its respective
in-core structure when bonus buffers are used.

This debugging-aid should be complementary to the verification
done by ztest in ztest_verify_dnode_bt().

A side to this patch is that we now clear out any extra bytes
past a bonus buffer's new size when the buffer is shrinking.

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #8348
2020-01-22 13:48:57 -08:00
Paul Zuchowski 9d4ca81b6f Make txg_wait_synced conditional in zfsvfs_teardown
The call to txg_wait_synced in zfsvfs_teardown should
be made conditional on the objset having dirty data.
This can prevent unnecessary txg_wait_synced during
some unmount operations.

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Zuchowski <pzuchowski@datto.com>
Closes #9115
2020-01-22 13:48:57 -08:00
Paul Dagnelie 93fd9101c9 Prevent race in blkptr_verify against device removal
When we check the vdev of the blkptr in zfs_blkptr_verify, we can run
into a race condition where that vdev is temporarily unavailable. This
happens when a device removal operation and the old vdev_t has been
removed from the array, but the new indirect vdev has not yet been
inserted.

We hold the spa_config_lock while doing our sensitive verification.
To ensure that we don't deadlock, we only grab the lock if we don't
have config_writer held. In addition, I had to const the tags of the
refcounts and the spa_config_lock arguments.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #9112
2020-01-22 13:48:57 -08:00
Prakash Surya 5549a537dd Fix device expansion when VM is powered off
When running on an ESXi based VM, I've found that "zpool online -e" will
not expand the zpool, if the disk was expanded in ESXi while the VM was
powered off.

For example, take the following scenario:

 1. VM running on top of VMware ESXi
 2. ZFS pool created with a given device "sda" of size 8GB
 3. VM powered off
 4. Device "sda" size expanded to 16GB
 5. VM powered on
 6. "zpool online -e" used on device "sda"

In this situation, after (2) the zpool will be roughly 8GB in size.
After (6), the expectation is the zpool's size will expand to roughly
16GB in size; i.e. expand to the new size of the "sda" device.
Unfortunately, I've seen that after (6), the zpool size does not change.

What's happening is after (5), the EFI label of the "sda" device will be
such that fields "efi_last_u_lba", "efi_last_lba", and "efi_altern_lba"
all reflect the new size of the disk; i.e. "33554398", "33554431", and
"33554431" respectively.

Thus, the check that we perform in "efi_use_whole_disk":

    if ((efi_label->efi_altern_lba == 1) || (efi_label->efi_altern_lba
        >= efi_label->efi_last_lba)) {

This will return true, and then we return from the function without
having expanded the size of the zpool/device.

In contrast, if we remove steps (3) and (5) in the sequence above, i.e.
the device is expanded while the VM is powered on, things change. In
that case, the fields "efi_last_u_lba" and "efi_altern_lba" do not
change (i.e. they still reflect the old 8GB device size), but the
"efi_last_lba" field does change (i.e. it now reflects the new 16GB
device size). Thus, when we evaluate the same conditional in
"efi_use_whole_disk", it'll return false, so the zpool is expanded.

Taking all of this into account, this PR updates "efi_use_whole_disk" to
properly expand the zpool when the underlying disk is expanded while the
VM is powered off.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Prakash Surya <prakash.surya@delphix.com>
Closes #9111
2020-01-22 13:48:57 -08:00
George Wilson 628fd31d26 spa_load_verify() may consume too much memory
When a pool is imported it will scan the pool to verify the integrity
of the data and metadata. The amount it scans will depend on the
import flags provided. On systems with small amounts of memory or
when importing a pool from the crash kernel, it's possible for
spa_load_verify to issue too many I/Os that it consumes all the memory
of the system resulting in an OOM message or a hang.

To prevent this, we limit the amount of memory that the initial pool
scan can consume. This change will, by default, use 1/16th of the ARC
for scan I/Os to prevent running the system out of memory during import.

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Signed-off-by: George Wilson george.wilson@delphix.com
External-issue: DLPX-65237
External-issue: DLPX-65238
Closes #9146
2020-01-22 13:48:57 -08:00
Tomohiro Kusumi d38e4ee142 Change boolean-like uint8_t fields in znode_t to boolean_t
Given znode_t is an in-core structure, it's more readable to have
them as boolean. Also co-locate existing boolean fields with them
for space efficiency (expecting 8 booleans to be packed/aligned).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Closes #9092
Conflicts:
	include/sys/zfs_znode.h
	module/zfs/zfs_znode.c
2020-01-22 13:48:57 -08:00
Richard Yao 0b96952eef Drop KMC_NOEMERGENCY
This is not implemented. If it were implemented, using it would risk
deadlocks on pre-3.18 kernels. Lets just drop it.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Michael Niewöhner <foss@mniewoehner.de>
Signed-off-by: Richard Yao <ryao@gentoo.org>
Closes #9119
2020-01-22 13:48:57 -08:00
DeHackEd 376ca4649b Don't wakeup unnecessarily in 'zpool events -f'
ZED can prevent CPU's from properly sleeping.

Rather than periodically waking up in the zevents code, just go to sleep and wait for a wakeup.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: DHE <git@dehacked.net>
Closes #9091
2020-01-22 13:48:57 -08:00
Serapheim Dimitropoulos 66398a4da3 Test cancelling a removal in ZTS
This patch adds a new test that sanity checks cancelling a removal.

Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #9101
Conflicts:
	tests/zfs-tests/tests/functional/removal/Makefile.am
2020-01-22 13:48:57 -08:00
jdike 77d59a6d63 lockdep false positive - move txg_kick() outside of ->dp_lock
This fixes a lockdep warning by breaking a link between ->tx_sync_lock
and ->dp_lock.

The deadlock envisioned by lockdep is this:
    thread 1 holds db->db_mtx and tries to get dp->dp_lock:
	dsl_pool_dirty_space+0x70/0x2d0 [zfs]
	dbuf_dirty+0x778/0x31d0 [zfs]

    thread 2 holds bpo->bpo_lock and tries to get db->db_mtx:
        dmu_buf_will_dirty_impl
        dmu_buf_will_dirty+0x6b/0x6c0 [zfs]
        bpobj_iterate_impl+0xbe6/0x1410 [zfs]

    thread 3 holds tx->tx_sync_lock and tries to get bpo->bpo_lock:
        bpobj_space+0x63/0x470 [zfs]
        dsl_scan_active+0x340/0x3d0 [zfs]
        txg_sync_thread+0x3f2/0x1370 [zfs]

    thread 4 holds dp->dp_lock and tries to get tx->tx_sync_lock
       txg_kick+0x61/0x420 [zfs]
       dsl_pool_need_dirty_delay+0x1c7/0x3f0 [zfs]

This patch is orginally from Brian Behlendorf and slightly simplified
by me.

It breaks this cycle in thread 4 by moving the call from
dsl_pool_need_dirty_delay to txg_kick outside the section controlled
by dp->dp_lock.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Signed-off-by: Jeff Dike <jdike@akamai.com>
Closes #9094
2020-01-22 13:48:57 -08:00
Clint Armstrong 1d4faef7a5 Add channel program for property based snapshots
Channel programs that many users find useful should be included with zfs
in the /contrib directory. This is the first of these contributions. A
channel program to recursively take snapshots of datasets with the
property com.sun:auto-snapshot=true.

Reviewed-by: Kash Pande <kash@tripleback.net>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Clint Armstrong <clint@clintarmstrong.net>
Closes #8443
Closes #9050
2020-01-22 13:48:57 -08:00
Michael Niewöhner cc8df1f117 install path fixes
* rpm: correct pkgconfig path

pkconfig files get installed to $datarootdir/pkgconfig but rpm expects
them to be at $datadir. This works when $datarootdir==$datadir which is
the case most of the time but will fail when they differ.

* install: make initramfs-tools path static

Since initramfs-tools' path is nothing we can control as it is an
external package it does not make any sense to install zfs additions
anywhere else. Simply use /usr/share/initramfs-tools as path.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>
Closes #9087
2020-01-22 13:48:57 -08:00
Paul Dagnelie 66c8b2f65a Don't activate metaslabs with weight 0
We return ENOSPC in metaslab_activate if the metaslab has weight 0,
to avoid activating a metaslab with no space available.  For sanity
checking, we also assert that there is no free space in the range
tree in that case.

Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #8968
2020-01-22 13:48:57 -08:00
Mike Gerdts f3f46b0e45 OpenZFS 9318 - vol_volsize_to_reservation does not account for raidz skip blocks
When a volume is created in a pool with raidz vdevs and
volblocksize != 128k, the volume can reference more space than is
reserved with the automatically calculated refreservation.  There
are two deficiencies in vol_volsize_to_reservation that contribute
to this:

  1) Skip blocks may be added to keep each allocation a multiple
     of parity + 1. This is the dominating factor when volblocksize
     is close to 2^ashift.

  2) raidz deflation for 128 KB blocks is different for most other
     block sizes.

See "The theory of raidz space accounting" comment in
libzfs_dataset.c for a full explanation.

Authored by: Mike Gerdts <mike.gerdts@joyent.com>
Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>
Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: Kody Kantor <kody.kantor@joyent.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Dan McDonald <danmcd@joyent.com>
Ported-by: Mike Gerdts <mike.gerdts@joyent.com>

Porting Notes:
* ZTS: wait for zvols to exist before writing
* ZTS: use log_must_busy with {zpool|zfs} destroy

OpenZFS-issue: https://www.illumos.org/issues/9318
OpenZFS-commit: https://github.com/illumos/illumos-gate/commit/b73ccab0
Closes #8973
2020-01-22 13:48:56 -08:00
Paul Dagnelie 350646563f Concurrent small allocation defeats large allocation
With the new parallel allocators scheme, there is a possibility for
a problem where two threads, allocating from the same allocator at
the same time, conflict with each other. There are two primary cases
to worry about. First, another thread working on another allocator
activates the same metaslab that the first thread was trying to
activate. This results in the first thread needing to go back and
reselect a new metaslab, even though it may have waited a long time
for this metaslab to load. Second, another thread working on the same
allocator may have activated a different metaslab while the first
thread was waiting for its metaslab to load. Both of these cases
can cause the first thread to be significantly delayed in issuing
its IOs. The second case can also cause metaslab load/unload churn;
because the metaslab is loaded but not fully activated, we never set
the selected_txg, which results in the metaslab being immediately
unloaded again. This process can repeat many times, wasting disk and
cpu resources. This is more likely to happen when the IO of the first
thread is a larger one (like a ZIL write) and the other thread is
doing a smaller write, because it is more likely to find an
acceptable metaslab quickly.

There are two primary changes. The first is to always proceed with
the allocation when returning from metaslab_activate if we were
preempted in either of the ways described in the previous section.
The second change is to set the selected_txg before we do the call
to activate so that even if the metaslab is not used for an
allocation, we won't immediately attempt to unload it.

Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>
Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
External-issue: DLPX-61314
Closes #8843
2020-01-22 13:48:56 -08:00
loli10K d47ee5ad1c Fix bp_embedded_type enum definition
With the addition of BP_EMBEDDED_TYPE_REDACTED in 30af21b0 a couple of
codepaths make wrong assumptions and could potentially result in errors.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #8951
Conflicts:
	include/sys/spa.h
2020-01-22 13:48:56 -08:00
Don Brady e625030c11 OpenZFS 9425 - channel programs can be interrupted
Problem Statement
=================
ZFS Channel program scripts currently require a timeout, so that hung or
long-running scripts return a timeout error instead of causing ZFS to get
wedged. This limit can currently be set up to 100 million Lua instructions.
Even with a limit in place, it would be desirable to have a sys admin
(support engineer) be able to cancel a script that is taking a long time.

Proposed Solution
=================
Make it possible to abort a channel program by sending an interrupt signal.In
the underlying txg_wait_sync function, switch the cv_wait to a cv_wait_sig to
catch the signal. Once a signal is encountered, the dsl_sync_task function can
install a Lua hook that will get called before the Lua interpreter executes a
new line of code. The dsl_sync_task can resume with a standard txg_wait_sync
call and wait for the txg to complete.  Meanwhile, the hook will abort the
script and indicate that the channel program was canceled. The kernel returns
a EINTR to indicate that the channel program run was canceled.

Porting notes: Added missing return value from cv_wait_sig()

Authored by: Don Brady <don.brady@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>
Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: Sara Hartse <sara.hartse@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Robert Mustacchi <rm@joyent.com>
Ported-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Don Brady <don.brady@delphix.com>

OpenZFS-issue: https://www.illumos.org/issues/9425
OpenZFS-commit: https://github.com/illumos/illumos-gate/commit/d0cb1fb926
Closes #8904
2020-01-22 13:48:56 -08:00
Matthew Ahrens cbb9154958 looping in metaslab_block_picker impacts performance on fragmented pools
On fragmented pools with high-performance storage, the looping in
metaslab_block_picker() can become the performance-limiting bottleneck.
When looking for a larger block (e.g. a 128K block for the ZIL), we may
search through many free segments (up to hundreds of thousands) to find
one that is large enough to satisfy the allocation. This can take a long
time (up to dozens of ms), and is done while holding the ms_lock, which
other threads may spin waiting for.

When this performance problem is encountered, profiling will show
high CPU time in metaslab_block_picker, as well as in mutex_enter from
various callers.

The problem is very evident on a test system with a sync write workload
with 8K writes to a recordsize=8k filesystem, with 4TB of SSD storage,
84% full and 88% fragmented. It has also been observed on production
systems with 90TB of storage, 76% full and 87% fragmented.

The fix is to change metaslab_df_alloc() to search only up to 16MB from
the previous allocation (of this alignment). After that, we will pick a
segment that is of the exact size requested (or larger). This reduces
the number of iterations to a few hundred on fragmented pools (a ~100x
improvement).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: George Wilson <george.wilson@delphix.com>
Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-62324
Closes #8877
2020-01-22 13:48:56 -08:00
Matthew Ahrens 8805abb8fc single-chunk scatter ABDs can be treated as linear
Scatter ABD's are allocated from a number of pages.  In contrast to
linear ABD's, these pages are disjoint in the kernel's virtual address
space, so they can't be accessed as a contiguous buffer.  Therefore
routines that need a linear buffer (e.g. abd_borrow_buf() and friends)
must allocate a separate linear buffer (with zio_buf_alloc()), and copy
the contents of the pages to/from the linear buffer.  This can have a
measurable performance overhead on some workloads.

https://github.com/zfsonlinux/zfs/commit/87c25d567fb7969b44c7d8af63990e
("abd_alloc should use scatter for >1K allocations") increased the use
of scatter ABD's, specifically switching 1.5K through 4K (inclusive)
buffers from linear to scatter.  For workloads that access blocks whose
compressed sizes are in this range, that commit introduced an additional
copy into the read code path.  For example, the
sequential_reads_arc_cached tests in the test suite were reduced by
around 5% (this is doing reads of 8K-logical blocks, compressed to 3K,
which are cached in the ARC).

This commit treats single-chunk scattered buffers as linear buffers,
because they are contiguous in the kernel's virtual address space.

All single-page (4K) ABD's can be represented this way.  Some multi-page
ABD's can also be represented this way, if we were able to allocate a
single "chunk" (higher-order "page" which represents a power-of-2 series
of physically-contiguous pages).  This is often the case for 2-page (8K)
ABD's.

Representing a single-entry scatter ABD as a linear ABD has the
performance advantage of avoiding the copy (and allocation) in
abd_borrow_buf_copy / abd_return_buf_copy.  A performance increase of
around 5% has been observed for ARC-cached reads (of small blocks which
can take advantage of this), fixing the regression introduced by
87c25d567.

Note that this optimization is only possible because all physical memory
is always mapped into the kernel's address space.  This is not the case
for HIGHMEM pages, so the optimization can not be made on 32-bit
systems.

Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #8580
2020-01-22 13:48:56 -08:00
Matthew Ahrens bee5738f77 make zil max block size tunable
We've observed that on some highly fragmented pools, most metaslab
allocations are small (~2-8KB), but there are some large, 128K
allocations.  The large allocations are for ZIL blocks.  If there is a
lot of fragmentation, the large allocations can be hard to satisfy.

The most common impact of this is that we need to check (and thus load)
lots of metaslabs from the ZIL allocation code path, causing sync writes
to wait for metaslabs to load, which can take a second or more.  In the
worst case, we may not be able to satisfy the allocation, in which case
the ZIL will resort to txg_wait_synced() to ensure the change is on
disk.

To provide a workaround for this, this change adds a tunable that can
reduce the size of ZIL blocks.

External-issue: DLPX-61719
Reviewed-by: George Wilson <george.wilson@delphix.com>
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #8865
2020-01-22 13:48:56 -08:00
Tony Hutter 1222e921c9 Tag zfs-0.8.2
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2019-09-25 11:27:51 -07:00
Kody A Kantor c37fa0d5a8 Disabled resilver_defer feature leads to looping resilvers
When a disk is replaced with another on a pool with the resilver_defer
feature present, but not enabled the resilver activity restarts during
each spa_sync. This patch checks to make sure that the resilver_defer
feature is first enabled before requesting a deferred resilver.

This was originally fixed in illumos-joyent as OS-7982.

Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>
Signed-off-by: Kody A Kantor <kody@kkantor.com>
External-issue: illumos-joyent OS-7982
Closes #9299
Closes #9338
2019-09-25 11:27:51 -07:00
Andriy Gapon 12a78fbb4f Fix dsl_scan_ds_clone_swapped logic
The was incorrect with respect to swapping dataset IDs both in the
on-disk ZAP object and the in-memory queue.

In both cases, if ds1 was already present, then it would be first
replaced with ds2 and then ds would be replaced back with ds1.
Also, both cases did not properly handle a situation where both ds1 and
ds2 are already queued.  A duplicate insertion would be attempted and
its failure would result in a panic.

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Andriy Gapon <avg@FreeBSD.org>
Closes #9140
Closes #9163
2019-09-25 11:27:51 -07:00
loli10K 63d8f57fe7 Scrubbing root pools may deadlock on kernels without elevator_change() (#9321)
Originally the zfs_vdev_elevator module option was added as a
convenience so the requested elevator would be automatically set
on the underlying block devices. At the time this was simple
because the kernel provided an API function which did exactly this.

This API was then removed in the Linux 4.12 kernel which prompted
us to add compatibly code to set the elevator via a usermodehelper.

Unfortunately changing the evelator via usermodehelper requires reading
some userland binaries, most notably modprobe(8) or sh(1), from a zfs
dataset on systems with root-on-zfs. This can deadlock the system if
used during the following call path because it may need, if the data
is not already cached in the ARC, reading directly from disk while
holding the spa config lock as a writer:

  zfs_ioc_pool_scan()
    -> spa_scan()
      -> spa_scan()
        -> vdev_reopen()
          -> vdev_elevator_switch()
            -> call_usermodehelper()

While the usermodehelper waits sh(1), modprobe(8) is blocked in the
ZIO pipeline trying to read from disk:

  INFO: task modprobe:2650 blocked for more than 10 seconds.
       Tainted: P           OE     5.2.14
  modprobe        D    0  2650    206 0x00000000
  Call Trace:
   ? __schedule+0x244/0x5f0
   schedule+0x2f/0xa0
   cv_wait_common+0x156/0x290 [spl]
   ? do_wait_intr_irq+0xb0/0xb0
   spa_config_enter+0x13b/0x1e0 [zfs]
   zio_vdev_io_start+0x51d/0x590 [zfs]
   ? tsd_get_by_thread+0x3b/0x80 [spl]
   zio_nowait+0x142/0x2f0 [zfs]
   arc_read+0xb2d/0x19d0 [zfs]
   ...
   zpl_iter_read+0xfa/0x170 [zfs]
   new_sync_read+0x124/0x1b0
   vfs_read+0x91/0x140
   ksys_read+0x59/0xd0
   do_syscall_64+0x4f/0x130
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

This commit changes how we use the usermodehelper functionality from
synchronous (UMH_WAIT_PROC) to asynchronous (UMH_NO_WAIT) which prevents
scrubs, and other vdev_elevator_switch() consumers, from triggering the
aforementioned issue.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Issue #8664
Closes #9321
2019-09-25 11:27:51 -07:00
Chengfei ZHu 9fa8b5b55b QAT related bug fixes
1. Fix issue:  Kernel BUG with QAT during decompression  #9276.
   Now it is uninterruptible for a specific given QAT request,
   but Ctrl-C interrupt still works in user-space process.

2. Copy the digest result to the buffer only when doing encryption,
   and vise-versa for decryption.

Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chengfei Zhu <chengfeix.zhu@intel.com>
Closes #9276
Closes #9303
2019-09-25 11:27:51 -07:00
Brian Behlendorf e17445d1f7 kmodtool: depmod path
Determine the location of depmod on the system, either /sbin/depmod or
/usr/sbin/depmod.  Then use that path when generating the specfile.

Additionally, update the Requires lines to reference the package which
provides depmod rather than the binary itself.  For CentOS/RHEL 7+8
and all supported Fedora releases this is the kmod package, and for
CentOS/RHEL 6 it is the module-init-tools package.

Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #8724
Closes #9310
2019-09-25 11:27:51 -07:00
Brian Behlendorf 97d4986214 Fix /etc/hostid on root pool deadlock
Accidentally introduced by dc04a8c which now takes the SCL_VDEV lock
as a reader in zfs_blkptr_verify().  A deadlock can occur if the
/etc/hostid file resides on a dataset in the same pool.  This is
because reading the /etc/hostid file may occur while the caller is
holding the SCL_VDEV lock as a writer.  For example, to perform a
`zpool attach` as shown in the abbreviated stack below.

To resolve the issue we cache the system's hostid when initializing
the spa_t, or when modifying the multihost property.  The cached
value is then relied upon for subsequent accesses.

Call Trace:
    spa_config_enter+0x1e8/0x350 [zfs]
    zfs_blkptr_verify+0x33c/0x4f0 [zfs] <--- trying read lock
    zio_read+0x6c/0x140 [zfs]
    ...
    vfs_read+0xfc/0x1e0
    kernel_read+0x50/0x90
    ...
    spa_get_hostid+0x1c/0x38 [zfs]
    spa_config_generate+0x1a0/0x610 [zfs]
    vdev_label_init+0xa0/0xc80 [zfs]
    vdev_create+0x98/0xe0 [zfs]
    spa_vdev_attach+0x14c/0xb40 [zfs] <--- grabbed write lock

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9256
Closes #9285
2019-09-25 11:27:51 -07:00
Olaf Faaland 0ae5f0c8d2 BuildRequires libtirpc-devel needed for RHEL 8
Building against RHEL 8 requires libtirpc-devel, as with fedora 28.
Add rhel8 and centos8 options to the test, to account for that.

BuildRequires Originally added for fedora 28 via commit
1a62a305be

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #9289
2019-09-25 11:27:51 -07:00
loli10K 146d7d8846 Fix zpool subcommands error message with some unsupported options
Both 'detach' and 'online' zpool subcommands, when provided with an
unsupported option, forget to print it in the error message:

   # zpool online -t rpool vda3
   invalid option ''
   usage:
      online [-e] <pool> <device> ...

This changes fixes the error message in order to include the actual
option that is not supported.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #9270
2019-09-25 11:27:51 -07:00
loli10K 9f261b1be6 Fix zfs-dkms .deb package warning in prerm script
Debian zfs-dkms package generated by alien doesn't call the prerm script
(rpm's %preun) with an integer as first parameter, which results in the
following warning when the package is uninstalled:

   "zfs-dkms.prerm: line 3: [: remove: integer expression expected"

Modify the if-condition to avoid the warning.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #9271
2019-09-25 11:27:51 -07:00
Pavel Zakharov 5acba22ec0 zvol_wait script should ignore partially received zvols
Partially received zvols won't have links in /dev/zvol.

Reviewed-by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Closes #9260
2019-09-25 11:27:51 -07:00
Pavel Zakharov 38528476bf New service that waits on zvol links to be created
The zfs-volume-wait.service scans existing zvols and waits for their
links under /dev to be created. Any service that depends on zvol
links to be there should add a dependency on zfs-volumes.target.
By default, this target is not enabled.

Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Reviewed-by: Antonio Russo <antonio.e.russo@gmail.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: John Gallagher <john.gallagher@delphix.com>
Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pavel Zakharov <pzakharov@delphix.com>
Closes #8975
2019-09-25 11:27:51 -07:00
Andriy Gapon beb21db3c6 Always refuse receving non-resume stream when resume state exists
This fixes a hole in the situation where the resume state is left from
receiving a new dataset and, so, the state is set on the dataset itself
(as opposed to %recv child).

Additionally, distinguish incremental and resume streams in error
messages.

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andriy Gapon <avg@FreeBSD.org>
Closes #9252
2019-09-25 11:27:51 -07:00
loli10K 13e5e396a3 Fix Intel QAT / ZFS compatibility on v4.7.1+ kernels
This change use the compat code introduced in 9cc1844a.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #9268
Closes #9269
2019-09-25 11:27:51 -07:00
Georgy Yakovlev 3cf4ecb03f etc/init.d/zfs-functions.in: remove arch warning
Remove the x86_64 warning, it's no longer the case that this is the
only supported architecture.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Georgy Yakovlev <gyakovlev@gentoo.org>
Closes: #9177
2019-09-25 11:27:51 -07:00
Pavel Zakharov 0e765c4eb8 zfs_handle used after being closed/freed in change_one callback
This is a typical case of use after free. We would call zfs_close(zhp)
which would free the handle, and then call zfs_iter_children() on that
handle later.  This change ensures that the zfs_handle is only closed
when we are ready to return.

Running `zfs inherit -r sharenfs pool` was failing with an error
code without any error messages. After some debugging I've pinpointed
the issue to be memory corruption, which would cause zfs to try to
issue an ioctl to the wrong device and receive ENOTTY.

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed-by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alek Pinchuk <apinchuk@datto.com>
Signed-off-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Issue #7967
Closes #9165
2019-09-25 11:27:51 -07:00
Chunwei Chen c7a4255f12 Fix zil replay panic when TX_REMOVE followed by TX_CREATE
If TX_REMOVE is followed by TX_CREATE on the same object id, we need to
make sure the object removal is completely finished before creation. The
current implementation relies on dnode_hold_impl with
DNODE_MUST_BE_ALLOCATED returning ENOENT. While this check seems to work
fine before, in current version it does not guarantee the object removal
is completed.

We fix this by checking if DNODE_MUST_BE_FREE returns successful
instead. Also add test and remove dead code in dnode_hold_impl.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #7151
Closes #8910
Closes #9123
Closes #9145
2019-09-25 11:27:51 -07:00
Andriy Gapon 931bef81c8 zfs_ioc_snapshot: check user-prop permissions on snapshotted datasets
Previously, the permissions were checked on the pool which was obviously
incorrect.

After this change, zfs_check_userprops() only validates the properties
without any permission checks.  The permissions are checked individually
for each snapshotted dataset.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Signed-off-by: Andriy Gapon <avg@FreeBSD.org>
Closes #9179
Closes #9180
2019-09-25 11:27:50 -07:00
Richard Allen ea34735203 Fix Plymouth passphrase prompt in initramfs script
Entering the ZFS encryption passphrase under Plymouth wasn't working
because in the ZFS initrd script, Plymouth was calling zfs via
"--command", which wasn't passing through the filesystem argument to
zfs load-key properly (it was passing through the single quotes around
the filesystem name intended to handle spaces literally,
which zfs load-key couldn't understand).

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Garrett Fields <ghfields@gmail.com>
Signed-off-by: Richard Allen <belperite@gmail.com>
Issue #9193
Closes #9202
2019-09-25 11:27:50 -07:00
Tom Caputi 95319fc569 Fix deadlock in 'zfs rollback'
Currently, the 'zfs rollback' code can end up deadlocked due to
the way the kernel handles unreferenced inodes on a suspended fs.
Essentially, the zfs_resume_fs() code path may cause zfs to spawn
new threads as it reinstantiates the suspended fs's zil. When a
new thread is spawned, the kernel may attempt to free memory for
that thread by freeing some unreferenced inodes. If it happens to
select inodes that are a a part of the suspended fs a deadlock
will occur because freeing inodes requires holding the fs's
z_teardown_inactive_lock which is still held from the suspend.

This patch corrects this issue by adding an additional reference
to all inodes that are still present when a suspend is initiated.
This prevents them from being freed by the kernel for any reason.

Reviewed-by: Alek Pinchuk <apinchuk@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #9203
2019-09-25 11:27:50 -07:00
Ryan Moeller 33374f21f0 Make slog test setup more robust
The slog tests fail when attempting to create pools using file vdevs
that already exist from previous test runs. Remove these files in the
setup for the test.

Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Ryan Moeller <ryan@ixsystems.com>
Closes #9194
2019-09-25 11:27:50 -07:00
yshui 512a50f38d zfs-mount-genrator: dependencies should be space-separated
Reviewed-by: Antonio Russo <antonio.e.russo@gmail.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com>
Closes #9174
2019-09-25 11:27:50 -07:00
Tony Hutter 023ab67a64 Linux 5.3: Fix switch() fall though compiler errors
Fix some switch() fall-though compiler errors:

    abd.c:1504:9: error: this statement may fall through

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #9170
2019-09-25 11:27:50 -07:00
Dominic Pearson 65469f6e30 Linux 5.3 compat: Makefile subdir-m no longer supported
Uses obj-m instead, due to kernel changes.

See LKML: Masahiro Yamada, Tue, 6 Aug 2019 19:03:23 +0900

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Dominic Pearson <dsp@technoanimal.net>
Closes #9169
2019-09-25 11:27:50 -07:00
Chunwei Chen 569f5d5d05 Fix out-of-order ZIL txtype lost on hardlinked files
We should only call zil_remove_async when an object is removed. However,
in current implementation, it is called whenever TX_REMOVE is called. In
the case of hardlinked file, every unlink will generate TX_REMOVE and
causing operations to be dropped even when the object is not removed.

We fix this by only calling zil_remove_async when the file is fully
unlinked.

Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Prakash Surya <prakash.surya@delphix.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #8769
Closes #9061
2019-09-25 11:27:50 -07:00
Michael Niewöhner 6d1599c1e1 Increase default zcmd allocation to 256K
When creating hundreds of clones (for example using containers with
LXD) cloning slows down as the number of clones increases over time.
The reason for this is that the fetching of the clone information
using a small zcmd buffer requires two ioctl calls, one to determine
the size and a second to return the data. However, this requires
gathering the data twice, once to determine the size and again to
populate the zcmd buffer to return it to userspace.
These are expensive ioctl() calls, so instead, make the default buffer
size much larger: 256K.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>
Closes #9084
2019-09-25 11:27:50 -07:00
Matthew Ahrens 6c9882d5db Improve performance by using dmu_tx_hold_*_by_dnode()
In zfs_write() and dmu_tx_hold_sa(), we can use dmu_tx_hold_*_by_dnode()
instead of dmu_tx_hold_*(), since we already have a dbuf from the target
dnode in hand.  This eliminates some calls to dnode_hold(), which can be
expensive.  This is especially impactful if several threads are
accessing objects that are in the same block of dnodes, because they
will contend for that dbuf's lock.

We are seeing 10-20% performance wins for the sequential_writes tests in
the performance test suite, when doing >=128K writes to files with
recordsize=8K.

This also removes some unnecessary casts that are in the area.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #9081
2019-09-25 11:27:50 -07:00
Brian Behlendorf 8c00159411 Fix channel programs on s390x
When adapting the original sources for s390x the JMP_BUF_CNT was
mistakenly halved due to an incorrect assumption of the size of
a unsigned long.  They are 8 bytes for the s390x architecture.
Increase JMP_BUF_CNT accordingly.

Authored-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reported-by: Colin Ian King <canonical.com>
Tested-by: Colin Ian King <canonical.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #8992
Closes #9080
2019-09-25 11:27:50 -07:00
George Wilson a8c5bcb5de Race between zfs-share and zfs-mount services
When a system boots the zfs-mount.service and the
zfs-share.service can start simultaneously. What may be
unclear is that sharing a filesystem will first mount
the filesystem if it's not already mounted. This means
that both service can race to mount the same fileystem.
This race can result in a SEGFAULT or EBUSY conditions.

This change explicitly defines the start ordering between the
two services such that the zfs-mount.service is solely
responsible for mounting filesystems eliminating the race
between "zfs mount -a" and "zfs share -a" commands.

Reviewed-by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Wilson <george.wilson@delphix.com>
Closes #9083
2019-09-25 11:27:50 -07:00
Tomohiro Kusumi 6c68594675 Implement secpolicy_vnode_setid_retain()
Don't unconditionally return 0 (i.e. retain SUID/SGID).
Test CAP_FSETID capability.

https://github.com/pjd/pjdfstest/blob/master/tests/chmod/12.t
which expects SUID/SGID to be dropped on write(2) by non-owner fails
without this. Most filesystems make this decision within VFS by using
a generic file write for fops.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Closes #9035
Closes #9043
2019-09-25 11:27:50 -07:00
Matthew Ahrens 1f5979d23f zed crashes when devid not present
zed core dumps due to a NULL pointer in zfs_agent_iter_vdev(). The
gs_devid is NULL, but the nvl has a "devid" entry.

zfs_agent_post_event() checks that ZFS_EV_VDEV_GUID or DEV_IDENTIFIER is
present in nvl, but then later it and zfs_agent_iter_vdev() assume that
DEV_IDENTIFIER is present and thus gs_devid is set.

Typically this is not a problem because usually either all vdevs have
devid's, or none of them do. Since zfs_agent_iter_vdev() first checks if
the vdev has devid before dereferencing gs_devid, the problem isn't
typically encountered. However, if some vdevs have devid's and some do
not, then the problem is easily reproduced.  This can happen if the pool
has been moved from a system that has devid's to one that does not.

The fix is for zfs_agent_iter_vdev() to only try to match the devid's if
both nvl and gsp have devid's present.

Reviewed-by: Prashanth Sreenivasa <pks@delphix.com>
Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-65090
Closes #9054
Closes #9060
2019-09-25 11:27:50 -07:00
Tomohiro Kusumi 4f951b183c Don't directly cast unsigned long to void*
Cast to uintptr_t first for portability on integer to/from pointer
conversion.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Closes #9065
2019-09-25 11:27:50 -07:00
Tomohiro Kusumi 65a0b28b42 Fix module_param() type for zfs_read_chunk_size
zfs_read_chunk_size is unsigned long.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Closes #9051
2019-09-25 11:27:50 -07:00
Tony Hutter be068aeea8 Move some tests to cli_user/zpool_status
The tests in tests/functional/cli_root/zpool_status should all require
root. However, linux.run has "user =" specified for those tests, which
means they run as a normal user.  When I removed that line to run them
as root, the following tests did not pass:

zpool_status_003_pos
zpool_status_-c_disable
zpool_status_-c_homedir
zpool_status_-c_searchpath

These tests need to be run as a normal user.  To fix this, move these
tests to a new tests/functional/cli_user/zpool_status directory.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #9057
2019-09-25 11:27:50 -07:00
Serapheim Dimitropoulos 1c4b0fc745 Race condition between spa async threads and export
In the past we've seen multiple race conditions that have
to do with open-context threads async threads and concurrent
calls to spa_export()/spa_destroy() (including the one
referenced in issue #9015).

This patch ensures that only one thread can execute the
main body of spa_export_common() at a time, with subsequent
threads returning with a new error code created just for
this situation, eliminating this way any race condition
bugs introduced by concurrent calls to this function.

Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #9015
Closes #9044
2019-09-25 11:27:50 -07:00
Serapheim Dimitropoulos bbbe4b0a98 hdr_recl calls zthr_wakeup() on destroyed zthr
There exists a race condition were hdr_recl() calls
zthr_wakeup() on a destroyed zthr. The timeline is the
following:

[1] hdr_recl() runs first and goes intro zthr_wakeup()
    because arc_initialized is set.
[2] arc_fini() is called by another thread, zeroes
    that flag, destroying the zthr, and goes into
    buf_init().
[3] hdr_recl() tries to enter the destroyed mutex
    and we blow up.

This patch ensures that the ARC's zthrs are not offloaded
any new work once arc_initialized is set and then destroys
them after all of the ARC state has been deleted.

Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #9047
2019-09-25 11:27:50 -07:00
Tomohiro Kusumi 3c144b9267 Fix wrong comment on zcr_blksz_{min,max}
These aren't tunable; illumos has this comment fixed in
"3742 zfs comments need cleaner, more consistent style",
so sync with that.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Closes #9052
2019-09-25 11:27:50 -07:00
Brian Behlendorf 428a63cc62 Retire unused spl_{mutex,rwlock}_{init_fini}
These functions are unused and can be removed along
with the spl-mutex.c and spl-rwlock.c source files.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9029
2019-09-25 11:27:49 -07:00
Brian Behlendorf 3982d959c5 Linux 5.3 compat: retire rw_tryupgrade()
The Linux kernel's rwsem's have never provided an interface to
allow a reader to be upgraded to a writer.  Historically, this
functionality has been implemented by a SPL wrapper function.
However, this approach depends on internal knowledge of the
rw_semaphore and is therefore rather brittle.

Since the ZFS code must always be able to fallback to rw_exit()
and rw_enter() when an rw_tryupgrade() fails; this functionality
isn't critical.  Furthermore, the only potentially performance
sensitive consumer is dmu_zfetch() and no decrease in performance
was observed with this change applied.  See the PR comments for
additional testing details.

Therefore, it is being retired to make the build more robust and
to simplify the rwlock implementation.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9029
2019-09-25 11:27:49 -07:00
Brian Behlendorf 54561073e7 Linux 5.3 compat: rw_semaphore owner
Commit https://github.com/torvalds/linux/commit/94a9717b updated the
rwsem's owner field to contain additional flags describing the rwsem's
state.  Rather then update the wrappers to mask out these bits, the
code no longer relies on the owner stored by the kernel.  This does
increase the size of a krwlock_t but it makes the implementation
less sensitive to future kernel changes.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9029
2019-09-25 11:27:49 -07:00
jdike 4c98586daf Fix lockdep recursive locking false positive in dbuf_destroy
lockdep reports a possible recursive lock in dbuf_destroy.

It is true that dbuf_destroy is acquiring the dn_dbufs_mtx
on one dnode while holding it on another dnode.  However,
it is impossible for these to be the same dnode because,
among other things,dbuf_destroy checks MUTEX_HELD before
acquiring the mutex.

This fix defines a class NESTED_SINGLE == 1 and changes
that lock to call mutex_enter_nested with a subclass of
NESTED_SINGLE.

In order to make the userspace code compile,
include/sys/zfs_context.h now defines mutex_enter_nested and
NESTED_SINGLE.

This is the lockdep report:

[  122.950921] ============================================
[  122.950921] WARNING: possible recursive locking detected
[  122.950921] 4.19.29-4.19.0-debug-d69edad5368c1166 #1 Tainted: G           O
[  122.950921] --------------------------------------------
[  122.950921] dbu_evict/1457 is trying to acquire lock:
[  122.950921] 0000000083e9cbcf (&dn->dn_dbufs_mtx){+.+.}, at: dbuf_destroy+0x3c0/0xdb0 [zfs]
[  122.950921]
               but task is already holding lock:
[  122.950921] 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs]
[  122.950921]
               other info that might help us debug this:
[  122.950921]  Possible unsafe locking scenario:

[  122.950921]        CPU0
[  122.950921]        ----
[  122.950921]   lock(&dn->dn_dbufs_mtx);
[  122.950921]   lock(&dn->dn_dbufs_mtx);
[  122.950921]
                *** DEADLOCK ***

[  122.950921]  May be due to missing lock nesting notation

[  122.950921] 1 lock held by dbu_evict/1457:
[  122.950921]  #0: 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs]
[  122.950921]
               stack backtrace:
[  122.950921] CPU: 0 PID: 1457 Comm: dbu_evict Tainted: G           O      4.19.29-4.19.0-debug-d69edad5368c1166 #1
[  122.950921] Hardware name: Supermicro H8SSL-I2/H8SSL-I2, BIOS 080011  03/13/2009
[  122.950921] Call Trace:
[  122.950921]  dump_stack+0x91/0xeb
[  122.950921]  __lock_acquire+0x2ca7/0x4f10
[  122.950921]  lock_acquire+0x153/0x330
[  122.950921]  dbuf_destroy+0x3c0/0xdb0 [zfs]
[  122.950921]  dbuf_evict_one+0x1cc/0x3d0 [zfs]
[  122.950921]  dbuf_rele_and_unlock+0xb84/0xd60 [zfs]
[  122.950921]  dnode_evict_dbufs+0x3a6/0x740 [zfs]
[  122.950921]  dmu_objset_evict+0x7a/0x500 [zfs]
[  122.950921]  dsl_dataset_evict_async+0x70/0x480 [zfs]
[  122.950921]  taskq_thread+0x979/0x1480 [spl]
[  122.950921]  kthread+0x2e7/0x3e0
[  122.950921]  ret_from_fork+0x27/0x50

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Jeff Dike <jdike@akamai.com>
Closes #8984
2019-09-25 11:27:49 -07:00
Michael Niewöhner ceb516ac2f Add missing __GFP_HIGHMEM flag to vmalloc
Make use of __GFP_HIGHMEM flag in vmem_alloc, which is required for
some 32-bit systems to make use of full available memory.
While kernel versions >=4.12-rc1 add this flag implicitly, older
kernels do not.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>
Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>
Closes #9031
2019-09-25 11:27:49 -07:00
Tomohiro Kusumi 2b9f73e5e6 Use zfsctl_snapshot_hold() wrapper
zfs_refcount_*() are to be wrapped by zfsctl_snapshot_*() in this file.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Closes #9039
2019-09-25 11:27:49 -07:00
Brian Behlendorf 984bfb373f Minor style cleanup
Resolve an assortment of style inconsistencies including
use of white space, typos, capitalization, and line wrapping.
There is no functional change.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9030
2019-09-25 11:27:49 -07:00
Brian Behlendorf 446d08fba4 Fix get_special_prop() build failure
The cast of the size_t returned by strlcpy() to a uint64_t by the
VERIFY3U can result in a build failure when CONFIG_FORTIFY_SOURCE
is set.  This is due to the additional hardening.  Since the token
is expected to always fit in strval the VERIFY3U has been removed.
If somehow it doesn't, it will still be safely truncated.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #8999
Closes #9020
2019-09-25 11:27:49 -07:00
Antonio Russo af7a5672c3 systemd encryption key support
Modify zfs-mount-generator to produce a dependency on new
zfs-import-key-*.service units, dynamically created at boot to call
zfs load-key for the encryption root, before attempting to mount any
encrypted datasets.

These units are created by zfs-mount-generator, and RequiresMountsFor on
the keyfile, if present, or call systemd-ask-password if a passphrase is
requested.

This patch includes suggestions from @Fabian-Gruenbichler, @ryanjaeb and
@rlaager, as well an adaptation of @rlaager's script to retry on
incorrect password entry.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Antonio Russo <antonio.e.russo@gmail.com>
Closes #8750
Closes #8848
2019-09-25 11:27:49 -07:00
Tomohiro Kusumi 73e50a7d5d Drop redundant POSIX ACL check in zpl_init_acl()
ZFS_ACLTYPE_POSIXACL has already been tested in zpl_init_acl(),
so no need to test again on POSIX ACL access.

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Closes #9009
2019-09-25 11:27:49 -07:00
Brian Behlendorf d751b12a9d Export dnode symbols
External consumers such as Lustre require access to the dnode
interfaces in order to correctly manipulate dnodes.

Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #8994
Closes #9027
2019-09-25 11:27:49 -07:00
Tom Caputi 78831d4290 Ensure dsl_destroy_head() decrypts objsets
This patch corrects a small issue where the dsl_destroy_head()
code that runs when the async_destroy feature is disabled would
not properly decrypt the dataset before beginning processing.
If the dataset is not able to be decrypted, the optimization
code now simply does not run and the dataset is completely
destroyed in the DSL sync task.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #9021
2019-09-25 11:27:49 -07:00
Tomohiro Kusumi 0a223246e1 Disable unused pathname::pn_path* (unneeded in Linux)
struct pathname is originally from Solaris VFS, and it has been used
in ZoL to merely call VOP from Linux VFS interface without API change,
therefore pathname::pn_path* are unused and unneeded. Technically,
struct pathname is a wrapper for C string in ZoL.

Saves stack a bit on lookup and unlink.

(#if0'd members instead of removing since comments refer to them.)

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Closes #9025
2019-09-25 11:27:49 -07:00
Nick Mattis cf966cb19a Fixes: #8934 Large kmem_alloc
Large allocation over the spl_kmem_alloc_warn value was being performed.
Switched to vmem_alloc interface as specified for large allocations.
Changed the subsequent frees to match.

Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: nmattis <nickm970@gmail.com>
Closes #8934
Closes #9011
2019-09-25 11:27:49 -07:00
Attila Fülöp 6e19cc77cf Fix ZTS killed processes detection
log_neg_expect was using the wrong exit status to detect if a process
got killed by SIGSEGV or SIGBUS, resulting in false positives.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes #9003
2019-09-25 11:27:49 -07:00
Shaun Tancheff c3a3c5a30f pkg-utils python sitelib for SLES15
Use python -Esc to set __python_sitelib.

Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Closes #8969
2019-09-25 11:27:49 -07:00
Tomohiro Kusumi ccd8125e45 Fix race in parallel mount's thread dispatching algorithm
Strategy of parallel mount is as follows.

1) Initial thread dispatching is to select sets of mount points that
 don't have dependencies on other sets, hence threads can/should run
 lock-less and shouldn't race with other threads for other sets. Each
 thread dispatched corresponds to top level directory which may or may
 not have datasets to be mounted on sub directories.

2) Subsequent recursive thread dispatching for each thread from 1)
 is to mount datasets for each set of mount points. The mount points
 within each set have dependencies (i.e. child directories), so child
 directories are processed only after parent directory completes.

The problem is that the initial thread dispatching in
zfs_foreach_mountpoint() can be multi-threaded when it needs to be
single-threaded, and this puts threads under race condition. This race
appeared as mount/unmount issues on ZoL for ZoL having different
timing regarding mount(2) execution due to fork(2)/exec(2) of mount(8).
`zfs unmount -a` which expects proper mount order can't unmount if the
mounts were reordered by the race condition.

There are currently two known patterns of input list `handles` in
`zfs_foreach_mountpoint(..,handles,..)` which cause the race condition.

1) #8833 case where input is `/a /a /a/b` after sorting.
 The problem is that libzfs_path_contains() can't correctly handle an
 input list with two same top level directories.
 There is a race between two POSIX threads A and B,
  * ThreadA for "/a" for test1 and "/a/b"
  * ThreadB for "/a" for test0/a
 and in case of #8833, ThreadA won the race. Two threads were created
 because "/a" wasn't considered as `"/a" contains "/a"`.

2) #8450 case where input is `/ /var/data /var/data/test` after sorting.
 The problem is that libzfs_path_contains() can't correctly handle an
 input list containing "/".
 There is a race between two POSIX threads A and B,
  * ThreadA for "/" and "/var/data/test"
  * ThreadB for "/var/data"
 and in case of #8450, ThreadA won the race. Two threads were created
 because "/var/data" wasn't considered as `"/" contains "/var/data"`.
 In other words, if there is (at least one) "/" in the input list,
 the initial thread dispatching must be single-threaded since every
 directory is a child of "/", meaning they all directly or indirectly
 depend on "/".

In both cases, the first non_descendant_idx() call fails to correctly
determine "path1-contains-path2", and as a result the initial thread
dispatching creates another thread when it needs to be single-threaded.
Fix a conditional in libzfs_path_contains() to consider above two.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Closes #8450
Closes #8833
Closes #8878
2019-09-25 11:27:49 -07:00
loli10K 2ac233c633 Fix dracut Debian/Ubuntu packaging
This commit ensures make(1) targets that build .deb packages fail if
alien(1) can't convert all .rpm files; additionally it also updates
the zfs-dracut package name which was changed to "noarch" in ca4e5a7.

Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #8990
Closes #8991
2019-09-25 11:27:49 -07:00
Tom Caputi 1f72a18f59 Remove VERIFY from dsl_dataset_crypt_stats()
This patch fixes an issue where dsl_dataset_crypt_stats() would
VERIFY that it was able to hold the encryption root. This function
should instead silently continue without populating the related
field in the nvlist, as is the convention for this code.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #8976
2019-09-25 11:27:49 -07:00
Paul Zuchowski 14a11bf2f6 Improve "Unable to automount" error message.
Having the mountpoint and dataset name both in the message made it
confusing to read.  Additionally, convert this to a zfs_dbgmsg rather than
sending it to the console.

Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Zuchowski <pzuchowski@datto.com>
Closes #8959
2019-09-25 11:27:49 -07:00
Brian Behlendorf 7a03d7c73c Check b_freeze_cksum under ZFS_DEBUG_MODIFY conditional
The b_freeze_cksum field can only have data when ZFS_DEBUG_MODIFY
is set.  Therefore, the EQUIV check must be wrapped accordingly.
For the same reason the ASSERT in arc_buf_fill() in unsafe.
However, since it's largely redundant it has simply been removed.

Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed-by: Allan Jude <allanjude@freebsd.org>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #8979
2019-09-25 11:27:49 -07:00
Tom Caputi 9e09826b33 Fix error text for EINVAL in zfs_receive_one()
This small patch fixes the EINVAL case for zfs_receive_one(). A
missing 'else' has been added to the two possible cases, which
will ensure the intended error message is printed.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #8977
2019-09-25 11:27:48 -07:00
Tomohiro Kusumi 093bb64461 Don't use d_path() for automount mount point for chroot'd process
Chroot'd process fails to automount snapshots due to realpath(3)
failure in mount.zfs(8).

Construct a mount point path from sb of the ctldir inode and dirent
name, instead of from d_path(), so that chroot'd process doesn't get
affected by its view of fs.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Closes #8903
Closes #8966
2019-09-25 11:27:48 -07:00
George Wilson 7d2489cfad nopwrites on dmu_sync-ed blocks can result in a panic
After device removal, performing nopwrites on a dmu_sync-ed block
will result in a panic. This panic can show up in two ways:
1. an attempt to issue an IOCTL in vdev_indirect_io_start()
2. a failed comparison of zio->io_bp and zio->io_bp_orig in
   zio_done()
To resolve both of these panics, nopwrites of blocks on indirect
vdevs should be ignored and new allocations should be performed on
concrete vdevs.

Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: George Wilson <gwilson@delphix.com>
Closes #8957
2019-09-25 11:27:48 -07:00
Alexander Motin 04d4df89f4 Avoid extra taskq_dispatch() calls by DMU
DMU sync code calls taskq_dispatch() for each sublist of os_dirty_dnodes
and os_synced_dnodes.  Since the number of sublists by default is equal
to number of CPUs, it will dispatch equal, potentially large, number of
tasks, waking up many CPUs to handle them, even if only one or few of
sublists actually have any work to do.

This change adds check for empty sublists to avoid this.

Reviewed by: Sean Eric Fagan <sef@ixsystems.com>
Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:  Alexander Motin <mav@FreeBSD.org>
Closes #8909
2019-09-25 11:27:48 -07:00
Igor K 05006f125c -Y option for zdb is valid
The -Y option was added for ztest to test split block reconstruction.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Signed-off-by: Igor Kozhukhov <igor@dilos.org>
Closes #8926
2019-09-25 11:27:48 -07:00
Tom Caputi bfe5f029cf Fix error message on promoting encrypted dataset
This patch corrects the error message reported when attempting
to promote a dataset outside of its encryption root.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #8905
Closes #8935
2019-09-25 11:27:48 -07:00
Brian Behlendorf cc7fe8a599 Fix out-of-tree build failures
Resolve the incorrect use of srcdir and builddir references for
various files in the build system.  These have crept in over time
and went unnoticed because when building in the top level directory
srcdir and builddir are identical.

With this change it's again possible to build in a subdirectory.

    $ mkdir obj
    $ cd obj
    $ ../configure
    $ make

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #8921
Closes #8943
2019-09-25 11:27:48 -07:00
Matthew Ahrens 7d64595c25 dn_struct_rwlock can not be held in dmu_tx_try_assign()
The thread calling dmu_tx_try_assign() can't hold the dn_struct_rwlock
while assigning the tx, because this can lead to deadlock. Specifically,
if this dnode is already assigned to an earlier txg, this thread may
need to wait for that txg to sync (the ERESTART case below).  The other
thread that has assigned this dnode to an earlier txg prevents this txg
from syncing until its tx can complete (calling dmu_tx_commit()), but it
may need to acquire the dn_struct_rwlock to do so (e.g. via
dmu_buf_hold*()).

This commit adds an assertion to dmu_tx_try_assign() to ensure that this
deadlock is not inadvertently introduced.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #8929
2019-09-25 11:27:48 -07:00
gordan-bobic be4a282a8d Remove arch and relax version dependency
Remove arch and relax version dependency for zfs-dracut
package.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gordan Bobic <gordan@redsleeve.org>
Issue #8913
Closes #8914
2019-09-25 11:27:48 -07:00
Harry Mallon 2d88230d97 Add libnvpair to libzfs pkg-config
Functions such as `fnvlist_lookup_nvlist` need libnvpair to be linked.
Default pkg-config file did not contain it.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Harry Mallon <hjmallon@gmail.com>
Closes #8919
2019-09-25 11:27:48 -07:00
Don Brady 95fcb04215 Let zfs mount all tolerate in-progress mounts
The zfs-mount service can unexpectedly fail to start when zfs
encounters a mount that is in progress. This service uses
zfs mount -a, which has a window between the time it checks if
the dataset was mounted and when the actual mount (via mount.zfs
binary) occurs.

The reason for the racing mounts is that both zfs-mount.target
and zfs-share.target are allowed to execute concurrently after
the import.  This is more of an issue with the relatively recent
addition of parallel mounting, and we should consider serializing
the mount and share targets.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Allan Jude <allanjude@freebsd.org>
Signed-off-by: Don Brady <don.brady@delphix.com>
Closes #8881
2019-09-25 11:27:48 -07:00
Allan Jude d053481523 zstreamdump: add per-record-type counters and an overhead counter
Count the bytes of payload for each replication record type

Count the bytes of overhead (replication records themselves)

Include these counters in the output summary at the end of the run.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Signed-off-by: Allan Jude <allanjude@freebsd.org>
Sponsored-By: Klara Systems and Catalogic
Closes #8432
2019-09-25 11:27:48 -07:00
Paul Dagnelie 7a5f4656ce Fix comments on zfs_bookmark_phys
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #8945
2019-09-25 11:27:48 -07:00
Paul Dagnelie 1fd28bd8d4 Add SCSI_PASSTHROUGH to zvols to enable UNMAP support
When exporting ZVOLs as SCSI LUNs, by default Windows will not
issue them UNMAP commands. This reduces storage efficiency in
many cases.

We add the SCSI_PASSTHROUGH flag to the zvol's device queue,
which lets the SCSI target logic know that it can handle SCSI
commands.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: John Gallagher <john.gallagher@delphix.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #8933
2019-09-25 11:27:48 -07:00
Tomohiro Kusumi ab24c9cd4c Prevent pointer to an out-of-scope local variable
`show_str` could be a pointer to a local variable in stack
which is out-of-scope by the time
`return (snprintf(buf, buflen, "%s\n", show_str));`
is called.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Closes #8924
Closes #8940
2019-09-25 11:27:48 -07:00
Matthew Ahrens 3c2a42fd25 dedup=verify doesn't clear the blkptr's dedup flag
The logic to handle strong checksum collisions where the data doesn't
match is incorrect. It is not clearing the dedup bit of the blkptr,
which can cause a panic later in zio_ddt_free() due to the dedup table
not matching what is in the blkptr.

Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-48097
Closes #8936
2019-09-25 11:27:48 -07:00
Igor K 9af524b0ee Update vdev_ops_t from illumos
Align vdev_ops_t from illumos for better compatibility.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Igor Kozhukhov <igor@dilos.org>
Closes #8925
2019-09-25 11:27:48 -07:00
Tom Caputi b96ceeead2 Allow unencrypted children of encrypted datasets
When encryption was first added to ZFS, we made a decision to
prevent users from creating unencrypted children of encrypted
datasets. The idea was to prevent users from inadvertently
leaving some of their data unencrypted. However, since the
release of 0.8.0, some legitimate reasons have been brought up
for this behavior to be allowed. This patch simply removes this
limitation from all code paths that had checks for it and updates
the tests accordingly.

Reviewed-by: Jason King <jason.king@joyent.com>
Reviewed-by: Sean Eric Fagan <sef@ixsystems.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #8737
Closes #8870
2019-09-25 11:27:48 -07:00
dacianstremtan 01cc94f68d Replace whereis with type in zfs-lib.sh
The whereis command should not be used since it may not exist
in the initramfs.  The dracut plymouth module also uses the type
command instead of whereis.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Garrett Fields <ghfields@gmail.com>
Signed-off-by: Dacian Reece-Stremtan <dacianstremtan@gmail.com>
Closes #8920
Closes #8938
2019-09-25 11:27:48 -07:00
Tomohiro Kusumi fb6f6b47d6 Use ZFS_DEV macro instead of literals
The rest of the code/comments use ZFS_DEV, so sync with that.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Closes #8912
2019-09-25 11:27:48 -07:00
Michael Niewöhner 2087b6cf49 Fix memory leak in check_disk()
Reviewed-by: Allan Jude <allanjude@freebsd.org>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>
Closes #8897
Closes #8911
2019-09-25 11:27:47 -07:00
Olaf Faaland 5b0327bc57 kmod-zfs-devel rpm should provide kmod-spl-devel
When configure is run with --with-spec=redhat, and rpms are built, the
kmod-zfs-devel package is missing

Provides: kmod-spl-devel = %{version}

which is required by software such as Lustre which builds against zfs
kmods.  Adding it makes it easier for such software to build against
both zfs-0.7 (where SPL is separate and may be missing) and zfs-0.8.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #8930
2019-09-25 11:27:47 -07:00
Brian Behlendorf b5e8d14a4b ZTS: Fix mmp_interval failure
The mmp_interval test case was failing on Fedora 30 due to the built-in
'echo' command terminating the script when it was unable to write to
the sysfs module parameter.  This change in behavior was observed with
ksh-2020.0.0-alpha1.  Resolve the issue by using the external cat
command which fails gracefully as expected.

Additionally, remove some incorrect quotes around the $? return values.

Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #8906
2019-09-25 11:27:47 -07:00
Alexander Motin ed7b0d357a Minimize aggsum_compare(&arc_size, arc_c) calls.
For busy ARC situation when arc_size close to arc_c is desired.  But
then it is quite likely that aggsum_compare(&arc_size, arc_c) will need
to flush per-CPU buckets to find exact comparison result.  Doing that
often in a hot path penalizes whole idea of aggsum usage there, since it
replaces few simple atomic additions with dozens of lock acquisitions.

Replacing aggsum_compare() with aggsum_upper_bound() in code increasing
arc_p when ARC is growing (arc_size < arc_c) according to PMC profiles
allows to save ~5% of CPU time in aggsum code during sequential write
to 12 ZVOLs with 16KB block size on large dual-socket system.

I suppose there some minor arc_p behavior change due to lower precision
of the new code, but I don't think it is a big deal, since it should
affect only very small window in time (aggsum buckets are flushed every
second) and in ARC size (buckets are limited to 10 average ARC blocks
per CPU).

Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Allan Jude <allanjude@freebsd.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:  Alexander Motin <mav@FreeBSD.org>
Closes #8901
2019-09-25 11:27:47 -07:00
Ryan Moeller 9e54b9d930 Python config cleanup
Don't require Python at configure/build unless building pyzfs.
Move ZFS_AC_PYTHON_MODULE to always-pyzfs.m4 where it is used.
Make test syntax more consistent.

Sponsored by: iXsystems, Inc.
Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@ixsystems.com>
Closes #8895
2019-09-25 11:27:47 -07:00
Matthew Ahrens b033353b25 lz4_decompress_abd declared but not defined
`lz4_decompress_abd` is declared in zio_compress.h but it is not defined
anywhere. The declaration should be removed.

Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed-by: Allan Jude <allanjude@freebsd.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-47477
Closes #8894
2019-09-25 11:27:47 -07:00
Matthew Ahrens 6083f40387 panic in removal_remap test on 4K devices
If the zfs_remove_max_segment tunable is changed to be not a multiple of
the sector size, then the device removal code will malfunction and try
to create mappings that are smaller than one sector, leading to a panic.

On debug bits this assertion will fail in spa_vdev_copy_segment():
    ASSERT3U(DVA_GET_ASIZE(&dst), ==, size);

On nondebug, the system panics with a stack like:
    metaslab_free_concrete()
    metaslab_free_impl()
    metaslab_free_impl_cb()
    vdev_indirect_remap()
    free_from_removing_vdev()
    metaslab_free_impl()
    metaslab_free_dva()
    metaslab_free()

Fortunately, the default for zfs_remove_max_segment is 1MB, so this
can't occur by default.  We hit it during this test because
removal_remap.ksh changes zfs_remove_max_segment to 1KB. When testing on
4KB-sector disks, we hit the bug.

This change makes the zfs_remove_max_segment tunable more robust,
automatically rounding it up to a multiple of the sector size. We also
turn some key assertions into VERIFY's so that similar bugs would be
caught before they are encoded on disk (and thus avoid a
panic-reboot-loop).

Reviewed-by: Sean Eric Fagan <sef@ixsystems.com>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-61342
Closes #8893
2019-09-25 11:27:47 -07:00
Matthew Ahrens 592ee2e6dd compress metadata in later sync passes
Starting in sync pass 5 (zfs_sync_pass_dont_compress), we disable
compression (including of metadata).  Ostensibly this helps the sync
passes to converge (i.e. for a sync pass to not need to allocate
anything because it is 100% overwrites).

However, in practice it increases the average number of sync passes,
because when we turn compression off, a lot of block's size will change
and thus we have to re-allocate (not overwrite) them.  It also increases
the number of 128KB allocations (e.g. for indirect blocks and spacemaps)
because these will not be compressed.  The 128K allocations are
especially detrimental to performance on highly fragmented systems,
which may have very few free segments of this size, and may need to load
new metaslabs to satisfy 128K allocations.

We should increase zfs_sync_pass_dont_compress.  In practice on a highly
fragmented system we see a few 5-pass txg's, a tiny number of 6-pass
txg's, and no txg's with more than 6 passes.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: George Wilson <george.wilson@delphix.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-63431
Closes #8892
2019-09-25 11:27:47 -07:00
Alexander Motin cab7d856ea Move write aggregation memory copy out of vq_lock
Memory copy is too heavy operation to do under the congested lock.
Moving it out reduces congestion by many times to almost invisible.
Since the original zio removed from the queue, and the child zio is
not executed yet, I don't see why would the copy need protection.
My guess it just remained like this from the time when lock was not
dropped here, which was added later to fix lock ordering issue.

Multi-threaded sequential write tests with both HDD and SSD pools
with ZVOL block sizes of 4KB, 16KB, 64KB and 128KB all show major
reduction of lock congestion, saving from 15% to 35% of CPU time
and increasing throughput from 10% to 40%.

Reviewed-by: Richard Yao <ryao@gentoo.org>
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:  Alexander Motin <mav@FreeBSD.org>
Closes #8890
2019-09-25 11:27:47 -07:00
Tulsi Jain 19cebf0518 Restrict filesystem creation if name referred either '.' or '..'
This change restricts filesystem creation if the given name
contains either '.' or '..'

Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Signed-off-by: TulsiJain <tulsi.jain@delphix.com>
Closes #8842
Closes #8564
2019-09-25 11:27:47 -07:00
Matthew Ahrens 77e64c6fff ztest: dmu_tx_assign() gets ENOSPC in spa_vdev_remove_thread()
When running zloop, we occasionally see the following crash:

    dmu_tx_assign(tx, TXG_WAIT) == 0 (0x1c == 0)
    ASSERT at ../../module/zfs/vdev_removal.c:1507:spa_vdev_remove_thread()/sbin/ztest(+0x89c3)[0x55faf567b9c3]

The error value 0x1c is ENOSPC.

The transaction used by spa_vdev_remove_thread() should not be able to
fail due to being out of space. i.e. we should not call
dmu_tx_hold_space().  This will allow the removal thread to schedule its
work even when the pool is low on space.  The "slop space" will provide
enough free space to sync out the txg.

Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-37853
Closes #8889
2019-09-25 11:27:47 -07:00
Tomohiro Kusumi 4f809bddc6 Fix lockdep warning on insmod
sysfs_attr_init() is required to make lockdep happy for dynamically
allocated sysfs attributes. This fixed #8868 on Fedora 29 running
kernel-debug.

This requirement was introduced in 2.6.34.
See include/linux/sysfs.h for what it actually does.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Closes #8868
Closes #8884
2019-09-25 11:27:47 -07:00
Matthew Ahrens 516a08ebb4 fat zap should prefetch when iterating
When iterating over a ZAP object, we're almost always certain to iterate
over the entire object. If there are multiple leaf blocks, we can
realize a performance win by issuing reads for all the leaf blocks in
parallel when the iteration begins.

For example, if we have 10,000 snapshots, "zfs destroy -nv
pool/fs@1%9999" can take 30 minutes when the cache is cold. This change
provides a >3x performance improvement, by issuing the reads for all ~64
blocks of each ZAP object in parallel.

Reviewed-by: Andreas Dilger <andreas.dilger@whamcloud.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-58347
Closes #8862
2019-09-25 11:27:47 -07:00
Matthew Ahrens 812c36fc71 Target ARC size can get reduced to arc_c_min
Sometimes the target ARC size is reduced to arc_c_min, which impacts
performance.  We've seen this happen as part of the random_reads
performance regression test, where the ARC size is reduced before the
reads test starts which impacts how long it takes for system to reach
good IOPS performance.

We call arc_reduce_target_size when arc_reap_cb_check() returns TRUE,
and arc_available_memory() is less than arc_c>>arc_shrink_shift.

However, arc_available_memory() could easily be low, even when arc_c is
low, because we can have tons of unused bufs in the abd kmem cache. This
would be especially true just after the DMU requests a bunch of stuff be
evicted from the ARC (e.g. due to "zpool export").

To fix this, the ARC should reduce arc_c by the requested amount, not
all the way down to arc_size (or arc_c_min), which can be very small.

Reviewed-by: Tim Chase <tim@chase2k.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-59431
Closes #8864
2019-09-25 11:27:47 -07:00
bnjf fe11968bbf Fix typo in vdev_raidz_math.c
Fix typo in vdev_raidz_math.c

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brad Forschinger <github@bnjf.id.au>
Closes #8875
Closes #8880
2019-09-25 11:27:47 -07:00
Richard Elling 4be4dedb9f Improve ZTS block_device_wait debugging
The udevadm settle timeout can be 120 or 180 seconds by default
for some distributions. If a long delay is experienced, it could
be due to some strangeness in a malfunctioning device that isn't
related to the devices under test. To help debug this condition,
a notice is given if settle takes too long.

Arguments can now be passed to block_device_wait. The expected
arguments are block device pathnames.

Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Elling <Richard.Elling@RichardElling.com>
Closes #8839
2019-09-25 11:27:47 -07:00
Richard Elling fb52bf9b1d Block_device_wait does not return an error code
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Elling <Richard.Elling@RichardElling.com>
Closes #8839
2019-09-25 11:27:47 -07:00
Richard Elling a22b00f924 Remove redundant redundant remove
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Elling <Richard.Elling@RichardElling.com>
Closes #8839
2019-09-25 11:27:47 -07:00
Richard Elling c350e62309 Fix logic error in setpartition function
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Elling <Richard.Elling@RichardElling.com>
Closes #8839
2019-09-25 11:27:47 -07:00
Paul Dagnelie 6f7bc75825 Allow metaslab to be unloaded even when not freed from
On large systems, the memory used by loaded metaslabs can become
a concern. While range trees are a fairly efficient data structure,
on heavily fragmented pools they can still consume a significant
amount of memory. This problem is amplified when we fail to unload
metaslabs that we aren't using. Currently, we only unload a metaslab
during metaslab_sync_done; in order for that function to be called
on a given metaslab in a given txg, we have to have dirtied that
metaslab in that txg. If the dirtying was the result of an allocation,
we wouldn't be unloading it (since it wouldn't be 8 txgs since it
was selected), so in effect we only unload a metaslab during txgs
where it's being freed from.

We move the unload logic from sync_done to a new function, and
call that function on all metaslabs in a given vdev during
vdev_sync_done().

Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #8837
2019-09-25 11:27:47 -07:00
Jorgen Lundman 06900c409b Avoid updating zfs_gitrev.h when rev is unchanged
Build process would always re-compile spa_history.c due to touching
zfs_gitrev.h - avoid if no change in gitrev.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Reviewed-by: Allan Jude <allanjude@freebsd.org>
Signed-off-by: Jorgen Lundman <lundman@lundman.net>
Closes #8860
2019-09-25 11:27:47 -07:00
Allan Jude 60cbc18136 l2arc_apply_transforms: Fix typo in comment
Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: Allan Jude <allanjude@freebsd.org>
Closes #8822
2019-09-25 11:27:46 -07:00
Serapheim Dimitropoulos b63ed49c29 Reduced IOPS when all vdevs are in the zfs_mg_fragmentation_threshold
Historically while doing performance testing we've noticed that IOPS
can be significantly reduced when all vdevs in the pool are hitting
the zfs_mg_fragmentation_threshold percentage. Specifically in a
hypothetical pool with two vdevs, what can happen is the following:
Vdev A would go above that threshold and only vdev B would be used.
Then vdev B would pass that threshold but vdev A would go below it
(we've been freeing from A to allocate to B). The allocations would
go back and forth utilizing one vdev at a time with IOPS taking a hit.

Empirically, we've seen that our vdev selection for allocations is
good enough that fragmentation increases uniformly across all vdevs
the majority of the time. Thus we set the threshold percentage high
enough to avoid hitting the speed bump on pools that are being pushed
to the edge. We effectively disable its effect in the majority of the
cases but we don't remove (at least for now) just in case we hit any
weird behavior in the future.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #8859
2019-09-25 11:27:46 -07:00
Tomohiro Kusumi fafe72712a Drop objid argument in zfs_znode_alloc() (sync with OpenZFS)
Since zfs_znode_alloc() already takes dmu_buf_t*, taking another
uint64_t argument for objid is redundant. inode's ->i_ino does and
needs to match znode's ->z_id.

zfs_znode_alloc() in FreeBSD and illumos doesn't have this argument
since vnode doesn't have vnode# in VFS (hence ->z_id exists).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@osnexus.com>
Closes #8841
2019-09-25 11:27:46 -07:00
Tomohiro Kusumi 328c95e391 Remove vn_set_fs_pwd()/vn_set_pwd() (no need to be at / during insmod)
Per suggestion from @behlendorf in #8777, remove vn_set_fs_pwd() and
vn_set_pwd() which are only used in zfs_ioctl.c:_init() while loading
zfs.ko.

The rest of initialization functions being called here after cwd set
to / don't depend on cwd of the process except for spa_config_load().
spa_config_load() uses a relative path ".//etc/zfs/zpool.cache" when
`rootdir` is non-NULL, which is "/etc/zfs/zpool.cache" given cwd is /,
so just unconditionally use the absolute path without "./", so that
`vn_set_pwd("/")` as well as the entire functions can be removed.
This is also what FreeBSD does.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@osnexus.com>
Closes #8826
2019-09-25 11:27:46 -07:00
Josh Soref 6ce10fdabb grammar: it is / plural agreement
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
Closes #8818
2019-09-25 11:27:46 -07:00
Tomohiro Kusumi e4a11acfac Refactor parent dataset handling in libzfs zfs_rename()
For recursive renaming, simplify the code by moving `zhrp` and
`parentname` to inner scope. `zhrp` is only used to test existence
of a parent dataset for recursive dataset dir scan since ba6a24026c.

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@osnexus.com>
Closes #8815
2019-09-25 11:27:46 -07:00
Ryan Moeller 90d8067a77 Update comments to match code
s/get_vdev_spec/make_root_vdev

The former doesn't exist anymore.

Sponsored by: iXsystems, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Ryan Moeller <ryan@freqlabs.com>
Closes #8759
2019-09-25 11:27:46 -07:00
Tomohiro Kusumi e5a877c5d0 Update descriptions for vnops
These descriptions are not uptodate with the code.

Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Closes #8767
2019-09-25 11:27:46 -07:00
Tomohiro Kusumi 4933b0a25b Drop local definition of MOUNT_BUSY
It's accessible via <sys/mntent.h>.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@osnexus.com>
Closes #8765
2019-09-25 11:27:46 -07:00
Rafael Kitover e0cd6c28a3 kernel timer API rework
In `config/kernel-timer.m4` refactor slightly to check more generally
for the new `timer_setup()` APIs, but also check the callback signature
because some kernels (notably 4.14) have the new `timer_setup()` API but
use the old callback signature. Also add a check for a `flags` member in
`struct timer_list`, which was added in 4.1-rc8.

Add compatibility shims to `include/spl/sys/timer.h` to allow using the
new timer APIs with the only two caveats being that the callback
argument type must be declared as `spl_timer_list_t` and an explicit
assignment is required to get the timer variable for the `timer_of()`
macro. So the callback would look like this:

```c
__cv_wakeup(spl_timer_list_t t)
{
        struct timer_list *tmr = (struct timer_list *)t;
	struct thing *parent = from_timer(parent, tmr,
		parent_timer_field);
	... /* do stuff with parent */
```

Make some minor changes to `spl-condvar.c` and `spl-taskq.c` to use the
new timer APIs instead of conditional code.

Reviewed-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rafael Kitover <rkitover@gmail.com>
Closes #8647
2019-09-25 11:27:46 -07:00
Tony Hutter 63b88f7e22 Tag zfs-0.8.1
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2019-06-14 09:43:18 -07:00
Alexander Motin 72888812b0 Fix comparison signedness in arc_is_overflowing()
When ARC size is very small, aggsum_lower_bound(&arc_size) may return
negative values, that due to unsigned comparison caused delays, waiting
for arc_adjust() to "fix" it by calling aggsum_value(&arc_size).  Use
of signed comparison there fixes the problem.

Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Closes #8873
2019-06-11 10:45:23 -07:00
Tom Caputi 581c77e725 Fix incorrect error message for raw receive
This patch fixes an incorrect error message that comes up when
doing a non-forcing, raw, incremental receive into a dataset
that has a newer snapshot than the "from" snapshot. In this
case, the current code prints a confusing message about an IVset
guid mismatch.

This functionality is supported by non-raw receives as an
undocumented feature, but was never supported by the raw receive
code. If this is desired in the future, we can probably figure
out a way to make it work.

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Issue #8758
Closes #8863
2019-06-11 10:45:17 -07:00
Eli Schwartz ba505f90d8 arc_summary: prefer python3 version and install when there is no python
This matches the behavior of other python scripts, such as arcstat and
dbufstat, which are always installed but whose install-exec-hook actions
will simply touch up the shebang if a python interpreter was configured
*and* that interpreter is a python2 interpreter.

Fixes installation in a minimal build chroot without python available.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@freqlabs.com>
Signed-off-by: Eli Schwartz <eschwartz@archlinux.org>
Closes #8851
2019-06-11 10:45:02 -07:00
Samuel VERSCHELDE eaa21b2349 Fix %post and %postun generation in kmodtool
During zfs-kmod RPM build, $(uname -r) gets unintentionally evaluated on
the build host, once and for all. It should be evaluated during the
execution of the scriptlets on the installation host. Escaping the $
character avoids evaluating it during build.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Signed-off-by: Samuel Verschelde <stormi-xcp@ylix.fr>
Closes #8866
2019-06-11 10:44:54 -07:00
Tom Caputi 8dc8bbde6e Reinstate raw receive check when truncating
This patch re-adds a check that was removed in 369aa50. The check
confirms that a raw receive is not occuring before truncating an
object's dn_maxblkid. At the time, it was believed that all cases
that would hit this code path would be handled in other places,
but that was not the case.

Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #8852 
Closes #8857
2019-06-07 12:45:40 -07:00
Garrett Fields 9fd95a2f1b If $ZFS_BOOTFS contains guid, replace the guid portion with $pool
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: Garrett Fields <ghfields@gmail.com>
Closes #8356
2019-06-07 12:45:40 -07:00
Tom Caputi 35050ef39e Fix integer overflow of ZTOI(zp)->i_generation
The ZFS on-disk format stores each inode's generation ID as a 64
bit number on disk and in-core. However, the Linux kernel's inode
is only a 32 bit number. In most places, the code handles this
correctly, but the cast is missing in zfs_rezget(). For many pools,
this isn't an issue since the generation ID is computed as the
current txg when the inode is created and many pools don't have
more than 2^32 txgs.

For the pools that have more txgs, this issue causes any inode with
a high enough generation number to report IO errors after a call to
"zfs rollback" while holding the file or directory open. This patch
simply adds the missing cast.

Reviewed-by: Alek Pinchuk <apinchuk@datto.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #8858
2019-06-07 12:45:40 -07:00
Don Brady 5108d27aec hkdf_test binary should only have one icp instance
The build for test binary hkdf_test was linking both against libicp 
and libzpool. This results in two instances of libicp inside the 
binary but the call to icp_init() only initializes one of them!

Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Don Brady <don.brady@delphix.com>
Closes #8850
2019-06-07 12:45:40 -07:00
Peter Wirdemo 02010e9c2c Fixed a small typo in man/man1/raidz_test.1
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Signed-off-by: Peter Wirdemo <peter.wirdemo@gmail.com>
Closes #8855
2019-06-07 12:45:40 -07:00
Torsten Wörtwein a0bf24952d Allow TRIM_UNUSED_KSYM when build as a builtin-module
If ZFS is built with enable_linux_builtin, it seems to be possible
to compile the kernel with TRIM_UNUSED_KSYM.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Torsten Wörtwein <twoertwein@gmail.com>
Closes #8820
2019-06-07 12:45:40 -07:00
Ryan Moeller d6920fb996 Make Python detection optional and more portable
Previously, --without-python would cause ./configure to fail. Now it is
able to proceed, and the Python scripts will not be built.

Use portable parameter expansion matching instead of nonstandard
substring matching to detect the Python version.  This test is
duplicated in several places, so define a function for it.

Don't assume the full path to binaries, since different platforms do
install things in different places.  Use AC_CHECK_PROGS instead.

When building without Python, also build without pyzfs.

Sponsored by: iXsystems, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Eli Schwartz <eschwartz93@gmail.com>
Signed-off-by: Ryan Moeller <ryan@freqlabs.com>
Closes #8809 
Closes #8731
2019-06-07 12:45:40 -07:00
DeHackEd 58b2de6420 Wait in 'S' state when send/recv pipe is blocking
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: DHE <git@dehacked.net>
Closes #8733 
Closes #8752
2019-06-07 12:45:40 -07:00
TulsiJain 11ad06d1d8 Make zfs_async_block_max_blocks handle zero correctly
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: TulsiJain <tulsi.jain@delphix.com>
Closes #8829
Closes #8289
2019-06-07 12:45:40 -07:00
Brian Behlendorf 4f8eef29e0 Revert "Report holes when there are only metadata changes"
This reverts commit ec4f9b8f30 which introduced a narrow race which
can lead to lseek(, SEEK_DATA) incorrectly returning ENXIO.  Resolve
the issue by revering this change to restore the previous behavior
which depends solely on checking the dirty list.

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #8816 
Closes #8834
2019-06-07 12:45:40 -07:00
Tomohiro Kusumi 94866d8309 Add link count test for root inode
Add tests for
97aa3ba44("Fix link count of root inode when snapdir is visible")
as suggested in #8727.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@osnexus.com>
Closes #8732
2019-06-07 12:45:40 -07:00
Brian Behlendorf a1eaf0dde0 Exclude log device ashift from normal class
When opening a log device during import its allocation bias will
not yet have been set by vdev_load().  This results in the log
device's ashift being incorrectly applied to the maximum ashift
of the vdevs in the normal class.  Which in turn prevents the
removal of any top-level devices due to the ashift check in the
spa_vdev_remove_top_check() function.

This issue is resolved by including vdev_islog in the check since
it will be set correctly during vdev_open().

Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #8735
2019-06-07 12:45:40 -07:00
madz 580256045b Fix integer overflow in get_next_chunk()
dn->dn_datablksz type is uint32_t and need to be casted to uint64_t
to avoid an overflow when the record size is greater than 4 MiB.

Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olivier Mazouffre <olivier.mazouffre@ims-bordeaux.fr>
Closes #8778 
Closes #8797
2019-06-07 12:45:40 -07:00
loli10K aaf3b30dcf Double-free of encryption wrapping key due to invalid pool properties
This commits fixes a double-free in zfs_ioc_pool_create() triggered by
specifying an unsupported combination of properties when creating a pool
with encryption enabled.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #8791
2019-06-07 12:45:40 -07:00
Stoiko Ivanov 27b446f799 tests: fix cosmetic permission issues during make install
files in dist_*_SCRIPTS get installed with 0755, those in dist_*_DATA
with 0644. This commit moves all .kshlib, .shlib and .cfg files in the
testsuite to dist_pkgdata_DATA, and removes the shebang from
zpool_import.kshlib.

This ensures that the files are installed with appropriate permissions
and silences some warnings from lintian

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
Closes #8803
2019-06-07 12:45:40 -07:00
Stoiko Ivanov 0c6206e7f1 test-runner.py: change shebang to python3
In commit 6e72a5b9b6 python scripts which
work with python2 and python3 changed the shebang from /usr/bin/python
to /usr/bin/python3. This gets adapted by the build-system on systems
which don't provide python3.
This commit changes test-runner.py to also use /usr/bin/python3,
enabling the change during buildtime and fixing a minor lintian issue
for those Debian packages, which depend on a specific python version
(python3/python2).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
Closes #8803
2019-06-07 12:45:40 -07:00
loli10K 51de7ccb42 Endless loop in zpool_do_remove() on platforms with unsigned char
On systems where "char" is an unsigned type the value returned by
getopt() will never be negative (-1), leading to an endless loop:
this issue prevents both 'zpool remove' and 'zstreamdump' for
working on some systems.

Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #8789
2019-06-07 12:45:40 -07:00
Tom Caputi 69ae34076f Fix embedded bp accounting in count_block()
Currently, count_block() does not correctly account for the
possibility that the bp that is passed to it could be embedded.
These blocks shouldn't be counted since the work of scanning
these blocks in already handled when the containing block is
scanned. This patch simply resolves this issue by returning
early in this case.

Reviewed by: Allan Jude <allanjude@freebsd.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Authored-by: Bill Sommerfeld <sommerfeld@alum.mit.edu>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #8800 
Closes #8766
2019-06-07 12:39:13 -07:00
Tom Caputi b746c397e3 Disable parallel processing for 'zfs mount -l'
Currently, 'zfs mount -a' will always attempt to parallelize
work related to mounting as best it can. Unfortunately, when
the user passes the '-l' option to load keys, this causes
all threads to prompt the user for their keys at once,
resulting in a confusing and racy user experience. This patch
simply disables parallel mounting when using the '-l' flag.

Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #8762 
Closes #8811
2019-06-07 12:39:13 -07:00
Tomohiro Kusumi 2fb37bcadd Linux 5.2 compat: Directly call wait_on_page_bit()
wait_on_page_writeback() was made GPL only in torvalds/linux@19343b5bdd.

Directly call wait_on_page_bit() without using wait_on_page_writeback()
interface, given zfs_putpage() is the only caller for now.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@osnexus.com>
Closes #8794
2019-06-07 12:39:13 -07:00
Tomohiro Kusumi a727f69e52 Linux 5.2 compat: Fix config/kernel-shrink.m4 test failure
"whether ->count_objects callback exists" test failed with
"error: error" message for using an incomplete function shrinker_cb().

This is caused by torvalds/linux@83da1bed86. It's configurable,
but we would want to be able to compile with default kbuild setting.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@osnexus.com>
Closes #8776
2019-06-07 12:39:13 -07:00
Tomohiro Kusumi 8ec352be1f Linux 5.2 compat: Remove config/kernel-set-fs-pwd.m4
This failed on 5.2-rc1 with "error: unknown" message, for set_fs_pwd()
not being visible in both const and non-const tests.

This is caused by torvalds/linux@83da1bed86. It's configurable,
but we would want to be able to compile with default kbuild setting.

set_fs_pwd() has never been exported with exception of some distro
kernels, and set_fs_pwd() wasn't used in ZoL to begin with. The test
result was used for a spl function vn_set_fs_pwd().

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@osnexus.com>
Closes #8777
2019-06-07 12:39:13 -07:00
loli10K df717bb835 zpool: status -t is not documented in help message
This commit adds the undocumented "-t" option to zpool(8) help message.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #8782
2019-06-07 12:39:13 -07:00
loli10K e0b3689ed5 zfs-tests: fix warnings when packaging some .shlib files
This change prevents the following warning when packaging some zfs-tests
files:

   *** WARNING: ./usr/src/zfs-0.8.0/tests/zfs-tests/include/zpool_script.shlib
   is executable but has empty or no shebang, removing executable bit

Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #8787
2019-06-07 12:39:13 -07:00
loli10K 438275c9a0 VERIFY3P() message is missing a space character
This commit just reintroduces a [space] character inadvertently removed
in a887d653.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #8786
2019-06-07 12:39:13 -07:00
loli10K 8cfa6d4a1c zfs-tests: verify zfs(8) and zpool(8) help message is under 80 columns
This commit updates the ZFS Test Suite to detect incorrect wrapping of
both zfs(8) and zpool(8) help message

Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #8785
2019-06-07 12:39:13 -07:00
loli10K ad0157ec91 zfs: don't pretty-print objsetid property
The objsetid property, while being stored as a number, is a dataset
identifier and should not be pretty-printed.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #8784
2019-06-07 12:39:13 -07:00
loli10K cd75d5f710 zfs: missing newline character in zfs_do_channel_program() error message
This commit simply adds a missing newline ("\n") character to the error
message printed by the zfs command when the provided pool parameter
can't be found.

Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #8783
2019-06-07 12:39:13 -07:00
siv0 c6bbacebc8 Fix ksh-path for random_readwrite_fixed.ksh
The test in zfs-tests/tests/perf/regression/random_readwrite_fixed.ksh
is the only file to use /usr/bin/ksh in the shebang.
Change it to /bin/ksh for consistency.

Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
Closes #8779
2019-06-07 12:39:13 -07:00
Tomohiro Kusumi 4d7cb872e8 Linux 2.6.39 compat: Test if kstrtoul() exists
kstrtoul() exists only after torvalds/linux@33ee3b2e2e in 2.6.39.
Use strict_strtoul() if kstrtoul() doesn't exist.
Note that strict_strtoul() has existed as an alias for kstrtoul()
for a while, but removed in torvalds/linux@3db2e9cdc0.

It looks like RHEL6 (2.6.32 based) has backported kstrtoul(),
and this caused build CI to pass compilation test.
It should fail on vanilla < 2.6.39 kernels or distro kernels without
backport as reported in #8760.

--
 # grep "kstrtoul(" /lib/modules/2.6.32-754.12.1.el6.x86_64/build/ \
 include/linux/kernel.h >/dev/null
 # echo $?
 0

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Closes #8760 
Closes #8761
2019-06-07 12:39:13 -07:00
loli10K f91e7e6284 Device removal panics on 32-bit systems
The issue is caused by an incorrect usage of the sizeof() operator
in vdev_obsolete_sm_object(): on 64-bit systems this is not an issue
since both "uint64_t" and "uint64_t*" are 8 bytes in size. However on
32-bit systems pointers are 4 bytes long which is not supported by
zap_lookup_impl(). Trying to remove a top-level vdev on a 32-bit system
will cause the following failure:

VERIFY3(0 == vdev_obsolete_sm_object(vd, &obsolete_sm_object)) failed (0 == 22)
PANIC at vdev_indirect.c:833:vdev_indirect_sync_obsolete()
Showing stack for process 1315
CPU: 6 PID: 1315 Comm: txg_sync Tainted: P           OE   4.4.69+ #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014
 c1abc6e7 0ae10898 00000286 d4ac3bc0 c14397bc da4cd7d8 d4ac3bf0 d4ac3bd0
 d790e7ce d7911cc1 00000523 d4ac3d00 d790e7d7 d7911ce4 da4cd7d8 00000341
 da4ce664 da4cd8c0 da33fa6e 49524556 28335946 3d3d2030 65647620 626f5f76
Call Trace:
 [<>] dump_stack+0x58/0x7c
 [<>] spl_dumpstack+0x23/0x27 [spl]
 [<>] spl_panic.cold.0+0x5/0x41 [spl]
 [<>] ? dbuf_rele+0x3e/0x90 [zfs]
 [<>] ? zap_lookup_norm+0xbe/0xe0 [zfs]
 [<>] ? zap_lookup+0x57/0x70 [zfs]
 [<>] ? vdev_obsolete_sm_object+0x102/0x12b [zfs]
 [<>] vdev_indirect_sync_obsolete+0x3e1/0x64d [zfs]
 [<>] ? txg_verify+0x1d/0x160 [zfs]
 [<>] ? dmu_tx_create_dd+0x80/0xc0 [zfs]
 [<>] vdev_sync+0xbf/0x550 [zfs]
 [<>] ? mutex_lock+0x10/0x30
 [<>] ? txg_list_remove+0x9f/0x1a0 [zfs]
 [<>] ? zap_contains+0x4d/0x70 [zfs]
 [<>] spa_sync+0x9f1/0x1b10 [zfs]
 ...
 [<>] ? kthread_stop+0x110/0x110

This commit simply corrects the "integer_size" parameter used to lookup
the vdev's ZAP object.

Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #8790
2019-06-07 12:39:13 -07:00
loli10K abe267f677 zpool: trim -p is not a valid option
This commit removes the documented but not handled "-p" option from
zpool(8) help message.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #8781
2019-06-07 12:39:13 -07:00
loli10K cc434dcf45 Fix coverity defects: CID 186143
CID 186143: Memory - illegal accesses (USE_AFTER_FREE)

This patch fixes an use-after-free in spa_import_progress_destroy()
moving the kmem_free() call at the end of the function.

Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #8788
2019-06-07 12:39:13 -07:00
Igor K e2e7b0a2cd Rename reservation tests from *.sh to *.ksh
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Igor Kozhukhov <igor@dilos.org>
Closes #8729
2019-06-07 12:39:13 -07:00
4396 changed files with 146054 additions and 457841 deletions
-10
View File
@@ -1,10 +0,0 @@
root = true
[*]
end_of_line = lf
insert_final_newline = true
trim_trailing_whitespace = true
[*.{c,h}]
tab_width = 8
indent_style = tab
+99 -84
View File
@@ -1,12 +1,10 @@
# Contributing to OpenZFS
<p align="center">
<img alt="OpenZFS Logo"
src="https://openzfs.github.io/openzfs-docs/_static/img/logo/480px-Open-ZFS-Secondary-Logo-Colour-halfsize.png"/>
</p>
# Contributing to ZFS on Linux
<p align="center"><img src="http://zfsonlinux.org/images/zfs-linux.png"/></p>
*First of all, thank you for taking the time to contribute!*
By using the following guidelines, you can help us make OpenZFS even better.
By using the following guidelines, you can help us make ZFS on Linux even
better.
## Table Of Contents
[What should I know before I get
@@ -34,17 +32,17 @@ started?](#what-should-i-know-before-i-get-started)
Helpful resources
* [OpenZFS Documentation](https://openzfs.github.io/openzfs-docs/)
* [OpenZFS Developer Resources](http://open-zfs.org/wiki/Developer_resources)
* [Git and GitHub for beginners](https://openzfs.github.io/openzfs-docs/Developer%20Resources/Git%20and%20GitHub%20for%20beginners.html)
* [ZFS on Linux wiki](https://github.com/zfsonlinux/zfs/wiki)
* [OpenZFS Documentation](http://open-zfs.org/wiki/Developer_resources)
* [Git and GitHub for beginners](https://github.com/zfsonlinux/zfs/wiki/Git-and-GitHub-for-beginners)
## What should I know before I get started?
### Get ZFS
You can build zfs packages by following [these
instructions](https://openzfs.github.io/openzfs-docs/Developer%20Resources/Building%20ZFS.html),
instructions](https://github.com/zfsonlinux/zfs/wiki/Building-ZFS),
or install stable packages from [your distribution's
repository](https://openzfs.github.io/openzfs-docs/Getting%20Started/index.html).
repository](https://github.com/zfsonlinux/zfs/wiki/Getting-Started).
### Debug ZFS
A variety of methods and tools are available to aid ZFS developers.
@@ -54,29 +52,28 @@ checks and all the ASSERTs to help quickly catch potential issues.
In addition, there are numerous utilities and debugging files which
provide visibility into the inner workings of ZFS. The most useful
of these tools are discussed in detail on the [Troubleshooting
page](https://openzfs.github.io/openzfs-docs/Basic%20Concepts/Troubleshooting.html).
of these tools are discussed in detail on the [debugging ZFS wiki
page](https://github.com/zfsonlinux/zfs/wiki/Debugging).
### Where can I ask for help?
The [zfs-discuss mailing
list](https://openzfs.github.io/openzfs-docs/Project%20and%20Community/Mailing%20Lists.html)
or IRC are the best places to ask for help. Please do not file
support requests on the GitHub issue tracker.
[The zfs-discuss mailing list or IRC](http://list.zfsonlinux.org)
are the best places to ask for help. Please do not file support requests
on the GitHub issue tracker.
## How Can I Contribute?
### Reporting Bugs
*Please* contact us via the [zfs-discuss mailing
list](https://openzfs.github.io/openzfs-docs/Project%20and%20Community/Mailing%20Lists.html)
or IRC if you aren't certain that you are experiencing a bug.
list or IRC](http://list.zfsonlinux.org) if you aren't
certain that you are experiencing a bug.
If you run into an issue, please search our [issue
tracker](https://github.com/openzfs/zfs/issues) *first* to ensure the
tracker](https://github.com/zfsonlinux/zfs/issues) *first* to ensure the
issue hasn't been reported before. Open a new issue only if you haven't
found anything similar to your issue.
You can open a new issue and search existing issues using the public [issue
tracker](https://github.com/openzfs/zfs/issues).
tracker](https://github.com/zfsonlinux/zfs/issues).
#### When opening a new issue, please include the following information at the top of the issue:
* What distribution (with version) you are using.
@@ -108,13 +105,13 @@ information like:
* Stack traces which may be logged to `dmesg`.
### Suggesting Enhancements
OpenZFS is a widely deployed production filesystem which is under active
development. The team's primary focus is on fixing known issues, improving
performance, and adding compelling new features.
ZFS on Linux is a widely deployed production filesystem which is under
active development. The team's primary focus is on fixing known issues,
improving performance, and adding compelling new features.
You can view the list of proposed features
by filtering the issue tracker by the ["Type: Feature"
label](https://github.com/openzfs/zfs/issues?q=is%3Aopen+is%3Aissue+label%3A%22Type%3A+Feature%22).
by filtering the issue tracker by the ["Feature"
label](https://github.com/zfsonlinux/zfs/issues?q=is%3Aopen+is%3Aissue+label%3AFeature).
If you have an idea for a feature first check this list. If your idea already
appears then add a +1 to the top most comment, this helps us gauge interest
in that feature.
@@ -123,11 +120,8 @@ Otherwise, open a new issue and describe your proposed feature. Why is this
feature needed? What problem does it solve?
### Pull Requests
#### General
* All pull requests, except backports and releases, must be based on the current master branch
and should apply without conflicts.
* All pull requests must be based on the current master branch and apply
without conflicts.
* Please attempt to limit pull requests to a single commit which resolves
one specific issue.
* Make sure your commit messages are in the correct format. See the
@@ -139,24 +133,16 @@ logically independent patches which build on each other. This makes large
changes easier to review and approve which speeds up the merging process.
* Try to keep pull requests simple. Simple code with comments is much easier
to review and approve.
* All proposed changes must be approved by an OpenZFS organization member.
* If you have an idea you'd like to discuss or which requires additional testing, consider opening it as a draft pull request.
Once everything is in good shape and the details have been worked out you can remove its draft status.
Any required reviews can then be finalized and the pull request merged.
#### Tests and Benchmarks
* Every pull request is tested using a GitHub Actions workflow on multiple platforms by running the [zfs-tests.sh and zloop.sh](
https://openzfs.github.io/openzfs-docs/Developer%20Resources/Building%20ZFS.html#running-zloop-sh-and-zfs-tests-sh) test suites.
`.github/workflows/scripts/generate-ci-type.py` is used to determine whether the pull request is nonbehavior, i.e., not introducing behavior changes of any code, configuration or tests. If so, the CI will run on fewer platforms and only essential sanity tests will run. You can always override this by adding `ZFS-CI-Type` line to your commit message:
* If your last commit (or `HEAD` in git terms) contains a line `ZFS-CI-Type: quick`, quick mode is forced regardless of what files are changed.
* Otherwise, if any commit in a PR contains a line `ZFS-CI-Type: full`, full mode is forced.
* To verify your changes conform to the [style guidelines](
https://github.com/openzfs/zfs/blob/master/.github/CONTRIBUTING.md#style-guides
), please run `make checkstyle` and resolve any warnings.
* Code analysis is performed by [CodeQL](https://codeql.github.com/) for each pull request.
* Test cases should be provided when appropriate. This includes making sure new features have adequate code coverage.
* Test cases should be provided when appropriate.
* If your pull request improves performance, please include some benchmarks.
* The pull request must pass all CI checks before being accepted.
* The pull request must pass all required [ZFS
Buildbot](http://build.zfsonlinux.org/) builders before
being accepted. If you are experiencing intermittent TEST
builder failures, you may be experiencing a [test suite
issue](https://github.com/zfsonlinux/zfs/issues?q=is%3Aissue+is%3Aopen+label%3A%22Test+Suite%22).
There are also various [buildbot options](https://github.com/zfsonlinux/zfs/wiki/Buildbot-Options)
to control how changes are tested.
* All proposed changes must be approved by a ZFS on Linux organization member.
### Testing
All help is appreciated! If you're in a position to run the latest code
@@ -166,41 +152,16 @@ range of realistic workloads, configurations and architectures we're better
able quickly identify and resolve potential issues.
Users can also run the [ZFS Test
Suite](https://github.com/openzfs/zfs/tree/master/tests) on their systems
Suite](https://github.com/zfsonlinux/zfs/tree/master/tests) on their systems
to verify ZFS is behaving as intended.
## Style Guides
### Repository Structure
OpenZFS uses a standardised branching structure.
- The "development and main branch", is the branch all development should be based on.
- "Release branches" contain the latest released code for said version.
- "Staging branches" contain selected commits prior to being released.
**Branch Names:**
- Development and Main branch: `master`
- Release branches: `zfs-$VERSION-release`
- Staging branches: `zfs-$VERSION-staging`
`$VERSION` should be replaced with the `major.minor` version number.
_(This is the version number without the `.patch` version at the end)_
### Coding Conventions
We currently use [C Style and Coding Standards for
SunOS](http://www.cis.upenn.edu/%7Elee/06cse480/data/cstyle.ms.pdf) as our
coding convention.
This repository has an `.editorconfig` file. If your editor [supports
editorconfig](https://editorconfig.org/#download), it will
automatically respect most of this project's whitespace preferences.
Additionally, Git can help warn on whitespace problems as well:
```
git config --local core.whitespace trailing-space,space-before-tab,indent-with-non-tab,-tab-in-indent
```
### Commit Message Formats
#### New Changes
Commit messages for new changes must meet the following guidelines:
@@ -226,6 +187,70 @@ attempting to solve.
Signed-off-by: Contributor <contributor@email.com>
```
#### OpenZFS Patch Ports
If you are porting OpenZFS patches, the commit message must meet
the following guidelines:
* The first line must be the summary line from the most important OpenZFS commit being ported.
It must begin with `OpenZFS dddd, dddd - ` where `dddd` are OpenZFS issue numbers.
* Provides a `Authored by:` line to attribute each patch for each original author.
* Provides the `Reviewed by:` and `Approved by:` lines from each original
OpenZFS commit.
* Provides a `Ported-by:` line with the developer's name followed by
their email for each OpenZFS commit.
* Provides a `OpenZFS-issue:` line with link for each original illumos
issue.
* Provides a `OpenZFS-commit:` line with link for each original OpenZFS commit.
* If necessary, provide some porting notes to describe any deviations from
the original OpenZFS commits.
An example OpenZFS patch port commit message for a single patch is provided
below.
```
OpenZFS 1234 - Summary from the original OpenZFS commit
Authored by: Original Author <original@email.com>
Reviewed by: Reviewer One <reviewer1@email.com>
Reviewed by: Reviewer Two <reviewer2@email.com>
Approved by: Approver One <approver1@email.com>
Ported-by: ZFS Contributor <contributor@email.com>
Provide some porting notes here if necessary.
OpenZFS-issue: https://www.illumos.org/issues/1234
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/abcd1234
```
If necessary, multiple OpenZFS patches can be combined in a single port.
This is useful when you are porting a new patch and its subsequent bug
fixes. An example commit message is provided below.
```
OpenZFS 1234, 5678 - Summary of most important OpenZFS commit
1234 Summary from original OpenZFS commit for 1234
Authored by: Original Author <original@email.com>
Reviewed by: Reviewer Two <reviewer2@email.com>
Approved by: Approver One <approver1@email.com>
Ported-by: ZFS Contributor <contributor@email.com>
Provide some porting notes here for 1234 if necessary.
OpenZFS-issue: https://www.illumos.org/issues/1234
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/abcd1234
5678 Summary from original OpenZFS commit for 5678
Authored by: Original Author2 <original2@email.com>
Reviewed by: Reviewer One <reviewer1@email.com>
Approved by: Approver Two <approver2@email.com>
Ported-by: ZFS Contributor <contributor@email.com>
Provide some porting notes here for 5678 if necessary.
OpenZFS-issue: https://www.illumos.org/issues/5678
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/efgh5678
```
#### Coverity Defect Fixes
If you are submitting a fix to a
[Coverity defect](https://scan.coverity.com/projects/zfsonlinux-zfs),
@@ -265,13 +290,3 @@ Git can append the `Signed-off-by` line to your commit messages. Simply
provide the `-s` or `--signoff` option when performing a `git commit`.
For more information about writing commit messages, visit [How to Write
a Git Commit Message](https://chris.beams.io/posts/git-commit/).
#### Co-authored By
If someone else had part in your pull request, please add the following to the commit:
`Co-authored-by: Name <gitregistered@email.address>`
This is useful if their authorship was lost during squashing, rebasing, etc.,
but may be used in any situation where there are co-authors.
The email address used here should be the same as on the GitHub profile of said user.
If said user does not have their email address public, please use the following instead:
`Co-authored-by: Name <[username]@users.noreply.github.com>`
+48
View File
@@ -0,0 +1,48 @@
<!-- Please fill out the following template, which will help other contributors address your issue. -->
<!--
Thank you for reporting an issue.
*IMPORTANT* - Please search our issue tracker *before* making a new issue.
If you cannot find a similar issue, then create a new issue.
https://github.com/zfsonlinux/zfs/issues
*IMPORTANT* - This issue tracker is for *bugs* and *issues* only.
Please search the wiki and the mailing list archives before asking
questions on the mailing list.
https://github.com/zfsonlinux/zfs/wiki/Mailing-Lists
Please fill in as much of the template as possible.
-->
### System information
<!-- add version after "|" character -->
Type | Version/Name
--- | ---
Distribution Name |
Distribution Version |
Linux Kernel |
Architecture |
ZFS Version |
SPL Version |
<!--
Commands to find ZFS/SPL versions:
modinfo zfs | grep -iw version
modinfo spl | grep -iw version
-->
### Describe the problem you're observing
### Describe how to reproduce the problem
### Include any warning/errors/backtraces from the system logs
<!--
*IMPORTANT* - Please mark logs and text output from terminal commands
or else Github will not display them correctly.
An example is provided below.
Example:
```
this is an example how log text should be marked (wrap it with ```)
```
-->
-55
View File
@@ -1,55 +0,0 @@
---
name: Bug report
about: Create a report to help us improve OpenZFS
title: ''
labels: 'Type: Defect'
assignees: ''
---
<!-- Please fill out the following template, which will help other contributors address your issue. -->
<!--
Thank you for reporting an issue.
*IMPORTANT* - Please check our issue tracker before opening a new issue.
Additional valuable information can be found in the OpenZFS documentation
and mailing list archives.
Please fill in as much of the template as possible.
-->
### System information
<!-- add version after "|" character -->
Type | Version/Name
--- | ---
Distribution Name |
Distribution Version |
Kernel Version |
Architecture |
OpenZFS Version |
<!--
Command to find OpenZFS version:
zfs version
Commands to find kernel version:
uname -r # Linux
freebsd-version -r # FreeBSD
-->
### Describe the problem you're observing
### Describe how to reproduce the problem
### Include any warning/errors/backtraces from the system logs
<!--
*IMPORTANT* - Please mark logs and text output from terminal commands
or else Github will not display them correctly.
An example is provided below.
Example:
```
this is an example how log text should be marked (wrap it with ```)
```
-->
-14
View File
@@ -1,14 +0,0 @@
blank_issues_enabled: false
contact_links:
- name: OpenZFS Questions
url: https://github.com/openzfs/zfs/discussions/new
about: Ask the community for help
- name: OpenZFS Community Support Mailing list (Linux)
url: https://zfsonlinux.topicbox.com/groups/zfs-discuss
about: Get community support for OpenZFS on Linux
- name: FreeBSD Community Support Mailing list
url: https://lists.freebsd.org/mailman/listinfo/freebsd-fs
about: Get community support for OpenZFS on FreeBSD
- name: OpenZFS on IRC
url: https://web.libera.chat/#openzfs
about: Use IRC to get community support for OpenZFS
-33
View File
@@ -1,33 +0,0 @@
---
name: Feature request
about: Suggest a feature for OpenZFS
title: ''
labels: 'Type: Feature'
assignees: ''
---
<!--
Thank you for suggesting a feature.
Please check our issue tracker before opening a new feature request.
Filling out the following template will help other contributors better understand your proposed feature.
-->
### Describe the feature you would like to see added to OpenZFS
<!--
Provide a clear and concise description of the feature.
-->
### How will this feature improve OpenZFS?
<!--
What problem does this feature solve?
-->
### Additional context
<!--
Any additional information you can add about the proposal?
-->
+10 -8
View File
@@ -2,6 +2,11 @@
<!--- Provide a general summary of your changes in the Title above -->
<!---
Documentation on ZFS Buildbot options can be found at
https://github.com/zfsonlinux/zfs/wiki/Buildbot-Options
-->
### Motivation and Context
<!--- Why is this change required? What problem does it solve? -->
<!--- If it fixes an open issue, please link to the issue here. -->
@@ -14,7 +19,6 @@
<!--- Include details of your testing environment, and the tests you ran to -->
<!--- see how your change affects other areas of the code, etc. -->
<!--- If your change is a performance enhancement, please provide benchmarks here. -->
<!--- Please think about using the draft PR feature if appropriate -->
### Types of changes
<!--- What types of changes does your code introduce? Put an `x` in all the boxes that apply: -->
@@ -22,17 +26,15 @@
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Performance enhancement (non-breaking change which improves efficiency)
- [ ] Code cleanup (non-breaking change which makes code smaller or more readable)
- [ ] Quality assurance (non-breaking change which makes the code more robust against bugs)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
- [ ] Library ABI change (libzfs, libzfs\_core, libnvpair, libuutil and libzfsbootenv)
- [ ] Documentation (a change to man pages or other documentation)
### Checklist:
<!--- Go over all the following points, and put an `x` in all the boxes that apply. -->
<!--- If you're unsure about any of these, don't hesitate to ask. We're here to help! -->
- [ ] My code follows the OpenZFS [code style requirements](https://github.com/openzfs/zfs/blob/master/.github/CONTRIBUTING.md#coding-conventions).
- [ ] My code follows the ZFS on Linux [code style requirements](https://github.com/zfsonlinux/zfs/blob/master/.github/CONTRIBUTING.md#coding-conventions).
- [ ] I have updated the documentation accordingly.
- [ ] I have read the [**contributing** document](https://github.com/openzfs/zfs/blob/master/.github/CONTRIBUTING.md).
- [ ] I have added [tests](https://github.com/openzfs/zfs/tree/master/tests) to cover my changes.
- [ ] I have run the ZFS Test Suite with this change applied.
- [ ] All commit messages are properly formatted and contain [`Signed-off-by`](https://github.com/openzfs/zfs/blob/master/.github/CONTRIBUTING.md#signed-off-by).
- [ ] I have read the [**contributing** document](https://github.com/zfsonlinux/zfs/blob/master/.github/CONTRIBUTING.md).
- [ ] I have added [tests](https://github.com/zfsonlinux/zfs/tree/master/tests) to cover my changes.
- [ ] All new and existing tests passed.
- [ ] All commit messages are properly formatted and contain [`Signed-off-by`](https://github.com/zfsonlinux/zfs/blob/master/.github/CONTRIBUTING.md#signed-off-by).
+1 -4
View File
@@ -4,8 +4,7 @@ codecov:
after_n_builds: 2 # user and kernel
coverage:
precision: 0 # 0 decimals of precision
round: nearest # Round to nearest precision point
precision: 2 # 2 digits of precision
range: "50...90" # red -> yellow -> green
status:
@@ -21,5 +20,3 @@ comment:
layout: "reach, diff, flags, footer"
behavior: once # update if exists; post new; skip if deleted
require_changes: yes # only post when coverage changes
# ignore: Please place any ignores in config/ax_code_coverage.m4 instead
-5
View File
@@ -1,5 +0,0 @@
name: "Custom CodeQL Analysis"
queries:
- uses: ./.github/codeql/custom-queries/cpp/deprecatedFunctionUsage.ql
- uses: ./.github/codeql/custom-queries/cpp/dslDatasetHoldReleMismatch.ql
-4
View File
@@ -1,4 +0,0 @@
name: "Custom CodeQL Analysis"
paths-ignore:
- tests
@@ -1,59 +0,0 @@
/**
* @name Deprecated function usage detection
* @description Detects functions whose usage is banned from the OpenZFS
* codebase due to QA concerns.
* @kind problem
* @severity error
* @id cpp/deprecated-function-usage
*/
import cpp
predicate isDeprecatedFunction(Function f) {
f.getName() = "strtok" or
f.getName() = "__xpg_basename" or
f.getName() = "basename" or
f.getName() = "dirname" or
f.getName() = "bcopy" or
f.getName() = "bcmp" or
f.getName() = "bzero" or
f.getName() = "asctime" or
f.getName() = "asctime_r" or
f.getName() = "gmtime" or
f.getName() = "localtime" or
f.getName() = "strncpy"
}
string getReplacementMessage(Function f) {
if f.getName() = "strtok" then
result = "Use strtok_r(3) instead!"
else if f.getName() = "__xpg_basename" then
result = "basename(3) is underspecified. Use zfs_basename() instead!"
else if f.getName() = "basename" then
result = "basename(3) is underspecified. Use zfs_basename() instead!"
else if f.getName() = "dirname" then
result = "dirname(3) is underspecified. Use zfs_dirnamelen() instead!"
else if f.getName() = "bcopy" then
result = "bcopy(3) is deprecated. Use memcpy(3)/memmove(3) instead!"
else if f.getName() = "bcmp" then
result = "bcmp(3) is deprecated. Use memcmp(3) instead!"
else if f.getName() = "bzero" then
result = "bzero(3) is deprecated. Use memset(3) instead!"
else if f.getName() = "asctime" then
result = "Use strftime(3) instead!"
else if f.getName() = "asctime_r" then
result = "Use strftime(3) instead!"
else if f.getName() = "gmtime" then
result = "gmtime(3) isn't thread-safe. Use gmtime_r(3) instead!"
else if f.getName() = "localtime" then
result = "localtime(3) isn't thread-safe. Use localtime_r(3) instead!"
else
result = "strncpy(3) is deprecated. Use strlcpy(3) instead!"
}
from FunctionCall fc, Function f
where
fc.getTarget() = f and
isDeprecatedFunction(f)
select fc, getReplacementMessage(f)
@@ -1,34 +0,0 @@
/**
* @name Detect mismatched dsl_dataset_hold/_rele pairs
* @description Flags instances of issue #12014 where
* - a dataset held with dsl_dataset_hold_obj() ends up in dsl_dataset_rele_flags(), or
* - a dataset held with dsl_dataset_hold_obj_flags() ends up in dsl_dataset_rele().
* @kind problem
* @severity error
* @tags correctness
* @id cpp/dslDatasetHoldReleMismatch
*/
import cpp
from Variable ds, Call holdCall, Call releCall, string message
where
ds.getType().toString() = "dsl_dataset_t *" and
holdCall.getASuccessor*() = releCall and
(
(holdCall.getTarget().getName() = "dsl_dataset_hold_obj_flags" and
holdCall.getArgument(4).(AddressOfExpr).getOperand().(VariableAccess).getTarget() = ds and
releCall.getTarget().getName() = "dsl_dataset_rele" and
releCall.getArgument(0).(VariableAccess).getTarget() = ds and
message = "Held with dsl_dataset_hold_obj_flags but released with dsl_dataset_rele")
or
(holdCall.getTarget().getName() = "dsl_dataset_hold_obj" and
holdCall.getArgument(3).(AddressOfExpr).getOperand().(VariableAccess).getTarget() = ds and
releCall.getTarget().getName() = "dsl_dataset_rele_flags" and
releCall.getArgument(0).(VariableAccess).getTarget() = ds and
message = "Held with dsl_dataset_hold_obj but released with dsl_dataset_rele_flags")
)
select releCall,
"Mismatched release: held with $@ but released with " + releCall.getTarget().getName() + " for dataset $@",
holdCall, holdCall.getTarget().getName(),
ds, ds.toString()
@@ -1,4 +0,0 @@
name: openzfs-cpp-queries
version: 0.0.0
libraryPathDependencies: codeql-cpp
suites: openzfs-cpp-suite
-13
View File
@@ -1,13 +0,0 @@
# Configuration for probot-no-response - https://github.com/probot/no-response
# Number of days of inactivity before an Issue is closed for lack of response
daysUntilClose: 31
# Label requiring a response
responseRequiredLabel: "Status: Feedback requested"
# Comment to post when closing an Issue for lack of response. Set to `false` to disable
closeComment: >
This issue has been automatically closed because there has been no response
to our request for more information from the original author. With only the
information that is currently in the issue, we don't have enough information
to take action. Please reach out if you have or find the answers we need so
that we can investigate further.
-26
View File
@@ -1,26 +0,0 @@
# Number of days of inactivity before an issue becomes stale
daysUntilStale: 365
# Number of days of inactivity before a stale issue is closed
daysUntilClose: 90
# Limit to only `issues` or `pulls`
only: issues
# Issues with these labels will never be considered stale
exemptLabels:
- "Type: Feature"
- "Bot: Not Stale"
- "Status: Work in Progress"
# Set to true to ignore issues in a project (defaults to false)
exemptProjects: true
# Set to true to ignore issues in a milestone (defaults to false)
exemptMilestones: true
# Set to true to ignore issues with an assignee (defaults to false)
exemptAssignees: true
# Label to use when marking an issue as stale
staleLabel: "Status: Stale"
# Comment to post when marking an issue as stale. Set to `false` to disable
markComment: >
This issue has been automatically marked as "stale" because it has not had
any activity for a while. It will be closed in 90 days if no further activity occurs.
Thank you for your contributions.
# Limit the number of actions per hour, from 1-30. Default is 30
limitPerRun: 6
+3
View File
@@ -0,0 +1,3 @@
preprocessorErrorDirective:./module/zfs/vdev_raidz_math_avx512f.c:243
preprocessorErrorDirective:./module/zfs/vdev_raidz_math_sse2.c:266
-61
View File
@@ -1,61 +0,0 @@
## The testings are done this way
```mermaid
flowchart TB
subgraph CleanUp and Summary
CleanUp+Summary
end
subgraph Functional Testings
sanity-checks-20.04
zloop-checks-20.04
functional-testing-20.04-->Part1-20.04
functional-testing-20.04-->Part2-20.04
functional-testing-20.04-->Part3-20.04
functional-testing-20.04-->Part4-20.04
functional-testing-22.04-->Part1-22.04
functional-testing-22.04-->Part2-22.04
functional-testing-22.04-->Part3-22.04
functional-testing-22.04-->Part4-22.04
sanity-checks-22.04
zloop-checks-22.04
end
subgraph Code Checking + Building
Build-Ubuntu-20.04
codeql.yml
checkstyle.yml
Build-Ubuntu-22.04
end
Build-Ubuntu-20.04-->sanity-checks-20.04
Build-Ubuntu-20.04-->zloop-checks-20.04
Build-Ubuntu-20.04-->functional-testing-20.04
Build-Ubuntu-22.04-->sanity-checks-22.04
Build-Ubuntu-22.04-->zloop-checks-22.04
Build-Ubuntu-22.04-->functional-testing-22.04
sanity-checks-20.04-->CleanUp+Summary
Part1-20.04-->CleanUp+Summary
Part2-20.04-->CleanUp+Summary
Part3-20.04-->CleanUp+Summary
Part4-20.04-->CleanUp+Summary
Part1-22.04-->CleanUp+Summary
Part2-22.04-->CleanUp+Summary
Part3-22.04-->CleanUp+Summary
Part4-22.04-->CleanUp+Summary
sanity-checks-22.04-->CleanUp+Summary
```
1) build zfs modules for Ubuntu 20.04 and 22.04 (~15m)
2) 2x zloop test (~10m) + 2x sanity test (~25m)
3) 4x functional testings in parts 1..4 (each ~1h)
4) cleanup and create summary
- content of summary depends on the results of the steps
When everything runs fine, the full run should be done in
about 2 hours.
The codeql.yml and checkstyle.yml are not part in this circle.
-64
View File
@@ -1,64 +0,0 @@
name: checkstyle
on:
push:
pull_request:
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
jobs:
checkstyle:
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha }}
- name: Install dependencies
run: |
# for x in lxd core20 snapd; do sudo snap remove $x; done
sudo apt-get purge -y snapd google-chrome-stable firefox
ONLY_DEPS=1 .github/workflows/scripts/qemu-3-deps-vm.sh ubuntu22
sudo apt-get install -y cppcheck devscripts mandoc pax-utils shellcheck
sudo python -m pipx install --quiet flake8
# confirm that the tools are installed
# the build system doesn't fail when they are not
checkbashisms --version
cppcheck --version
flake8 --version
scanelf --version
shellcheck --version
- name: Prepare
run: |
sed -i '/DEBUG_CFLAGS="-Werror"/s/^/#/' config/zfs-build.m4
./autogen.sh
- name: Configure
run: |
./configure
- name: Make
run: |
make -j$(nproc) --no-print-directory --silent
- name: Checkstyle
run: |
make -j$(nproc) --no-print-directory --silent checkstyle
- name: Lint
run: |
make -j$(nproc) --no-print-directory --silent lint
- name: CheckABI
id: CheckABI
run: |
docker run -v $PWD:/source ghcr.io/openzfs/libabigail make -j$(nproc) --no-print-directory --silent checkabi
- name: StoreABI
if: failure() && steps.CheckABI.outcome == 'failure'
run: |
docker run -v $PWD:/source ghcr.io/openzfs/libabigail make -j$(nproc) --no-print-directory --silent storeabi
- name: Prepare artifacts
if: failure() && steps.CheckABI.outcome == 'failure'
run: |
find -name *.abi | tar -cf abi_files.tar -T -
- uses: actions/upload-artifact@v4
if: failure() && steps.CheckABI.outcome == 'failure'
with:
name: New ABI files (use only if you're sure about interface changes)
path: abi_files.tar
-45
View File
@@ -1,45 +0,0 @@
name: "CodeQL"
on:
push:
pull_request:
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
jobs:
analyze:
name: Analyze
runs-on: ubuntu-22.04
permissions:
actions: read
contents: read
security-events: write
strategy:
fail-fast: false
matrix:
language: [ 'cpp', 'python' ]
steps:
- name: Set make jobs
run: |
echo "MAKEFLAGS=-j$(nproc)" >> $GITHUB_ENV
- name: Checkout repository
uses: actions/checkout@v4
- name: Initialize CodeQL
uses: github/codeql-action/init@v3
with:
config-file: .github/codeql-${{ matrix.language }}.yml
languages: ${{ matrix.language }}
- name: Autobuild
uses: github/codeql-action/autobuild@v3
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v3
with:
category: "/language:${{matrix.language}}"
-49
View File
@@ -1,49 +0,0 @@
name: labels
on:
pull_request_target:
types: [ opened, synchronize, reopened, converted_to_draft, ready_for_review ]
permissions:
pull-requests: write
jobs:
open:
runs-on: ubuntu-latest
if: ${{ github.event.action == 'opened' && github.event.pull_request.draft }}
steps:
- env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
ISSUE: ${{ github.event.pull_request.html_url }}
run: |
gh pr edit $ISSUE --add-label "Status: Work in Progress"
push:
runs-on: ubuntu-latest
if: ${{ github.event.action == 'synchronize' || github.event.action == 'reopened' }}
steps:
- env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
ISSUE: ${{ github.event.pull_request.html_url }}
run: |
gh pr edit $ISSUE --remove-label "Status: Accepted,Status: Inactive,Status: Revision Needed,Status: Stale"
draft:
runs-on: ubuntu-latest
if: ${{ github.event.action == 'converted_to_draft' }}
steps:
- env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
ISSUE: ${{ github.event.pull_request.html_url }}
run: |
gh pr edit $ISSUE --remove-label "Status: Accepted,Status: Code Review Needed,Status: Inactive,Status: Revision Needed,Status: Stale" --add-label "Status: Work in Progress"
rfr:
runs-on: ubuntu-latest
if: ${{ github.event.action == 'ready_for_review' }}
steps:
- env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
ISSUE: ${{ github.event.pull_request.html_url }}
run: |
gh pr edit $ISSUE --remove-label "Status: Accepted,Status: Inactive,Status: Revision Needed,Status: Stale,Status: Work in Progress" --add-label "Status: Code Review Needed"
-14
View File
@@ -1,14 +0,0 @@
Workflow for each operating system:
- install qemu on the github runner
- download current cloud image of operating system
- start and init that image via cloud-init
- install dependencies and poweroff system
- start system and build openzfs and then poweroff again
- clone build system and start 2 instances of it
- run functional testings and complete in around 3h
- when tests are done, do some logfile preparing
- show detailed results for each system
- in the end, generate the job summary
/TR 14.09.2024
@@ -1,112 +0,0 @@
#!/usr/bin/env python3
"""
Determine the CI type based on the change list and commit message.
Prints "quick" if (explicity required by user):
- the *last* commit message contains 'ZFS-CI-Type: quick'
or if (heuristics):
- the files changed are not in the list of specified directories, and
- all commit messages do not contain 'ZFS-CI-Type: (full|linux|freebsd)'
Otherwise prints "full".
"""
import sys
import subprocess
import re
"""
Patterns of files that are not considered to trigger full CI.
Note: not using pathlib.Path.match() because it does not support '**'
"""
FULL_RUN_IGNORE_REGEX = list(map(re.compile, [
r'.*\.md',
r'.*\.gitignore'
]))
"""
Patterns of files that are considered to trigger full CI.
"""
FULL_RUN_REGEX = list(map(re.compile, [
r'\.github/workflows/scripts/.*',
r'cmd.*',
r'configs/.*',
r'META',
r'.*\.am',
r'.*\.m4',
r'autogen\.sh',
r'configure\.ac',
r'copy-builtin',
r'contrib',
r'etc',
r'include',
r'lib/.*',
r'module/.*',
r'scripts/.*',
r'tests/.*',
r'udev/.*'
]))
if __name__ == '__main__':
prog = sys.argv[0]
if len(sys.argv) != 3:
print(f'Usage: {prog} <head_ref> <base_ref>')
sys.exit(1)
head, base = sys.argv[1:3]
def output_type(type, reason):
print(f'{prog}: will run {type} CI: {reason}', file=sys.stderr)
print(type)
sys.exit(0)
# check last (HEAD) commit message
last_commit_message_raw = subprocess.run([
'git', 'show', '-s', '--format=%B', head
], check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
for line in last_commit_message_raw.stdout.decode().splitlines():
if line.strip().lower() == 'zfs-ci-type: quick':
output_type('quick', f'requested by HEAD commit {head}')
# check all commit messages
all_commit_message_raw = subprocess.run([
'git', 'show', '-s',
'--format=ZFS-CI-Commit: %H%n%B', f'{head}...{base}'
], check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
all_commit_message = all_commit_message_raw.stdout.decode().splitlines()
commit_ref = head
for line in all_commit_message:
if line.startswith('ZFS-CI-Commit:'):
commit_ref = line.lstrip('ZFS-CI-Commit:').rstrip()
if line.strip().lower() == 'zfs-ci-type: freebsd':
output_type('freebsd', f'requested by commit {commit_ref}')
if line.strip().lower() == 'zfs-ci-type: linux':
output_type('linux', f'requested by commit {commit_ref}')
if line.strip().lower() == 'zfs-ci-type: full':
output_type('full', f'requested by commit {commit_ref}')
# check changed files
changed_files_raw = subprocess.run([
'git', 'diff', '--name-only', head, base
], check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
changed_files = changed_files_raw.stdout.decode().splitlines()
for f in changed_files:
for r in FULL_RUN_IGNORE_REGEX:
if r.match(f):
break
else:
for r in FULL_RUN_REGEX:
if r.match(f):
output_type(
'full',
f'changed file "{f}" matches pattern "{r.pattern}"'
)
# catch-all
output_type('quick', 'no changed file matches full CI patterns')
-109
View File
@@ -1,109 +0,0 @@
#!/bin/awk -f
#
# Merge multiple ZTS tests results summaries into a single summary. This is
# needed when you're running different parts of ZTS on different tests
# runners or VMs.
#
# Usage:
#
# ./merge_summary.awk summary1.txt [summary2.txt] [summary3.txt] ...
#
# or:
#
# cat summary*.txt | ./merge_summary.awk
#
BEGIN {
i=-1
pass=0
fail=0
skip=0
state=""
cl=0
el=0
upl=0
ul=0
# Total seconds of tests runtime
total=0;
}
# Skip empty lines
/^\s*$/{next}
# Skip Configuration and Test lines
/^Test:/{state=""; next}
/Configuration/{state="";next}
# When we see "test-runner.py" stop saving config lines, and
# save test runner lines
/test-runner.py/{state="testrunner"; runner=runner$0"\n"; next}
# We need to differentiate the PASS counts from test result lines that start
# with PASS, like:
#
# PASS mv_files/setup
#
# Use state="pass_count" to differentiate
#
/Results Summary/{state="pass_count"; next}
/PASS/{ if (state=="pass_count") {pass += $2}}
/FAIL/{ if (state=="pass_count") {fail += $2}}
/SKIP/{ if (state=="pass_count") {skip += $2}}
/Running Time/{
state="";
running[i]=$3;
split($3, arr, ":")
total += arr[1] * 60 * 60;
total += arr[2] * 60;
total += arr[3]
next;
}
/Tests with results other than PASS that are expected/{state="expected_lines"; next}
/Tests with result of PASS that are unexpected/{state="unexpected_pass_lines"; next}
/Tests with results other than PASS that are unexpected/{state="unexpected_lines"; next}
{
if (state == "expected_lines") {
expected_lines[el] = $0
el++
}
if (state == "unexpected_pass_lines") {
unexpected_pass_lines[upl] = $0
upl++
}
if (state == "unexpected_lines") {
unexpected_lines[ul] = $0
ul++
}
}
# Reproduce summary
END {
print runner;
print "\nResults Summary"
print "PASS\t"pass
print "FAIL\t"fail
print "SKIP\t"skip
print ""
print "Running Time:\t"strftime("%T", total, 1)
if (pass+fail+skip > 0) {
percent_passed=(pass/(pass+fail+skip) * 100)
}
printf "Percent passed:\t%3.2f%", percent_passed
print "\n\nTests with results other than PASS that are expected:"
asort(expected_lines, sorted)
for (j in sorted)
print sorted[j]
print "\n\nTests with result of PASS that are unexpected:"
asort(unexpected_pass_lines, sorted)
for (j in sorted)
print sorted[j]
print "\n\nTests with results other than PASS that are unexpected:"
asort(unexpected_lines, sorted)
for (j in sorted)
print sorted[j]
}
-135
View File
@@ -1,135 +0,0 @@
#!/usr/bin/env bash
######################################################################
# 1) setup qemu instance on action runner
######################################################################
set -eu
# The default 'azure.archive.ubuntu.com' mirrors can be really slow.
# Prioritize the official Ubuntu mirrors.
#
# The normal apt-mirrors.txt will look like:
#
# http://azure.archive.ubuntu.com/ubuntu/ priority:1
# https://archive.ubuntu.com/ubuntu/ priority:2
# https://security.ubuntu.com/ubuntu/ priority:3
#
# Just delete the 'azure.archive.ubuntu.com' line.
sudo sed -i '/azure.archive.ubuntu.com/d' /etc/apt/apt-mirrors.txt
echo "Using mirrors:"
cat /etc/apt/apt-mirrors.txt
# install needed packages
export DEBIAN_FRONTEND="noninteractive"
sudo apt-get -y update
sudo apt-get install -y axel cloud-image-utils daemonize guestfs-tools \
virt-manager linux-modules-extra-$(uname -r) zfsutils-linux
# generate ssh keys
rm -f ~/.ssh/id_ed25519
ssh-keygen -t ed25519 -f ~/.ssh/id_ed25519 -q -N ""
# not needed
sudo systemctl stop docker.socket
sudo systemctl stop multipathd.socket
sudo swapoff -a
# Special case:
#
# For reasons unknown, the runner can boot-up with two different block device
# configurations. On one config you get two 75GB block devices, and on the
# other you get a single 150GB block device. Here's what both look like:
#
# --- Two 75GB block devices ---
# NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
# sda 8:0 0 150G 0 disk
# ├─sda1 8:1 0 149G 0 part /
# ├─sda14 8:14 0 4M 0 part
# ├─sda15 8:15 0 106M 0 part /boot/efi
# └─sda16 259:0 0 913M 0 part /boot
#
# lrwxrwxrwx 1 root root 9 Jan 29 18:07 azure_root -> ../../sda
# lrwxrwxrwx 1 root root 10 Jan 29 18:07 azure_root-part1 -> ../../sda1
# lrwxrwxrwx 1 root root 11 Jan 29 18:07 azure_root-part14 -> ../../sda14
# lrwxrwxrwx 1 root root 11 Jan 29 18:07 azure_root-part15 -> ../../sda15
# lrwxrwxrwx 1 root root 11 Jan 29 18:07 azure_root-part16 -> ../../sda16
#
# --- One 150GB block device ---
# NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
# sda 8:0 0 75G 0 disk
# ├─sda1 8:1 0 74G 0 part /
# ├─sda14 8:14 0 4M 0 part
# ├─sda15 8:15 0 106M 0 part /boot/efi
# └─sda16 259:0 0 913M 0 part /boot
# sdb 8:16 0 75G 0 disk
# └─sdb1 8:17 0 75G 0 part
#
# lrwxrwxrwx 1 root root 9 Jan 29 18:07 azure_resource -> ../../sdb
# lrwxrwxrwx 1 root root 10 Jan 29 18:07 azure_resource-part1 -> ../../sdb1
# lrwxrwxrwx 1 root root 9 Jan 29 18:07 azure_root -> ../../sda
# lrwxrwxrwx 1 root root 10 Jan 29 18:07 azure_root-part1 -> ../../sda1
# lrwxrwxrwx 1 root root 11 Jan 29 18:07 azure_root-part14 -> ../../sda14
# lrwxrwxrwx 1 root root 11 Jan 29 18:07 azure_root-part15 -> ../../sda15
#
# If we have the azure_resource-part1 partition, umount it, partition it, and
# use it as our ZFS disk and swap partition. If not, just create a file VDEV
# and swap file and use that instead.
# remove default swapfile and /mnt
if [ -e /dev/disk/cloud/azure_resource-part1 ] ; then
sudo umount -l /mnt
DISK="/dev/disk/cloud/azure_resource-part1"
sudo sed -e "s|^$DISK.*||g" -i /etc/fstab
sudo wipefs -aq $DISK
sudo systemctl daemon-reload
fi
sudo modprobe loop
sudo modprobe zfs
if [ -e /dev/disk/cloud/azure_resource-part1 ] ; then
echo "We have two 75GB block devices"
# partition the disk as needed
DISK="/dev/disk/cloud/azure_resource"
sudo sgdisk --zap-all $DISK
sudo sgdisk -p \
-n 1:0:+16G -c 1:"swap" \
-n 2:0:0 -c 2:"tests" \
$DISK
sync
sleep 1
sudo fallocate -l 12G /test.ssd2
DISKS="$DISK-part2 /test.ssd2"
SWAP=$DISK-part1
else
echo "We have a single 150GB block device"
sudo fallocate -l 72G /test.ssd2
SWAP=/swapfile.ssd
sudo fallocate -l 16G $SWAP
sudo chmod 600 $SWAP
DISKS="/test.ssd2"
fi
# swap with same size as RAM (16GiB)
sudo mkswap $SWAP
sudo swapon $SWAP
# adjust zfs module parameter and create pool
exec 1>/dev/null
ARC_MIN=$((1024*1024*256))
ARC_MAX=$((1024*1024*512))
echo $ARC_MIN | sudo tee /sys/module/zfs/parameters/zfs_arc_min
echo $ARC_MAX | sudo tee /sys/module/zfs/parameters/zfs_arc_max
echo 1 | sudo tee /sys/module/zfs/parameters/zvol_use_blk_mq
sudo zpool create -f -o ashift=12 zpool $DISKS -O relatime=off \
-O atime=off -O xattr=sa -O compression=lz4 -O sync=disabled \
-O redundant_metadata=none -O mountpoint=/mnt/tests
# no need for some scheduler
for i in /sys/block/s*/queue/scheduler; do
echo "none" | sudo tee $i
done
-341
View File
@@ -1,341 +0,0 @@
#!/usr/bin/env bash
######################################################################
# 2) start qemu with some operating system, init via cloud-init
######################################################################
set -eu
# short name used in zfs-qemu.yml
OS="$1"
# OS variant (virt-install --os-variant list)
OSv=$OS
# FreeBSD urls's
FREEBSD_REL="https://download.freebsd.org/releases/CI-IMAGES"
FREEBSD_SNAP="https://download.freebsd.org/snapshots/CI-IMAGES"
URLxz=""
# Ubuntu mirrors
UBMIRROR="https://cloud-images.ubuntu.com"
#UBMIRROR="https://mirrors.cloud.tencent.com/ubuntu-cloud-images"
#UBMIRROR="https://mirror.citrahost.com/ubuntu-cloud-images"
# default nic model for vm's
NIC="virtio"
# additional options for virt-install
OPTS[0]=""
OPTS[1]=""
case "$OS" in
almalinux8)
OSNAME="AlmaLinux 8"
URL="https://repo.almalinux.org/almalinux/8/cloud/x86_64/images/AlmaLinux-8-GenericCloud-latest.x86_64.qcow2"
;;
almalinux9)
OSNAME="AlmaLinux 9"
URL="https://repo.almalinux.org/almalinux/9/cloud/x86_64/images/AlmaLinux-9-GenericCloud-latest.x86_64.qcow2"
;;
almalinux10)
OSNAME="AlmaLinux 10"
OSv="almalinux9"
URL="https://repo.almalinux.org/almalinux/10/cloud/x86_64/images/AlmaLinux-10-GenericCloud-latest.x86_64.qcow2"
;;
alpine3-23)
OSNAME="Alpine Linux 3.23.2"
# Alpine Linux v3.22 and v3.23 are unknown to osinfo as of 2025-12-26.
OSv="alpinelinux3.21"
URL="https://dl-cdn.alpinelinux.org/alpine/v3.23/releases/cloud/generic_alpine-3.23.2-x86_64-bios-cloudinit-r0.qcow2"
;;
archlinux)
OSNAME="Archlinux"
URL="https://geo.mirror.pkgbuild.com/images/latest/Arch-Linux-x86_64-cloudimg.qcow2"
;;
centos-stream9)
OSNAME="CentOS Stream 9"
URL="https://cloud.centos.org/centos/9-stream/x86_64/images/CentOS-Stream-GenericCloud-9-latest.x86_64.qcow2"
;;
centos-stream10)
OSNAME="CentOS Stream 10"
OSv="centos-stream9"
URL="https://cloud.centos.org/centos/10-stream/x86_64/images/CentOS-Stream-GenericCloud-10-latest.x86_64.qcow2"
;;
debian11)
OSNAME="Debian 11"
URL="https://cloud.debian.org/images/cloud/bullseye/latest/debian-11-generic-amd64.qcow2"
;;
debian12)
OSNAME="Debian 12"
URL="https://cloud.debian.org/images/cloud/bookworm/latest/debian-12-generic-amd64.qcow2"
;;
debian13)
OSNAME="Debian 13"
# TODO: Overwrite OSv to debian13 for virt-install until it's added to osinfo
OSv="debian12"
URL="https://cloud.debian.org/images/cloud/trixie/latest/debian-13-generic-amd64.qcow2"
OPTS[0]="--boot"
OPTS[1]="uefi=on"
;;
fedora41)
OSNAME="Fedora 41"
OSv="fedora-unknown"
URL="https://download.fedoraproject.org/pub/fedora/linux/releases/41/Cloud/x86_64/images/Fedora-Cloud-Base-Generic-41-1.4.x86_64.qcow2"
;;
fedora42)
OSNAME="Fedora 42"
OSv="fedora-unknown"
URL="https://download.fedoraproject.org/pub/fedora/linux/releases/42/Cloud/x86_64/images/Fedora-Cloud-Base-Generic-42-1.1.x86_64.qcow2"
;;
fedora43)
OSNAME="Fedora 43"
OSv="fedora-unknown"
URL="https://download.fedoraproject.org/pub/fedora/linux/releases/43/Cloud/x86_64/images/Fedora-Cloud-Base-Generic-43-1.6.x86_64.qcow2"
;;
freebsd13-5r)
FreeBSD="13.5-RELEASE"
OSNAME="FreeBSD $FreeBSD"
OSv="freebsd13.0"
URLxz="$FREEBSD_REL/$FreeBSD/amd64/Latest/FreeBSD-$FreeBSD-amd64-BASIC-CI.raw.xz"
KSRC="$FREEBSD_REL/../amd64/$FreeBSD/src.txz"
NIC="rtl8139"
;;
freebsd14-3r)
FreeBSD="14.3-RELEASE"
OSNAME="FreeBSD $FreeBSD"
OSv="freebsd14.0"
URLxz="$FREEBSD_REL/$FreeBSD/amd64/Latest/FreeBSD-$FreeBSD-amd64-BASIC-CI.raw.xz"
KSRC="$FREEBSD_REL/../amd64/$FreeBSD/src.txz"
;;
freebsd13-5s)
FreeBSD="13.5-STABLE"
OSNAME="FreeBSD $FreeBSD"
OSv="freebsd13.0"
URLxz="$FREEBSD_SNAP/$FreeBSD/amd64/Latest/FreeBSD-$FreeBSD-amd64-BASIC-CI.raw.xz"
KSRC="$FREEBSD_SNAP/../amd64/$FreeBSD/src.txz"
NIC="rtl8139"
;;
freebsd14-3s)
FreeBSD="14.3-STABLE"
OSNAME="FreeBSD $FreeBSD"
OSv="freebsd14.0"
URLxz="$FREEBSD_SNAP/$FreeBSD/amd64/Latest/FreeBSD-$FreeBSD-amd64-BASIC-CI-ufs.raw.xz"
KSRC="$FREEBSD_SNAP/../amd64/$FreeBSD/src.txz"
;;
freebsd15-0s)
FreeBSD="15.0-STABLE"
OSNAME="FreeBSD $FreeBSD"
OSv="freebsd14.0"
URLxz="$FREEBSD_SNAP/$FreeBSD/amd64/Latest/FreeBSD-$FreeBSD-amd64-BASIC-CI-ufs.raw.xz"
KSRC="$FREEBSD_SNAP/../amd64/$FreeBSD/src.txz"
;;
freebsd16-0c)
FreeBSD="16.0-CURRENT"
OSNAME="FreeBSD $FreeBSD"
OSv="freebsd14.0"
URLxz="$FREEBSD_SNAP/$FreeBSD/amd64/Latest/FreeBSD-$FreeBSD-amd64-BASIC-CI-ufs.raw.xz"
KSRC="$FREEBSD_SNAP/../amd64/$FreeBSD/src.txz"
;;
tumbleweed)
OSNAME="openSUSE Tumbleweed"
OSv="opensusetumbleweed"
MIRROR="http://opensuse-mirror-gce-us.susecloud.net"
URL="$MIRROR/tumbleweed/appliances/openSUSE-MicroOS.x86_64-OpenStack-Cloud.qcow2"
;;
ubuntu22)
OSNAME="Ubuntu 22.04"
OSv="ubuntu22.04"
URL="$UBMIRROR/jammy/current/jammy-server-cloudimg-amd64.img"
;;
ubuntu24)
OSNAME="Ubuntu 24.04"
OSv="ubuntu24.04"
URL="$UBMIRROR/noble/current/noble-server-cloudimg-amd64.img"
;;
*)
echo "Wrong value for OS variable!"
exit 111
;;
esac
# environment file
ENV="/var/tmp/env.txt"
echo "ENV=$ENV" >> $ENV
# result path
echo 'RESPATH="/var/tmp/test_results"' >> $ENV
# FreeBSD 13 has problems with: e1000 and virtio
echo "NIC=$NIC" >> $ENV
# freebsd15 -> used in zfs-qemu.yml
echo "OS=$OS" >> $ENV
# freebsd14.0 -> used for virt-install
echo "OSv=\"$OSv\"" >> $ENV
# FreeBSD 15 (Current) -> used for summary
echo "OSNAME=\"$OSNAME\"" >> $ENV
# default vm count for testings
VMs=2
echo "VMs=\"$VMs\"" >> $ENV
# default cpu count for testing vm's
CPU=2
echo "CPU=\"$CPU\"" >> $ENV
sudo mkdir -p "/mnt/tests"
sudo chown -R $(whoami) /mnt/tests
DISK="/dev/zvol/zpool/openzfs"
sudo zfs create -ps -b 64k -V 80g zpool/openzfs
while true; do test -b $DISK && break; sleep 1; done
# we are downloading via axel, curl and wget are mostly slower and
# require more return value checking
IMG="/mnt/tests/cloud-image"
if [ ! -z "$URLxz" ]; then
echo "Loading $URLxz ..."
time axel -q -o "$IMG" "$URLxz"
echo "Loading $KSRC ..."
time axel -q -o ~/src.txz $KSRC
else
echo "Loading $URL ..."
time axel -q -o "$IMG" "$URL"
fi
echo "Importing VM image to zvol..."
if [ ! -z "$URLxz" ]; then
xzcat -T0 $IMG | sudo dd of=$DISK bs=4M
else
sudo qemu-img dd -f qcow2 -O raw if=$IMG of=$DISK bs=4M
fi
rm -f $IMG
PUBKEY=$(cat ~/.ssh/id_ed25519.pub)
if [ ${OS:0:7} != "freebsd" ]; then
cat <<EOF > /tmp/user-data
#cloud-config
hostname: $OS
users:
- name: root
shell: /bin/bash
sudo: ['ALL=(ALL) NOPASSWD:ALL']
- name: zfs
shell: /bin/bash
sudo: ['ALL=(ALL) NOPASSWD:ALL']
ssh_authorized_keys:
- $PUBKEY
# Workaround for Alpine Linux.
lock_passwd: false
passwd: '*'
packages:
- sudo
- bash
growpart:
mode: auto
devices: ['/']
ignore_growroot_disabled: false
EOF
else
cat <<EOF > /tmp/user-data
#cloud-config
hostname: $OS
# minimized config without sudo for nuageinit of FreeBSD
growpart:
mode: auto
devices: ['/']
ignore_growroot_disabled: false
EOF
fi
sudo virsh net-update default add ip-dhcp-host \
"<host mac='52:54:00:83:79:00' ip='192.168.122.10'/>" --live --config
sudo virt-install \
--os-variant $OSv \
--name "openzfs" \
--cpu host-passthrough \
--virt-type=kvm --hvm \
--vcpus=4,sockets=1 \
--memory $((1024*12)) \
--memballoon model=virtio \
--graphics none \
--network bridge=virbr0,model=$NIC,mac='52:54:00:83:79:00' \
--cloud-init user-data=/tmp/user-data \
--disk $DISK,bus=virtio,cache=none,format=raw,driver.discard=unmap \
--import --noautoconsole ${OPTS[0]} ${OPTS[1]} >/dev/null
# Give the VMs hostnames so we don't have to refer to them with
# hardcoded IP addresses.
#
# vm0: Initial VM we install dependencies and build ZFS on.
# vm1..2 Testing VMs
for ((i=0; i<=VMs; i++)); do
echo "192.168.122.1$i vm$i" | sudo tee -a /etc/hosts
done
# in case the directory isn't there already
mkdir -p $HOME/.ssh
cat <<EOF >> $HOME/.ssh/config
# no questions please
StrictHostKeyChecking no
# small timeout, used in while loops later
ConnectTimeout 1
EOF
if [ ${OS:0:7} != "freebsd" ]; then
# enable KSM on Linux
sudo virsh dommemstat --domain "openzfs" --period 5
sudo virsh node-memory-tune 100 50 1
echo 1 | sudo tee /sys/kernel/mm/ksm/run > /dev/null
else
# on FreeBSD we need some more init stuff, because of nuageinit
BASH="/usr/local/bin/bash"
while pidof /usr/bin/qemu-system-x86_64 >/dev/null; do
ssh 2>/dev/null root@vm0 "uname -a" && break
done
ssh root@vm0 "env IGNORE_OSVERSION=yes pkg install -y bash ca_root_nss git qemu-guest-agent python3 py311-cloud-init"
ssh root@vm0 "chsh -s $BASH root"
ssh root@vm0 'sysrc qemu_guest_agent_enable="YES"'
ssh root@vm0 'sysrc cloudinit_enable="YES"'
ssh root@vm0 "pw add user zfs -w no -s $BASH"
ssh root@vm0 'mkdir -p ~zfs/.ssh'
ssh root@vm0 'echo "zfs ALL=(ALL:ALL) NOPASSWD: ALL" >> /usr/local/etc/sudoers'
ssh root@vm0 'echo "PubkeyAuthentication yes" >> /etc/ssh/sshd_config'
scp ~/.ssh/id_ed25519.pub "root@vm0:~zfs/.ssh/authorized_keys"
ssh root@vm0 'chown -R zfs ~zfs'
ssh root@vm0 'service sshd restart'
scp ~/src.txz "root@vm0:/tmp/src.txz"
ssh root@vm0 'tar -C / -zxf /tmp/src.txz'
fi
#
# Config for Alpine Linux similar to FreeBSD.
#
if [ ${OS:0:6} == "alpine" ]; then
while pidof /usr/bin/qemu-system-x86_64 >/dev/null; do
ssh 2>/dev/null zfs@vm0 "uname -a" && break
done
# Enable community and testing repositories.
ssh zfs@vm0 "sudo rm -rf /etc/apk/repositories"
ssh zfs@vm0 "sudo setup-apkrepos -c1"
ssh zfs@vm0 "echo '@testing http://dl-cdn.alpinelinux.org/alpine/edge/testing' | sudo tee -a /etc/apk/repositories"
# Upgrade to edge or latest-stable.
#ssh zfs@vm0 "sudo sed -i 's#/v[0-9]\+\.[0-9]\+/#/edge/#g' /etc/apk/repositories"
#ssh zfs@vm0 "sudo sed -i 's#/v[0-9]\+\.[0-9]\+/#/latest-stable/#g' /etc/apk/repositories"
# Update and upgrade after repository setup.
ssh zfs@vm0 "sudo apk update"
ssh zfs@vm0 "sudo apk add --upgrade apk-tools"
ssh zfs@vm0 "sudo apk upgrade --available"
fi
-310
View File
@@ -1,310 +0,0 @@
#!/usr/bin/env bash
######################################################################
# 3) install dependencies for compiling and loading
#
# $1: OS name (like 'fedora41')
# $2: (optional) Experimental Fedora kernel version, like "6.14" to
# install instead of Fedora defaults.
######################################################################
set -eu
function alpine() {
echo "##[group]Install Development Tools"
sudo apk add \
acl alpine-sdk attr autoconf automake bash build-base clang21 coreutils \
cpio cryptsetup curl curl-dev dhcpcd eudev eudev-dev eudev-libs findutils \
fio gawk gdb gettext-dev git grep jq libaio libaio-dev libcurl \
libtirpc-dev libtool libunwind libunwind-dev linux-headers linux-tools \
linux-virt linux-virt-dev lsscsi m4 make nfs-utils openssl-dev parted \
pax procps py3-cffi py3-distlib py3-packaging py3-setuptools python3 \
python3-dev qemu-guest-agent rng-tools rsync samba samba-server sed \
strace sysstat util-linux util-linux-dev wget words xfsprogs xxhash \
zlib-dev pamtester@testing
echo "##[endgroup]"
echo "##[group]Switch to eudev"
sudo setup-devd udev
echo "##[endgroup]"
echo "##[group]Install ksh93 from Source"
git clone --depth 1 https://github.com/ksh93/ksh.git /tmp/ksh
cd /tmp/ksh
./bin/package make
sudo ./bin/package install /
echo "##[endgroup]"
}
function archlinux() {
echo "##[group]Running pacman -Syu"
sudo btrfs filesystem resize max /
sudo pacman -Syu --noconfirm
echo "##[endgroup]"
echo "##[group]Install Development Tools"
sudo pacman -Sy --noconfirm base-devel bc cpio cryptsetup dhclient dkms \
fakeroot fio gdb inetutils jq less linux linux-headers lsscsi nfs-utils \
parted pax perf python-packaging python-setuptools qemu-guest-agent ksh \
samba strace sysstat rng-tools rsync wget xxhash
echo "##[endgroup]"
}
function debian() {
export DEBIAN_FRONTEND="noninteractive"
echo "##[group]Wait for cloud-init to finish"
cloud-init status --wait
echo "##[endgroup]"
echo "##[group]Running apt-get update+upgrade"
sudo sed -i '/[[:alpha:]]-backports/d' /etc/apt/sources.list
sudo apt-get update -y
sudo apt-get upgrade -y
echo "##[endgroup]"
echo "##[group]Install Development Tools"
sudo apt-get install -y \
acl alien attr autoconf bc cpio cryptsetup curl dbench dh-python dkms \
fakeroot fio gdb gdebi git ksh lcov isc-dhcp-client jq libacl1-dev \
libaio-dev libattr1-dev libblkid-dev libcurl4-openssl-dev libdevmapper-dev \
libelf-dev libffi-dev libmount-dev libpam0g-dev libselinux-dev libssl-dev \
libtool libtool-bin libudev-dev libunwind-dev linux-headers-$(uname -r) \
lsscsi nfs-kernel-server pamtester parted python3 python3-all-dev \
python3-cffi python3-dev python3-distlib python3-packaging libtirpc-dev \
python3-setuptools python3-sphinx qemu-guest-agent rng-tools rpm2cpio \
rsync samba strace sysstat uuid-dev watchdog wget xfslibs-dev xxhash \
zlib1g-dev
echo "##[endgroup]"
}
function freebsd() {
export ASSUME_ALWAYS_YES="YES"
echo "##[group]Install Development Tools"
sudo pkg install -y autoconf automake autotools base64 checkbashisms fio \
gdb gettext gettext-runtime git gmake gsed jq ksh lcov libtool lscpu \
pkgconf python python3 pamtester pamtester qemu-guest-agent rsync xxhash
sudo pkg install -xy \
'^samba4[[:digit:]]+$' \
'^py3[[:digit:]]+-cffi$' \
'^py3[[:digit:]]+-sysctl$' \
'^py3[[:digit:]]+-setuptools$' \
'^py3[[:digit:]]+-packaging$'
echo "##[endgroup]"
}
# common packages for: almalinux, centos, redhat
function rhel() {
echo "##[group]Running dnf update"
echo "max_parallel_downloads=10" | sudo -E tee -a /etc/dnf/dnf.conf
sudo dnf clean all
sudo dnf update -y --setopt=fastestmirror=1 --refresh
echo "##[endgroup]"
echo "##[group]Install Development Tools"
# Alma wants "Development Tools", Fedora 41 wants "development-tools"
if ! sudo dnf group install -y "Development Tools" ; then
echo "Trying 'development-tools' instead of 'Development Tools'"
sudo dnf group install -y development-tools
fi
sudo dnf install -y \
acl attr bc bzip2 cryptsetup curl dbench dkms elfutils-libelf-devel fio \
gdb git jq kernel-rpm-macros ksh libacl-devel libaio-devel \
libargon2-devel libattr-devel libblkid-devel libcurl-devel libffi-devel \
ncompress libselinux-devel libtirpc-devel libtool libudev-devel \
libuuid-devel lsscsi mdadm nfs-utils openssl-devel pam-devel pamtester \
parted perf python3 python3-cffi python3-devel python3-packaging \
kernel-devel python3-setuptools qemu-guest-agent rng-tools rpcgen \
rpm-build rsync samba strace sysstat systemd watchdog wget xfsprogs-devel \
xxhash zlib-devel
# These are needed for building Lustre. We only install these on EL VMs since
# we don't plan to test build Lustre on other platforms.
sudo dnf install -y libnl3-devel libyaml-devel libmount-devel
echo "##[endgroup]"
}
function tumbleweed() {
echo "##[group]Running zypper is TODO!"
sleep 23456
echo "##[endgroup]"
}
# $1: Kernel version to install (like '6.14rc7')
function install_fedora_experimental_kernel {
our_version="$1"
sudo dnf -y copr enable @kernel-vanilla/stable
sudo dnf -y copr enable @kernel-vanilla/mainline
all="$(sudo dnf list --showduplicates kernel-* python3-perf* perf* bpftool*)"
echo "Available versions:"
echo "$all"
# You can have a bunch of minor variants of the version we want '6.14'.
# Pick the newest variant (sorted by version number).
specific_version=$(echo "$all" | grep $our_version | awk '{print $2}' | sort -V | tail -n 1)
list="$(echo "$all" | grep $specific_version | grep -Ev 'kernel-rt|kernel-selftests|kernel-debuginfo' | sed 's/.x86_64//g' | awk '{print $1"-"$2}')"
sudo dnf install -y $list
sudo dnf -y copr disable @kernel-vanilla/stable
sudo dnf -y copr disable @kernel-vanilla/mainline
}
# Install dependencies
case "$1" in
almalinux8)
echo "##[group]Enable epel and powertools repositories"
sudo dnf config-manager -y --set-enabled powertools
sudo dnf install -y epel-release
echo "##[endgroup]"
rhel
echo "##[group]Install kernel-abi-whitelists"
sudo dnf install -y kernel-abi-whitelists
echo "##[endgroup]"
;;
almalinux9|almalinux10|centos-stream9|centos-stream10)
echo "##[group]Enable epel and crb repositories"
sudo dnf config-manager -y --set-enabled crb
sudo dnf install -y epel-release
echo "##[endgroup]"
rhel
echo "##[group]Install kernel-abi-stablelists"
sudo dnf install -y kernel-abi-stablelists
echo "##[endgroup]"
;;
alpine*)
alpine
;;
archlinux)
archlinux
;;
debian*)
echo 'debconf debconf/frontend select Noninteractive' | sudo debconf-set-selections
debian
echo "##[group]Install Debian specific"
sudo apt-get install -yq linux-perf dh-sequence-dkms
echo "##[endgroup]"
;;
fedora*)
rhel
sudo dnf install -y libunwind-devel
# Fedora 42+ moves /usr/bin/script from 'util-linux' to 'util-linux-script'
sudo dnf install -y util-linux-script || true
# Optional: Install an experimental kernel ($2 = kernel version)
if [ -n "${2:-}" ] ; then
install_fedora_experimental_kernel "$2"
fi
;;
freebsd*)
freebsd
;;
tumbleweed)
tumbleweed
;;
ubuntu*)
debian
echo "##[group]Install Ubuntu specific"
sudo apt-get install -yq linux-tools-common libtirpc-dev \
linux-modules-extra-$(uname -r)
sudo apt-get install -yq dh-sequence-dkms
echo "##[endgroup]"
echo "##[group]Delete Ubuntu OpenZFS modules"
for i in $(find /lib/modules -name zfs -type d); do sudo rm -rvf $i; done
echo "##[endgroup]"
;;
esac
# This script is used for checkstyle + zloop deps also.
# Install only the needed packages and exit - when used this way.
test -z "${ONLY_DEPS:-}" || exit 0
# Start services
echo "##[group]Enable services"
case "$1" in
alpine*)
sudo -E rc-update add qemu-guest-agent
sudo -E rc-update add nfs
sudo -E rc-update add samba
sudo -E rc-update add dhcpcd
# Remove services related to cloud-init.
sudo -E rc-update del cloud-init default
sudo -E rc-update del cloud-final default
sudo -E rc-update del cloud-config default
;;
freebsd*)
# add virtio things
echo 'virtio_load="YES"' | sudo -E tee -a /boot/loader.conf
for i in balloon blk console random scsi; do
echo "virtio_${i}_load=\"YES\"" | sudo -E tee -a /boot/loader.conf
done
echo "fdescfs /dev/fd fdescfs rw 0 0" | sudo -E tee -a /etc/fstab
sudo -E mount /dev/fd
sudo -E touch /etc/zfs/exports
sudo -E sysrc mountd_flags="/etc/zfs/exports"
echo '[global]' | sudo -E tee /usr/local/etc/smb4.conf >/dev/null
sudo -E service nfsd enable
sudo -E service qemu-guest-agent enable
sudo -E service samba_server enable
;;
debian*|ubuntu*)
sudo -E systemctl enable nfs-kernel-server
sudo -E systemctl enable qemu-guest-agent
sudo -E systemctl enable smbd
;;
*)
# All other linux distros
sudo -E systemctl enable nfs-server
sudo -E systemctl enable qemu-guest-agent
sudo -E systemctl enable smb
;;
esac
echo "##[endgroup]"
# Setup Kernel cmdline
CMDLINE="console=tty0 console=ttyS0,115200n8"
CMDLINE="$CMDLINE selinux=0"
CMDLINE="$CMDLINE random.trust_cpu=on"
CMDLINE="$CMDLINE no_timer_check"
case "$1" in
almalinux*|centos*|fedora*)
GRUB_CFG="/boot/grub2/grub.cfg"
GRUB_MKCONFIG="grub2-mkconfig"
CMDLINE="$CMDLINE biosdevname=0 net.ifnames=0"
echo 'GRUB_SERIAL_COMMAND="serial --speed=115200"' \
| sudo tee -a /etc/default/grub >/dev/null
;;
ubuntu24)
GRUB_CFG="/boot/grub/grub.cfg"
GRUB_MKCONFIG="grub-mkconfig"
echo 'GRUB_DISABLE_OS_PROBER="false"' \
| sudo tee -a /etc/default/grub >/dev/null
;;
*)
GRUB_CFG="/boot/grub/grub.cfg"
GRUB_MKCONFIG="grub-mkconfig"
;;
esac
case "$1" in
alpine*|archlinux|freebsd*)
true
;;
*)
echo "##[group]Edit kernel cmdline"
sudo sed -i -e '/^GRUB_CMDLINE_LINUX/d' /etc/default/grub || true
echo "GRUB_CMDLINE_LINUX=\"$CMDLINE\"" \
| sudo tee -a /etc/default/grub >/dev/null
sudo $GRUB_MKCONFIG -o $GRUB_CFG
echo "##[endgroup]"
;;
esac
# reset cloud-init configuration and poweroff
sudo cloud-init clean --logs
sleep 2 && sudo poweroff &
exit 0
-28
View File
@@ -1,28 +0,0 @@
######################################################################
# 3) Wait for VM to boot from previous step and launch dependencies
# script on it.
#
# $1: OS name (like 'fedora41')
# $2: (optional) Experimental kernel version to install on fedora,
# like "6.14".
######################################################################
.github/workflows/scripts/qemu-wait-for-vm.sh vm0
# SPECIAL CASE:
#
# If the user passed in an experimental kernel version to test on Fedora,
# we need to update the kernel version in zfs's META file to allow the
# build to happen. We update our local copy of META here, since we know
# it will be rsync'd up in the next step.
if [ -n "${2:-}" ] ; then
sed -i -E 's/Linux-Maximum: .+/Linux-Maximum: 99.99/g' META
fi
scp .github/workflows/scripts/qemu-3-deps-vm.sh zfs@vm0:qemu-3-deps-vm.sh
PID=`pidof /usr/bin/qemu-system-x86_64`
ssh zfs@vm0 '$HOME/qemu-3-deps-vm.sh' "$@"
# wait for poweroff to succeed
tail --pid=$PID -f /dev/null
sleep 5 # avoid this: "error: Domain is already active"
rm -f $HOME/.ssh/known_hosts
@@ -1,396 +0,0 @@
#!/usr/bin/env bash
######################################################################
# 4) configure and build openzfs modules. This is run on the VMs.
#
# Usage:
#
# qemu-4-build-vm.sh OS [--enable-debug][--dkms][--patch-level NUM]
# [--poweroff][--release][--repo][--tarball]
#
# OS: OS name like 'fedora41'
# --enable-debug: Build RPMs with '--enable-debug' (for testing)
# --dkms: Build DKMS RPMs as well
# --patch-level NUM: Use a custom patch level number for packages.
# --poweroff: Power-off the VM after building
# --release Build zfs-release*.rpm as well
# --repo After building everything, copy RPMs into /tmp/repo
# in the ZFS RPM repository file structure. Also
# copy tarballs if they were built.
# --tarball: Also build a tarball of ZFS source
######################################################################
ENABLE_DEBUG=""
DKMS=""
PATCH_LEVEL=""
POWEROFF=""
RELEASE=""
REPO=""
TARBALL=""
while [[ $# -gt 0 ]]; do
case $1 in
--enable-debug)
ENABLE_DEBUG=1
shift
;;
--dkms)
DKMS=1
shift
;;
--patch-level)
PATCH_LEVEL=$2
shift
shift
;;
--poweroff)
POWEROFF=1
shift
;;
--release)
RELEASE=1
shift
;;
--repo)
REPO=1
shift
;;
--tarball)
TARBALL=1
shift
;;
*)
OS=$1
shift
;;
esac
done
set -eu
function run() {
LOG="/var/tmp/build-stderr.txt"
echo "****************************************************"
echo "$(date) ($*)"
echo "****************************************************"
($@ || echo $? > /tmp/rv) 3>&1 1>&2 2>&3 | stdbuf -eL -oL tee -a $LOG
if [ -f /tmp/rv ]; then
RV=$(cat /tmp/rv)
echo "****************************************************"
echo "exit with value=$RV ($*)"
echo "****************************************************"
echo 1 > /var/tmp/build-exitcode.txt
exit $RV
fi
}
# Look at the RPMs in the current directory and copy/move them to
# /tmp/repo, using the directory structure we use for the ZFS RPM repos.
#
# For example:
# /tmp/repo/epel-testing/9.5
# /tmp/repo/epel-testing/9.5/SRPMS
# /tmp/repo/epel-testing/9.5/SRPMS/zfs-2.3.99-1.el9.src.rpm
# /tmp/repo/epel-testing/9.5/SRPMS/zfs-kmod-2.3.99-1.el9.src.rpm
# /tmp/repo/epel-testing/9.5/kmod
# /tmp/repo/epel-testing/9.5/kmod/x86_64
# /tmp/repo/epel-testing/9.5/kmod/x86_64/debug
# /tmp/repo/epel-testing/9.5/kmod/x86_64/debug/kmod-zfs-debuginfo-2.3.99-1.el9.x86_64.rpm
# /tmp/repo/epel-testing/9.5/kmod/x86_64/debug/libnvpair3-debuginfo-2.3.99-1.el9.x86_64.rpm
# /tmp/repo/epel-testing/9.5/kmod/x86_64/debug/libuutil3-debuginfo-2.3.99-1.el9.x86_64.rpm
# ...
function copy_rpms_to_repo {
# Pick a RPM to query. It doesn't matter which one - we just want to extract
# the 'Build Host' value from it.
rpm=$(ls zfs-*.rpm | head -n 1)
# Get zfs version '2.2.99'
zfs_ver=$(rpm -qpi $rpm | awk '/Version/{print $3}')
# Get "2.1" or "2.2"
zfs_major=$(echo $zfs_ver | grep -Eo [0-9]+\.[0-9]+)
# Get 'almalinux9.5' or 'fedora41' type string
build_host=$(rpm -qpi $rpm | awk '/Build Host/{print $4}')
# Get '9.5' or '41' OS version
os_ver=$(echo $build_host | grep -Eo '[0-9\.]+$')
# Our ZFS version and OS name will determine which repo the RPMs
# will go in (regular or testing). Fedora always gets the newest
# releases, and Alma gets the older releases.
case $build_host in
almalinux*)
case $zfs_major in
2.2)
d="epel"
;;
*)
d="epel-testing"
;;
esac
;;
fedora*)
d="fedora"
;;
esac
prefix=/tmp/repo
dst="$prefix/$d/$os_ver"
# Special case: move zfs-release*.rpm out of the way first (if we built them).
# This will make filtering the other RPMs easier.
mkdir -p $dst
mv zfs-release*.rpm $dst || true
# Copy source RPMs
mkdir -p $dst/SRPMS
cp $(ls *.src.rpm) $dst/SRPMS/
if [[ "$build_host" =~ "almalinux" ]] ; then
# Copy kmods+userspace
mkdir -p $dst/kmod/x86_64/debug
cp $(ls *.rpm | grep -Ev 'src.rpm|dkms|debuginfo') $dst/kmod/x86_64
cp *debuginfo*.rpm $dst/kmod/x86_64/debug
fi
if [ -n "$DKMS" ] ; then
# Copy dkms+userspace
mkdir -p $dst/x86_64
cp $(ls *.rpm | grep -Ev 'src.rpm|kmod|debuginfo') $dst/x86_64
fi
# Copy debug
mkdir -p $dst/x86_64/debug
cp $(ls *debuginfo*.rpm | grep -v kmod) $dst/x86_64/debug
}
function freebsd() {
extra="${1:-}"
export MAKE="gmake"
echo "##[group]Autogen.sh"
run ./autogen.sh
echo "##[endgroup]"
echo "##[group]Configure"
run ./configure \
--prefix=/usr/local \
--with-libintl-prefix=/usr/local \
--enable-pyzfs \
--enable-debuginfo $extra
echo "##[endgroup]"
echo "##[group]Build"
run gmake -j$(nproc)
echo "##[endgroup]"
echo "##[group]Install"
run sudo gmake install
echo "##[endgroup]"
}
function linux() {
extra="${1:-}"
echo "##[group]Autogen.sh"
run ./autogen.sh
echo "##[endgroup]"
echo "##[group]Configure"
run ./configure \
--prefix=/usr \
--enable-pyzfs \
--enable-debuginfo $extra
echo "##[endgroup]"
echo "##[group]Build"
run make -j$(nproc)
echo "##[endgroup]"
echo "##[group]Install"
run sudo make install
echo "##[endgroup]"
}
function rpm_build_and_install() {
extra="${1:-}"
# Build RPMs with XZ compression by default (since gzip decompression is slow)
echo "%_binary_payload w7.xzdio" >> ~/.rpmmacros
echo "##[group]Autogen.sh"
run ./autogen.sh
echo "##[endgroup]"
if [ -n "$PATCH_LEVEL" ] ; then
sed -i -E 's/(Release:\s+)1/\1'$PATCH_LEVEL'/g' META
fi
echo "##[group]Configure"
run ./configure --enable-debuginfo $extra
echo "##[endgroup]"
echo "##[group]Build"
run make pkg-kmod pkg-utils
echo "##[endgroup]"
if [ -n "$DKMS" ] ; then
echo "##[group]DKMS"
make rpm-dkms
echo "##[endgroup]"
fi
if [ -n "$REPO" ] ; then
echo "Skipping install since we're only building RPMs and nothing else"
else
echo "##[group]Install"
run sudo dnf -y --nobest install $(ls *.rpm | grep -Ev 'dkms|src.rpm')
echo "##[endgroup]"
fi
# Optionally build the zfs-release.*.rpm
if [ -n "$RELEASE" ] ; then
echo "##[group]Release"
pushd ~
sudo dnf -y install rpm-build || true
# Check out a sparse copy of zfsonlinux.github.com.git so we don't get
# all the binaries. We just need a few kilobytes of files to build RPMs.
git clone --depth 1 --no-checkout \
https://github.com/zfsonlinux/zfsonlinux.github.com.git
cd zfsonlinux.github.com
git sparse-checkout set zfs-release
git checkout
cd zfs-release
mkdir -p ~/rpmbuild/{BUILDROOT,SPECS,RPMS,SRPMS,SOURCES,BUILD}
cp RPM-GPG-KEY-openzfs* *.repo ~/rpmbuild/SOURCES
cp zfs-release.spec ~/rpmbuild/SPECS/
rpmbuild -ba ~/rpmbuild/SPECS/zfs-release.spec
# ZFS release RPMs are built. Copy them to the ~/zfs directory just to
# keep all the RPMs in the same place.
cp ~/rpmbuild/RPMS/noarch/*.rpm ~/zfs
cp ~/rpmbuild/SRPMS/*.rpm ~/zfs
popd
rm -fr ~/rpmbuild
echo "##[endgroup]"
fi
if [ -n "$REPO" ] ; then
echo "##[group]Repo"
copy_rpms_to_repo
echo "##[endgroup]"
fi
}
function deb_build_and_install() {
extra="${1:-}"
echo "##[group]Autogen.sh"
run ./autogen.sh
echo "##[endgroup]"
echo "##[group]Configure"
run ./configure \
--prefix=/usr \
--enable-pyzfs \
--enable-debuginfo $extra
echo "##[endgroup]"
echo "##[group]Build"
run make native-deb-kmod native-deb-utils
echo "##[endgroup]"
echo "##[group]Install"
# Do kmod install. Note that when you build the native debs, the
# packages themselves are placed in parent directory '../' rather than
# in the source directory like the rpms are.
run sudo apt-get -y install $(find ../ | grep -E '\.deb$' \
| grep -Ev 'dkms|dracut')
echo "##[endgroup]"
}
function build_tarball {
if [ -n "$REPO" ] ; then
./autogen.sh
./configure --with-config=srpm
make dist
mkdir -p /tmp/repo/releases
# The tarball name is based off of 'Version' field in the META file.
mv *.tar.gz /tmp/repo/releases/
fi
}
# Debug: show kernel cmdline
if [ -f /proc/cmdline ] ; then
cat /proc/cmdline || true
fi
# Set our hostname to our OS name and version number. Specifically, we set the
# major and minor number so that when we query the Build Host field in the RPMs
# we build, we can see what specific version of Fedora/Almalinux we were using
# to build them. This is helpful for matching up KMOD versions.
#
# Examples:
#
# rhel8.10
# almalinux9.5
# fedora42
source /etc/os-release
if which hostnamectl &> /dev/null ; then
# Fedora 42+ use hostnamectl
sudo hostnamectl set-hostname "$ID$VERSION_ID"
sudo hostnamectl set-hostname --pretty "$ID$VERSION_ID"
else
sudo hostname "$ID$VERSION_ID"
fi
# save some sysinfo
uname -a > /var/tmp/uname.txt
cd $HOME/zfs
export PATH="$PATH:/sbin:/usr/sbin:/usr/local/sbin"
extra=""
if [ -n "$ENABLE_DEBUG" ] ; then
extra="--enable-debug"
fi
# build
case "$OS" in
freebsd*)
freebsd "$extra"
;;
alma*|centos*)
rpm_build_and_install "--with-spec=redhat $extra"
;;
fedora*)
rpm_build_and_install "$extra"
# Historically, we've always built the release tarballs on Fedora, since
# there was one instance long ago where we built them on CentOS 7, and they
# didn't work correctly for everyone.
if [ -n "$TARBALL" ] ; then
build_tarball
fi
;;
debian*|ubuntu*)
deb_build_and_install "$extra"
;;
*)
linux "$extra"
;;
esac
# building the zfs module was ok
echo 0 > /var/tmp/build-exitcode.txt
# reset cloud-init configuration and poweroff
if [ -n "$POWEROFF" ] ; then
sudo cloud-init clean --logs
sync && sleep 2 && sudo poweroff &
fi
exit 0
-11
View File
@@ -1,11 +0,0 @@
#!/usr/bin/env bash
######################################################################
# 4) configure and build openzfs modules
######################################################################
echo "Build modules in QEMU machine"
# Bring our VM back up and copy over ZFS source
.github/workflows/scripts/qemu-prepare-for-build.sh
ssh zfs@vm0 '$HOME/zfs/.github/workflows/scripts/qemu-4-build-vm.sh' $@
-145
View File
@@ -1,145 +0,0 @@
#!/usr/bin/env bash
######################################################################
# 5) start test machines and load openzfs module
######################################################################
set -eu
# read our defined variables
source /var/tmp/env.txt
# wait for poweroff to succeed
PID=$(pidof /usr/bin/qemu-system-x86_64)
tail --pid=$PID -f /dev/null
sudo virsh undefine --nvram openzfs
# cpu pinning
CPUSET=("0,1" "2,3")
# additional options for virt-install
OPTS[0]=""
OPTS[1]=""
case "$OS" in
freebsd*)
# FreeBSD needs only 6GiB
RAM=6
;;
debian13)
RAM=8
# Boot Debian 13 with uefi=on and secureboot=off (ZFS Kernel Module not signed)
OPTS[0]="--boot"
OPTS[1]="firmware=efi,firmware.feature0.name=secure-boot,firmware.feature0.enabled=no"
;;
*)
# Linux needs more memory, but can be optimized to share it via KSM
RAM=8
;;
esac
# create snapshot we can clone later
sudo zfs snapshot zpool/openzfs@now
# setup the testing vm's
PUBKEY=$(cat ~/.ssh/id_ed25519.pub)
# start testing VMs
for ((i=1; i<=VMs; i++)); do
echo "Creating disk for vm$i..."
DISK="/dev/zvol/zpool/vm$i"
FORMAT="raw"
sudo zfs clone zpool/openzfs@now zpool/vm$i-system
sudo zfs create -ps -b 64k -V 64g zpool/vm$i-tests
cat <<EOF > /tmp/user-data
#cloud-config
fqdn: vm$i
users:
- name: root
shell: /bin/bash
sudo: ['ALL=(ALL) NOPASSWD:ALL']
- name: zfs
shell: /bin/bash
sudo: ['ALL=(ALL) NOPASSWD:ALL']
ssh_authorized_keys:
- $PUBKEY
# Workaround for Alpine Linux.
lock_passwd: false
passwd: '*'
packages:
- sudo
- bash
growpart:
mode: auto
devices: ['/']
ignore_growroot_disabled: false
EOF
sudo virsh net-update default add ip-dhcp-host \
"<host mac='52:54:00:83:79:0$i' ip='192.168.122.1$i'/>" --live --config
sudo virt-install \
--os-variant $OSv \
--name "vm$i" \
--cpu host-passthrough \
--virt-type=kvm --hvm \
--vcpus=$CPU,sockets=1 \
--cpuset=${CPUSET[$((i-1))]} \
--memory $((1024*RAM)) \
--memballoon model=virtio \
--graphics none \
--cloud-init user-data=/tmp/user-data \
--network bridge=virbr0,model=$NIC,mac="52:54:00:83:79:0$i" \
--disk $DISK-system,bus=virtio,cache=none,format=$FORMAT,driver.discard=unmap \
--disk $DISK-tests,bus=virtio,cache=none,format=$FORMAT,driver.discard=unmap \
--import --noautoconsole ${OPTS[0]} ${OPTS[1]}
done
# generate some memory stats
cat <<EOF > cronjob.sh
exec 1>>/var/tmp/stats.txt
exec 2>&1
echo "********************************************************************************"
uptime
free -m
zfs list
EOF
sudo chmod +x cronjob.sh
sudo mv -f cronjob.sh /root/cronjob.sh
echo '*/5 * * * * /root/cronjob.sh' > crontab.txt
sudo crontab crontab.txt
rm crontab.txt
# Save the VM's serial output (ttyS0) to /var/tmp/console.txt
# - ttyS0 on the VM corresponds to a local /dev/pty/N entry
# - use 'virsh ttyconsole' to lookup the /dev/pty/N entry
for ((i=1; i<=VMs; i++)); do
mkdir -p $RESPATH/vm$i
read "pty" <<< $(sudo virsh ttyconsole vm$i)
# Create the file so we can tail it, even if there's no output.
touch $RESPATH/vm$i/console.txt
sudo nohup bash -c "cat $pty > $RESPATH/vm$i/console.txt" &
# Write all VM boot lines to the console to aid in debugging failed boots.
# The boot lines from all the VMs will be munged together, so prepend each
# line with the vm hostname (like 'vm1:').
(while IFS=$'\n' read -r line; do echo "vm$i: $line" ; done < <(sudo tail -f $RESPATH/vm$i/console.txt)) &
done
echo "Console logging for ${VMs}x $OS started."
# check if the machines are okay
echo "Waiting for vm's to come up... (${VMs}x CPU=$CPU RAM=$RAM)"
for ((i=1; i<=VMs; i++)); do
.github/workflows/scripts/qemu-wait-for-vm.sh vm$i
done
echo "All $VMs VMs are up now."
@@ -1,51 +0,0 @@
#!/usr/bin/env bash
######################################################################
# 6) Test if Lustre can still build against ZFS
######################################################################
set -e
# Build from the latest Lustre tag rather than the master branch. We do this
# under the assumption that master is going to have a lot of churn thus will be
# more prone to breaking the build than a point release. We don't want ZFS
# PR's reporting bad test results simply because upstream Lustre accidentally
# broke their build.
#
# Skip any RC tags, or any tags where the last version digit is 50 or more.
# Versions with 50 or more are development versions of Lustre.
repo=https://github.com/lustre/lustre-release.git
tag="$(git ls-remote --refs --exit-code --sort=version:refname --tags $repo | \
awk -F '_' '/-RC/{next}; /refs\/tags\/v/{if ($NF < 50){print}}' | \
tail -n 1 | sed 's/.*\///')"
echo "Cloning Lustre tag $tag"
git clone --depth 1 --branch "$tag" "$repo"
cd lustre-release
# Include Lustre patches to build against master/zfs-2.4.x. Once these
# patches are merged we can remove these lines.
patches=('https://review.whamcloud.com/changes/fs%2Flustre-release~62101/revisions/2/patch?download'
'https://review.whamcloud.com/changes/fs%2Flustre-release~63267/revisions/9/patch?download')
for p in "${patches[@]}" ; do
curl $p | base64 -d > patch
patch -p1 < patch || true
done
echo "Configure Lustre"
./autogen.sh
# EL 9 needs '--disable-gss-keyring'
./configure --with-zfs --disable-gss-keyring
echo "Building Lustre RPMs"
make rpms
ls *.rpm
# There's only a handful of Lustre RPMs we actually need to install
lustrerpms="$(ls *.rpm | grep -E 'kmod-lustre-osd-zfs-[0-9]|kmod-lustre-[0-9]|lustre-osd-zfs-mount-[0-9]')"
echo "Installing: $lustrerpms"
sudo dnf -y install $lustrerpms
sudo modprobe -v lustre
# Should see some Lustre lines in dmesg
sudo dmesg | grep -Ei 'lnet|lustre'
-231
View File
@@ -1,231 +0,0 @@
#!/usr/bin/env bash
######################################################################
# 6) load openzfs module and run the tests
#
# called on runner: qemu-6-tests.sh
# called on qemu-vm: qemu-6-tests.sh $OS $2 $3 [--lustre|--builtin] [quick|default]
#
# --lustre: Test build lustre in addition to the normal tests
# --builtin: Test build ZFS as a kernel built-in in addition to the normal tests
######################################################################
set -eu
function prefix() {
ID="$1"
LINE="$2"
CURRENT=$(date +%s)
TSSTART=$(cat /tmp/tsstart)
DIFF=$((CURRENT-TSSTART))
H=$((DIFF/3600))
DIFF=$((DIFF-(H*3600)))
M=$((DIFF/60))
S=$((DIFF-(M*60)))
CTR=$(cat /tmp/ctr)
echo $LINE| grep -q '^\[.*] Test[: ]' && CTR=$((CTR+1)) && echo $CTR > /tmp/ctr
BASE="$HOME/work/zfs/zfs"
COLOR="$BASE/scripts/zfs-tests-color.sh"
CLINE=$(echo $LINE| grep '^\[.*] Test[: ]' \
| sed -e 's|^\[.*] Test|Test|g' \
| sed -e 's|/usr/local|/usr|g' \
| sed -e 's| /usr/share/zfs/zfs-tests/tests/| |g' | $COLOR)
if [ -z "$CLINE" ]; then
printf "vm${ID}: %s\n" "$LINE"
else
# [vm2: 00:15:54 256] Test: functional/checksum/setup (run as root) [00:00] [PASS]
printf "[vm${ID}: %02d:%02d:%02d %4d] %s\n" \
"$H" "$M" "$S" "$CTR" "$CLINE"
fi
}
function do_lustre_build() {
local rc=0
$HOME/zfs/.github/workflows/scripts/qemu-6-lustre-tests-vm.sh &> /var/tmp/lustre.txt || rc=$?
echo "$rc" > /var/tmp/lustre-exitcode.txt
if [ "$rc" != "0" ] ; then
echo "$rc" > /var/tmp/tests-exitcode.txt
fi
}
export -f do_lustre_build
# Test build ZFS into the kernel directly
function do_builtin_build() {
local rc=0
# Get currently full kernel version (like '6.18.8')
fullver=$(uname -r | grep -Eo '^[0-9]+\.[0-9]+\.[0-9]+')
# Get just the major ('6')
major=$(echo $fullver | grep -Eo '^[0-9]+')
(
set -e
wget https://cdn.kernel.org/pub/linux/kernel/v${major}.x/linux-$fullver.tar.xz
tar -xf $HOME/linux-$fullver.tar.xz
cd $HOME/linux-$fullver
make tinyconfig
./scripts/config --enable EFI_PARTITON
./scripts/config --enable BLOCK
# BTRFS_FS is easiest config option to enable CONFIG_ZLIB_INFLATE|DEFLATE
./scripts/config --enable BTRFS_FS
yes "" | make oldconfig
make prepare
cd $HOME/zfs
./configure --with-linux=$HOME/linux-$fullver --enable-linux-builtin --enable-debug
./copy-builtin $HOME/linux-$fullver
cd $HOME/linux-$fullver
./scripts/config --enable ZFS
yes "" | make oldconfig
make -j `nproc`
) &> /var/tmp/builtin.txt || rc=$?
echo "$rc" > /var/tmp/builtin-exitcode.txt
if [ "$rc" != "0" ] ; then
echo "$rc" > /var/tmp/tests-exitcode.txt
fi
}
export -f do_builtin_build
# called directly on the runner
if [ -z ${1:-} ]; then
cd "/var/tmp"
source env.txt
SSH=$(which ssh)
TESTS='$HOME/zfs/.github/workflows/scripts/qemu-6-tests.sh'
echo 0 > /tmp/ctr
date "+%s" > /tmp/tsstart
for ((i=1; i<=VMs; i++)); do
IP="192.168.122.1$i"
# We do an additional test build of Lustre against ZFS if we're vm2
# on almalinux*. At the time of writing, the vm2 tests were
# completing roughly 15min before the vm1 tests, so it makes sense
# to have vm2 do the build.
#
# In addition, we do an additional test build of ZFS as a Linux
# kernel built-in on Fedora. Again, we do it on vm2 to exploit vm2's
# early finish time.
extra=""
if [[ "$OS" == almalinux* ]] && [[ "$i" == "2" ]] ; then
extra="--lustre"
elif [[ "$OS" == fedora* ]] && [[ "$i" == "2" ]] ; then
extra="--builtin"
fi
daemonize -c /var/tmp -p vm${i}.pid -o vm${i}log.txt -- \
$SSH zfs@$IP $TESTS $OS $i $VMs $extra $CI_TYPE
# handly line by line and add info prefix
stdbuf -oL tail -fq vm${i}log.txt \
| while read -r line; do prefix "$i" "$line"; done &
echo $! > vm${i}log.pid
# don't mix up the initial --- Configuration --- part
sleep 0.13
done
# wait for all vm's to finish
for ((i=1; i<=VMs; i++)); do
tail --pid=$(cat vm${i}.pid) -f /dev/null
pid=$(cat vm${i}log.pid)
rm -f vm${i}log.pid
kill $pid
done
exit 0
fi
#############################################
# Everything from here on runs inside qemu vm
#############################################
# Process cmd line args
OS="$1"
shift
NUM="$1"
shift
DEN="$1"
shift
BUILD_LUSTRE=0
BUILD_BUILTIN=0
if [ "$1" == "--lustre" ] ; then
BUILD_LUSTRE=1
shift
elif [ "$1" == "--builtin" ] ; then
BUILD_BUILTIN=1
shift
fi
if [ "$1" == "quick" ] ; then
export RUNFILES="sanity.run"
fi
export PATH="$PATH:/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/sbin:/usr/local/bin"
case "$OS" in
freebsd*)
TDIR="/usr/local/share/zfs"
sudo kldstat -n zfs 2>/dev/null && sudo kldunload zfs
sudo -E ./zfs/scripts/zfs.sh
sudo mv -f /var/tmp/*.txt /tmp
sudo newfs -U -t -L tmp /dev/vtbd1 >/dev/null
sudo mount -o noatime /dev/vtbd1 /var/tmp
sudo chmod 1777 /var/tmp
sudo mv -f /tmp/*.txt /var/tmp
;;
*)
# use xfs @ /var/tmp for all distros
TDIR="/usr/share/zfs"
sudo -E modprobe zfs
sudo mv -f /var/tmp/*.txt /tmp
sudo mkfs.xfs -fq /dev/vdb
sudo mount -o noatime /dev/vdb /var/tmp
sudo chmod 1777 /var/tmp
sudo mv -f /tmp/*.txt /var/tmp
;;
esac
# Distribution-specific settings.
case "$OS" in
almalinux9|almalinux10|centos-stream*)
# Enable io_uring on Enterprise Linux 9 and 10.
sudo sysctl kernel.io_uring_disabled=0 > /dev/null
;;
alpine*)
# Ensure `/etc/zfs/zpool.cache` exists.
sudo mkdir -p /etc/zfs
sudo touch /etc/zfs/zpool.cache
sudo chmod 644 /etc/zfs/zpool.cache
;;
esac
# Lustre calls a number of exported ZFS module symbols. To make sure we don't
# change the symbols and break Lustre, do a quick Lustre build of the latest
# released Lustre against ZFS.
#
# Note that we do the Lustre test build in parallel with ZTS. ZTS isn't very
# CPU intensive, so we can use idle CPU cycles "guilt free" for the build.
# The Lustre build on its own takes ~15min.
if [ "$BUILD_LUSTRE" == "1" ] ; then
do_lustre_build &
elif [ "$BUILD_BUILTIN" == "1" ] ; then
# Try building ZFS directly into the Linux kernel (not as a module)
do_builtin_build &
fi
# run functional testings and save exitcode
cd /var/tmp
TAGS=$NUM/$DEN
sudo dmesg -c > dmesg-prerun.txt
mount > mount.txt
df -h > df-prerun.txt
$TDIR/zfs-tests.sh -vKO -s 3GB -T $TAGS
RV=$?
df -h > df-postrun.txt
echo $RV > tests-exitcode.txt
sync
exit 0
-124
View File
@@ -1,124 +0,0 @@
#!/usr/bin/env bash
######################################################################
# 7) prepare output of the results
# - this script pre-creates all needed logfiles for later summary
######################################################################
set -eu
# read our defined variables
cd /var/tmp
source env.txt
mkdir -p $RESPATH
# check if building the module has failed
if [ -z ${VMs:-} ]; then
cd $RESPATH
echo ":exclamation: ZFS module didn't build successfully :exclamation:" \
| tee summary.txt | tee /tmp/summary.txt
cp /var/tmp/*.txt .
tar cf /tmp/qemu-$OS.tar -C $RESPATH -h . || true
exit 0
fi
# build was okay
BASE="$HOME/work/zfs/zfs"
MERGE="$BASE/.github/workflows/scripts/merge_summary.awk"
# catch result files of testings (vm's should be there)
for ((i=1; i<=VMs; i++)); do
rsync -arL zfs@vm$i:$RESPATH/current $RESPATH/vm$i || true
scp zfs@vm$i:"/var/tmp/*.txt" $RESPATH/vm$i || true
scp zfs@vm$i:"/var/tmp/*.rpm" $RESPATH/vm$i || true
done
cp -f /var/tmp/*.txt $RESPATH || true
cd $RESPATH
# prepare result files for summary
for ((i=1; i<=VMs; i++)); do
file="vm$i/build-stderr.txt"
test -s $file && mv -f $file build-stderr.txt
file="vm$i/build-exitcode.txt"
test -s $file && mv -f $file build-exitcode.txt
file="vm$i/uname.txt"
test -s $file && mv -f $file uname.txt
file="vm$i/tests-exitcode.txt"
if [ ! -s $file ]; then
# XXX - add some tests for kernel panic's here
# tail -n 80 vm$i/console.txt | grep XYZ
echo 1 > $file
fi
rv=$(cat vm$i/tests-exitcode.txt)
test $rv != 0 && touch /tmp/have_failed_tests
file="vm$i/current/log"
if [ -s $file ]; then
cat $file >> log
awk '/\[FAIL\]|\[KILLED\]/{ show=1; print; next; }; \
/\[SKIP\]|\[PASS\]/{ show=0; } show' \
$file > /tmp/vm${i}dbg.txt
fi
file="vm${i}log.txt"
fileC="/tmp/vm${i}log.txt"
if [ -s $file ]; then
cat $file >> summary
cat $file | $BASE/scripts/zfs-tests-color.sh > $fileC
fi
done
# create summary of tests
if [ -s summary ]; then
$MERGE summary | grep -v '^/' > summary.txt
$MERGE summary | $BASE/scripts/zfs-tests-color.sh > /tmp/summary.txt
rm -f summary
else
touch summary.txt /tmp/summary.txt
fi
# create file for debugging
if [ -s log ]; then
awk '/\[FAIL\]|\[KILLED\]/{ show=1; print; next; }; \
/\[SKIP\]|\[PASS\]/{ show=0; } show' \
log > summary-failure-logs.txt
rm -f log
else
touch summary-failure-logs.txt
fi
# create debug overview for failed tests
cat summary.txt \
| awk '/\(expected PASS\)/{ if ($1!="SKIP") print $2; next; } show' \
| while read t; do
cat summary-failure-logs.txt \
| awk '$0~/Test[: ]/{ show=0; } $0~v{ show=1; } show' v="$t" \
> /tmp/fail.txt
SIZE=$(stat --printf="%s" /tmp/fail.txt)
SIZE=$((SIZE/1024))
# Test Summary:
echo "##[group]$t ($SIZE KiB)" >> /tmp/failed.txt
cat /tmp/fail.txt | $BASE/scripts/zfs-tests-color.sh >> /tmp/failed.txt
echo "##[endgroup]" >> /tmp/failed.txt
# Job Summary:
echo -e "\n<details>\n<summary>$t ($SIZE KiB)</summary><pre>" >> failed.txt
cat /tmp/fail.txt >> failed.txt
echo "</pre></details>" >> failed.txt
done
if [ -e /tmp/have_failed_tests ]; then
echo ":warning: Some tests failed!" >> failed.txt
else
echo ":thumbsup: All tests passed." >> failed.txt
fi
if [ ! -s uname.txt ]; then
echo ":interrobang: Panic - where is my uname.txt?" > uname.txt
fi
# artifact ready now
tar cf /tmp/qemu-$OS.tar -C $RESPATH -h . || true
-103
View File
@@ -1,103 +0,0 @@
#!/usr/bin/env bash
######################################################################
# 8) show colored output of results
######################################################################
set -eu
# read our defined variables
source /var/tmp/env.txt
cd $RESPATH
# helper function for showing some content with headline
function showfile() {
content=$(dd if=$1 bs=1024 count=400k 2>/dev/null)
if [ -z "$2" ]; then
group1=""
group2=""
else
SIZE=$(stat --printf="%s" "$file")
SIZE=$((SIZE/1024))
group1="##[group]$2 ($SIZE KiB)"
group2="##[endgroup]"
fi
cat <<EOF > tmp$$
$group1
$content
$group2
EOF
cat tmp$$
rm -f tmp$$
}
function showfile_tail() {
echo "##[group]$2 (final lines)"
tail -n 80 $1
echo "##[endgroup]"
}
# overview
cat /tmp/summary.txt
echo ""
if [ -f /tmp/have_failed_tests -a -s /tmp/failed.txt ]; then
echo "Debuginfo of failed tests:"
cat /tmp/failed.txt
echo ""
cat /tmp/summary.txt | grep -v '^/'
echo ""
fi
echo -e "\nFull logs for download:\n $1\n"
for ((i=1; i<=VMs; i++)); do
# Print Lustre build test results (the build is only done on vm2)
if [ -f vm$i/lustre-exitcode.txt ] ; then
rv=$(< vm$i/lustre-exitcode.txt)
if [ $rv = 0 ]; then
vm="vm$i"
else
vm="vm$i"
touch /tmp/have_failed_tests
fi
file="vm$i/lustre.txt"
test -s "$file" && showfile_tail "$file" "$vm: Lustre build"
fi
if [ -f vm$i/builtin-exitcode.txt ] ; then
rv=$(< vm$i/builtin-exitcode.txt)
if [ $rv = 0 ]; then
vm="vm$i"
else
vm="vm$i"
touch /tmp/have_failed_tests
fi
file="vm$i/builtin.txt"
test -s "$file" && showfile_tail "$file" "$vm: Linux built-in build"
fi
rv=$(cat vm$i/tests-exitcode.txt)
if [ $rv = 0 ]; then
vm="vm$i"
else
vm="vm$i"
fi
file="vm$i/dmesg-prerun.txt"
test -s "$file" && showfile "$file" "$vm: dmesg kernel"
file="/tmp/vm${i}log.txt"
test -s "$file" && showfile "$file" "$vm: test results"
file="vm$i/console.txt"
test -s "$file" && showfile "$file" "$vm: serial console"
file="/tmp/vm${i}dbg.txt"
test -s "$file" && showfile "$file" "$vm: failure logfile"
done
test -f /tmp/have_failed_tests && exit 1
exit 0
@@ -1,57 +0,0 @@
#!/usr/bin/env bash
######################################################################
# 9) generate github summary page of all the testings
######################################################################
set -eu
function output() {
echo -e $* >> "out-$logfile.md"
}
function outfile() {
cat "$1" >> "out-$logfile.md"
}
function outfile_plain() {
output "<pre>"
cat "$1" >> "out-$logfile.md"
output "</pre>"
}
function send2github() {
test -f "$1" || exit 0
dd if="$1" bs=1023k count=1 >> $GITHUB_STEP_SUMMARY
}
# https://docs.github.com/en/enterprise-server@3.6/actions/using-workflows/workflow-commands-for-github-actions#step-isolation-and-limits
# Job summaries are isolated between steps and each step is restricted to a maximum size of 1MiB.
# [ ] can not show all error findings here
# [x] split files into smaller ones and create additional steps
# first call, generate all summaries
if [ ! -f out-1.md ]; then
logfile="1"
for tarfile in Logs-functional-*/qemu-*.tar; do
rm -rf vm* *.txt
if [ ! -s "$tarfile" ]; then
output "\n## Functional Tests: unknown\n"
output ":exclamation: Tarfile $tarfile is empty :exclamation:"
continue
fi
tar xf "$tarfile"
test -s env.txt || continue
source env.txt
# when uname.txt is there, the other files are also ok
test -s uname.txt || continue
output "\n## Functional Tests: $OSNAME\n"
outfile_plain uname.txt
outfile_plain summary.txt
outfile failed.txt
logfile=$((logfile+1))
done
send2github out-1.md
else
send2github out-$1.md
fi
@@ -1,8 +0,0 @@
#!/usr/bin/env bash
# Helper script to run after installing dependencies. This brings the VM back
# up and copies over the zfs source directory.
echo "Build modules in QEMU machine"
sudo virsh start openzfs
.github/workflows/scripts/qemu-wait-for-vm.sh vm0
rsync -ar $HOME/work/zfs/zfs zfs@vm0:./
@@ -1,115 +0,0 @@
#!/bin/bash
#
# Do a test install of ZFS from an external repository.
#
# USAGE:
#
# ./qemu-test-repo-vm [--install] [URL]
#
# --lookup: When testing a repo, only lookup the latest package versions,
# don't try to install them. Installing all of them takes over
# an hour, so this is much quicker.
#
# URL: URL to use instead of http://download.zfsonlinux.org
# If blank, use the default repo from zfs-release RPM.
set -e
source /etc/os-release
OS="$ID"
VERSION="$VERSION_ID"
LOOKUP=""
if [ -n "$1" ] && [ "$1" == "--lookup" ] ; then
LOOKUP=1
shift
fi
ALTHOST=""
if [ -n "$1" ] ; then
ALTHOST="$1"
fi
# Write summary to /tmp/repo so our artifacts scripts pick it up
mkdir /tmp/repo
SUMMARY=/tmp/repo/$OS-$VERSION-summary.txt
# $1: Repo 'zfs' 'zfs-kmod' 'zfs-testing' 'zfs-testing-kmod'
# $2: (optional) Alternate host than 'http://download.zfsonlinux.org' to
# install from. Blank means use default from zfs-release RPM.
function test_install {
repo=$1
host=""
if [ -n "$2" ] ; then
host=$2
fi
args="--disablerepo=zfs --enablerepo=$repo"
# If we supplied an alternate repo URL, and have not already edited
# zfs.repo, then update the repo file.
if [ -n "$host" ] && ! grep -q $host /etc/yum.repos.d/zfs.repo ; then
sudo sed -i "s;baseurl=http://download.zfsonlinux.org;baseurl=$host;g" /etc/yum.repos.d/zfs.repo
fi
baseurl=$(grep -A 5 "\[$repo\]" /etc/yum.repos.d/zfs.repo | awk -F'=' '/baseurl=/{print $2; exit}')
# Just do a version lookup - don't try to install any RPMs
if [ "$LOOKUP" == "1" ] ; then
package="$(dnf list $args zfs | tail -n 1 | awk '{print $2}')"
echo "$repo ${package} $baseurl" >> $SUMMARY
return
fi
if ! sudo dnf -y install $args zfs zfs-test ; then
echo "$repo ${package}...[FAILED] $baseurl" >> $SUMMARY
return
fi
# Load modules and create a simple pool as a sanity test.
sudo /usr/share/zfs/zfs.sh -r
truncate -s 100M /tmp/file
sudo zpool create tank /tmp/file
sudo zpool status
# Print out repo name, rpm installed (kmod or dkms), and repo URL
package=$(sudo rpm -qa | grep zfs | grep -E 'kmod|dkms')
echo "$repo $package $baseurl" >> $SUMMARY
sudo zpool destroy tank
sudo rm /tmp/file
sudo dnf -y remove zfs
}
echo "##[group]Installing from repo"
# The openzfs docs are the authoritative instructions for the install. Use
# the specific version of zfs-release RPM it recommends.
case $OS in
almalinux*)
url='https://raw.githubusercontent.com/openzfs/openzfs-docs/refs/heads/master/docs/Getting%20Started/RHEL-based%20distro/index.rst'
name=$(curl -Ls $url | grep 'dnf install' | grep -Eo 'zfs-release-[0-9]+-[0-9]+')
sudo dnf -y install https://zfsonlinux.org/epel/$name$(rpm --eval "%{dist}").noarch.rpm 2>&1
sudo rpm -qi zfs-release
for i in zfs zfs-kmod zfs-testing zfs-testing-kmod zfs-latest \
zfs-latest-kmod zfs-legacy zfs-legacy-kmod zfs-2.2 \
zfs-2.2-kmod zfs-2.3 zfs-2.3-kmod zfs-2.4 zfs-2.4-kmod; do
test_install $i $ALTHOST
done
;;
fedora*)
url='https://raw.githubusercontent.com/openzfs/openzfs-docs/refs/heads/master/docs/Getting%20Started/Fedora/index.rst'
name=$(curl -Ls $url | grep 'dnf install' | grep -Eo 'zfs-release-[0-9]+-[0-9]+')
sudo dnf -y install -y https://zfsonlinux.org/fedora/$name$(rpm --eval "%{dist}").noarch.rpm
for i in zfs zfs-latest zfs-legacy zfs-2.2 zfs-2.3 zfs-2.4 ; do
test_install $i $ALTHOST
done
;;
esac
echo "##[endgroup]"
# Write out a simple version of the summary here. Later on we will collate all
# the summaries and put them into a nice table in the workflow Summary page.
echo "Summary: "
cat $SUMMARY
@@ -1,10 +0,0 @@
#!/bin/bash
#
# Wait for a VM to boot up and become active. This is used in a number of our
# scripts.
#
# $1: VM hostname or IP address
while pidof /usr/bin/qemu-system-x86_64 >/dev/null; do
ssh 2>/dev/null zfs@$1 "uname -a" && break
done
@@ -1,32 +0,0 @@
#!/bin/bash
#
# Recursively go though a directory structure and replace duplicate files with
# symlinks. This cuts down our RPM repo size by ~25%.
#
# replace-dupes-with-symlinks.sh [DIR]
#
# DIR: Directory to traverse. Defaults to current directory if not specified.
#
src="$1"
if [ -z "$src" ] ; then
src="."
fi
declare -A db
pushd "$src"
while read line ; do
bn="$(basename $line)"
if [ -z "${db[$bn]}" ] ; then
# First time this file has been seen
db[$bn]="$line"
else
if diff -b "$line" "${db[$bn]}" &>/dev/null ; then
# Files are the same, make a symlink
rm "$line"
ln -sr "${db[$bn]}" "$line"
fi
fi
done <<< "$(find . -type f)"
popd
-52
View File
@@ -1,52 +0,0 @@
name: smatch
on:
push:
pull_request:
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
jobs:
smatch:
runs-on: ubuntu-24.04
steps:
- name: Checkout smatch
uses: actions/checkout@v4
with:
repository: error27/smatch
ref: master
path: smatch
- name: Install smatch dependencies
run: |
sudo apt-get install -y llvm gcc make sqlite3 libsqlite3-dev libdbd-sqlite3-perl libssl-dev libtry-tiny-perl
- name: Make smatch
run: |
cd $GITHUB_WORKSPACE/smatch
make -j$(nproc)
- name: Checkout OpenZFS
uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha }}
path: zfs
- name: Install OpenZFS dependencies
run: |
cd $GITHUB_WORKSPACE/zfs
sudo apt-get purge -y snapd google-chrome-stable firefox
ONLY_DEPS=1 .github/workflows/scripts/qemu-3-deps-vm.sh ubuntu24
- name: Autogen.sh OpenZFS
run: |
cd $GITHUB_WORKSPACE/zfs
./autogen.sh
- name: Configure OpenZFS
run: |
cd $GITHUB_WORKSPACE/zfs
./configure --enable-debug
- name: Make OpenZFS
run: |
cd $GITHUB_WORKSPACE/zfs
make -j$(nproc) CHECK="$GITHUB_WORKSPACE/smatch/smatch" CC=$GITHUB_WORKSPACE/smatch/cgcc | tee $GITHUB_WORKSPACE/smatch.log
- name: Smatch results log
run: |
grep -E 'error:|warn:|warning:' $GITHUB_WORKSPACE/smatch.log
-157
View File
@@ -1,157 +0,0 @@
# This workflow is used to build and test RPM packages. It is a
# 'workflow_dispatch' workflow, which means it gets run manually.
#
# The workflow has a dropdown menu with two options:
#
# Build RPMs - Build release RPMs and tarballs and put them into an artifact
# ZIP file. The directory structure used in the ZIP file mirrors
# the ZFS yum repo.
#
# Test repo - Test install the ZFS RPMs from the ZFS repo. On EL distos, this
# will do a DKMS and KMOD test install from both the regular and
# testing repos. On Fedora, it will do a DKMS install from the
# regular repo. All test install results will be displayed in the
# Summary page. Note that the workflow provides an optional text
# text box where you can specify the full URL to an alternate repo.
# If left blank, it will install from the default repo from the
# zfs-release RPM (http://download.zfsonlinux.org).
#
# Most users will never need to use this workflow. It will be used primary by
# ZFS admins for building and testing releases.
#
name: zfs-qemu-packages
on:
workflow_dispatch:
inputs:
test_type:
type: choice
required: false
default: "Build RPMs"
description: "Build RPMs or test the repo?"
options:
- "Build RPMs"
- "Test repo"
patch_level:
type: string
required: false
default: ""
description: "(optional) patch level number"
repo_url:
type: string
required: false
default: ""
description: "(optional) repo URL (blank: use http://download.zfsonlinux.org)"
lookup:
type: boolean
required: false
default: false
description: "(optional) do version lookup only on repo test"
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
jobs:
zfs-qemu-packages-jobs:
name: qemu-VMs
strategy:
fail-fast: false
matrix:
os: ['almalinux8', 'almalinux9', 'almalinux10', 'fedora41', 'fedora42', 'fedora43']
runs-on: ubuntu-24.04
steps:
- uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha }}
- name: Setup QEMU
run: .github/workflows/scripts/qemu-1-setup.sh
- name: Start build machine
run: .github/workflows/scripts/qemu-2-start.sh ${{ matrix.os }}
- name: Install dependencies
run: |
.github/workflows/scripts/qemu-3-deps.sh ${{ matrix.os }}
- name: Build modules or Test repo
run: |
set -e
if [ "${{ github.event.inputs.test_type }}" == "Test repo" ] ; then
# Bring VM back up and copy over zfs source
.github/workflows/scripts/qemu-prepare-for-build.sh
mkdir -p /tmp/repo
EXTRA=""
if [ "${{ github.event.inputs.lookup }}" == 'true' ] ; then
EXTRA="--lookup"
fi
ssh zfs@vm0 '$HOME/zfs/.github/workflows/scripts/qemu-test-repo-vm.sh' $EXTRA ${{ github.event.inputs.repo_url }}
else
EXTRA=""
if [ -n "${{ github.event.inputs.patch_level }}" ] ; then
EXTRA="--patch-level ${{ github.event.inputs.patch_level }}"
fi
.github/workflows/scripts/qemu-4-build.sh $EXTRA \
--repo --release --dkms --tarball ${{ matrix.os }}
fi
- name: Prepare artifacts
if: always()
run: |
rsync -a zfs@vm0:/tmp/repo /tmp || true
.github/workflows/scripts/replace-dupes-with-symlinks.sh /tmp/repo
tar -cf ${{ matrix.os }}-repo.tar -C /tmp repo
- uses: actions/upload-artifact@v4
id: artifact-upload
if: always()
with:
name: ${{ matrix.os }}-repo
path: ${{ matrix.os }}-repo.tar
compression-level: 0
retention-days: 2
if-no-files-found: ignore
combine_repos:
if: always()
needs: [zfs-qemu-packages-jobs]
name: "Results"
runs-on: ubuntu-latest
steps:
- uses: actions/download-artifact@v4
id: artifact-download
if: always()
- name: Test Summary
if: always()
run: |
for i in $(find . -type f -iname "*.tar") ; do
tar -xf $i -C /tmp
done
tar -cf all-repo.tar -C /tmp repo
# If we're installing from a repo, print out the summary of the versions
# that got installed using Markdown.
if [ "${{ github.event.inputs.test_type }}" == "Test repo" ] ; then
cd /tmp/repo
for i in $(ls *.txt) ; do
nicename="$(echo $i | sed 's/.txt//g; s/-/ /g')"
echo "### $nicename" >> $GITHUB_STEP_SUMMARY
echo "|repo|RPM|URL|" >> $GITHUB_STEP_SUMMARY
echo "|:---|:---|:---|" >> $GITHUB_STEP_SUMMARY
awk '{print "|"$1"|"$2"|"$3"|"}' $i >> $GITHUB_STEP_SUMMARY
done
fi
- uses: actions/upload-artifact@v4
id: artifact-upload2
if: always()
with:
name: all-repo
path: all-repo.tar
compression-level: 0
retention-days: 5
if-no-files-found: ignore
-192
View File
@@ -1,192 +0,0 @@
name: zfs-qemu
on:
push:
pull_request:
workflow_dispatch:
inputs:
fedora_kernel_ver:
type: string
required: false
default: ""
description: "(optional) Experimental kernel version to install on Fedora (like '6.14' or '6.13.3-0.rc3')"
specific_os:
type: string
required: false
default: ""
description: "(optional) Only run on this specific OS (like 'fedora42' or 'alpine3-23')"
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
jobs:
test-config:
name: Setup
runs-on: ubuntu-24.04
outputs:
test_os: ${{ steps.os.outputs.os }}
ci_type: ${{ steps.os.outputs.ci_type }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Generate OS config and CI type
id: os
run: |
ci_type="default"
# determine CI type when running on PR
if ${{ github.event_name == 'pull_request' }}; then
head=${{ github.event.pull_request.head.sha }}
base=${{ github.event.pull_request.base.sha }}
ci_type=$(python3 .github/workflows/scripts/generate-ci-type.py $head $base)
fi
case "$ci_type" in
quick)
os_selection='["almalinux8", "almalinux9", "almalinux10", "debian12", "fedora42", "freebsd15-0s", "ubuntu24"]'
;;
linux)
os_selection='["almalinux8", "almalinux9", "almalinux10", "centos-stream9", "centos-stream10", "debian11", "debian12", "debian13", "fedora41", "fedora42", "fedora43", "ubuntu22", "ubuntu24"]'
;;
freebsd)
os_selection='["freebsd13-5r", "freebsd14-3r", "freebsd13-5s", "freebsd14-3s", "freebsd15-0s", "freebsd16-0c"]'
;;
*)
# default list
os_selection='["almalinux8", "almalinux9", "almalinux10", "centos-stream9", "centos-stream10", "debian12", "debian13", "fedora42", "fedora43", "freebsd14-3r", "freebsd15-0s", "freebsd16-0c", "ubuntu22", "ubuntu24"]'
;;
esac
if ${{ github.event.inputs.fedora_kernel_ver != '' }}; then
# They specified a custom kernel version for Fedora.
# Use only Fedora runners.
os_json=$(echo ${os_selection} | jq -c '[.[] | select(startswith("fedora"))]')
elif ${{ github.event.inputs.specific_os != '' }}; then
# Use only the specified runner.
os_json=$(jq -cn --arg os "${{ github.event.inputs.specific_os }}" '[ $os ]')
else
# Normal case
os_json=$(echo ${os_selection} | jq -c)
fi
echo "os=$os_json" | tee -a $GITHUB_OUTPUT
echo "ci_type=$ci_type" | tee -a $GITHUB_OUTPUT
qemu-vm:
name: qemu-x86
needs: [ test-config ]
strategy:
fail-fast: false
matrix:
# rhl: almalinux8, almalinux9, centos-streamX, fedora4x
# debian: debian12, debian13, ubuntu22, ubuntu24
# misc: archlinux, tumbleweed
# FreeBSD variants of november 2025:
# FreeBSD Release: freebsd13-5r, freebsd14-3r, freebsd15-0r
# FreeBSD Stable: freebsd13-5s, freebsd14-3s, freebsd15-0s
# FreeBSD Current: freebsd16-0c
os: ${{ fromJson(needs.test-config.outputs.test_os) }}
runs-on: ubuntu-24.04
steps:
- uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha }}
- name: Setup QEMU
timeout-minutes: 60
run: .github/workflows/scripts/qemu-1-setup.sh
- name: Start build machine
timeout-minutes: 10
run: .github/workflows/scripts/qemu-2-start.sh ${{ matrix.os }}
- name: Install dependencies
timeout-minutes: 60
run: .github/workflows/scripts/qemu-3-deps.sh ${{ matrix.os }} ${{ github.event.inputs.fedora_kernel_ver }}
- name: Build modules
timeout-minutes: 30
run: .github/workflows/scripts/qemu-4-build.sh --poweroff --enable-debug ${{ matrix.os }}
- name: Setup testing machines
timeout-minutes: 5
run: .github/workflows/scripts/qemu-5-setup.sh
- name: Run tests
timeout-minutes: 270
run: .github/workflows/scripts/qemu-6-tests.sh
env:
CI_TYPE: ${{ needs.test-config.outputs.ci_type }}
- name: Prepare artifacts
if: always()
timeout-minutes: 10
run: .github/workflows/scripts/qemu-7-prepare.sh
- uses: actions/upload-artifact@v4
id: artifact-upload
if: always()
with:
name: Logs-functional-${{ matrix.os }}
path: /tmp/qemu-${{ matrix.os }}.tar
if-no-files-found: ignore
- name: Test Summary
if: always()
run: .github/workflows/scripts/qemu-8-summary.sh '${{ steps.artifact-upload.outputs.artifact-url }}'
cleanup:
if: always()
name: Cleanup
runs-on: ubuntu-latest
needs: [ qemu-vm ]
steps:
- uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha }}
- uses: actions/download-artifact@v4
- name: Generating summary
run: .github/workflows/scripts/qemu-9-summary-page.sh
- name: Generating summary...
run: .github/workflows/scripts/qemu-9-summary-page.sh 2
- name: Generating summary...
run: .github/workflows/scripts/qemu-9-summary-page.sh 3
- name: Generating summary...
run: .github/workflows/scripts/qemu-9-summary-page.sh 4
- name: Generating summary...
run: .github/workflows/scripts/qemu-9-summary-page.sh 5
- name: Generating summary...
run: .github/workflows/scripts/qemu-9-summary-page.sh 6
- name: Generating summary...
run: .github/workflows/scripts/qemu-9-summary-page.sh 7
- name: Generating summary...
run: .github/workflows/scripts/qemu-9-summary-page.sh 8
- name: Generating summary...
run: .github/workflows/scripts/qemu-9-summary-page.sh 9
- name: Generating summary...
run: .github/workflows/scripts/qemu-9-summary-page.sh 10
- name: Generating summary...
run: .github/workflows/scripts/qemu-9-summary-page.sh 11
- name: Generating summary...
run: .github/workflows/scripts/qemu-9-summary-page.sh 12
- name: Generating summary...
run: .github/workflows/scripts/qemu-9-summary-page.sh 13
- name: Generating summary...
run: .github/workflows/scripts/qemu-9-summary-page.sh 14
- name: Generating summary...
run: .github/workflows/scripts/qemu-9-summary-page.sh 15
- name: Generating summary...
run: .github/workflows/scripts/qemu-9-summary-page.sh 16
- name: Generating summary...
run: .github/workflows/scripts/qemu-9-summary-page.sh 17
- name: Generating summary...
run: .github/workflows/scripts/qemu-9-summary-page.sh 18
- name: Generating summary...
run: .github/workflows/scripts/qemu-9-summary-page.sh 19
- uses: actions/upload-artifact@v4
with:
name: Summary Files
path: out-*
-77
View File
@@ -1,77 +0,0 @@
name: zloop
on:
push:
pull_request:
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
jobs:
zloop:
runs-on: ubuntu-24.04
env:
WORK_DIR: /mnt/zloop
CORE_DIR: /mnt/zloop/cores
steps:
- uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha }}
- name: Install dependencies
run: |
sudo apt-get purge -y snapd google-chrome-stable firefox
ONLY_DEPS=1 .github/workflows/scripts/qemu-3-deps-vm.sh ubuntu24
- name: Autogen.sh
run: |
sed -i '/DEBUG_CFLAGS="-Werror"/s/^/#/' config/zfs-build.m4
./autogen.sh
- name: Configure
run: |
./configure --prefix=/usr --enable-debug --enable-debuginfo \
--enable-asan --enable-ubsan \
--enable-debug-kmem --enable-debug-kmem-tracking
- name: Make
run: |
make -j$(nproc)
- name: Install
run: |
sudo make install
sudo depmod
sudo modprobe zfs
- name: Tests
run: |
sudo truncate -s 256G /mnt/vdev
sudo zpool create cipool -m $WORK_DIR -O compression=on -o autotrim=on /mnt/vdev
sudo /usr/share/zfs/zloop.sh -t 600 -I 6 -l -m 1 -c $CORE_DIR -f $WORK_DIR -- -T 120 -P 60
- name: Prepare artifacts
if: failure()
run: |
sudo chmod +r -R $WORK_DIR/
- name: Ztest log
if: failure()
run: |
grep -B10 -A1000 'ASSERT' $CORE_DIR/*/ztest.out || tail -n 1000 $CORE_DIR/*/ztest.out
- name: Gdb log
if: failure()
run: |
sed -n '/Backtraces (full)/q;p' $CORE_DIR/*/ztest.gdb
- name: Zdb log
if: failure()
run: |
cat $CORE_DIR/*/ztest.zdb
- uses: actions/upload-artifact@v4
if: failure()
with:
name: Logs
path: |
/mnt/zloop/*/
!/mnt/zloop/cores/*/vdev/
if-no-files-found: ignore
- uses: actions/upload-artifact@v4
if: failure()
with:
name: Pool files
path: |
/mnt/zloop/cores/*/vdev/
if-no-files-found: ignore
+31 -56
View File
@@ -1,7 +1,6 @@
#
# This is the top-level .gitignore file:
# ignore everything except a list of allowed files.
#
# N.B.
# This is the toplevel .gitignore file.
# This is not the place for entries that are specific to
# a subdirectory. Instead add those files to the
# .gitignore file in that subdirectory.
@@ -11,57 +10,6 @@
# command after changing this file, to see if there are
# any tracked files which get ignored after the change.
*
!.github
!cmd
!config
!contrib
!etc
!include
!lib
!man
!module
!rpm
!scripts
!tests
!udev
!.github/**
!cmd/**
!config/**
!contrib/**
!etc/**
!include/**
!lib/**
!man/**
!module/**
!rpm/**
!scripts/**
!tests/**
!udev/**
!.editorconfig
!.cirrus.yml
!.gitignore
!.gitmodules
!.mailmap
!AUTHORS
!autogen.sh
!CODE_OF_CONDUCT.md
!configure.ac
!copy-builtin
!COPYRIGHT
!LICENSE
!Makefile.am
!META
!NEWS
!NOTICE
!README.md
!RELEASES.md
!TEST
!zfs.release.in
#
# Normal rules
#
@@ -83,8 +31,35 @@
modules.order
Makefile
Makefile.in
changelog
#
# Top level generated files specific to this top level dir
#
/bin
/build
/configure
/config.log
/config.status
/libtool
/zfs_config.h
/zfs_config.h.in
/zfs.release
/stamp-h1
/aclocal.m4
/autom4te.cache
#
# Top level generic files
#
!.gitignore
tags
TAGS
current
cscope.*
*.rpm
*.deb
*.tar.gz
*.patch
*.orig
*.tmp
*.log
venv
+1 -1
View File
@@ -1,3 +1,3 @@
[submodule "scripts/zfs-images"]
path = scripts/zfs-images
url = https://github.com/openzfs/zfs-images
url = https://github.com/zfsonlinux/zfs-images
-237
View File
@@ -1,237 +0,0 @@
# This file maps the name+email seen in a commit back to a canonical
# name+email. Git will replace the commit name/email with the canonical version
# wherever it sees it.
#
# If there is a commit in the history with a "wrong" name or email, list it
# here. If you regularly commit with an alternate name or email address and
# would like to ensure that you are always listed consistently in the repo, add
# mapping here.
#
# On the other hand, if you use multiple names or email addresses legitimately
# (eg you use a company email address for your paid OpenZFS work, and a
# personal address for your evening side projects), then don't map one to the
# other here.
#
# The most common formats are:
#
# Canonical Name <canonical-email>
# Canonical Name <canonical-email> <commit-email>
# Canonical Name <canonical-email> Commit Name <commit-email>
#
# See https://git-scm.com/docs/gitmailmap for more info.
# These maps are making names consistent where they have varied but the email
# address has never changed. In most cases, the full name is in the
# Signed-off-by of a commit with a matching author.
Achill Gilgenast <achill@achill.org>
Ahelenia Ziemiańska <nabijaczleweli@gmail.com>
Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Alex John <alex@stty.io>
Andreas Dilger <adilger@dilger.ca>
Andrew Walker <awalker@ixsystems.com>
Benedikt Neuffer <github@itfriend.de>
Chengfei Zhu <chengfeix.zhu@intel.com>
ChenHao Lu <18302010006@fudan.edu.cn>
Chris Lindee <chris.lindee+github@gmail.com>
Colm Buckley <colm@tuatha.org>
Crag Wang <crag0715@gmail.com>
Damian Szuberski <szuberskidamian@gmail.com>
Daniel Kolesa <daniel@octaforge.org>
Debabrata Banerjee <dbavatar@gmail.com>
Diwakar Kristappagari <diwakar-k@hpe.com>
Finix Yan <yanchongwen@hotmail.com>
Gaurav Kumar <gauravk.18@gmail.com>
Gionatan Danti <g.danti@assyoma.it>
Glenn Washburn <development@efficientek.com>
Gordan Bobic <gordan.bobic@gmail.com>
Gregory Bartholomew <gregory.lee.bartholomew@gmail.com>
hedong zhang <h_d_zhang@163.com>
Ilkka Sovanto <github@ilkka.kapsi.fi>
InsanePrawn <Insane.Prawny@gmail.com>
Jason Cohen <jwittlincohen@gmail.com>
Jason Harmening <jason.harmening@gmail.com>
Jeremy Faulkner <gldisater@gmail.com>
Jinshan Xiong <jinshan.xiong@gmail.com>
John Poduska <jpoduska@datto.com>
Jo Zzsi <jozzsicsataban@gmail.com>
Justin Scholz <git@justinscholz.de>
Ka Ho Ng <khng300@gmail.com>
Kash Pande <github@tripleback.net>
Kay Pedersen <christianpe96@gmail.com>
KernelOfTruth <kerneloftruth@gmail.com>
Liu Hua <liu.hua130@zte.com.cn>
Liu Qing <winglq@gmail.com>
loli10K <ezomori.nozomu@gmail.com>
Mart Frauenlob <allkind@fastest.cc>
Matthias Blankertz <matthias@blankertz.org>
Michael Gmelin <grembo@FreeBSD.org>
Olivier Mazouffre <olivier.mazouffre@ims-bordeaux.fr>
Piotr Kubaj <pkubaj@anongoth.pl>
Quentin Zdanis <zdanisq@gmail.com>
Roberto Ricci <io@r-ricci.it>
Roberto Ricci <ricci@disroot.org>
Rob Norris <robn@despairlabs.com>
Rob Norris <rob.norris@klarasystems.com>
Sam Lunt <samuel.j.lunt@gmail.com>
Sanjeev Bagewadi <sanjeev.bagewadi@gmail.com>
Sebastian Wuerl <s.wuerl@mailbox.org>
SHENGYI HONG <aokblast@FreeBSD.org>
Stoiko Ivanov <github@nomore.at>
Tamas TEVESZ <ice@extreme.hu>
WHR <msl0000023508@gmail.com>
Yanping Gao <yanping.gao@xtaotech.com>
Youzhong Yang <youzhong@gmail.com>
# Signed-off-by: overriding Author:
Alexander Ziaee <ziaee@FreeBSD.org> <concussious@runbox.com>
Felix Schmidt <felixschmidt20@aol.com> <f.sch.prototype@gmail.com>
Jean-Sébastien Pédron <dumbbell@FreeBSD.org> <jean-sebastien.pedron@dumbbell.fr>
Konstantin Belousov <kib@FreeBSD.org> <kib@kib.kiev.ua>
Olivier Certner <olce@FreeBSD.org> <olce.freebsd@certner.fr>
Patrick Xia <patrickx@google.com> <octalc0de@aim.com>
Phil Sutter <phil@nwl.cc> <p.github@nwl.cc>
poscat <poscat@poscat.moe> <poscat0x04@outlook.com>
Qiuhao Chen <chenqiuhao1997@gmail.com> <haohao0924@126.com>
Ryan <errornointernet@envs.net> <error.nointernet@gmail.com>
Sietse <sietse@wizdom.nu> <uglymotha@wizdom.nu>
Yuxin Wang <yuxinwang9999@gmail.com> <Bi11gates9999@gmail.com>
Zhenlei Huang <zlei@FreeBSD.org> <zlei.huang@gmail.com>
# Commits from strange places, long ago
Brian Behlendorf <behlendorf1@llnl.gov> <behlendo@7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c>
Brian Behlendorf <behlendorf1@llnl.gov> <behlendo@fedora-17-amd64.(none)>
Brian Behlendorf <behlendorf1@llnl.gov> <behlendo@myhost.(none)>
Brian Behlendorf <behlendorf1@llnl.gov> <ubuntu@ip-172-31-16-145.us-west-1.compute.internal>
Brian Behlendorf <behlendorf1@llnl.gov> <ubuntu@ip-172-31-20-6.us-west-1.compute.internal>
Herb Wartens <wartens2@llnl.gov> <wartens2@7e1ea52c-4ff2-0310-8f11-9dd32ca42a1c>
Ned Bass <bass6@llnl.gov> <bass6@zeno1.(none)>
Tulsi Jain <tulsi.jain@delphix.com> <tulsi.jain@Tulsi-Jains-MacBook-Pro.local>
# Mappings from Github no-reply addresses
ajs124 <git@ajs124.de> <ajs124@users.noreply.github.com>
Alek Pinchuk <apinchuk@axcient.com> <alek-p@users.noreply.github.com>
Aleksandr Liber <aleksandr.liber@perforce.com> <61714074+AleksandrLiber@users.noreply.github.com>
Alexander Lobakin <alobakin@pm.me> <solbjorn@users.noreply.github.com>
Alexey Smirnoff <fling@member.fsf.org> <fling-@users.noreply.github.com>
Allen Holl <allen.m.holl@gmail.com> <65494904+allen-4@users.noreply.github.com>
Alphan Yılmaz <alphanyilmaz@gmail.com> <a1ea321@users.noreply.github.com>
Ameer Hamza <ahamza@ixsystems.com> <106930537+ixhamza@users.noreply.github.com>
Andrew J. Hesford <ajh@sideband.org> <48421688+ahesford@users.noreply.github.com>>
Andrew Sun <me@andrewsun.com> <as-com@users.noreply.github.com>
Aron Xu <happyaron.xu@gmail.com> <happyaron@users.noreply.github.com>
Arun KV <arun.kv@datacore.com> <65647132+arun-kv@users.noreply.github.com>
Ben Wolsieffer <benwolsieffer@gmail.com> <lopsided98@users.noreply.github.com>
bernie1995 <bernie.pikes@gmail.com> <42413912+bernie1995@users.noreply.github.com>
Bojan Novković <bnovkov@FreeBSD.org> <72801811+bnovkov@users.noreply.github.com>
Boris Protopopov <boris.protopopov@actifio.com> <bprotopopov@users.noreply.github.com>
Brad Forschinger <github@bnjf.id.au> <bnjf@users.noreply.github.com>
Brandon Thetford <brandon@dodecatec.com> <dodexahedron@users.noreply.github.com>
buzzingwires <buzzingwires@outlook.com> <131118055+buzzingwires@users.noreply.github.com>
Cedric Maunoury <cedric.maunoury@gmail.com> <38213715+cedricmaunoury@users.noreply.github.com>
Charles Suh <charles.suh@gmail.com> <charlessuh@users.noreply.github.com>
Chris Peredun <chris.peredun@ixsystems.com> <126915832+chrisperedun@users.noreply.github.com>
classabbyamp <dev@placeviolette.net> <5366828+classabbyamp@users.noreply.github.com>
Dacian Reece-Stremtan <dacianstremtan@gmail.com> <35844628+dacianstremtan@users.noreply.github.com>
Damian Szuberski <szuberskidamian@gmail.com> <30863496+szubersk@users.noreply.github.com>
Daniel Hiepler <d-git@coderdu.de> <32984777+heeplr@users.noreply.github.com>
Daniel Kobras <d.kobras@science-computing.de> <sckobras@users.noreply.github.com>
Daniel Reichelt <hacking@nachtgeist.net> <nachtgeist@users.noreply.github.com>
David Quigley <david.quigley@intel.com> <dpquigl@users.noreply.github.com>
Dennis R. Friedrichsen <dennis.r.friedrichsen@gmail.com> <31087738+dennisfriedrichsen@users.noreply.github.com>
Dex Wood <slash2314@gmail.com> <slash2314@users.noreply.github.com>
DHE <git@dehacked.net> <DeHackEd@users.noreply.github.com>
Dmitri John Ledkov <dimitri.ledkov@canonical.com> <19779+xnox@users.noreply.github.com>
Dries Michiels <driesm.michiels@gmail.com> <32487486+driesmp@users.noreply.github.com>
Edmund Nadolski <edmund.nadolski@ixsystems.com> <137826107+ednadolski-ix@users.noreply.github.com>
Érico Nogueira <erico.erc@gmail.com> <34201958+ericonr@users.noreply.github.com>
Fedor Uporov <fuporov.vstack@gmail.com> <60701163+fuporovvStack@users.noreply.github.com>
Felix Dörre <felix@dogcraft.de> <felixdoerre@users.noreply.github.com>
Felix Neumärker <xdch47@posteo.de> <34678034+xdch47@users.noreply.github.com>
Finix Yan <yancw@info2soft.com> <Finix1979@users.noreply.github.com>
Friedrich Weber <f.weber@proxmox.com> <56110206+frwbr@users.noreply.github.com>
Gaurav Kumar <gauravk.18@gmail.com> <gaurkuma@users.noreply.github.com>
George Gaydarov <git@gg7.io> <gg7@users.noreply.github.com>
Georgy Yakovlev <gyakovlev@gentoo.org> <168902+gyakovlev@users.noreply.github.com>
Gerardwx <gerardw@alum.mit.edu> <Gerardwx@users.noreply.github.com>
Germano Massullo <germano.massullo@gmail.com> <Germano0@users.noreply.github.com>
Gian-Carlo DeFazio <defazio1@llnl.gov> <defaziogiancarlo@users.noreply.github.com>
Giuseppe Di Natale <dinatale2@llnl.gov> <dinatale2@users.noreply.github.com>
Hajo Möller <dasjoe@gmail.com> <dasjoe@users.noreply.github.com>
Harry Mallon <hjmallon@gmail.com> <1816667+hjmallon@users.noreply.github.com>
Hiếu Lê <leorize+oss@disroot.org> <alaviss@users.noreply.github.com>
Jake Howard <git@theorangeone.net> <RealOrangeOne@users.noreply.github.com>
James Cowgill <james.cowgill@mips.com> <jcowgill@users.noreply.github.com>
Jaron Kent-Dobias <jaron@kent-dobias.com> <kentdobias@users.noreply.github.com>
Jason King <jason.king@joyent.com> <jasonbking@users.noreply.github.com>
Jeff Dike <jdike@akamai.com> <52420226+jdike@users.noreply.github.com>
Jitendra Patidar <jitendra.patidar@nutanix.com> <53164267+jsai20@users.noreply.github.com>
João Carlos Mendes Luís <jonny@jonny.eng.br> <dioni21@users.noreply.github.com>
John Eismeier <john.eismeier@gmail.com> <32205350+jeis2497052@users.noreply.github.com>
John L. Hammond <john.hammond@intel.com> <35266395+jhammond-intel@users.noreply.github.com>
John-Mark Gurney <jmg@funkthat.com> <jmgurney@users.noreply.github.com>
John Ramsden <johnramsden@riseup.net> <johnramsden@users.noreply.github.com>
Jonathon Fernyhough <jonathon@m2x.dev> <559369+jonathonf@users.noreply.github.com>
Jose Luis Duran <jlduran@gmail.com> <jlduran@users.noreply.github.com>
Justin Hibbits <chmeeedalf@gmail.com> <chmeeedalf@users.noreply.github.com>
Kaitlin Hoang <kthoang@amazon.com> <khoang98@users.noreply.github.com>
Kevin Greene <kevin.greene@delphix.com> <104801862+kxgreene@users.noreply.github.com>
Kevin Jin <lostking2008@hotmail.com> <33590050+jxdking@users.noreply.github.com>
Kevin P. Fleming <kevin@km6g.us> <kpfleming@users.noreply.github.com>
Krzysztof Piecuch <piecuch@kpiecuch.pl> <3964215+pikrzysztof@users.noreply.github.com>
Kyle Evans <kevans@FreeBSD.org> <kevans91@users.noreply.github.com>
Laurențiu Nicola <lnicola@dend.ro> <lnicola@users.noreply.github.com>
loli10K <ezomori.nozomu@gmail.com> <loli10K@users.noreply.github.com>
Lorenz Hüdepohl <dev@stellardeath.org> <lhuedepohl@users.noreply.github.com>
Luís Henriques <henrix@camandro.org> <73643340+lumigch@users.noreply.github.com>
Marcin Skarbek <git@skarbek.name> <mskarbek@users.noreply.github.com>
Matt Fiddaman <github@m.fiddaman.uk> <81489167+matt-fidd@users.noreply.github.com>
Maxim Filimonov <che@bein.link> <part1zano@users.noreply.github.com>
Max Zettlmeißl <max@zettlmeissl.de> <6818198+maxz@users.noreply.github.com>
Michael Niewöhner <foss@mniewoehner.de> <c0d3z3r0@users.noreply.github.com>
Michael Zhivich <mzhivich@akamai.com> <33133421+mzhivich@users.noreply.github.com>
MigeljanImeri <ImeriMigel@gmail.com> <78048439+MigeljanImeri@users.noreply.github.com>
Mo Zhou <cdluminate@gmail.com> <5723047+cdluminate@users.noreply.github.com>
nav1s <nav1s@proton.me> <42621369+nav1s@users.noreply.github.com>
Nick Mattis <nickm970@gmail.com> <nmattis@users.noreply.github.com>
omni <omni+vagant@hack.org> <79493359+omnivagant@users.noreply.github.com>
Pablo Correa Gómez <ablocorrea@hotmail.com> <32678034+pablofsf@users.noreply.github.com>
Paul Zuchowski <pzuchowski@datto.com> <31706010+PaulZ-98@users.noreply.github.com>
Peter Ashford <ashford@accs.com> <pashford@users.noreply.github.com>
Peter Dave Hello <hsu@peterdavehello.org> <PeterDaveHello@users.noreply.github.com>
Peter Wirdemo <peter.wirdemo@gmail.com> <4224155+pewo@users.noreply.github.com>
Petros Koutoupis <petros@petroskoutoupis.com> <pkoutoupis@users.noreply.github.com>
Ping Huang <huangping@smartx.com> <101400146+hpingfs@users.noreply.github.com>
Piotr P. Stefaniak <pstef@freebsd.org> <pstef@users.noreply.github.com>
Richard Allen <belperite@gmail.com> <33836503+belperite@users.noreply.github.com>
Rich Ercolani <rincebrain@gmail.com> <214141+rincebrain@users.noreply.github.com>
Rick Macklem <rmacklem@uoguelph.ca> <64620010+rmacklem@users.noreply.github.com>
Rob Wing <rob.wing@klarasystems.com> <98866084+rob-wing@users.noreply.github.com>
Roman Strashkin <roman.strashkin@nexenta.com> <Ramzec@users.noreply.github.com>
Ryan Hirasaki <ryanhirasaki@gmail.com> <4690732+RyanHir@users.noreply.github.com>
Samuel Wycliffe J <samwyc@hpe.com> <115969550+samwyc@users.noreply.github.com>
Samuel Wycliffe <samuelwycliffe@gmail.com> <50765275+npc203@users.noreply.github.com>
Savyasachee Jha <hi@savyasacheejha.com> <savyajha@users.noreply.github.com>
Scott Colby <scott@scolby.com> <scolby33@users.noreply.github.com>
Sean Eric Fagan <kithrup@mac.com> <kithrup@users.noreply.github.com>
Shreshth Srivastava <shreshthsrivastava2@gmail.com> <66148173+Shreshth3@users.noreply.github.com>
Spencer Kinny <spencerkinny1995@gmail.com> <30333052+Spencer-Kinny@users.noreply.github.com>
Srikanth N S <srikanth.nagasubbaraoseetharaman@hpe.com> <75025422+nssrikanth@users.noreply.github.com>
Stefan Lendl <s.lendl@proxmox.com> <1321542+stfl@users.noreply.github.com>
Thomas Bertschinger <bertschinger@lanl.gov> <101425190+bertschinger@users.noreply.github.com>
Thomas Geppert <geppi@digitx.de> <geppi@users.noreply.github.com>
Tim Crawford <tcrawford@datto.com> <crawfxrd@users.noreply.github.com>
Todd Seidelmann <18294602+seidelma@users.noreply.github.com>
Tom Matthews <tom@axiom-partners.com> <tomtastic@users.noreply.github.com>
Tony Perkins <tperkins@datto.com> <62951051+tony-zfs@users.noreply.github.com>
Torsten Wörtwein <twoertwein@gmail.com> <twoertwein@users.noreply.github.com>
Tulsi Jain <tulsi.jain@delphix.com> <TulsiJain@users.noreply.github.com>
Václav Skála <skala@vshosting.cz> <33496485+vaclavskala@users.noreply.github.com>
Vaibhav Bhanawat <vaibhav.bhanawat@delphix.com> <88050553+vaibhav-delphix@users.noreply.github.com>
Vandana Rungta <vrungta@amazon.com> <46906819+vandanarungta@users.noreply.github.com>
Violet Purcell <vimproved@inventati.org> <66446404+vimproved@users.noreply.github.com>
Vipin Kumar Verma <vipin.verma@hpe.com> <75025470+vermavipinkumar@users.noreply.github.com>
Wolfgang Bumiller <w.bumiller@proxmox.com> <Blub@users.noreply.github.com>
XDTG <click1799@163.com> <35128600+XDTG@users.noreply.github.com>
xtouqh <xtouqh@hotmail.com> <72357159+xtouqh@users.noreply.github.com>
Yuri Pankov <yuripv@FreeBSD.org> <113725409+yuripv@users.noreply.github.com>
Yuri Pankov <yuripv@FreeBSD.org> <82001006+yuripv@users.noreply.github.com>
+38
View File
@@ -0,0 +1,38 @@
language: c
sudo: required
env:
global:
# Travis limits maximum log size, we have to cut tests output
- ZFS_TEST_TRAVIS_LOG_MAX_LENGTH=800
matrix:
# tags are mainly in ascending order
- ZFS_TEST_TAGS='acl,atime,bootfs,cachefile,casenorm,chattr,checksum,clean_mirror,compression,ctime,delegate,devices,events,exec,fault,features,grow_pool,zdb,zfs,zfs_bookmark,zfs_change-key,zfs_clone,zfs_copies,zfs_create,zfs_diff,zfs_get,zfs_inherit,zfs_load-key,zfs_rename'
- ZFS_TEST_TAGS='cache,history,hkdf,inuse,zfs_property,zfs_receive,zfs_reservation,zfs_send,zfs_set,zfs_share,zfs_snapshot,zfs_unload-key,zfs_unmount,zfs_unshare,zfs_upgrade,zpool,zpool_add,zpool_attach,zpool_clear,zpool_create,zpool_destroy,zpool_detach'
- ZFS_TEST_TAGS='grow_replicas,mv_files,cli_user,zfs_mount,zfs_promote,zfs_rollback,zpool_events,zpool_expand,zpool_export,zpool_get,zpool_history,zpool_import,zpool_labelclear,zpool_offline,zpool_online,zpool_remove,zpool_reopen,zpool_replace,zpool_scrub,zpool_set,zpool_status,zpool_sync,zpool_upgrade'
- ZFS_TEST_TAGS='zfs_destroy,large_files,largest_pool,link_count,migration,mmap,mmp,mount,nestedfs,no_space,nopwrite,online_offline,pool_names,poolversion,privilege,quota,raidz,redundancy,rsend'
- ZFS_TEST_TAGS='inheritance,refquota,refreserv,rename_dirs,replacement,reservation,rootpool,scrub_mirror,slog,snapshot,snapused,sparse,threadsappend,tmpfile,truncate,upgrade,userquota,vdev_zaps,write_dirs,xattr,zvol,libzfs'
before_install:
- sudo apt-get -qq update
- sudo apt-get install --yes -qq build-essential autoconf libtool gawk alien fakeroot linux-headers-$(uname -r)
- sudo apt-get install --yes -qq zlib1g-dev uuid-dev libattr1-dev libblkid-dev libselinux-dev libudev-dev libssl-dev
# packages for tests
- sudo apt-get install --yes -qq parted lsscsi ksh attr acl nfs-kernel-server fio
install:
- git clone --depth=1 https://github.com/zfsonlinux/spl
- cd spl
- git checkout master
- sh autogen.sh
- ./configure
- make --no-print-directory -s pkg-utils pkg-kmod
- sudo dpkg -i *.deb
- cd ..
- sh autogen.sh
- ./configure
- make --no-print-directory -s pkg-utils pkg-kmod
- sudo dpkg -i *.deb
script:
- travis_wait 50 /usr/share/zfs/zfs-tests.sh -v -T $ZFS_TEST_TAGS
after_failure:
- find /var/tmp/test_results/current/log -type f -name '*' -printf "%f\n" -exec cut -c -$ZFS_TEST_TRAVIS_LOG_MAX_LENGTH {} \;
after_success:
- find /var/tmp/test_results/current/log -type f -name '*' -printf "%f\n" -exec cut -c -$ZFS_TEST_TRAVIS_LOG_MAX_LENGTH {} \;
+27 -451
View File
@@ -10,720 +10,296 @@ PAST MAINTAINERS:
CONTRIBUTORS:
Aaron Fineman <abyxcos@gmail.com>
Achill Gilgenast <achill@achill.org>
Adam D. Moss <c@yotes.com>
Adam Leventhal <ahl@delphix.com>
Adam Stevko <adam.stevko@gmail.com>
adisbladis <adis@blad.is>
Adrian Chadd <adrian@freebsd.org>
Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Ahmed G <ahmedg@delphix.com>
Aidan Harris <me@aidanharr.is>
AJ Jordan <alex@strugee.net>
ajs124 <git@ajs124.de>
Akash Ayare <aayare@delphix.com>
Akash B <akash-b@hpe.com>
Alan Somers <asomers@gmail.com>
Alar Aun <spamtoaun@gmail.com>
Albert Lee <trisk@nexenta.com>
Alec Salazar <alec.j.salazar@gmail.com>
Alejandro Colomar <Colomar.6.4.3@GMail.com>
Alejandro R. Sedeño <asedeno@mit.edu>
Alek Pinchuk <alek@nexenta.com>
Aleksandr Liber <aleksandr.liber@perforce.com>
Aleksa Sarai <cyphar@cyphar.com>
Alexander Eremin <a.eremin@nexenta.com>
Alexander Lobakin <alobakin@pm.me>
Alexander Motin <mav@freebsd.org>
Alexander Pyhalov <apyhalov@gmail.com>
Alexander Richardson <Alexander.Richardson@cl.cam.ac.uk>
Alexander Stetsenko <ams@nexenta.com>
Alexander Ziaee <ziaee@FreeBSD.org>
Alex Braunegg <alex.braunegg@gmail.com>
Alexey Shvetsov <alexxy@gentoo.org>
Alexey Smirnoff <fling@member.fsf.org>
Alex John <alex@stty.io>
Alex McWhirter <alexmcwhirter@triadic.us>
Alex Reece <alex@delphix.com>
Alex Wilson <alex.wilson@joyent.com>
Alex Zhuravlev <alexey.zhuravlev@intel.com>
Alexander Eremin <a.eremin@nexenta.com>
Alexander Motin <mav@freebsd.org>
Alexander Pyhalov <apyhalov@gmail.com>
Alexander Stetsenko <ams@nexenta.com>
Alexey Shvetsov <alexxy@gentoo.org>
Alexey Smirnoff <fling@member.fsf.org>
Allan Jude <allanjude@freebsd.org>
Allen Holl <allen.m.holl@gmail.com>
Alphan Yılmaz <alphanyilmaz@gmail.com>
alteriks <alteriks@gmail.com>
Alyssa Ross <hi@alyssa.is>
Ameer Hamza <ahamza@ixsystems.com>
Anatoly Borodin <anatoly.borodin@gmail.com>
AndCycle <andcycle@andcycle.idv.tw>
Andrea Gelmini <andrea.gelmini@gelma.net>
Andrea Righi <andrea.righi@canonical.com>
Andreas Buschmann <andreas.buschmann@tech.net.de>
Andreas Dilger <adilger@intel.com>
Andreas Vögele <andreas@andreasvoegele.com>
Andres <a-d-j-i@users.noreply.github.com>
Andrew Barnes <barnes333@gmail.com>
Andrew Hamilton <ahamilto@tjhsst.edu>
Andrew Innes <andrew.c12@gmail.com>
Andrew J. Hesford <ajh@sideband.org>
Andrew Reid <ColdCanuck@nailedtotheperch.com>
Andrew Stormont <andrew.stormont@nexenta.com>
Andrew Sun <me@andrewsun.com>
Andrew Tselischev <andrewtselischev@gmail.com>
Andrew Turner <andrew@fubar.geek.nz>
Andrew Walker <awalker@ixsystems.com>
Andrey Prokopenko <job@terem.fr>
Andrey Vesnovaty <andrey.vesnovaty@gmail.com>
Andriy Gapon <avg@freebsd.org>
Andriy Tkachuk <andriy.tkachuk@seagate.com>
Andy Bakun <github@thwartedefforts.org>
Andy Fiddaman <omnios@citrus-it.co.uk>
Aniruddha Shankar <k@191a.net>
Anton Gubarkov <anton.gubarkov@gmail.com>
Antonio Russo <antonio.e.russo@gmail.com>
Arkadiusz Bubała <arkadiusz.bubala@open-e.com>
Armin Wehrfritz <dkxls23@gmail.com>
Arne Jansen <arne@die-jansens.de>
Aron Xu <happyaron.xu@gmail.com>
Arshad Hussain <arshad.hussain@aeoncomputing.com>
Artem <artem.vlasenko@ossrevival.org>
Arun KV <arun.kv@datacore.com>
Arvind Sankar <nivedita@alum.mit.edu>
Attila Fülöp <attila@fueloep.org>
Avatat <kontakt@avatat.pl>
Bart Coddens <bart.coddens@gmail.com>
Basil Crow <basil.crow@delphix.com>
Bassu <bassu@phi9.com>
Huang Liu <liu.huang@zte.com.cn>
Ben Allen <bsallen@alcf.anl.gov>
Ben Cordero <bencord0@condi.me>
Benda Xu <orv@debian.org>
Benedikt Neuffer <github@itfriend.de>
Benjamin Albrecht <git@albrecht.io>
Benjamin Gentil <benjgentil.pro@gmail.com>
Benjamin Sherman <benjamin@holyarmy.org>
Ben McGough <bmcgough@fredhutch.org>
Ben Rubson <ben.rubson@gmail.com>
Ben Wolsieffer <benwolsieffer@gmail.com>
bernie1995 <bernie.pikes@gmail.com>
Benjamin Albrecht <git@albrecht.io>
Bill McGonigle <bill-github.com-public1@bfccomputing.com>
Bill Pijewski <wdp@joyent.com>
Bojan Novković <bnovkov@FreeBSD.org>
Boris Protopopov <boris.protopopov@nexenta.com>
Brad Forschinger <github@bnjf.id.au>
Brad Lewis <brad.lewis@delphix.com>
Brandon Thetford <brandon@dodecatec.com>
Brian Atkinson <bwa@g.clemson.edu>
Brian Behlendorf <behlendorf1@llnl.gov>
Brian J. Murrell <brian@sun.com>
Brooks Davis <brooks@one-eyed-alien.net>
BtbN <btbn@btbn.de>
bunder2015 <omfgbunder@gmail.com>
buzzingwires <buzzingwires@outlook.com>
bzzz77 <bzzz.tomas@gmail.com>
cable2999 <cable2999@users.noreply.github.com>
Caleb James DeLisle <calebdelisle@lavabit.com>
Cameron Harr <harr1@llnl.gov>
Cao Xuewen <cao.xuewen@zte.com.cn>
Carl George <carlwgeorge@gmail.com>
Carlo Landmeter <clandmeter@gmail.com>
Carlos Alberto Lopez Perez <clopez@igalia.com>
Cedric Maunoury <cedric.maunoury@gmail.com>
Chaoyu Zhang <zhang.chaoyu@zte.com.cn>
Charles Suh <charles.suh@gmail.com>
Chen Can <chen.can2@zte.com.cn>
Chengfei Zhu <chengfeix.zhu@intel.com>
Chen Haiquan <oc@yunify.com>
ChenHao Lu <18302010006@fudan.edu.cn>
Chip Parker <aparker@enthought.com>
Chris Burroughs <chris.burroughs@gmail.com>
Chris Davidson <christopher.davidson@gmail.com>
Chris Dunlap <cdunlap@llnl.gov>
Chris Dunlop <chris@onthe.net.au>
Chris Lindee <chris.lindee+github@gmail.com>
Chris McDonough <chrism@plope.com>
Chris Peredun <chris.peredun@ixsystems.com>
Chris Siden <chris.siden@delphix.com>
Chris Siebenmann <cks.github@cs.toronto.edu>
Chris Wedgwood <cw@f00f.org>
Chris Williamson <chris.williamson@delphix.com>
Chris Zubrzycki <github@mid-earth.net>
Christ Schlacta <aarcane@aarcane.info>
Christer Ekholm <che@chrekh.se>
Christian Kohlschütter <christian@kohlschutter.com>
Christian Neukirchen <chneukirchen@gmail.com>
Christian Schwarz <me@cschwarz.com>
Christopher Voltz <cjunk@voltz.ws>
Christ Schlacta <aarcane@aarcane.info>
Chris Wedgwood <cw@f00f.org>
Chris Williamson <chris.williamson@delphix.com>
Chris Zubrzycki <github@mid-earth.net>
Chuck Tuffli <ctuffli@gmail.com>
Chunwei Chen <david.chen@nutanix.com>
classabbyamp <dev@placeviolette.net>
Clemens Fruhwirth <clemens@endorphin.org>
Clemens Lang <cl@clang.name>
Clint Armstrong <clint@clintarmstrong.net>
Coleman Kane <ckane@colemankane.org>
Colin Ian King <colin.king@canonical.com>
Colin Percival <cperciva@tarsnap.com>
Colm Buckley <colm@tuatha.org>
Cong Zhang <congzhangzh@users.noreply.github.com>
Crag Wang <crag0715@gmail.com>
Craig Loomis <cloomis@astro.princeton.edu>
Craig Sanders <github@taz.net.au>
Cyril Plisko <cyril.plisko@infinidat.com>
Cy Schubert <cy@FreeBSD.org>
Cédric Berger <cedric@precidata.com>
Dacian Reece-Stremtan <dacianstremtan@gmail.com>
Dag-Erling Smørgrav <des@FreeBSD.org>
Damiano Albani <damiano.albani@gmail.com>
Damian Szuberski <szuberskidamian@gmail.com>
DHE <git@dehacked.net>
Damian Wojsław <damian@wojslaw.pl>
Daniel Berlin <dberlin@dberlin.org>
Daniel Hiepler <d-git@coderdu.de>
Daniel Hoffman <dj.hoffman@delphix.com>
Daniel Kobras <d.kobras@science-computing.de>
Daniel Kolesa <daniel@octaforge.org>
Daniel Perry <dtperry@amazon.com>
Daniel Reichelt <hacking@nachtgeist.net>
Daniel Stevenson <bot@dstev.net>
Daniel Verite <daniel@verite.pro>
Daniil Lunev <d.lunev.mail@gmail.com>
Dan Kimmel <dan.kimmel@delphix.com>
Dan McDonald <danmcd@nexenta.com>
Dan Swartzendruber <dswartz@druber.com>
Dan Vatca <dan.vatca@gmail.com>
Daniel Hoffman <dj.hoffman@delphix.com>
Daniel Verite <daniel@verite.pro>
Daniil Lunev <d.lunev.mail@gmail.com>
Darik Horn <dajhorn@vanadac.com>
Dave Eddy <dave@daveeddy.com>
David Hedberg <david@qzx.se>
David Lamparter <equinox@diac24.net>
David Qian <david.qian@intel.com>
David Quigley <david.quigley@intel.com>
Debabrata Banerjee <dbanerje@akamai.com>
D. Ebdrup <debdrup@freebsd.org>
Dennis R. Friedrichsen <dennis.r.friedrichsen@gmail.com>
Denys Rtveliashvili <denys@rtveliashvili.name>
Derek Dai <daiderek@gmail.com>
Derek Schrock <dereks@lifeofadishwasher.com>
Dex Wood <slash2314@gmail.com>
DHE <git@dehacked.net>
Didier Roche <didrocks@ubuntu.com>
Dimitri John Ledkov <xnox@ubuntu.com>
Dimitry Andric <dimitry@andric.com>
Dirkjan Bussink <d.bussink@gmail.com>
Diwakar Kristappagari <diwakar-k@hpe.com>
Dmitry Khasanov <pik4ez@gmail.com>
Dominic Pearson <dsp@technoanimal.net>
Dominik Hassler <hadfl@omniosce.org>
Dominik Honnef <dominikh@fork-bomb.org>
Don Brady <don.brady@delphix.com>
Doug Rabson <dfr@rabson.org>
Dr. András Korn <korn-github.com@elan.rulez.org>
Dries Michiels <driesm.michiels@gmail.com>
Edmund Nadolski <edmund.nadolski@ixsystems.com>
Eitan Adler <lists@eitanadler.com>
Eli Rosenthal <eli.rosenthal@delphix.com>
Eli Schwartz <eschwartz93@gmail.com>
Eric A. Borisch <eborisch@gmail.com>
Eric Desrochers <eric.desrochers@canonical.com>
Eric Dillmann <eric@jave.fr>
Eric Schrock <Eric.Schrock@delphix.com>
Ethan Coe-Renner <coerenner1@llnl.gov>
Etienne Dechamps <etienne@edechamps.fr>
Evan Allrich <eallrich@gmail.com>
Evan Harris <eharris@puremagic.com>
Evan Susarret <evansus@gmail.com>
Fabian Grünbichler <f.gruenbichler@proxmox.com>
Fabio Buso <dev.siroibaf@gmail.com>
Fabio Scaccabarozzi <fsvm88@gmail.com>
Fajar A. Nugraha <github@fajar.net>
Fan Yong <fan.yong@intel.com>
fbynite <fbynite@users.noreply.github.com>
Fedor Uporov <fuporov.vstack@gmail.com>
Felix Dörre <felix@dogcraft.de>
Felix Neumärker <xdch47@posteo.de>
Felix Schmidt <felixschmidt20@aol.com>
Feng Sun <loyou85@gmail.com>
Finix Yan <yancw@info2soft.com>
Francesco Mazzoli <f@mazzo.li>
Frederik Wessels <wessels147@gmail.com>
Friedrich Weber <f.weber@proxmox.com>
Frédéric Vanniere <f.vanniere@planet-work.com>
Gabriel A. Devenyi <gdevenyi@gmail.com>
Garrett D'Amore <garrett@nexenta.com>
Garrett Fields <ghfields@gmail.com>
Garrison Jensen <garrison.jensen@gmail.com>
Gary Mills <gary_mills@fastmail.fm>
Gaurav Kumar <gauravk.18@gmail.com>
GeLiXin <ge.lixin@zte.com.cn>
George Amanakis <g_amanakis@yahoo.com>
George Diamantopoulos <georgediam@gmail.com>
George Gaydarov <git@gg7.io>
George Melikov <mail@gmelikov.ru>
George Wilson <gwilson@delphix.com>
Georgy Yakovlev <ya@sysdump.net>
Gerardwx <gerardw@alum.mit.edu>
Germano Massullo <germano.massullo@gmail.com>
Gian-Carlo DeFazio <defazio1@llnl.gov>
Gionatan Danti <g.danti@assyoma.it>
Giuseppe Di Natale <guss80@gmail.com>
Gleb Smirnoff <glebius@FreeBSD.org>
Glenn Washburn <development@efficientek.com>
glibg10b <glibg10b@users.noreply.github.com>
gofaster <felix.gofaster@gmail.com>
Gordan Bobic <gordan@redsleeve.org>
Gordon Bergling <gbergling@googlemail.com>
Gordon Ross <gwr@nexenta.com>
Gordon Tetlow <gordon@freebsd.org>
Graham Christensen <graham@grahamc.com>
Graham Perrin <grahamperrin@gmail.com>
Gregor Kopka <gregor@kopka.net>
Gregory Bartholomew <gregory.lee.bartholomew@gmail.com>
grembo <freebsd@grem.de>
Grischa Zengel <github.zfsonlinux@zengel.info>
grodik <pat@litke.dev>
Gunnar Beutner <gunnar@beutner.name>
Gvozden Neskovic <neskovic@gmail.com>
Hajo Möller <dasjoe@gmail.com>
Han Gao <rabenda.cn@gmail.com>
Hans Rosenfeld <hans.rosenfeld@nexenta.com>
Harald van Dijk <harald@gigawatt.nl>
Harry Mallon <hjmallon@gmail.com>
Harry Sintonen <github-piru@kyber.fi>
HC <mmttdebbcc@yahoo.com>
hedong zhang <h_d_zhang@163.com>
Heitor Alves de Siqueira <halves@canonical.com>
Henrik Riomar <henrik.riomar@gmail.com>
Herb Wartens <wartens2@llnl.gov>
Hiếu Lê <leorize+oss@disroot.org>
hoshinomori <hoshinomorimorimo@gmail.com>
Huang Liu <liu.huang@zte.com.cn>
Håkan Johansson <f96hajo@chalmers.se>
Igor K <igor@dilos.org>
Igor Kozhukhov <ikozhukhov@gmail.com>
Igor Lvovsky <ilvovsky@gmail.com>
Igor Ostapenko <pm@igoro.pro>
ilbsmart <wgqimut@gmail.com>
Ilkka Sovanto <github@ilkka.kapsi.fi>
illiliti <illiliti@protonmail.com>
ilovezfs <ilovezfs@icloud.com>
InsanePrawn <Insane.Prawny@gmail.com>
Isaac Huang <he.huang@intel.com>
Ivan Shapovalov <intelfx@intelfx.name>
Ivan Volosyuk <Ivan.Volosyuk@gmail.com>
JK Dingwall <james@dingwall.me.uk>
Jacek Fefliński <feflik@gmail.com>
Jacob Adams <tookmund@gmail.com>
Jake Howard <git@theorangeone.net>
James Cowgill <james.cowgill@mips.com>
James H <james@kagisoft.co.uk>
James Lee <jlee@thestaticvoid.com>
James Pan <jiaming.pan@yahoo.com>
James Reilly <jreilly1821@gmail.com>
James Wah <james@laird-wah.net>
Jan Engelhardt <jengelh@inai.de>
Jan Kryl <jan.kryl@nexenta.com>
Jan Sanislo <oystr@cs.washington.edu>
Jaron Kent-Dobias <jaron@kent-dobias.com>
Jason Cohen <jwittlincohen@gmail.com>
Jason Harmening <jason.harmening@gmail.com>
Jason King <jason.brian.king@gmail.com>
Jason Lee <jasonlee@lanl.gov>
Jason Zaman <jasonzaman@gmail.com>
Javen Wu <wu.javen@gmail.com>
Jaydeep Kshirsagar <jkshirsagar@maxlinear.com>
Jean-Baptiste Lallement <jean-baptiste@ubuntu.com>
Jean-Sébastien Pédron <dumbbell@FreeBSD.org>
Jeff Dike <jdike@akamai.com>
Jeremy Faulkner <gldisater@gmail.com>
Jeremy Gill <jgill@parallax-innovations.com>
Jeremy Jones <jeremy@delphix.com>
Jeremy Visser <jeremy.visser@gmail.com>
Jerry Jelinek <jerry.jelinek@joyent.com>
Jerzy Kołosowski <jerzy@kolosowscy.pl>
Jessica Clarke <jrtc27@jrtc27.com>
Jinshan Xiong <jinshan.xiong@intel.com>
Jitendra Patidar <jitendra.patidar@nutanix.com>
JK Dingwall <james@dingwall.me.uk>
Joel Low <joel@joelsplace.sg>
Joe Stein <joe.stein@delphix.com>
John-Mark Gurney <jmg@funkthat.com>
John Albietz <inthecloud247@gmail.com>
John Eismeier <john.eismeier@gmail.com>
John Gallagher <john.gallagher@delphix.com>
John Layman <jlayman@sagecloud.com>
John L. Hammond <john.hammond@intel.com>
John M. Layman <jml@frijid.net>
Johnny Stenback <github@jstenback.com>
John Layman <jlayman@sagecloud.com>
John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
John Poduska <jpoduska@datto.com>
John Ramsden <johnramsden@riseup.net>
John Wren Kennedy <john.kennedy@delphix.com>
jokersus <lolivampireslave@gmail.com>
Jonathon Fernyhough <jonathon@m2x.dev>
Johnny Stenback <github@jstenback.com>
Jorgen Lundman <lundman@lundman.net>
Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
Jose Luis Duran <jlduran@gmail.com>
Josh Soref <jsoref@users.noreply.github.com>
Joshua M. Clulow <josh@sysmgr.org>
José Luis Salvador Rufo <salvador.joseluis@gmail.com>
Jo Zzsi <jozzsicsataban@gmail.com>
João Carlos Mendes Luís <jonny@jonny.eng.br>
JT Pennington <jt.pennington@klarasystems.com>
Julian Brunner <julian.brunner@gmail.com>
Julian Heuking <JulianH@beckhoff.com>
jumbi77 <jumbi77@users.noreply.github.com>
Justin Bedő <cu@cua0.org>
Justin Gottula <justin@jgottula.com>
Justin Hibbits <chmeeedalf@gmail.com>
Justin Keogh <github.com@v6y.net>
Justin Lecher <jlec@gentoo.org>
Justin Scholz <git@justinscholz.de>
Justin T. Gibbs <gibbs@FreeBSD.org>
jyxent <jordanp@gmail.com>
Jörg Thalheim <joerg@higgsboson.tk>
ka7 <ka7@la-evento.com>
Ka Ho Ng <khng@FreeBSD.org>
KORN Andras <korn@elan.rulez.org>
Kamil Domański <kamil@domanski.co>
Karsten Kretschmer <kkretschmer@gmail.com>
Kash Pande <kash@tripleback.net>
Kay Pedersen <christianpe96@gmail.com>
Keith M Wesolowski <wesolows@foobazco.org>
Kent Ross <k@mad.cash>
KernelOfTruth <kerneloftruth@gmail.com>
Kevin Bowling <kevin.bowling@kev009.com>
Kevin Greene <kevin.greene@delphix.com>
Kevin Jin <lostking2008@hotmail.com>
Kevin P. Fleming <kevin@km6g.us>
Kevin Tanguy <kevin.tanguy@ovh.net>
khoang98 <khoang98@users.noreply.github.com>
KireinaHoro <i@jsteward.moe>
Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl>
Kleber Tarcísio <klebertarcisio@yahoo.com.br>
Kody A Kantor <kody.kantor@gmail.com>
Kohsuke Kawaguchi <kk@kohsuke.org>
Konstantin Belousov <kib@FreeBSD.org>
Konstantin Khorenko <khorenko@virtuozzo.com>
KORN Andras <korn@elan.rulez.org>
kotauskas <v.toncharov@gmail.com>
Kristof Provost <github@sigsegv.be>
Krzysztof Piecuch <piecuch@kpiecuch.pl>
Kyle Blatter <kyleblatter@llnl.gov>
Kyle Evans <kevans@FreeBSD.org>
Kyle Fuller <inbox@kylefuller.co.uk>
Laevos <Laevos@users.noreply.github.com>
Lalufu <Lalufu@users.noreply.github.com>
Lars Johannsen <laj@it.dk>
Laura Hild <lsh@jlab.org>
Laurențiu Nicola <lnicola@dend.ro>
Lauri Tirkkonen <lauri@hacktheplanet.fi>
liaoyuxiangqin <guo.yong33@zte.com.cn>
Li Dongyang <dongyang.li@anu.edu.au>
Liu Hua <liu.hua130@zte.com.cn>
Liu Qing <winglq@gmail.com>
Li Wei <W.Li@Sun.COM>
Loli <ezomori.nozomu@gmail.com>
lorddoskias <lorddoskias@gmail.com>
Lorenz Brun <lorenz@dolansoft.org>
Lorenz Hüdepohl <dev@stellardeath.org>
louwrentius <louwrentius@gmail.com>
Lars Johannsen <laj@it.dk>
Li Dongyang <dongyang.li@anu.edu.au>
Li Wei <W.Li@Sun.COM>
Lukas Wunner <lukas@wunner.de>
luozhengzheng <luo.zhengzheng@zte.com.cn>
Luís Henriques <henrix@camandro.org>
Madhav Suresh <madhav.suresh@delphix.com>
Maksym Shkolnyi <maksym.shkolnyi@workato.com>
manfromafar <jonsonb10@gmail.com>
Manoj Joseph <manoj.joseph@delphix.com>
Manuel Amador (Rudd-O) <rudd-o@rudd-o.com>
Marcel Huber <marcelhuberfoo@gmail.com>
Marcel Menzel <mail@mcl.gg>
Marcel Schilling <marcel.schilling@uni-luebeck.de>
Marcel Telka <marcel.telka@nexenta.com>
Marcel Wysocki <maci.stgn@gmail.com>
Marcin Skarbek <git@skarbek.name>
Mariusz Zaborski <mariusz.zaborski@klarasystems.com>
Mark Johnston <markj@FreeBSD.org>
Mark Maybee <mark.maybee@delphix.com>
Mark Roper <markroper@gmail.com>
Mark Shellenbaum <Mark.Shellenbaum@Oracle.COM>
marku89 <mar42@kola.li>
Mark Wright <markwright@internode.on.net>
Mart Frauenlob <allkind@fastest.cc>
Martin Matuska <mm@FreeBSD.org>
Martin Rüegg <martin.rueegg@metaworx.ch>
Martin Wagner <martin.wagner.dev@gmail.com>
Massimo Maggi <me@massimo-maggi.eu>
Mateusz Guzik <mjguzik@gmail.com>
Mateusz Piotrowski <0mp@FreeBSD.org>
Mathieu Velten <matmaul@gmail.com>
Matt Fiddaman <github@m.fiddaman.uk>
Matthew Ahrens <matt@delphix.com>
Matthew Heller <matthew.f.heller@gmail.com>
Matthew Thode <mthode@mthode.org>
Matthias Blankertz <matthias@blankertz.org>
Matt Johnston <matt@fugro-fsi.com.au>
Matt Kemp <matt@mattikus.com>
Matt Macy <mmacy@freebsd.org>
Matthew Ahrens <matt@delphix.com>
Matthew Thode <mthode@mthode.org>
Matus Kral <matuskral@me.com>
Mauricio Faria de Oliveira <mfo@canonical.com>
Max Grossman <max.grossman@delphix.com>
Maxim Filimonov <che@bein.link>
Maximilian Mehnert <maximilian.mehnert@gmx.de>
Max Zettlmeißl <max@zettlmeissl.de>
Md Islam <mdnahian@outlook.com>
megari <megari@iki.fi>
Meriel Luna Mittelbach <lunarlambda@gmail.com>
Michael D Labriola <michael.d.labriola@gmail.com>
Michael Franzl <michael@franzl.name>
Michael Gebetsroither <michael@mgeb.org>
Michael Kjorling <michael@kjorling.se>
Michael Martin <mgmartin.mgm@gmail.com>
Michael Niewöhner <foss@mniewoehner.de>
Michael Zhivich <mzhivich@akamai.com>
Michal Vasilek <michal@vasilek.cz>
MigeljanImeri <ImeriMigel@gmail.com>
Mike Gerdts <mike.gerdts@joyent.com>
Mike Harsch <mike@harschsystems.com>
Mike Leddy <mike.leddy@gmail.com>
Mike Swanson <mikeonthecomputer@gmail.com>
Milan Jurik <milan.jurik@xylab.cz>
Minsoo Choo <minsoochoo0122@proton.me>
mnrx <mnrx@users.noreply.github.com>
Mohamed Tawfik <m_tawfik@aucegypt.edu>
Morgan Jones <mjones@rice.edu>
Moritz Maxeiner <moritz@ucworks.org>
Mo Zhou <cdluminate@gmail.com>
naivekun <naivekun@outlook.com>
nathancheek <myself@nathancheek.com>
Nathaniel Clark <Nathaniel.Clark@misrule.us>
Nathaniel Wesley Filardo <nwf@cs.jhu.edu>
Nathan Lewis <linux.robotdude@gmail.com>
nav1s <nav1s@proton.me>
Nav Ravindranath <nav@delphix.com>
Neal Gompa (ニール・ゴンパ) <ngompa13@gmail.com>
Ned Bass <bass6@llnl.gov>
Neependra Khare <neependra@kqinfotech.com>
Neil Stockbridge <neil@dist.ro>
Nick Black <dank@qemfd.net>
Nick Garvey <garvey.nick@gmail.com>
Nick Mattis <nickm970@gmail.com>
Nick Terrell <terrelln@fb.com>
Niklas Haas <github-c6e1c8@haasn.xyz>
Nikolay Borisov <n.borisov.lkml@gmail.com>
nordaux <nordaux@gmail.com>
ofthesun9 <olivier@ofthesun.net>
Olaf Faaland <faaland1@llnl.gov>
Oleg Drokin <green@linuxhacker.ru>
Oleg Stepura <oleg@stepura.com>
Olivier Certner <olce@FreeBSD.org>
Olivier Mazouffre <olivier.mazouffre@ims-bordeaux.fr>
omni <omni+vagant@hack.org>
Orivej Desh <orivej@gmx.fr>
Pablo Correa Gómez <ablocorrea@hotmail.com>
Palash Gandhi <pbg4930@rit.edu>
Patrick Fasano <patrick@patrickfasano.com>
Patrick Mooney <pmooney@pfmooney.com>
Patrick Xia <patrickx@google.com>
Patrik Greco <sikevux@sikevux.se>
Paul B. Henson <henson@acm.org>
Paul Dagnelie <pcd@delphix.com>
Paul Zuchowski <pzuchowski@datto.com>
Pavel Boldin <boldin.pavel@gmail.com>
Pavel Snajdr <snajpa@snajpa.net>
Pavel Zakharov <pavel.zakharov@delphix.com>
Pawel Jakub Dawidek <pjd@FreeBSD.org>
Pedro Giffuni <pfg@freebsd.org>
Peng <peng.hse@xtaotech.com>
Peng Liu <littlenewton6@gmail.com>
Peter Ashford <ashford@accs.com>
Peter Dave Hello <hsu@peterdavehello.org>
Peter Doherty <peterd@acranox.org>
Peter Levine <plevine457@gmail.com>
Peter Wirdemo <peter.wirdemo@gmail.com>
Petros Koutoupis <petros@petroskoutoupis.com>
Philip Pokorny <ppokorny@penguincomputing.com>
Philipp Riederer <pt@philipptoelke.de>
Phil Kauffman <philip@kauffman.me>
Phil Sutter <phil@nwl.cc>
Ping Huang <huangping@smartx.com>
Piotr Kubaj <pkubaj@anongoth.pl>
Piotr P. Stefaniak <pstef@freebsd.org>
poscat <poscat@poscat.moe>
Prakash Surya <prakash.surya@delphix.com>
Prasad Joshi <prasadjoshi124@gmail.com>
privb0x23 <privb0x23@users.noreply.github.com>
P.SCH <p88@yahoo.com>
Qiuhao Chen <chenqiuhao1997@gmail.com>
Quartz <yyhran@163.com>
Quentin Thébault <quentin.thebault@defenso.fr>
Quentin Zdanis <zdanisq@gmail.com>
Rafael Kitover <rkitover@gmail.com>
RageLtMan <sempervictus@users.noreply.github.com>
Ralf Ertzinger <ralf@skytale.net>
Randall Mason <ClashTheBunny@gmail.com>
Remy Blank <remy.blank@pobox.com>
renelson <bnelson@nelsonbe.com>
Reno Reckling <e-github@wthack.de>
René Wirnata <rene.wirnata@pandascience.net>
Ricardo M. Correia <ricardo.correia@oracle.com>
Riccardo Schirone <rschirone91@gmail.com>
Richard Allen <belperite@gmail.com>
Rich Ercolani <rincebrain@gmail.com>
Richard Elling <Richard.Elling@RichardElling.com>
Richard Kojedzinszky <richard@kojedz.in>
Richard Laager <rlaager@wiktel.com>
Richard Lowe <richlowe@richlowe.net>
Richard Sharpe <rsharpe@samba.org>
Richard Yao <ryao@gentoo.org>
Rich Ercolani <rincebrain@gmail.com>
Rick Macklem <rmacklem@uoguelph.ca>
rilysh <nightquick@proton.me>
Robert Evans <evansr@google.com>
Robert Novak <sailnfool@gmail.com>
Roberto Ricci <ricci@disroot.org>
Rob Norris <robn@despairlabs.com>
Rob Wing <rew@FreeBSD.org>
Rohan Puri <rohan.puri15@gmail.com>
Romain Dolbeau <romain.dolbeau@atos.net>
Roman Strashkin <roman.strashkin@nexenta.com>
Ross Williams <ross@ross-williams.net>
Ruben Kerkhof <ruben@rubenkerkhof.com>
Ryan <errornointernet@envs.net>
Ryan Hirasaki <ryanhirasaki@gmail.com>
Ryan Lahfa <masterancpp@gmail.com>
Ryan Libby <rlibby@FreeBSD.org>
Ryan Moeller <freqlabs@FreeBSD.org>
Sam Atkinson <samatk@amazon.com>
Sam Hathaway <github.com@munkynet.org>
Sam James <sam@gentoo.org>
Sam Lunt <samuel.j.lunt@gmail.com>
Samuel VERSCHELDE <stormi-github@ylix.fr>
Samuel Wycliffe <samuelwycliffe@gmail.com>
Samuel Wycliffe J <samwyc@hpe.com>
Sanjeev Bagewadi <sanjeev.bagewadi@gmail.com>
Sara Hartse <sara.hartse@delphix.com>
Saso Kiselkov <saso.kiselkov@nexenta.com>
Satadru Pramanik <satadru@gmail.com>
Savyasachee Jha <genghizkhan91@hawkradius.com>
Scott Colby <scott@scolby.com>
Scot W. Stevenson <scot.stevenson@gmail.com>
Sean Eric Fagan <sef@ixsystems.com>
Sebastian Gottschall <s.gottschall@dd-wrt.com>
Sebastian Pauka <me@spauka.se>
Sebastian Wuerl <s.wuerl@mailbox.org>
Sebastien Roy <seb@delphix.com>
Sen Haerens <sen@senhaerens.be>
Serapheim Dimitropoulos <serapheim@delphix.com>
Seth Forshee <seth.forshee@canonical.com>
Seth Hoffert <Seth.Hoffert@gmail.com>
Seth Troisi <sethtroisi@google.com>
Shaan Nobee <sniper111@gmail.com>
Shampavman <sham.pavman@nexenta.com>
Shaun Tancheff <shaun@aeonazure.com>
Shawn Bayern <sbayern@law.fsu.edu>
Shengqi Chen <harry-chen@outlook.com>
SHENGYI HONG <aokblast@FreeBSD.org>
Shen Yan <shenyanxxxy@qq.com>
Shreshth Srivastava <shreshthsrivastava2@gmail.com>
Sietse <sietse@wizdom.nu>
Simon Guest <simon.guest@tesujimath.org>
Simon Howard <fraggle@soulsphere.org>
Simon Klinkert <simon.klinkert@gmail.com>
Sowrabha Gopal <sowrabha.gopal@delphix.com>
Spencer Kinny <spencerkinny1995@gmail.com>
Srikanth N S <srikanth.nagasubbaraoseetharaman@hpe.com>
Stanislav Seletskiy <s.seletskiy@gmail.com>
Stefan Lendl <s.lendl@proxmox.com>
Steffen Müthing <steffen.muething@iwr.uni-heidelberg.de>
Stephen Blinick <stephen.blinick@delphix.com>
sterlingjensen <sterlingjensen@users.noreply.github.com>
Steve Dougherty <sdougherty@barracuda.com>
Steve Mokris <smokris@softpixel.com>
Steven Burgess <sburgess@dattobackup.com>
Steven Hartland <smh@freebsd.org>
Steven Johnson <sjohnson@sakuraindustries.com>
Steven Noonan <steven@uplinklabs.net>
stf <s@ctrlc.hu>
Stian Ellingsen <stian@plaimi.net>
Stoiko Ivanov <github@nomore.at>
Stéphane Lesimple <speed47_github@speed47.net>
Suman Chakravartula <schakrava@gmail.com>
Sydney Vanda <sydney.m.vanda@intel.com>
Syed Shahrukh Hussain <syed.shahrukh@ossrevival.org>
Sören Tempel <soeren+git@soeren-tempel.net>
Tamas TEVESZ <ice@extreme.hu>
Teodor Spæren <teodor_spaeren@riseup.net>
TerraTech <TerraTech@users.noreply.github.com>
Theera K. <tkittich@hotmail.com>
Thijs Cramer <thijs.cramer@gmail.com>
Thomas Bertschinger <bertschinger@lanl.gov>
Thomas Geppert <geppi@digitx.de>
Thomas Lamprecht <guggentom@hotmail.de>
Till Maas <opensource@till.name>
Tim Chase <tim@chase2k.com>
Tim Connors <tconnors@rather.puzzling.org>
Tim Crawford <tcrawford@datto.com>
Tim Haley <Tim.Haley@Sun.COM>
timor <timor.dd@googlemail.com>
Timothy Day <tday141@gmail.com>
Tim Schumacher <timschumi@gmx.de>
Tim Smith <tim@mondoo.com>
Tino Reichardt <milky-zfs@mcmilk.de>
tleydxdy <shironeko.github@tesaguri.club>
Tobin Harding <me@tobin.cc>
Todd Seidelmann <seidelma@users.noreply.github.com>
Todd Zullinger <tmz@pobox.com>
Tom Caputi <tcaputi@datto.com>
Tom Matthews <tom@axiom-partners.com>
Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Tom Prince <tom.prince@ualberta.net>
Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Tony Hutter <hutter2@llnl.gov>
Tony Nguyen <tony.nguyen@delphix.com>
Tony Perkins <tperkins@datto.com>
Toomas Soome <tsoome@me.com>
Torsten Wörtwein <twoertwein@gmail.com>
Toyam Cox <aviator45003@gmail.com>
Trevor Bautista <trevrb@trevrb.net>
Trey Dockendorf <treydock@gmail.com>
trick2011 <trick2011@users.noreply.github.com>
Troels Nørgaard <tnn@tradeshift.com>
tstabrawa <tstabrawa@users.noreply.github.com>
Tulsi Jain <tulsi.jain@delphix.com>
Turbo Fredriksson <turbo@bayour.com>
Tyler J. Stachecki <stachecki.tyler@gmail.com>
Umer Saleem <usaleem@ixsystems.com>
Vaibhav Bhanawat <vaibhav.bhanawat@delphix.com>
Valmiky Arquissandas <kayvlim@gmail.com>
Val Packett <val@packett.cool>
Vandana Rungta <vrungta@amazon.com>
Vince van Oosten <techhazard@codeforyouand.me>
Violet Purcell <vimproved@inventati.org>
Vipin Kumar Verma <vipin.verma@hpe.com>
Vitaut Bajaryn <vitaut.bayaryn@gmail.com>
Volker Mauel <volkermauel@gmail.com>
Václav Skála <skala@vshosting.cz>
Walter Huf <hufman@gmail.com>
Warner Losh <imp@bsdimp.com>
Weigang Li <weigang.li@intel.com>
WHR <msl0000023508@gmail.com>
Will Andrews <will@freebsd.org>
Will Rouesnel <w.rouesnel@gmail.com>
Windel Bouwman <windel@windel.nl>
Wojciech Małota-Wójcik <outofforest@users.noreply.github.com>
Wolfgang Bumiller <w.bumiller@proxmox.com>
XDTG <click1799@163.com>
Xin Li <delphij@FreeBSD.org>
Xinliang Liu <xinliang.liu@linaro.org>
xtouqh <xtouqh@hotmail.com>
Yann Collet <cyan@fb.com>
Yanping Gao <yanping.gao@xtaotech.com>
Ying Zhu <casualfisher@gmail.com>
Youzhong Yang <youzhong@gmail.com>
yparitcher <y@paritcher.com>
yuina822 <ayuichi@club.kyutech.ac.jp>
YunQiang Su <syq@debian.org>
Yuri Pankov <yuri.pankov@gmail.com>
Yuxin Wang <yuxinwang9999@gmail.com>
Yuxuan Shui <yshuiv7@gmail.com>
Zachary Bedell <zac@thebedells.org>
Zach Dykstra <dykstra.zachary@gmail.com>
zgock <zgock@nuc.base.zgock-lab.net>
Zhao Yongming <zym@apache.org>
Zhenlei Huang <zlei@FreeBSD.org>
Zhu Chuang <chuang@melty.land>
Érico Nogueira <erico.erc@gmail.com>
Đoàn Trần Công Danh <congdanhqx@gmail.com>
韩朴宇 <w12101111@gmail.com>
+2 -2
View File
@@ -1,2 +1,2 @@
The [OpenZFS Code of Conduct](https://openzfs.org/wiki/Code_of_Conduct)
applies to spaces associated with the OpenZFS project, including GitHub.
The [OpenZFS Code of Conduct](http://www.open-zfs.org/wiki/Code_of_Conduct)
applies to spaces associated with the ZFS on Linux project, including GitHub.
+5 -5
View File
@@ -19,11 +19,11 @@ notable exceptions and their respective licenses include:
* AES Implementation: module/icp/asm-x86_64/aes/THIRDPARTYLICENSE.gladman
* AES Implementation: module/icp/asm-x86_64/aes/THIRDPARTYLICENSE.openssl
* PBKDF2 Implementation: lib/libzfs/THIRDPARTYLICENSE.openssl
* SPL Implementation: module/os/linux/spl/THIRDPARTYLICENSE.gplv2
* GCM Implementation: module/icp/asm-x86_64/modes/THIRDPARTYLICENSE.cryptogams
* GCM Implementation: module/icp/asm-x86_64/modes/THIRDPARTYLICENSE.openssl
* GHASH Implementation: module/icp/asm-x86_64/modes/THIRDPARTYLICENSE.cryptogams
* GHASH Implementation: module/icp/asm-x86_64/modes/THIRDPARTYLICENSE.openssl
* SPL Implementation: module/spl/THIRDPARTYLICENSE.gplv2
* GCM Implementaion: module/icp/asm-x86_64/modes/THIRDPARTYLICENSE.cryptogams
* GCM Implementaion: module/icp/asm-x86_64/modes/THIRDPARTYLICENSE.openssl
* GHASH Implementaion: module/icp/asm-x86_64/modes/THIRDPARTYLICENSE.cryptogams
* GHASH Implementaion: module/icp/asm-x86_64/modes/THIRDPARTYLICENSE.openssl
This product includes software developed by the OpenSSL Project for use
in the OpenSSL Toolkit (http://www.openssl.org/)
+4 -4
View File
@@ -1,10 +1,10 @@
Meta: 1
Name: zfs
Branch: 1.0
Version: 2.4.1
Version: 0.8.5
Release: 1
Release-Tags: relext
License: CDDL
Author: OpenZFS
Linux-Maximum: 6.19
Linux-Minimum: 4.18
Author: OpenZFS on Linux
Linux-Maximum: 5.6
Linux-Minimum: 2.6.32
+101 -142
View File
@@ -1,86 +1,51 @@
CLEANFILES =
dist_noinst_DATA =
INSTALL_DATA_HOOKS =
INSTALL_EXEC_HOOKS =
ALL_LOCAL =
CLEAN_LOCAL =
CHECKS = shellcheck checkbashisms
include $(top_srcdir)/config/Rules.am
include $(top_srcdir)/config/CppCheck.am
include $(top_srcdir)/config/Shellcheck.am
include $(top_srcdir)/config/Substfiles.am
include $(top_srcdir)/scripts/Makefile.am
ACLOCAL_AMFLAGS = -I config
SUBDIRS = include
if BUILD_LINUX
include $(srcdir)/%D%/rpm/Makefile.am
endif
include config/rpm.am
include config/deb.am
include config/tgz.am
SUBDIRS = include rpm
if CONFIG_USER
include $(srcdir)/%D%/cmd/Makefile.am
include $(srcdir)/%D%/contrib/Makefile.am
include $(srcdir)/%D%/etc/Makefile.am
include $(srcdir)/%D%/lib/Makefile.am
include $(srcdir)/%D%/man/Makefile.am
include $(srcdir)/%D%/tests/Makefile.am
if BUILD_LINUX
include $(srcdir)/%D%/udev/Makefile.am
SUBDIRS += udev etc man scripts lib tests cmd contrib
endif
endif
CPPCHECKDIRS += module
if CONFIG_KERNEL
SUBDIRS += module
extradir = $(prefix)/src/zfs-$(VERSION)
extra_HEADERS = zfs.release.in zfs_config.h.in
kerneldir = $(prefix)/src/zfs-$(VERSION)/$(LINUX_VERSION)
nodist_kernel_HEADERS = zfs.release zfs_config.h module/$(LINUX_SYMBOLS)
endif
dist_noinst_DATA += autogen.sh copy-builtin
dist_noinst_DATA += AUTHORS CODE_OF_CONDUCT.md COPYRIGHT LICENSE META NEWS NOTICE
dist_noinst_DATA += README.md RELEASES.md
dist_noinst_DATA += module/lua/README.zfs module/os/linux/spl/README.md
AUTOMAKE_OPTIONS = foreign
EXTRA_DIST = autogen.sh copy-builtin
EXTRA_DIST += config/config.awk config/rpm.am config/deb.am config/tgz.am
EXTRA_DIST += META AUTHORS COPYRIGHT LICENSE NEWS NOTICE README.md
EXTRA_DIST += CODE_OF_CONDUCT.md
# Include all the extra licensing information for modules
dist_noinst_DATA += module/icp/algs/skein/THIRDPARTYLICENSE
dist_noinst_DATA += module/icp/algs/skein/THIRDPARTYLICENSE.descrip
dist_noinst_DATA += module/icp/asm-x86_64/aes/THIRDPARTYLICENSE.gladman
dist_noinst_DATA += module/icp/asm-x86_64/aes/THIRDPARTYLICENSE.gladman.descrip
dist_noinst_DATA += module/icp/asm-x86_64/aes/THIRDPARTYLICENSE.openssl
dist_noinst_DATA += module/icp/asm-x86_64/aes/THIRDPARTYLICENSE.openssl.descrip
dist_noinst_DATA += module/icp/asm-x86_64/modes/THIRDPARTYLICENSE.cryptogams
dist_noinst_DATA += module/icp/asm-x86_64/modes/THIRDPARTYLICENSE.cryptogams.descrip
dist_noinst_DATA += module/icp/asm-x86_64/modes/THIRDPARTYLICENSE.openssl
dist_noinst_DATA += module/icp/asm-x86_64/modes/THIRDPARTYLICENSE.openssl.descrip
dist_noinst_DATA += module/os/linux/spl/THIRDPARTYLICENSE.gplv2
dist_noinst_DATA += module/os/linux/spl/THIRDPARTYLICENSE.gplv2.descrip
dist_noinst_DATA += module/zfs/THIRDPARTYLICENSE.cityhash
dist_noinst_DATA += module/zfs/THIRDPARTYLICENSE.cityhash.descrip
EXTRA_DIST += module/icp/algs/skein/THIRDPARTYLICENSE
EXTRA_DIST += module/icp/algs/skein/THIRDPARTYLICENSE.descrip
EXTRA_DIST += module/icp/asm-x86_64/aes/THIRDPARTYLICENSE.gladman
EXTRA_DIST += module/icp/asm-x86_64/aes/THIRDPARTYLICENSE.gladman.descrip
EXTRA_DIST += module/icp/asm-x86_64/aes/THIRDPARTYLICENSE.openssl
EXTRA_DIST += module/icp/asm-x86_64/aes/THIRDPARTYLICENSE.openssl.descrip
EXTRA_DIST += module/spl/THIRDPARTYLICENSE.gplv2
EXTRA_DIST += module/spl/THIRDPARTYLICENSE.gplv2.descrip
EXTRA_DIST += module/zfs/THIRDPARTYLICENSE.cityhash
EXTRA_DIST += module/zfs/THIRDPARTYLICENSE.cityhash.descrip
@CODE_COVERAGE_RULES@
GITREV = include/zfs_gitrev.h
CLEANFILES += $(GITREV)
PHONY += gitrev
.PHONY: gitrev
gitrev:
$(AM_V_GEN)$(top_srcdir)/scripts/make_gitrev.sh $(GITREV)
-${top_srcdir}/scripts/make_gitrev.sh
all: gitrev
BUILT_SOURCES = gitrev
PHONY += install-data-hook $(INSTALL_DATA_HOOKS)
install-data-hook: $(INSTALL_DATA_HOOKS)
PHONY += install-exec-hook $(INSTALL_EXEC_HOOKS)
install-exec-hook: $(INSTALL_EXEC_HOOKS)
PHONY += maintainer-clean-local
maintainer-clean-local:
-$(RM) $(GITREV)
PHONY += distclean-local
distclean-local:
# Double-colon rules are allowed; there are multiple independent definitions.
distclean-local::
-$(RM) -R autom4te*.cache build
-find . \( -name SCCS -o -name BitKeeper -o -name .svn -o -name CVS \
-o -name .pc -o -name .hg -o -name .git \) -prune -o \
@@ -90,126 +55,120 @@ distclean-local:
-o -name 'core' -o -name 'Makefile' -o -name 'Module.symvers' \
-o -name '*.order' -o -name '*.markers' -o -name '*.gcda' \
-o -name '*.gcno' \) \
-type f -delete
-type f -print | xargs $(RM)
PHONY += $(CLEAN_LOCAL)
clean-local: $(CLEAN_LOCAL)
all-local:
-[ -x ${top_builddir}/scripts/zfs-tests.sh ] && \
${top_builddir}/scripts/zfs-tests.sh -c
PHONY += $(ALL_LOCAL)
all-local: $(ALL_LOCAL)
dist-hook: gitrev
cp ${top_srcdir}/include/zfs_gitrev.h $(distdir)/include; \
sed -i 's/Release:[[:print:]]*/Release: $(RELEASE)/' \
$(distdir)/META
dist-hook:
$(top_srcdir)/scripts/make_gitrev.sh -D $(distdir) $(GITREV)
$(SED) $(ac_inplace) 's/\(Release:[[:space:]]*\).*/\1$(RELEASE)/' $(distdir)/META
# For compatibility, create a matching spl-x.y.z directly which contains
# symlinks to the updated header and object file locations. These
# compatibility links will be removed in the next major release.
if CONFIG_KERNEL
install-data-hook:
rm -rf $(DESTDIR)$(prefix)/src/spl-$(VERSION) && \
mkdir $(DESTDIR)$(prefix)/src/spl-$(VERSION) && \
cd $(DESTDIR)$(prefix)/src/spl-$(VERSION) && \
ln -s ../zfs-$(VERSION)/include/spl include && \
ln -s ../zfs-$(VERSION)/$(LINUX_VERSION) $(LINUX_VERSION) && \
ln -s ../zfs-$(VERSION)/zfs_config.h.in spl_config.h.in && \
ln -s ../zfs-$(VERSION)/zfs.release.in spl.release.in && \
cd $(DESTDIR)$(prefix)/src/zfs-$(VERSION)/$(LINUX_VERSION) && \
ln -fs zfs_config.h spl_config.h && \
ln -fs zfs.release spl.release
endif
PHONY += codecheck $(CHECKS)
codecheck: $(CHECKS)
codecheck: cstyle shellcheck flake8 mancheck testscheck vcscheck
SHELLCHECKSCRIPTS += autogen.sh
PHONY += checkstyle
checkstyle: codecheck commitcheck
PHONY += commitcheck
commitcheck:
$(AM_V_at)if git rev-parse --git-dir > /dev/null 2>&1; then \
@if git rev-parse --git-dir > /dev/null 2>&1; then \
${top_srcdir}/scripts/commitcheck.sh; \
fi
CHECKS += spdxcheck
spdxcheck:
$(AM_V_at)$(top_srcdir)/scripts/spdxcheck.pl
if HAVE_PARALLEL
cstyle_line = -print0 | parallel -X0 ${top_srcdir}/scripts/cstyle.pl -cpP {}
else
cstyle_line = -exec ${top_srcdir}/scripts/cstyle.pl -cpP {} +
endif
CHECKS += cstyle
cstyle:
$(AM_V_at)find $(top_srcdir) -name build -prune \
-o -type f -name '*.[hc]' \
! -name 'zfs_config.*' ! -name '*.mod.c' \
! -name 'opt_global.h' ! -name '*_if*.h' \
! -name 'zstd_compat_wrapper.h' \
! -path './module/zstd/lib/*' \
! -path './include/sys/lua/*' \
! -path './module/lua/l*.[ch]' \
! -path './module/zfs/lz4.c' \
$(cstyle_line)
@find ${top_srcdir} -name build -prune -o -name '*.[hc]' \
! -name 'zfs_config.*' ! -name '*.mod.c' -type f \
-exec ${top_srcdir}/scripts/cstyle.pl -cpP {} \+
shellcheck:
@if type shellcheck > /dev/null 2>&1; then \
shellcheck --exclude=SC1090 --format=gcc \
$$(find ${top_srcdir}/scripts/*.sh -type f) \
$$(find ${top_srcdir}/cmd/zed/zed.d/*.sh -type f) \
$$(find ${top_srcdir}/cmd/zpool/zpool.d/* -executable); \
else \
echo "skipping shellcheck because shellcheck is not installed"; \
fi
mancheck:
@if type mandoc > /dev/null 2>&1; then \
find ${top_srcdir}/man/man8 -type f -name 'zfs.8' \
-o -name 'zpool.8' -o -name 'zdb.8' \
-o -name 'zgenhostid.8' | \
xargs mandoc -Tlint -Werror; \
else \
echo "skipping mancheck because mandoc is not installed"; \
fi
filter_executable = -exec test -x '{}' \; -print
CHECKS += testscheck
testscheck:
$(AM_V_at)[ $$(find $(top_srcdir)/tests/zfs-tests -type f \
\( -name '*.ksh' -not $(filter_executable) \) -o \
\( -name '*.kshlib' $(filter_executable) \) -o \
\( -name '*.shlib' $(filter_executable) \) -o \
\( -name '*.cfg' $(filter_executable) \) | \
tee /dev/stderr | wc -l) -eq 0 ]
@find ${top_srcdir}/tests/zfs-tests -type f \
\( -name '*.ksh' -not -executable \) -o \
\( -name '*.kshlib' -executable \) -o \
\( -name '*.shlib' -executable \) -o \
\( -name '*.cfg' -executable \) | \
xargs -r stat -c '%A %n' | \
awk '{c++; print} END {if(c>0) exit 1}'
CHECKS += vcscheck
vcscheck:
$(AM_V_at)if git rev-parse --git-dir > /dev/null 2>&1; then \
@if git rev-parse --git-dir > /dev/null 2>&1; then \
git ls-files . --exclude-standard --others | \
awk '{c++; print} END {if(c>0) exit 1}' ; \
fi
CHECKS += zstdcheck
zstdcheck:
@$(MAKE) -C module check-zstd-symbols
PHONY += lint
lint: cppcheck paxcheck
PHONY += paxcheck
cppcheck:
@if type cppcheck > /dev/null 2>&1; then \
cppcheck --quiet --force --error-exitcode=2 --inline-suppr \
--suppressions-list=.github/suppressions.txt \
-UHAVE_SSE2 -UHAVE_AVX512F -UHAVE_UIO_ZEROCOPY \
${top_srcdir}; \
else \
echo "skipping cppcheck because cppcheck is not installed"; \
fi
paxcheck:
$(AM_V_at)if type scanelf > /dev/null 2>&1; then \
$(top_srcdir)/scripts/paxcheck.sh $(top_builddir); \
@if type scanelf > /dev/null 2>&1; then \
${top_srcdir}/scripts/paxcheck.sh ${top_srcdir}; \
else \
echo "skipping paxcheck because scanelf is not installed"; \
fi
CHECKS += flake8
flake8:
$(AM_V_at)if type flake8 > /dev/null 2>&1; then \
flake8 $(top_srcdir); \
@if type flake8 > /dev/null 2>&1; then \
flake8 ${top_srcdir}; \
else \
echo "skipping flake8 because flake8 is not installed"; \
fi
PHONY += regen-tests
regen-tests:
@$(MAKE) -C tests/zfs-tests/tests regen
PHONY += ctags
ctags:
$(RM) tags
find $(top_srcdir) -name '.?*' -prune \
-o -type f -name '*.[hcS]' -exec ctags -a {} +
find $(top_srcdir) -name .git -prune -o -name '*.[hc]' | xargs ctags
PHONY += etags
etags:
$(RM) TAGS
find $(top_srcdir) -name '.?*' -prune \
-o -type f -name '*.[hcS]' -exec etags -a {} +
find $(top_srcdir) -name .pc -prune -o -name '*.[hc]' | xargs etags -a
PHONY += cscopelist
cscopelist:
find $(top_srcdir) -name '.?*' -prune \
-o -type f -name '*.[hc]' -print >cscope.files
PHONY += tags
tags: ctags etags
PHONY += pkg pkg-dkms pkg-kmod pkg-utils
pkg: @DEFAULT_PACKAGE@
pkg-dkms: @DEFAULT_PACKAGE@-dkms
pkg-kmod: @DEFAULT_PACKAGE@-kmod
pkg-utils: @DEFAULT_PACKAGE@-utils
include config/rpm.am
include config/deb.am
include config/tgz.am
.PHONY: $(PHONY)
+1 -1
View File
@@ -1,3 +1,3 @@
Descriptions of all releases can be found on github:
https://github.com/openzfs/zfs/releases
https://github.com/zfsonlinux/zfs/releases
+12 -16
View File
@@ -1,35 +1,31 @@
![img](https://openzfs.github.io/openzfs-docs/_static/img/logo/480px-Open-ZFS-Secondary-Logo-Colour-halfsize.png)
![img](http://zfsonlinux.org/images/zfs-linux.png)
OpenZFS is an advanced file system and volume manager which was originally
ZFS on Linux is an advanced file system and volume manager which was originally
developed for Solaris and is now maintained by the OpenZFS community.
This repository contains the code for running OpenZFS on Linux and FreeBSD.
[![codecov](https://codecov.io/gh/openzfs/zfs/branch/master/graph/badge.svg)](https://codecov.io/gh/openzfs/zfs)
[![coverity](https://scan.coverity.com/projects/1973/badge.svg)](https://scan.coverity.com/projects/openzfs-zfs)
[![codecov](https://codecov.io/gh/zfsonlinux/zfs/branch/master/graph/badge.svg)](https://codecov.io/gh/zfsonlinux/zfs)
[![coverity](https://scan.coverity.com/projects/1973/badge.svg)](https://scan.coverity.com/projects/zfsonlinux-zfs)
# Official Resources
* [Documentation](https://openzfs.github.io/openzfs-docs/) - for using and developing this repo
* [ZoL site](https://zfsonlinux.org) - Linux release info & links
* [Mailing lists](https://openzfs.github.io/openzfs-docs/Project%20and%20Community/Mailing%20Lists.html)
* [OpenZFS site](https://openzfs.org/) - for conference videos and info on other platforms (illumos, OSX, Windows, etc)
* [Site](http://zfsonlinux.org)
* [Wiki](https://github.com/zfsonlinux/zfs/wiki)
* [Mailing lists](https://github.com/zfsonlinux/zfs/wiki/Mailing-Lists)
* [OpenZFS site](http://open-zfs.org/)
# Installation
Full documentation for installing OpenZFS on your favorite operating system can
be found at the [Getting Started Page](https://openzfs.github.io/openzfs-docs/Getting%20Started/index.html).
Full documentation for installing ZoL on your favorite Linux distribution can
be found at [our site](http://zfsonlinux.org/).
# Contribute & Develop
We have a separate document with [contribution guidelines](./.github/CONTRIBUTING.md).
We have a [Code of Conduct](./CODE_OF_CONDUCT.md).
# Release
OpenZFS is released under a CDDL license.
ZFS on Linux is released under a CDDL license.
For more details see the NOTICE, LICENSE and COPYRIGHT files; `UCRL-CODE-235197`
# Supported Kernels
* The `META` file contains the officially recognized supported Linux kernel versions.
* Supported FreeBSD versions are any supported branches and releases starting from 13.0-RELEASE.
* The `META` file contains the officially recognized supported kernel versions.
-37
View File
@@ -1,37 +0,0 @@
OpenZFS uses the MAJOR.MINOR.PATCH versioning scheme described here:
* MAJOR - Incremented at the discretion of the OpenZFS developers to indicate
a particularly noteworthy feature or change. An increase in MAJOR number
does not indicate any incompatible on-disk format change. The ability
to import a ZFS pool is controlled by the feature flags enabled on the
pool and the feature flags supported by the installed OpenZFS version.
Increasing the MAJOR version is expected to be an infrequent occurrence.
* MINOR - Incremented to indicate new functionality such as a new feature
flag, pool/dataset property, zfs/zpool sub-command, new user/kernel
interface, etc. MINOR releases may introduce incompatible changes to the
user space library APIs (libzfs.so). Existing user/kernel interfaces are
considered to be stable to maximize compatibility between OpenZFS releases.
Additions to the user/kernel interface are backwards compatible.
* PATCH - Incremented when applying documentation updates, important bug
fixes, minor performance improvements, and kernel compatibility patches.
The user space library APIs and user/kernel interface are considered to
be stable. PATCH releases for a MAJOR.MINOR are published as needed.
Two release branches are maintained for OpenZFS, they are:
* OpenZFS LTS - A designated MAJOR.MINOR release with periodic PATCH
releases that incorporate important changes backported from newer OpenZFS
releases. This branch is intended for use in environments using an
LTS, enterprise, or similarly managed kernel (RHEL, Ubuntu LTS, Debian).
Minor changes to support these distribution kernels will be applied as
needed. New kernel versions released after the OpenZFS LTS release are
not supported. LTS releases will receive patches for at least 2 years.
The current LTS release is OpenZFS 2.2.
* OpenZFS current - Tracks the newest MAJOR.MINOR release. This branch
includes support for the latest OpenZFS features and recently releases
kernels. When a new MINOR release is tagged the previous MINOR release
will no longer be maintained (unless it is an LTS release). New MINOR
releases are planned to occur roughly annually.
+61
View File
@@ -48,3 +48,64 @@
#TEST_ZFSSTRESS_VDEV="/var/tmp/vdev"
#TEST_ZFSSTRESS_DIR="/$TEST_ZFSSTRESS_POOL/$TEST_ZFSSTRESS_FS"
#TEST_ZFSSTRESS_OPTIONS=""
### per-builder customization
#
# BB_NAME=builder-name <distribution-version-architecture-type>
# - distribution=Amazon,Debian,Fedora,RHEL,SUSE,Ubuntu
# - version=x.y
# - architecture=x86_64,i686,arm,aarch64
# - type=build,test
#
case "$BB_NAME" in
Amazon*)
# ZFS enabled xfstests fails to build
TEST_XFSTESTS_SKIP="yes"
;;
CentOS-7*)
# ZFS enabled xfstests fails to build
TEST_XFSTESTS_SKIP="yes"
;;
CentOS-6*)
;;
Debian*)
;;
Fedora*)
;;
RHEL*)
;;
SUSE*)
;;
Ubuntu-16.04*)
# ZFS enabled xfstests fails to build
TEST_XFSTESTS_SKIP="yes"
;;
Ubuntu*)
;;
*)
;;
esac
###
#
# Run ztest longer on the "coverage" builders to gain more code coverage
# data out of ztest, libzpool, etc.
#
case "$BB_NAME" in
*coverage*)
TEST_ZTEST_TIMEOUT=3600
;;
*)
TEST_ZTEST_TIMEOUT=900
;;
esac
###
#
# Disable the following test suites on 32-bit systems.
#
if [ $(getconf LONG_BIT) = "32" ]; then
TEST_ZTEST_SKIP="yes"
TEST_XFSTESTS_SKIP="yes"
TEST_ZFSSTRESS_SKIP="yes"
fi
+2 -1
View File
@@ -1,3 +1,4 @@
#!/bin/sh
autoreconf -fiv "$(dirname "$0")" && rm -rf "$(dirname "$0")"/autom4te.cache
autoreconf -fiv || exit 1
rm -Rf autom4te.cache
+4 -109
View File
@@ -1,113 +1,8 @@
bin_SCRIPTS =
bin_PROGRAMS =
sbin_SCRIPTS =
sbin_PROGRAMS =
dist_bin_SCRIPTS =
zfsexec_PROGRAMS =
mounthelper_PROGRAMS =
sbin_SCRIPTS += fsck.zfs
SHELLCHECKSCRIPTS += fsck.zfs
CLEANFILES += fsck.zfs
dist_noinst_DATA += %D%/fsck.zfs.in
$(call SUBST,fsck.zfs,%D%/)
sbin_PROGRAMS += zfs_ids_to_path
CPPCHECKTARGETS += zfs_ids_to_path
zfs_ids_to_path_SOURCES = \
%D%/zfs_ids_to_path.c
zfs_ids_to_path_LDADD = \
libzfs.la
zhack_CPPFLAGS = $(AM_CPPFLAGS) $(LIBZPOOL_CPPFLAGS)
sbin_PROGRAMS += zhack
CPPCHECKTARGETS += zhack
zhack_SOURCES = \
%D%/zhack.c
zhack_LDADD = \
libzpool.la \
libzfs_core.la \
libnvpair.la
ztest_CFLAGS = $(AM_CFLAGS) $(KERNEL_CFLAGS)
ztest_CPPFLAGS = $(AM_CPPFLAGS) $(LIBZPOOL_CPPFLAGS)
sbin_PROGRAMS += ztest
CPPCHECKTARGETS += ztest
ztest_SOURCES = \
%D%/ztest.c
ztest_LDADD = \
libzpool.la \
libzfs_core.la \
libnvpair.la
ztest_LDADD += -lm
ztest_LDFLAGS = -pthread
include $(srcdir)/%D%/raidz_test/Makefile.am
include $(srcdir)/%D%/zdb/Makefile.am
include $(srcdir)/%D%/zfs/Makefile.am
include $(srcdir)/%D%/zinject/Makefile.am
include $(srcdir)/%D%/zpool/Makefile.am
include $(srcdir)/%D%/zpool_influxdb/Makefile.am
include $(srcdir)/%D%/zstream/Makefile.am
if BUILD_LINUX
mounthelper_PROGRAMS += mount.zfs
CPPCHECKTARGETS += mount.zfs
mount_zfs_SOURCES = \
%D%/mount_zfs.c
mount_zfs_LDADD = \
libzfs.la \
libzfs_core.la \
libnvpair.la
mount_zfs_LDADD += $(LTLIBINTL)
CPPCHECKTARGETS += raidz_test
sbin_PROGRAMS += zgenhostid
CPPCHECKTARGETS += zgenhostid
zgenhostid_SOURCES = \
%D%/zgenhostid.c
dist_bin_SCRIPTS += %D%/zvol_wait
SHELLCHECKSCRIPTS += %D%/zvol_wait
include $(srcdir)/%D%/zed/Makefile.am
endif
SUBDIRS = zfs zpool zdb zhack zinject zstreamdump ztest
SUBDIRS += fsck_zfs vdev_id raidz_test zgenhostid
if USING_PYTHON
bin_SCRIPTS += zarcsummary zarcstat dbufstat zilstat
CLEANFILES += zarcsummary zarcstat dbufstat zilstat
dist_noinst_DATA += %D%/zarcsummary %D%/zarcstat.in %D%/dbufstat.in %D%/zilstat.in
$(call SUBST,zarcstat,%D%/)
$(call SUBST,dbufstat,%D%/)
$(call SUBST,zilstat,%D%/)
zarcsummary: %D%/zarcsummary
$(AM_V_at)cp $< $@
SUBDIRS += arcstat arc_summary dbufstat
endif
PHONY += cmd
cmd: $(bin_SCRIPTS) $(bin_PROGRAMS) $(sbin_SCRIPTS) $(sbin_PROGRAMS) $(dist_bin_SCRIPTS) $(zfsexec_PROGRAMS) $(mounthelper_PROGRAMS)
SUBDIRS += mount_zfs zed zvol_id zvol_wait
+11
View File
@@ -0,0 +1,11 @@
EXTRA_DIST = arc_summary2 arc_summary3
if USING_PYTHON_2
dist_bin_SCRIPTS = arc_summary2
install-exec-hook:
mv $(DESTDIR)$(bindir)/arc_summary2 $(DESTDIR)$(bindir)/arc_summary
else
dist_bin_SCRIPTS = arc_summary3
install-exec-hook:
mv $(DESTDIR)$(bindir)/arc_summary3 $(DESTDIR)$(bindir)/arc_summary
endif
+1081
View File
File diff suppressed because it is too large Load Diff
+875
View File
@@ -0,0 +1,875 @@
#!/usr/bin/env python3
#
# Copyright (c) 2008 Ben Rockwood <benr@cuddletech.com>,
# Copyright (c) 2010 Martin Matuska <mm@FreeBSD.org>,
# Copyright (c) 2010-2011 Jason J. Hellenthal <jhell@DataIX.net>,
# Copyright (c) 2017 Scot W. Stevenson <scot.stevenson@gmail.com>
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
#
# 1. Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# 2. Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
#
# THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND
# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
# ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
# OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
# HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
# LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
# OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
# SUCH DAMAGE.
"""Print statistics on the ZFS ARC Cache and other information
Provides basic information on the ARC, its efficiency, the L2ARC (if present),
the Data Management Unit (DMU), Virtual Devices (VDEVs), and tunables. See
the in-source documentation and code at
https://github.com/zfsonlinux/zfs/blob/master/module/zfs/arc.c for details.
The original introduction to arc_summary can be found at
http://cuddletech.com/?p=454
"""
import argparse
import os
import subprocess
import sys
import time
DESCRIPTION = 'Print ARC and other statistics for ZFS on Linux'
INDENT = ' '*8
LINE_LENGTH = 72
PROC_PATH = '/proc/spl/kstat/zfs/'
SPL_PATH = '/sys/module/spl/parameters/'
TUNABLES_PATH = '/sys/module/zfs/parameters/'
DATE_FORMAT = '%a %b %d %H:%M:%S %Y'
TITLE = 'ZFS Subsystem Report'
SECTIONS = 'arc archits dmu l2arc spl tunables vdev zil'.split()
SECTION_HELP = 'print info from one section ('+' '.join(SECTIONS)+')'
# Tunables and SPL are handled separately because they come from
# different sources
SECTION_PATHS = {'arc': 'arcstats',
'dmu': 'dmu_tx',
'l2arc': 'arcstats', # L2ARC stuff lives in arcstats
'vdev': 'vdev_cache_stats',
'xuio': 'xuio_stats',
'zfetch': 'zfetchstats',
'zil': 'zil'}
parser = argparse.ArgumentParser(description=DESCRIPTION)
parser.add_argument('-a', '--alternate', action='store_true', default=False,
help='use alternate formatting for tunables and SPL',
dest='alt')
parser.add_argument('-d', '--description', action='store_true', default=False,
help='print descriptions with tunables and SPL',
dest='desc')
parser.add_argument('-g', '--graph', action='store_true', default=False,
help='print graph on ARC use and exit', dest='graph')
parser.add_argument('-p', '--page', type=int, dest='page',
help='print page by number (DEPRECATED, use "-s")')
parser.add_argument('-r', '--raw', action='store_true', default=False,
help='dump all available data with minimal formatting',
dest='raw')
parser.add_argument('-s', '--section', dest='section', help=SECTION_HELP)
ARGS = parser.parse_args()
def cleanup_line(single_line):
"""Format a raw line of data from /proc and isolate the name value
part, returning a tuple with each. Currently, this gets rid of the
middle '4'. For example "arc_no_grow 4 0" returns the tuple
("arc_no_grow", "0").
"""
name, _, value = single_line.split()
return name, value
def draw_graph(kstats_dict):
"""Draw a primitive graph representing the basic information on the
ARC -- its size and the proportion used by MFU and MRU -- and quit.
We use max size of the ARC to calculate how full it is. This is a
very rough representation.
"""
arc_stats = isolate_section('arcstats', kstats_dict)
GRAPH_INDENT = ' '*4
GRAPH_WIDTH = 60
arc_size = f_bytes(arc_stats['size'])
arc_perc = f_perc(arc_stats['size'], arc_stats['c_max'])
mfu_size = f_bytes(arc_stats['mfu_size'])
mru_size = f_bytes(arc_stats['mru_size'])
meta_limit = f_bytes(arc_stats['arc_meta_limit'])
meta_size = f_bytes(arc_stats['arc_meta_used'])
dnode_limit = f_bytes(arc_stats['arc_dnode_limit'])
dnode_size = f_bytes(arc_stats['dnode_size'])
info_form = ('ARC: {0} ({1}) MFU: {2} MRU: {3} META: {4} ({5}) '
'DNODE {6} ({7})')
info_line = info_form.format(arc_size, arc_perc, mfu_size, mru_size,
meta_size, meta_limit, dnode_size,
dnode_limit)
info_spc = ' '*int((GRAPH_WIDTH-len(info_line))/2)
info_line = GRAPH_INDENT+info_spc+info_line
graph_line = GRAPH_INDENT+'+'+('-'*(GRAPH_WIDTH-2))+'+'
mfu_perc = float(int(arc_stats['mfu_size'])/int(arc_stats['c_max']))
mru_perc = float(int(arc_stats['mru_size'])/int(arc_stats['c_max']))
arc_perc = float(int(arc_stats['size'])/int(arc_stats['c_max']))
total_ticks = float(arc_perc)*GRAPH_WIDTH
mfu_ticks = mfu_perc*GRAPH_WIDTH
mru_ticks = mru_perc*GRAPH_WIDTH
other_ticks = total_ticks-(mfu_ticks+mru_ticks)
core_form = 'F'*int(mfu_ticks)+'R'*int(mru_ticks)+'O'*int(other_ticks)
core_spc = ' '*(GRAPH_WIDTH-(2+len(core_form)))
core_line = GRAPH_INDENT+'|'+core_form+core_spc+'|'
for line in ('', info_line, graph_line, core_line, graph_line, ''):
print(line)
def f_bytes(byte_string):
"""Return human-readable representation of a byte value in
powers of 2 (eg "KiB" for "kibibytes", etc) to two decimal
points. Values smaller than one KiB are returned without
decimal points. Note "bytes" is a reserved keyword.
"""
prefixes = ([2**80, "YiB"], # yobibytes (yotta)
[2**70, "ZiB"], # zebibytes (zetta)
[2**60, "EiB"], # exbibytes (exa)
[2**50, "PiB"], # pebibytes (peta)
[2**40, "TiB"], # tebibytes (tera)
[2**30, "GiB"], # gibibytes (giga)
[2**20, "MiB"], # mebibytes (mega)
[2**10, "KiB"]) # kibibytes (kilo)
bites = int(byte_string)
if bites >= 2**10:
for limit, unit in prefixes:
if bites >= limit:
value = bites / limit
break
result = '{0:.1f} {1}'.format(value, unit)
else:
result = '{0} Bytes'.format(bites)
return result
def f_hits(hits_string):
"""Create a human-readable representation of the number of hits.
The single-letter symbols used are SI to avoid the confusion caused
by the different "short scale" and "long scale" representations in
English, which use the same words for different values. See
https://en.wikipedia.org/wiki/Names_of_large_numbers and:
https://physics.nist.gov/cuu/Units/prefixes.html
"""
numbers = ([10**24, 'Y'], # yotta (septillion)
[10**21, 'Z'], # zetta (sextillion)
[10**18, 'E'], # exa (quintrillion)
[10**15, 'P'], # peta (quadrillion)
[10**12, 'T'], # tera (trillion)
[10**9, 'G'], # giga (billion)
[10**6, 'M'], # mega (million)
[10**3, 'k']) # kilo (thousand)
hits = int(hits_string)
if hits >= 1000:
for limit, symbol in numbers:
if hits >= limit:
value = hits/limit
break
result = "%0.1f%s" % (value, symbol)
else:
result = "%d" % hits
return result
def f_perc(value1, value2):
"""Calculate percentage and return in human-readable form. If
rounding produces the result '0.0' though the first number is
not zero, include a 'less-than' symbol to avoid confusion.
Division by zero is handled by returning 'n/a'; no error
is called.
"""
v1 = float(value1)
v2 = float(value2)
try:
perc = 100 * v1/v2
except ZeroDivisionError:
result = 'n/a'
else:
result = '{0:0.1f} %'.format(perc)
if result == '0.0 %' and v1 > 0:
result = '< 0.1 %'
return result
def format_raw_line(name, value):
"""For the --raw option for the tunable and SPL outputs, decide on the
correct formatting based on the --alternate flag.
"""
if ARGS.alt:
result = '{0}{1}={2}'.format(INDENT, name, value)
else:
spc = LINE_LENGTH-(len(INDENT)+len(value))
result = '{0}{1:<{spc}}{2}'.format(INDENT, name, value, spc=spc)
return result
def get_kstats():
"""Collect information on the ZFS subsystem from the /proc Linux virtual
file system. The step does not perform any further processing, giving us
the option to only work on what is actually needed. The name "kstat" is a
holdover from the Solaris utility of the same name.
"""
result = {}
secs = SECTION_PATHS.values()
for section in secs:
with open(PROC_PATH+section, 'r') as proc_location:
lines = [line for line in proc_location]
del lines[0:2] # Get rid of header
result[section] = lines
return result
def get_spl_tunables(PATH):
"""Collect information on the Solaris Porting Layer (SPL) or the
tunables, depending on the PATH given. Does not check if PATH is
legal.
"""
result = {}
parameters = os.listdir(PATH)
for name in parameters:
with open(PATH+name, 'r') as para_file:
value = para_file.read()
result[name] = value.strip()
return result
def get_descriptions(request):
"""Get the descriptions of the Solaris Porting Layer (SPL) or the
tunables, return with minimal formatting.
"""
if request not in ('spl', 'zfs'):
print('ERROR: description of "{0}" requested)'.format(request))
sys.exit(1)
descs = {}
target_prefix = 'parm:'
# We would prefer to do this with /sys/modules -- see the discussion at
# get_version() -- but there isn't a way to get the descriptions from
# there, so we fall back on modinfo
command = ["/sbin/modinfo", request, "-0"]
# The recommended way to do this is with subprocess.run(). However,
# some installed versions of Python are < 3.5, so we offer them
# the option of doing it the old way (for now)
info = ''
try:
if 'run' in dir(subprocess):
info = subprocess.run(command, stdout=subprocess.PIPE,
universal_newlines=True)
raw_output = info.stdout.split('\0')
else:
info = subprocess.check_output(command, universal_newlines=True)
raw_output = info.split('\0')
except subprocess.CalledProcessError:
print("Error: Descriptions not available (can't access kernel module)")
sys.exit(1)
for line in raw_output:
if not line.startswith(target_prefix):
continue
line = line[len(target_prefix):].strip()
name, raw_desc = line.split(':', 1)
desc = raw_desc.rsplit('(', 1)[0]
if desc == '':
desc = '(No description found)'
descs[name.strip()] = desc.strip()
return descs
def get_version(request):
"""Get the version number of ZFS or SPL on this machine for header.
Returns an error string, but does not raise an error, if we can't
get the ZFS/SPL version via modinfo.
"""
if request not in ('spl', 'zfs'):
error_msg = '(ERROR: "{0}" requested)'.format(request)
return error_msg
# The original arc_summary called /sbin/modinfo/{spl,zfs} to get
# the version information. We switch to /sys/module/{spl,zfs}/version
# to make sure we get what is really loaded in the kernel
command = ["cat", "/sys/module/{0}/version".format(request)]
req = request.upper()
version = "(Can't get {0} version)".format(req)
# The recommended way to do this is with subprocess.run(). However,
# some installed versions of Python are < 3.5, so we offer them
# the option of doing it the old way (for now)
info = ''
if 'run' in dir(subprocess):
info = subprocess.run(command, stdout=subprocess.PIPE,
universal_newlines=True)
version = info.stdout.strip()
else:
info = subprocess.check_output(command, universal_newlines=True)
version = info.strip()
return version
def print_header():
"""Print the initial heading with date and time as well as info on the
Linux and ZFS versions. This is not called for the graph.
"""
# datetime is now recommended over time but we keep the exact formatting
# from the older version of arc_summary in case there are scripts
# that expect it in this way
daydate = time.strftime(DATE_FORMAT)
spc_date = LINE_LENGTH-len(daydate)
sys_version = os.uname()
sys_msg = sys_version.sysname+' '+sys_version.release
zfs = get_version('zfs')
spc_zfs = LINE_LENGTH-len(zfs)
machine_msg = 'Machine: '+sys_version.nodename+' ('+sys_version.machine+')'
spl = get_version('spl')
spc_spl = LINE_LENGTH-len(spl)
print('\n'+('-'*LINE_LENGTH))
print('{0:<{spc}}{1}'.format(TITLE, daydate, spc=spc_date))
print('{0:<{spc}}{1}'.format(sys_msg, zfs, spc=spc_zfs))
print('{0:<{spc}}{1}\n'.format(machine_msg, spl, spc=spc_spl))
def print_raw(kstats_dict):
"""Print all available data from the system in a minimally sorted format.
This can be used as a source to be piped through 'grep'.
"""
sections = sorted(kstats_dict.keys())
for section in sections:
print('\n{0}:'.format(section.upper()))
lines = sorted(kstats_dict[section])
for line in lines:
name, value = cleanup_line(line)
print(format_raw_line(name, value))
# Tunables and SPL must be handled separately because they come from a
# different source and have descriptions the user might request
print()
section_spl()
section_tunables()
def isolate_section(section_name, kstats_dict):
"""From the complete information on all sections, retrieve only those
for one section.
"""
try:
section_data = kstats_dict[section_name]
except KeyError:
print('ERROR: Data on {0} not available'.format(section_data))
sys.exit(1)
section_dict = dict(cleanup_line(l) for l in section_data)
return section_dict
# Formatted output helper functions
def prt_1(text, value):
"""Print text and one value, no indent"""
spc = ' '*(LINE_LENGTH-(len(text)+len(value)))
print('{0}{spc}{1}'.format(text, value, spc=spc))
def prt_i1(text, value):
"""Print text and one value, with indent"""
spc = ' '*(LINE_LENGTH-(len(INDENT)+len(text)+len(value)))
print(INDENT+'{0}{spc}{1}'.format(text, value, spc=spc))
def prt_2(text, value1, value2):
"""Print text and two values, no indent"""
values = '{0:>9} {1:>9}'.format(value1, value2)
spc = ' '*(LINE_LENGTH-(len(text)+len(values)+2))
print('{0}{spc} {1}'.format(text, values, spc=spc))
def prt_i2(text, value1, value2):
"""Print text and two values, with indent"""
values = '{0:>9} {1:>9}'.format(value1, value2)
spc = ' '*(LINE_LENGTH-(len(INDENT)+len(text)+len(values)+2))
print(INDENT+'{0}{spc} {1}'.format(text, values, spc=spc))
# The section output concentrates on important parameters instead of
# being exhaustive (that is what the --raw parameter is for)
def section_arc(kstats_dict):
"""Give basic information on the ARC, MRU and MFU. This is the first
and most used section.
"""
arc_stats = isolate_section('arcstats', kstats_dict)
throttle = arc_stats['memory_throttle_count']
if throttle == '0':
health = 'HEALTHY'
else:
health = 'THROTTLED'
prt_1('ARC status:', health)
prt_i1('Memory throttle count:', throttle)
print()
arc_size = arc_stats['size']
arc_target_size = arc_stats['c']
arc_max = arc_stats['c_max']
arc_min = arc_stats['c_min']
mfu_size = arc_stats['mfu_size']
mru_size = arc_stats['mru_size']
meta_limit = arc_stats['arc_meta_limit']
meta_size = arc_stats['arc_meta_used']
dnode_limit = arc_stats['arc_dnode_limit']
dnode_size = arc_stats['dnode_size']
target_size_ratio = '{0}:1'.format(int(arc_max) // int(arc_min))
prt_2('ARC size (current):',
f_perc(arc_size, arc_max), f_bytes(arc_size))
prt_i2('Target size (adaptive):',
f_perc(arc_target_size, arc_max), f_bytes(arc_target_size))
prt_i2('Min size (hard limit):',
f_perc(arc_min, arc_max), f_bytes(arc_min))
prt_i2('Max size (high water):',
target_size_ratio, f_bytes(arc_max))
caches_size = int(mfu_size)+int(mru_size)
prt_i2('Most Frequently Used (MFU) cache size:',
f_perc(mfu_size, caches_size), f_bytes(mfu_size))
prt_i2('Most Recently Used (MRU) cache size:',
f_perc(mru_size, caches_size), f_bytes(mru_size))
prt_i2('Metadata cache size (hard limit):',
f_perc(meta_limit, arc_max), f_bytes(meta_limit))
prt_i2('Metadata cache size (current):',
f_perc(meta_size, meta_limit), f_bytes(meta_size))
prt_i2('Dnode cache size (hard limit):',
f_perc(dnode_limit, meta_limit), f_bytes(dnode_limit))
prt_i2('Dnode cache size (current):',
f_perc(dnode_size, dnode_limit), f_bytes(dnode_size))
print()
print('ARC hash breakdown:')
prt_i1('Elements max:', f_hits(arc_stats['hash_elements_max']))
prt_i2('Elements current:',
f_perc(arc_stats['hash_elements'], arc_stats['hash_elements_max']),
f_hits(arc_stats['hash_elements']))
prt_i1('Collisions:', f_hits(arc_stats['hash_collisions']))
prt_i1('Chain max:', f_hits(arc_stats['hash_chain_max']))
prt_i1('Chains:', f_hits(arc_stats['hash_chains']))
print()
print('ARC misc:')
prt_i1('Deleted:', f_hits(arc_stats['deleted']))
prt_i1('Mutex misses:', f_hits(arc_stats['mutex_miss']))
prt_i1('Eviction skips:', f_hits(arc_stats['evict_skip']))
print()
def section_archits(kstats_dict):
"""Print information on how the caches are accessed ("arc hits").
"""
arc_stats = isolate_section('arcstats', kstats_dict)
all_accesses = int(arc_stats['hits'])+int(arc_stats['misses'])
actual_hits = int(arc_stats['mfu_hits'])+int(arc_stats['mru_hits'])
prt_1('ARC total accesses (hits + misses):', f_hits(all_accesses))
ta_todo = (('Cache hit ratio:', arc_stats['hits']),
('Cache miss ratio:', arc_stats['misses']),
('Actual hit ratio (MFU + MRU hits):', actual_hits))
for title, value in ta_todo:
prt_i2(title, f_perc(value, all_accesses), f_hits(value))
dd_total = int(arc_stats['demand_data_hits']) +\
int(arc_stats['demand_data_misses'])
prt_i2('Data demand efficiency:',
f_perc(arc_stats['demand_data_hits'], dd_total),
f_hits(dd_total))
dp_total = int(arc_stats['prefetch_data_hits']) +\
int(arc_stats['prefetch_data_misses'])
prt_i2('Data prefetch efficiency:',
f_perc(arc_stats['prefetch_data_hits'], dp_total),
f_hits(dp_total))
known_hits = int(arc_stats['mfu_hits']) +\
int(arc_stats['mru_hits']) +\
int(arc_stats['mfu_ghost_hits']) +\
int(arc_stats['mru_ghost_hits'])
anon_hits = int(arc_stats['hits'])-known_hits
print()
print('Cache hits by cache type:')
cl_todo = (('Most frequently used (MFU):', arc_stats['mfu_hits']),
('Most recently used (MRU):', arc_stats['mru_hits']),
('Most frequently used (MFU) ghost:',
arc_stats['mfu_ghost_hits']),
('Most recently used (MRU) ghost:',
arc_stats['mru_ghost_hits']))
for title, value in cl_todo:
prt_i2(title, f_perc(value, arc_stats['hits']), f_hits(value))
# For some reason, anon_hits can turn negative, which is weird. Until we
# have figured out why this happens, we just hide the problem, following
# the behavior of the original arc_summary.
if anon_hits >= 0:
prt_i2('Anonymously used:',
f_perc(anon_hits, arc_stats['hits']), f_hits(anon_hits))
print()
print('Cache hits by data type:')
dt_todo = (('Demand data:', arc_stats['demand_data_hits']),
('Demand prefetch data:', arc_stats['prefetch_data_hits']),
('Demand metadata:', arc_stats['demand_metadata_hits']),
('Demand prefetch metadata:',
arc_stats['prefetch_metadata_hits']))
for title, value in dt_todo:
prt_i2(title, f_perc(value, arc_stats['hits']), f_hits(value))
print()
print('Cache misses by data type:')
dm_todo = (('Demand data:', arc_stats['demand_data_misses']),
('Demand prefetch data:',
arc_stats['prefetch_data_misses']),
('Demand metadata:', arc_stats['demand_metadata_misses']),
('Demand prefetch metadata:',
arc_stats['prefetch_metadata_misses']))
for title, value in dm_todo:
prt_i2(title, f_perc(value, arc_stats['misses']), f_hits(value))
print()
def section_dmu(kstats_dict):
"""Collect information on the DMU"""
zfetch_stats = isolate_section('zfetchstats', kstats_dict)
zfetch_access_total = int(zfetch_stats['hits'])+int(zfetch_stats['misses'])
prt_1('DMU prefetch efficiency:', f_hits(zfetch_access_total))
prt_i2('Hit ratio:', f_perc(zfetch_stats['hits'], zfetch_access_total),
f_hits(zfetch_stats['hits']))
prt_i2('Miss ratio:', f_perc(zfetch_stats['misses'], zfetch_access_total),
f_hits(zfetch_stats['misses']))
print()
def section_l2arc(kstats_dict):
"""Collect information on L2ARC device if present. If not, tell user
that we're skipping the section.
"""
# The L2ARC statistics live in the same section as the normal ARC stuff
arc_stats = isolate_section('arcstats', kstats_dict)
if arc_stats['l2_size'] == '0':
print('L2ARC not detected, skipping section\n')
return
l2_errors = int(arc_stats['l2_writes_error']) +\
int(arc_stats['l2_cksum_bad']) +\
int(arc_stats['l2_io_error'])
l2_access_total = int(arc_stats['l2_hits'])+int(arc_stats['l2_misses'])
health = 'HEALTHY'
if l2_errors > 0:
health = 'DEGRADED'
prt_1('L2ARC status:', health)
l2_todo = (('Low memory aborts:', 'l2_abort_lowmem'),
('Free on write:', 'l2_free_on_write'),
('R/W clashes:', 'l2_rw_clash'),
('Bad checksums:', 'l2_cksum_bad'),
('I/O errors:', 'l2_io_error'))
for title, value in l2_todo:
prt_i1(title, f_hits(arc_stats[value]))
print()
prt_1('L2ARC size (adaptive):', f_bytes(arc_stats['l2_size']))
prt_i2('Compressed:', f_perc(arc_stats['l2_asize'], arc_stats['l2_size']),
f_bytes(arc_stats['l2_asize']))
prt_i2('Header size:',
f_perc(arc_stats['l2_hdr_size'], arc_stats['l2_size']),
f_bytes(arc_stats['l2_hdr_size']))
print()
prt_1('L2ARC breakdown:', f_hits(l2_access_total))
prt_i2('Hit ratio:',
f_perc(arc_stats['l2_hits'], l2_access_total),
f_hits(arc_stats['l2_hits']))
prt_i2('Miss ratio:',
f_perc(arc_stats['l2_misses'], l2_access_total),
f_hits(arc_stats['l2_misses']))
prt_i1('Feeds:', f_hits(arc_stats['l2_feeds']))
print()
print('L2ARC writes:')
if arc_stats['l2_writes_done'] != arc_stats['l2_writes_sent']:
prt_i2('Writes sent:', 'FAULTED', f_hits(arc_stats['l2_writes_sent']))
prt_i2('Done ratio:',
f_perc(arc_stats['l2_writes_done'],
arc_stats['l2_writes_sent']),
f_bytes(arc_stats['l2_writes_done']))
prt_i2('Error ratio:',
f_perc(arc_stats['l2_writes_error'],
arc_stats['l2_writes_sent']),
f_bytes(arc_stats['l2_writes_error']))
else:
prt_i2('Writes sent:', '100 %', f_bytes(arc_stats['l2_writes_sent']))
print()
print('L2ARC evicts:')
prt_i1('Lock retries:', f_hits(arc_stats['l2_evict_lock_retry']))
prt_i1('Upon reading:', f_hits(arc_stats['l2_evict_reading']))
print()
def section_spl(*_):
"""Print the SPL parameters, if requested with alternative format
and/or descriptions. This does not use kstats.
"""
spls = get_spl_tunables(SPL_PATH)
keylist = sorted(spls.keys())
print('Solaris Porting Layer (SPL):')
if ARGS.desc:
descriptions = get_descriptions('spl')
for key in keylist:
value = spls[key]
if ARGS.desc:
try:
print(INDENT+'#', descriptions[key])
except KeyError:
print(INDENT+'# (No description found)') # paranoid
print(format_raw_line(key, value))
print()
def section_tunables(*_):
"""Print the tunables, if requested with alternative format and/or
descriptions. This does not use kstasts.
"""
tunables = get_spl_tunables(TUNABLES_PATH)
keylist = sorted(tunables.keys())
print('Tunables:')
if ARGS.desc:
descriptions = get_descriptions('zfs')
for key in keylist:
value = tunables[key]
if ARGS.desc:
try:
print(INDENT+'#', descriptions[key])
except KeyError:
print(INDENT+'# (No description found)') # paranoid
print(format_raw_line(key, value))
print()
def section_vdev(kstats_dict):
"""Collect information on VDEV caches"""
# Currently [Nov 2017] the VDEV cache is disabled, because it is actually
# harmful. When this is the case, we just skip the whole entry. See
# https://github.com/zfsonlinux/zfs/blob/master/module/zfs/vdev_cache.c
# for details
tunables = get_spl_tunables(TUNABLES_PATH)
if tunables['zfs_vdev_cache_size'] == '0':
print('VDEV cache disabled, skipping section\n')
return
vdev_stats = isolate_section('vdev_cache_stats', kstats_dict)
vdev_cache_total = int(vdev_stats['hits']) +\
int(vdev_stats['misses']) +\
int(vdev_stats['delegations'])
prt_1('VDEV cache summary:', f_hits(vdev_cache_total))
prt_i2('Hit ratio:', f_perc(vdev_stats['hits'], vdev_cache_total),
f_hits(vdev_stats['hits']))
prt_i2('Miss ratio:', f_perc(vdev_stats['misses'], vdev_cache_total),
f_hits(vdev_stats['misses']))
prt_i2('Delegations:', f_perc(vdev_stats['delegations'], vdev_cache_total),
f_hits(vdev_stats['delegations']))
print()
def section_zil(kstats_dict):
"""Collect information on the ZFS Intent Log. Some of the information
taken from https://github.com/zfsonlinux/zfs/blob/master/include/sys/zil.h
"""
zil_stats = isolate_section('zil', kstats_dict)
prt_1('ZIL committed transactions:',
f_hits(zil_stats['zil_itx_count']))
prt_i1('Commit requests:', f_hits(zil_stats['zil_commit_count']))
prt_i1('Flushes to stable storage:',
f_hits(zil_stats['zil_commit_writer_count']))
prt_i2('Transactions to SLOG storage pool:',
f_bytes(zil_stats['zil_itx_metaslab_slog_bytes']),
f_hits(zil_stats['zil_itx_metaslab_slog_count']))
prt_i2('Transactions to non-SLOG storage pool:',
f_bytes(zil_stats['zil_itx_metaslab_normal_bytes']),
f_hits(zil_stats['zil_itx_metaslab_normal_count']))
print()
section_calls = {'arc': section_arc,
'archits': section_archits,
'dmu': section_dmu,
'l2arc': section_l2arc,
'spl': section_spl,
'tunables': section_tunables,
'vdev': section_vdev,
'zil': section_zil}
def main():
"""Run program. The options to draw a graph and to print all data raw are
treated separately because they come with their own call.
"""
kstats = get_kstats()
if ARGS.graph:
draw_graph(kstats)
sys.exit(0)
print_header()
if ARGS.raw:
print_raw(kstats)
elif ARGS.section:
try:
section_calls[ARGS.section](kstats)
except KeyError:
print('Error: Section "{0}" unknown'.format(ARGS.section))
sys.exit(1)
elif ARGS.page:
print('WARNING: Pages are deprecated, please use "--section"\n')
pages_to_calls = {1: 'arc',
2: 'archits',
3: 'l2arc',
4: 'dmu',
5: 'vdev',
6: 'tunables'}
try:
call = pages_to_calls[ARGS.page]
except KeyError:
print('Error: Page "{0}" not supported'.format(ARGS.page))
sys.exit(1)
else:
section_calls[call](kstats)
else:
# If no parameters were given, we print all sections. We might want to
# change the sequence by hand
calls = sorted(section_calls.keys())
for section in calls:
section_calls[section](kstats)
sys.exit(0)
if __name__ == '__main__':
main()
+13
View File
@@ -0,0 +1,13 @@
dist_bin_SCRIPTS = arcstat
#
# The arcstat script is compatible with both Python 2.6 and 3.4.
# As such the python 3 shebang can be replaced at install time when
# targeting a python 2 system. This allows us to maintain a single
# version of the source.
#
if USING_PYTHON_2
install-exec-hook:
sed --in-place 's|^#!/usr/bin/env python3|#!/usr/bin/env python2|' \
$(DESTDIR)$(bindir)/arcstat
endif
+470
View File
@@ -0,0 +1,470 @@
#!/usr/bin/env python3
#
# Print out ZFS ARC Statistics exported via kstat(1)
# For a definition of fields, or usage, use arctstat.pl -v
#
# This script is a fork of the original arcstat.pl (0.1) by
# Neelakanth Nadgir, originally published on his Sun blog on
# 09/18/2007
# http://blogs.sun.com/realneel/entry/zfs_arc_statistics
#
# This version aims to improve upon the original by adding features
# and fixing bugs as needed. This version is maintained by
# Mike Harsch and is hosted in a public open source repository:
# http://github.com/mharsch/arcstat
#
# Comments, Questions, or Suggestions are always welcome.
# Contact the maintainer at ( mike at harschsystems dot com )
#
# CDDL HEADER START
#
# The contents of this file are subject to the terms of the
# Common Development and Distribution License, Version 1.0 only
# (the "License"). You may not use this file except in compliance
# with the License.
#
# You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
# or http://www.opensolaris.org/os/licensing.
# See the License for the specific language governing permissions
# and limitations under the License.
#
# When distributing Covered Code, include this CDDL HEADER in each
# file and include the License file at usr/src/OPENSOLARIS.LICENSE.
# If applicable, add the following below this CDDL HEADER, with the
# fields enclosed by brackets "[]" replaced with your own identifying
# information: Portions Copyright [yyyy] [name of copyright owner]
#
# CDDL HEADER END
#
#
# Fields have a fixed width. Every interval, we fill the "v"
# hash with its corresponding value (v[field]=value) using calculate().
# @hdr is the array of fields that needs to be printed, so we
# just iterate over this array and print the values using our pretty printer.
#
# This script must remain compatible with Python 2.6+ and Python 3.4+.
#
import sys
import time
import getopt
import re
import copy
from decimal import Decimal
from signal import signal, SIGINT, SIGWINCH, SIG_DFL
cols = {
# HDR: [Size, Scale, Description]
"time": [8, -1, "Time"],
"hits": [4, 1000, "ARC reads per second"],
"miss": [4, 1000, "ARC misses per second"],
"read": [4, 1000, "Total ARC accesses per second"],
"hit%": [4, 100, "ARC Hit percentage"],
"miss%": [5, 100, "ARC miss percentage"],
"dhit": [4, 1000, "Demand hits per second"],
"dmis": [4, 1000, "Demand misses per second"],
"dh%": [3, 100, "Demand hit percentage"],
"dm%": [3, 100, "Demand miss percentage"],
"phit": [4, 1000, "Prefetch hits per second"],
"pmis": [4, 1000, "Prefetch misses per second"],
"ph%": [3, 100, "Prefetch hits percentage"],
"pm%": [3, 100, "Prefetch miss percentage"],
"mhit": [4, 1000, "Metadata hits per second"],
"mmis": [4, 1000, "Metadata misses per second"],
"mread": [5, 1000, "Metadata accesses per second"],
"mh%": [3, 100, "Metadata hit percentage"],
"mm%": [3, 100, "Metadata miss percentage"],
"arcsz": [5, 1024, "ARC Size"],
"c": [4, 1024, "ARC Target Size"],
"mfu": [4, 1000, "MFU List hits per second"],
"mru": [4, 1000, "MRU List hits per second"],
"mfug": [4, 1000, "MFU Ghost List hits per second"],
"mrug": [4, 1000, "MRU Ghost List hits per second"],
"eskip": [5, 1000, "evict_skip per second"],
"mtxmis": [6, 1000, "mutex_miss per second"],
"dread": [5, 1000, "Demand accesses per second"],
"pread": [5, 1000, "Prefetch accesses per second"],
"l2hits": [6, 1000, "L2ARC hits per second"],
"l2miss": [6, 1000, "L2ARC misses per second"],
"l2read": [6, 1000, "Total L2ARC accesses per second"],
"l2hit%": [6, 100, "L2ARC access hit percentage"],
"l2miss%": [7, 100, "L2ARC access miss percentage"],
"l2asize": [7, 1024, "Actual (compressed) size of the L2ARC"],
"l2size": [6, 1024, "Size of the L2ARC"],
"l2bytes": [7, 1024, "bytes read per second from the L2ARC"],
"grow": [4, 1000, "ARC Grow disabled"],
"need": [4, 1024, "ARC Reclaim need"],
"free": [4, 1024, "ARC Free memory"],
}
v = {}
hdr = ["time", "read", "miss", "miss%", "dmis", "dm%", "pmis", "pm%", "mmis",
"mm%", "arcsz", "c"]
xhdr = ["time", "mfu", "mru", "mfug", "mrug", "eskip", "mtxmis", "dread",
"pread", "read"]
sint = 1 # Default interval is 1 second
count = 1 # Default count is 1
hdr_intr = 20 # Print header every 20 lines of output
opfile = None
sep = " " # Default separator is 2 spaces
version = "0.4"
l2exist = False
cmd = ("Usage: arcstat [-hvx] [-f fields] [-o file] [-s string] [interval "
"[count]]\n")
cur = {}
d = {}
out = None
kstat = None
def detailed_usage():
sys.stderr.write("%s\n" % cmd)
sys.stderr.write("Field definitions are as follows:\n")
for key in cols:
sys.stderr.write("%11s : %s\n" % (key, cols[key][2]))
sys.stderr.write("\n")
sys.exit(0)
def usage():
sys.stderr.write("%s\n" % cmd)
sys.stderr.write("\t -h : Print this help message\n")
sys.stderr.write("\t -v : List all possible field headers and definitions"
"\n")
sys.stderr.write("\t -x : Print extended stats\n")
sys.stderr.write("\t -f : Specify specific fields to print (see -v)\n")
sys.stderr.write("\t -o : Redirect output to the specified file\n")
sys.stderr.write("\t -s : Override default field separator with custom "
"character or string\n")
sys.stderr.write("\nExamples:\n")
sys.stderr.write("\tarcstat -o /tmp/a.log 2 10\n")
sys.stderr.write("\tarcstat -s \",\" -o /tmp/a.log 2 10\n")
sys.stderr.write("\tarcstat -v\n")
sys.stderr.write("\tarcstat -f time,hit%,dh%,ph%,mh% 1\n")
sys.stderr.write("\n")
sys.exit(1)
def kstat_update():
global kstat
k = [line.strip() for line in open('/proc/spl/kstat/zfs/arcstats')]
if not k:
sys.exit(1)
del k[0:2]
kstat = {}
for s in k:
if not s:
continue
name, unused, value = s.split()
kstat[name] = Decimal(value)
def snap_stats():
global cur
global kstat
prev = copy.deepcopy(cur)
kstat_update()
cur = kstat
for key in cur:
if re.match(key, "class"):
continue
if key in prev:
d[key] = cur[key] - prev[key]
else:
d[key] = cur[key]
def prettynum(sz, scale, num=0):
suffix = [' ', 'K', 'M', 'G', 'T', 'P', 'E', 'Z']
index = 0
save = 0
# Special case for date field
if scale == -1:
return "%s" % num
# Rounding error, return 0
elif 0 < num < 1:
num = 0
while num > scale and index < 5:
save = num
num = num / scale
index += 1
if index == 0:
return "%*d" % (sz, num)
if (save / scale) < 10:
return "%*.1f%s" % (sz - 1, num, suffix[index])
else:
return "%*d%s" % (sz - 1, num, suffix[index])
def print_values():
global hdr
global sep
global v
for col in hdr:
sys.stdout.write("%s%s" % (
prettynum(cols[col][0], cols[col][1], v[col]),
sep
))
sys.stdout.write("\n")
sys.stdout.flush()
def print_header():
global hdr
global sep
for col in hdr:
sys.stdout.write("%*s%s" % (cols[col][0], col, sep))
sys.stdout.write("\n")
def get_terminal_lines():
try:
import fcntl
import termios
import struct
data = fcntl.ioctl(sys.stdout.fileno(), termios.TIOCGWINSZ, '1234')
sz = struct.unpack('hh', data)
return sz[0]
except Exception:
pass
def update_hdr_intr():
global hdr_intr
lines = get_terminal_lines()
if lines and lines > 3:
hdr_intr = lines - 3
def resize_handler(signum, frame):
update_hdr_intr()
def init():
global sint
global count
global hdr
global xhdr
global opfile
global sep
global out
global l2exist
desired_cols = None
xflag = False
hflag = False
vflag = False
i = 1
try:
opts, args = getopt.getopt(
sys.argv[1:],
"xo:hvs:f:",
[
"extended",
"outfile",
"help",
"verbose",
"separator",
"columns"
]
)
except getopt.error as msg:
sys.stderr.write("Error: %s\n" % str(msg))
usage()
opts = None
for opt, arg in opts:
if opt in ('-x', '--extended'):
xflag = True
if opt in ('-o', '--outfile'):
opfile = arg
i += 1
if opt in ('-h', '--help'):
hflag = True
if opt in ('-v', '--verbose'):
vflag = True
if opt in ('-s', '--separator'):
sep = arg
i += 1
if opt in ('-f', '--columns'):
desired_cols = arg
i += 1
i += 1
argv = sys.argv[i:]
sint = Decimal(argv[0]) if argv else sint
count = int(argv[1]) if len(argv) > 1 else count
if len(argv) > 1:
sint = Decimal(argv[0])
count = int(argv[1])
elif len(argv) > 0:
sint = Decimal(argv[0])
count = 0
if hflag or (xflag and desired_cols):
usage()
if vflag:
detailed_usage()
if xflag:
hdr = xhdr
update_hdr_intr()
# check if L2ARC exists
snap_stats()
l2_size = cur.get("l2_size")
if l2_size:
l2exist = True
if desired_cols:
hdr = desired_cols.split(",")
invalid = []
incompat = []
for ele in hdr:
if ele not in cols:
invalid.append(ele)
elif not l2exist and ele.startswith("l2"):
sys.stdout.write("No L2ARC Here\n%s\n" % ele)
incompat.append(ele)
if len(invalid) > 0:
sys.stderr.write("Invalid column definition! -- %s\n" % invalid)
usage()
if len(incompat) > 0:
sys.stderr.write("Incompatible field specified! -- %s\n" %
incompat)
usage()
if opfile:
try:
out = open(opfile, "w")
sys.stdout = out
except IOError:
sys.stderr.write("Cannot open %s for writing\n" % opfile)
sys.exit(1)
def calculate():
global d
global v
global l2exist
v = dict()
v["time"] = time.strftime("%H:%M:%S", time.localtime())
v["hits"] = d["hits"] / sint
v["miss"] = d["misses"] / sint
v["read"] = v["hits"] + v["miss"]
v["hit%"] = 100 * v["hits"] / v["read"] if v["read"] > 0 else 0
v["miss%"] = 100 - v["hit%"] if v["read"] > 0 else 0
v["dhit"] = (d["demand_data_hits"] + d["demand_metadata_hits"]) / sint
v["dmis"] = (d["demand_data_misses"] + d["demand_metadata_misses"]) / sint
v["dread"] = v["dhit"] + v["dmis"]
v["dh%"] = 100 * v["dhit"] / v["dread"] if v["dread"] > 0 else 0
v["dm%"] = 100 - v["dh%"] if v["dread"] > 0 else 0
v["phit"] = (d["prefetch_data_hits"] + d["prefetch_metadata_hits"]) / sint
v["pmis"] = (d["prefetch_data_misses"] +
d["prefetch_metadata_misses"]) / sint
v["pread"] = v["phit"] + v["pmis"]
v["ph%"] = 100 * v["phit"] / v["pread"] if v["pread"] > 0 else 0
v["pm%"] = 100 - v["ph%"] if v["pread"] > 0 else 0
v["mhit"] = (d["prefetch_metadata_hits"] +
d["demand_metadata_hits"]) / sint
v["mmis"] = (d["prefetch_metadata_misses"] +
d["demand_metadata_misses"]) / sint
v["mread"] = v["mhit"] + v["mmis"]
v["mh%"] = 100 * v["mhit"] / v["mread"] if v["mread"] > 0 else 0
v["mm%"] = 100 - v["mh%"] if v["mread"] > 0 else 0
v["arcsz"] = cur["size"]
v["c"] = cur["c"]
v["mfu"] = d["mfu_hits"] / sint
v["mru"] = d["mru_hits"] / sint
v["mrug"] = d["mru_ghost_hits"] / sint
v["mfug"] = d["mfu_ghost_hits"] / sint
v["eskip"] = d["evict_skip"] / sint
v["mtxmis"] = d["mutex_miss"] / sint
if l2exist:
v["l2hits"] = d["l2_hits"] / sint
v["l2miss"] = d["l2_misses"] / sint
v["l2read"] = v["l2hits"] + v["l2miss"]
v["l2hit%"] = 100 * v["l2hits"] / v["l2read"] if v["l2read"] > 0 else 0
v["l2miss%"] = 100 - v["l2hit%"] if v["l2read"] > 0 else 0
v["l2asize"] = cur["l2_asize"]
v["l2size"] = cur["l2_size"]
v["l2bytes"] = d["l2_read_bytes"] / sint
v["grow"] = 0 if cur["arc_no_grow"] else 1
v["need"] = cur["arc_need_free"]
v["free"] = cur["arc_sys_free"]
def main():
global sint
global count
global hdr_intr
i = 0
count_flag = 0
init()
if count > 0:
count_flag = 1
signal(SIGINT, SIG_DFL)
signal(SIGWINCH, resize_handler)
while True:
if i == 0:
print_header()
snap_stats()
calculate()
print_values()
if count_flag == 1:
if count <= 1:
break
count -= 1
i = 0 if i >= hdr_intr else i + 1
time.sleep(sint)
if out:
out.close()
if __name__ == '__main__':
main()
+13
View File
@@ -0,0 +1,13 @@
dist_bin_SCRIPTS = dbufstat
#
# The dbufstat script is compatible with both Python 2.6 and 3.4.
# As such the python 3 shebang can be replaced at install time when
# targeting a python 2 system. This allows us to maintain a single
# version of the source.
#
if USING_PYTHON_2
install-exec-hook:
sed --in-place 's|^#!/usr/bin/env python3|#!/usr/bin/env python2|' \
$(DESTDIR)$(bindir)/dbufstat
endif
+16 -34
View File
@@ -1,5 +1,4 @@
#!/usr/bin/env @PYTHON_SHEBANG@
# SPDX-License-Identifier: CDDL-1.0
#!/usr/bin/env python3
#
# Print out statistics for all cached dmu buffers. This information
# is available through the dbufs kstat and may be post-processed as
@@ -13,7 +12,7 @@
# with the License.
#
# You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
# or https://opensource.org/licenses/CDDL-1.0.
# or http://www.opensolaris.org/os/licensing.
# See the License for the specific language governing permissions
# and limitations under the License.
#
@@ -28,7 +27,7 @@
# Copyright (C) 2013 Lawrence Livermore National Security, LLC.
# Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
#
# This script must remain compatible with and Python 3.6+.
# This script must remain compatible with Python 2.6+ and Python 3.4+.
#
import sys
@@ -38,7 +37,7 @@ import re
bhdr = ["pool", "objset", "object", "level", "blkid", "offset", "dbsize"]
bxhdr = ["pool", "objset", "object", "level", "blkid", "offset", "dbsize",
"usize", "meta", "state", "dbholds", "dbc", "list", "atype", "flags",
"meta", "state", "dbholds", "dbc", "list", "atype", "flags",
"count", "asize", "access", "mru", "gmru", "mfu", "gmfu", "l2",
"l2_dattr", "l2_asize", "l2_comp", "aholds", "dtype", "btype",
"data_bs", "meta_bs", "bsize", "lvls", "dholds", "blocks", "dsize"]
@@ -48,17 +47,17 @@ dhdr = ["pool", "objset", "object", "dtype", "cached"]
dxhdr = ["pool", "objset", "object", "dtype", "btype", "data_bs", "meta_bs",
"bsize", "lvls", "dholds", "blocks", "dsize", "cached", "direct",
"indirect", "bonus", "spill"]
dincompat = ["level", "blkid", "offset", "dbsize", "usize", "meta", "state",
"dbholds", "dbc", "list", "atype", "flags", "count", "asize",
"access", "mru", "gmru", "mfu", "gmfu", "l2", "l2_dattr",
"l2_asize", "l2_comp", "aholds"]
dincompat = ["level", "blkid", "offset", "dbsize", "meta", "state", "dbholds",
"dbc", "list", "atype", "flags", "count", "asize", "access",
"mru", "gmru", "mfu", "gmfu", "l2", "l2_dattr", "l2_asize",
"l2_comp", "aholds"]
thdr = ["pool", "objset", "dtype", "cached"]
txhdr = ["pool", "objset", "dtype", "cached", "direct", "indirect",
"bonus", "spill"]
tincompat = ["object", "level", "blkid", "offset", "dbsize", "usize", "meta",
"state", "dbc", "dbholds", "list", "atype", "flags", "count",
"asize", "access", "mru", "gmru", "mfu", "gmfu", "l2", "l2_dattr",
tincompat = ["object", "level", "blkid", "offset", "dbsize", "meta", "state",
"dbc", "dbholds", "list", "atype", "flags", "count", "asize",
"access", "mru", "gmru", "mfu", "gmfu", "l2", "l2_dattr",
"l2_asize", "l2_comp", "aholds", "btype", "data_bs", "meta_bs",
"bsize", "lvls", "dholds", "blocks", "dsize"]
@@ -71,7 +70,6 @@ cols = {
"blkid": [8, -1, "block number of buffer"],
"offset": [12, 1024, "offset in object of buffer"],
"dbsize": [7, 1024, "size of buffer"],
"usize": [7, 1024, "size of attached user data"],
"meta": [4, -1, "is this buffer metadata?"],
"state": [5, -1, "state of buffer (read, cached, etc)"],
"dbholds": [7, 1000, "number of holds on buffer"],
@@ -115,25 +113,10 @@ cmd = ("Usage: dbufstat [-bdhnrtvx] [-i file] [-f fields] [-o file] "
raw = 0
if sys.platform.startswith("freebsd"):
import io
# Requires py-sysctl on FreeBSD
import sysctl
def default_ifile():
dbufs = sysctl.filter("kstat.zfs.misc.dbufs")[0].value
sys.stdin = io.StringIO(dbufs)
return "-"
elif sys.platform.startswith("linux"):
def default_ifile():
return "/proc/spl/kstat/zfs/dbufs"
def print_incompat_helper(incompat):
cnt = 0
for key in sorted(incompat):
if cnt == 0:
if cnt is 0:
sys.stderr.write("\t")
elif cnt > 8:
sys.stderr.write(",\n\t")
@@ -360,7 +343,7 @@ def get_compstring(c):
"ZIO_COMPRESS_GZIP_6", "ZIO_COMPRESS_GZIP_7",
"ZIO_COMPRESS_GZIP_8", "ZIO_COMPRESS_GZIP_9",
"ZIO_COMPRESS_ZLE", "ZIO_COMPRESS_LZ4",
"ZIO_COMPRESS_ZSTD", "ZIO_COMPRESS_FUNCTION"]
"ZIO_COMPRESS_FUNCTION"]
# If "-rr" option is used, don't convert to string representation
if raw > 1:
@@ -401,7 +384,6 @@ def update_dict(d, k, line, labels):
key = line[labels[k]]
dbsize = int(line[labels['dbsize']])
usize = int(line[labels['usize']])
blkid = int(line[labels['blkid']])
level = int(line[labels['level']])
@@ -419,7 +401,7 @@ def update_dict(d, k, line, labels):
d[pool][objset][key]['indirect'] = 0
d[pool][objset][key]['spill'] = 0
d[pool][objset][key]['cached'] += dbsize + usize
d[pool][objset][key]['cached'] += dbsize
if blkid == -1:
d[pool][objset][key]['bonus'] += dbsize
@@ -663,9 +645,9 @@ def main():
sys.exit(1)
if not ifile:
ifile = default_ifile()
ifile = '/proc/spl/kstat/zfs/dbufs'
if ifile != "-":
if ifile is not "-":
try:
tmp = open(ifile, "r")
sys.stdin = tmp
-44
View File
@@ -1,44 +0,0 @@
#!/bin/sh
#
# fsck.zfs: A fsck helper to accommodate distributions that expect
# to be able to execute a fsck on all filesystem types.
#
# This script simply bubbles up some already-known-about errors,
# see fsck.zfs(8)
#
if [ $# -eq 0 ]; then
echo "Usage: $0 [options] dataset…" >&2
exit 16
fi
ret=0
for dataset; do
case "$dataset" in
-*)
continue
;;
*)
;;
esac
pool="${dataset%%/*}"
case "$(@sbindir@/zpool list -Ho health "$pool")" in
DEGRADED)
ret=$(( ret | 4 ))
;;
FAULTED)
awk '!/^([[:space:]]*#.*)?$/ && $1 == "'"$dataset"'" && $3 == "zfs" {exit 1}' /etc/fstab || \
ret=$(( ret | 8 ))
;;
"")
# Pool not found, error printed by zpool(8)
ret=$(( ret | 8 ))
;;
*)
;;
esac
done
exit "$ret"
+1
View File
@@ -0,0 +1 @@
dist_sbin_SCRIPTS = fsck.zfs
+9
View File
@@ -0,0 +1,9 @@
#!/bin/sh
#
# fsck.zfs: A fsck helper to accommodate distributions that expect
# to be able to execute a fsck on all filesystem types. Currently
# this script does nothing but it could be extended to act as a
# compatibility wrapper for 'zpool scrub'.
#
exit 0
+1
View File
@@ -0,0 +1 @@
mount.zfs
+21
View File
@@ -0,0 +1,21 @@
include $(top_srcdir)/config/Rules.am
DEFAULT_INCLUDES += \
-I$(top_srcdir)/include \
-I$(top_srcdir)/lib/libspl/include
#
# Ignore the prefix for the mount helper. It must be installed in /sbin/
# because this path is hardcoded in the mount(8) for security reasons.
# However, if needed, the configure option --with-mounthelperdir= can be used
# to override the default install location.
#
sbindir=$(mounthelperdir)
sbin_PROGRAMS = mount.zfs
mount_zfs_SOURCES = \
mount_zfs.c
mount_zfs_LDADD = \
$(top_builddir)/lib/libnvpair/libnvpair.la \
$(top_builddir)/lib/libzfs/libzfs.la
+316 -86
View File
@@ -1,4 +1,3 @@
// SPDX-License-Identifier: CDDL-1.0
/*
* CDDL HEADER START
*
@@ -7,7 +6,7 @@
* You may not use this file except in compliance with the License.
*
* You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
* or https://opensource.org/licenses/CDDL-1.0.
* or http://www.opensolaris.org/os/licensing.
* See the License for the specific language governing permissions
* and limitations under the License.
*
@@ -43,46 +42,247 @@
libzfs_handle_t *g_zfs;
typedef struct option_map {
const char *name;
unsigned long mntmask;
unsigned long zfsmask;
} option_map_t;
static const option_map_t option_map[] = {
/* Canonicalized filesystem independent options from mount(8) */
{ MNTOPT_NOAUTO, MS_COMMENT, ZS_COMMENT },
{ MNTOPT_DEFAULTS, MS_COMMENT, ZS_COMMENT },
{ MNTOPT_NODEVICES, MS_NODEV, ZS_COMMENT },
{ MNTOPT_DIRSYNC, MS_DIRSYNC, ZS_COMMENT },
{ MNTOPT_NOEXEC, MS_NOEXEC, ZS_COMMENT },
{ MNTOPT_GROUP, MS_GROUP, ZS_COMMENT },
{ MNTOPT_NETDEV, MS_COMMENT, ZS_COMMENT },
{ MNTOPT_NOFAIL, MS_COMMENT, ZS_COMMENT },
{ MNTOPT_NOSUID, MS_NOSUID, ZS_COMMENT },
{ MNTOPT_OWNER, MS_OWNER, ZS_COMMENT },
{ MNTOPT_REMOUNT, MS_REMOUNT, ZS_COMMENT },
{ MNTOPT_RO, MS_RDONLY, ZS_COMMENT },
{ MNTOPT_RW, MS_COMMENT, ZS_COMMENT },
{ MNTOPT_SYNC, MS_SYNCHRONOUS, ZS_COMMENT },
{ MNTOPT_USER, MS_USERS, ZS_COMMENT },
{ MNTOPT_USERS, MS_USERS, ZS_COMMENT },
/* acl flags passed with util-linux-2.24 mount command */
{ MNTOPT_ACL, MS_POSIXACL, ZS_COMMENT },
{ MNTOPT_NOACL, MS_COMMENT, ZS_COMMENT },
{ MNTOPT_POSIXACL, MS_POSIXACL, ZS_COMMENT },
#ifdef MS_NOATIME
{ MNTOPT_NOATIME, MS_NOATIME, ZS_COMMENT },
#endif
#ifdef MS_NODIRATIME
{ MNTOPT_NODIRATIME, MS_NODIRATIME, ZS_COMMENT },
#endif
#ifdef MS_RELATIME
{ MNTOPT_RELATIME, MS_RELATIME, ZS_COMMENT },
#endif
#ifdef MS_STRICTATIME
{ MNTOPT_STRICTATIME, MS_STRICTATIME, ZS_COMMENT },
#endif
#ifdef MS_LAZYTIME
{ MNTOPT_LAZYTIME, MS_LAZYTIME, ZS_COMMENT },
#endif
{ MNTOPT_CONTEXT, MS_COMMENT, ZS_COMMENT },
{ MNTOPT_FSCONTEXT, MS_COMMENT, ZS_COMMENT },
{ MNTOPT_DEFCONTEXT, MS_COMMENT, ZS_COMMENT },
{ MNTOPT_ROOTCONTEXT, MS_COMMENT, ZS_COMMENT },
#ifdef MS_I_VERSION
{ MNTOPT_IVERSION, MS_I_VERSION, ZS_COMMENT },
#endif
#ifdef MS_MANDLOCK
{ MNTOPT_NBMAND, MS_MANDLOCK, ZS_COMMENT },
#endif
/* Valid options not found in mount(8) */
{ MNTOPT_BIND, MS_BIND, ZS_COMMENT },
#ifdef MS_REC
{ MNTOPT_RBIND, MS_BIND|MS_REC, ZS_COMMENT },
#endif
{ MNTOPT_COMMENT, MS_COMMENT, ZS_COMMENT },
#ifdef MS_NOSUB
{ MNTOPT_NOSUB, MS_NOSUB, ZS_COMMENT },
#endif
#ifdef MS_SILENT
{ MNTOPT_QUIET, MS_SILENT, ZS_COMMENT },
#endif
/* Custom zfs options */
{ MNTOPT_XATTR, MS_COMMENT, ZS_COMMENT },
{ MNTOPT_NOXATTR, MS_COMMENT, ZS_COMMENT },
{ MNTOPT_ZFSUTIL, MS_COMMENT, ZS_ZFSUTIL },
{ NULL, 0, 0 } };
/*
* Opportunistically convert a target string into a pool name. If the
* string does not represent a block device with a valid zfs label
* then it is passed through without modification.
* Break the mount option in to a name/value pair. The name is
* validated against the option map and mount flags set accordingly.
*/
static void
parse_dataset(const char *target, char **dataset)
static int
parse_option(char *mntopt, unsigned long *mntflags,
unsigned long *zfsflags, int sloppy)
{
const option_map_t *opt;
char *ptr, *name, *value = NULL;
int error = 0;
name = strdup(mntopt);
if (name == NULL)
return (ENOMEM);
for (ptr = name; ptr && *ptr; ptr++) {
if (*ptr == '=') {
*ptr = '\0';
value = ptr+1;
VERIFY3P(value, !=, NULL);
break;
}
}
for (opt = option_map; opt->name != NULL; opt++) {
if (strncmp(name, opt->name, strlen(name)) == 0) {
*mntflags |= opt->mntmask;
*zfsflags |= opt->zfsmask;
error = 0;
goto out;
}
}
if (!sloppy)
error = ENOENT;
out:
/* If required further process on the value may be done here */
free(name);
return (error);
}
/*
* Translate the mount option string in to MS_* mount flags for the
* kernel vfs. When sloppy is non-zero unknown options will be ignored
* otherwise they are considered fatal are copied in to badopt.
*/
static int
parse_options(char *mntopts, unsigned long *mntflags, unsigned long *zfsflags,
int sloppy, char *badopt, char *mtabopt)
{
int error = 0, quote = 0, flag = 0, count = 0;
char *ptr, *opt, *opts;
opts = strdup(mntopts);
if (opts == NULL)
return (ENOMEM);
*mntflags = 0;
opt = NULL;
/*
* Prior to util-linux 2.36.2, if a file or directory in the
* current working directory was named 'dataset' then mount(8)
* would prepend the current working directory to the dataset.
* Check for it and strip the prepended path when it is added.
* Scan through all mount options which must be comma delimited.
* We must be careful to notice regions which are double quoted
* and skip commas in these regions. Each option is then checked
* to determine if it is a known option.
*/
for (ptr = opts; ptr && !flag; ptr++) {
if (opt == NULL)
opt = ptr;
if (*ptr == '"')
quote = !quote;
if (quote)
continue;
if (*ptr == '\0')
flag = 1;
if ((*ptr == ',') || (*ptr == '\0')) {
*ptr = '\0';
error = parse_option(opt, mntflags, zfsflags, sloppy);
if (error) {
strcpy(badopt, opt);
goto out;
}
if (!(*mntflags & MS_REMOUNT) &&
!(*zfsflags & ZS_ZFSUTIL)) {
if (count > 0)
strlcat(mtabopt, ",", MNT_LINE_MAX);
strlcat(mtabopt, opt, MNT_LINE_MAX);
count++;
}
opt = NULL;
}
}
out:
free(opts);
return (error);
}
/*
* Return the pool/dataset to mount given the name passed to mount. This
* is expected to be of the form pool/dataset, however may also refer to
* a block device if that device contains a valid zfs label.
*/
static char *
parse_dataset(char *dataset)
{
char cwd[PATH_MAX];
if (getcwd(cwd, PATH_MAX) == NULL) {
perror("getcwd");
return;
struct stat64 statbuf;
int error;
int len;
/*
* We expect a pool/dataset to be provided, however if we're
* given a device which is a member of a zpool we attempt to
* extract the pool name stored in the label. Given the pool
* name we can mount the root dataset.
*/
error = stat64(dataset, &statbuf);
if (error == 0) {
nvlist_t *config;
char *name;
int fd;
fd = open(dataset, O_RDONLY);
if (fd < 0)
goto out;
error = zpool_read_label(fd, &config, NULL);
(void) close(fd);
if (error)
goto out;
error = nvlist_lookup_string(config,
ZPOOL_CONFIG_POOL_NAME, &name);
if (error) {
nvlist_free(config);
} else {
dataset = strdup(name);
nvlist_free(config);
return (dataset);
}
}
int len = strlen(cwd);
if (strncmp(cwd, target, len) == 0)
target += len;
out:
/*
* If a file or directory in your current working directory is
* named 'dataset' then mount(8) will prepend your current working
* directory to the dataset. There is no way to prevent this
* behavior so we simply check for it and strip the prepended
* patch when it is added.
*/
if (getcwd(cwd, PATH_MAX) == NULL)
return (dataset);
/* Assume pool/dataset is more likely */
strlcpy(*dataset, target, PATH_MAX);
len = strlen(cwd);
int fd = open(target, O_RDONLY | O_CLOEXEC);
if (fd < 0)
return;
/* Do not add one when cwd already ends in a trailing '/' */
if (strncmp(cwd, dataset, len) == 0)
return (dataset + len + (cwd[len-1] != '/'));
nvlist_t *cfg = NULL;
if (zpool_read_label(fd, &cfg, NULL) == 0) {
const char *nm = NULL;
if (!nvlist_lookup_string(cfg, ZPOOL_CONFIG_POOL_NAME, &nm))
strlcpy(*dataset, nm, PATH_MAX);
nvlist_free(cfg);
}
if (close(fd))
perror("close");
return (dataset);
}
/*
@@ -109,26 +309,25 @@ mtab_is_writeable(void)
}
static int
mtab_update(const char *dataset, const char *mntpoint, const char *type,
const char *mntopts)
mtab_update(char *dataset, char *mntpoint, char *type, char *mntopts)
{
struct mntent mnt;
FILE *fp;
int error;
mnt.mnt_fsname = (char *)dataset;
mnt.mnt_dir = (char *)mntpoint;
mnt.mnt_type = (char *)type;
mnt.mnt_opts = (char *)(mntopts ?: "");
mnt.mnt_fsname = dataset;
mnt.mnt_dir = mntpoint;
mnt.mnt_type = type;
mnt.mnt_opts = mntopts ? mntopts : "";
mnt.mnt_freq = 0;
mnt.mnt_passno = 0;
fp = setmntent("/etc/mtab", "a+e");
fp = setmntent("/etc/mtab", "a+");
if (!fp) {
(void) fprintf(stderr, gettext(
"filesystem '%s' was mounted, but /etc/mtab "
"could not be opened due to error: %s\n"),
dataset, strerror(errno));
"could not be opened due to error %d\n"),
dataset, errno);
return (MOUNT_FILEIO);
}
@@ -136,8 +335,8 @@ mtab_update(const char *dataset, const char *mntpoint, const char *type,
if (error) {
(void) fprintf(stderr, gettext(
"filesystem '%s' was mounted, but /etc/mtab "
"could not be updated due to error: %s\n"),
dataset, strerror(errno));
"could not be updated due to error %d\n"),
dataset, errno);
return (MOUNT_FILEIO);
}
@@ -146,6 +345,34 @@ mtab_update(const char *dataset, const char *mntpoint, const char *type,
return (MOUNT_SUCCESS);
}
static void
append_mntopt(const char *name, const char *val, char *mntopts,
char *mtabopt, boolean_t quote)
{
char tmp[MNT_LINE_MAX];
snprintf(tmp, MNT_LINE_MAX, quote ? ",%s=\"%s\"" : ",%s=%s", name, val);
if (mntopts)
strlcat(mntopts, tmp, MNT_LINE_MAX);
if (mtabopt)
strlcat(mtabopt, tmp, MNT_LINE_MAX);
}
static void
zfs_selinux_setcontext(zfs_handle_t *zhp, zfs_prop_t zpt, const char *name,
char *mntopts, char *mtabopt)
{
char context[ZFS_MAXPROPLEN];
if (zfs_prop_get(zhp, zpt, context, sizeof (context),
NULL, NULL, 0, B_FALSE) == 0) {
if (strcmp(context, "none") != 0)
append_mntopt(name, context, mntopts, mtabopt, B_TRUE);
}
}
int
main(int argc, char **argv)
{
@@ -156,13 +383,12 @@ main(int argc, char **argv)
char badopt[MNT_LINE_MAX] = { '\0' };
char mtabopt[MNT_LINE_MAX] = { '\0' };
char mntpoint[PATH_MAX];
char dataset[PATH_MAX], *pdataset = dataset;
char *dataset;
unsigned long mntflags = 0, zfsflags = 0, remount = 0;
int sloppy = 0, fake = 0, verbose = 0, nomtab = 0, zfsutil = 0;
int error, c;
(void) setlocale(LC_ALL, "");
(void) setlocale(LC_NUMERIC, "C");
(void) textdomain(TEXT_DOMAIN);
opterr = 0;
@@ -187,11 +413,10 @@ main(int argc, char **argv)
break;
case 'h':
case '?':
if (optopt)
(void) fprintf(stderr,
gettext("Invalid option '%c'\n"), optopt);
(void) fprintf(stderr, gettext("Invalid option '%c'\n"),
optopt);
(void) fprintf(stderr, gettext("Usage: mount.zfs "
"[-sfnvh] [-o options] <dataset> <mountpoint>\n"));
"[-sfnv] [-o options] <dataset> <mountpoint>\n"));
return (MOUNT_USAGE);
}
}
@@ -213,18 +438,18 @@ main(int argc, char **argv)
return (MOUNT_USAGE);
}
parse_dataset(argv[0], &pdataset);
dataset = parse_dataset(argv[0]);
/* canonicalize the mount point */
if (realpath(argv[1], mntpoint) == NULL) {
(void) fprintf(stderr, gettext("filesystem '%s' cannot be "
"mounted at '%s' due to canonicalization error: %s\n"),
dataset, argv[1], strerror(errno));
"mounted at '%s' due to canonicalization error %d.\n"),
dataset, argv[1], errno);
return (MOUNT_SYSERR);
}
/* validate mount options and set mntflags */
error = zfs_parse_mount_options(mntopts, &mntflags, &zfsflags, sloppy,
error = parse_options(mntopts, &mntflags, &zfsflags, sloppy,
badopt, mtabopt);
if (error) {
switch (error) {
@@ -248,6 +473,13 @@ main(int argc, char **argv)
}
}
if (verbose)
(void) fprintf(stdout, gettext("mount.zfs:\n"
" dataset: \"%s\"\n mountpoint: \"%s\"\n"
" mountflags: 0x%lx\n zfsflags: 0x%lx\n"
" mountopts: \"%s\"\n mtabopts: \"%s\"\n"),
dataset, mntpoint, mntflags, zfsflags, mntopts, mtabopt);
if (mntflags & MS_REMOUNT) {
nomtab = 1;
remount = 1;
@@ -270,10 +502,33 @@ main(int argc, char **argv)
return (MOUNT_USAGE);
}
if (sloppy || libzfs_envvar_is_set("ZFS_MOUNT_HELPER")) {
zfs_adjust_mount_options(zhp, mntpoint, mntopts, mtabopt);
/*
* Checks to see if the ZFS_PROP_SELINUX_CONTEXT exists
* if it does, create a tmp variable in case it's needed
* checks to see if the selinux context is set to the default
* if it is, allow the setting of the other context properties
* this is needed because the 'context' property overrides others
* if it is not the default, set the 'context' property
*/
if (zfs_prop_get(zhp, ZFS_PROP_SELINUX_CONTEXT, prop, sizeof (prop),
NULL, NULL, 0, B_FALSE) == 0) {
if (strcmp(prop, "none") == 0) {
zfs_selinux_setcontext(zhp, ZFS_PROP_SELINUX_FSCONTEXT,
MNTOPT_FSCONTEXT, mntopts, mtabopt);
zfs_selinux_setcontext(zhp, ZFS_PROP_SELINUX_DEFCONTEXT,
MNTOPT_DEFCONTEXT, mntopts, mtabopt);
zfs_selinux_setcontext(zhp,
ZFS_PROP_SELINUX_ROOTCONTEXT, MNTOPT_ROOTCONTEXT,
mntopts, mtabopt);
} else {
append_mntopt(MNTOPT_CONTEXT, prop,
mntopts, mtabopt, B_TRUE);
}
}
/* A hint used to determine an auto-mounted snapshot mount point */
append_mntopt(MNTOPT_MNTPOINT, mntpoint, mntopts, NULL, B_FALSE);
/* treat all snapshots as legacy mount points */
if (zfs_get_type(zhp) == ZFS_TYPE_SNAPSHOT)
(void) strlcpy(prop, ZFS_MOUNTPOINT_LEGACY, ZFS_MAXPROPLEN);
@@ -290,11 +545,12 @@ main(int argc, char **argv)
if (zfs_version == 0) {
fprintf(stderr, gettext("unable to fetch "
"ZFS version for filesystem '%s'\n"), dataset);
zfs_close(zhp);
libzfs_fini(g_zfs);
return (MOUNT_SYSERR);
}
zfs_close(zhp);
libzfs_fini(g_zfs);
/*
* Legacy mount points may only be mounted using 'mount', never using
* 'zfs mount'. However, since 'zfs mount' actually invokes 'mount'
@@ -312,8 +568,6 @@ main(int argc, char **argv)
"Use 'zfs set mountpoint=%s' or 'mount -t zfs %s %s'.\n"
"See zfs(8) for more information.\n"),
dataset, mntpoint, dataset, mntpoint);
zfs_close(zhp);
libzfs_fini(g_zfs);
return (MOUNT_USAGE);
}
@@ -324,38 +578,14 @@ main(int argc, char **argv)
"Use 'zfs set mountpoint=%s' or 'zfs mount %s'.\n"
"See zfs(8) for more information.\n"),
dataset, "legacy", dataset);
zfs_close(zhp);
libzfs_fini(g_zfs);
return (MOUNT_USAGE);
}
if (verbose)
(void) fprintf(stdout, gettext("mount.zfs:\n"
" dataset: \"%s\"\n mountpoint: \"%s\"\n"
" mountflags: 0x%lx\n zfsflags: 0x%lx\n"
" mountopts: \"%s\"\n mtabopts: \"%s\"\n"),
dataset, mntpoint, mntflags, zfsflags, mntopts, mtabopt);
if (!fake) {
if (!remount && !sloppy &&
!libzfs_envvar_is_set("ZFS_MOUNT_HELPER")) {
error = zfs_mount_at(zhp, mntopts, mntflags, mntpoint);
if (error) {
(void) fprintf(stderr, "zfs_mount_at() failed: "
"%s", libzfs_error_description(g_zfs));
zfs_close(zhp);
libzfs_fini(g_zfs);
return (MOUNT_SYSERR);
}
} else {
error = mount(dataset, mntpoint, MNTTYPE_ZFS,
mntflags, mntopts);
}
error = mount(dataset, mntpoint, MNTTYPE_ZFS,
mntflags, mntopts);
}
zfs_close(zhp);
libzfs_fini(g_zfs);
if (error) {
switch (errno) {
case ENOENT:
@@ -390,8 +620,8 @@ main(int argc, char **argv)
"mount the filesystem again.\n"), dataset);
return (MOUNT_SYSERR);
}
/* fallthru */
#endif
zfs_fallthrough;
default:
(void) fprintf(stderr, gettext("filesystem "
"'%s' can not be mounted: %s\n"), dataset,
+1
View File
@@ -0,0 +1 @@
/raidz_test
+17 -10
View File
@@ -1,16 +1,23 @@
raidz_test_CFLAGS = $(AM_CFLAGS) $(KERNEL_CFLAGS)
raidz_test_CPPFLAGS = $(AM_CPPFLAGS) $(LIBZPOOL_CPPFLAGS)
include $(top_srcdir)/config/Rules.am
bin_PROGRAMS += raidz_test
CPPCHECKTARGETS += raidz_test
# Includes kernel code, generate warnings for large stack frames
AM_CFLAGS += $(FRAME_LARGER_THAN)
# Unconditionally enable ASSERTs
AM_CPPFLAGS += -DDEBUG -UNDEBUG
DEFAULT_INCLUDES += \
-I$(top_srcdir)/include \
-I$(top_srcdir)/lib/libspl/include
bin_PROGRAMS = raidz_test
raidz_test_SOURCES = \
%D%/raidz_bench.c \
%D%/raidz_test.c \
%D%/raidz_test.h
raidz_test.h \
raidz_test.c \
raidz_bench.c
raidz_test_LDADD = \
libzpool.la \
libzfs_core.la
$(top_builddir)/lib/libzpool/libzpool.la
raidz_test_LDADD += -lm
raidz_test_LDADD += -lm -ldl
+10 -26
View File
@@ -1,4 +1,3 @@
// SPDX-License-Identifier: CDDL-1.0
/*
* CDDL HEADER START
*
@@ -7,7 +6,7 @@
* You may not use this file except in compliance with the License.
*
* You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
* or https://opensource.org/licenses/CDDL-1.0.
* or http://www.opensolaris.org/os/licensing.
* See the License for the specific language governing permissions
* and limitations under the License.
*
@@ -32,6 +31,8 @@
#include <sys/vdev_raidz_impl.h>
#include <stdio.h>
#include <sys/time.h>
#include "raidz_test.h"
#define GEN_BENCH_MEMORY (((uint64_t)1ULL)<<32)
@@ -64,7 +65,7 @@ bench_fini_raidz_maps(void)
{
/* tear down golden zio */
raidz_free(zio_bench.io_abd, max_data_size);
memset(&zio_bench, 0, sizeof (zio_t));
bzero(&zio_bench, sizeof (zio_t));
}
static inline void
@@ -82,17 +83,8 @@ run_gen_bench_impl(const char *impl)
/* create suitable raidz_map */
ncols = rto_opts.rto_dcols + fn + 1;
zio_bench.io_size = 1ULL << ds;
if (rto_opts.rto_expand) {
rm_bench = vdev_raidz_map_alloc_expanded(
&zio_bench,
rto_opts.rto_ashift, ncols+1, ncols,
fn+1, rto_opts.rto_expand_offset,
0, B_FALSE);
} else {
rm_bench = vdev_raidz_map_alloc(&zio_bench,
BENCH_ASHIFT, ncols, fn+1);
}
rm_bench = vdev_raidz_map_alloc(&zio_bench,
BENCH_ASHIFT, ncols, fn+1);
/* estimate iteration count */
iter_cnt = GEN_BENCH_MEMORY;
@@ -121,7 +113,7 @@ run_gen_bench_impl(const char *impl)
}
}
static void
void
run_gen_bench(void)
{
char **impl_name;
@@ -171,16 +163,8 @@ run_rec_bench_impl(const char *impl)
(1ULL << BENCH_ASHIFT))
continue;
if (rto_opts.rto_expand) {
rm_bench = vdev_raidz_map_alloc_expanded(
&zio_bench,
BENCH_ASHIFT, ncols+1, ncols,
PARITY_PQR,
rto_opts.rto_expand_offset, 0, B_FALSE);
} else {
rm_bench = vdev_raidz_map_alloc(&zio_bench,
BENCH_ASHIFT, ncols, PARITY_PQR);
}
rm_bench = vdev_raidz_map_alloc(&zio_bench,
BENCH_ASHIFT, ncols, PARITY_PQR);
/* estimate iteration count */
iter_cnt = (REC_BENCH_MEMORY);
@@ -213,7 +197,7 @@ run_rec_bench_impl(const char *impl)
}
}
static void
void
run_rec_bench(void)
{
char **impl_name;
+70 -144
View File
@@ -1,4 +1,3 @@
// SPDX-License-Identifier: CDDL-1.0
/*
* CDDL HEADER START
*
@@ -7,7 +6,7 @@
* You may not use this file except in compliance with the License.
*
* You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
* or https://opensource.org/licenses/CDDL-1.0.
* or http://www.opensolaris.org/os/licensing.
* See the License for the specific language governing permissions
* and limitations under the License.
*
@@ -38,11 +37,11 @@
static int *rand_data;
raidz_test_opts_t rto_opts;
static char pid_s[16];
static char gdb[256];
static const char gdb_tmpl[] = "gdb -ex \"set pagination 0\" -p %d";
static void sig_handler(int signo)
{
int old_errno = errno;
struct sigaction action;
/*
* Restore default action and re-raise signal so SIGSEGV and
@@ -53,32 +52,22 @@ static void sig_handler(int signo)
action.sa_flags = 0;
(void) sigaction(signo, &action, NULL);
if (rto_opts.rto_gdb) {
pid_t pid = fork();
if (pid == 0) {
execlp("gdb", "gdb", "-ex", "set pagination 0",
"-p", pid_s, NULL);
_exit(-1);
} else if (pid > 0)
while (waitpid(pid, NULL, 0) == -1 && errno == EINTR)
;
}
if (rto_opts.rto_gdb)
if (system(gdb)) { }
raise(signo);
errno = old_errno;
}
static void print_opts(raidz_test_opts_t *opts, boolean_t force)
{
const char *verbose;
char *verbose;
switch (opts->rto_v) {
case D_ALL:
case 0:
verbose = "no";
break;
case D_INFO:
case 1:
verbose = "info";
break;
case D_DEBUG:
default:
verbose = "debug";
break;
@@ -88,20 +77,16 @@ static void print_opts(raidz_test_opts_t *opts, boolean_t force)
(void) fprintf(stdout, DBLSEP "Running with options:\n"
" (-a) zio ashift : %zu\n"
" (-o) zio offset : 1 << %zu\n"
" (-e) expanded map : %s\n"
" (-r) reflow offset : %llx\n"
" (-d) number of raidz data columns : %zu\n"
" (-s) size of DATA : 1 << %zu\n"
" (-S) sweep parameters : %s \n"
" (-v) verbose : %s \n\n",
opts->rto_ashift, /* -a */
ilog2(opts->rto_offset), /* -o */
opts->rto_expand ? "yes" : "no", /* -e */
(u_longlong_t)opts->rto_expand_offset, /* -r */
opts->rto_dcols, /* -d */
ilog2(opts->rto_dsize), /* -s */
opts->rto_sweep ? "yes" : "no", /* -S */
verbose); /* -v */
opts->rto_ashift, /* -a */
ilog2(opts->rto_offset), /* -o */
opts->rto_dcols, /* -d */
ilog2(opts->rto_dsize), /* -s */
opts->rto_sweep ? "yes" : "no", /* -S */
verbose); /* -v */
}
}
@@ -119,9 +104,7 @@ static void usage(boolean_t requested)
"\t[-S parameter sweep (default: %s)]\n"
"\t[-t timeout for parameter sweep test]\n"
"\t[-B benchmark all raidz implementations]\n"
"\t[-e use expanded raidz map (default: %s)]\n"
"\t[-r expanded raidz map reflow offset (default: %llx)]\n"
"\t[-v increase verbosity (default: %d)]\n"
"\t[-v increase verbosity (default: %zu)]\n"
"\t[-h (print help)]\n"
"\t[-T test the test, see if failure would be detected]\n"
"\t[-D debug (attach gdb on SIGSEGV)]\n"
@@ -131,9 +114,7 @@ static void usage(boolean_t requested)
o->rto_dcols, /* -d */
ilog2(o->rto_dsize), /* -s */
rto_opts.rto_sweep ? "yes" : "no", /* -S */
rto_opts.rto_expand ? "yes" : "no", /* -e */
(u_longlong_t)o->rto_expand_offset, /* -r */
o->rto_v); /* -v */
o->rto_v); /* -d */
exit(requested ? 0 : 1);
}
@@ -142,22 +123,19 @@ static void process_options(int argc, char **argv)
{
size_t value;
int opt;
raidz_test_opts_t *o = &rto_opts;
memcpy(o, &rto_opts_defaults, sizeof (*o));
bcopy(&rto_opts_defaults, o, sizeof (*o));
while ((opt = getopt(argc, argv, "TDBSvha:o:d:s:t:")) != -1) {
value = 0;
while ((opt = getopt(argc, argv, "TDBSvha:er:o:d:s:t:")) != -1) {
switch (opt) {
case 'a':
value = strtoull(optarg, NULL, 0);
o->rto_ashift = MIN(13, MAX(9, value));
break;
case 'e':
o->rto_expand = 1;
break;
case 'r':
o->rto_expand_offset = strtoull(optarg, NULL, 0);
break;
case 'o':
value = strtoull(optarg, NULL, 0);
o->rto_offset = ((1ULL << MIN(12, value)) >> 9) << 9;
@@ -201,34 +179,25 @@ static void process_options(int argc, char **argv)
}
}
#define DATA_COL(rr, i) ((rr)->rr_col[rr->rr_firstdatacol + (i)].rc_abd)
#define DATA_COL_SIZE(rr, i) ((rr)->rr_col[rr->rr_firstdatacol + (i)].rc_size)
#define DATA_COL(rm, i) ((rm)->rm_col[raidz_parity(rm) + (i)].rc_abd)
#define DATA_COL_SIZE(rm, i) ((rm)->rm_col[raidz_parity(rm) + (i)].rc_size)
#define CODE_COL(rr, i) ((rr)->rr_col[(i)].rc_abd)
#define CODE_COL_SIZE(rr, i) ((rr)->rr_col[(i)].rc_size)
#define CODE_COL(rm, i) ((rm)->rm_col[(i)].rc_abd)
#define CODE_COL_SIZE(rm, i) ((rm)->rm_col[(i)].rc_size)
static int
cmp_code(raidz_test_opts_t *opts, const raidz_map_t *rm, const int parity)
{
int r, i, ret = 0;
int i, ret = 0;
VERIFY(parity >= 1 && parity <= 3);
for (r = 0; r < rm->rm_nrows; r++) {
raidz_row_t * const rr = rm->rm_row[r];
raidz_row_t * const rrg = opts->rm_golden->rm_row[r];
for (i = 0; i < parity; i++) {
if (CODE_COL_SIZE(rrg, i) == 0) {
VERIFY0(CODE_COL_SIZE(rr, i));
continue;
}
if (abd_cmp(CODE_COL(rr, i),
CODE_COL(rrg, i)) != 0) {
ret++;
LOG_OPT(D_DEBUG, opts,
"\nParity block [%d] different!\n", i);
}
for (i = 0; i < parity; i++) {
if (abd_cmp(CODE_COL(rm, i), CODE_COL(opts->rm_golden, i))
!= 0) {
ret++;
LOG_OPT(D_DEBUG, opts,
"\nParity block [%d] different!\n", i);
}
}
return (ret);
@@ -237,26 +206,16 @@ cmp_code(raidz_test_opts_t *opts, const raidz_map_t *rm, const int parity)
static int
cmp_data(raidz_test_opts_t *opts, raidz_map_t *rm)
{
int r, i, dcols, ret = 0;
int i, ret = 0;
int dcols = opts->rm_golden->rm_cols - raidz_parity(opts->rm_golden);
for (r = 0; r < rm->rm_nrows; r++) {
raidz_row_t *rr = rm->rm_row[r];
raidz_row_t *rrg = opts->rm_golden->rm_row[r];
dcols = opts->rm_golden->rm_row[0]->rr_cols -
raidz_parity(opts->rm_golden);
for (i = 0; i < dcols; i++) {
if (DATA_COL_SIZE(rrg, i) == 0) {
VERIFY0(DATA_COL_SIZE(rr, i));
continue;
}
for (i = 0; i < dcols; i++) {
if (abd_cmp(DATA_COL(opts->rm_golden, i), DATA_COL(rm, i))
!= 0) {
ret++;
if (abd_cmp(DATA_COL(rrg, i),
DATA_COL(rr, i)) != 0) {
ret++;
LOG_OPT(D_DEBUG, opts,
"\nData block [%d] different!\n", i);
}
LOG_OPT(D_DEBUG, opts,
"\nData block [%d] different!\n", i);
}
}
return (ret);
@@ -265,41 +224,31 @@ cmp_data(raidz_test_opts_t *opts, raidz_map_t *rm)
static int
init_rand(void *data, size_t size, void *private)
{
size_t *offsetp = (size_t *)private;
size_t offset = *offsetp;
int i;
int *dst = (int *)data;
VERIFY3U(offset + size, <=, SPA_MAXBLOCKSIZE);
memcpy(data, (char *)rand_data + offset, size);
*offsetp = offset + size;
return (0);
}
for (i = 0; i < size / sizeof (int); i++)
dst[i] = rand_data[i];
static int
corrupt_rand_fill(void *data, size_t size, void *private)
{
(void) private;
memset(data, 0xAA, size);
return (0);
}
static void
corrupt_colums(raidz_map_t *rm, const int *tgts, const int cnt)
{
for (int r = 0; r < rm->rm_nrows; r++) {
raidz_row_t *rr = rm->rm_row[r];
for (int i = 0; i < cnt; i++) {
raidz_col_t *col = &rr->rr_col[tgts[i]];
abd_iterate_func(col->rc_abd, 0, col->rc_size,
corrupt_rand_fill, NULL);
}
int i;
raidz_col_t *col;
for (i = 0; i < cnt; i++) {
col = &rm->rm_col[tgts[i]];
abd_iterate_func(col->rc_abd, 0, col->rc_size, init_rand, NULL);
}
}
void
init_zio_abd(zio_t *zio)
{
size_t offset = 0;
abd_iterate_func(zio->io_abd, 0, zio->io_size, init_rand, &offset);
abd_iterate_func(zio->io_abd, 0, zio->io_size, init_rand, NULL);
}
static void
@@ -339,20 +288,10 @@ init_raidz_golden_map(raidz_test_opts_t *opts, const int parity)
VERIFY0(vdev_raidz_impl_set("original"));
if (opts->rto_expand) {
opts->rm_golden =
vdev_raidz_map_alloc_expanded(opts->zio_golden,
opts->rto_ashift, total_ncols+1, total_ncols,
parity, opts->rto_expand_offset, 0, B_FALSE);
rm_test = vdev_raidz_map_alloc_expanded(zio_test,
opts->rto_ashift, total_ncols+1, total_ncols,
parity, opts->rto_expand_offset, 0, B_FALSE);
} else {
opts->rm_golden = vdev_raidz_map_alloc(opts->zio_golden,
opts->rto_ashift, total_ncols, parity);
rm_test = vdev_raidz_map_alloc(zio_test,
opts->rto_ashift, total_ncols, parity);
}
opts->rm_golden = vdev_raidz_map_alloc(opts->zio_golden,
opts->rto_ashift, total_ncols, parity);
rm_test = vdev_raidz_map_alloc(zio_test,
opts->rto_ashift, total_ncols, parity);
VERIFY(opts->zio_golden);
VERIFY(opts->rm_golden);
@@ -386,19 +325,13 @@ init_raidz_map(raidz_test_opts_t *opts, zio_t **zio, const int parity)
*zio = umem_zalloc(sizeof (zio_t), UMEM_NOFAIL);
(*zio)->io_offset = opts->rto_offset;
(*zio)->io_offset = 0;
(*zio)->io_size = alloc_dsize;
(*zio)->io_abd = raidz_alloc(alloc_dsize);
init_zio_abd(*zio);
if (opts->rto_expand) {
rm = vdev_raidz_map_alloc_expanded(*zio,
opts->rto_ashift, total_ncols+1, total_ncols,
parity, opts->rto_expand_offset, 0, B_FALSE);
} else {
rm = vdev_raidz_map_alloc(*zio, opts->rto_ashift,
total_ncols, parity);
}
rm = vdev_raidz_map_alloc(*zio, opts->rto_ashift,
total_ncols, parity);
VERIFY(rm);
/* Make sure code columns are destroyed */
@@ -487,7 +420,7 @@ run_rec_check_impl(raidz_test_opts_t *opts, raidz_map_t *rm, const int fn)
if (fn < RAIDZ_REC_PQ) {
/* can reconstruct 1 failed data disk */
for (x0 = 0; x0 < opts->rto_dcols; x0++) {
if (x0 >= rm->rm_row[0]->rr_cols - raidz_parity(rm))
if (x0 >= rm->rm_cols - raidz_parity(rm))
continue;
/* Check if should stop */
@@ -512,11 +445,10 @@ run_rec_check_impl(raidz_test_opts_t *opts, raidz_map_t *rm, const int fn)
} else if (fn < RAIDZ_REC_PQR) {
/* can reconstruct 2 failed data disk */
for (x0 = 0; x0 < opts->rto_dcols; x0++) {
if (x0 >= rm->rm_row[0]->rr_cols - raidz_parity(rm))
if (x0 >= rm->rm_cols - raidz_parity(rm))
continue;
for (x1 = x0 + 1; x1 < opts->rto_dcols; x1++) {
if (x1 >= rm->rm_row[0]->rr_cols -
raidz_parity(rm))
if (x1 >= rm->rm_cols - raidz_parity(rm))
continue;
/* Check if should stop */
@@ -543,15 +475,14 @@ run_rec_check_impl(raidz_test_opts_t *opts, raidz_map_t *rm, const int fn)
} else {
/* can reconstruct 3 failed data disk */
for (x0 = 0; x0 < opts->rto_dcols; x0++) {
if (x0 >= rm->rm_row[0]->rr_cols - raidz_parity(rm))
if (x0 >= rm->rm_cols - raidz_parity(rm))
continue;
for (x1 = x0 + 1; x1 < opts->rto_dcols; x1++) {
if (x1 >= rm->rm_row[0]->rr_cols -
raidz_parity(rm))
if (x1 >= rm->rm_cols - raidz_parity(rm))
continue;
for (x2 = x1 + 1; x2 < opts->rto_dcols; x2++) {
if (x2 >= rm->rm_row[0]->rr_cols -
raidz_parity(rm))
if (x2 >=
rm->rm_cols - raidz_parity(rm))
continue;
/* Check if should stop */
@@ -667,7 +598,7 @@ static kcondvar_t sem_cv;
static int max_free_slots;
static int free_slots;
static __attribute__((noreturn)) void
static void
sweep_thread(void *arg)
{
int err = 0;
@@ -767,10 +698,8 @@ run_sweep(void)
opts = umem_zalloc(sizeof (raidz_test_opts_t), UMEM_NOFAIL);
opts->rto_ashift = ashift_v[a];
opts->rto_dcols = dcols_v[d];
opts->rto_offset = (1ULL << ashift_v[a]) * rand();
opts->rto_offset = (1 << ashift_v[a]) * rand();
opts->rto_dsize = size_v[s];
opts->rto_expand = rto_opts.rto_expand;
opts->rto_expand_offset = rto_opts.rto_expand_offset;
opts->rto_v = 0; /* be quiet */
VERIFY3P(thread_create(NULL, 0, sweep_thread, (void *) opts,
@@ -803,7 +732,6 @@ exit:
return (sweep_state == SWEEP_ERROR ? SWEEP_ERROR : 0);
}
int
main(int argc, char **argv)
{
@@ -811,8 +739,8 @@ main(int argc, char **argv)
struct sigaction action;
int err = 0;
/* init gdb pid string early */
(void) sprintf(pid_s, "%d", getpid());
/* init gdb string early */
(void) sprintf(gdb, gdb_tmpl, getpid());
action.sa_handler = sig_handler;
sigemptyset(&action.sa_mask);
@@ -829,7 +757,7 @@ main(int argc, char **argv)
process_options(argc, argv);
kernel_init(SPA_MODE_READ);
kernel_init(FREAD);
/* setup random data because rand() is not reentrant */
rand_data = (int *)umem_alloc(SPA_MAXBLOCKSIZE, UMEM_NOFAIL);
@@ -847,8 +775,6 @@ main(int argc, char **argv)
err = run_test(NULL);
}
mprotect(rand_data, SPA_MAXBLOCKSIZE, PROT_READ | PROT_WRITE);
umem_free(rand_data, SPA_MAXBLOCKSIZE);
kernel_fini();
+15 -23
View File
@@ -1,4 +1,3 @@
// SPDX-License-Identifier: CDDL-1.0
/*
* CDDL HEADER START
*
@@ -7,7 +6,7 @@
* You may not use this file except in compliance with the License.
*
* You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
* or https://opensource.org/licenses/CDDL-1.0.
* or http://www.opensolaris.org/os/licensing.
* See the License for the specific language governing permissions
* and limitations under the License.
*
@@ -29,7 +28,7 @@
#include <sys/spa.h>
static const char *const raidz_impl_names[] = {
static const char *raidz_impl_names[] = {
"original",
"scalar",
"sse2",
@@ -39,27 +38,18 @@ static const char *const raidz_impl_names[] = {
"avx512bw",
"aarch64_neon",
"aarch64_neonx2",
"powerpc_altivec",
NULL
};
enum raidz_verbosity {
D_ALL,
D_INFO,
D_DEBUG,
};
typedef struct raidz_test_opts {
size_t rto_ashift;
uint64_t rto_offset;
size_t rto_offset;
size_t rto_dcols;
size_t rto_dsize;
enum raidz_verbosity rto_v;
size_t rto_v;
size_t rto_sweep;
size_t rto_sweep_timeout;
size_t rto_benchmark;
size_t rto_expand;
uint64_t rto_expand_offset;
size_t rto_sanity;
size_t rto_gdb;
@@ -72,14 +62,12 @@ typedef struct raidz_test_opts {
static const raidz_test_opts_t rto_opts_defaults = {
.rto_ashift = 9,
.rto_offset = 0,
.rto_offset = 1ULL << 0,
.rto_dcols = 8,
.rto_dsize = 1<<19,
.rto_v = D_ALL,
.rto_v = 0,
.rto_sweep = 0,
.rto_benchmark = 0,
.rto_expand = 0,
.rto_expand_offset = -1ULL,
.rto_sanity = 0,
.rto_gdb = 0,
.rto_should_stop = B_FALSE
@@ -93,19 +81,23 @@ static inline size_t ilog2(size_t a)
}
#define LOG(lvl, ...) \
#define D_ALL 0
#define D_INFO 1
#define D_DEBUG 2
#define LOG(lvl, a...) \
{ \
if (rto_opts.rto_v >= lvl) \
(void) fprintf(stdout, __VA_ARGS__); \
(void) fprintf(stdout, a); \
} \
#define LOG_OPT(lvl, opt, ...) \
#define LOG_OPT(lvl, opt, a...) \
{ \
if (opt->rto_v >= lvl) \
(void) fprintf(stdout, __VA_ARGS__); \
(void) fprintf(stdout, a); \
} \
#define ERR(...) (void) fprintf(stderr, __VA_ARGS__)
#define ERR(a...) (void) fprintf(stderr, a)
#define DBLSEP "================\n"
+1
View File
@@ -0,0 +1 @@
dist_udev_SCRIPTS = vdev_id
+606
View File
@@ -0,0 +1,606 @@
#!/bin/sh
#
# vdev_id: udev helper to generate user-friendly names for JBOD disks
#
# This script parses the file /etc/zfs/vdev_id.conf to map a
# physical path in a storage topology to a channel name. The
# channel name is combined with a disk enclosure slot number to
# create an alias that reflects the physical location of the drive.
# This is particularly helpful when it comes to tasks like replacing
# failed drives. Slot numbers may also be re-mapped in case the
# default numbering is unsatisfactory. The drive aliases will be
# created as symbolic links in /dev/disk/by-vdev.
#
# The currently supported topologies are sas_direct and sas_switch.
# A multipath mode is supported in which dm-mpath devices are
# handled by examining the first-listed running component disk. In
# multipath mode the configuration file should contain a channel
# definition with the same name for each path to a given enclosure.
#
# The alias keyword provides a simple way to map already-existing
# device symlinks to more convenient names. It is suitable for
# small, static configurations or for sites that have some automated
# way to generate the mapping file.
#
#
# Some example configuration files are given below.
# #
# # Example vdev_id.conf - sas_direct.
# #
#
# multipath no
# topology sas_direct
# phys_per_port 4
# slot bay
#
# # PCI_ID HBA PORT CHANNEL NAME
# channel 85:00.0 1 A
# channel 85:00.0 0 B
# channel 86:00.0 1 C
# channel 86:00.0 0 D
#
# # Custom mapping for Channel A
#
# # Linux Mapped
# # Slot Slot Channel
# slot 1 7 A
# slot 2 10 A
# slot 3 3 A
# slot 4 6 A
#
# # Default mapping for B, C, and D
# slot 1 4
# slot 2 2
# slot 3 1
# slot 4 3
# #
# # Example vdev_id.conf - sas_switch
# #
#
# topology sas_switch
#
# # SWITCH PORT CHANNEL NAME
# channel 1 A
# channel 2 B
# channel 3 C
# channel 4 D
# #
# # Example vdev_id.conf - multipath
# #
#
# multipath yes
#
# # PCI_ID HBA PORT CHANNEL NAME
# channel 85:00.0 1 A
# channel 85:00.0 0 B
# channel 86:00.0 1 A
# channel 86:00.0 0 B
# #
# # Example vdev_id.conf - alias
# #
#
# # by-vdev
# # name fully qualified or base name of device link
# alias d1 /dev/disk/by-id/wwn-0x5000c5002de3b9ca
# alias d2 wwn-0x5000c5002def789e
PATH=/bin:/sbin:/usr/bin:/usr/sbin
CONFIG=/etc/zfs/vdev_id.conf
PHYS_PER_PORT=
DEV=
MULTIPATH=
TOPOLOGY=
BAY=
usage() {
cat << EOF
Usage: vdev_id [-h]
vdev_id <-d device> [-c config_file] [-p phys_per_port]
[-g sas_direct|sas_switch|scsi] [-m]
-c specify name of an alternative config file [default=$CONFIG]
-d specify basename of device (i.e. sda)
-e Create enclose device symlinks only (/dev/by-enclosure)
-g Storage network topology [default="$TOPOLOGY"]
-m Run in multipath mode
-p number of phy's per switch port [default=$PHYS_PER_PORT]
-h show this summary
EOF
exit 0
}
map_slot() {
local LINUX_SLOT=$1
local CHANNEL=$2
local MAPPED_SLOT=
MAPPED_SLOT=`awk "\\$1 == \"slot\" && \\$2 == ${LINUX_SLOT} && \
\\$4 ~ /^${CHANNEL}$|^$/ { print \\$3; exit }" $CONFIG`
if [ -z "$MAPPED_SLOT" ] ; then
MAPPED_SLOT=$LINUX_SLOT
fi
printf "%d" ${MAPPED_SLOT}
}
map_channel() {
local MAPPED_CHAN=
local PCI_ID=$1
local PORT=$2
case $TOPOLOGY in
"sas_switch")
MAPPED_CHAN=`awk "\\$1 == \"channel\" && \\$2 == ${PORT} \
{ print \\$3; exit }" $CONFIG`
;;
"sas_direct"|"scsi")
MAPPED_CHAN=`awk "\\$1 == \"channel\" && \
\\$2 == \"${PCI_ID}\" && \\$3 == ${PORT} \
{ print \\$4; exit }" $CONFIG`
;;
esac
printf "%s" ${MAPPED_CHAN}
}
sas_handler() {
if [ -z "$PHYS_PER_PORT" ] ; then
PHYS_PER_PORT=`awk "\\$1 == \"phys_per_port\" \
{print \\$2; exit}" $CONFIG`
fi
PHYS_PER_PORT=${PHYS_PER_PORT:-4}
if ! echo $PHYS_PER_PORT | grep -q -E '^[0-9]+$' ; then
echo "Error: phys_per_port value $PHYS_PER_PORT is non-numeric"
exit 1
fi
if [ -z "$MULTIPATH_MODE" ] ; then
MULTIPATH_MODE=`awk "\\$1 == \"multipath\" \
{print \\$2; exit}" $CONFIG`
fi
# Use first running component device if we're handling a dm-mpath device
if [ "$MULTIPATH_MODE" = "yes" ] ; then
# If udev didn't tell us the UUID via DM_NAME, check /dev/mapper
if [ -z "$DM_NAME" ] ; then
DM_NAME=`ls -l --full-time /dev/mapper |
awk "/\/$DEV$/{print \\$9}"`
fi
# For raw disks udev exports DEVTYPE=partition when
# handling partitions, and the rules can be written to
# take advantage of this to append a -part suffix. For
# dm devices we get DEVTYPE=disk even for partitions so
# we have to append the -part suffix directly in the
# helper.
if [ "$DEVTYPE" != "partition" ] ; then
PART=`echo $DM_NAME | awk -Fp '/p/{print "-part"$2}'`
fi
# Strip off partition information.
DM_NAME=`echo $DM_NAME | sed 's/p[0-9][0-9]*$//'`
if [ -z "$DM_NAME" ] ; then
return
fi
# Get the raw scsi device name from multipath -ll. Strip off
# leading pipe symbols to make field numbering consistent.
DEV=`multipath -ll $DM_NAME |
awk '/running/{gsub("^[|]"," "); print $3 ; exit}'`
if [ -z "$DEV" ] ; then
return
fi
fi
if echo $DEV | grep -q ^/devices/ ; then
sys_path=$DEV
else
sys_path=`udevadm info -q path -p /sys/block/$DEV 2>/dev/null`
fi
# Use positional parameters as an ad-hoc array
set -- $(echo "$sys_path" | tr / ' ')
num_dirs=$#
scsi_host_dir="/sys"
# Get path up to /sys/.../hostX
i=1
while [ $i -le $num_dirs ] ; do
d=$(eval echo \${$i})
scsi_host_dir="$scsi_host_dir/$d"
echo $d | grep -q -E '^host[0-9]+$' && break
i=$(($i + 1))
done
if [ $i = $num_dirs ] ; then
return
fi
PCI_ID=$(eval echo \${$(($i -1))} | awk -F: '{print $2":"$3}')
# In sas_switch mode, the directory four levels beneath
# /sys/.../hostX contains symlinks to phy devices that reveal
# the switch port number. In sas_direct mode, the phy links one
# directory down reveal the HBA port.
port_dir=$scsi_host_dir
case $TOPOLOGY in
"sas_switch") j=$(($i + 4)) ;;
"sas_direct") j=$(($i + 1)) ;;
esac
i=$(($i + 1))
while [ $i -le $j ] ; do
port_dir="$port_dir/$(eval echo \${$i})"
i=$(($i + 1))
done
PHY=`ls -d $port_dir/phy* 2>/dev/null | head -1 | awk -F: '{print $NF}'`
if [ -z "$PHY" ] ; then
PHY=0
fi
PORT=$(( $PHY / $PHYS_PER_PORT ))
# Look in /sys/.../sas_device/end_device-X for the bay_identifier
# attribute.
end_device_dir=$port_dir
while [ $i -lt $num_dirs ] ; do
d=$(eval echo \${$i})
end_device_dir="$end_device_dir/$d"
if echo $d | grep -q '^end_device' ; then
end_device_dir="$end_device_dir/sas_device/$d"
break
fi
i=$(($i + 1))
done
SLOT=
case $BAY in
"bay")
SLOT=`cat $end_device_dir/bay_identifier 2>/dev/null`
;;
"phy")
SLOT=`cat $end_device_dir/phy_identifier 2>/dev/null`
;;
"port")
d=$(eval echo \${$i})
SLOT=`echo $d | sed -e 's/^.*://'`
;;
"id")
i=$(($i + 1))
d=$(eval echo \${$i})
SLOT=`echo $d | sed -e 's/^.*://'`
;;
"lun")
i=$(($i + 2))
d=$(eval echo \${$i})
SLOT=`echo $d | sed -e 's/^.*://'`
;;
"ses")
# look for this SAS path in all SCSI Enclosure Services
# (SES) enclosures
sas_address=`cat $end_device_dir/sas_address 2>/dev/null`
enclosures=`lsscsi -g | \
sed -n -e '/enclosu/s/^.* \([^ ][^ ]*\) *$/\1/p'`
for enclosure in $enclosures; do
set -- $(sg_ses -p aes $enclosure | \
awk "/device slot number:/{slot=\$12} \
/SAS address: $sas_address/\
{print slot}")
SLOT=$1
if [ -n "$SLOT" ] ; then
break
fi
done
;;
esac
if [ -z "$SLOT" ] ; then
return
fi
CHAN=`map_channel $PCI_ID $PORT`
SLOT=`map_slot $SLOT $CHAN`
if [ -z "$CHAN" ] ; then
return
fi
echo ${CHAN}${SLOT}${PART}
}
scsi_handler() {
if [ -z "$FIRST_BAY_NUMBER" ] ; then
FIRST_BAY_NUMBER=`awk "\\$1 == \"first_bay_number\" \
{print \\$2; exit}" $CONFIG`
fi
FIRST_BAY_NUMBER=${FIRST_BAY_NUMBER:-0}
if [ -z "$PHYS_PER_PORT" ] ; then
PHYS_PER_PORT=`awk "\\$1 == \"phys_per_port\" \
{print \\$2; exit}" $CONFIG`
fi
PHYS_PER_PORT=${PHYS_PER_PORT:-4}
if ! echo $PHYS_PER_PORT | grep -q -E '^[0-9]+$' ; then
echo "Error: phys_per_port value $PHYS_PER_PORT is non-numeric"
exit 1
fi
if [ -z "$MULTIPATH_MODE" ] ; then
MULTIPATH_MODE=`awk "\\$1 == \"multipath\" \
{print \\$2; exit}" $CONFIG`
fi
# Use first running component device if we're handling a dm-mpath device
if [ "$MULTIPATH_MODE" = "yes" ] ; then
# If udev didn't tell us the UUID via DM_NAME, check /dev/mapper
if [ -z "$DM_NAME" ] ; then
DM_NAME=`ls -l --full-time /dev/mapper |
awk "/\/$DEV$/{print \\$9}"`
fi
# For raw disks udev exports DEVTYPE=partition when
# handling partitions, and the rules can be written to
# take advantage of this to append a -part suffix. For
# dm devices we get DEVTYPE=disk even for partitions so
# we have to append the -part suffix directly in the
# helper.
if [ "$DEVTYPE" != "partition" ] ; then
PART=`echo $DM_NAME | awk -Fp '/p/{print "-part"$2}'`
fi
# Strip off partition information.
DM_NAME=`echo $DM_NAME | sed 's/p[0-9][0-9]*$//'`
if [ -z "$DM_NAME" ] ; then
return
fi
# Get the raw scsi device name from multipath -ll. Strip off
# leading pipe symbols to make field numbering consistent.
DEV=`multipath -ll $DM_NAME |
awk '/running/{gsub("^[|]"," "); print $3 ; exit}'`
if [ -z "$DEV" ] ; then
return
fi
fi
if echo $DEV | grep -q ^/devices/ ; then
sys_path=$DEV
else
sys_path=`udevadm info -q path -p /sys/block/$DEV 2>/dev/null`
fi
# expect sys_path like this, for example:
# /devices/pci0000:00/0000:00:0b.0/0000:09:00.0/0000:0a:05.0/0000:0c:00.0/host3/target3:1:0/3:1:0:21/block/sdv
# Use positional parameters as an ad-hoc array
set -- $(echo "$sys_path" | tr / ' ')
num_dirs=$#
scsi_host_dir="/sys"
# Get path up to /sys/.../hostX
i=1
while [ $i -le $num_dirs ] ; do
d=$(eval echo \${$i})
scsi_host_dir="$scsi_host_dir/$d"
echo $d | grep -q -E '^host[0-9]+$' && break
i=$(($i + 1))
done
if [ $i = $num_dirs ] ; then
return
fi
PCI_ID=$(eval echo \${$(($i -1))} | awk -F: '{print $2":"$3}')
# In scsi mode, the directory two levels beneath
# /sys/.../hostX reveals the port and slot.
port_dir=$scsi_host_dir
j=$(($i + 2))
i=$(($i + 1))
while [ $i -le $j ] ; do
port_dir="$port_dir/$(eval echo \${$i})"
i=$(($i + 1))
done
set -- $(echo $port_dir | sed -e 's/^.*:\([^:]*\):\([^:]*\)$/\1 \2/')
PORT=$1
SLOT=$(($2 + $FIRST_BAY_NUMBER))
if [ -z "$SLOT" ] ; then
return
fi
CHAN=`map_channel $PCI_ID $PORT`
SLOT=`map_slot $SLOT $CHAN`
if [ -z "$CHAN" ] ; then
return
fi
echo ${CHAN}${SLOT}${PART}
}
# Figure out the name for the enclosure symlink
enclosure_handler () {
# We get all the info we need from udev's DEVPATH variable:
#
# DEVPATH=/sys/devices/pci0000:00/0000:00:03.0/0000:05:00.0/host0/subsystem/devices/0:0:0:0/scsi_generic/sg0
# Get the enclosure ID ("0:0:0:0")
ENC=$(basename $(readlink -m "/sys/$DEVPATH/../.."))
if [ ! -d /sys/class/enclosure/$ENC ] ; then
# Not an enclosure, bail out
return
fi
# Get the long sysfs device path to our enclosure. Looks like:
# /devices/pci0000:00/0000:00:03.0/0000:05:00.0/host0/port-0:0/ ... /enclosure/0:0:0:0
ENC_DEVICE=$(readlink /sys/class/enclosure/$ENC)
# Grab the full path to the hosts port dir:
# /devices/pci0000:00/0000:00:03.0/0000:05:00.0/host0/port-0:0
PORT_DIR=$(echo $ENC_DEVICE | grep -Eo '.+host[0-9]+/port-[0-9]+:[0-9]+')
# Get the port number
PORT_ID=$(echo $PORT_DIR | grep -Eo "[0-9]+$")
# The PCI directory is two directories up from the port directory
# /sys/devices/pci0000:00/0000:00:03.0/0000:05:00.0
PCI_ID_LONG=$(basename $(readlink -m "/sys/$PORT_DIR/../.."))
# Strip down the PCI address from 0000:05:00.0 to 05:00.0
PCI_ID=$(echo "$PCI_ID_LONG" | sed -r 's/^[0-9]+://g')
# Name our device according to vdev_id.conf (like "L0" or "U1").
NAME=$(awk "/channel/{if (\$1 == \"channel\" && \$2 == \"$PCI_ID\" && \
\$3 == \"$PORT_ID\") {print \$4int(count[\$4])}; count[\$4]++}" $CONFIG)
echo "${NAME}"
}
alias_handler () {
# Special handling is needed to correctly append a -part suffix
# to partitions of device mapper devices. The DEVTYPE attribute
# is normally set to "disk" instead of "partition" in this case,
# so the udev rules won't handle that for us as they do for
# "plain" block devices.
#
# For example, we may have the following links for a device and its
# partitions,
#
# /dev/disk/by-id/dm-name-isw_dibgbfcije_ARRAY0 -> ../../dm-0
# /dev/disk/by-id/dm-name-isw_dibgbfcije_ARRAY0p1 -> ../../dm-1
# /dev/disk/by-id/dm-name-isw_dibgbfcije_ARRAY0p2 -> ../../dm-3
#
# and the following alias in vdev_id.conf.
#
# alias A0 dm-name-isw_dibgbfcije_ARRAY0
#
# The desired outcome is for the following links to be created
# without having explicitly defined aliases for the partitions.
#
# /dev/disk/by-vdev/A0 -> ../../dm-0
# /dev/disk/by-vdev/A0-part1 -> ../../dm-1
# /dev/disk/by-vdev/A0-part2 -> ../../dm-3
#
# Warning: The following grep pattern will misidentify whole-disk
# devices whose names end with 'p' followed by a string of
# digits as partitions, causing alias creation to fail. This
# ambiguity seems unavoidable, so devices using this facility
# must not use such names.
local DM_PART=
if echo $DM_NAME | grep -q -E 'p[0-9][0-9]*$' ; then
if [ "$DEVTYPE" != "partition" ] ; then
DM_PART=`echo $DM_NAME | awk -Fp '/p/{print "-part"$2}'`
fi
fi
# DEVLINKS attribute must have been populated by already-run udev rules.
for link in $DEVLINKS ; do
# Remove partition information to match key of top-level device.
if [ -n "$DM_PART" ] ; then
link=`echo $link | sed 's/p[0-9][0-9]*$//'`
fi
# Check both the fully qualified and the base name of link.
for l in $link `basename $link` ; do
alias=`awk "\\$1 == \"alias\" && \\$3 == \"${l}\" \
{ print \\$2; exit }" $CONFIG`
if [ -n "$alias" ] ; then
echo ${alias}${DM_PART}
return
fi
done
done
}
while getopts 'c:d:eg:mp:h' OPTION; do
case ${OPTION} in
c)
CONFIG=${OPTARG}
;;
d)
DEV=${OPTARG}
;;
e)
# When udev sees a scsi_generic device, it calls this script with -e to
# create the enclosure device symlinks only. We also need
# "enclosure_symlinks yes" set in vdev_id.config to actually create the
# symlink.
ENCLOSURE_MODE=$(awk '{if ($1 == "enclosure_symlinks") print $2}' $CONFIG)
if [ "$ENCLOSURE_MODE" != "yes" ] ; then
exit 0
fi
;;
g)
TOPOLOGY=$OPTARG
;;
p)
PHYS_PER_PORT=${OPTARG}
;;
m)
MULTIPATH_MODE=yes
;;
h)
usage
;;
esac
done
if [ ! -r $CONFIG ] ; then
exit 0
fi
if [ -z "$DEV" -a -z "$ENCLOSURE_MODE" ] ; then
echo "Error: missing required option -d"
exit 1
fi
if [ -z "$TOPOLOGY" ] ; then
TOPOLOGY=`awk "\\$1 == \"topology\" {print \\$2; exit}" $CONFIG`
fi
if [ -z "$BAY" ] ; then
BAY=`awk "\\$1 == \"slot\" {print \\$2; exit}" $CONFIG`
fi
TOPOLOGY=${TOPOLOGY:-sas_direct}
# Should we create /dev/by-enclosure symlinks?
if [ "$ENCLOSURE_MODE" = "yes" -a "$TOPOLOGY" = "sas_direct" ] ; then
ID_ENCLOSURE=$(enclosure_handler)
if [ -z "$ID_ENCLOSURE" ] ; then
exit 0
fi
# Just create the symlinks to the enclosure devices and then exit.
ENCLOSURE_PREFIX=$(awk '/enclosure_symlinks_prefix/{print $2}' $CONFIG)
if [ -z "$ENCLOSURE_PREFIX" ] ; then
ENCLOSURE_PREFIX="enc"
fi
echo "ID_ENCLOSURE=$ID_ENCLOSURE"
echo "ID_ENCLOSURE_PATH=by-enclosure/$ENCLOSURE_PREFIX-$ID_ENCLOSURE"
exit 0
fi
# First check if an alias was defined for this device.
ID_VDEV=`alias_handler`
if [ -z "$ID_VDEV" ] ; then
BAY=${BAY:-bay}
case $TOPOLOGY in
sas_direct|sas_switch)
ID_VDEV=`sas_handler`
;;
scsi)
ID_VDEV=`scsi_handler`
;;
*)
echo "Error: unknown topology $TOPOLOGY"
exit 1
;;
esac
fi
if [ -n "$ID_VDEV" ] ; then
echo "ID_VDEV=${ID_VDEV}"
echo "ID_VDEV_PATH=disk/by-vdev/${ID_VDEV}"
fi
-805
View File
@@ -1,805 +0,0 @@
#!/usr/bin/env @PYTHON_SHEBANG@
# SPDX-License-Identifier: CDDL-1.0
#
# Print out ZFS ARC Statistics exported via kstat(1)
# For a definition of fields, or usage, use zarcstat -v
#
# This script was originally a fork of the original arcstat.pl (0.1)
# by Neelakanth Nadgir, originally published on his Sun blog on
# 09/18/2007
# http://blogs.sun.com/realneel/entry/zfs_arc_statistics
#
# A new version aimed to improve upon the original by adding features
# and fixing bugs as needed. This version was maintained by Mike
# Harsch and was hosted in a public open source repository:
# http://github.com/mharsch/arcstat
#
# but has since moved to the illumos-gate repository.
#
# This Python port was written by John Hixson for FreeNAS, introduced
# in commit e2c29f:
# https://github.com/freenas/freenas
#
# and has been improved by many people since.
#
# CDDL HEADER START
#
# The contents of this file are subject to the terms of the
# Common Development and Distribution License, Version 1.0 only
# (the "License"). You may not use this file except in compliance
# with the License.
#
# You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
# or https://opensource.org/licenses/CDDL-1.0.
# See the License for the specific language governing permissions
# and limitations under the License.
#
# When distributing Covered Code, include this CDDL HEADER in each
# file and include the License file at usr/src/OPENSOLARIS.LICENSE.
# If applicable, add the following below this CDDL HEADER, with the
# fields enclosed by brackets "[]" replaced with your own identifying
# information: Portions Copyright [yyyy] [name of copyright owner]
#
# CDDL HEADER END
#
#
# Fields have a fixed width. Every interval, we fill the "v"
# hash with its corresponding value (v[field]=value) using calculate().
# @hdr is the array of fields that needs to be printed, so we
# just iterate over this array and print the values using our pretty printer.
#
# This script must remain compatible with Python 3.6+.
#
import sys
import time
import getopt
import re
import copy
import os
from signal import signal, SIGINT, SIGWINCH, SIG_DFL
cols = {
# HDR: [Size, Scale, Description]
"time": [8, -1, "Time"],
"hits": [4, 1000, "ARC hits per second"],
"iohs": [4, 1000, "ARC I/O hits per second"],
"miss": [4, 1000, "ARC misses per second"],
"read": [4, 1000, "Total ARC accesses per second"],
"hit%": [4, 100, "ARC hit percentage"],
"ioh%": [4, 100, "ARC I/O hit percentage"],
"miss%": [5, 100, "ARC miss percentage"],
"dhit": [4, 1000, "Demand hits per second"],
"dioh": [4, 1000, "Demand I/O hits per second"],
"dmis": [4, 1000, "Demand misses per second"],
"dh%": [3, 100, "Demand hit percentage"],
"di%": [3, 100, "Demand I/O hit percentage"],
"dm%": [3, 100, "Demand miss percentage"],
"ddhit": [5, 1000, "Demand data hits per second"],
"ddioh": [5, 1000, "Demand data I/O hits per second"],
"ddmis": [5, 1000, "Demand data misses per second"],
"ddh%": [4, 100, "Demand data hit percentage"],
"ddi%": [4, 100, "Demand data I/O hit percentage"],
"ddm%": [4, 100, "Demand data miss percentage"],
"dmhit": [5, 1000, "Demand metadata hits per second"],
"dmioh": [5, 1000, "Demand metadata I/O hits per second"],
"dmmis": [5, 1000, "Demand metadata misses per second"],
"dmh%": [4, 100, "Demand metadata hit percentage"],
"dmi%": [4, 100, "Demand metadata I/O hit percentage"],
"dmm%": [4, 100, "Demand metadata miss percentage"],
"phit": [4, 1000, "Prefetch hits per second"],
"pioh": [4, 1000, "Prefetch I/O hits per second"],
"pmis": [4, 1000, "Prefetch misses per second"],
"ph%": [3, 100, "Prefetch hits percentage"],
"pi%": [3, 100, "Prefetch I/O hits percentage"],
"pm%": [3, 100, "Prefetch miss percentage"],
"pdhit": [5, 1000, "Prefetch data hits per second"],
"pdioh": [5, 1000, "Prefetch data I/O hits per second"],
"pdmis": [5, 1000, "Prefetch data misses per second"],
"pdh%": [4, 100, "Prefetch data hits percentage"],
"pdi%": [4, 100, "Prefetch data I/O hits percentage"],
"pdm%": [4, 100, "Prefetch data miss percentage"],
"pmhit": [5, 1000, "Prefetch metadata hits per second"],
"pmioh": [5, 1000, "Prefetch metadata I/O hits per second"],
"pmmis": [5, 1000, "Prefetch metadata misses per second"],
"pmh%": [4, 100, "Prefetch metadata hits percentage"],
"pmi%": [4, 100, "Prefetch metadata I/O hits percentage"],
"pmm%": [4, 100, "Prefetch metadata miss percentage"],
"mhit": [4, 1000, "Metadata hits per second"],
"mioh": [4, 1000, "Metadata I/O hits per second"],
"mmis": [4, 1000, "Metadata misses per second"],
"mread": [5, 1000, "Metadata accesses per second"],
"mh%": [3, 100, "Metadata hit percentage"],
"mi%": [3, 100, "Metadata I/O hit percentage"],
"mm%": [3, 100, "Metadata miss percentage"],
"arcsz": [5, 1024, "ARC size"],
"size": [5, 1024, "ARC size"],
"c": [5, 1024, "ARC target size"],
"mfu": [4, 1000, "MFU list hits per second"],
"mru": [4, 1000, "MRU list hits per second"],
"mfug": [4, 1000, "MFU ghost list hits per second"],
"mrug": [4, 1000, "MRU ghost list hits per second"],
"unc": [4, 1000, "Uncached list hits per second"],
"eskip": [5, 1000, "evict_skip per second"],
"el2skip": [7, 1000, "evict skip, due to l2 writes, per second"],
"el2cach": [7, 1024, "Size of L2 cached evictions per second"],
"el2el": [5, 1024, "Size of L2 eligible evictions per second"],
"el2mfu": [6, 1024, "Size of L2 eligible MFU evictions per second"],
"el2mru": [6, 1024, "Size of L2 eligible MRU evictions per second"],
"el2inel": [7, 1024, "Size of L2 ineligible evictions per second"],
"mtxmis": [6, 1000, "mutex_miss per second"],
"dread": [5, 1000, "Demand accesses per second"],
"ddread": [6, 1000, "Demand data accesses per second"],
"dmread": [6, 1000, "Demand metadata accesses per second"],
"pread": [5, 1000, "Prefetch accesses per second"],
"pdread": [6, 1000, "Prefetch data accesses per second"],
"pmread": [6, 1000, "Prefetch metadata accesses per second"],
"l2hits": [6, 1000, "L2ARC hits per second"],
"l2miss": [6, 1000, "L2ARC misses per second"],
"l2read": [6, 1000, "Total L2ARC accesses per second"],
"l2hit%": [6, 100, "L2ARC access hit percentage"],
"l2miss%": [7, 100, "L2ARC access miss percentage"],
"l2pref": [6, 1024, "L2ARC prefetch allocated size"],
"l2mfu": [5, 1024, "L2ARC MFU allocated size"],
"l2mru": [5, 1024, "L2ARC MRU allocated size"],
"l2data": [6, 1024, "L2ARC data allocated size"],
"l2meta": [6, 1024, "L2ARC metadata allocated size"],
"l2pref%": [7, 100, "L2ARC prefetch percentage"],
"l2mfu%": [6, 100, "L2ARC MFU percentage"],
"l2mru%": [6, 100, "L2ARC MRU percentage"],
"l2data%": [7, 100, "L2ARC data percentage"],
"l2meta%": [7, 100, "L2ARC metadata percentage"],
"l2asize": [7, 1024, "Actual (compressed) size of the L2ARC"],
"l2size": [6, 1024, "Size of the L2ARC"],
"l2bytes": [7, 1024, "Bytes read per second from the L2ARC"],
"l2wbytes": [8, 1024, "Bytes written per second to the L2ARC"],
"grow": [4, 1000, "ARC grow disabled"],
"need": [5, 1024, "ARC reclaim need"],
"free": [5, 1024, "ARC free memory"],
"avail": [5, 1024, "ARC available memory"],
"waste": [5, 1024, "Wasted memory due to round up to pagesize"],
"ztotal": [6, 1000, "zfetch total prefetcher calls per second"],
"zhits": [5, 1000, "zfetch stream hits per second"],
"zahead": [6, 1000, "zfetch hits ahead of streams per second"],
"zpast": [5, 1000, "zfetch hits behind streams per second"],
"zmisses": [7, 1000, "zfetch stream misses per second"],
"zmax": [4, 1000, "zfetch limit reached per second"],
"zfuture": [7, 1000, "zfetch stream future per second"],
"zstride": [7, 1000, "zfetch stream strides per second"],
"zissued": [7, 1000, "zfetch prefetches issued per second"],
"zactive": [7, 1000, "zfetch prefetches active per second"],
}
# ARC structural breakdown from zarcsummary
structfields = {
"cmp": ["compressed", "Compressed"],
"ovh": ["overhead", "Overhead"],
"bon": ["bonus", "Bonus"],
"dno": ["dnode", "Dnode"],
"dbu": ["dbuf", "Dbuf"],
"hdr": ["hdr", "Header"],
"l2h": ["l2_hdr", "L2 header"],
"abd": ["abd_chunk_waste", "ABD chunk waste"],
}
structstats = { # size stats
"percent": "size", # percentage of this value
"sz": ["_size", "size"],
}
# ARC types breakdown from zarcsummary
typefields = {
"data": ["data", "ARC data"],
"meta": ["metadata", "ARC metadata"],
}
typestats = { # size stats
"percent": "cachessz", # percentage of this value
"tg": ["_target", "target"],
"sz": ["_size", "size"],
}
# ARC states breakdown from zarcsummary
statefields = {
"ano": ["anon", "Anonymous"],
"mfu": ["mfu", "MFU"],
"mru": ["mru", "MRU"],
"unc": ["uncached", "Uncached"],
}
targetstats = {
"percent": "cachessz", # percentage of this value
"fields": ["mfu", "mru"], # only applicable to these fields
"tg": ["_target", "target"],
"dt": ["_data_target", "data target"],
"mt": ["_metadata_target", "metadata target"],
}
statestats = { # size stats
"percent": "cachessz", # percentage of this value
"sz": ["_size", "size"],
"da": ["_data", "data size"],
"me": ["_metadata", "metadata size"],
"ed": ["_evictable_data", "evictable data size"],
"em": ["_evictable_metadata", "evictable metadata size"],
}
ghoststats = {
"fields": ["mfu", "mru"], # only applicable to these fields
"gsz": ["_ghost_size", "ghost size"],
"gd": ["_ghost_data", "ghost data size"],
"gm": ["_ghost_metadata", "ghost metadata size"],
}
# fields and stats
fieldstats = [
[structfields, structstats],
[typefields, typestats],
[statefields, targetstats, statestats, ghoststats],
]
for fs in fieldstats:
fields, stats = fs[0], fs[1:]
for field, fieldval in fields.items():
for group in stats:
for stat, statval in group.items():
if stat in ["fields", "percent"] or \
("fields" in group and field not in group["fields"]):
continue
colname = field + stat
coldesc = fieldval[1] + " " + statval[1]
cols[colname] = [len(colname), 1024, coldesc]
if "percent" in group:
cols[colname + "%"] = [len(colname) + 1, 100, \
coldesc + " percentage"]
v = {}
hdr = ["time", "read", "ddread", "ddh%", "dmread", "dmh%", "pread", "ph%",
"size", "c", "avail"]
xhdr = ["time", "mfu", "mru", "mfug", "mrug", "unc", "eskip", "mtxmis",
"dread", "pread", "read"]
zhdr = ["time", "ztotal", "zhits", "zahead", "zpast", "zmisses", "zmax",
"zfuture", "zstride", "zissued", "zactive"]
sint = 1 # Default interval is 1 second
count = 1 # Default count is 1
hdr_intr = 20 # Print header every 20 lines of output
opfile = None
sep = " " # Default separator is 2 spaces
l2exist = False
cmd = ("Usage: zarcstat [-havxp] [-f fields] [-o file] [-s string] [interval "
"[count]]\n")
cur = {}
d = {}
out = None
kstat = None
pretty_print = True
if sys.platform.startswith('freebsd'):
# Requires py-sysctl on FreeBSD
import sysctl
def kstat_update():
global kstat
k = [ctl for ctl in sysctl.filter('kstat.zfs.misc.arcstats')
if ctl.type != sysctl.CTLTYPE_NODE]
k += [ctl for ctl in sysctl.filter('kstat.zfs.misc.zfetchstats')
if ctl.type != sysctl.CTLTYPE_NODE]
if not k:
sys.exit(1)
kstat = {}
for s in k:
if not s:
continue
name, value = s.name, s.value
if "arcstats" in name:
# Trims 'kstat.zfs.misc.arcstats' from the name
kstat[name[24:]] = int(value)
else:
kstat["zfetch_" + name[27:]] = int(value)
elif sys.platform.startswith('linux'):
def kstat_update():
global kstat
k1 = [line.strip() for line in open('/proc/spl/kstat/zfs/arcstats')]
k2 = ["zfetch_" + line.strip() for line in
open('/proc/spl/kstat/zfs/zfetchstats')]
if k1 is None or k2 is None:
sys.exit(1)
del k1[0:2]
del k2[0:2]
k = k1 + k2
kstat = {}
for s in k:
if not s:
continue
name, unused, value = s.split()
kstat[name] = int(value)
def detailed_usage():
sys.stderr.write("%s\n" % cmd)
sys.stderr.write("Field definitions are as follows:\n")
for key in cols:
sys.stderr.write("%11s : %s\n" % (key, cols[key][2]))
sys.stderr.write("\n")
sys.exit(0)
def usage():
sys.stderr.write("%s\n" % cmd)
sys.stderr.write("\t -h : Print this help message\n")
sys.stderr.write("\t -a : Print all possible stats\n")
sys.stderr.write("\t -v : List all possible field headers and definitions"
"\n")
sys.stderr.write("\t -x : Print extended stats\n")
sys.stderr.write("\t -z : Print zfetch stats\n")
sys.stderr.write("\t -f : Specify specific fields to print (see -v)\n")
sys.stderr.write("\t -o : Redirect output to the specified file\n")
sys.stderr.write("\t -s : Override default field separator with custom "
"character or string\n")
sys.stderr.write("\t -p : Disable auto-scaling of numerical fields\n")
sys.stderr.write("\nExamples:\n")
sys.stderr.write("\tzarcstat -o /tmp/a.log 2 10\n")
sys.stderr.write("\tzarcstat -s \",\" -o /tmp/a.log 2 10\n")
sys.stderr.write("\tzarcstat -v\n")
sys.stderr.write("\tzarcstat -f time,hit%,dh%,ph%,mh% 1\n")
sys.stderr.write("\n")
sys.exit(1)
def snap_stats():
global cur
global kstat
prev = copy.deepcopy(cur)
kstat_update()
cur = kstat
# fill in additional values from zarcsummary
cur["caches_size"] = caches_size = cur["anon_data"]+cur["anon_metadata"]+\
cur["mfu_data"]+cur["mfu_metadata"]+cur["mru_data"]+cur["mru_metadata"]+\
cur["uncached_data"]+cur["uncached_metadata"]
s = 4294967296
pd = cur["pd"]
pm = cur["pm"]
meta = cur["meta"]
v = (s-int(pd))*(s-int(meta))/s
cur["mfu_data_target"] = v / 65536 * caches_size / 65536
v = (s-int(pm))*int(meta)/s
cur["mfu_metadata_target"] = v / 65536 * caches_size / 65536
v = int(pd)*(s-int(meta))/s
cur["mru_data_target"] = v / 65536 * caches_size / 65536
v = int(pm)*int(meta)/s
cur["mru_metadata_target"] = v / 65536 * caches_size / 65536
cur["data_target"] = cur["mfu_data_target"] + cur["mru_data_target"]
cur["metadata_target"] = cur["mfu_metadata_target"] + cur["mru_metadata_target"]
cur["mfu_target"] = cur["mfu_data_target"] + cur["mfu_metadata_target"]
cur["mru_target"] = cur["mru_data_target"] + cur["mru_metadata_target"]
for key in cur:
if re.match(key, "class"):
continue
if key in prev:
d[key] = cur[key] - prev[key]
else:
d[key] = cur[key]
def isint(num):
if isinstance(num, float):
return num.is_integer()
if isinstance(num, int):
return True
return False
def prettynum(sz, scale, num=0):
suffix = ['', 'K', 'M', 'G', 'T', 'P', 'E', 'Z']
index = 0
# Special case for date field
if scale == -1:
return "%s" % num
if scale != 100:
while abs(num) > scale and index < 5:
num = num / scale
index += 1
width = sz - (0 if index == 0 else 1)
intlen = len("%.0f" % num) # %.0f rounds to nearest int
if sint == 1 and isint(num) or width < intlen + 2:
decimal = 0
else:
decimal = 1
return "%*.*f%s" % (width, decimal, num, suffix[index])
def print_values():
global hdr
global sep
global v
global pretty_print
if pretty_print:
fmt = lambda col: prettynum(cols[col][0], cols[col][1], v[col])
else:
fmt = lambda col: str(v[col])
sys.stdout.write(sep.join(fmt(col) for col in hdr))
sys.stdout.write("\n")
sys.stdout.flush()
def print_header():
global hdr
global sep
global pretty_print
if pretty_print:
fmt = lambda col: "%*s" % (cols[col][0], col)
else:
fmt = lambda col: col
sys.stdout.write(sep.join(fmt(col) for col in hdr))
sys.stdout.write("\n")
def get_terminal_lines():
try:
import fcntl
import termios
import struct
data = fcntl.ioctl(sys.stdout.fileno(), termios.TIOCGWINSZ, '1234')
sz = struct.unpack('hh', data)
return sz[0]
except Exception:
pass
def update_hdr_intr():
global hdr_intr
lines = get_terminal_lines()
if lines and lines > 3:
hdr_intr = lines - 3
def resize_handler(signum, frame):
update_hdr_intr()
def init():
global sint
global count
global hdr
global xhdr
global zhdr
global opfile
global sep
global out
global l2exist
global pretty_print
desired_cols = None
aflag = False
xflag = False
hflag = False
vflag = False
zflag = False
i = 1
try:
opts, args = getopt.getopt(
sys.argv[1:],
"axzo:hvs:f:p",
[
"all",
"extended",
"zfetch",
"outfile",
"help",
"verbose",
"separator",
"columns",
"parsable"
]
)
except getopt.error as msg:
sys.stderr.write("Error: %s\n" % str(msg))
usage()
opts = None
for opt, arg in opts:
if opt in ('-a', '--all'):
aflag = True
if opt in ('-x', '--extended'):
xflag = True
if opt in ('-o', '--outfile'):
opfile = arg
i += 1
if opt in ('-h', '--help'):
hflag = True
if opt in ('-v', '--verbose'):
vflag = True
if opt in ('-s', '--separator'):
sep = arg
i += 1
if opt in ('-f', '--columns'):
desired_cols = arg
i += 1
if opt in ('-p', '--parsable'):
pretty_print = False
if opt in ('-z', '--zfetch'):
zflag = True
i += 1
argv = sys.argv[i:]
sint = int(argv[0]) if argv else sint
count = int(argv[1]) if len(argv) > 1 else (0 if len(argv) > 0 else 1)
if hflag or (xflag and zflag) or ((zflag or xflag) and desired_cols):
usage()
if vflag:
detailed_usage()
if xflag:
hdr = xhdr
if zflag:
hdr = zhdr
update_hdr_intr()
# check if L2ARC exists
snap_stats()
l2_size = cur.get("l2_size")
if l2_size:
l2exist = True
if desired_cols:
hdr = desired_cols.split(",")
invalid = []
incompat = []
for ele in hdr:
if ele not in cols:
invalid.append(ele)
elif not l2exist and ele.startswith("l2"):
sys.stdout.write("No L2ARC Here\n%s\n" % ele)
incompat.append(ele)
if len(invalid) > 0:
sys.stderr.write("Invalid column definition! -- %s\n" % invalid)
usage()
if len(incompat) > 0:
sys.stderr.write("Incompatible field specified! -- %s\n" %
incompat)
usage()
if aflag:
if l2exist:
hdr = cols.keys()
else:
hdr = [col for col in cols.keys() if not col.startswith("l2")]
if opfile:
try:
out = open(opfile, "w")
sys.stdout = out
except IOError:
sys.stderr.write("Cannot open %s for writing\n" % opfile)
sys.exit(1)
def calculate():
global d
global v
global l2exist
v = dict()
v["time"] = time.strftime("%H:%M:%S", time.localtime())
v["hits"] = d["hits"] / sint
v["iohs"] = d["iohits"] / sint
v["miss"] = d["misses"] / sint
v["read"] = v["hits"] + v["iohs"] + v["miss"]
v["hit%"] = 100 * v["hits"] / v["read"] if v["read"] > 0 else 0
v["ioh%"] = 100 * v["iohs"] / v["read"] if v["read"] > 0 else 0
v["miss%"] = 100 - v["hit%"] - v["ioh%"] if v["read"] > 0 else 0
v["dhit"] = (d["demand_data_hits"] + d["demand_metadata_hits"]) / sint
v["dioh"] = (d["demand_data_iohits"] + d["demand_metadata_iohits"]) / sint
v["dmis"] = (d["demand_data_misses"] + d["demand_metadata_misses"]) / sint
v["dread"] = v["dhit"] + v["dioh"] + v["dmis"]
v["dh%"] = 100 * v["dhit"] / v["dread"] if v["dread"] > 0 else 0
v["di%"] = 100 * v["dioh"] / v["dread"] if v["dread"] > 0 else 0
v["dm%"] = 100 - v["dh%"] - v["di%"] if v["dread"] > 0 else 0
v["ddhit"] = d["demand_data_hits"] / sint
v["ddioh"] = d["demand_data_iohits"] / sint
v["ddmis"] = d["demand_data_misses"] / sint
v["ddread"] = v["ddhit"] + v["ddioh"] + v["ddmis"]
v["ddh%"] = 100 * v["ddhit"] / v["ddread"] if v["ddread"] > 0 else 0
v["ddi%"] = 100 * v["ddioh"] / v["ddread"] if v["ddread"] > 0 else 0
v["ddm%"] = 100 - v["ddh%"] - v["ddi%"] if v["ddread"] > 0 else 0
v["dmhit"] = d["demand_metadata_hits"] / sint
v["dmioh"] = d["demand_metadata_iohits"] / sint
v["dmmis"] = d["demand_metadata_misses"] / sint
v["dmread"] = v["dmhit"] + v["dmioh"] + v["dmmis"]
v["dmh%"] = 100 * v["dmhit"] / v["dmread"] if v["dmread"] > 0 else 0
v["dmi%"] = 100 * v["dmioh"] / v["dmread"] if v["dmread"] > 0 else 0
v["dmm%"] = 100 - v["dmh%"] - v["dmi%"] if v["dmread"] > 0 else 0
v["phit"] = (d["prefetch_data_hits"] + d["prefetch_metadata_hits"]) / sint
v["pioh"] = (d["prefetch_data_iohits"] +
d["prefetch_metadata_iohits"]) / sint
v["pmis"] = (d["prefetch_data_misses"] +
d["prefetch_metadata_misses"]) / sint
v["pread"] = v["phit"] + v["pioh"] + v["pmis"]
v["ph%"] = 100 * v["phit"] / v["pread"] if v["pread"] > 0 else 0
v["pi%"] = 100 * v["pioh"] / v["pread"] if v["pread"] > 0 else 0
v["pm%"] = 100 - v["ph%"] - v["pi%"] if v["pread"] > 0 else 0
v["pdhit"] = d["prefetch_data_hits"] / sint
v["pdioh"] = d["prefetch_data_iohits"] / sint
v["pdmis"] = d["prefetch_data_misses"] / sint
v["pdread"] = v["pdhit"] + v["pdioh"] + v["pdmis"]
v["pdh%"] = 100 * v["pdhit"] / v["pdread"] if v["pdread"] > 0 else 0
v["pdi%"] = 100 * v["pdioh"] / v["pdread"] if v["pdread"] > 0 else 0
v["pdm%"] = 100 - v["pdh%"] - v["pdi%"] if v["pdread"] > 0 else 0
v["pmhit"] = d["prefetch_metadata_hits"] / sint
v["pmioh"] = d["prefetch_metadata_iohits"] / sint
v["pmmis"] = d["prefetch_metadata_misses"] / sint
v["pmread"] = v["pmhit"] + v["pmioh"] + v["pmmis"]
v["pmh%"] = 100 * v["pmhit"] / v["pmread"] if v["pmread"] > 0 else 0
v["pmi%"] = 100 * v["pmioh"] / v["pmread"] if v["pmread"] > 0 else 0
v["pmm%"] = 100 - v["pmh%"] - v["pmi%"] if v["pmread"] > 0 else 0
v["mhit"] = (d["prefetch_metadata_hits"] +
d["demand_metadata_hits"]) / sint
v["mioh"] = (d["prefetch_metadata_iohits"] +
d["demand_metadata_iohits"]) / sint
v["mmis"] = (d["prefetch_metadata_misses"] +
d["demand_metadata_misses"]) / sint
v["mread"] = v["mhit"] + v["mioh"] + v["mmis"]
v["mh%"] = 100 * v["mhit"] / v["mread"] if v["mread"] > 0 else 0
v["mi%"] = 100 * v["mioh"] / v["mread"] if v["mread"] > 0 else 0
v["mm%"] = 100 - v["mh%"] - v["mi%"] if v["mread"] > 0 else 0
v["arcsz"] = cur["size"]
v["size"] = cur["size"]
v["c"] = cur["c"]
v["mfu"] = d["mfu_hits"] / sint
v["mru"] = d["mru_hits"] / sint
v["mrug"] = d["mru_ghost_hits"] / sint
v["mfug"] = d["mfu_ghost_hits"] / sint
v["unc"] = d["uncached_hits"] / sint
v["eskip"] = d["evict_skip"] / sint
v["el2skip"] = d["evict_l2_skip"] / sint
v["el2cach"] = d["evict_l2_cached"] / sint
v["el2el"] = d["evict_l2_eligible"] / sint
v["el2mfu"] = d["evict_l2_eligible_mfu"] / sint
v["el2mru"] = d["evict_l2_eligible_mru"] / sint
v["el2inel"] = d["evict_l2_ineligible"] / sint
v["mtxmis"] = d["mutex_miss"] / sint
v["ztotal"] = (d["zfetch_hits"] + d["zfetch_future"] + d["zfetch_stride"] +
d["zfetch_past"] + d["zfetch_misses"]) / sint
v["zhits"] = d["zfetch_hits"] / sint
v["zahead"] = (d["zfetch_future"] + d["zfetch_stride"]) / sint
v["zpast"] = d["zfetch_past"] / sint
v["zmisses"] = d["zfetch_misses"] / sint
v["zmax"] = d["zfetch_max_streams"] / sint
v["zfuture"] = d["zfetch_future"] / sint
v["zstride"] = d["zfetch_stride"] / sint
v["zissued"] = d["zfetch_io_issued"] / sint
v["zactive"] = d["zfetch_io_active"] / sint
# ARC structural breakdown, ARC types breakdown, ARC states breakdown
v["cachessz"] = cur["caches_size"]
for fs in fieldstats:
fields, stats = fs[0], fs[1:]
for field, fieldval in fields.items():
for group in stats:
for stat, statval in group.items():
if stat in ["fields", "percent"] or \
("fields" in group and field not in group["fields"]):
continue
colname = field + stat
v[colname] = cur[fieldval[0] + statval[0]]
if "percent" in group:
v[colname + "%"] = 100 * v[colname] / \
v[group["percent"]] if v[group["percent"]] > 0 else 0
if l2exist:
l2asize = cur["l2_asize"]
v["l2hits"] = d["l2_hits"] / sint
v["l2miss"] = d["l2_misses"] / sint
v["l2read"] = v["l2hits"] + v["l2miss"]
v["l2hit%"] = 100 * v["l2hits"] / v["l2read"] if v["l2read"] > 0 else 0
v["l2miss%"] = 100 - v["l2hit%"] if v["l2read"] > 0 else 0
v["l2asize"] = l2asize
v["l2size"] = cur["l2_size"]
v["l2bytes"] = d["l2_read_bytes"] / sint
v["l2wbytes"] = d["l2_write_bytes"] / sint
v["l2pref"] = cur["l2_prefetch_asize"]
v["l2mfu"] = cur["l2_mfu_asize"]
v["l2mru"] = cur["l2_mru_asize"]
v["l2data"] = cur["l2_bufc_data_asize"]
v["l2meta"] = cur["l2_bufc_metadata_asize"]
v["l2pref%"] = 100 * v["l2pref"] / l2asize if l2asize > 0 else 0
v["l2mfu%"] = 100 * v["l2mfu"] / l2asize if l2asize > 0 else 0
v["l2mru%"] = 100 * v["l2mru"] / l2asize if l2asize > 0 else 0
v["l2data%"] = 100 * v["l2data"] / l2asize if l2asize > 0 else 0
v["l2meta%"] = 100 * v["l2meta"] / l2asize if l2asize > 0 else 0
v["grow"] = 0 if cur["arc_no_grow"] else 1
v["need"] = cur["arc_need_free"]
v["free"] = cur["memory_free_bytes"]
v["avail"] = cur["memory_available_bytes"]
v["waste"] = cur["abd_chunk_waste_size"]
def main():
global sint
global count
global hdr_intr
i = 0
count_flag = 0
init()
if count > 0:
count_flag = 1
signal(SIGINT, SIG_DFL)
signal(SIGWINCH, resize_handler)
while True:
if i == 0:
print_header()
snap_stats()
calculate()
print_values()
if count_flag == 1:
if count <= 1:
break
count -= 1
i = 0 if i >= hdr_intr else i + 1
time.sleep(sint)
if out:
out.close()
if __name__ == '__main__':
main()
-1073
View File
File diff suppressed because it is too large Load Diff
+1
View File
@@ -0,0 +1 @@
/zdb
+14 -13
View File
@@ -1,18 +1,19 @@
zdb_CPPFLAGS = $(AM_CPPFLAGS) $(LIBZPOOL_CPPFLAGS)
zdb_CFLAGS = $(AM_CFLAGS) $(LIBCRYPTO_CFLAGS)
include $(top_srcdir)/config/Rules.am
sbin_PROGRAMS += zdb
CPPCHECKTARGETS += zdb
# Unconditionally enable debugging for zdb
AM_CPPFLAGS += -DDEBUG -UNDEBUG
DEFAULT_INCLUDES += \
-I$(top_srcdir)/include \
-I$(top_srcdir)/lib/libspl/include
sbin_PROGRAMS = zdb
zdb_SOURCES = \
%D%/zdb.c \
%D%/zdb.h \
%D%/zdb_il.c
zdb.c \
zdb_il.c \
zdb.h
zdb_LDADD = \
libzdb.la \
libzpool.la \
libzfs_core.la \
libnvpair.la
zdb_LDADD += $(LIBCRYPTO_LIBS)
$(top_builddir)/lib/libnvpair/libnvpair.la \
$(top_builddir)/lib/libzpool/libzpool.la
+887 -4741
View File
File diff suppressed because it is too large Load Diff
+2 -3
View File
@@ -1,4 +1,3 @@
// SPDX-License-Identifier: CDDL-1.0
/*
* CDDL HEADER START
*
@@ -7,7 +6,7 @@
* You may not use this file except in compliance with the License.
*
* You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
* or https://opensource.org/licenses/CDDL-1.0.
* or http://www.opensolaris.org/os/licensing.
* See the License for the specific language governing permissions
* and limitations under the License.
*
@@ -29,6 +28,6 @@
#define _ZDB_H
void dump_intent_log(zilog_t *);
extern uint8_t dump_opt[512];
extern uint8_t dump_opt[256];
#endif /* _ZDB_H */
+50 -161
View File
@@ -1,4 +1,3 @@
// SPDX-License-Identifier: CDDL-1.0
/*
* CDDL HEADER START
*
@@ -7,7 +6,7 @@
* You may not use this file except in compliance with the License.
*
* You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
* or https://opensource.org/licenses/CDDL-1.0.
* or http://www.opensolaris.org/os/licensing.
* See the License for the specific language governing permissions
* and limitations under the License.
*
@@ -48,6 +47,8 @@
#include "zdb.h"
extern uint8_t dump_opt[256];
static char tab_prefix[4] = "\t\t\t";
static void
@@ -59,26 +60,25 @@ print_log_bp(const blkptr_t *bp, const char *prefix)
(void) printf("%s%s\n", prefix, blkbuf);
}
/* ARGSUSED */
static void
zil_prt_rec_create(zilog_t *zilog, int txtype, const void *arg)
zil_prt_rec_create(zilog_t *zilog, int txtype, void *arg)
{
(void) zilog;
const lr_create_t *lrc = arg;
const _lr_create_t *lr = &lrc->lr_create;
lr_create_t *lr = arg;
time_t crtime = lr->lr_crtime[0];
const char *name, *link;
char *name, *link;
lr_attr_t *lrattr;
name = (const char *)&lrc->lr_data[0];
name = (char *)(lr + 1);
if (lr->lr_common.lrc_txtype == TX_CREATE_ATTR ||
lr->lr_common.lrc_txtype == TX_MKDIR_ATTR) {
lrattr = (lr_attr_t *)&lrc->lr_data[0];
lrattr = (lr_attr_t *)(lr + 1);
name += ZIL_XVAT_SIZE(lrattr->lr_attr_masksize);
}
if (txtype == TX_SYMLINK) {
link = (const char *)&lrc->lr_data[strlen(name) + 1];
link = name + strlen(name) + 1;
(void) printf("%s%s -> %s\n", tab_prefix, name, link);
} else if (txtype != TX_MKXATTR) {
(void) printf("%s%s\n", tab_prefix, name);
@@ -96,53 +96,44 @@ zil_prt_rec_create(zilog_t *zilog, int txtype, const void *arg)
(u_longlong_t)lr->lr_gen, (u_longlong_t)lr->lr_rdev);
}
/* ARGSUSED */
static void
zil_prt_rec_remove(zilog_t *zilog, int txtype, const void *arg)
zil_prt_rec_remove(zilog_t *zilog, int txtype, void *arg)
{
(void) zilog, (void) txtype;
const lr_remove_t *lr = arg;
lr_remove_t *lr = arg;
(void) printf("%sdoid %llu, name %s\n", tab_prefix,
(u_longlong_t)lr->lr_doid, (const char *)&lr->lr_data[0]);
(u_longlong_t)lr->lr_doid, (char *)(lr + 1));
}
/* ARGSUSED */
static void
zil_prt_rec_link(zilog_t *zilog, int txtype, const void *arg)
zil_prt_rec_link(zilog_t *zilog, int txtype, void *arg)
{
(void) zilog, (void) txtype;
const lr_link_t *lr = arg;
lr_link_t *lr = arg;
(void) printf("%sdoid %llu, link_obj %llu, name %s\n", tab_prefix,
(u_longlong_t)lr->lr_doid, (u_longlong_t)lr->lr_link_obj,
(const char *)&lr->lr_data[0]);
(char *)(lr + 1));
}
/* ARGSUSED */
static void
zil_prt_rec_rename(zilog_t *zilog, int txtype, const void *arg)
zil_prt_rec_rename(zilog_t *zilog, int txtype, void *arg)
{
(void) zilog, (void) txtype;
const lr_rename_t *lrr = arg;
const _lr_rename_t *lr = &lrr->lr_rename;
const char *snm = (const char *)&lrr->lr_data[0];
const char *tnm = (const char *)&lrr->lr_data[strlen(snm) + 1];
lr_rename_t *lr = arg;
char *snm = (char *)(lr + 1);
char *tnm = snm + strlen(snm) + 1;
(void) printf("%ssdoid %llu, tdoid %llu\n", tab_prefix,
(u_longlong_t)lr->lr_sdoid, (u_longlong_t)lr->lr_tdoid);
(void) printf("%ssrc %s tgt %s\n", tab_prefix, snm, tnm);
switch (txtype) {
case TX_RENAME_EXCHANGE:
(void) printf("%sflags RENAME_EXCHANGE\n", tab_prefix);
break;
case TX_RENAME_WHITEOUT:
(void) printf("%sflags RENAME_WHITEOUT\n", tab_prefix);
break;
}
}
/* ARGSUSED */
static int
zil_prt_rec_write_cb(void *data, size_t len, void *unused)
{
(void) unused;
char *cdata = data;
for (size_t i = 0; i < len; i++) {
@@ -155,12 +146,13 @@ zil_prt_rec_write_cb(void *data, size_t len, void *unused)
return (0);
}
/* ARGSUSED */
static void
zil_prt_rec_write(zilog_t *zilog, int txtype, const void *arg)
zil_prt_rec_write(zilog_t *zilog, int txtype, void *arg)
{
const lr_write_t *lr = arg;
lr_write_t *lr = arg;
abd_t *data;
const blkptr_t *bp = &lr->lr_blkptr;
blkptr_t *bp = &lr->lr_blkptr;
zbookmark_phys_t zb;
int verbose = MAX(dump_opt['d'], dump_opt['i']);
int error;
@@ -169,31 +161,28 @@ zil_prt_rec_write(zilog_t *zilog, int txtype, const void *arg)
(u_longlong_t)lr->lr_foid, (u_longlong_t)lr->lr_offset,
(u_longlong_t)lr->lr_length);
if (txtype == TX_WRITE2 || verbose < 4)
if (txtype == TX_WRITE2 || verbose < 5)
return;
if (lr->lr_common.lrc_reclen == sizeof (lr_write_t)) {
(void) printf("%shas blkptr, %s\n", tab_prefix,
!BP_IS_HOLE(bp) && BP_GET_BIRTH(bp) >=
spa_min_claim_txg(zilog->zl_spa) ?
!BP_IS_HOLE(bp) &&
bp->blk_birth >= spa_min_claim_txg(zilog->zl_spa) ?
"will claim" : "won't claim");
print_log_bp(bp, tab_prefix);
if (verbose < 5)
return;
if (BP_IS_HOLE(bp)) {
(void) printf("\t\t\tLSIZE 0x%llx\n",
(u_longlong_t)BP_GET_LSIZE(bp));
(void) printf("%s<hole>\n", tab_prefix);
return;
}
if (BP_GET_BIRTH(bp) < zilog->zl_header->zh_claim_txg) {
if (bp->blk_birth < zilog->zl_header->zh_claim_txg) {
(void) printf("%s<block already committed>\n",
tab_prefix);
return;
}
ASSERT3U(BP_GET_LSIZE(bp), !=, 0);
SET_BOOKMARK(&zb, dmu_objset_id(zilog->zl_os),
lr->lr_foid, ZB_ZIL_LEVEL,
lr->lr_offset / BP_GET_LSIZE(bp));
@@ -205,12 +194,9 @@ zil_prt_rec_write(zilog_t *zilog, int txtype, const void *arg)
if (error)
goto out;
} else {
if (verbose < 5)
return;
/* data is stored after the end of the lr_write record */
data = abd_alloc(lr->lr_length, B_FALSE);
abd_copy_from_buf(data, &lr->lr_data[0], lr->lr_length);
abd_copy_from_buf(data, lr + 1, lr->lr_length);
}
(void) printf("%s", tab_prefix);
@@ -223,44 +209,22 @@ out:
abd_free(data);
}
/* ARGSUSED */
static void
zil_prt_rec_write_enc(zilog_t *zilog, int txtype, const void *arg)
zil_prt_rec_truncate(zilog_t *zilog, int txtype, void *arg)
{
(void) txtype;
const lr_write_t *lr = arg;
const blkptr_t *bp = &lr->lr_blkptr;
int verbose = MAX(dump_opt['d'], dump_opt['i']);
(void) printf("%s(encrypted)\n", tab_prefix);
if (verbose < 4)
return;
if (lr->lr_common.lrc_reclen == sizeof (lr_write_t)) {
(void) printf("%shas blkptr, %s\n", tab_prefix,
!BP_IS_HOLE(bp) && BP_GET_BIRTH(bp) >=
spa_min_claim_txg(zilog->zl_spa) ?
"will claim" : "won't claim");
print_log_bp(bp, tab_prefix);
}
}
static void
zil_prt_rec_truncate(zilog_t *zilog, int txtype, const void *arg)
{
(void) zilog, (void) txtype;
const lr_truncate_t *lr = arg;
lr_truncate_t *lr = arg;
(void) printf("%sfoid %llu, offset 0x%llx, length 0x%llx\n", tab_prefix,
(u_longlong_t)lr->lr_foid, (longlong_t)lr->lr_offset,
(u_longlong_t)lr->lr_length);
}
/* ARGSUSED */
static void
zil_prt_rec_setattr(zilog_t *zilog, int txtype, const void *arg)
zil_prt_rec_setattr(zilog_t *zilog, int txtype, void *arg)
{
(void) zilog, (void) txtype;
const lr_setattr_t *lr = arg;
lr_setattr_t *lr = arg;
time_t atime = (time_t)lr->lr_atime[0];
time_t mtime = (time_t)lr->lr_mtime[0];
@@ -302,83 +266,19 @@ zil_prt_rec_setattr(zilog_t *zilog, int txtype, const void *arg)
}
}
/* ARGSUSED */
static void
zil_prt_rec_setsaxattr(zilog_t *zilog, int txtype, const void *arg)
zil_prt_rec_acl(zilog_t *zilog, int txtype, void *arg)
{
(void) zilog, (void) txtype;
const lr_setsaxattr_t *lr = arg;
const char *name = (const char *)&lr->lr_data[0];
(void) printf("%sfoid %llu\n", tab_prefix,
(u_longlong_t)lr->lr_foid);
(void) printf("%sXAT_NAME %s\n", tab_prefix, name);
if (lr->lr_size == 0) {
(void) printf("%sXAT_VALUE NULL\n", tab_prefix);
} else {
(void) printf("%sXAT_VALUE ", tab_prefix);
const char *val = (const char *)&lr->lr_data[strlen(name) + 1];
for (int i = 0; i < lr->lr_size; i++) {
(void) printf("%c", *val);
val++;
}
}
}
static void
zil_prt_rec_acl(zilog_t *zilog, int txtype, const void *arg)
{
(void) zilog, (void) txtype;
const lr_acl_t *lr = arg;
lr_acl_t *lr = arg;
(void) printf("%sfoid %llu, aclcnt %llu\n", tab_prefix,
(u_longlong_t)lr->lr_foid, (u_longlong_t)lr->lr_aclcnt);
}
static void
zil_prt_rec_clone_range(zilog_t *zilog, int txtype, const void *arg)
{
(void) zilog, (void) txtype;
const lr_clone_range_t *lr = arg;
int verbose = MAX(dump_opt['d'], dump_opt['i']);
(void) printf("%sfoid %llu, offset %llx, length %llx, blksize %llx\n",
tab_prefix, (u_longlong_t)lr->lr_foid, (u_longlong_t)lr->lr_offset,
(u_longlong_t)lr->lr_length, (u_longlong_t)lr->lr_blksz);
if (verbose < 4)
return;
for (unsigned int i = 0; i < lr->lr_nbps; i++) {
(void) printf("%s[%u/%llu] ", tab_prefix, i + 1,
(u_longlong_t)lr->lr_nbps);
print_log_bp(&lr->lr_bps[i], "");
}
}
static void
zil_prt_rec_clone_range_enc(zilog_t *zilog, int txtype, const void *arg)
{
(void) zilog, (void) txtype;
const lr_clone_range_t *lr = arg;
int verbose = MAX(dump_opt['d'], dump_opt['i']);
(void) printf("%s(encrypted)\n", tab_prefix);
if (verbose < 4)
return;
for (unsigned int i = 0; i < lr->lr_nbps; i++) {
(void) printf("%s[%u/%llu] ", tab_prefix, i + 1,
(u_longlong_t)lr->lr_nbps);
print_log_bp(&lr->lr_bps[i], "");
}
}
typedef void (*zil_prt_rec_func_t)(zilog_t *, int, const void *);
typedef void (*zil_prt_rec_func_t)(zilog_t *, int, void *);
typedef struct zil_rec_info {
zil_prt_rec_func_t zri_print;
zil_prt_rec_func_t zri_print_enc;
const char *zri_name;
uint64_t zri_count;
} zil_rec_info_t;
@@ -393,9 +293,7 @@ static zil_rec_info_t zil_rec_info[TX_MAX_TYPE] = {
{.zri_print = zil_prt_rec_remove, .zri_name = "TX_RMDIR "},
{.zri_print = zil_prt_rec_link, .zri_name = "TX_LINK "},
{.zri_print = zil_prt_rec_rename, .zri_name = "TX_RENAME "},
{.zri_print = zil_prt_rec_write,
.zri_print_enc = zil_prt_rec_write_enc,
.zri_name = "TX_WRITE "},
{.zri_print = zil_prt_rec_write, .zri_name = "TX_WRITE "},
{.zri_print = zil_prt_rec_truncate, .zri_name = "TX_TRUNCATE "},
{.zri_print = zil_prt_rec_setattr, .zri_name = "TX_SETATTR "},
{.zri_print = zil_prt_rec_acl, .zri_name = "TX_ACL_V0 "},
@@ -407,19 +305,12 @@ static zil_rec_info_t zil_rec_info[TX_MAX_TYPE] = {
{.zri_print = zil_prt_rec_create, .zri_name = "TX_MKDIR_ATTR "},
{.zri_print = zil_prt_rec_create, .zri_name = "TX_MKDIR_ACL_ATTR "},
{.zri_print = zil_prt_rec_write, .zri_name = "TX_WRITE2 "},
{.zri_print = zil_prt_rec_setsaxattr,
.zri_name = "TX_SETSAXATTR "},
{.zri_print = zil_prt_rec_rename, .zri_name = "TX_RENAME_EXCHANGE "},
{.zri_print = zil_prt_rec_rename, .zri_name = "TX_RENAME_WHITEOUT "},
{.zri_print = zil_prt_rec_clone_range,
.zri_print_enc = zil_prt_rec_clone_range_enc,
.zri_name = "TX_CLONE_RANGE "},
};
/* ARGSUSED */
static int
print_log_record(zilog_t *zilog, const lr_t *lr, void *arg, uint64_t claim_txg)
print_log_record(zilog_t *zilog, lr_t *lr, void *arg, uint64_t claim_txg)
{
(void) arg, (void) claim_txg;
int txtype;
int verbose = MAX(dump_opt['d'], dump_opt['i']);
@@ -439,8 +330,6 @@ print_log_record(zilog_t *zilog, const lr_t *lr, void *arg, uint64_t claim_txg)
if (txtype && verbose >= 3) {
if (!zilog->zl_os->os_encrypted) {
zil_rec_info[txtype].zri_print(zilog, txtype, lr);
} else if (zil_rec_info[txtype].zri_print_enc) {
zil_rec_info[txtype].zri_print_enc(zilog, txtype, lr);
} else {
(void) printf("%s(encrypted)\n", tab_prefix);
}
@@ -452,11 +341,10 @@ print_log_record(zilog_t *zilog, const lr_t *lr, void *arg, uint64_t claim_txg)
return (0);
}
/* ARGSUSED */
static int
print_log_block(zilog_t *zilog, const blkptr_t *bp, void *arg,
uint64_t claim_txg)
print_log_block(zilog_t *zilog, blkptr_t *bp, void *arg, uint64_t claim_txg)
{
(void) arg;
char blkbuf[BP_SPRINTF_LEN + 10];
int verbose = MAX(dump_opt['d'], dump_opt['i']);
const char *claim;
@@ -474,7 +362,7 @@ print_log_block(zilog_t *zilog, const blkptr_t *bp, void *arg,
if (claim_txg != 0)
claim = "already claimed";
else if (BP_GET_BIRTH(bp) >= spa_min_claim_txg(zilog->zl_spa))
else if (bp->blk_birth >= spa_min_claim_txg(zilog->zl_spa))
claim = "will claim";
else
claim = "won't claim";
@@ -507,6 +395,7 @@ print_log_stats(int verbose)
(void) printf("\n");
}
/* ARGSUSED */
void
dump_intent_log(zilog_t *zilog)
{
+41 -39
View File
@@ -1,46 +1,48 @@
include $(srcdir)/%D%/zed.d/Makefile.am
SUBDIRS = zed.d
zed_CFLAGS = $(AM_CFLAGS)
zed_CFLAGS += $(LIBUDEV_CFLAGS) $(LIBUUID_CFLAGS)
include $(top_srcdir)/config/Rules.am
sbin_PROGRAMS += zed
CPPCHECKTARGETS += zed
DEFAULT_INCLUDES += \
-I$(top_srcdir)/include \
-I$(top_srcdir)/lib/libspl/include
zed_SOURCES = \
%D%/zed.c \
%D%/zed.h \
%D%/zed_conf.c \
%D%/zed_conf.h \
%D%/zed_disk_event.c \
%D%/zed_disk_event.h \
%D%/zed_event.c \
%D%/zed_event.h \
%D%/zed_exec.c \
%D%/zed_exec.h \
%D%/zed_file.c \
%D%/zed_file.h \
%D%/zed_log.c \
%D%/zed_log.h \
%D%/zed_strings.c \
%D%/zed_strings.h \
\
%D%/agents/fmd_api.c \
%D%/agents/fmd_api.h \
%D%/agents/fmd_serd.c \
%D%/agents/fmd_serd.h \
%D%/agents/zfs_agents.c \
%D%/agents/zfs_agents.h \
%D%/agents/zfs_diagnosis.c \
%D%/agents/zfs_mod.c \
%D%/agents/zfs_retire.c
sbin_PROGRAMS = zed
ZED_SRC = \
zed.c \
zed.h \
zed_conf.c \
zed_conf.h \
zed_disk_event.c \
zed_disk_event.h \
zed_event.c \
zed_event.h \
zed_exec.c \
zed_exec.h \
zed_file.c \
zed_file.h \
zed_log.c \
zed_log.h \
zed_strings.c \
zed_strings.h
FMA_SRC = \
agents/zfs_agents.c \
agents/zfs_agents.h \
agents/zfs_diagnosis.c \
agents/zfs_mod.c \
agents/zfs_retire.c \
agents/fmd_api.c \
agents/fmd_api.h \
agents/fmd_serd.c \
agents/fmd_serd.h
zed_SOURCES = $(ZED_SRC) $(FMA_SRC)
zed_LDADD = \
libzfs.la \
libzfs_core.la \
libnvpair.la \
libuutil.la
$(top_builddir)/lib/libnvpair/libnvpair.la \
$(top_builddir)/lib/libuutil/libuutil.la \
$(top_builddir)/lib/libzfs/libzfs.la
zed_LDADD += -lrt $(LIBATOMIC_LIBS) $(LIBUDEV_LIBS) $(LIBUUID_LIBS)
zed_LDADD += -lrt
zed_LDFLAGS = -pthread
dist_noinst_DATA += %D%/agents/README.md
+45 -62
View File
@@ -1,4 +1,3 @@
// SPDX-License-Identifier: CDDL-1.0
/*
* CDDL HEADER START
*
@@ -7,7 +6,7 @@
* You may not use this file except in compliance with the License.
*
* You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
* or https://opensource.org/licenses/CDDL-1.0.
* or http://www.opensolaris.org/os/licensing.
* See the License for the specific language governing permissions
* and limitations under the License.
*
@@ -23,7 +22,6 @@
* Copyright (c) 2004, 2010, Oracle and/or its affiliates. All rights reserved.
*
* Copyright (c) 2016, Intel Corporation.
* Copyright (c) 2023, Klara Inc.
*/
/*
@@ -40,7 +38,7 @@
#include <sys/fm/protocol.h>
#include <uuid/uuid.h>
#include <signal.h>
#include <string.h>
#include <strings.h>
#include <time.h>
#include "fmd_api.h"
@@ -99,7 +97,6 @@ _umem_logging_init(void)
int
fmd_hdl_register(fmd_hdl_t *hdl, int version, const fmd_hdl_info_t *mip)
{
(void) version;
fmd_module_t *mp = (fmd_module_t *)hdl;
mp->mod_info = mip;
@@ -182,21 +179,18 @@ fmd_hdl_getspecific(fmd_hdl_t *hdl)
void *
fmd_hdl_alloc(fmd_hdl_t *hdl, size_t size, int flags)
{
(void) hdl;
return (umem_alloc(size, flags));
}
void *
fmd_hdl_zalloc(fmd_hdl_t *hdl, size_t size, int flags)
{
(void) hdl;
return (umem_zalloc(size, flags));
}
void
fmd_hdl_free(fmd_hdl_t *hdl, void *data, size_t size)
{
(void) hdl;
umem_free(data, size);
}
@@ -223,8 +217,6 @@ fmd_hdl_debug(fmd_hdl_t *hdl, const char *format, ...)
int32_t
fmd_prop_get_int32(fmd_hdl_t *hdl, const char *name)
{
(void) hdl;
/*
* These can be looked up in mp->modinfo->fmdi_props
* For now we just hard code for phase 2. In the
@@ -233,6 +225,26 @@ fmd_prop_get_int32(fmd_hdl_t *hdl, const char *name)
if (strcmp(name, "spare_on_remove") == 0)
return (1);
if (strcmp(name, "io_N") == 0 || strcmp(name, "checksum_N") == 0)
return (10); /* N = 10 events */
return (0);
}
int64_t
fmd_prop_get_int64(fmd_hdl_t *hdl, const char *name)
{
/*
* These can be looked up in mp->modinfo->fmdi_props
* For now we just hard code for phase 2. In the
* future, there can be a ZED based override.
*/
if (strcmp(name, "remove_timeout") == 0)
return (15ULL * 1000ULL * 1000ULL * 1000ULL); /* 15 sec */
if (strcmp(name, "io_T") == 0 || strcmp(name, "checksum_T") == 0)
return (1000ULL * 1000ULL * 1000ULL * 600ULL); /* 10 min */
return (0);
}
@@ -322,24 +334,22 @@ fmd_case_uuresolved(fmd_hdl_t *hdl, const char *uuid)
fmd_hdl_debug(hdl, "case resolved by uuid (%s)", uuid);
}
boolean_t
int
fmd_case_solved(fmd_hdl_t *hdl, fmd_case_t *cp)
{
(void) hdl;
return (cp->ci_state >= FMD_CASE_SOLVED);
return ((cp->ci_state >= FMD_CASE_SOLVED) ? FMD_B_TRUE : FMD_B_FALSE);
}
void
fmd_case_add_ereport(fmd_hdl_t *hdl, fmd_case_t *cp, fmd_event_t *ep)
{
(void) hdl, (void) cp, (void) ep;
}
static void
zed_log_fault(nvlist_t *nvl, const char *uuid, const char *code)
{
nvlist_t *rsrc;
const char *strval;
char *strval;
uint64_t guid;
uint8_t byte;
@@ -352,7 +362,7 @@ zed_log_fault(nvlist_t *nvl, const char *uuid, const char *code)
if (code != NULL)
zed_log_msg(LOG_INFO, "\t%s: %s", FM_SUSPECT_DIAG_CODE, code);
if (nvlist_lookup_uint8(nvl, FM_FAULT_CERTAINTY, &byte) == 0)
zed_log_msg(LOG_INFO, "\t%s: %hhu", FM_FAULT_CERTAINTY, byte);
zed_log_msg(LOG_INFO, "\t%s: %llu", FM_FAULT_CERTAINTY, byte);
if (nvlist_lookup_nvlist(nvl, FM_FAULT_RESOURCE, &rsrc) == 0) {
if (nvlist_lookup_string(rsrc, FM_FMRI_SCHEME, &strval) == 0)
zed_log_msg(LOG_INFO, "\t%s: %s", FM_FMRI_SCHEME,
@@ -369,8 +379,7 @@ zed_log_fault(nvlist_t *nvl, const char *uuid, const char *code)
static const char *
fmd_fault_mkcode(nvlist_t *fault)
{
const char *class;
const char *code = "-";
char *class, *code = "-";
/*
* Note: message codes come from: openzfs/usr/src/cmd/fm/dicts/ZFS.po
@@ -419,8 +428,7 @@ fmd_case_add_suspect(fmd_hdl_t *hdl, fmd_case_t *cp, nvlist_t *fault)
err |= nvlist_add_string(nvl, FM_SUSPECT_DIAG_CODE, code);
err |= nvlist_add_int64_array(nvl, FM_SUSPECT_DIAG_TIME, tod, 2);
err |= nvlist_add_uint32(nvl, FM_SUSPECT_FAULT_SZ, 1);
err |= nvlist_add_nvlist_array(nvl, FM_SUSPECT_FAULT_LIST,
(const nvlist_t **)&fault, 1);
err |= nvlist_add_nvlist_array(nvl, FM_SUSPECT_FAULT_LIST, &fault, 1);
if (err)
zed_log_die("failed to populate nvlist");
@@ -435,21 +443,19 @@ fmd_case_add_suspect(fmd_hdl_t *hdl, fmd_case_t *cp, nvlist_t *fault)
void
fmd_case_setspecific(fmd_hdl_t *hdl, fmd_case_t *cp, void *data)
{
(void) hdl;
cp->ci_data = data;
}
void *
fmd_case_getspecific(fmd_hdl_t *hdl, fmd_case_t *cp)
{
(void) hdl;
return (cp->ci_data);
}
void
fmd_buf_create(fmd_hdl_t *hdl, fmd_case_t *cp, const char *name, size_t size)
{
assert(strcmp(name, "data") == 0), (void) name;
assert(strcmp(name, "data") == 0);
assert(cp->ci_bufptr == NULL);
assert(size < (1024 * 1024));
@@ -461,24 +467,22 @@ void
fmd_buf_read(fmd_hdl_t *hdl, fmd_case_t *cp,
const char *name, void *buf, size_t size)
{
(void) hdl;
assert(strcmp(name, "data") == 0), (void) name;
assert(strcmp(name, "data") == 0);
assert(cp->ci_bufptr != NULL);
assert(size <= cp->ci_bufsiz);
memcpy(buf, cp->ci_bufptr, size);
bcopy(cp->ci_bufptr, buf, size);
}
void
fmd_buf_write(fmd_hdl_t *hdl, fmd_case_t *cp,
const char *name, const void *buf, size_t size)
{
(void) hdl;
assert(strcmp(name, "data") == 0), (void) name;
assert(strcmp(name, "data") == 0);
assert(cp->ci_bufptr != NULL);
assert(cp->ci_bufsiz >= size);
memcpy(cp->ci_bufptr, buf, size);
bcopy(buf, cp->ci_bufptr, size);
}
/* SERD Engines */
@@ -515,19 +519,6 @@ fmd_serd_exists(fmd_hdl_t *hdl, const char *name)
return (fmd_serd_eng_lookup(&mp->mod_serds, name) != NULL);
}
int
fmd_serd_active(fmd_hdl_t *hdl, const char *name)
{
fmd_module_t *mp = (fmd_module_t *)hdl;
fmd_serd_eng_t *sgp;
if ((sgp = fmd_serd_eng_lookup(&mp->mod_serds, name)) == NULL) {
zed_log_msg(LOG_ERR, "serd engine '%s' does not exist", name);
return (0);
}
return (fmd_serd_eng_fired(sgp) || !fmd_serd_eng_empty(sgp));
}
void
fmd_serd_reset(fmd_hdl_t *hdl, const char *name)
{
@@ -536,10 +527,12 @@ fmd_serd_reset(fmd_hdl_t *hdl, const char *name)
if ((sgp = fmd_serd_eng_lookup(&mp->mod_serds, name)) == NULL) {
zed_log_msg(LOG_ERR, "serd engine '%s' does not exist", name);
} else {
fmd_serd_eng_reset(sgp);
fmd_hdl_debug(hdl, "serd_reset %s", name);
return;
}
fmd_serd_eng_reset(sgp);
fmd_hdl_debug(hdl, "serd_reset %s", name);
}
int
@@ -547,21 +540,16 @@ fmd_serd_record(fmd_hdl_t *hdl, const char *name, fmd_event_t *ep)
{
fmd_module_t *mp = (fmd_module_t *)hdl;
fmd_serd_eng_t *sgp;
int err;
if ((sgp = fmd_serd_eng_lookup(&mp->mod_serds, name)) == NULL) {
zed_log_msg(LOG_ERR, "failed to add record to SERD engine '%s'",
name);
return (0);
return (FMD_B_FALSE);
}
return (fmd_serd_eng_record(sgp, ep->ev_hrt));
}
err = fmd_serd_eng_record(sgp, ep->ev_hrt);
void
fmd_serd_gc(fmd_hdl_t *hdl)
{
fmd_module_t *mp = (fmd_module_t *)hdl;
fmd_serd_hash_apply(&mp->mod_serds, fmd_serd_eng_gc, NULL);
return (err);
}
/* FMD Timers */
@@ -575,10 +563,10 @@ _timer_notify(union sigval sv)
const fmd_hdl_ops_t *ops = mp->mod_info->fmdi_ops;
struct itimerspec its;
fmd_hdl_debug(hdl, "%s timer fired (%p)", mp->mod_name, ftp->ft_tid);
fmd_hdl_debug(hdl, "timer fired (%p)", ftp->ft_tid);
/* disarm the timer */
memset(&its, 0, sizeof (struct itimerspec));
bzero(&its, sizeof (struct itimerspec));
timer_settime(ftp->ft_tid, 0, &its, NULL);
/* Note that the fmdo_timeout can remove this timer */
@@ -594,7 +582,6 @@ _timer_notify(union sigval sv)
fmd_timer_t *
fmd_timer_install(fmd_hdl_t *hdl, void *arg, fmd_event_t *ep, hrtime_t delta)
{
(void) ep;
struct sigevent sev;
struct itimerspec its;
fmd_timer_t *ftp;
@@ -612,7 +599,6 @@ fmd_timer_install(fmd_hdl_t *hdl, void *arg, fmd_event_t *ep, hrtime_t delta)
sev.sigev_notify_function = _timer_notify;
sev.sigev_notify_attributes = NULL;
sev.sigev_value.sival_ptr = ftp;
sev.sigev_signo = 0;
timer_create(CLOCK_REALTIME, &sev, &ftp->ft_tid);
timer_settime(ftp->ft_tid, 0, &its, NULL);
@@ -639,7 +625,6 @@ nvlist_t *
fmd_nvl_create_fault(fmd_hdl_t *hdl, const char *class, uint8_t certainty,
nvlist_t *asru, nvlist_t *fru, nvlist_t *resource)
{
(void) hdl;
nvlist_t *nvl;
int err = 0;
@@ -703,8 +688,7 @@ fmd_strmatch(const char *s, const char *p)
int
fmd_nvl_class_match(fmd_hdl_t *hdl, nvlist_t *nvl, const char *pattern)
{
(void) hdl;
const char *class;
char *class;
return (nvl != NULL &&
nvlist_lookup_string(nvl, FM_CLASS, &class) == 0 &&
@@ -714,7 +698,6 @@ fmd_nvl_class_match(fmd_hdl_t *hdl, nvlist_t *nvl, const char *pattern)
nvlist_t *
fmd_nvl_alloc(fmd_hdl_t *hdl, int flags)
{
(void) hdl, (void) flags;
nvlist_t *nvl = NULL;
if (nvlist_alloc(&nvl, NV_UNIQUE_NAME, 0) != 0)
+8 -5
View File
@@ -1,4 +1,3 @@
// SPDX-License-Identifier: CDDL-1.0
/*
* CDDL HEADER START
*
@@ -7,7 +6,7 @@
* You may not use this file except in compliance with the License.
*
* You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
* or https://opensource.org/licenses/CDDL-1.0.
* or http://www.opensolaris.org/os/licensing.
* See the License for the specific language governing permissions
* and limitations under the License.
*
@@ -73,6 +72,10 @@ typedef struct fmd_case {
} fmd_case_t;
#define FMD_B_FALSE 0 /* false value for booleans as int */
#define FMD_B_TRUE 1 /* true value for booleans as int */
#define FMD_CASE_UNSOLVED 0 /* case is not yet solved (waiting) */
#define FMD_CASE_SOLVED 1 /* case is solved (suspects added) */
#define FMD_CASE_CLOSE_WAIT 2 /* case is executing fmdo_close() */
@@ -152,6 +155,7 @@ extern void fmd_hdl_vdebug(fmd_hdl_t *, const char *, va_list);
extern void fmd_hdl_debug(fmd_hdl_t *, const char *, ...);
extern int32_t fmd_prop_get_int32(fmd_hdl_t *, const char *);
extern int64_t fmd_prop_get_int64(fmd_hdl_t *, const char *);
#define FMD_STAT_NOALLOC 0x0 /* fmd should use caller's memory */
#define FMD_STAT_ALLOC 0x1 /* fmd should allocate stats memory */
@@ -172,7 +176,8 @@ extern int fmd_case_uuclosed(fmd_hdl_t *, const char *);
extern int fmd_case_uuisresolved(fmd_hdl_t *, const char *);
extern void fmd_case_uuresolved(fmd_hdl_t *, const char *);
extern boolean_t fmd_case_solved(fmd_hdl_t *, fmd_case_t *);
extern int fmd_case_solved(fmd_hdl_t *, fmd_case_t *);
extern int fmd_case_closed(fmd_hdl_t *, fmd_case_t *);
extern void fmd_case_add_ereport(fmd_hdl_t *, fmd_case_t *, fmd_event_t *);
extern void fmd_case_add_serd(fmd_hdl_t *, fmd_case_t *, const char *);
@@ -195,12 +200,10 @@ extern size_t fmd_buf_size(fmd_hdl_t *, fmd_case_t *, const char *);
extern void fmd_serd_create(fmd_hdl_t *, const char *, uint_t, hrtime_t);
extern void fmd_serd_destroy(fmd_hdl_t *, const char *);
extern int fmd_serd_exists(fmd_hdl_t *, const char *);
extern int fmd_serd_active(fmd_hdl_t *, const char *);
extern void fmd_serd_reset(fmd_hdl_t *, const char *);
extern int fmd_serd_record(fmd_hdl_t *, const char *, fmd_event_t *);
extern int fmd_serd_fired(fmd_hdl_t *, const char *);
extern int fmd_serd_empty(fmd_hdl_t *, const char *);
extern void fmd_serd_gc(fmd_hdl_t *);
extern id_t fmd_timer_install(fmd_hdl_t *, void *, fmd_event_t *, hrtime_t);
extern void fmd_timer_remove(fmd_hdl_t *, id_t);
+8 -29
View File
@@ -1,4 +1,3 @@
// SPDX-License-Identifier: CDDL-1.0
/*
* CDDL HEADER START
*
@@ -8,7 +7,7 @@
* with the License.
*
* You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
* or https://opensource.org/licenses/CDDL-1.0.
* or http://www.opensolaris.org/os/licensing.
* See the License for the specific language governing permissions
* and limitations under the License.
*
@@ -30,7 +29,7 @@
#include <assert.h>
#include <stddef.h>
#include <stdlib.h>
#include <string.h>
#include <strings.h>
#include <sys/list.h>
#include <sys/time.h>
@@ -75,18 +74,9 @@ fmd_serd_eng_alloc(const char *name, uint64_t n, hrtime_t t)
fmd_serd_eng_t *sgp;
sgp = malloc(sizeof (fmd_serd_eng_t));
if (sgp == NULL) {
perror("malloc");
exit(EXIT_FAILURE);
}
memset(sgp, 0, sizeof (fmd_serd_eng_t));
bzero(sgp, sizeof (fmd_serd_eng_t));
sgp->sg_name = strdup(name);
if (sgp->sg_name == NULL) {
perror("strdup");
exit(EXIT_FAILURE);
}
sgp->sg_flags = FMD_SERD_DIRTY;
sgp->sg_n = n;
sgp->sg_t = t;
@@ -133,12 +123,6 @@ fmd_serd_hash_create(fmd_serd_hash_t *shp)
shp->sh_hashlen = FMD_STR_BUCKETS;
shp->sh_hash = calloc(shp->sh_hashlen, sizeof (void *));
shp->sh_count = 0;
if (shp->sh_hash == NULL) {
perror("calloc");
exit(EXIT_FAILURE);
}
}
void
@@ -155,7 +139,7 @@ fmd_serd_hash_destroy(fmd_serd_hash_t *shp)
}
free(shp->sh_hash);
memset(shp, 0, sizeof (fmd_serd_hash_t));
bzero(shp, sizeof (fmd_serd_hash_t));
}
void
@@ -250,17 +234,13 @@ fmd_serd_eng_record(fmd_serd_eng_t *sgp, hrtime_t hrt)
if (sgp->sg_flags & FMD_SERD_FIRED) {
serd_log_msg(" SERD Engine: record %s already fired!",
sgp->sg_name);
return (B_FALSE);
return (FMD_B_FALSE);
}
while (sgp->sg_count >= sgp->sg_n)
fmd_serd_eng_discard(sgp, list_tail(&sgp->sg_list));
sep = malloc(sizeof (fmd_serd_elem_t));
if (sep == NULL) {
perror("malloc");
exit(EXIT_FAILURE);
}
sep->se_hrt = hrt;
list_insert_head(&sgp->sg_list, sep);
@@ -279,11 +259,11 @@ fmd_serd_eng_record(fmd_serd_eng_t *sgp, hrtime_t hrt)
fmd_event_delta(oep->se_hrt, sep->se_hrt) <= sgp->sg_t) {
sgp->sg_flags |= FMD_SERD_FIRED | FMD_SERD_DIRTY;
serd_log_msg(" SERD Engine: fired %s", sgp->sg_name);
return (B_TRUE);
return (FMD_B_TRUE);
}
sgp->sg_flags |= FMD_SERD_DIRTY;
return (B_FALSE);
return (FMD_B_FALSE);
}
int
@@ -311,9 +291,8 @@ fmd_serd_eng_reset(fmd_serd_eng_t *sgp)
}
void
fmd_serd_eng_gc(fmd_serd_eng_t *sgp, void *arg)
fmd_serd_eng_gc(fmd_serd_eng_t *sgp)
{
(void) arg;
fmd_serd_elem_t *sep, *nep;
hrtime_t hrt;
+2 -3
View File
@@ -1,4 +1,3 @@
// SPDX-License-Identifier: CDDL-1.0
/*
* CDDL HEADER START
*
@@ -8,7 +7,7 @@
* with the License.
*
* You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
* or https://opensource.org/licenses/CDDL-1.0.
* or http://www.opensolaris.org/os/licensing.
* See the License for the specific language governing permissions
* and limitations under the License.
*
@@ -78,7 +77,7 @@ extern int fmd_serd_eng_fired(fmd_serd_eng_t *);
extern int fmd_serd_eng_empty(fmd_serd_eng_t *);
extern void fmd_serd_eng_reset(fmd_serd_eng_t *);
extern void fmd_serd_eng_gc(fmd_serd_eng_t *, void *);
extern void fmd_serd_eng_gc(fmd_serd_eng_t *);
#ifdef __cplusplus
}
+32 -71
View File
@@ -1,4 +1,3 @@
// SPDX-License-Identifier: CDDL-1.0
/*
* CDDL HEADER START
*
@@ -14,7 +13,6 @@
/*
* Copyright (c) 2016, Intel Corporation.
* Copyright (c) 2018, loli10K <ezomori.nozomu@gmail.com>
* Copyright (c) 2021 Hewlett Packard Enterprise Development LP
*/
#include <libnvpair.h>
@@ -65,7 +63,7 @@ typedef enum device_type {
typedef struct guid_search {
uint64_t gs_pool_guid;
uint64_t gs_vdev_guid;
const char *gs_devid;
char *gs_devid;
device_type_t gs_vdev_type;
uint64_t gs_vdev_expandtime; /* vdev expansion time */
} guid_search_t;
@@ -78,10 +76,9 @@ static boolean_t
zfs_agent_iter_vdev(zpool_handle_t *zhp, nvlist_t *nvl, void *arg)
{
guid_search_t *gsp = arg;
const char *path = NULL;
char *path = NULL;
uint_t c, children;
nvlist_t **child;
uint64_t vdev_guid;
/*
* First iterate over any children.
@@ -102,7 +99,7 @@ zfs_agent_iter_vdev(zpool_handle_t *zhp, nvlist_t *nvl, void *arg)
&child, &children) == 0) {
for (c = 0; c < children; c++) {
if (zfs_agent_iter_vdev(zhp, child[c], gsp)) {
gsp->gs_vdev_type = DEVICE_TYPE_SPARE;
gsp->gs_vdev_type = DEVICE_TYPE_L2ARC;
return (B_TRUE);
}
}
@@ -111,7 +108,7 @@ zfs_agent_iter_vdev(zpool_handle_t *zhp, nvlist_t *nvl, void *arg)
&child, &children) == 0) {
for (c = 0; c < children; c++) {
if (zfs_agent_iter_vdev(zhp, child[c], gsp)) {
gsp->gs_vdev_type = DEVICE_TYPE_L2ARC;
gsp->gs_vdev_type = DEVICE_TYPE_SPARE;
return (B_TRUE);
}
}
@@ -128,23 +125,6 @@ zfs_agent_iter_vdev(zpool_handle_t *zhp, nvlist_t *nvl, void *arg)
&gsp->gs_vdev_expandtime);
return (B_TRUE);
}
/*
* Otherwise, on a vdev guid match, grab the devid and expansion
* time. The devid might be missing on removal since its not part
* of blkid cache and L2ARC VDEV does not contain pool guid in its
* blkid, so this is a special case for L2ARC VDEV.
*/
else if (gsp->gs_vdev_guid != 0 &&
nvlist_lookup_uint64(nvl, ZPOOL_CONFIG_GUID, &vdev_guid) == 0 &&
gsp->gs_vdev_guid == vdev_guid) {
if (gsp->gs_devid == NULL) {
(void) nvlist_lookup_string(nvl, ZPOOL_CONFIG_DEVID,
&gsp->gs_devid);
}
(void) nvlist_lookup_uint64(nvl, ZPOOL_CONFIG_EXPANSION_TIME,
&gsp->gs_vdev_expandtime);
return (B_TRUE);
}
return (B_FALSE);
}
@@ -158,28 +138,22 @@ zfs_agent_iter_pool(zpool_handle_t *zhp, void *arg)
/*
* For each vdev in this pool, look for a match by devid
*/
boolean_t found = B_FALSE;
uint64_t pool_guid;
if ((config = zpool_get_config(zhp, NULL)) != NULL) {
if (nvlist_lookup_nvlist(config, ZPOOL_CONFIG_VDEV_TREE,
&nvl) == 0) {
(void) zfs_agent_iter_vdev(zhp, nvl, gsp);
}
}
/*
* if a match was found then grab the pool guid
*/
if (gsp->gs_vdev_guid) {
(void) nvlist_lookup_uint64(config, ZPOOL_CONFIG_POOL_GUID,
&gsp->gs_pool_guid);
}
/* Get pool configuration and extract pool GUID */
if ((config = zpool_get_config(zhp, NULL)) == NULL ||
nvlist_lookup_uint64(config, ZPOOL_CONFIG_POOL_GUID,
&pool_guid) != 0)
goto out;
/* Skip this pool if we're looking for a specific pool */
if (gsp->gs_pool_guid != 0 && pool_guid != gsp->gs_pool_guid)
goto out;
if (nvlist_lookup_nvlist(config, ZPOOL_CONFIG_VDEV_TREE, &nvl) == 0)
found = zfs_agent_iter_vdev(zhp, nvl, gsp);
if (found && gsp->gs_pool_guid == 0)
gsp->gs_pool_guid = pool_guid;
out:
zpool_close(zhp);
return (found);
return (gsp->gs_vdev_guid != 0);
}
void
@@ -203,12 +177,10 @@ zfs_agent_post_event(const char *class, const char *subclass, nvlist_t *nvl)
}
/*
* On Linux, we don't get the expected FM_RESOURCE_REMOVED ereport
* from the vdev_disk layer after a hot unplug. Fortunately we do
* get an EC_DEV_REMOVE from our disk monitor and it is a suitable
* On ZFS on Linux, we don't get the expected FM_RESOURCE_REMOVED
* ereport from vdev_disk layer after a hot unplug. Fortunately we
* get a EC_DEV_REMOVE from our disk monitor and it is a suitable
* proxy so we remap it here for the benefit of the diagnosis engine.
* Starting in OpenZFS 2.0, we do get FM_RESOURCE_REMOVED from the spa
* layer. Processing multiple FM_RESOURCE_REMOVED events is not harmful.
*/
if ((strcmp(class, EC_DEV_REMOVE) == 0) &&
(strcmp(subclass, ESC_DISK) == 0) &&
@@ -220,13 +192,11 @@ zfs_agent_post_event(const char *class, const char *subclass, nvlist_t *nvl)
uint64_t pool_guid = 0, vdev_guid = 0;
guid_search_t search = { 0 };
device_type_t devtype = DEVICE_TYPE_PRIMARY;
const char *devid = NULL;
class = "resource.fs.zfs.removed";
subclass = "";
(void) nvlist_add_string(payload, FM_CLASS, class);
(void) nvlist_lookup_string(nvl, DEV_IDENTIFIER, &devid);
(void) nvlist_lookup_uint64(nvl, ZFS_EV_POOL_GUID, &pool_guid);
(void) nvlist_lookup_uint64(nvl, ZFS_EV_VDEV_GUID, &vdev_guid);
@@ -236,21 +206,14 @@ zfs_agent_post_event(const char *class, const char *subclass, nvlist_t *nvl)
(void) nvlist_add_int64_array(payload, FM_EREPORT_TIME, tod, 2);
/*
* If devid is missing but vdev_guid is available, find devid
* and pool_guid from vdev_guid.
* For multipath, spare and l2arc devices ZFS_EV_VDEV_GUID or
* ZFS_EV_POOL_GUID may be missing so find them.
*/
search.gs_devid = devid;
search.gs_vdev_guid = vdev_guid;
search.gs_pool_guid = pool_guid;
zpool_iter(g_zfs_hdl, zfs_agent_iter_pool, &search);
if (devid == NULL)
devid = search.gs_devid;
if (pool_guid == 0)
pool_guid = search.gs_pool_guid;
if (vdev_guid == 0)
vdev_guid = search.gs_vdev_guid;
(void) nvlist_lookup_string(nvl, DEV_IDENTIFIER,
&search.gs_devid);
(void) zpool_iter(g_zfs_hdl, zfs_agent_iter_pool, &search);
pool_guid = search.gs_pool_guid;
vdev_guid = search.gs_vdev_guid;
devtype = search.gs_vdev_type;
/*
@@ -263,9 +226,7 @@ zfs_agent_post_event(const char *class, const char *subclass, nvlist_t *nvl)
search.gs_vdev_expandtime + 10 > tv.tv_sec) {
zed_log_msg(LOG_INFO, "agent post event: ignoring '%s' "
"for recently expanded device '%s'", EC_DEV_REMOVE,
devid);
fnvlist_free(payload);
free(event);
search.gs_devid);
goto out;
}
@@ -357,8 +318,6 @@ zfs_agent_dispatch(const char *class, const char *subclass, nvlist_t *nvl)
static void *
zfs_agent_consumer_thread(void *arg)
{
(void) arg;
for (;;) {
agent_event_t *event;
@@ -375,7 +334,9 @@ zfs_agent_consumer_thread(void *arg)
return (NULL);
}
if ((event = list_remove_head(&agent_events)) != NULL) {
if ((event = (list_head(&agent_events))) != NULL) {
list_remove(&agent_events, event);
(void) pthread_mutex_unlock(&agent_lock);
/* dispatch to all event subscribers */
@@ -422,7 +383,6 @@ zfs_agent_init(libzfs_handle_t *zfs_hdl)
list_destroy(&agent_events);
zed_log_die("Failed to initialize agents");
}
pthread_setname_np(g_agents_tid, "agents");
}
void
@@ -438,7 +398,8 @@ zfs_agent_fini(void)
(void) pthread_join(g_agents_tid, NULL);
/* drain any pending events */
while ((event = list_remove_head(&agent_events)) != NULL) {
while ((event = (list_head(&agent_events))) != NULL) {
list_remove(&agent_events, event);
nvlist_free(event->ae_nvl);
free(event);
}
-1
View File
@@ -1,4 +1,3 @@
// SPDX-License-Identifier: CDDL-1.0
/*
* CDDL HEADER START
*
+58 -249
View File
@@ -1,4 +1,3 @@
// SPDX-License-Identifier: CDDL-1.0
/*
* CDDL HEADER START
*
@@ -7,7 +6,7 @@
* You may not use this file except in compliance with the License.
*
* You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
* or https://opensource.org/licenses/CDDL-1.0.
* or http://www.opensolaris.org/os/licensing.
* See the License for the specific language governing permissions
* and limitations under the License.
*
@@ -24,11 +23,11 @@
* Copyright (c) 2010, Oracle and/or its affiliates. All rights reserved.
* Copyright 2015 Nexenta Systems, Inc. All rights reserved.
* Copyright (c) 2016, Intel Corporation.
* Copyright (c) 2023, Klara Inc.
*/
#include <stddef.h>
#include <string.h>
#include <strings.h>
#include <libuutil.h>
#include <libzfs.h>
#include <sys/types.h>
@@ -36,29 +35,14 @@
#include <sys/fs/zfs.h>
#include <sys/fm/protocol.h>
#include <sys/fm/fs/zfs.h>
#include <sys/zio.h>
#include "zfs_agents.h"
#include "fmd_api.h"
/*
* Default values for the serd engine when processing checksum or io errors. The
* semantics are N <events> in T <seconds>.
*/
#define DEFAULT_CHECKSUM_N 10 /* events */
#define DEFAULT_CHECKSUM_T 600 /* seconds */
#define DEFAULT_IO_N 10 /* events */
#define DEFAULT_IO_T 600 /* seconds */
#define DEFAULT_SLOW_IO_N 10 /* events */
#define DEFAULT_SLOW_IO_T 30 /* seconds */
#define CASE_GC_TIMEOUT_SECS 43200 /* 12 hours */
/*
* Our serd engines are named in the following format:
* 'zfs_<pool_guid>_<vdev_guid>_{checksum,io,slow_io}'
* This #define reserves enough space for two 64-bit hex values plus the
* length of the longest string.
* Our serd engines are named 'zfs_<pool_guid>_<vdev_guid>_{checksum,io}'. This
* #define reserves enough space for two 64-bit hex values plus the length of
* the longest string.
*/
#define MAX_SERDLEN (16 * 2 + sizeof ("zfs___checksum"))
@@ -72,11 +56,9 @@ typedef struct zfs_case_data {
uint64_t zc_ena;
uint64_t zc_pool_guid;
uint64_t zc_vdev_guid;
uint64_t zc_parent_guid;
int zc_pool_state;
char zc_serd_checksum[MAX_SERDLEN];
char zc_serd_io[MAX_SERDLEN];
char zc_serd_slow_io[MAX_SERDLEN];
int zc_has_remove_timer;
} zfs_case_data_t;
@@ -123,8 +105,7 @@ zfs_de_stats_t zfs_stats = {
{ "resource_drops", FMD_TYPE_UINT64, "resource related ereports" }
};
/* wait 15 seconds after a removal */
static hrtime_t zfs_remove_timeout = SEC2NSEC(15);
static hrtime_t zfs_remove_timeout;
uu_list_pool_t *zfs_case_pool;
uu_list_t *zfs_cases;
@@ -134,13 +115,11 @@ uu_list_t *zfs_cases;
#define ZFS_MAKE_EREPORT(type) \
FM_EREPORT_CLASS "." ZFS_ERROR_CLASS "." type
static void zfs_purge_cases(fmd_hdl_t *hdl);
/*
* Write out the persistent representation of an active case.
*/
static void
zfs_case_serialize(zfs_case_t *zcp)
zfs_case_serialize(fmd_hdl_t *hdl, zfs_case_t *zcp)
{
zcp->zc_data.zc_version = CASE_DATA_VERSION_SERD;
}
@@ -182,64 +161,6 @@ zfs_case_unserialize(fmd_hdl_t *hdl, fmd_case_t *cp)
return (zcp);
}
/*
* Return count of other unique SERD cases under same vdev parent
*/
static uint_t
zfs_other_serd_cases(fmd_hdl_t *hdl, const zfs_case_data_t *zfs_case)
{
zfs_case_t *zcp;
uint_t cases = 0;
static hrtime_t next_check = 0;
/*
* Note that plumbing in some external GC would require adding locking,
* since most of this module code is not thread safe and assumes there
* is only one thread running against the module. So we perform GC here
* inline periodically so that future delay induced faults will be
* possible once the issue causing multiple vdev delays is resolved.
*/
if (gethrestime_sec() > next_check) {
/* Periodically purge old SERD entries and stale cases */
fmd_serd_gc(hdl);
zfs_purge_cases(hdl);
next_check = gethrestime_sec() + CASE_GC_TIMEOUT_SECS;
}
for (zcp = uu_list_first(zfs_cases); zcp != NULL;
zcp = uu_list_next(zfs_cases, zcp)) {
zfs_case_data_t *zcd = &zcp->zc_data;
/*
* must be same pool and parent vdev but different leaf vdev
*/
if (zcd->zc_pool_guid != zfs_case->zc_pool_guid ||
zcd->zc_parent_guid != zfs_case->zc_parent_guid ||
zcd->zc_vdev_guid == zfs_case->zc_vdev_guid) {
continue;
}
/*
* Check if there is another active serd case besides zfs_case
*
* Only one serd engine will be assigned to the case
*/
if (zcd->zc_serd_checksum[0] == zfs_case->zc_serd_checksum[0] &&
fmd_serd_active(hdl, zcd->zc_serd_checksum)) {
cases++;
}
if (zcd->zc_serd_io[0] == zfs_case->zc_serd_io[0] &&
fmd_serd_active(hdl, zcd->zc_serd_io)) {
cases++;
}
if (zcd->zc_serd_slow_io[0] == zfs_case->zc_serd_slow_io[0] &&
fmd_serd_active(hdl, zcd->zc_serd_slow_io)) {
cases++;
}
}
return (cases);
}
/*
* Iterate over any active cases. If any cases are associated with a pool or
* vdev which is no longer present on the system, close the associated case.
@@ -288,10 +209,10 @@ zfs_mark_vdev(uint64_t pool_guid, nvlist_t *vd, er_timeval_t *loaded)
}
}
/*ARGSUSED*/
static int
zfs_mark_pool(zpool_handle_t *zhp, void *unused)
{
(void) unused;
zfs_case_t *zcp;
uint64_t pool_guid;
uint64_t *tod;
@@ -446,21 +367,14 @@ zfs_serd_name(char *buf, uint64_t pool_guid, uint64_t vdev_guid,
(long long unsigned int)vdev_guid, type);
}
static void
zfs_case_retire(fmd_hdl_t *hdl, zfs_case_t *zcp)
{
fmd_hdl_debug(hdl, "retiring case");
fmd_case_close(hdl, zcp->zc_case);
}
/*
* Solve a given ZFS case. This first checks to make sure the diagnosis is
* still valid, as well as cleaning up any pending timer associated with the
* case.
*/
static void
zfs_case_solve(fmd_hdl_t *hdl, zfs_case_t *zcp, const char *faultname)
zfs_case_solve(fmd_hdl_t *hdl, zfs_case_t *zcp, const char *faultname,
boolean_t checkunusable)
{
nvlist_t *detector, *fault;
boolean_t serialize;
@@ -498,7 +412,7 @@ zfs_case_solve(fmd_hdl_t *hdl, zfs_case_t *zcp, const char *faultname)
serialize = B_TRUE;
}
if (serialize)
zfs_case_serialize(zcp);
zfs_case_serialize(hdl, zcp);
nvlist_free(detector);
}
@@ -510,10 +424,10 @@ timeval_earlier(er_timeval_t *a, er_timeval_t *b)
(a->ertv_sec == b->ertv_sec && a->ertv_nsec < b->ertv_nsec));
}
/*ARGSUSED*/
static void
zfs_ereport_when(fmd_hdl_t *hdl, nvlist_t *nvl, er_timeval_t *when)
{
(void) hdl;
int64_t *tod;
uint_t nelem;
@@ -526,51 +440,22 @@ zfs_ereport_when(fmd_hdl_t *hdl, nvlist_t *nvl, er_timeval_t *when)
}
}
/*
* Record the specified event in the SERD engine and return a
* boolean value indicating whether or not the engine fired as
* the result of inserting this event.
*
* When the pool has similar active cases on other vdevs, then
* the fired state is disregarded and the case is retired.
*/
static int
zfs_fm_serd_record(fmd_hdl_t *hdl, const char *name, fmd_event_t *ep,
zfs_case_t *zcp, const char *err_type)
{
int fired = fmd_serd_record(hdl, name, ep);
int peers = 0;
if (fired && (peers = zfs_other_serd_cases(hdl, &zcp->zc_data)) > 0) {
fmd_hdl_debug(hdl, "pool %llu is tracking %d other %s cases "
"-- skip faulting the vdev %llu",
(u_longlong_t)zcp->zc_data.zc_pool_guid,
peers, err_type,
(u_longlong_t)zcp->zc_data.zc_vdev_guid);
zfs_case_retire(hdl, zcp);
fired = 0;
}
return (fired);
}
/*
* Main fmd entry point.
*/
/*ARGSUSED*/
static void
zfs_fm_recv(fmd_hdl_t *hdl, fmd_event_t *ep, nvlist_t *nvl, const char *class)
{
zfs_case_t *zcp, *dcp;
int32_t pool_state;
uint64_t ena, pool_guid, vdev_guid, parent_guid;
uint64_t checksum_n, checksum_t;
uint64_t io_n, io_t;
uint64_t ena, pool_guid, vdev_guid;
er_timeval_t pool_load;
er_timeval_t er_when;
nvlist_t *detector;
boolean_t pool_found = B_FALSE;
boolean_t isresource;
const char *type;
char *type;
/*
* We subscribe to notifications for vdev or pool removal. In these
@@ -652,9 +537,6 @@ zfs_fm_recv(fmd_hdl_t *hdl, fmd_event_t *ep, nvlist_t *nvl, const char *class)
if (nvlist_lookup_uint64(nvl,
FM_EREPORT_PAYLOAD_ZFS_VDEV_GUID, &vdev_guid) != 0)
vdev_guid = 0;
if (nvlist_lookup_uint64(nvl,
FM_EREPORT_PAYLOAD_ZFS_PARENT_GUID, &parent_guid) != 0)
parent_guid = 0;
if (nvlist_lookup_uint64(nvl, FM_EREPORT_ENA, &ena) != 0)
ena = 0;
@@ -741,7 +623,9 @@ zfs_fm_recv(fmd_hdl_t *hdl, fmd_event_t *ep, nvlist_t *nvl, const char *class)
if (strcmp(class,
ZFS_MAKE_EREPORT(FM_EREPORT_ZFS_DATA)) == 0 ||
strcmp(class,
ZFS_MAKE_EREPORT(FM_EREPORT_ZFS_CONFIG_CACHE_WRITE)) == 0) {
ZFS_MAKE_EREPORT(FM_EREPORT_ZFS_CONFIG_CACHE_WRITE)) == 0 ||
strcmp(class,
ZFS_MAKE_EREPORT(FM_EREPORT_ZFS_DELAY)) == 0) {
zfs_stats.resource_drops.fmds_value.ui64++;
return;
}
@@ -765,7 +649,6 @@ zfs_fm_recv(fmd_hdl_t *hdl, fmd_event_t *ep, nvlist_t *nvl, const char *class)
data.zc_ena = ena;
data.zc_pool_guid = pool_guid;
data.zc_vdev_guid = vdev_guid;
data.zc_parent_guid = parent_guid;
data.zc_pool_state = (int)pool_state;
fmd_buf_write(hdl, cs, CASE_DATA, &data, sizeof (data));
@@ -803,16 +686,13 @@ zfs_fm_recv(fmd_hdl_t *hdl, fmd_event_t *ep, nvlist_t *nvl, const char *class)
if (zcp->zc_data.zc_has_remove_timer) {
fmd_timer_remove(hdl, zcp->zc_remove_timer);
zcp->zc_data.zc_has_remove_timer = 0;
zfs_case_serialize(zcp);
zfs_case_serialize(hdl, zcp);
}
if (zcp->zc_data.zc_serd_io[0] != '\0')
fmd_serd_reset(hdl, zcp->zc_data.zc_serd_io);
if (zcp->zc_data.zc_serd_checksum[0] != '\0')
fmd_serd_reset(hdl,
zcp->zc_data.zc_serd_checksum);
if (zcp->zc_data.zc_serd_slow_io[0] != '\0')
fmd_serd_reset(hdl,
zcp->zc_data.zc_serd_slow_io);
} else if (fmd_nvl_class_match(hdl, nvl,
ZFS_MAKE_RSRC(FM_RESOURCE_STATECHANGE))) {
uint64_t state = 0;
@@ -841,11 +721,7 @@ zfs_fm_recv(fmd_hdl_t *hdl, fmd_event_t *ep, nvlist_t *nvl, const char *class)
if (fmd_case_solved(hdl, zcp->zc_case))
return;
if (vdev_guid)
fmd_hdl_debug(hdl, "error event '%s', vdev %llu", class,
vdev_guid);
else
fmd_hdl_debug(hdl, "error event '%s'", class);
fmd_hdl_debug(hdl, "error event '%s'", class);
/*
* Determine if we should solve the case and generate a fault. We solve
@@ -875,18 +751,18 @@ zfs_fm_recv(fmd_hdl_t *hdl, fmd_event_t *ep, nvlist_t *nvl, const char *class)
fmd_case_close(hdl, dcp->zc_case);
}
zfs_case_solve(hdl, zcp, "fault.fs.zfs.pool");
zfs_case_solve(hdl, zcp, "fault.fs.zfs.pool", B_TRUE);
} else if (fmd_nvl_class_match(hdl, nvl,
ZFS_MAKE_EREPORT(FM_EREPORT_ZFS_LOG_REPLAY))) {
/*
* Pool level fault for reading the intent logs.
*/
zfs_case_solve(hdl, zcp, "fault.fs.zfs.log_replay");
zfs_case_solve(hdl, zcp, "fault.fs.zfs.log_replay", B_TRUE);
} else if (fmd_nvl_class_match(hdl, nvl, "ereport.fs.zfs.vdev.*")) {
/*
* Device fault.
*/
zfs_case_solve(hdl, zcp, "fault.fs.zfs.device");
zfs_case_solve(hdl, zcp, "fault.fs.zfs.device", B_TRUE);
} else if (fmd_nvl_class_match(hdl, nvl,
ZFS_MAKE_EREPORT(FM_EREPORT_ZFS_IO)) ||
fmd_nvl_class_match(hdl, nvl,
@@ -894,12 +770,9 @@ zfs_fm_recv(fmd_hdl_t *hdl, fmd_event_t *ep, nvlist_t *nvl, const char *class)
fmd_nvl_class_match(hdl, nvl,
ZFS_MAKE_EREPORT(FM_EREPORT_ZFS_IO_FAILURE)) ||
fmd_nvl_class_match(hdl, nvl,
ZFS_MAKE_EREPORT(FM_EREPORT_ZFS_DELAY)) ||
fmd_nvl_class_match(hdl, nvl,
ZFS_MAKE_EREPORT(FM_EREPORT_ZFS_PROBE_FAILURE))) {
const char *failmode = NULL;
char *failmode = NULL;
boolean_t checkremove = B_FALSE;
uint32_t pri = 0;
/*
* If this is a checksum or I/O error, then toss it into the
@@ -911,111 +784,30 @@ zfs_fm_recv(fmd_hdl_t *hdl, fmd_event_t *ep, nvlist_t *nvl, const char *class)
if (fmd_nvl_class_match(hdl, nvl,
ZFS_MAKE_EREPORT(FM_EREPORT_ZFS_IO))) {
if (zcp->zc_data.zc_serd_io[0] == '\0') {
if (nvlist_lookup_uint64(nvl,
FM_EREPORT_PAYLOAD_ZFS_VDEV_IO_N,
&io_n) != 0) {
io_n = DEFAULT_IO_N;
}
if (nvlist_lookup_uint64(nvl,
FM_EREPORT_PAYLOAD_ZFS_VDEV_IO_T,
&io_t) != 0) {
io_t = DEFAULT_IO_T;
}
zfs_serd_name(zcp->zc_data.zc_serd_io,
pool_guid, vdev_guid, "io");
fmd_serd_create(hdl, zcp->zc_data.zc_serd_io,
io_n,
SEC2NSEC(io_t));
zfs_case_serialize(zcp);
fmd_prop_get_int32(hdl, "io_N"),
fmd_prop_get_int64(hdl, "io_T"));
zfs_case_serialize(hdl, zcp);
}
if (zfs_fm_serd_record(hdl, zcp->zc_data.zc_serd_io,
ep, zcp, "io error")) {
if (fmd_serd_record(hdl, zcp->zc_data.zc_serd_io, ep))
checkremove = B_TRUE;
}
} else if (fmd_nvl_class_match(hdl, nvl,
ZFS_MAKE_EREPORT(FM_EREPORT_ZFS_DELAY))) {
uint64_t slow_io_n, slow_io_t;
/*
* Create a slow io SERD engine when the VDEV has the
* 'vdev_slow_io_n' and 'vdev_slow_io_n' properties.
*/
if (zcp->zc_data.zc_serd_slow_io[0] == '\0' &&
nvlist_lookup_uint64(nvl,
FM_EREPORT_PAYLOAD_ZFS_VDEV_SLOW_IO_N,
&slow_io_n) == 0 &&
nvlist_lookup_uint64(nvl,
FM_EREPORT_PAYLOAD_ZFS_VDEV_SLOW_IO_T,
&slow_io_t) == 0) {
zfs_serd_name(zcp->zc_data.zc_serd_slow_io,
pool_guid, vdev_guid, "slow_io");
fmd_serd_create(hdl,
zcp->zc_data.zc_serd_slow_io,
slow_io_n,
SEC2NSEC(slow_io_t));
zfs_case_serialize(zcp);
}
/* Pass event to SERD engine and see if this triggers */
if (zcp->zc_data.zc_serd_slow_io[0] != '\0' &&
zfs_fm_serd_record(hdl,
zcp->zc_data.zc_serd_slow_io, ep, zcp, "slow io")) {
zfs_case_solve(hdl, zcp,
"fault.fs.zfs.vdev.slow_io");
}
} else if (fmd_nvl_class_match(hdl, nvl,
ZFS_MAKE_EREPORT(FM_EREPORT_ZFS_CHECKSUM))) {
uint64_t flags = 0;
int32_t flags32 = 0;
/*
* We ignore ereports for checksum errors generated by
* scrub/resilver I/O to avoid potentially further
* degrading the pool while it's being repaired.
*
* Note that FM_EREPORT_PAYLOAD_ZFS_ZIO_FLAGS used to
* be int32. To allow newer zed to work on older
* kernels, if we don't find the flags, we look for
* the older ones too.
*/
if (((nvlist_lookup_uint32(nvl,
FM_EREPORT_PAYLOAD_ZFS_ZIO_PRIORITY, &pri) == 0) &&
(pri == ZIO_PRIORITY_SCRUB ||
pri == ZIO_PRIORITY_REBUILD)) ||
((nvlist_lookup_uint64(nvl,
FM_EREPORT_PAYLOAD_ZFS_ZIO_FLAGS, &flags) == 0) &&
(flags & (ZIO_FLAG_SCRUB | ZIO_FLAG_RESILVER))) ||
((nvlist_lookup_int32(nvl,
FM_EREPORT_PAYLOAD_ZFS_ZIO_FLAGS, &flags32) == 0) &&
(flags32 & (ZIO_FLAG_SCRUB | ZIO_FLAG_RESILVER)))) {
fmd_hdl_debug(hdl, "ignoring '%s' for "
"scrub/resilver I/O", class);
return;
}
if (zcp->zc_data.zc_serd_checksum[0] == '\0') {
if (nvlist_lookup_uint64(nvl,
FM_EREPORT_PAYLOAD_ZFS_VDEV_CKSUM_N,
&checksum_n) != 0) {
checksum_n = DEFAULT_CHECKSUM_N;
}
if (nvlist_lookup_uint64(nvl,
FM_EREPORT_PAYLOAD_ZFS_VDEV_CKSUM_T,
&checksum_t) != 0) {
checksum_t = DEFAULT_CHECKSUM_T;
}
zfs_serd_name(zcp->zc_data.zc_serd_checksum,
pool_guid, vdev_guid, "checksum");
fmd_serd_create(hdl,
zcp->zc_data.zc_serd_checksum,
checksum_n,
SEC2NSEC(checksum_t));
zfs_case_serialize(zcp);
fmd_prop_get_int32(hdl, "checksum_N"),
fmd_prop_get_int64(hdl, "checksum_T"));
zfs_case_serialize(hdl, zcp);
}
if (zfs_fm_serd_record(hdl,
zcp->zc_data.zc_serd_checksum, ep, zcp,
"checksum")) {
if (fmd_serd_record(hdl,
zcp->zc_data.zc_serd_checksum, ep)) {
zfs_case_solve(hdl, zcp,
"fault.fs.zfs.vdev.checksum");
"fault.fs.zfs.vdev.checksum", B_FALSE);
}
} else if (fmd_nvl_class_match(hdl, nvl,
ZFS_MAKE_EREPORT(FM_EREPORT_ZFS_IO_FAILURE)) &&
@@ -1025,11 +817,12 @@ zfs_fm_recv(fmd_hdl_t *hdl, fmd_event_t *ep, nvlist_t *nvl, const char *class)
if (strncmp(failmode, FM_EREPORT_FAILMODE_CONTINUE,
strlen(FM_EREPORT_FAILMODE_CONTINUE)) == 0) {
zfs_case_solve(hdl, zcp,
"fault.fs.zfs.io_failure_continue");
"fault.fs.zfs.io_failure_continue",
B_FALSE);
} else if (strncmp(failmode, FM_EREPORT_FAILMODE_WAIT,
strlen(FM_EREPORT_FAILMODE_WAIT)) == 0) {
zfs_case_solve(hdl, zcp,
"fault.fs.zfs.io_failure_wait");
"fault.fs.zfs.io_failure_wait", B_FALSE);
}
} else if (fmd_nvl_class_match(hdl, nvl,
ZFS_MAKE_EREPORT(FM_EREPORT_ZFS_PROBE_FAILURE))) {
@@ -1051,7 +844,7 @@ zfs_fm_recv(fmd_hdl_t *hdl, fmd_event_t *ep, nvlist_t *nvl, const char *class)
zfs_remove_timeout);
if (!zcp->zc_data.zc_has_remove_timer) {
zcp->zc_data.zc_has_remove_timer = 1;
zfs_case_serialize(zcp);
zfs_case_serialize(hdl, zcp);
}
}
}
@@ -1061,13 +854,14 @@ zfs_fm_recv(fmd_hdl_t *hdl, fmd_event_t *ep, nvlist_t *nvl, const char *class)
* The timeout is fired when we diagnosed an I/O error, and it was not due to
* device removal (which would cause the timeout to be cancelled).
*/
/* ARGSUSED */
static void
zfs_fm_timeout(fmd_hdl_t *hdl, id_t id, void *data)
{
zfs_case_t *zcp = data;
if (id == zcp->zc_remove_timer)
zfs_case_solve(hdl, zcp, "fault.fs.zfs.vdev.io");
zfs_case_solve(hdl, zcp, "fault.fs.zfs.vdev.io", B_FALSE);
}
/*
@@ -1083,8 +877,6 @@ zfs_fm_close(fmd_hdl_t *hdl, fmd_case_t *cs)
fmd_serd_destroy(hdl, zcp->zc_data.zc_serd_checksum);
if (zcp->zc_data.zc_serd_io[0] != '\0')
fmd_serd_destroy(hdl, zcp->zc_data.zc_serd_io);
if (zcp->zc_data.zc_serd_slow_io[0] != '\0')
fmd_serd_destroy(hdl, zcp->zc_data.zc_serd_slow_io);
if (zcp->zc_data.zc_has_remove_timer)
fmd_timer_remove(hdl, zcp->zc_remove_timer);
@@ -1093,15 +885,30 @@ zfs_fm_close(fmd_hdl_t *hdl, fmd_case_t *cs)
fmd_hdl_free(hdl, zcp, sizeof (zfs_case_t));
}
/*
* We use the fmd gc entry point to look for old cases that no longer apply.
* This allows us to keep our set of case data small in a long running system.
*/
static void
zfs_fm_gc(fmd_hdl_t *hdl)
{
zfs_purge_cases(hdl);
}
static const fmd_hdl_ops_t fmd_ops = {
zfs_fm_recv, /* fmdo_recv */
zfs_fm_timeout, /* fmdo_timeout */
zfs_fm_close, /* fmdo_close */
NULL, /* fmdo_stats */
NULL, /* fmdo_gc */
zfs_fm_gc, /* fmdo_gc */
};
static const fmd_prop_t fmd_props[] = {
{ "checksum_N", FMD_TYPE_UINT32, "10" },
{ "checksum_T", FMD_TYPE_TIME, "10min" },
{ "io_N", FMD_TYPE_UINT32, "10" },
{ "io_T", FMD_TYPE_TIME, "10min" },
{ "remove_timeout", FMD_TYPE_TIME, "15sec" },
{ NULL, 0, NULL }
};
@@ -1142,6 +949,8 @@ _zfs_diagnosis_init(fmd_hdl_t *hdl)
(void) fmd_stat_create(hdl, FMD_STAT_NOALLOC, sizeof (zfs_stats) /
sizeof (fmd_stat_t), (fmd_stat_t *)&zfs_stats);
zfs_remove_timeout = fmd_prop_get_int64(hdl, "remove_timeout");
}
void
+102 -518
View File
File diff suppressed because it is too large Load Diff
+24 -153
View File
@@ -1,4 +1,3 @@
// SPDX-License-Identifier: CDDL-1.0
/*
* CDDL HEADER START
*
@@ -7,7 +6,7 @@
* You may not use this file except in compliance with the License.
*
* You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
* or https://opensource.org/licenses/CDDL-1.0.
* or http://www.opensolaris.org/os/licensing.
* See the License for the specific language governing permissions
* and limitations under the License.
*
@@ -39,10 +38,8 @@
#include <sys/fs/zfs.h>
#include <sys/fm/protocol.h>
#include <sys/fm/fs/zfs.h>
#include <libzutil.h>
#include <libzfs.h>
#include <string.h>
#include <libgen.h>
#include "zfs_agents.h"
#include "fmd_api.h"
@@ -77,8 +74,6 @@ typedef struct find_cbdata {
uint64_t cb_guid;
zpool_handle_t *cb_zhp;
nvlist_t *cb_vdev;
uint64_t cb_vdev_guid;
uint64_t cb_num_spares;
} find_cbdata_t;
static int
@@ -144,64 +139,6 @@ find_vdev(libzfs_handle_t *zhdl, nvlist_t *nv, uint64_t search_guid)
return (NULL);
}
static int
remove_spares(zpool_handle_t *zhp, void *data)
{
nvlist_t *config, *nvroot;
nvlist_t **spares;
uint_t nspares;
char *devname;
find_cbdata_t *cbp = data;
uint64_t spareguid = 0;
vdev_stat_t *vs;
unsigned int c;
config = zpool_get_config(zhp, NULL);
if (nvlist_lookup_nvlist(config,
ZPOOL_CONFIG_VDEV_TREE, &nvroot) != 0) {
zpool_close(zhp);
return (0);
}
if (nvlist_lookup_nvlist_array(nvroot, ZPOOL_CONFIG_SPARES,
&spares, &nspares) != 0) {
zpool_close(zhp);
return (0);
}
for (int i = 0; i < nspares; i++) {
if (nvlist_lookup_uint64(spares[i], ZPOOL_CONFIG_GUID,
&spareguid) == 0 && spareguid == cbp->cb_vdev_guid) {
devname = zpool_vdev_name(NULL, zhp, spares[i],
B_FALSE);
nvlist_lookup_uint64_array(spares[i],
ZPOOL_CONFIG_VDEV_STATS, (uint64_t **)&vs, &c);
if (vs->vs_state != VDEV_STATE_REMOVED &&
zpool_vdev_remove_wanted(zhp, devname) == 0)
cbp->cb_num_spares++;
break;
}
}
zpool_close(zhp);
return (0);
}
/*
* Given a vdev guid, find and remove all spares associated with it.
*/
static int
find_and_remove_spares(libzfs_handle_t *zhdl, uint64_t vdev_guid)
{
find_cbdata_t cb;
cb.cb_num_spares = 0;
cb.cb_vdev_guid = vdev_guid;
zpool_iter(zhdl, remove_spares, &cb);
return (cb.cb_num_spares);
}
/*
* Given a (pool, vdev) GUID pair, find the matching pool and vdev.
*/
@@ -282,31 +219,25 @@ replace_with_spare(fmd_hdl_t *hdl, zpool_handle_t *zhp, nvlist_t *vdev)
* replace it.
*/
for (s = 0; s < nspares; s++) {
boolean_t rebuild = B_FALSE;
const char *spare_name, *type;
char *spare_name;
if (nvlist_lookup_string(spares[s], ZPOOL_CONFIG_PATH,
&spare_name) != 0)
continue;
/* prefer sequential resilvering for distributed spares */
if ((nvlist_lookup_string(spares[s], ZPOOL_CONFIG_TYPE,
&type) == 0) && strcmp(type, VDEV_TYPE_DRAID_SPARE) == 0)
rebuild = B_TRUE;
/* if set, add the "ashift" pool property to the spare nvlist */
if (source != ZPROP_SRC_DEFAULT)
(void) nvlist_add_uint64(spares[s],
ZPOOL_CONFIG_ASHIFT, ashift);
(void) nvlist_add_nvlist_array(replacement,
ZPOOL_CONFIG_CHILDREN, (const nvlist_t **)&spares[s], 1);
ZPOOL_CONFIG_CHILDREN, &spares[s], 1);
fmd_hdl_debug(hdl, "zpool_vdev_replace '%s' with spare '%s'",
dev_name, zfs_basename(spare_name));
dev_name, basename(spare_name));
if (zpool_vdev_attach(zhp, dev_name, spare_name,
replacement, B_TRUE, rebuild) == 0) {
replacement, B_TRUE) == 0) {
free(dev_name);
nvlist_free(replacement);
return (B_TRUE);
@@ -324,6 +255,7 @@ replace_with_spare(fmd_hdl_t *hdl, zpool_handle_t *zhp, nvlist_t *vdev)
* ASRU is now usable. ZFS has found the device to be present and
* functioning.
*/
/*ARGSUSED*/
static void
zfs_vdev_repair(fmd_hdl_t *hdl, nvlist_t *nvl)
{
@@ -362,11 +294,11 @@ zfs_vdev_repair(fmd_hdl_t *hdl, nvlist_t *nvl)
vdev_guid, pool_guid);
}
/*ARGSUSED*/
static void
zfs_retire_recv(fmd_hdl_t *hdl, fmd_event_t *ep, nvlist_t *nvl,
const char *class)
{
(void) ep;
uint64_t pool_guid, vdev_guid;
zpool_handle_t *zhp;
nvlist_t *resource, *fault;
@@ -376,61 +308,30 @@ zfs_retire_recv(fmd_hdl_t *hdl, fmd_event_t *ep, nvlist_t *nvl,
libzfs_handle_t *zhdl = zdp->zrd_hdl;
boolean_t fault_device, degrade_device;
boolean_t is_repair;
boolean_t l2arc = B_FALSE;
boolean_t spare = B_FALSE;
const char *scheme;
char *scheme;
nvlist_t *vdev = NULL;
const char *uuid;
char *uuid;
int repair_done = 0;
boolean_t retire;
boolean_t is_disk;
vdev_aux_t aux;
uint64_t state = 0;
vdev_stat_t *vs;
unsigned int c;
fmd_hdl_debug(hdl, "zfs_retire_recv: '%s'", class);
(void) nvlist_lookup_uint64(nvl, FM_EREPORT_PAYLOAD_ZFS_VDEV_STATE,
&state);
/*
* If this is a resource notifying us of device removal then simply
* check for an available spare and continue unless the device is a
* l2arc vdev, in which case we just offline it.
*/
if (strcmp(class, "resource.fs.zfs.removed") == 0 ||
(strcmp(class, "resource.fs.zfs.statechange") == 0 &&
(state == VDEV_STATE_REMOVED || state == VDEV_STATE_FAULTED))) {
const char *devtype;
if (strcmp(class, "resource.fs.zfs.removed") == 0) {
char *devtype;
char *devname;
boolean_t skip_removal = B_FALSE;
if (nvlist_lookup_string(nvl, FM_EREPORT_PAYLOAD_ZFS_VDEV_TYPE,
&devtype) == 0) {
if (strcmp(devtype, VDEV_TYPE_SPARE) == 0)
spare = B_TRUE;
else if (strcmp(devtype, VDEV_TYPE_L2CACHE) == 0)
l2arc = B_TRUE;
}
if (nvlist_lookup_uint64(nvl,
FM_EREPORT_PAYLOAD_ZFS_VDEV_GUID, &vdev_guid) != 0)
return;
if (vdev_guid == 0) {
fmd_hdl_debug(hdl, "Got a zero GUID");
return;
}
if (spare) {
int nspares = find_and_remove_spares(zhdl, vdev_guid);
fmd_hdl_debug(hdl, "%d spares removed", nspares);
return;
}
if (nvlist_lookup_uint64(nvl, FM_EREPORT_PAYLOAD_ZFS_POOL_GUID,
&pool_guid) != 0)
&pool_guid) != 0 ||
nvlist_lookup_uint64(nvl, FM_EREPORT_PAYLOAD_ZFS_VDEV_GUID,
&vdev_guid) != 0)
return;
if ((zhp = find_by_guid(zhdl, pool_guid, vdev_guid,
@@ -439,40 +340,13 @@ zfs_retire_recv(fmd_hdl_t *hdl, fmd_event_t *ep, nvlist_t *nvl,
devname = zpool_vdev_name(NULL, zhp, vdev, B_FALSE);
nvlist_lookup_uint64_array(vdev, ZPOOL_CONFIG_VDEV_STATS,
(uint64_t **)&vs, &c);
if (vs->vs_state == VDEV_STATE_OFFLINE)
return;
/*
* If state removed is requested for already removed vdev,
* its a loopback event from spa_async_remove(). Just
* ignore it.
*/
if ((vs->vs_state == VDEV_STATE_REMOVED &&
state == VDEV_STATE_REMOVED)) {
if (strcmp(class, "resource.fs.zfs.removed") == 0 &&
nvlist_exists(nvl, "by_kernel")) {
skip_removal = B_TRUE;
} else {
return;
}
}
/* Remove the vdev since device is unplugged */
int remove_status = 0;
if (!skip_removal && (l2arc ||
(strcmp(class, "resource.fs.zfs.removed") == 0))) {
remove_status = zpool_vdev_remove_wanted(zhp, devname);
fmd_hdl_debug(hdl, "zpool_vdev_remove_wanted '%s'"
", err:%d", devname, libzfs_errno(zhdl));
}
/* Replace the vdev with a spare if its not a l2arc */
if (!l2arc && !remove_status &&
(!fmd_prop_get_int32(hdl, "spare_on_remove") ||
replace_with_spare(hdl, zhp, vdev) == B_FALSE)) {
/* Can't replace l2arc with a spare: offline the device */
if (nvlist_lookup_string(nvl, FM_EREPORT_PAYLOAD_ZFS_VDEV_TYPE,
&devtype) == 0 && strcmp(devtype, VDEV_TYPE_L2CACHE) == 0) {
fmd_hdl_debug(hdl, "zpool_vdev_offline '%s'", devname);
zpool_vdev_offline(zhp, devname, B_TRUE);
} else if (!fmd_prop_get_int32(hdl, "spare_on_remove") ||
replace_with_spare(hdl, zhp, vdev) == B_FALSE) {
/* Could not handle with spare */
fmd_hdl_debug(hdl, "no spare for '%s'", devname);
}
@@ -486,11 +360,12 @@ zfs_retire_recv(fmd_hdl_t *hdl, fmd_event_t *ep, nvlist_t *nvl,
return;
/*
* Note: on Linux statechange events are more than just
* Note: on zfsonlinux statechange events are more than just
* healthy ones so we need to confirm the actual state value.
*/
if (strcmp(class, "resource.fs.zfs.statechange") == 0 &&
state == VDEV_STATE_HEALTHY) {
nvlist_lookup_uint64(nvl, FM_EREPORT_PAYLOAD_ZFS_VDEV_STATE,
&state) == 0 && state == VDEV_STATE_HEALTHY) {
zfs_vdev_repair(hdl, nvl);
return;
}
@@ -535,9 +410,6 @@ zfs_retire_recv(fmd_hdl_t *hdl, fmd_event_t *ep, nvlist_t *nvl,
} else if (fmd_nvl_class_match(hdl, fault,
"fault.fs.zfs.vdev.checksum")) {
degrade_device = B_TRUE;
} else if (fmd_nvl_class_match(hdl, fault,
"fault.fs.zfs.vdev.slow_io")) {
degrade_device = B_TRUE;
} else if (fmd_nvl_class_match(hdl, fault,
"fault.fs.zfs.device")) {
fault_device = B_FALSE;
@@ -624,7 +496,6 @@ zfs_retire_recv(fmd_hdl_t *hdl, fmd_event_t *ep, nvlist_t *nvl,
* Attempt to substitute a hot spare.
*/
(void) replace_with_spare(hdl, zhp, vdev);
zpool_close(zhp);
}
+23 -53
View File
@@ -1,10 +1,9 @@
// SPDX-License-Identifier: CDDL-1.0
/*
* This file is part of the ZFS Event Daemon (ZED).
*
* This file is part of the ZFS Event Daemon (ZED)
* for ZFS on Linux (ZoL) <http://zfsonlinux.org/>.
* Developed at Lawrence Livermore National Laboratory (LLNL-CODE-403049).
* Copyright (C) 2013-2014 Lawrence Livermore National Security, LLC.
* Refer to the OpenZFS git commit log for authoritative copyright attribution.
* Refer to the ZoL git commit log for authoritative copyright attribution.
*
* The contents of this file are subject to the terms of the
* Common Development and Distribution License Version 1.0 (CDDL-1.0).
@@ -37,7 +36,6 @@ static volatile sig_atomic_t _got_hup = 0;
static void
_exit_handler(int signum)
{
(void) signum;
_got_exit = 1;
}
@@ -47,7 +45,6 @@ _exit_handler(int signum)
static void
_hup_handler(int signum)
{
(void) signum;
_got_hup = 1;
}
@@ -63,8 +60,8 @@ _setup_sig_handlers(void)
zed_log_die("Failed to initialize sigset");
sa.sa_flags = SA_RESTART;
sa.sa_handler = SIG_IGN;
if (sigaction(SIGPIPE, &sa, NULL) < 0)
zed_log_die("Failed to ignore SIGPIPE");
@@ -78,10 +75,6 @@ _setup_sig_handlers(void)
sa.sa_handler = _hup_handler;
if (sigaction(SIGHUP, &sa, NULL) < 0)
zed_log_die("Failed to register SIGHUP handler");
(void) sigaddset(&sa.sa_mask, SIGCHLD);
if (pthread_sigmask(SIG_BLOCK, &sa.sa_mask, NULL) < 0)
zed_log_die("Failed to block SIGCHLD");
}
/*
@@ -219,20 +212,22 @@ _finish_daemonize(void)
int
main(int argc, char *argv[])
{
struct zed_conf zcp;
struct zed_conf *zcp;
uint64_t saved_eid;
int64_t saved_etime[2];
zed_log_init(argv[0]);
zed_log_stderr_open(LOG_NOTICE);
zed_conf_init(&zcp);
zed_conf_parse_opts(&zcp, argc, argv);
if (zcp.do_verbose)
zcp = zed_conf_create();
zed_conf_parse_opts(zcp, argc, argv);
if (zcp->do_verbose)
zed_log_stderr_open(LOG_INFO);
if (geteuid() != 0)
zed_log_die("Must be run as root");
zed_conf_parse_file(zcp);
zed_file_close_from(STDERR_FILENO + 1);
(void) umask(0);
@@ -240,72 +235,47 @@ main(int argc, char *argv[])
if (chdir("/") < 0)
zed_log_die("Failed to change to root directory");
if (zed_conf_scan_dir(&zcp) < 0)
if (zed_conf_scan_dir(zcp) < 0)
exit(EXIT_FAILURE);
if (!zcp.do_foreground) {
if (!zcp->do_foreground) {
_start_daemonize();
zed_log_syslog_open(LOG_DAEMON);
}
_setup_sig_handlers();
if (zcp.do_memlock)
if (zcp->do_memlock)
_lock_memory();
if ((zed_conf_write_pid(&zcp) < 0) && (!zcp.do_force))
if ((zed_conf_write_pid(zcp) < 0) && (!zcp->do_force))
exit(EXIT_FAILURE);
if (!zcp.do_foreground)
if (!zcp->do_foreground)
_finish_daemonize();
zed_log_msg(LOG_NOTICE,
"ZFS Event Daemon %s-%s (PID %d)",
ZFS_META_VERSION, ZFS_META_RELEASE, (int)getpid());
if (zed_conf_open_state(&zcp) < 0)
if (zed_conf_open_state(zcp) < 0)
exit(EXIT_FAILURE);
if (zed_conf_read_state(&zcp, &saved_eid, saved_etime) < 0)
if (zed_conf_read_state(zcp, &saved_eid, saved_etime) < 0)
exit(EXIT_FAILURE);
idle:
/*
* If -I is specified, attempt to open /dev/zfs repeatedly until
* successful.
*/
do {
if (!zed_event_init(&zcp))
break;
/* Wait for some time and try again. tunable? */
sleep(30);
} while (!_got_exit && zcp.do_idle);
if (_got_exit)
goto out;
zed_event_seek(&zcp, saved_eid, saved_etime);
zed_event_init(zcp);
zed_event_seek(zcp, saved_eid, saved_etime);
while (!_got_exit) {
int rv;
if (_got_hup) {
_got_hup = 0;
(void) zed_conf_scan_dir(&zcp);
(void) zed_conf_scan_dir(zcp);
}
rv = zed_event_service(&zcp);
/* ENODEV: When kernel module is unloaded (osx) */
if (rv != 0)
break;
zed_event_service(zcp);
}
zed_log_msg(LOG_NOTICE, "Exiting");
zed_event_fini(&zcp);
if (zcp.do_idle && !_got_exit)
goto idle;
out:
zed_conf_destroy(&zcp);
zed_event_fini(zcp);
zed_conf_destroy(zcp);
zed_log_fini();
exit(EXIT_SUCCESS);
}
+39 -41
View File
@@ -1,59 +1,57 @@
include $(top_srcdir)/config/Rules.am
EXTRA_DIST = \
README \
history_event-zfs-list-cacher.sh.in
zedconfdir = $(sysconfdir)/zfs/zed.d
dist_zedconf_DATA = \
%D%/zed-functions.sh \
%D%/zed.rc
zed-functions.sh \
zed.rc
zedexecdir = $(zfsexecdir)/zed.d
dist_zedexec_SCRIPTS = \
%D%/all-debug.sh \
%D%/all-syslog.sh \
%D%/data-notify.sh \
%D%/deadman-sync-slot_off.sh \
%D%/generic-notify.sh \
%D%/pool_import-sync-led.sh \
%D%/resilver_finish-notify.sh \
%D%/resilver_finish-start-scrub.sh \
%D%/scrub_finish-notify.sh \
%D%/statechange-sync-led.sh \
%D%/statechange-notify.sh \
%D%/statechange-sync-slot_off.sh \
%D%/trim_finish-notify.sh \
%D%/vdev_attach-sync-led.sh \
%D%/vdev_clear-sync-led.sh
all-debug.sh \
all-syslog.sh \
data-notify.sh \
generic-notify.sh \
resilver_finish-notify.sh \
scrub_finish-notify.sh \
statechange-led.sh \
statechange-notify.sh \
vdev_clear-led.sh \
vdev_attach-led.sh \
pool_import-led.sh \
resilver_finish-start-scrub.sh
nodist_zedexec_SCRIPTS = \
%D%/history_event-zfs-list-cacher.sh
nodist_zedexec_SCRIPTS = history_event-zfs-list-cacher.sh
SUBSTFILES += $(nodist_zedexec_SCRIPTS)
$(nodist_zedexec_SCRIPTS): %: %.in
-$(SED) -e 's,@bindir\@,$(bindir),g' \
-e 's,@runstatedir\@,$(runstatedir),g' \
-e 's,@sbindir\@,$(sbindir),g' \
-e 's,@sysconfdir\@,$(sysconfdir),g' \
$< >'$@'
zedconfdefaults = \
all-syslog.sh \
data-notify.sh \
deadman-sync-slot_off.sh \
history_event-zfs-list-cacher.sh \
pool_import-sync-led.sh \
resilver_finish-notify.sh \
resilver_finish-start-scrub.sh \
scrub_finish-notify.sh \
statechange-sync-led.sh \
statechange-led.sh \
statechange-notify.sh \
statechange-sync-slot_off.sh \
vdev_attach-sync-led.sh \
vdev_clear-sync-led.sh
vdev_clear-led.sh \
vdev_attach-led.sh \
pool_import-led.sh \
resilver_finish-start-scrub.sh
dist_noinst_DATA += %D%/README
INSTALL_DATA_HOOKS += zed-install-data-hook
zed-install-data-hook:
install-data-hook:
$(MKDIR_P) "$(DESTDIR)$(zedconfdir)"
set -x; for f in $(zedconfdefaults); do \
[ -f "$(DESTDIR)$(zedconfdir)/$${f}" ] ||\
[ -L "$(DESTDIR)$(zedconfdir)/$${f}" ] || \
$(LN_S) "$(zedexecdir)/$${f}" "$(DESTDIR)$(zedconfdir)"; \
for f in $(zedconfdefaults); do \
test -f "$(DESTDIR)$(zedconfdir)/$${f}" -o \
-L "$(DESTDIR)$(zedconfdir)/$${f}" || \
ln -s "$(zedexecdir)/$${f}" "$(DESTDIR)$(zedconfdir)"; \
done
SHELLCHECKSCRIPTS += $(dist_zedconf_DATA) $(dist_zedexec_SCRIPTS) $(nodist_zedexec_SCRIPTS)
$(call SHELLCHECK_OPTS,$(dist_zedconf_DATA) $(dist_zedexec_SCRIPTS) $(nodist_zedexec_SCRIPTS)): SHELLCHECK_SHELL = sh
# False positive: 1>&"${ZED_FLOCK_FD}" looks suspiciously similar to a >&filename bash extension
$(call SHELLCHECK_OPTS,$(dist_zedconf_DATA) $(dist_zedexec_SCRIPTS) $(nodist_zedexec_SCRIPTS)): CHECKBASHISMS_IGNORE = -e 'should be >word 2>&1' -e '&"$${ZED_FLOCK_FD}"'
chmod 0600 "$(DESTDIR)$(zedconfdir)/zed.rc"
+10 -7
View File
@@ -1,5 +1,4 @@
#!/bin/sh
# shellcheck disable=SC2154
#
# Log all environment variables to ZED_DEBUG_LOG.
#
@@ -13,11 +12,15 @@
zed_exit_if_ignoring_this_event
zed_lock "${ZED_DEBUG_LOG}"
{
env | sort
echo
} 1>&"${ZED_FLOCK_FD}"
zed_unlock "${ZED_DEBUG_LOG}"
lockfile="$(basename -- "${ZED_DEBUG_LOG}").lock"
umask 077
zed_lock "${lockfile}"
exec >> "${ZED_DEBUG_LOG}"
printenv | sort
echo
exec >&-
zed_unlock "${lockfile}"
exit 0
+4 -42
View File
@@ -1,52 +1,14 @@
#!/bin/sh
# shellcheck disable=SC2154
#
# Copyright (C) 2013-2014 Lawrence Livermore National Security, LLC.
# Copyright (c) 2020 by Delphix. All rights reserved.
#
#
# Log the zevent via syslog.
#
[ -f "${ZED_ZEDLET_DIR}/zed.rc" ] && . "${ZED_ZEDLET_DIR}/zed.rc"
. "${ZED_ZEDLET_DIR}/zed-functions.sh"
zed_exit_if_ignoring_this_event
# build a string of name=value pairs for this event
msg="eid=${ZEVENT_EID} class=${ZEVENT_SUBCLASS}"
if [ "${ZED_SYSLOG_DISPLAY_GUIDS}" = "1" ]; then
[ -n "${ZEVENT_POOL_GUID}" ] && msg="${msg} pool_guid=${ZEVENT_POOL_GUID}"
[ -n "${ZEVENT_VDEV_GUID}" ] && msg="${msg} vdev_guid=${ZEVENT_VDEV_GUID}"
else
[ -n "${ZEVENT_POOL}" ] && msg="${msg} pool='${ZEVENT_POOL}'"
[ -n "${ZEVENT_VDEV_PATH}" ] && msg="${msg} vdev=${ZEVENT_VDEV_PATH##*/}"
fi
# log pool state if state is anything other than 'ACTIVE'
[ -n "${ZEVENT_POOL_STATE_STR}" ] && [ "$ZEVENT_POOL_STATE" -ne 0 ] && \
msg="${msg} pool_state=${ZEVENT_POOL_STATE_STR}"
# Log the following payload nvpairs if they are present
[ -n "${ZEVENT_VDEV_STATE_STR}" ] && msg="${msg} vdev_state=${ZEVENT_VDEV_STATE_STR}"
[ -n "${ZEVENT_CKSUM_ALGORITHM}" ] && msg="${msg} algorithm=${ZEVENT_CKSUM_ALGORITHM}"
[ -n "${ZEVENT_ZIO_SIZE}" ] && msg="${msg} size=${ZEVENT_ZIO_SIZE}"
[ -n "${ZEVENT_ZIO_OFFSET}" ] && msg="${msg} offset=${ZEVENT_ZIO_OFFSET}"
[ -n "${ZEVENT_ZIO_PRIORITY}" ] && msg="${msg} priority=${ZEVENT_ZIO_PRIORITY}"
[ -n "${ZEVENT_ZIO_ERR}" ] && msg="${msg} err=${ZEVENT_ZIO_ERR}"
[ -n "${ZEVENT_ZIO_FLAGS}" ] && msg="${msg} flags=$(printf '0x%x' "${ZEVENT_ZIO_FLAGS}")"
# log delays that are >= 10 milisec
[ -n "${ZEVENT_ZIO_DELAY}" ] && [ "$ZEVENT_ZIO_DELAY" -gt 10000000 ] && \
msg="${msg} delay=$((ZEVENT_ZIO_DELAY / 1000000))ms"
# list the bookmark data together
# shellcheck disable=SC2153
[ -n "${ZEVENT_ZIO_OBJSET}" ] && \
msg="${msg} bookmark=${ZEVENT_ZIO_OBJSET}:${ZEVENT_ZIO_OBJECT}:${ZEVENT_ZIO_LEVEL}:${ZEVENT_ZIO_BLKID}"
zed_log_msg "${msg}"
zed_log_msg "eid=${ZEVENT_EID}" "class=${ZEVENT_SUBCLASS}" \
"${ZEVENT_POOL_GUID:+"pool_guid=${ZEVENT_POOL_GUID}"}" \
"${ZEVENT_VDEV_PATH:+"vdev_path=${ZEVENT_VDEV_PATH}"}" \
"${ZEVENT_VDEV_STATE_STR:+"vdev_state=${ZEVENT_VDEV_STATE_STR}"}"
exit 0
+1 -2
View File
@@ -1,5 +1,4 @@
#!/bin/sh
# shellcheck disable=SC2154
#
# Send notification in response to a DATA error.
#
@@ -26,7 +25,7 @@ zed_rate_limit "${rate_limit_tag}" || exit 3
umask 077
note_subject="ZFS ${ZEVENT_SUBCLASS} error for ${ZEVENT_POOL} on $(hostname)"
note_pathname="$(mktemp)"
note_pathname="${TMPDIR:="/tmp"}/$(basename -- "$0").${ZEVENT_EID}.$$"
{
echo "ZFS has detected a data error:"
echo

Some files were not shown because too many files have changed in this diff Show More