Compare commits

..

270 Commits

Author SHA1 Message Date
Tony Hutter a8c2b7ebc6 Tag zfs-0.7.13
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2019-02-22 09:47:55 -08:00
John Wren Kennedy 2af898ee24 test-runner: python3 support
Updated to be compatible with Python 2.6, 2.7, 3.5 or newer.

Reviewed-by: John Ramsden <johnramsden@riseup.net>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: John Wren Kennedy <john.kennedy@delphix.com>
Closes #8096
2019-02-22 09:47:34 -08:00
Gregor Kopka c32c2f17d0 Fix flake 8 style warnings
Ran zts-report.py and test-runner.py from ./tests/test-runner/bin/
through the 2to3 (https://docs.python.org/2/library/2to3.html).
Checked the result, fixed:
- 'maxint' -> 'maxsize' that 2to3 missed.
- 'cmp=' parameter for a 'sorted()' with a 'key=' version.
- try/except wrapping of configparser import as there are still
  python 2.7 systems that lack a compatibility shim

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gregor Kopka <gregor@kopka.net>
Closes #7925
Closes #7952
2019-02-22 09:47:34 -08:00
Tony Hutter 2254b2bbbe GCC 9.0: Fix ztest "directive argument is not a nul-terminated string"
GCC 9.0 is complaining because we're trying to print strings that
are defined like this:

.zo_pool = { 'z', 't', 'e', 's', 't', '\0' },

Fix them by making them actual strings.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #8330
2019-02-22 09:47:34 -08:00
Brian Behlendorf 5c4ec382a7 Linux 5.0 compat: Fix bio_set_dev()
The Linux 5.0 kernel updated the bio_set_dev() macro so it calls the
GPL-only bio_associate_blkg() symbol thus inadvertently converting
the entire macro.  Provide a minimal version which always assigns the
request queue's root_blkg to the bio.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #8287
2019-02-22 09:47:34 -08:00
Tony Hutter e22bfd8149 Linux 5.0 compat: Disable vector instructions on 5.0+ kernels
The 5.0 kernel no longer exports the functions we need to do vector
(SSE/SSE2/SSE3/AVX...) instructions.  Disable vector-based checksum
algorithms when building against those kernels.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #8259
2019-02-22 09:47:34 -08:00
Tony Hutter f45ad7bff6 Linux 5.0 compat: Fix SUBDIRs
SUBDIRs has been deprecated for a long time, and was finally removed in
the 5.0 kernel.  Use "M=" instead.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #8257
2019-02-22 09:47:34 -08:00
Tony Hutter 0a3a4d067a Linux 5.0 compat: Convert MS_* macros to SB_*
In the 5.0 kernel, only the mount namespace code should use the MS_*
macos. Filesystems should use the SB_* ones.

https://patchwork.kernel.org/patch/10552493/

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #8264
2019-02-22 09:47:34 -08:00
Tony Hutter ba8024a284 Linux 5.0 compat: Use totalram_pages()
totalram_pages() was converted to an atomic variable in 5.0:

https://patchwork.kernel.org/patch/10652795/

Its value should now be read though the totalram_pages() helper
function.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #8263
2019-02-22 09:47:34 -08:00
Tony Hutter edc2675aed Linux 5.0 compat: access_ok() drops 'type' parameter
access_ok no longer needs a 'type' parameter in the 5.0 kernel.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #8261
2019-02-22 09:47:34 -08:00
ilbsmart 98bb45e27a deadlock between mm_sem and tx assign in zfs_write() and page fault
The bug time sequence:
1. thread #1, `zfs_write` assign a txg "n".
2. In a same process, thread #2, mmap page fault (which means the
   `mm_sem` is hold) occurred, `zfs_dirty_inode` open a txg failed,
   and wait previous txg "n" completed.
3. thread #1 call `uiomove` to write, however page fault is occurred
   in `uiomove`, which means it need `mm_sem`, but `mm_sem` is hold by
   thread #2, so it stuck and can't complete,  then txg "n" will
   not complete.

So thread #1 and thread #2 are deadlocked.

Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Grady Wong <grady.w@xtaotech.com>
Closes #7939
2019-02-22 09:47:34 -08:00
Neal Gompa (ニール・ゴンパ) 44f463824b dkms: Enable debuginfo option to be set with zfs sysconfig file
On some Linux distributions, the kernel module build will not
default to building with debuginfo symbols, which can make it
difficult for debugging and testing.

For this case, we provide a flag to override the build to force
debuginfo to be produced for the kernel module build.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Neal Gompa <ngompa@datto.com>
Co-authored-by: Simon Watson <swatson@datto.com>
Signed-off-by: Neal Gompa <ngompa@datto.com>
Signed-off-by: Simon Watson <swatson@datto.com>
Closes #8304
2019-02-22 09:47:34 -08:00
Neal Gompa (ニール・ゴンパ) b0d579bc55 Bump commit subject length to 72 characters
There's not really a reason to keep the subject length so short,
since the reason to make it this short was for making nice renders
of a summary list of the git log. With 72 characters, this still
works out fine, so let's just raise it to that so that it's easier
to give slightly more descriptive change summaries.

Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Neal Gompa <ngompa@datto.com>
Closes #8250
2019-02-22 09:47:34 -08:00
Benjamin Gentil 7e5def8ae0 zfs.8 uses wrong snapshot names in Example 15
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: bunder2015 <omfgbunder@gmail.com>
Signed-off-by: Benjamin Gentil <benjamin@gentil.io>
Closes #8241
2019-02-22 09:47:34 -08:00
Tony Hutter 89019a846b Add enclosure_symlinks option to vdev_id
Add an 'enclosure_symlinks' option to vdev_id.conf.  This creates
consistently named symlinks to the enclosure devices (/dev/sg*) based
off the configuration in vdev_id.conf.  The enclosure symlinks show
up in /dev/by-enclosure/<prefix>-<channel><num>.  The links make it
make it easy to run sg_ses on a particular enclosure device.  The
enclosure links are created in addition to the normal
/dev/disk/by-vdev links.

'enclosure_symlinks' is only valid in sas_direct configurations.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Simon Guest <simon.guest@tesujimath.org>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #8194
2019-02-22 09:47:34 -08:00
Simon Guest 41f7723e9c vdev_id: new slot type ses
This extends vdev_id to support a new slot type, ses, for SCSI Enclosure
Services.  With slot type ses, the disk slot numbers are determined by
using the device slot number reported by sg_ses for the device with
matching SAS address, found by querying all available enclosures.

This is primarily of use on systems with a deficient driver omitting
support for bay_identifier in /sys/devices.  In my testing, I found that
the existing slot types of port and id were not stable across disk
replacement, so an alternative was required.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Simon Guest <simon.guest@tesujimath.org>
Closes #6956
2019-02-22 09:47:34 -08:00
Simon Guest 2b8c3cb0c8 vdev_id: extension for new scsi topology
On systems with SCSI rather than SAS disk topology, this change enables
the vdev_id script to match against the block device path, and therefore
create a vdev alias in /dev/disk/by-vdev.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Simon Guest <simon.guest@tesujimath.org>
Closes #6592
2019-02-22 09:47:34 -08:00
Olaf Faaland f325d76e96 Rename macro ZFS_MINOR due to Lustre conflict
Macro ZFS_MINOR, introduced in commit a6cc9756 to record the chosen
static minor number for /dev/zfs, conflicts with an existing macro
in Lustre.  The lustre macro (along with _MAJOR, _PATCH, _FIX) is
used to record the zfsonlinux version Lustre is being built against.

Since the Lustre macro came first, and is used in past versions of
lustre at least going back to 2.10, it makes sense to rename the
macro in ZFS instead of doing so in Lustre which would require
backporting the patch.

Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #8195
2019-02-22 09:47:34 -08:00
Brian Behlendorf e3fb781c5f Add kernel module auto-loading
Historically a dynamic misc minor number was registered for the
/dev/zfs device in order to prevent minor number collisions.  This
was fine but it prevented us from being able to use the kernel
module auto-loaded which requires a known reserved value.

Resolve this issue by adding a configure test to find an available
misc minor number which can then be used in MODULE_ALIAS_MISCDEV at
build time.  By adding this alias the zfs kmod is added to the list
of known static-nodes and the systemd-tmpfiles-setup-dev service
will create a /dev/zfs character device at boot time.

This in turn allows us to update the 90-zfs.rules file to make it
aware this is a static node.  The upshot of this is that whenever
a process (zpool, zfs, zed) opens the /dev/zfs the kmods will be
automatic loaded.  This even works for unprivileged users so there
is no longer a need to manually load the modules at boot time.

As an additional bonus the zed now no longer needs to start after
the zfs-import.service since it will trigger the module load.

In the unlikely event the minor number we selected conflicts with
another out of tree unregistered minor number the code falls back
to dynamically allocating it.  In this case the modules again
must be manually loaded.

Note that due to the change in the method of registering the minor
number the zimport.sh test case may incorrectly fail when the
static node for the installed packages is created instead of the
dynamic one.  This issue will only transiently impact zimport.sh
for this single commit when we transition and are mixing and
matching methods.

Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
TEST_ZIMPORT_SKIP="yes"
Closes #7287
2019-02-22 09:47:34 -08:00
Ben Wolsieffer 14a5e48fb9 Use autoconf variable for C preprocessor
This fixes the build when cross-compiling, where the preprocessor might
be prefixed.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Ben Wolsieffer <benwolsieffer@gmail.com>
Closes #8180
2019-02-22 09:47:34 -08:00
Matthew Ahrens 01937958ce OpenZFS 9577 - remove zfs_dbuf_evict_key tsd
The zfs_dbuf_evict_key TSD (thread-specific data) is not necessary -
we can instead pass a flag down in a few places to prevent recursive
dbuf eviction. Making this change has 3 benefits:

1. The code semantics are easier to understand.
2. On Linux, performance is improved, because creating/removing
   TSD values (by setting to NULL vs non-NULL) is expensive, and
   we do it very often.
3. According to Nexenta, the current semantics can cause a
   deadlock when concurrently calling dmu_objset_evict_dbufs()
   (which is rare today, but they are working on a "parallel
   unmount" change that triggers this more easily):

Porting Notes:
* Minor conflict with OpenZFS 9337 which has not yet been ported.

Authored by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>
Reviewed by: Brad Lewis <brad.lewis@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>

OpenZFS-issue: https://illumos.org/issues/9577
OpenZFS-commit: https://github.com/openzfs/openzfs/pull/645
External-issue: DLPX-58547
Closes #7602
2019-02-22 09:47:34 -08:00
LOLi edb504f9db Honor --with-mounthelperdir where applicable
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6962
2019-02-22 09:47:34 -08:00
LOLi 2428fbbfcf contrib/initramfs: switch to automake
Use automake to build initramfs scripts and hooks.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6761
2019-02-22 09:47:33 -08:00
Tony Hutter 16d298188f Tag zfs-0.7.12
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2018-11-08 14:38:37 -08:00
Tony Hutter f42f8702ce Add BuildRequires gcc, make, elfutils-libelf-devel
This adds a BuildRequires for gcc, make, and elfutils-libelf-devel
into our spec files.  gcc has been a packaging requirement for
awhile now:

https://fedoraproject.org/wiki/Packaging:C_and_C%2B%2B

These additional BuildRequires allow us to mock build in
Fedora 29.

Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:  Tony Hutter <hutter2@llnl.gov>
Closes #8095
Closes #8102
2018-11-08 14:38:28 -08:00
Brian Behlendorf 9e58d5ef38 Fix flake8 "invalid escape sequence 'x'" warning
From, https://lintlyci.github.io/Flake8Rules/rules/W605.html

As of Python 3.6, a backslash-character pair that is not a valid
escape sequence now generates a DeprecationWarning. Although this
will eventually become a SyntaxError, that will not be for several
Python releases.

Note 'float_pobj' was simply removed from arcstat.py since it
was entirely unused.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #8056
2018-11-08 14:38:28 -08:00
Brian Behlendorf 320f9de8ab ZTS: Update O_TMPFILE support check
In CentOS 7.5 the kernel provided a compatibility wrapper to support
O_TMPFILE.  This results in the test setup script correctly detecting
kernel support.  But the ZFS module was built without O_TMPFILE
support due to the non-standard CentOS kernel interface.

Handle this case by updating the setup check to fail either when
the kernel or the ZFS module fail to provide support.  The reason
will be clearly logged in the test results.

Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7528
2018-11-08 14:38:28 -08:00
George Melikov 262275ab26 Allow use of pool GUID as root pool
It's helpful if there are pools with same names,
but you need to use only one of them.

Main case is twin servers, meanwhile some software
requires the same name of pools (e.g. Proxmox).

Reviewed-by: Kash Pande <kash@tripleback.net>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Igor ‘guardian’ Lidin of Moscow, Russia
Closes #8052
2018-11-08 14:38:28 -08:00
Brian Behlendorf 55f39a01e6 Fix arc_release() refcount
Update arc_release to use arc_buf_size().  This hunk was accidentally
dropped when porting compressed send/recv, 2aa34383b.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #8000
2018-11-08 14:38:28 -08:00
Tim Schumacher b884768e46 Prefix all refcount functions with zfs_
Recent changes in the Linux kernel made it necessary to prefix
the refcount_add() function with zfs_ due to a name collision.

To bring the other functions in line with that and to avoid future
collisions, prefix the other refcount functions as well.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tim Schumacher <timschumi@gmx.de>
Closes #7963
2018-11-08 14:38:28 -08:00
Tim Schumacher f8f4e13776 Linux 4.19-rc3+ compat: Remove refcount_t compat
torvalds/linux@59b57717f ("blkcg: delay blkg destruction until
after writeback has finished") added a refcount_t to the blkcg
structure. Due to the refcount_t compatibility code, zfs_refcount_t
was used by mistake.

Resolve this by removing the compatibility code and replacing the
occurrences of refcount_t with zfs_refcount_t.

Reviewed-by: Franz Pletz <fpletz@fnordicwalking.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tim Schumacher <timschumi@gmx.de>
Closes #7885
Closes #7932
2018-11-08 14:38:28 -08:00
Gregor Kopka 5f07d51751 Zpool iostat: remove latency/queue scaling
Bandwidth and iops are average per second while *_wait are averages
per request for latency or, for queue depths, an instantaneous
measurement at the end of an interval (according to man zpool).

When calculating the first two it makes sense to do
x/interval_duration (x being the increase in total bytes or number of
requests over the duration of the interval, interval_duration in
seconds) to 'scale' from amount/interval_duration to amount/second.

But applying the same math for the latter (*_wait latencies/queue) is
wrong as there is no interval_duration component in the values (these
are time/requests to get to average_time/request or already an
absulute number).

This bug leads to the only correct continuous *_wait figures for both
latencies and queue depths from 'zpool iostat -l/q' being with
duration=1 as then the wrong math cancels itself (x/1 is a nop).

This removes temporal scaling from latency and queue depth figures.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gregor Kopka <gregor@kopka.net>
Closes #7945
Closes #7694
2018-11-08 14:38:28 -08:00
Brian Behlendorf b2f003c4f4 Fix statfs(2) for 32-bit user space
When handling a 32-bit statfs() system call the returned fields,
although 64-bit in the kernel, must be limited to 32-bits or an
EOVERFLOW error will be returned.

This is less of an issue for block counts since the default
reported block size in 128KiB. But since it is possible to
set a smaller block size, these values will be scaled as
needed to fit in a 32-bit unsigned long.

Unlike most other filesystems the total possible file counts
are more likely to overflow because they are calculated based
on the available free space in the pool. In order to prevent
this the reported value must be capped at 2^32-1. This is
only for statfs(2) reporting, there are no changes to the
internal ZFS limits.

Reviewed-by: Andreas Dilger <andreas.dilger@whamcloud.com>
Reviewed-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #7927
Closes #7122
Closes #7937
2018-11-08 14:38:28 -08:00
Olaf Faaland 9014da2b01 Skip import activity test in more zdb code paths
Since zdb opens the pools read-only, it cannot damage the pool in the
event the pool is already imported either on the same host or on
another one.

If the pool vdev structure is changing while zdb is importing the
pool, it may cause zdb to crash.  However this is unlikely, and in any
case it's a user space process and can simply be run again.

For this reason, zdb should disable the multihost activity test on
import that is normally run.

This commit fixes a few zdb code paths where that had been overlooked.
It also adds tests to ensure that several common use cases handle this
properly in the future.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Gu Zheng <guzheng2331314@163.com>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7797
Closes #7801
2018-11-08 14:38:28 -08:00
Matthew Ahrens 45579c9515 Reduce taskq and context-switch cost of zio pipe
When doing a read from disk, ZFS creates 3 ZIO's: a zio_null(), the
logical zio_read(), and then a physical zio. Currently, each of these
results in a separate taskq_dispatch(zio_execute).

On high-read-iops workloads, this causes a significant performance
impact. By processing all 3 ZIO's in a single taskq entry, we reduce the
overhead on taskq locking and context switching.  We accomplish this by
allowing zio_done() to return a "next zio to execute" to zio_execute().

This results in a ~12% performance increase for random reads, from
96,000 iops to 108,000 iops (with recordsize=8k, on SSD's).

Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: George Wilson <george.wilson@delphix.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-59292
Closes #7736
2018-11-08 14:38:28 -08:00
Tom Caputi b32f1279d4 Fix race in dnode_check_slots_free()
Currently, dnode_check_slots_free() works by checking dn->dn_type
in the dnode to determine if the dnode is reclaimable. However,
there is a small window of time between dnode_free_sync() in the
first call to dsl_dataset_sync() and when the useraccounting code
is run when the type is set DMU_OT_NONE, but the dnode is not yet
evictable, leading to crashes. This patch adds the ability for
dnodes to track which txg they were last dirtied in and adds a
check for this before performing the reclaim.

This patch also corrects several instances when dn_dirty_link was
treated as a list_node_t when it is technically a multilist_node_t.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #7147
Closes #7388
2018-11-08 14:38:28 -08:00
Tony Hutter 1b0cd07131 Tag zfs-0.7.11
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2018-09-13 10:13:41 -07:00
Dr. András Korn 8c6867dae4 tx_waited -> tx_dirty_delayed in trace_dmu.h
This change was missed in 0735ecb334.

Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: András Korn <korn-github.com@elan.rulez.org>
Closes #7096
2018-09-13 10:12:22 -07:00
Tony Hutter 99310c0aa0 Revert "zpool reopen should detect expanded devices"
This reverts commit 2a16d4cfaf.

The commit was causing a "attempt to access beyond the end
of device" error:

list.zfsonlinux.org/pipermail/zfs-discuss/2018-September/032217.html
2018-09-13 10:11:42 -07:00
Tony Hutter d126980e5f Tag zfs-0.7.10
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2018-09-05 10:37:32 -07:00
Chris Siebenmann 88ef5b238b Correctly handle errors from kern_path
As a regular kernel function, kern_path() returns errors as negative
errnos, such as -ELOOP. zfsctl_snapdir_vget() must convert these into
the positive errnos used throughout the ZFS code when it returns them
to other ZFS functions so that the ZFS code properly sees them as
errors.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Siebenmann <cks.git01@cs.toronto.edu>
Closes #7764
Closes #7864
2018-07-06 02:46:51 -07:00
Georgy Yakovlev 30d8b85702 Fix build with CONFIG_GCC_PLUGIN_RANDSTRUCT
fs/zfs/zfs/metaslab.c:1055:2: error: positional initialization of field
in ‘struct’ declared with ‘designated_init’ attribute
[-Werror=designated-init]
  metaslab_rt_remove,

Signed-off-by: Georgy Yakovlev <ya@sysdump.net>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes: #7069
2018-07-06 02:46:51 -07:00
Tom Caputi 45f0437912 Fix 'zfs recv' of non large_dnode send streams
Currently, there is a bug where older send streams without the
DMU_BACKUP_FEATURE_LARGE_DNODE flag are not handled correctly.
The code in receive_object() fails to handle cases where
drro->drr_dn_slots is set to 0, which is always the case when the
sending code does not support this feature flag. This patch fixes
the issue by ensuring that that a value of 0 is treated as
DNODE_MIN_SLOTS.

Tested-by:  DHE <git@dehacked.net>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #7617
Closes #7662
2018-07-06 02:46:51 -07:00
Tom Caputi dc3eea871a Fix object reclaim when using large dnodes
Currently, when the receive_object() code wants to reclaim an
object, it always assumes that the dnode is the legacy 512 bytes,
even when the incoming bonus buffer exceeds this length. This
causes a buffer overflow if --enable-debug is not provided and
triggers an ASSERT if it is. This patch resolves this issue and
adds an ASSERT to ensure this can't happen again.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #7097
Closes #7433
2018-07-06 02:46:51 -07:00
Tim Chase d2c8103a68 Fix problems receiving reallocated dnodes
This is a port of 047116ac - Raw sends must be able to decrease nlevels,
to the zfs-0.7-stable branch.  It includes the various fixes to the
problem of receiving incremental streams which include reallocated dnodes
in which the number of dnode slots has changed but excludes the parts
which are related to raw streams.

From 047116ac:

    Currently, when a raw zfs send file includes a
    DRR_OBJECT record that would decrease the number of
    levels of an existing object, the object is reallocated
    with dmu_object_reclaim() which creates the new dnode
    using the old object's nlevels. For non-raw sends this
    doesn't really matter, but raw sends require that
    nlevels on the receive side match that of the send
    side so that the checksum-of-MAC tree can be properly
    maintained. This patch corrects the issue by freeing
    the object completely before allocating it again in
    this case.

    This patch also corrects several issues with
    dnode_hold_impl() and related functions that prevented
    dnodes (particularly multi-slot dnodes) from being
    reallocated properly due to the fact that existing
    dnodes were not being fully cleaned up when they
    were freed.

    This patch adds a test to make sure that zfs recv
    functions properly with incremental streams containing
    dnodes of different sizes.

This also includes a one-liner fix from loli10K to fix a test failure:
https://github.com/zfsonlinux/zfs/pull/7792#discussion_r212769264

Authored-by: Tom Caputi <tcaputi@datto.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tim Chase <tim@chase2k.com>
Ported-by: Tim Chase <tim@chase2k.com>

Closes #6821
Closes #6864

NOTE: This is the first of the port of 3 related patches patches to the
zfs-0.7-release branch of ZoL.  The other two patches should immediately
follow this one.
2018-07-06 02:46:51 -07:00
Joao Carlos Mendes Luis 3ea1f7f193 Fedora 28: Fix misc bounds check compiler warnings
Fix a bunch of truncation compiler warnings that show up
on Fedora 28 (GCC 8.0.1).

Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #7368
Closes #7826
Closes #7830
2018-07-06 02:46:51 -07:00
LOLi 4356dd23a9 Fix libaio-devel requirement for Debian-based distributions
BuildRequires tags for "-devel" packages in the RPM spec file do not
work when building on Debian-based distributions.

Fix this issue by making this requirement conditional to RPM-based
distributions.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7829
Closes #7831
2018-07-06 02:46:51 -07:00
Brian Behlendorf 75318ec497 Add libaio-devel BuildRequires
The zfs-test package needs a build requirement on the libaio-devel
package.  Without it ./configure will correctly determine that
mmap_libaio cannot be built and it will be skipped.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7821
Closes #7824
2018-07-06 02:46:51 -07:00
Brian Behlendorf c1629734ab Add missing zfs-dracut RPM dependencies
The zfs-dracut package requires the hostid, basename, head, awk,
and grep utilities be installed.  The first three are provided by
coreutils but additional dependencies are required for awk and grep.

Reviewed-by: Manuel Amador (Rudd-O) <rudd-o@rudd-o.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7729
Closes #7747
2018-07-06 02:46:51 -07:00
DeHackEd 778290d5bc Don't modify argv[] in user tools
argv[] gets modified during string parsing for input arguments. This
is reflected in the live process listing. Don't do that.

Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: DHE <git@dehacked.net>
Closes #7760
2018-07-06 02:46:51 -07:00
LOLi 98bc8e0b23 Fix arcstat.py handling of unsupported options
This change allows the arcstat.py script to handle unsupported options
gracefully and print both error and usage messages when one such option
is provided.

Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7799
2018-07-06 02:46:51 -07:00
LOLi caafa436eb Allow inherited properties in zfs_check_settable()
This change modifies how 'checksum' and 'dedup' properties are verified
in zfs_check_settable() handling the case where they are explicitly
inherited in the dataset hierarchy when receiving a recursive send
stream.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7755
Closes #7576
Closes #7757
2018-07-06 02:46:51 -07:00
LOLi fe8de1c8a6 Fix zfs incremental send remove '-o' properties
When receiving an incremental send stream with intermediary snapshots
zfs_receive_one() does not correctly identify the top-level dataset:
consequently we restore said snapshots as if they were children
datasets in the hierarchy, forcing inheritance of any property received
with 'zfs send -o' and effectively removing any locally set value.

The test case did not correctly verify this situation because it uses
adjacent snapshots, basically testing 'zfs send -i' instead of
'zfs send -I': this commit adds an additional intermediary snapshot to
the test script.

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7478
2018-07-06 02:46:51 -07:00
Toomas Soome 1bd93ea1e0 OpenZFS 8906 - uts: illumos rootfs should support salted cksum
Porting notes:
* As of grub-2.02 these checksums are not supported.  However, as
  pointed out in #6501 there are alternatives such as EFISTUB which
  work and have no such restriction.  A warning was added to the
  checksum property section of the zfs.8 man page.

Authored by: Toomas Soome <tsoome@me.com>
Reviewed by: C Fraire <cfraire@me.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Reviewed by: Yuri Pankov <yuripv@yuripv.net>
Approved by: Dan McDonald <danmcd@joyent.com>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>

OpenZFS-issue: https://illumos.org/issues/8906
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/7dec52f
Closes #6501
Closes #7714
2018-07-06 02:46:51 -07:00
Brian Behlendorf 6857950e46 Fix zpl_mount() deadlock
Commit 93b43af10 inadvertently introduced the following scenario which
can result in a deadlock.  This issue was most easily reproduced by
LXD containers using a ZFS storage backend but should be reproducible
under any workload which is frequently mounting and unmounting.

-- THREAD A --
spa_sync()
  spa_sync_upgrades()
    rrw_enter(&dp->dp_config_rwlock, RW_WRITER, FTAG); <- Waiting on B

-- THREAD B --
mount_fs()
  zpl_mount()
    zpl_mount_impl()
      dmu_objset_hold()
        dmu_objset_hold_flags()
          dsl_pool_hold()
            dsl_pool_config_enter()
              rrw_enter(&dp->dp_config_rwlock, RW_READER, tag);
    sget()
      sget_userns()
        grab_super()
          down_write(&s->s_umount); <- Waiting on C

-- THREAD C --
cleanup_mnt()
  deactivate_super()
    down_write(&s->s_umount);
    deactivate_locked_super()
      zpl_kill_sb()
        kill_anon_super()
          generic_shutdown_super()
            sync_filesystem()
              zpl_sync_fs()
                zfs_sync()
                  zil_commit()
                    txg_wait_synced() <- Waiting on A

Reviewed by: Alek Pinchuk <apinchuk@datto.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7598 
Closes #7659 
Closes #7691 
Closes #7693
2018-07-06 02:46:51 -07:00
Brian Behlendorf 716ce2b89e Fix kernel unaligned access on sparc64
Update the SA_COPY_DATA macro to check if architecture supports
efficient unaligned memory accesses at compile time.  Otherwise
fallback to using the sa_copy_data() function.

The kernel provided CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is
used to determine availability in kernel space.  In user space
the x86_64, x86, powerpc, and sometimes arm architectures will
define the HAVE_EFFICIENT_UNALIGNED_ACCESS macro.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7642
Closes #7684
2018-07-06 02:46:51 -07:00
Troels Nørgaard 9daae583d8 Default ashift for Amazon EC2 NVMe devices
Add a default 4 KiB ashift for Amazon EC2 NVMe devices on instances with
NVMe ephemeral devices, such as the types c5d, f1, i3 and m5d.
As per the official documentation [1] a 4096 byte blocksize should be
used to match the underlying hardware.

The string was identified via:

$ sudo sginfo -M /dev/nvme0n1
INQUIRY response (cmd: 0x12)
----------------------------
Device Type                        0
Vendor:                    NVMe
Product:                   Amazon EC2 NVMe
Revision level:

$ lsblk -io KNAME,TYPE,SIZE,MODEL
KNAME   TYPE    SIZE MODEL
nvme0n1 disk  442.4G Amazon EC2 NVMe Instance Storage

[1] https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/
    storage-optimized-instances.html
    Retrived 2018-07-03

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Troels Nørgaard <tnn@tradeshift.com>
Closes #7676
2018-07-06 02:46:51 -07:00
Brian Behlendorf b5ee3df776 Linux 4.14 compat: blk_queue_stackable()
The blk_queue_stackable() function was replaced in the 4.14 kernel
by queue_is_rq_based(), commit torvalds/linux@5fdee212.  This change
resulted in the default elevator being used which can negatively
impact performance.

Rather than adding additional compatibility code to detect the
new interface unconditionally attempt to set the elevator.  Since
we expect this to fail for block devices without an elevator the
error message has been moved in to zfs_dbgmsg().

Finally, it was observed that the elevator_change() was removed
from the 4.12 kernel, commit torvalds/linux@c033269.  Update the
comment to clearly specify which are expected to export the
elevator_change() symbol.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7645
2018-07-06 02:46:51 -07:00
Tony Hutter 17cd9a8e0c Add pool state /proc entry, "SUSPENDED" pools
1. Add a proc entry to display the pool's state:

$ cat /proc/spl/kstat/zfs/tank/state
ONLINE

This is done without using the spa config locks, so it will
never hang.

2. Fix 'zpool status' and 'zpool list -o health' output to print
"SUSPENDED" instead of "ONLINE" for suspended pools.

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #7331
Closes #7563
2018-07-06 02:46:51 -07:00
Sara Hartse 2a16d4cfaf zpool reopen should detect expanded devices
Update bdev_capacity to have wholedisk vdevs query the
size of the underlying block device (correcting for the size
of the efi parition and partition alignment) and therefore detect
expanded space.

Correct vdev_get_stats_ex so that the expandsize is aligned
to metaslab size and new space is only reported if it is large
enough for a new metaslab.

Reviewed by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: John Wren Kennedy <jwk404@gmail.com>
Signed-off-by: sara hartse <sara.hartse@delphix.com>
External-issue: LX-165
Closes #7546
Issue #7582
2018-07-06 02:46:51 -07:00
Antonio Russo 3350a33908 Support Debian DKMS builds
scripts/dkms.mkconf calls configure with
`--with-linux=${kernel_source_dir}`, but Debian puts it kernel source at
`/lib/modules/<version>/source`. This patch adds the same logic to the
DKMS file produced by `scripts/dkms.mkconf` that Debian has shipped in
its official ZFS packaging: at DKMS build time, it checks if the system
is a Debian system, and adjusts the path accordingly.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Antonio Russo <antonio.e.russo@gmail.com>
Closes #7358 
Closes #7540 
Closes #7554
2018-07-06 02:46:51 -07:00
Olaf Faaland 3eef58c9b6 module param callbacks check for initialized spa
Callbacks provided for module parameters are executed both
after the module is loaded, when a user alters it via sysfs, e.g
	echo bar > /sys/modules/zfs/parameters/foo

as well as when the module is loaded with an argument, e.g.
	modprobe zfs foo=bar

In the latter case, the init functions likely have not run yet,
including spa_init() which initializes the namespace lock so it is safe
to use.

Instead of immediately taking the namespace lock and attemping to
iterate over initialized spa structures, check whether spa_mode_global
is nonzero.  This is set by spa_init() after it has initialized the
namespace lock.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7496 
Closes #7521
2018-07-06 02:46:51 -07:00
Brian Behlendorf 4805781c74 Trim new line from zfs_vdev_scheduler
Add a helper function to trim the tailing new line.  While we're
here use this new hook to immediately apply the new scheduler.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #3356 
Closes #6573
2018-07-06 02:46:51 -07:00
Chunwei Chen b06f40ea9b Fix ENOSPC in "Handle zap_add() failures in ..."
Commit cc63068 caused ENOSPC error when copy a large amount of files
between two directories. The reason is that the patch limits zap leaf
expansion to 2 retries, and return ENOSPC when failed.

The intent for limiting retries is to prevent pointlessly growing table
to max size when adding a block full of entries with same name in
different case in mixed mode. However, it turns out we cannot use any
limit on the retry. When we copy files from one directory in readdir
order, we are copying in hash order, one leaf block at a time. Which
means that if the leaf block in source directory has expanded 6 times,
and you copy those entries in that block, by the time you need to expand
the leaf in destination directory, you need to expand it 6 times in one
go. So any limit on the retry will result in error where it shouldn't.

Note that while we do use different salt for different directories, it
seems that the salt/hash function doesn't provide enough randomization
to the hash distance to prevent this from happening.

Since cc63068 has already been reverted. This patch adds it back and
removes the retry limit.

Also, as it turn out, failing on zap_add() has a serious side effect for
mzap_upgrade(). When upgrading from micro zap to fat zap, it will
call zap_add() to transfer entries one at a time. If it hit any error
halfway through, the remaining entries will be lost, causing those files
to become orphan. This patch add a VERIFY to catch it.

Reviewed-by: Sanjeev Bagewadi <sanjeev.bagewadi@gmail.com>
Reviewed-by: Richard Yao <ryao@gentoo.org>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Albert Lee <trisk@forkgnu.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #7401 
Closes #7421
2018-07-06 02:46:51 -07:00
Olaf Faaland 6b5cc49d81 Fix divide-by-zero in mmp_delay_update()
vdev_count_leaves() in the denominator may return 0, caught by Coverity.
Introduced by

* 533ea04 Update mmp_delay on sync or skipped, failed write

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7391
2018-07-06 02:46:51 -07:00
Prakash Surya ef7a79488a OpenZFS 8997 - ztest assertion failure in zil_lwb_write_issue
PROBLEM
=======

When `dmu_tx_assign` is called from `zil_lwb_write_issue`, it's possible
for either `ERESTART` or `EIO` to be returned.

If `ERESTART` is returned, this will cause an assertion to fail directly
in `zil_lwb_write_issue`, where the code assumes the return value is
`EIO` if `dmu_tx_assign` returns a non-zero value. This can occur if the
SPA is suspended when `dmu_tx_assign` is called, and most often occurs
when running `zloop`.

If `EIO` is returned, this can cause assertions to fail elsewhere in the
ZIL code. For example, `zil_commit_waiter_timeout` contains the
following logic:

    lwb_t *nlwb = zil_lwb_write_issue(zilog, lwb);
    ASSERT3S(lwb->lwb_state, !=, LWB_STATE_OPENED);

In this case, if `dmu_tx_assign` returned `EIO` from within
`zil_lwb_write_issue`, the `lwb` variable passed in will not be issued
to disk. Thus, it's `lwb_state` field will remain `LWB_STATE_OPENED` and
this assertion will fail. `zil_commit_waiter_timeout` assumes that after
it calls `zil_lwb_write_issue`, the `lwb` will be issued to disk, and
doesn't handle the case where this is not true; i.e. it doesn't handle
the case where `dmu_tx_assign` returns `EIO`.

SOLUTION
========

This change modifies the `dmu_tx_assign` function such that `txg_how` is
a bitmask, rather than of the `txg_how_t` enum type. Now, the previous
`TXG_WAITED` semantics can be used via `TXG_NOTHROTTLE`, along with
specifying either `TXG_NOWAIT` or `TXG_WAIT` semantics.

Previously, when `TXG_WAITED` was specified, `TXG_NOWAIT` semantics was
automatically invoked. This was not ideal when using `TXG_WAITED` within
`zil_lwb_write_issued`, leading the problem described above. Rather, we
want to achieve the semantics of `TXG_WAIT`, while also preventing the
`tx` from being penalized via the dirty delay throttling.

With this change, `zil_lwb_write_issued` can acheive the semtantics that
it requires by passing in the value `TXG_WAIT | TXG_NOTHROTTLE` to
`dmu_tx_assign`.

Further, consumers of `dmu_tx_assign` wishing to achieve the old
`TXG_WAITED` semantics can pass in the value `TXG_NOWAIT | TXG_NOTHROTTLE`.

Authored by: Prakash Surya <prakash.surya@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Andriy Gapon <avg@FreeBSD.org>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>

Porting Notes:
- Additionally updated `zfs_tmpfile` to use `TXG_NOTHROTTLE`

OpenZFS-issue: https://www.illumos.org/issues/8997
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/19ea6cb0f9
Closes #7084
2018-07-06 02:46:51 -07:00
Brian Behlendorf a2f759146d Linux compat 4.18: check_disk_size_change()
Added support for the bops->check_events() interface which was
added in the 2.6.38 kernel to replace bops->media_changed().
Fully implementing this functionality allows the volume resize
code to rely on revalidate_disk(), which is the preferred
mechanism, and removes the need to use check_disk_size_change().

In order for bops->check_events() to lookup the zvol_state_t
stored in the disk->private_data the zvol_state_lock needs to
be held.  Since the check events interface may poll the mutex
has been converted to a rwlock for better concurrently.  The
rwlock need only be taken as a writer in the zvol_free() path
when disk->private_data is set to NULL.

The configure checks for the block_device_operations structure
were consolidated in a single kernel-block-device-operations.m4
file.

The ZFS_AC_KERNEL_BDEV_BLOCK_DEVICE_OPERATIONS configure checks
and assoicated dead code was removed.  This interface was added
to the 2.6.28 kernel which predates the oldest supported 2.6.32
kernel and will therefore always be available.

Updated maximum Linux version in META file.  The 4.17 kernel
was released on 2018-06-03 and ZoL is compatible with the
finalized kernel.

Reviewed-by: Boris Protopopov <boris.protopopov@actifio.com>
Reviewed-by: Sara Hartse <sara.hartse@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7611
2018-07-06 02:46:51 -07:00
Brian Behlendorf f79c0de208 Linux 4.18 compat: inode timespec -> timespec64
Commit torvalds/linux@95582b0 changes the inode i_atime, i_mtime,
and i_ctime members form timespec's to timespec64's to make them
2038 safe.  As part of this change the current_time() function was
also updated to return the timespec64 type.

Resolve this issue by introducing a new inode_timespec_t type which
is defined to match the timespec type used by the inode.  It should
be used when working with inode timestamps to ensure matching types.

The timestruc_t type under Illumos was used in a similar fashion but
was specified to always be a timespec_t.  Rather than incorrectly
define this type all timespec_t types have been replaced by the new
inode_timespec_t type.

Finally, the kernel and user space 'sys/time.h' headers were aligned
with each other.  They define as appropriate for the context several
constants as macros and include static inline implementation of
gethrestime(), gethrestime_sec(), and gethrtime().

Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7643
Backported-by: Richard Yao <ryao@gentoo.org>
2018-07-06 02:46:51 -07:00
Boris Protopopov 1667816089 zv_suspend_lock in zvol_open()/zvol_release()
Acquire zv_suspend_lock on first open and last close only.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Boris Protopopov <boris.protopopov@actifio.com>
Closes #6342
2018-07-06 02:46:51 -07:00
Tony Hutter d1ed1be3cd Tag zfs-0.7.9
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2018-05-08 13:33:38 -07:00
Tony Hutter e749242a99 Remove DEBUG_STACKFLAGS to bypass compiler error
'Support -fsanitize=address with --enable-asan' (fed9035) removed
DEBUG_STACKFLAGS="-fstack-check" from zfs-build.m4 in master.
However, that's too heavyweight a patch to merge in to the 0.7.x branch,
so just take the one-liner we need to get around a compiler error
on Fedora 28:

$ ./configure --enable-debug --enable-debuginfo && make pkg-utils
  CC       gethrtime.lo
cc1: error: '-fstack-check=' and '-fstack-clash_protection' are mutually
exclusive.  Disabling '-fstack-check=' [-Werror]

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>

Requires-spl: #701
2018-05-07 17:19:58 -07:00
Tony Hutter 9267ef84fd Fedora 28: Add BuildRequires: libtirpc-devel
Add "BuildRequires: libtirpc-devel" to fix mock builds on Fedora 28.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #7494
Closes #7495
2018-05-07 17:19:57 -07:00
Brian Behlendorf 0ee129199f RHEL 7.5 compat: FMODE_KABI_ITERATE
As of RHEL 7.5 the mainline fops.iterate() method was added to
the file_operations structure and is correctly detected by the
configure script.

Normally this is what we want, but in order to maintain KABI
compatibility the RHEL change additionally does the following:

* Requires that callers intending to use this extended interface
  set the FMODE_KABI_ITERATE flag on the file structure when
  opening the directory.
* Adds the fops.iterate() method to the end of the structure,
  without removing fops.readdir().

This change updates the configure check to ignore the RHEL 7.5+
variant of fops.iterate() when detected.  Instead fallback to
the fops.readdir() interface which will be available.

Finally, add the 'zpl_' prefix to the directory context wrappers
to avoid colliding with the kernel provided symbols when both
the fops.iterate() and fops.readdir() are provided by the kernel.

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7460
Closes #7463
2018-05-07 17:19:57 -07:00
George Melikov 245be00597 Add back iostat -y or -w descriptions
The iostat -y and -w descriptions were left in cda0317e,
get them back.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #7479
Closes #7483
2018-05-07 17:19:57 -07:00
Antonio Russo c38d702330 Add test with two kinds of file creation orders
Data loss was identified in #7401 when many small files were copied.
This adds a reproducer for this bug and other similar ones: randomly
generate N files. Then, listing M of them by `ls -U` order, produce
those same files in a directory of the same name.

This triggers the bug consistently, provided N and M are large enough.
Here, N=2^16 and M=2^13.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Antonio Russo <antonio.e.russo@gmail.com>
Closes #7411
2018-05-07 17:19:57 -07:00
Seth Forshee 3f729907c8 Allow mounting datasets more than once
Currently mounting an already mounted zfs dataset results in an
error, whereas it is typically allowed with other filesystems.
This causes some bad interactions with mount namespaces. Take
this sequence for example:

- Create a dataset
- Create a snapshot of the dataset
- Create a clone of the snapshot
- Create a new mount namespace
- Rename the original dataset

The rename results in unmounting and remounting the clone in the
original mount namespace, however the remount fails because the
dataset is still mounted in the new mount namespace. (Note that
this means the mount in the new mount namespace is never being
unmounted, so perhaps the unmount/remount of the clone isn't
actually necessary.)

The problem here is a result of the way mounting is implemented
in the kernel module. Since it is not mounting block devices it
uses mount_nodev() instead of the usual mount_bdev(). However,
mount_nodev() is written for filesystems for which each mount is
a new instance (i.e. a new super block), and zfs should be able
to detect when a mount request can be satisfied using an existing
super block.

Change zpl_mount() to call sget() directly with it's own test
callback. Passing the objset_t object as the fs data allows
checking if a superblock already exists for the dataset, and in
that case we just need to return a new reference for the sb's
root dentry.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Alek Pinchuk <apinchuk@datto.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Closes #5796
Closes #7207
2018-05-07 17:19:57 -07:00
beren12 cca220d7c6 Fix zfs_arc_max minimum tuning
When setting `zfs_arc_max` its minimum value is allowed
to be 64 MiB.  There was an off-by-1 error which can matter
on tiny systems.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Zubrzycki <github@mid-earth.net>
Closes #7417
2018-05-07 17:19:57 -07:00
Brian Behlendorf 4ed30958ce Linux compat 4.16: blk_queue_flag_{set,clear}
The HAVE_BLK_QUEUE_WRITE_CACHE_GPL_ONLY case was overlooked in
the original 10f88c5c commit because blk_queue_write_cache()
was available for the in-kernel builds.

Update the blk_queue_flag_{set,clear} wrappers to call the locked
versions to avoid confusion.  This is safe for all existing callers.

The blk_queue_set_write_cache() function has been updated to use
these wrappers.  This means setting/clearing both QUEUE_FLAG_WC
and QUEUE_FLAG_FUA is no longer atomic but this only done early
in zvol_alloc() prior to any requests so there is no issue.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Kash Pande <kash@tripleback.net>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7428
Closes #7431
2018-05-07 17:19:57 -07:00
Giuseppe Di Natale 2f118072cb Linux compat 4.16: blk_queue_flag_{set,clear}
queue_flag_{set,clear}_unlocked are now private interfaces in
the Linux kernel (https://github.com/torvalds/linux/commit/8a0ac14).
Use blk_queue_flag_{set,clear} interfaces which were introduced as
of https://github.com/torvalds/linux/commit/8814ce8.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #7410
2018-05-07 17:19:57 -07:00
Brian Behlendorf 7440f10ec1 Fix 'zfs send/recv' hang with 16M blocks
When using 16MB blocks the send/recv queue's aren't quite big
enough.  This change leaves the default 16M queue size which a
good value for most pools.  But it additionally ensures that the
queue sizes are at least twice the allowed zfs_max_recordsize.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7365
Closes #7404
2018-05-07 17:19:57 -07:00
Giuseppe Di Natale 8bb800d6b4 Clean up (k)shlib and cfg file shebangs
Most kshlib files are imported by other scripts
and do not have a shebang at the top of their files.
Make all kshlib follow this convention.

Remove shebangs from cfg files as well.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Close #7406
2018-05-07 17:19:57 -07:00
Tony Hutter bbf61c118f Fix "file is executable, but no shebang" warnings
Fedora 28's RPM build checks warn when executable files don't have a
shebang line.  These warnings are caused when we (incorrectly)
include data & config files in the_SCRIPTS automake lines. Files in
_SCRIPTS are marked executable by automake. This patch fixes the
issue by including non-executable scripts in a _DATA line instead.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #7359
Closes #7395
2018-05-07 17:19:57 -07:00
Tony Hutter d296b09456 Exclude python scripts from RPM shebang check
The newest Fedora packaging rules print warnings for scripts using the
/usr/bin/python shebang:

    *** WARNING: mangling shebang in /usr/bin/arc_summary.py from
    #!/usr/bin/python to #!/usr/bin/python2. This will become an ERROR,
    fix it manually!

Fedora wants all cross compatible scripts to pick python3.  Since we
don't want our users to have to pick a specific version of python, we
exclude our scripts from the RPM build check.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #7360
Closes #7399
2018-05-07 17:19:57 -07:00
Olaf Faaland 5ac017fc04 Update mmp_delay on sync or skipped, failed write
When an MMP write is skipped, or fails, and time since
mts->mmp_last_write is already greater than mts->mmp_delay, increase
mts->mmp_delay.  The original code only updated mts->mmp_delay when a
write succeeded, but this results in the write(s) after delays and
failed write(s) reporting an ub_mmp_delay which is too low.

Update mmp_last_write and mmp_delay if a txg sync was successful.  At
least one uberblock was written, thus extending the time we can be sure
the pool will not be imported by another host.

Do not allow mmp_delay to go below (MSEC2NSEC(zfs_multihost_interval) /
vdev_count_leaves()) so that a period of frequent successful MMP writes,
e.g. due to frequent txg syncs, does not result in an import activity
check so short it is not reliable based on mmp thread writes alone.

Remove unnecessary local variable, start.  We do not use the start time
of the loop iteration.

Add a debug message in spa_activity_check() to allow verification of the
import_delay value and to prove the activity check occurred.

Alter the tests that import pools and attempt to detect an activity
check.  Calculate the expected duration of spa_activity_check() based on
module parameters at the time the import is performed, rather than a
fixed time set in mmp.cfg.  The fixed time may be wrong.  Also, use the
default zfs_multihost_interval value so the activity check is longer and
easier to recognize.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7330
2018-05-07 17:19:57 -07:00
Tony Hutter f5ecab3aef Fedora 28: Fix misc bounds check compiler warnings
Fix a bunch of (mostly) sprintf/snprintf truncation compiler
warnings that show up on Fedora 28 (GCC 8.0.1).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #7361
Closes #7368
2018-05-07 17:19:57 -07:00
LOLi fd01167ffd Fix hung z_zvol tasks during 'zfs receive'
During a receive operation zvol_create_minors_impl() can wait
needlessly for the prefetch thread because both share the same tasks
queue.  This results in hung tasks:

<3>INFO: task z_zvol:5541 blocked for more than 120 seconds.
<3>      Tainted: P           O  3.16.0-4-amd64
<3>"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

The first z_zvol:5541 (zvol_task_cb) is waiting for the long running
traverse_prefetch_thread:260

root@linux:~# cat /proc/spl/taskq
taskq                       act  nthr  spwn  maxt   pri  mina
spl_system_taskq/0            1     2     0    64   100     1
	active: [260]traverse_prefetch_thread [zfs](0xffff88003347ae40)
	wait: 5541
spl_delay_taskq/0             0     1     0     4   100     1
	delay: spa_deadman [zfs](0xffff880039924000)
z_zvol/1                      1     1     0     1   120     1
	active: [5541]zvol_task_cb [zfs](0xffff88001fde6400)
	pend: zvol_task_cb [zfs](0xffff88001fde6800)

This change adds a dedicated, per-pool, prefetch taskq to prevent the
traverse code from monopolizing the global (and limited) system_taskq by
inappropriately scheduling long running tasks on it.

Reviewed-by: Albert Lee <trisk@forkgnu.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6330
Closes #6890
Closes #7343
2018-05-07 17:19:57 -07:00
Don Brady 3b118f0a34 Add support for nvme based devids
Adds a devid for nvme devices. This is very similar to how the
other 'bus' (scsi|sata|usb) devids are generated. The devid
resides in a name/value pair in the leaf vdevs in a zpool config.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Don Brady <don.brady@delphix.com>
Closes #7356
2018-05-07 17:19:57 -07:00
Tony Hutter ebe443c8ff chmod -x on etc/init.d/zfs-*.in automake files
Clear executable bit on zfs-import.in, zfs-mount.in,
zfs-share.in, and zfs-zed.in.  These are automake files and
should not be marked executable.  This fixes a RPM build error
on Fedora 28.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #7355
Closes #7327
2018-05-07 17:19:57 -07:00
Brian Behlendorf 63f3396233 Fix mmap / libaio deadlock
Calling uiomove() in mappedread() under the page lock can result
in a deadlock if the user space page needs to be faulted in.

Resolve the issue by dropping the page lock before the uiomove().
The inode range lock protects against concurrent updates via
zfs_read() and zfs_write().

Reviewed-by: Albert Lee <trisk@forkgnu.org>
Reviewed-by: Chunwei Chen <david.chen@nutanix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7335
Closes #7339
2018-05-07 17:19:57 -07:00
DeHackEd 2deb4526ee Remove libattr requirement
RHEL/CentOS 6 supports sys/xattr.h eliminating the need for
libattr-devel as a dependency.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: DHE <git@dehacked.net>
Closes #7344
Closes #7351
2018-05-07 17:19:57 -07:00
Tony Hutter a1662ffcaa Fedora 28: Fix "Macro %_dracutdir has empty body"
If you run ./configure --with-config=srpm, it will not trigger
the user m4 scripts to populate the dracut and udev directories.
This causes a build error on Fedora 28.  Make the dracut and
udev lines conditional to get around this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #7326
Closes #7328
2018-05-07 17:19:57 -07:00
kpande ea921bf6a6 modprobe zfs during dracut mount
Resolves importing root pool during boot in dracut.  This case was
inadvertently broken with the module autoloading change in #7287.

Reviewed-by: Matthew Thode <prometheanfire@gentoo.org>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Kash Pande <kash@tripleback.net>
Closes #7322
2018-05-07 17:19:57 -07:00
timor 6e627cc468 Add support for nvme disk detection
This treats /dev/nvme.. devices the same way as /dev/sd... devices.  The
motivation behind this is that whole disk detection did not work on nvme
SSDs without that, because it DKC_UNKNOWN was returned for such devices.

Perhaps there should be a separate DKC_ type for this, but I don't know
enough about the code to know the implications of that.

Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: timor <timor.dd@googlemail.com>
Closes #7304
2018-05-07 17:19:56 -07:00
Olaf Faaland 3eb3a13628 Report pool suspended due to MMP
When the pool is suspended, record whether it was due to an I/O error or
due to MMP writes failing to succeed within the required time.

Change spa_suspended from uint8_t to zio_suspend_reason_t to store the
reason.

When userspace queries pool status via spa_tryimport(), report the
reason the pool was suspended in a new key,
ZPOOL_CONFIG_SUSPENDED_REASON.

In libzfs, when interpreting the returned config nvlist, report
suspension due to MMP with a new pool status enum value,
ZPOOL_STATUS_IO_FAILURE_MMP.

In status_callback(), which generates and emits the message when 'zpool
status' is executed, add a case to print an appropriate message for the
new pool status enum value.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7296
2018-05-07 17:19:56 -07:00
Tim Chase c234706270 Add zfs_scan_ignore_errors tunable
When it's set, a DTL range will be cleared even if its scan/scrub had
errors.  This allows to work around resilver/scrub upon import when the
pool has errors.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tim Chase <tim@chase2k.com>
Closes #7293
2018-05-07 17:19:56 -07:00
Tony Hutter 6059ba27c4 Allow to limit zed's syslog chattiness
Some usage patterns like send/recv of replication streams can
produce a large number of events. In such a case, the current
all-syslog.sh zedlet will hold up to its name, and flood the
logs with mostly redundant information. Two mitigate this
situation, this changeset introduces to new variables
ZED_SYSLOG_SUBCLASS_INCLUDE and ZED_SYSLOG_SUBCLASS_EXCLUDE
to zed.rc that give more control over which event classes end
up in the syslog.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Daniel Kobras <d.kobras@science-computing.de>
Closes #6886
Closes #7260
2018-05-07 17:19:56 -07:00
Olaf Faaland 927f40d089 Record skipped MMP writes in multihost_history
Once per pass through the MMP thread's loop, the vdev tree is walked to
find a suitable leaf to write the next MMP block to.  If no such leaf is
found, the thread sleeps for a while and resumes at the top of the loop.

Add an entry to multihost_history when no leaf can be found, and record
the reason in the error column.  The error code for such entries is a
bitfield, displayed in hex:

0x1  At least one vdev (interior or leaf) was not writeable.
0x2  At least one writeable leaf vdev was found, but it had a pending
MMP write.

timestamp = the time in seconds since the epoch when no leaf could be
found originally.

duration = the time (in ns) during which no MMP block was written for
this reason.  This does not include the preceeding inter-write period
nor the following inter-write period.

vdev_guid = the number of sequential cycles of the MMP thread looop when
this occurred.

Sample output, truncated to fit:

For records of skipped MMP writes the right-most column, vdev_path, is
reported as "-".

id   txg  timestamp   error  duration    mmp_delay  vdev_guid     ...
936  11   1520036441  0      146264      891422313  1740883117838 ...
937  11   1520036441  0      163956      888356657  7320395061548 ...
938  11   1520036442  0      130690      885314969  7320395061548 ...
939  11   1520036442  0      2001068577  882296582  1740883117838 ...
940  11   1520036443  0      161806      882296582  7320395061548 ...
941  11   1520036443  0x2    0           998020546  1             ...
942  11   1520036444  0      136585      998020546  7320395061548 ...
943  11   1520036444  0x2    0           998020257  1             ...
944  11   1520036445  5      2002662964  994160219  1740883117838 ...
945  11   1520036445  0x2    998073118   994160219  3             ...
946  11   1520036447  0      247136      994160219  7320395061548 ...

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7212
2018-05-07 17:19:56 -07:00
Giuseppe Di Natale 6356d50e67 Introduce a destroy_dataset helper
Datasets can be busy when calling zfs destroy. Introduce
a helper function to destroy datasets and use it to destroy
datasets in zfs_allow_004_pos, zfs_promote_008_pos, and
zfs_destroy_002_pos.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #7224
Closes #7246
Closes #7249
Closes #7267
2018-05-07 17:19:56 -07:00
Tony Hutter bd69ae3b53 Tag zfs-0.7.8
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2018-04-09 14:31:57 -07:00
Tony Hutter 9a2e90c9fc Revert "Handle zap_add() failures in mixed ... "
This reverts commit cc63068e95.

Under certain circumstances this change can result in an ENOSPC
error when adding new files to a directory.  See #7401 for full
details.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Issue #7401
Closes #7416
2018-04-09 17:29:59 -04:00
Tony Hutter 240ccfc13a Tag zfs-0.7.7
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2018-03-14 16:16:43 -07:00
Brian Behlendorf c30e716c81 Fix MMP write frequency for large pools
When a single pool contains more vdevs than the CONFIG_HZ for
for the kernel the mmp thread will not delay properly.  Switch
to using cv_timedwait_sig_hires() to handle higher resolution
delays.

This issue was reported on Arch Linux where HZ defaults to only
100 and this could be fairly easily reproduced with a reasonably
large pool.  Most distribution kernels set CONFIG_HZ=250 or
CONFIG_HZ=1000 and thus are unlikely to be impacted.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7205
Closes #7289
2018-03-14 16:10:38 -07:00
Olaf Faaland 267fd7b0f1 Handle zio_resume and mmp => off
When multihost is disabled on a pool, and the pool is resumed via zpool
clear, within a single cycle of the mmp thread's loop (e.g.  while it's
in the cv_timedwait call), both mmp_last_write and mmp_delay should be
updated.

The original code mistakenly treated the two cases as if they could not
occur at the same time.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7286
2018-03-14 16:10:38 -07:00
LOLi dc0176eeec Fix zfs-kmod builds when using rpm >= 4.14
With rpm-software-management/rpm@5e94633 a package version containing
invalid characters (most commonly a double '-') causes the kmod package
generation to terminate with an error.  This change takes advantage of
the newly introduced rpm macro "_wrong_version_format_terminate_build"
to allow kmod packages to be built.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:  loli10K <ezomori.nozomu@gmail.com>
Closes #7284
2018-03-14 16:10:38 -07:00
Paul Zuchowski 0a0af41bd9 zdb and inuse tests don't pass with real disks
Due to zpool create auto-partioning in Linux (i.e. sdb1),
certain utilities need to use the parition (sdb1) while
others use the whole disk name (sdb).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Zuchowski <pzuchowski@datto.com>
Closes #6939
Closes #7261
2018-03-14 16:10:38 -07:00
Wolfgang Bumiller 3808006edf Take user namespaces into account in policy checks
Change file related checks to use user namespaces and make
sure involved uids/gids are mappable in the current
namespace.

Note that checks without file ownership information will
still not take user namespaces into account, as some of
these should be handled via 'zfs allow' (otherwise root in a
user namespace could issue commands such as `zpool export`).

This also adds an initial user namespace regression test
for the setgid bit loss, with a user_ns_exec helper usable
in further tests.

Additionally, configure checks for the required user
namespace related features are added for:
  * ns_capable
  * kuid/kgid_has_mapping()
  * user_ns in cred_t

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Closes #6800
Closes #7270
2018-03-14 16:10:38 -07:00
Olaf Faaland c17922b8a9 Detect long config lock acquisition in mmp
If something holds the config lock as a writer for too long, MMP will
fail to issue MMP writes in a timely manner.  This will result either in
the pool being suspended, or in an extreme case, in the pool not being
protected.

If the time to acquire the config lock exceeds 1/10 of the minimum
zfs_multihost_interval, report it in the zfs debug log.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7212
2018-03-14 16:10:38 -07:00
Giuseppe Di Natale 8d7f17798d Linux 4.16 compat: get_disk_and_module()
As of https://github.com/torvalds/linux/commit/fb6d47a, get_disk()
is now get_disk_and_module(). Add a configure check to determine
if we need to use get_disk_and_module().

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #7264
2018-03-14 16:10:38 -07:00
Tony Hutter 6dc40e2ada Change checksum & IO delay ratelimit values
Change checksum & IO delay ratelimit thresholds from 5/sec to 20/sec.
This allows zed to actually trigger if a bunch of these events arrive in
a short period of time (zed has a threshold of 10 events in 10 sec).
Previously, if you had, say, 100 checksum errors in 1 sec, it would get
ratelimited to 5/sec which wouldn't trigger zed to fault the drive.

Also, convert the checksum and IO delay thresholds to module params for
easy testing.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #7252
2018-03-14 16:10:38 -07:00
chrisrd 792f88131c Increment zil_itx_needcopy_bytes properly
In zil_lwb_commit() with TX_WRITE, we copy the log write record (lrw)
into the log write block (lwb) and send it off using zil_lwb_add_txg().
If we also have WR_NEED_COPY, we additionally copy the lwr's data into
the lwb to be sent off.  If the lwr + data doesn't fit into the lwb, we
send the lrw and as much data as will fit (dnow bytes), then go back
and do the same with the remaining data.

Each time through this loop we're sending dnow data bytes. I.e.
zil_itx_needcopy_bytes should be incremented by dnow.

Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Dunlop <chris@onthe.net.au>
Closes #6988
Closes #7176
2018-03-14 16:10:38 -07:00
John Eismeier 33bb1e8256 Fix some typos
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: George Melikov <mail@gmelikov.ru>
Signed-off-by: John Eismeier <john.eismeier@gmail.com>
Closes #7237
2018-03-14 16:10:38 -07:00
Tomohiro Kusumi bcaba38e42 Fix zpool(8) list example to match actual format
a05dfd00 (Illumos 5147) has swapped FRAG and EXPANDSZ,
so it's natural to modify these examples.

 # zpool list | head -1
 NAME     SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
                              ^^^^^^^^^^^^^^^

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@osnexus.com>
Closes #7244
2018-03-14 16:10:38 -07:00
Tony Hutter 5e3085e360 Add SMART self-test results to zpool status -c
Add in SMART self-test results to zpool status|iostat -c.  This
works for both SAS and SATA drives.

Also, add plumbing to allow the 'smart' script to take smartctl
output from a directory of output text files instead of running
it against the vdevs.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #7178
2018-03-14 16:10:37 -07:00
Tony Hutter 99920d823e Add scrub after resilver zed script
* Add a zed script to kick off a scrub after a resilver.  The script is
disabled by default.

* Add a optional $PATH (-P) option to zed to allow it to use a custom
$PATH for its zedlets.  This is needed when you're running zed under
the ZTS in a local workspace.

* Update test scripts to not copy in all-debug.sh and all-syslog.sh by
default.  They can be optionally copied in as part of zed_setup().
These scripts slow down zed considerably under heavy events loads and
can cause events to be dropped or their delivery delayed. This was
causing some sporadic failures in the 'fault' tests.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #4662
Closes #7086
2018-03-14 16:10:37 -07:00
chrisrd 338523dd6e Fix free memory calculation on v3.14+
Provide infrastructure to auto-configure to enum and API changes in the
global page stats used for our free memory calculations.

arc_free_memory has been broken since an API change in Linux v3.14:

2016-07-28 v4.8 599d0c95 mm, vmscan: move LRU lists to node
2016-07-28 v4.8 75ef7184 mm, vmstat: add infrastructure for per-node
  vmstats

These commits moved some of global_page_state() into
global_node_page_state(). The API change was particularly egregious as,
instead of breaking the old code, it silently did the wrong thing and we
continued using global_page_state() where we should have been using
global_node_page_state(), thus indexing into the wrong array via
NR_SLAB_RECLAIMABLE et al.

There have been further API changes along the way:

2017-07-06 v4.13 385386cf mm: vmstat: move slab statistics from zone to
  node counters
2017-09-06 v4.14 c41f012a mm: rename global_page_state to
  global_zone_page_state

...and various (incomplete, as it turns out) attempts to accomodate
these changes in ZoL:

2017-08-24 2209e409 Linux 4.8+ compatibility fix for vm stats
2017-09-16 787acae0 Linux 3.14 compat: IO acct, global_page_state, etc
2017-09-19 661907e6 Linux 4.14 compat: IO acct, global_page_state, etc

The config infrastructure provided here resolves these issues going back
to the original API change in v3.14 and is robust against further Linux
changes in this area.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Chris Dunlop <chris@onthe.net.au>
Closes #7170
2018-03-14 16:10:37 -07:00
Olaf Faaland 2644784f49 Report duration and error in mmp_history entries
After an MMP write completes, update the relevant mmp_history entry
with the time between submission and completion, and the error
status of the write.

[faaland1@toss3a zfs]$ cat /proc/spl/kstat/zfs/pool/multihost
39 0 0x01 100 8800 69147946270893 72723903122926
id       txg     timestamp  error  duration   mmp_delay    vdev_guid
10607    1166    1518985089 0      138301     637785455    4882...
10608    1166    1518985089 0      136154     635407747    1151...
10609    1166    1518985089 0      803618560  633048078    9740...
10610    1166    1518985090 0      144826     633048078    4882...
10611    1166    1518985090 0      164527     666187671    1151...

Where duration = gethrtime_in_done_fn - gethrtime_at_submission, and
error = zio->io_error.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7190
2018-03-14 16:10:37 -07:00
Olaf Faaland b1f61f05b4 Do not initiate MMP writes while pool is suspended
While the pool is suspended on host A, it may be imported on host B.
If host A continued to write MMP blocks, it would be blindly
overwriting MMP blocks written by host B, and the blocks written by
host A would have outdated txg information.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7182
2018-03-14 16:10:37 -07:00
Tony Hutter e5ba614d05 Linux 4.16 compat: use correct *_dec_and_test()
Use refcount_dec_and_test() on 4.16+ kernels, atomic_dec_and_test()
on older kernels.  https://lwn.net/Articles/714974/

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes: #7179
Closes: #7211
2018-03-14 16:10:37 -07:00
Matthew Thode 30ac8de48a Allow modprobe to fail when called within systemd
This allows for systems with zfs built into the kernel manually to run
these services.  Otherwise the service will fail to start.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Kash Pande <kash@tripleback.net>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Thode <mthode@mthode.org>
Closes #7174
2018-03-14 16:10:37 -07:00
bunder2015 c705d8386b Add SMART attributes for SSD and NVMe
This adds the SMART attributes required to probe Samsung SSD and NVMe
(and possibly others) disks when using the "zpool status -c" command.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: bunder2015 <omfgbunder@gmail.com>
Closes #7183
Closes #7193
2018-03-14 16:10:37 -07:00
Giuseppe Di Natale d5b10b3ef3 Correct count_uberblocks in mmp.kshlib
A log_must call was causing count_uberblocks to return more
than just the uberblock count. Remove the log_must since it
was only logging a sleep.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #7191
2018-03-14 16:10:37 -07:00
chrisrd 5a84c60fb9 Fix config issues: frame size and headers
1. With various (debug and/or tracing?) kernel options enabled it's
possible for 'struct inode' and 'struct super_block' to exceed the
default frame size, leaving errors like this in config.log:

build/conftest.c:116:1: error: the frame size of 1048 bytes is larger
than 1024 bytes [-Werror=frame-larger-than=]

Fix this by removing the frame size warning for config checks

2. Without the correct headers included, it's possible for declarations
to be missed, leaving errors like this in the config.log:

build/conftest.c:131:14: error: ‘struct nameidata’ declared inside
parameter list [-Werror]

Fix this by adding appropriate headers.

Note: Both these issues can result in silent config failures because
the compile failure is taken to mean "this option is not supported by
this kernel" rather than "there's something wrong with the config
test". This can lead to something merely annoying (compile failures) to
something potentially serious (miscompiled or misused kernel primitives
or functions). E.g. the fixes included here resulted in these
additional defines in zfs_config.h with linux v4.14.19:

Also, drive-by whitespace fixes in config/* files which don't mention
"GNU" (those ones look to be imported from elsewhere so leave them
alone).

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Dunlop <chris@onthe.net.au>
Closes #7169
2018-03-14 16:10:37 -07:00
Olaf Faaland 26941ce90b Clarify zinject(8) explanation of -e
Error injection of EIO or ENXIO simply sets the zio's io_error value,
rather than preventing the read or write from occurring.  This is
important information as it affects how the probes must be used.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7172
2018-03-14 16:10:37 -07:00
George Wilson 07ce5d7390 OpenZFS 8857 - zio_remove_child() panic due to already destroyed parent zio
PROBLEM
=======
It's possible for a parent zio to complete even though it has children
which have not completed. This can result in the following panic:
    > $C
    ffffff01809128c0 vpanic()
    ffffff01809128e0 mutex_panic+0x58(fffffffffb94c904, ffffff597dde7f80)
    ffffff0180912950 mutex_vector_enter+0x347(ffffff597dde7f80)
    ffffff01809129b0 zio_remove_child+0x50(ffffff597dde7c58, ffffff32bd901ac0,
    ffffff3373370908)
    ffffff0180912a40 zio_done+0x390(ffffff32bd901ac0)
    ffffff0180912a70 zio_execute+0x78(ffffff32bd901ac0)
    ffffff0180912b30 taskq_thread+0x2d0(ffffff33bae44140)
    ffffff0180912b40 thread_start+8()
    > ::status
    debugging crash dump vmcore.2 (64-bit) from batfs0390
    operating system: 5.11 joyent_20170911T171900Z (i86pc)
    image uuid: (not set)
    panic message: mutex_enter: bad mutex, lp=ffffff597dde7f80
    owner=ffffff3c59b39480 thread=ffffff0180912c40
    dump content: kernel pages only
The problem is that dbuf_prefetch along with l2arc can create a zio tree
which confuses the parent zio and allows it to complete with while children
still exist. Here's the scenario:
    zio tree:
        pio
         |--- lio
The parent zio, pio, has entered the zio_done stage and begins to check its
children to see there are still some that have not completed. In zio_done(),
the children are checked in the following order:
    zio_wait_for_children(zio, ZIO_CHILD_VDEV, ZIO_WAIT_DONE)
    zio_wait_for_children(zio, ZIO_CHILD_GANG, ZIO_WAIT_DONE)
    zio_wait_for_children(zio, ZIO_CHILD_DDT, ZIO_WAIT_DONE)
    zio_wait_for_children(zio, ZIO_CHILD_LOGICAL, ZIO_WAIT_DONE)
If pio, finds any child which has not completed then it stops executing and
goes to sleep. Each call to zio_wait_for_children() will grab the io_lock
while checking the particular child.
In this scenario, the pio has completed the first call to
zio_wait_for_children() to check for any ZIO_CHILD_VDEV children. Since
the only zio in the zio tree right now is the logical zio, lio, then it
completes that call and prepares to check the next child type.
In the meantime, the lio completes and in its callback creates a child vdev
zio, cio. The zio tree looks like this:
    zio tree:
        pio
         |--- lio
         |--- cio
The lio then grabs the parent's io_lock and removes itself.
    zio tree:
        pio
         |--- cio
The pio continues to run but has already completed its check for ZIO_CHILD_VDEV
and will erroneously complete. When the child zio, cio, completes it will panic
the system trying to reference the parent zio which has been destroyed.
SOLUTION
========
The fix is to rework the zio_wait_for_children() logic to accept a bitfield
for all the children types that it's interested in checking. The
io_lock will is held the entire time we check all the children types. Since
the function now accepts a bitfield, a simple ZIO_CHILD_BIT() macro is provided
to allow for the conversion between a ZIO_CHILD type and the bitfield used by
the zio_wiat_for_children logic.

Authored by: George Wilson <george.wilson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Andriy Gapon <avg@FreeBSD.org>
Reviewed by: Youzhong Yang <youzhong@gmail.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Dan McDonald <danmcd@omniti.com>
Ported-by: Giuseppe Di Natale <dinatale2@llnl.gov>

OpenZFS-issue: https://www.illumos.org/issues/8857
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/862ff6d99c
Issue #5918
Closes #7168
2018-03-14 16:10:37 -07:00
LOLi 1d805a534b 'zfs receive' fails with "dataset is busy"
Receiving an incremental stream after an interrupted "zfs receive -s"
fails with the message "dataset is busy": this is because we still have
the hidden clone ../%recv from the resumable receive.

Improve the error message suggesting the existence of a partially
complete resumable stream from "zfs receive -s" which can be either
aborted ("zfs receive -A") or resumed ("zfs send -t").

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7129
Closes #7154
2018-03-14 16:10:37 -07:00
LOLi a9ff89e05c contrib/initramfs: add missing conf.d/zfs
When upgrading from the distribution-provided zfs-initramfs package on
root-on-zfs Ubuntu and Debian the system may fail to boot: this change
adds the missing initramfs configuration file.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7158
2018-03-14 16:10:37 -07:00
sanjeevbagewadi d85011ed69 mmp should use a fixed tag for spa_config locks
mmp_write_uberblock() and mmp_write_done() should the same tag
for spa_config_locks.

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Sanjeev Bagewadi <sanjeev.bagewadi@gmail.com>
Closes #6530
Closes #7155
2018-03-14 16:10:37 -07:00
sanjeevbagewadi b3da003ebf Handle zap_add() failures in mixed case mode
With "casesensitivity=mixed", zap_add() could fail when the number of
files/directories with the same name (varying in case) exceed the
capacity of the leaf node of a Fatzap. This results in a ASSERT()
failure as zfs_link_create() does not expect zap_add() to fail. The fix
is to handle these failures and rollback the transactions.

Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Chunwei Chen <david.chen@nutanix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Sanjeev Bagewadi <sanjeev.bagewadi@gmail.com>
Closes #7011
Closes #7054
2018-03-14 16:10:37 -07:00
Chunwei Chen 478754a8f5 Fix zdb -ed on objset for exported pool
zdb -ed on objset for exported pool would failed with:
  failed to own dataset 'qq/fs0': No such file or directory

The reason is that zdb pass objset name to spa_import, it uses that
name to create a spa. Later, when dmu_objset_own tries to lookup the spa
using real pool name, it can't find one.

We fix this by make sure we pass pool name rather than objset name to
spa_import.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #7099
Closes #6464
2018-03-14 16:10:37 -07:00
Chunwei Chen 31ff122aa2 Fix zdb -E segfault
SPA_MAXBLOCKSIZE is too large for stack.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #7099
2018-03-14 16:10:36 -07:00
Chunwei Chen 18c662b845 Fix zdb -R decompression
There are some issues in the zdb -R decompression implementation.

The first is that ZLE can easily decompress non-ZLE streams. So we add
ZDB_NO_ZLE env to make zdb skip ZLE.

The second is the random bytes appended to pabd, pbuf2 stuff. This serve
no purpose at all, those bytes shouldn't be read during decompression
anyway. Instead, we randomize lbuf2, so that we can make sure
decompression fill exactly to lsize by bcmp lbuf and lbuf2.

The last one is the condition to detect fail is wrong.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #7099
Closes #4984
2018-03-14 16:10:36 -07:00
Chunwei Chen c797f0898e Fix racy assignment of zcb.zcb_haderrors
zcb_haderrors will be modified in zdb_blkptr_done, which is
asynchronous. So we must move this assignment after zio_wait.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #7099
2018-03-14 16:10:36 -07:00
Chunwei Chen 5e566c5772 Fix zle_decompress out of bound access
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #7099
2018-03-14 16:10:36 -07:00
Chunwei Chen 23227313a2 Fix zdb -c traverse stop on damaged objset root
If a corruption happens to be on a root block of an objset, zdb -c will
not correctly report the error, and it will not traverse the datasets
that come after. This is because traverse_visitbp, which does the
callback and reset error for TRAVERSE_HARD, is skipped when traversing
zil is failed in traverse_impl.

Here's example of what 'zdb -eLcc' command looks like on a pool with
damaged objset root:

== before patch:

Traversing all blocks to verify checksums ...

Error counts:

	errno  count
block traversal size 379392 != alloc 33987072 (unreachable 33607680)

	bp count:             172
	ganged count:           0
	bp logical:       1678336      avg:   9757
	bp physical:       130560      avg:    759     compression:  12.85
	bp allocated:      379392      avg:   2205     compression:   4.42
	bp deduped:             0    ref>1:      0   deduplication:   1.00
	SPA allocated:   33987072     used:  0.80%

	additional, non-pointer bps of type 0:         71
	Dittoed blocks on same vdev: 101

== after patch:

Traversing all blocks to verify checksums ...

zdb_blkptr_cb: Got error 52 reading <54, 0, -1, 0>  -- skipping

Error counts:

	errno  count
	   52  1
block traversal size 33963520 != alloc 33987072 (unreachable 23552)

	bp count:             447
	ganged count:           0
	bp logical:      36093440      avg:  80745
	bp physical:     33699840      avg:  75391     compression:   1.07
	bp allocated:    33963520      avg:  75981     compression:   1.06
	bp deduped:             0    ref>1:      0   deduplication:   1.00
	SPA allocated:   33987072     used:  0.80%

	additional, non-pointer bps of type 0:         76
	Dittoed blocks on same vdev: 115

==

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #7099
2018-03-14 16:10:36 -07:00
Brian Behlendorf 3713b73335 Linux 4.11 compat: avoid refcount_t name conflict
Related to commit 4859fe796, when directly using the kernel's
refcount functions in kernel compatibility code do not map
refcount_t to zfs_refcount_t.  This leads to a type mismatch.

Longer term we should consider renaming refcount_t to
zfs_refcount_t in the zfs code base.

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Chunwei Chen <david.chen@nutanix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7148
2018-03-14 16:10:36 -07:00
Brian Behlendorf 310e63dfd1 Linux 4.16 compat: inode_set_iversion()
A new interface was added to manipulate the version field of an
inode.  Add a inode_set_iversion() wrapper for older kernels and
use the new interface when available.

The i_version field was dropped from the trace point due to the
switch to an atomic64_t i_version type.

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Chunwei Chen <david.chen@nutanix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7148
2018-03-14 16:10:36 -07:00
WHR a196b3bc3d OpenZFS 8966 - Source file zfs_acl.c, function zfs_aclset_common contains a use after end of the lifetime of a local variable
Authored by: WHR <msl0000023508@gmail.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Andriy Gapon <avg@FreeBSD.org>
Reviewed by: George Melikov <mail@gmelikov.ru>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Richard Lowe <richlowe@richlowe.net>
Ported-by: Giuseppe Di Natale <dinatale2@llnl.gov>

OpenZFS-issue: https://www.illumos.org/issues/8966
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/c95549fcdc
Closes #7141
2018-03-14 16:10:36 -07:00
Richard Elling a58e1284d8 Remove deprecated zfs_arc_p_aggressive_disable
zfs_arc_p_aggressive_disable is no more. This PR removes docs
and module parameters for zfs_arc_p_aggressive_disable.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Richard Elling <Richard.Elling@RichardElling.com>
Closes #7135
2018-03-14 16:10:36 -07:00
Brian Behlendorf f1dde3fb20 Fix default libdir for Debian/Ubuntu
The distribution provided architecture specific RPM macro files
for x86_64 and other architectures on Debian/Ubuntu specify the
wrong default libdir install location.  When building deb packages
override _lib with the correct location.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7083
Closes #7101
2018-03-14 16:10:36 -07:00
wli5 5f38142e7b Bug fix in qat_compress.c for vmalloc addr check
Remove the unused vmalloc address check, and function mem_to_page
will handle the non-vmalloc address when map it to a physical
address.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Weigang Li <weigang.li@intel.com>
Closes #7125
2018-03-14 16:10:36 -07:00
LOLi 29b79dcfe9 Fix systemd_ RPM macros usage on Debian-based distributions
Debian-based distributions do not seem to provide RPM macros for
dealing with systemd pre- and post- (un)install actions: this results
in errors when installing or upgrading .deb packages because the
resulting control scripts contain the following unresolved macros:

 * %systemd_post
 * %systemd_preun
 * %systemd_postun

Fix this by providing default values for postinstall, preuninstall and
postuninstall scripts when these macros are not defined.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7074
Closes #7100
2018-03-14 16:10:36 -07:00
John L. Hammond ecc972c7f0 Emit an error message before MMP suspends pool
In mmp_thread(), emit an MMP specific error message before calling
zio_suspend() so that the administrator will understand why the pool
is being suspended.

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Closes #7048
2018-03-14 16:10:36 -07:00
LOLi 6c891ade8b ZTS: Fix create-o_ashift test case
The function that fills the uberblock ring buffer on every device label
has been reworked to avoid occasional failures caused by a race
condition that prevents 'zpool sync' from writing some uberblock
sequentially: this happens when the pool sync ioctl dispatch code calls
txg_wait_synced() while we're already waiting for a TXG to sync.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6924
Closes #6977
2018-03-14 16:10:36 -07:00
LOLi 03658d5081 Fix --with-systemd on Debian-based distributions (#6963)
These changes propagate the "--with-systemd" configure option to the
RPM spec file, allowing Debian-based distributions to package
systemd-related files.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6591
Closes #6963
2018-03-14 16:10:36 -07:00
Brian Behlendorf 5d62588032 Remove vn_rename and vn_remove dependency
The only place vn_rename and vn_remove are used is when writing
out an updated pool configuration file.  By truncating the file
instead of renaming and removing it we can avoid having to implement
these interfaces entirely.  Functionally an empty cache file is
treated the same as a missing cache file.  This is particularly
advantageous because the Linux kernel has never provided a way
to reliably implement vn_rename and vn_remove.

The cachefile_004_pos.ksh test case was updated to understand
that an empty cache file is the same as a missing one.

The zfs-import-* systemd service files were not updated to use
ConditionFileNotEmpty in place of ConditionPathExists.  This
means that after exporting all pools and rebooting new pools
will not the scanned for on the next boot.  This small change
should not impact normal usage since pools are not exported
as part of a normal shutdown.

Documentation was updated accordingly.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Arkadiusz Bubała <arkadiusz.bubala@open-e.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes zfsonlinux/spl#648
Closes #6753
2018-03-14 16:10:36 -07:00
Brian Behlendorf 6897ea475f Fix "--enable-code-coverage" debug build
When --enable-code-coverage is provided it should not result
in NDEBUG being defined.  This is controlled by --enable-debug.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6674
2018-03-14 16:10:36 -07:00
Brian Behlendorf 3790bfa80f Update codecov.yml
Update the codecov.yml to make the following functional changes.

* Do not require the CI testing to pass before posting results.
* Set red-yellow-green coverage percent from 50%-100%
* Allow a 1% drop in coverage to still be considered a pass.
* Reduce the size of the comment posted to the issue.

Additionally, the top level README.markdown has been updated
to include the codecov.io badge and the project summary reworded.

Reviewed-by: Prakash Surya <prakash.surya@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6669
2018-03-14 16:10:36 -07:00
Prakash Surya 6b278f3223 Add support for "--enable-code-coverage" option
This change adds support for a new option that can be passed to the
configure script: "--enable-code-coverage". Further, the "--enable-gcov"
option has been removed, as this new option provides the same
functionality (plus more).

When using this new option the following make targets are available:

 * check-code-coverage
 * code-coverage-capture
 * code-coverage-clean

Note: these make targets can only be run from the root of the project.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Prakash Surya <prakash.surya@delphix.com>
Closes #6670
2018-03-14 16:10:36 -07:00
Prakash Surya f1236ebf35 Make "-fno-inline" compile option more accessible
When functions are inlined, it can make the system much more difficult
to instrument using tools such as ftrace, BPF, crash, etc. Thus, to aid
development and increase the system's observability, when the
"--enable-debuginfo" flag is specified, the "-fno-inline" compilation
option will be used for both userspace and kernel modules.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Prakash Surya <prakash.surya@delphix.com>
Closes #6605
2018-03-14 16:10:36 -07:00
Brian Behlendorf 184087f822 Add configure option to enable gcov analysis
* Add configure option to enable gcov analysis.
* Includes a few minor ctime fixes.
* Add codecov.yml configuration.

Reviewed-by: Prakash Surya <prakash.surya@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6642
2018-03-14 16:10:36 -07:00
Richard Yao 834815e9f7 Implement --enable-debuginfo to force debuginfo
Inspection of a Ubuntu 14.04 x64 system revealed that the config file
used to build the kernel image differs from the config file used to
build kernel modules by the presence of CONFIG_DEBUG_INFO=y:

This in itself is insufficient to show that the kernel is built with
debuginfo, but a cursory analysis of the debuginfo provided and the
size of the kernel strongly suggests that it was built with
CONFIG_DEBUG_INFO=y while the modules were not. Installing
linux-image-$(uname -r)-dbgsym had no obvious effect on the debuginfo
provided by either the modules or the kernel.

The consequence is that issue reports from distributions such as Ubuntu
and its derivatives build kernel modules without debuginfo contain
nonsensical backtraces. It is therefore desireable to force generation
of debuginfo, so we implement --enable-debuginfo. Since the build system
can build both userspace components and kernel modules, the generic
--enable-debuginfo option will force debuginfo for both. However, it
also supports --enable-debuginfo=kernel and --enable-debuginfo=user for
finer grained control.

Enabling debuginfo for the kernel modules works by injecting
CONFIG_DEBUG_INFO=y into the make environment. This is enables
generation of debuginfo by the kernel build systems on all Linux
kernels, but the build environment is slightly different int hat
CONFIG_DEBUG_INFO has not been in the CPP. Adding -DCONFIG_DEBUG_INFO
would fix that, but it would also cause build failures on kernels where
CONFIG_DEBUG_INFO=y is already set. That would complicate its use in
DKMS environments that support a range of kernels and is therefore
undesireable. We could write a compatibility shim to enable
CONFIG_DEBUG_INFO only when it is explicitly disabled, but we forgo
doing that because it is unnecessary. Nothing in ZoL or the kernel uses
CONFIG_DEBUG_INFO in the CPP at this time and that is unlikely to
change.

Enabling debuginfo for the userspace components is done by injecting -g
into CPPFLAGS. This is not necessary because the build system honors the
environment's CPPFLAGS by appending them to the actual CPPFLAGS used,
but it is supported for consistency.

Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@clusterhq.com>
Closes #2734
2018-03-14 16:10:35 -07:00
Richard Yao 0f1ff38476 Make --enable-debug fail when given bogus args
Currently, bogus options to --enable-debug become --disable-debug. That
means that passing --enable-debug=true is analogous to --disable-debug,
but the result is counterintuitive. We switch to AS_CASE to allow us to
fail when given a bogus option.

Also, we modify the text printed to clarify that --enable-debug enables
assertions.

Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@clusterhq.com>
Closes #2734
2018-03-14 16:10:35 -07:00
Tony Hutter e3b28e16ce Tag zfs-0.7.6
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2018-02-01 10:02:58 -08:00
LOLi 2f62fdd644 Fix 'zfs receive -o' when used with '-e|-d'
When used in conjunction with one of '-e' or '-d' zfs receive options
none of the properties requested to be set (-o) are actually applied:
this is caused by a wrong assumption made about the toplevel dataset
in zfs_receive_one().

Fix this by correctly detecting the toplevel dataset.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7088

Requires-spl: refs/pull/679/head
2018-01-30 10:27:32 -06:00
Brian Behlendorf 137b3e6cff Extend zloop.sh for automated testing
In order to debug issues encountered by ztest during automated
testing it's important that as much debugging information as
possible by dumped at the time of the failure.  The following
changes extend the zloop.sh script in order to make it easier
to integrate with buildbot.

* Add the `-m <maximum cores>` option to zloop.sh to place a
  limit of the number of core dumps generated.  By default, the
  existing behavior is maintained and no limit is set.

* Add the `-l` option to create a 'ztest.core.N' symlink in the
  current directory to the core directory. This functionality
  is provided primarily for buildbot which expects log files to
  have well known names.

* Rename 'ztest.ddt' to 'ztest.zdb' and extend it to dump
  additional basic information on failure for latter analysis.

Reviewed-by: Tim Chase <tim@chase2k.com>
Reviewed by: Thomas Caputi <tcaputi@datto.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6999

Conflicts:
	scripts/zloop.sh
2018-01-30 10:27:31 -06:00
Don Brady d1630dda58 Cleanup zloop working directory after each pass
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed by: John Kennedy <jwk404@gmail.com>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Signed-off-by: Don Brady <don.brady@delphix.com>
Issue #6595
Closes #6663
2018-01-30 10:27:31 -06:00
Alexander Motin 701ebd014a OpenZFS 8835 - Speculative prefetch in ZFS not working for misaligned reads
In case of misaligned I/O sequential requests are not detected as such
due to overlaps in logical block sequence:

    dmu_zfetch(fffff80198dd0ae0, 27347, 9, 1)
    dmu_zfetch(fffff80198dd0ae0, 27355, 9, 1)
    dmu_zfetch(fffff80198dd0ae0, 27363, 9, 1)
    dmu_zfetch(fffff80198dd0ae0, 27371, 9, 1)
    dmu_zfetch(fffff80198dd0ae0, 27379, 9, 1)
    dmu_zfetch(fffff80198dd0ae0, 27387, 9, 1)

This patch makes single block overlap to be counted as a stream hit,
improving performance up to several times.

Authored by: Alexander Motin <mav@FreeBSD.org>
Approved by: Gordon Ross <gwr@nexenta.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Allan Jude <allanjude@freebsd.org>
Reviewed by: Gvozden Neskovic <neskovic@gmail.com>
Reviewed by: George Melikov <mail@gmelikov.ru>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>

OpenZFS-issue: https://www.illumos.org/issues/8835
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/aab6dd482a
Closes #7062
2018-01-30 10:27:31 -06:00
LOLi 5b8ec2cf39 Fix Debian packaging on ARMv7/ARM64
When building packages on Debian-based systems specify the target
architecture used by 'alien' to convert .rpm packages into .deb: this
avoids detecting an incorrect value which results in the following
errors:

<package>.aarch64.rpm is for architecture aarch64 ; the package cannot be built on this system
<package>.armv7l.rpm is for architecture armel ; the package cannot be built on this system

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #7046
Closes #7058
2018-01-30 10:27:31 -06:00
Brian Behlendorf 9d1a39cec6 Fix shellcheck v0.4.6 warnings
Resolve new warnings reported after upgrading to shellcheck
version 0.4.6.  This patch contains no functional changes.

* egrep is non-standard and deprecated. Use grep -E instead. [SC2196]
* Check exit code directly with e.g. 'if mycmd;', not indirectly
  with $?.  [SC2181]  Suppressed.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7040
2018-01-30 10:27:31 -06:00
DeHackEd 2a7b736dce Remove l2arc_nocompress from zfs-module-parameters(5)
Parameter was removed in d3c2ae1c08
(OpenZFS 6950 - ARC should cache compressed data)

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: DHE <git@dehacked.net>
Closes #7043
2018-01-30 10:27:31 -06:00
Richard Yao ecc8af1812 Fix incompatibility with Reiser4 patched kernels
In ZFSOnLinux, our sources and build system are self contained such that
we do not need to make changes to the Linux kernel sources. Reiser4 on
the other hand exists solely as a kernel tree patch and opts to make
changes to the kernel rather than adapt to it. After Linux 4.1 made a
VFS change that replaced new_sync_read with do_sync_read, Reiser4's
maintainer decided to modify the kernel VFS to export the old function.
This caused our autotools check to misidentify the kernel API as
predating Linux 4.1 on kernels that have been patched with Reiser4
support, which breaks our build.

Reiser4 really should be patched to stop doing this, but lets modify our
check to be more strict to help the affected users of both filesystems.

Also, we were not checking the types of arguments and return value of
new_sync_read() and new_sync_write() . Lets fix that too.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Richard Yao <ryao@gentoo.org>
Closes #6241
Closes #7021
2018-01-30 10:27:31 -06:00
Alex Zhuravlev 129e3e8dc3 Use zap_count instead of cached z_size for unlink
As a performance optimization Lustre does not strictly update
the SA_ZPL_SIZE when adding/removing from non-directory entries.
This results in entries which cannot be removed through the ZPL
layer even though the ZAP is empty and safe to remove.

Resolve this issue by checking the zap_count() directly instead
on relying on the cached SA_ZPL_SIZE.  Micro-benchmarks show no
significant performance impact due to the additional overhead
of using zap_count().

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7019
2018-01-30 10:27:31 -06:00
Nathaniel Wesley Filardo 9fb09f79e5 Revert raidz_map and _col structure types
As part of the refactoring of ab9f4b0b82,
several uint64_t-s and uint8_t-s were changed to other types.  This
caused ZoL github issue #6981, an overflow of a size_t on a 32-bit ARM
machine.  In absense of any strong motivation for the type changes, this
simply puts them back, modulo the changes accumulated for ABD.

Compile-tested on amd64 and run-tested on armhf.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Gvozden Neskovic <neskovic@gmail.com>
Signed-off-by: Nathaniel Wesley Filardo <nwf@cs.jhu.edu>
Closes #6981
Closes #7023
2018-01-30 10:27:31 -06:00
Nathaniel Wesley Filardo a2ee6568c6 zhack: fix getopt return type
This fixes zhack's command processing on ARM.  On ARM char
is unsigned, and so, in promotion to an int, it will never
compare equal to -1.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Nathaniel Wesley Filardo <nwf@cs.jhu.edu>
Closes #7016
2018-01-30 10:27:31 -06:00
Brian Behlendorf 9c1a8eaa51 Fix ARC hit rate
When the compressed ARC feature was added in commit d3c2ae1
the method of reference counting in the ARC was modified.  As
part of this accounting change the arc_buf_add_ref() function
was removed entirely.

This would have be fine but the arc_buf_add_ref() function
served a second undocumented purpose of updating the ARC access
information when taking a hold on a dbuf.  Without this logic
in place a cached dbuf would not migrate its associated
arc_buf_hdr_t to the MFU list.  This would negatively impact
the ARC hit rate, particularly on systems with a small ARC.

This change reinstates the missing call to arc_access() from
dbuf_hold() by implementing a new arc_buf_access() function.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Tim Chase <tim@chase2k.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6171
Closes #6852
Closes #6989
2018-01-30 10:27:31 -06:00
LOLi a8fa31b50b Fix 'zpool add' handling of nested interior VDEVs
When replacing a faulted device which was previously handled by a spare
multiple levels of nested interior VDEVs will be present in the pool
configuration; the following example illustrates one of the possible
situations:

   NAME                          STATE     READ WRITE CKSUM
   testpool                      DEGRADED     0     0     0
     raidz1-0                    DEGRADED     0     0     0
       spare-0                   DEGRADED     0     0     0
         replacing-0             DEGRADED     0     0     0
           /var/tmp/fault-dev    UNAVAIL      0     0     0  cannot open
           /var/tmp/replace-dev  ONLINE       0     0     0
         /var/tmp/spare-dev1     ONLINE       0     0     0
       /var/tmp/safe-dev         ONLINE       0     0     0
   spares
     /var/tmp/spare-dev1         INUSE     currently in use

This is safe and allowed, but get_replication() needs to handle this
situation gracefully to let zpool add new devices to the pool.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6678
Closes #6996
2018-01-30 10:27:31 -06:00
lidongyang 8d82a19def Call commit callbacks from the tail of the list
Our zfs backed Lustre MDT had soft lockups while under heavy metadata
workloads while handling transaction callbacks from osd_zfs.

The problem is zfs is not taking advantage of the fast path in
Lustre's trans callback handling, where Lustre will skip the calls
to ptlrpc_commit_replies() when it already saw a higher transaction
number.

This patch corrects this, it also has a positive impact on metadata
performance on Lustre with osd_zfs, plus some cleanup in the headers.

A similar issue for ext4/ldiskfs is described on:
https://jira.hpdd.intel.com/browse/LU-6527

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Li Dongyang <dongyang.li@anu.edu.au>
Closes #6986
2018-01-30 10:27:31 -06:00
Giuseppe Di Natale c2aacf2087 Handle broken pipes in arc_summary
Using a command similar to 'arc_summary.py | head' causes
a broken pipe exception. Gracefully exit in the case of a
broken pipe in arc_summary.py.

Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #6965 
Closes #6969
2018-01-30 10:27:31 -06:00
LOLi 9a6c57845a Handle invalid options in arc_summary
If an invalid option is provided to arc_summary.py we handle any error
thrown from the getopt Python module and print the usage help message.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6983
2018-01-30 10:27:31 -06:00
Dominik Hassler d27a40d28f OpenZFS 8794 - cstyle generates warnings with recent perl
Authored by: Dominik Hassler <hadfl@omniosce.org>
Reviewed by: Andy Fiddaman <andy@omniosce.org>
Reviewed by: Igor Kozhukhov <igor@dilos.org>
Reviewed by: Toomas Soome <tsoome@me.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Dan McDonald <danmcd@joyent.com>
Ported-by: Giuseppe Di Natale <dinatale2@llnl.gov>

OpenZFS-issue: https://www.illumos.org/issues/8794
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/578f67364c
Closes #6973
2018-01-30 10:27:31 -06:00
Brian Behlendorf aebc5df418 Update for cppcheck v1.80
Resolve new warnings and errors from cppcheck v1.80.

* [lib/libshare/libshare.c:543]: (warning)
  Possible null pointer dereference: protocol
* [lib/libzfs/libzfs_dataset.c:2323]: (warning)
  Possible null pointer dereference: srctype
* [lib/libzfs/libzfs_import.c:318]: (error)
  Uninitialized variable: link
* [module/zfs/abd.c:353]: (error) Uninitialized variable: sg
* [module/zfs/abd.c:353]: (error) Uninitialized variable: i
* [module/zfs/abd.c:385]: (error) Uninitialized variable: sg
* [module/zfs/abd.c:385]: (error) Uninitialized variable: i
* [module/zfs/abd.c:553]: (error) Uninitialized variable: i
* [module/zfs/abd.c:553]: (error) Uninitialized variable: sg
* [module/zfs/abd.c:763]: (error) Uninitialized variable: i
* [module/zfs/abd.c:763]: (error) Uninitialized variable: sg
* [module/zfs/abd.c:305]: (error) Uninitialized variable: tmp_page
* [module/zfs/zpl_xattr.c:342]: (warning)
   Possible null pointer dereference: value
* [module/zfs/zvol.c:208]: (error) Uninitialized variable: p

Convert the following suppression to inline.

* [module/zfs/zfs_vnops.c:840]: (error)
  Possible null pointer dereference: aiov

Exclude HAVE_UIO_ZEROCOPY and HAVE_DNLC from analysis since
these macro's will never be defined until this functionality
is implemented.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6879
2018-01-30 10:27:31 -06:00
Scot W. Stevenson 7a8bef3983 Fix data on evict_skips in arc_summary.py
Display correct data from kstat arcstats for evict_skips,
which is currently repeating the data from mutex_misses.
Fixes #6882

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com>
Closes #6882 
Closes #6883
2018-01-30 10:27:31 -06:00
Scot W. Stevenson d486dee89e Minor code cleanups in arc_python.py
Remove unused library re and associated variable kstat_pobj. Add note
to documentation at start of program about required support for old
versions of Python. Change variable "format" (which is a built-in
function) to "fmt".

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com>
Closes #6869
2018-01-30 10:27:31 -06:00
Scot W. Stevenson 7de8fb33a2 Fix arc_summary.py -d crash with Python3
Prevents arc_summary.py crashing when called with parameter -d or
long form --description with Python3.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com>
Closes #6849 
Closes #6850
2018-01-30 10:27:31 -06:00
Scot W. Stevenson 904c03672b Sort output of tunables in arc_summary.py
Sort list of tunables printed by _tunable_summary()
alphabetically

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com>
Closes #6828
2018-01-30 10:27:30 -06:00
Scot W. Stevenson 03955e3488 Add documentation strings to arc_summary.py
Include docstrings (PEP8, PEP257) for module and all functions.
Separately, remove outdated section in comment at start of
module. Separately, remove unused global constant "usetunable".

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com>
Closes #6818
2018-01-30 10:27:30 -06:00
Scot W. Stevenson 88e4e0d5dd Rewrite fHits() in arc_summary.py with SI units
Complete rewrite of fHits(). Move units from non-standard English
abbreviations to SI units, thereby avoiding confusion because of
"long scale" and "short scale" numbers. Remove unused parameter
"Decimal". Add function string. Aim to confirm to PEP8.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com>
Closes #6815
2018-01-30 10:27:30 -06:00
Scot W. Stevenson 03f638a8ef Minor code cleanup in arc_summary.py
Simplify and inline single-use function div1(); inline twice-used
function div2(); add function comment to zfs_header(); replace
variable "unused" in get_Kstat() with "_" following convention.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com>
Closes #6802
2018-01-30 10:27:30 -06:00
Scot W. Stevenson 5dc25de668 Rewrite of function fBytes() in arc_summary.py
Replace if-elif-else construction with shorter loop;
remove unused parameter "Decimal"; centralize format
string; add function documentation string; conform to
PEP8.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com>
Closes #6784
2018-01-30 10:27:30 -06:00
David Quigley 53e5890cff Fix bug in distclean which removes needed files
Running distclean removes the following files because of an error
in Makefile.am

deleted:    tests/zfs-tests/include/commands.cfg
deleted:    tests/zfs-tests/include/libtest.shlib
deleted:    tests/zfs-tests/include/math.shlib
deleted:    tests/zfs-tests/include/properties.shlib
deleted:    tests/zfs-tests/include/zpool_script.shlib

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: David Quigley <david.quigley@intel.com>
Closes #6636
2018-01-30 10:27:30 -06:00
Gvozden Neskovic a94447ddf3 dmu_objset: release bonus buffer in failure path
Reported by kmemleak during testing of a new patch:

```
unreferenced object 0xffff9f1c12e38800 (size 1024):
  comm "z_upgrade", pid 17842, jiffies 4296870904 (age 8746.268s)
  backtrace:
    kmemleak_alloc+0x7a/0x100
    __kmalloc_node+0x26c/0x510
    range_tree_create+0x39/0xa0 [zfs]
    dmu_zfetch_init+0x73/0xe0 [zfs]
    dnode_create+0x12c/0x3b0 [zfs]
    dnode_hold_impl+0x1096/0x1130 [zfs]
    dnode_hold+0x23/0x30 [zfs]
    dmu_bonus_hold_impl+0x6b/0x370 [zfs]
    dmu_bonus_hold+0x1e/0x30 [zfs]
    dmu_objset_space_upgrade+0x114/0x310 [zfs]
    dmu_objset_userobjspace_upgrade_cb+0xd8/0x150 [zfs]
    dmu_objset_upgrade_task_cb+0x136/0x1e0 [zfs]    
    kthread+0x119/0x150
```

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
Closes #6575
2018-01-30 10:27:30 -06:00
Chunwei Chen 7192ec7942 Fix zfs_ioc_pool_sync should not use fnvlist
Use fnvlist on user input would allow user to easily panic zfs.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Alek Pinchuk <apinchuk@datto.com>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes #6529
2018-01-30 10:27:30 -06:00
Gvozden Neskovic 06acbbc429 vdev_mirror: load balancing fixes
vdev_queue:
- Track the last position of each vdev, including the io size,
  in order to detect linear access of the following zio.
- Remove duplicate `vq_lastoffset`

vdev_mirror:
- Correctly calculate the zio offset (signedness issue)
- Deprecate `vdev_queue_register_lastoffset()`
- Add `VDEV_LABEL_START_SIZE` to zio offset of leaf vdevs

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
Closes #6461
2018-01-30 10:27:30 -06:00
BtbN 6116bbd744 Use /sbin/openrc-run for openrc init scripts
Using /sbin/runscript is deprecated and throws a QA warning
when still used in init scripts.

Reviewed-by: bunder2015 <omfgbunder@gmail.com>
Signed-off-by: BtbN <btbn@btbn.de>
Closes #6519
2018-01-30 10:27:30 -06:00
Tony Hutter a803eacf26 Tag zfs-0.7.5
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2017-12-18 10:57:47 -08:00
Brian Behlendorf 504bfc8b49 Fix multihost stale cache file import
When the multihost property is enabled it should be impossible to
import an active pool even using the force (-f) option.  This patch
prevents a forced import from succeeding when importing with a
stale cache file.

The root cause of the problem is that the kernel modules trusted
the hostid provided in configuration.  This is always correct when
the configuration is generated by scanning for the pool.  However,
when using an existing cache file the hostid could be stale which
would result in the activity check being skipped.

Resolve the issue by always using the hostid read from the label
configuration where the best uberblock was found.

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6933
Closes #6971
2017-12-18 10:31:01 -08:00
Olaf Faaland 53a8cbd70e Fix ZTS MMP tests and ztest -M behavior
Quote "$MMP_IMPORT_MSG" when it is passed as an argument, as it is a
multi-word string.  Some tests were passing when they should not have,
because the grep was only testing for the first word.

Correct the message expected when no hostid is set and the test attempts
to enable multihost.  It did not match the actual output in that
situation.

Disable ztest_reguid() when ztest is invoked with the -M option.  If
ztest performs a reguid, a concurrent import attempt may fail with the
error "one or more devices is currently unavailable" if the guid sum is
calculated on the original device guids but compared against the guid
sum ztest wrote based on the new device guids.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #6666
2017-12-18 10:14:39 -08:00
David Qian 505b97ae20 Enable QAT support in zfs-dkms RPM
Enable QAT accelerated gzip compression in zfs-dkms RPM package when
environment variant ICP_ROOT is set to QAT drive source code folder
and QAT hardware presence.  Otherwise, use default gzip compression.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: David Qian <david.qian@intel.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6932
2017-12-18 10:02:19 -08:00
Lalufu 30a64ebaed Add zfs-import.target services in spec file
Add missing zfs-import.target to list of systemd services in zfs
RPM spec file.

Reviewed-by: Niklas Wagner <Skaro@Skaronator.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ralf Ertzinger <ralf@skytale.net>
Issue #6953
Closes #6955
2017-12-18 09:45:01 -08:00
Antonio Russo da16fc5739 Enable zfs-import.target in systemd preset (#6968)
Cherry picked line from PR #6822, this enables the new
target introduced in PR #6764.

Signed-off-by: Antonio Russo <antonio.e.russo@gmail.com>
2017-12-18 09:43:55 -08:00
Tony Hutter 3c7fa6ca33 Tag zfs-0.7.4
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2017-12-07 10:25:36 -08:00
Tony Hutter 36e0ddb744 Revert "Long hold the dataset during upgrade"
This reverts commit a5c8119eba.

The commit (which was modified to remove encryption) was hitting
ASSERT(dsl_pool_config_held(dmu_objset_pool(os))) in
dmu_objset_upgrade() during automated testing.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2017-12-06 13:25:40 -06:00
Giuseppe Di Natale cf21b5b5b2 Allow test-runner to filter test groups by tag
Enable test-runner to accept a list of tags to identify
which test groups the user wishes to run.

Also allow test-runner to perform multiple iterations
of a test run.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Wren Kennedy <john.kennedy@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #6788
2017-12-06 13:25:40 -06:00
Brian Behlendorf 1030f807ba Fix NFS sticky bit permission denied error
When zfs_sticky_remove_access() was originally adapted for Linux
a typo was made which altered the intended behavior.  As described
in the block comment, the intended behavior is that permission
should be granted when the entry is a regular file and you have
write access.  That is, S_ISREG should have been used instead of
S_ISDIR.

Restricting permission to regular files made good sense for older
systems where setting the bit on executable files would instruct
the system to save the program's text segment on the swap device.

On modern systems this behavior has been replaced by the sticky
bit acting as a restricted deletion flag and the plain file
restriction has been relaxed.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6889
Closes #6910
2017-12-04 17:22:47 -08:00
JKDingwall d45702bcfa Add /usr/bin/env to COPY_EXEC_LIST initramfs hook
5dc1ff29 changed the user space program to mount a zfs snapshot
from /bin/sh to /usr/bin/env.  If the executable is not present
in the initramfs then snapshots cannot be automounted.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: James Dingwall <james.dingwall@zynstra.com>
Closes #5360
Closes #6913
Conflicts:
	contrib/initramfs/hooks/zfs
2017-12-04 17:22:36 -08:00
Brian Behlendorf ddd20dbe0b Fix 'zpool create|add' replication level check
When the pool configuration contains a hole due to a previous device
removal ignore this top level vdev.  Failure to do so will result in
the current configuration being assessed to have a non-uniform
replication level and the expected warning will be disabled.

The zpool_add_010_pos test case was extended to cover this scenario.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6907
Closes #6911
2017-12-04 17:21:39 -08:00
Brian Behlendorf 4a98780933 Preserve itx alloc size for zio_data_buf_free()
Using zio_data_buf_alloc() to allocate the itx's may be unsafe
because the itx->itx_lr.lrc_reclen field is not constant from
allocation to free.  Using a different itx->itx_lr.lrc_reclen
size in zio_data_buf_free() can result in the allocation being
returned to the wrong kmem cache.

This issue can be avoided entirely by storing the allocation size
in itx->itx_size and using that for zio_data_buf_free().

Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6912
2017-12-04 17:21:39 -08:00
LOLi 6db8f1a0d1 Fix 'zfs get {user|group}objused@' functionality
Fix a regression accidentally introduced in 1b81ab4 that prevents
'zfs get {user|group}objused@' from correctly reporting the requested
value.

Update "userspace_003_pos.ksh" and "groupspace_003_pos.ksh" to verify
this functionality.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6908
2017-12-04 17:21:39 -08:00
Mark Wright e06711412b Linux 4.14 compat: CONFIG_GCC_PLUGIN_RANDSTRUCT
Fix build errors with gcc 7.2.0 on Gentoo with kernel 4.14
built with CONFIG_GCC_PLUGIN_RANDSTRUCT=y such as:

module/nvpair/nvpair.c:2810:2:error:
positional initialization of field in ?struct? declared with
'designated_init' attribute [-Werror=designated-init]
  nvs_native_nvlist,
  ^~~~~~~~~~~~~~~~~

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Mark Wright <gienah@gentoo.org>
Closes #5390
Closes #6903
2017-12-04 17:21:39 -08:00
Richard Laager 68ba1d2fa9 initramfs: Honor canmount=off
The initramfs script was not honoring canmount=off.  With this change,
it does.  If the administrator has asked that a filesystem not be
mounted, that should be honored.

As an exception, the initramfs script ignores canmount=off on the
rootfs.  The rootfs should not have canmount=off set either.  However,
mounting it anyway seems harmless because it is being asked for
explicitly.  The point of this exception is to avoid the risk of
breaking existing systems, just in case someone has canmount=off set on
their rootfs.

The initramfs still mounts filesystems with canmount=noauto.  This is
necessary because it is typical to set that on the rootfs so that it can
be cloned.  Without canmount=noauto, the clones' duplicate mountpoints
would conflict.

This is the remainder of the fix for:
https://github.com/zfsonlinux/pkg-zfs/issues/221

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Closes #6897
2017-12-04 17:21:38 -08:00
Richard Laager 4e11137989 initramfs: Honor mountpoint=none/legacy
For filesystems that are children of the rootfs, when mountpoint=none or
mountpoint=legacy, the initrafms script would assume a mountpoint based
on the dataset path.  Given that the rootfs should have mountpoint=/ and
mountpoint inheritance is is the default behavior of ZFS, this behavior
seems unnecessary.  In any event, it turns mountpoint=none into a no-op.
That removes this option from the administrator, and if someone uses it,
it does not work as expected.  Worse yet, if the mountpoint directory
does not exist (which is the typical case for mountpoint=none), the
mounting and thus the boot process will fail.  For the case of
mountpoint=legacy, the assumed mountpoint may not be the correct value
set in /etc/fstab.

This change makes the initramfs script not mount the filesystem in
either case.  For mountpoint=none, this means we are correctly honoring
the setting.  For mountpoint=legacy, there are two scenarios:  If
canmount=on, the filesystem will be mounted by the normal mechanisms
later in the boot process.  If canmount=noauto, the filesystem will not
be mounted at all, unless the administrator has done something special.
If they're not doing something special and they want it mounted by the
initramfs, they can simply not set mountpoint=legacy.

This is part of the fix for:
https://github.com/zfsonlinux/pkg-zfs/issues/221

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Closes #6897
2017-12-04 17:21:38 -08:00
DeHackEd be9be1cc3e zpool(8): Fix "zpool import -t"
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: DHE <git@dehacked.net>
Closes #6894
2017-12-04 17:21:38 -08:00
George G eab4536081 Fix column alignment with long zpool names
`zpool status` normally aligns NAME/STATE/etc columns:

    NAME                       STATE     READ WRITE CKSUM
    dummy                      ONLINE       0     0     0
      mirror-0                 ONLINE       0     0     0
        /tmp/dummy-long-1.bin  ONLINE       0     0     0
        /tmp/dummy-long-2.bin  ONLINE       0     0     0
      mirror-1                 ONLINE       0     0     0
        /tmp/dummy-long-3.bin  ONLINE       0     0     0
        /tmp/dummy-long-4.bin  ONLINE       0     0     0

However, if the zpool name is longer than the zvol names, alignment
issues arise:

    NAME                  STATE     READ WRITE CKSUM
    dummy-very-very-long-zpool-name  ONLINE       0     0     0
      mirror-0            ONLINE       0     0     0
        /tmp/dummy-1.bin  ONLINE       0     0     0
        /tmp/dummy-2.bin  ONLINE       0     0     0
      mirror-1            ONLINE       0     0     0
        /tmp/dummy-3.bin  ONLINE       0     0     0
        /tmp/dummy-4.bin  ONLINE       0     0     0

`zpool iostat` and `zpool import` are also affected:

                  capacity     operations     bandwidth
    pool        alloc   free   read  write   read  write
    ----------  -----  -----  -----  -----  -----  -----
    dummy        104K  1.97G      0      0    152  9.84K
    dummy-very-very-long-zpool-name   152K  1.97G      0      1    144  13.1K
    ----------  -----  -----  -----  -----  -----  -----

    dummy-very-very-long-zpool-name  ONLINE
      mirror-0            ONLINE
        /tmp/dummy-1.bin  ONLINE
        /tmp/dummy-2.bin  ONLINE
      mirror-1            ONLINE
        /tmp/dummy-3.bin  ONLINE
        /tmp/dummy-4.bin  ONLINE

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Gaydarov <git@gg7.io>
Closes #6786
2017-12-04 17:21:38 -08:00
Brian Behlendorf 954516cec1 Emit history events for 'zpool create'
History commands and events were being suppressed for the
'zpool create' command since the history object did not
yet exist.  Create the object earlier so this history
doesn't get lost.

Split the pool_destroy event in to pool_destroy and
pool_export so they may be distinguished.

Updated events_001_pos and events_002_pos test cases.  They
now check for the expected history events and were reworked
to be more reliable.

Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6712
Closes #6486
Conflicts:
	tests/zfs-tests/tests/functional/events/events_002_pos.ksh
2017-12-04 17:21:03 -08:00
Brian Behlendorf 841cb5ee2a Fix dirty check in dmu_offset_next()
The correct way to determine if a dnode is dirty is to check
if any of the dn->dn_dirty_link's are active.  Relying solely
on the dn->dn_dirtyctx can result in the dnode being mistakenly
reported as clean.

Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #3125 
Closes #6867
2017-11-21 13:11:29 -06:00
Brian Behlendorf d4cf31275b Disable automatic dependencies in zfs-test package
All of the ZTS test scripts specify /bin/ksh as the interpreter.
Unfortunately, as of Fedora 27 only /usr/bin/ksh is provided by
the package manager.  Rather than change all the scripts to
accommodate the latest Fedora disable automatic dependencies
for the zfs-test package.  Functionally this will not cause
any problems since /bin is a symlink to /usr/bin.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6868
2017-11-21 13:11:29 -06:00
LOLi fedc1d96a8 Fix truncate(2) mtime and ctime handling
On Linux, ftruncate(2) always changes the file timestamps, even if the
file size is not changed. However, in case of a successfull
truncate(2), the timestamps are updated only if the file size changes.
This translates to the VFS calling the ZFS Posix Layer "setattr"
function (zpl_setattr) with ATTR_MTIME and ATTR_CTIME unconditionally
set on the iattr mask only when doing a ftruncate(2), while the
truncate(2) is left to the filesystem implementation to be dealt with.

This behaviour is consistent with POSIX:2004/SUSv3 specifications
where there's no explicit requirement for file size changes to update
the timestamps only for ftruncate(2):

http://pubs.opengroup.org/onlinepubs/009695399/functions/truncate.html
http://pubs.opengroup.org/onlinepubs/009695399/functions/ftruncate.html

This has been later updated in POSIX:2008/SUSv4 where, for both
truncate(2)/ftruncate(2), there's no mention of this size change
requirement:

http://austingroupbugs.net/view.php?id=489
http://pubs.opengroup.org/onlinepubs/9699919799/functions/truncate.html
http://pubs.opengroup.org/onlinepubs/9699919799/functions/ftruncate.html

Unfortunately the Linux VFS is still calling into the ZPL without
ATTR_MTIME/ATTR_CTIME set in the truncate(2) case: we fix this by
explicitly updating the timestamps when detecting the ATTR_SIZE bit,
which is always set in do_truncate(), on the iattr mask.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6811
Closes #6819
2017-11-21 13:11:29 -06:00
benrubson 59511072b4 OpenZFS 7531 - Assign correct flags to prefetched buffers
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Authored by: abraunegg <alex.braunegg@gmail.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>

OpenZFS-issue: https://www.illumos.org/issues/7531
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/468008cb
2017-11-21 13:11:29 -06:00
Arkadiusz Bubała a5c8119eba Long hold the dataset during upgrade
If the receive or rollback is performed while filesystem is upgrading
the objset may be evicted in `dsl_dataset_clone_swap_sync_impl`. This
will lead to NULL pointer dereference when upgrade tries to access
evicted objset.

This commit adds long hold of dataset during whole upgrade process.
The receive and rollback will return an EBUSY error until the
upgrade is not finished.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arkadiusz Bubała <arkadiusz.bubala@open-e.com>
Closes #5295
Closes #6837
2017-11-21 13:03:21 -06:00
Tim Chase d7881a6dca Handle compressed buffers in __dbuf_hold_impl()
In __dbuf_hold_impl(), if a buffer is currently syncing and is still
referenced from db_data, a copy is made in case it is dirtied again in
the txg.  Previously, the buffer for the copy was simply allocated with
arc_alloc_buf() which doesn't handle compressed or encrypted buffers
(which are a special case of a compressed buffer).  The result was
typically an invalid memory access because the newly-allocated buffer
was of the uncompressed size.

This commit fixes the problem by handling the 2 compressed cases,
encrypted and unencrypted, respectively, with arc_alloc_raw_buf() and
arc_alloc_compressed_buf().

Although using the proper allocation functions fixes the invalid memory
access by allocating a buffer of the compressed size, another unrelated
issue made it impossible to properly detect compressed buffers in the
first place.  The header's compression flag was set to ZIO_COMPRESS_OFF
in arc_write() when it was possible that an attached buffer was actually
compressed.  This commit adds logic to only set ZIO_COMPRESS_OFF in
the non-ZIO_RAW case which wil handle both cases of compressed buffers
(encrypted or unencrypted).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tim Chase <tim@chase2k.com>
Closes #5742
Closes #6797
2017-11-21 13:01:30 -06:00
LOLi 951e62169e Fix undefined %{systemd_svcs} in RPM scriptlets
This allows RPM-based systems to properly control package installation
and removal when using systemd.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6838 
Closes #6841
2017-11-20 16:48:26 -06:00
wli5 9add19b37d Bug fix in qat_compress.c when compressed size is < 4KB
When the 128KB block is compressed to less than 4KB, the pointer
to the Footer is not in the end of the compressed buffer, that's
because the Header offset was added twice for this case. So there
is a gap between the Footer and the compressed buffer.
1. Always compute the Footer pointer address from the start of the
last page.
2. Remove the un-used workaroud code which has been verified fixed
with the latest driver and this fix.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Weigang Li <weigang.li@intel.com>
Closes #6827
2017-11-20 16:48:26 -06:00
Brian Behlendorf b2d633202d Disable automatic dependencies in DKMS package
By default additional dependencies are generated automatically for
packages.  This is normally a good thing because it helps ensure
things just work.  It doesn't make sense for the DKMS package which
requires minimal dependencies that can be easily listed.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6467 
Closes #6835
2017-11-20 16:48:25 -06:00
Brian Behlendorf 414f4a9c54 Initramfs fixes
* initramfs: Fix inconsistent whitespace
* initramfs: Fix a spelling error
* initramfs: Set elevator=noop on the rpool's disks

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Closes #6807
2017-11-20 16:23:33 -06:00
Antonio Russo 1c4f5e7d92 systemd zfs-import.target and documentation
zfs-import-{cache,scan}.service must complete before any mounting of
filesystems can occur. To simplify this dependency, create a target
that is reached After (in the systemd sense) the pool is imported.

Additionally, recommend that legacy zfs mounts use the option

x-systemd.requires=zfs-import.target

to codify this requirement.

Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Antonio Russo <antonio.e.russo@gmail.com>
Closes #6764
2017-11-20 16:20:08 -06:00
abraunegg 246e515cf8 Update zfs module parameters man5
Update zfs module parameters man5 with missing parameter details
for multiple tunings.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Alex Braunegg <alex.braunegg@gmail.com>
Closes #6785
2017-11-20 16:19:54 -06:00
Brian Behlendorf 2d41e75e52 Fix status command options in zpool(8)
The 'zpool status' command supports the -P option for printing full
path names.  It does not support the -p parsable option for printing
exact values.
    
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6792 
Closes #6794
2017-11-20 16:19:23 -06:00
Fabian-Gruenbichler d834d6811b arcstat: flush stdout / outfile after each line
Otherwise, if arcstat gets interrupted before the desired number of
iterations is reached, the output file will be empty (both if set via
'-o' or via shell redirection).

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Closes #6775
2017-11-20 16:19:23 -06:00
Giuseppe Di Natale 029a1b0c20 Ensure arc_size_break is filled in arc_summary.py
Use mfu_size and mru_size pulled from the arcstats
kstat file to calculate the mfu and mru percentages
for arc size breakdown.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: AndCycle <andcycle@andcycle.idv.tw>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #5526 
Closes #6770
2017-11-20 16:19:23 -06:00
Giuseppe Di Natale c45254b0ec Correct flake8 errors after STYLE builder update
Fix new flake8 errors related to bare excepts and ambiguous
variable names due to a STYLE builder update.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #6776
2017-11-20 16:19:23 -06:00
wli5 318fdeb51f Support integration with new QAT products
Support integration with new QAT products: Intel(R) C62x Chipset,
or Atom(R) C3000 Processor Product Family SoC:
1. Detect new file name in auto-conf.
2. Change MAX_INSTANCES to 48.
3. Change "num_inst" to U16 to clean a build warning.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Weigang Li <weigang.li@intel.com>
Closes #6767
2017-11-20 16:19:23 -06:00
Olaf Faaland d3d20bf442 Reimplement vdev_random_leaf and rename it
Rename it as mmp_random_leaf() since it is defined in mmp.c.

The earlier implementation could end up spinning forever if a pool had a
vdev marked writeable, none of whose children were writeable.  It also
did not guarantee that if a writeable leaf vdev existed, it would be
found.

Reimplement to recursively walk the device tree to select the leaf.  It
searches the entire tree, so that a return value of (NULL) indicates
there were no usable leaves in the pool; all were either not writeable
or had pending mmp writes.

It still chooses the starting child randomly at each level of the tree,
so if the pool's devices are healthy, the mmp writes go to random leaves
with an even distribution.  This was verified by testing using
zfs_multihost_history enabled.

Reviewed by: Thomas Caputi <tcaputi@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #6631 
Closes #6665
2017-11-20 16:19:23 -06:00
Tony Hutter 99598264fc Tag zfs-0.7.3
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2017-10-18 11:00:26 -07:00
Neal Gompa (ニール・ゴンパ) abe30b7b40 Add DKMS package on Debian-based distributions
* config/deb.am: Enable building DKMS packages for Debian
* rpm/generic/zfs-dkms.spec.in: Adjust spec to be Debian-compatible
  * Condition kernel-devel Req to RPM distros
  * Adjust the DKMS Req to have a minimum of a version only
  * Ensure that --rpm_safe_upgrade isn't used on non-RPM distros
* config/deb.am: Drop CONFIG_KERNEL and CONFIG_USER guards
* Makefile.am: Add pkg-dkms target

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Neal Gompa <ngompa@datto.com>
Closes #6044
Closes #6731
2017-10-17 16:49:19 -07:00
Tobin Harding f90ee0ca3d Fix function documentation to correctly mirror code
Currently the function documentation states that two strings are
allocated, this is outdated. Only one char ** parameter is passed
into the function now, clearly only a pointer to a single string
is returned and needs to be free'd.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Closes #6754
2017-10-17 16:49:14 -07:00
Brian Behlendorf 4ed955e280 Increase default zloop.sh vdev size
The default 128M vdev size used by zloop.sh isn't always large
enough and can result in ENOSPC failures which suspend the pool.
Increase the default size to 512M and provide a -s option which
can be used to specify an alternate size.

This does increase the free space requirements to run zloop.sh.
However, since the vdevs are sparse 4x the space is not required.

Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6758
2017-10-17 16:49:08 -07:00
Damian Wojsław 1721f13e76 Typo in dsl_dataset.h
The parameters dsl_dataset_t *os in function prototype should be
renamed to dsl_dataset_t *ds.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Damian Wojsław <damian@wojslaw.pl>
Closes #6756
Closes #6273
2017-10-17 16:49:03 -07:00
Brian Behlendorf 6e893ef62a Fix chattr/cleanup failure
The chattr cleanup step may fail to delete the user if there is still
an active process running as that user.  Retry the userdel when this
occurs to eliminate spurious false positves.

  ERROR: userdel quser1 exited 8
  userdel: user quser1 is currently used by process 26814

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6749
2017-10-17 16:48:58 -07:00
Brian Behlendorf e0eaaf8144 Fixes for SPARC support
The current code base almost compiles on SPARC, but a few fixes are
required for the code to compile (and work efficiently). Code in this
PR comes from OpenZFS project which was initially dropped when porting
the crypto framework.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pengcheng Xu <i@jsteward.moe>
Closes #6733
Closes #6738
Closes #6750
2017-10-16 10:57:55 -07:00
Antonio Russo cb8a074dcb Explicitly depend on icp module in initramfs hook
Automatic dependency resolution is unreliable on many systems.
Follow suit with existing code, and explicitly include icp
in module dependencies.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Antonio Russo <antonio.e.russo@gmail.com>
Closes #6751
2017-10-16 10:57:55 -07:00
aun c3ac4ccabb Fix boot from ZFS issues
* Correct ZFS snapshot listing
* Disable "lvm is not available" message on quiet boot

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alar Aun <spamtoaun@gmail.com>
Closes #6700
Closes #6747
2017-10-16 10:57:55 -07:00
Fabian Grünbichler 8d688ce66a Skip FREEOBJECTS for objects which can't exist
When sending an incremental stream based on a snapshot, the receiving
side must have the same base snapshot.  Thus we do not need to send
FREEOBJECTS records for any objects past the maximum one which exists
locally.

This allows us to send incremental streams (again) to older ZFS
implementations (e.g. ZoL < 0.7) which actually try to free all objects
in a FREEOBJECTS record, instead of bailing out early.

Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Closes #5699
Closes #6507
Closes #6616
2017-10-16 10:57:55 -07:00
Fabian Grünbichler b544fe4123 Free objects when receiving full stream as clone
All objects after the last written or freed object are not supposed to
exist after receiving the stream.  Free them accordingly, as if a
freeobjects record for them had been included in the stream.

Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Closes #5699
Closes #6507
Closes #6616
2017-10-16 10:57:55 -07:00
LOLi 926c6ec453 Fix intra-pool resumable 'zfs send -t <token>'
Because resuming from a token requires "guid" -> "snapshot" mapping
we have to walk the whole dataset hierarchy to find the right snapshot
to send; when both source and destination exists, for an incremental
resumable stream, libzfs gets confused and picks up the wrong snapshot
to send from: this results in attempting to send

   "destination@snap1 -> source@snap2"

instead of

   "source@snap1 -> source@snap2"

which fails with a "Invalid cross-device link" error (EXDEV).

Fix this by adjusting the logic behind dataset traversal in
zfs_iter_children() to pick the right snapshot to send from.

Additionally update dry-run 'zfs send -t' to print its output to
stderr: this is consistent with other dry-run commands.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6618
Closes #6619
Closes #6623
2017-10-16 10:57:55 -07:00
Brian Behlendorf 91b2f6ab1c Fix ARC behavior on 32-bit systems
With the addition of the ABD changes consumption of the virtual
address space has been greatly reduced.  This exposed an issue on
CONFIG_HIGHMEM systems where free memory was being calculated
incorrectly.  Functionally this didn't cause any major problems
prior to ABD because a lack of available virtual address space
was used as an indicator of low memory.

This patch makes the following changes to address the issue and
in the process realigns the code further with OpenZFS.  There
are no substantive changes in behavior for 64-bit systems.

* Added CONFIG_HIGHMEM case to the arc_all_memory() and
  arc_free_memory() functions to only consider low memory pages
  on CONFIG_HIGHMEM systems.

* The arc_free_memory() function was updated to return bytes
  instead of pages to be consistent with the other helper
  functions.  In user space we make up some reasonable values
  since currently only testing is performed in this context.

* Adds three new values to the arcstats kstat to provide visibility
  in to the ARC's assessment of the memory situation:
  memory_all_bytes, memory_free_bytes, and memory_available_bytes.

* Added kmem_reap() call to arc_available_memory() for 32-bit
  builds to realign code with OpenZFS.

* Reduced size of test file in /async_destroy_001_pos.ksh to
  speed up test case.  Multiple txgs are still required.

* Move vdevs used by zpool_clear_001_pos and zpool_upgrade_002_pos
  to TEST_BASE_DIR location to speed up test cases.

Reviewed-by: David Quigley <david.quigley@intel.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #5352
Closes #6734
2017-10-16 10:57:55 -07:00
privb0x23 851a7cd833 Fix inclusion of libgcc_s.so on Void
On Void Linux (x86_64 musl) libgcc_s.so is located in "/usr/lib"
so it is not found by dracut and it produces an error.

Add a simple additional path check for "/usr/lib/libgcc_s.so*"
and install it in the initramfs.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: privb0x23 <privb0x23@users.noreply.github.com>
Closes #6715
2017-10-16 10:57:55 -07:00
Tobin Harding 83d4d1a784 Use bitwise '&' instead of logical '&&'
Make two instances of the same change. Change bitwise AND (&) to logical
AND (&&).

Currently the code uses a bitwise AND between two boolean values.

In the first instance;

The first operand is a flag that has been bitwise combined with a bit
mask to get a boolean value as to whether a file has group write
permissions set.

The second operand used is a struct member that is intended as a
boolean flag not a bit mask.

In the second instance the argument is the same except with world write
permissions instead of group write (S_IWOTH, S_IWGRP).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Closes #6684
Closes #6722
2017-10-16 10:57:55 -07:00
Tobin Harding 80cc2f6111 Remove unnecessary equality check
Currently `if` statement includes an assignment (from a function return
value) and a equality check. The parenthesis are in the incorrect place,
currently the code clobbers the function return value because of this.

We can fix this by simplifying the `if` statement.

`if (foo != 0)`

can be more succinctly expressed as

`if (foo)`

Remove the equality check, add parenthesis to correct the statement.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Closes #6685
Close #6719
2017-10-16 10:57:55 -07:00
Isaac Huang b97948276d Use linear abd in vdev_copy_uberblocks()
The vdev_copy_uberblocks() function should use abd_alloc_linear() to
allocate ub_abd, because abd_to_buf(ub_abd)) is used later.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Isaac Huang <he.huang@intel.com>
Closes #6718
Closes #6713
2017-10-16 10:57:55 -07:00
Ned Bass 4cfc086e4d receive_freeobjects() skips freeing some objects
When receiving a FREEOBJECTS record, receive_freeobjects()
incorrectly skips a freed object in some cases. Specifically, this
happens when the first object in the range to be freed doesn't exist,
but the second object does. This leaves an object allocated on disk
on the receiving side which is unallocated on the sending side, which
may cause receiving subsequent incremental streams to fail.

The bug was caused by an incorrect increment of the object index
variable when current object being freed doesn't exist.  The
increment is incorrect because incrementing the object index is
handled by a call to dmu_object_next() in the increment portion of
the for loop statement.

Add test case that exposes this bug.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ned Bass <bass6@llnl.gov>
Closes #6694
Closes #6695
2017-10-16 10:57:55 -07:00
chrisrd 25d232f407 Scale the dbuf cache with arc_c
Commit d3c2ae1 introduced a dbuf cache with a default size of the
minimum of 100M or 1/32 maximum ARC size. (These figures may be adjusted
using dbuf_cache_max_bytes and dbuf_cache_max_shift.) The dbuf cache
is counted as metadata for the purposes of ARC size calculations.

On a 1GB box the ARC maximum size defaults to c_max 493M which gives a
dbuf cache default minimum size of 15.4M, and the ARC metadata defaults
to minimum 16M. I.e. the dbuf cache is an significant proportion of the
minimum metadata size. With other overheads involved this actually means
the ARC metadata doesn't get down to the minimum.

This patch dynamically scales the dbuf cache to the target ARC size
instead of statically scaling it to the maximum ARC size. (The scale is
still set by dbuf_cache_max_shift and the maximum size is still fixed by
dbuf_cache_max_bytes.) Using the target ARC size rather than the current
ARC size is done to help the ARC reach the target rather than simply
focusing on the current size.

Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Chris Dunlop <chris@onthe.net.au>
Issue #6506
Closes #6561
2017-10-16 10:57:54 -07:00
Tony Hutter edd7c24623 Tag zfs-0.7.2
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2017-09-22 11:14:01 -07:00
Giuseppe Di Natale bef6a8bc3a Correct cppcheck errors (#6662)
ZFS buildbot STYLE builder was moved to Ubuntu 17.04
which has a newer version of cppcheck. Handle the
new cppcheck errors.

uu_* functions removed in this commit were unused
and effectively dead code. They are now retired.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #6653
2017-09-20 12:59:21 -07:00
Brian Behlendorf 266b181e75 Increase default arc_c_min
Increase the default arc_c_min value to which whichever is larger,
either 32M or 1/32 of total system memory.  This is advantageous for
systems with more than 1G of memory where performance issues may
occur when the ARC is allowed to collapse below a minimum size.
At the same time we want to use the bare minimum value which is
still functional so the filesystem can be used in very low memory
environments.

Reviewed-by: Tim Chase <tim@chase2k.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6659
2017-09-20 10:25:54 -07:00
Brian Behlendorf c474f5e9a7 Export symbol dmu_tx_mark_netfree()
This symbol is needed by Lustre for the same reason it was needed
by the ZPL.  It should have been exported when the original patch
was merged.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6660
2017-09-20 10:25:54 -07:00
Brian Behlendorf 4e6a9e4598 ZTS fix slog_replay_volume.ksh failure
The slog_replay_volume.ksh test case will fail when the pool is
layered on files in a filesystem which does not support discard.
Avoid this issue by creating the pool using DISKS which will
either be loopback device or real disk.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6654
2017-09-20 10:25:54 -07:00
Brian Behlendorf 661907e6bc Linux 4.14 compat: IO acct, global_page_state, etc (#6655)
generic_start_io_acct/generic_end_io_acct in the master
branch of the linux kernel requires that the request_queue
be provided.

Move the logic from freemem in the spl to arc_free_memory
in arc.c. Do this so we can take advantage of global_page_state
interface checks in zfs.

Upstream kernel replaced struct block_device with
struct gendisk in struct bio. Determine if the
function bio_set_dev exists during configure
and have zfs use that if it exists.

bio_set_dev https://github.com/torvalds/linux/commit/74d4699
global_node_page_state https://github.com/torvalds/linux/commit/75ef718
io acct https://github.com/torvalds/linux/commit/d62e26b

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #6635

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2017-09-19 14:24:34 -07:00
Gaurav Kumar d3e7d981d4 Modifying XATTRs doesnt change the ctime
Changing any metadata, should modify the ctime.

Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: gaurkuma <gauravk.18@gmail.com>
Closes #3644
Closes #6586
2017-09-13 16:05:18 -07:00
Brian Behlendorf a2a0440918 Fix volume WR_INDIRECT log replay (#6620)
The portion of the zvol_replay_write() handler responsible for
replaying indirect log records for some reason never existed.
As a result indirect log records were not being correctly replayed.

This went largely unnoticed since the majority of zvol log records
were of the type WR_COPIED or WR_NEED_COPY prior to OpenZFS 7578.

This patch updates zvol_replay_write() to correctly handle these
log records and adds a new test case which verifies volume replay
to prevent any regression.  The existing test case which verified
replay on filesystem was renamed slog_replay_fs.ksh for clarity.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6603
2017-09-13 16:04:16 -07:00
Giuseppe Di Natale 45d1abc74d Improved dnode allocation and dmu_hold_impl() (#6611)
Refactor dmu_object_alloc_dnsize() and dnode_hold_impl() to simplify the
code, fix errors introduced by commit dbeb879 (PR #6117) interacting
badly with large dnodes, and improve performance.

* When allocating a new dnode in dmu_object_alloc_dnsize(), update the
percpu object ID for the core's metadnode chunk immediately.  This
eliminates most lock contention when taking the hold and creating the
dnode.

* Correct detection of the chunk boundary to work properly with large
dnodes.

* Separate the dmu_hold_impl() code for the FREE case from the code for
the ALLOCATED case to make it easier to read.

* Fully populate the dnode handle array immediately after reading a
block of the metadnode from disk.  Subsequently the dnode handle array
provides enough information to determine which dnode slots are in use
and which are free.

* Add several kstats to allow the behavior of the code to be examined.

* Verify dnode packing in large_dnode_008_pos.ksh.  Since the test is
purely creates, it should leave very few holes in the metadnode.

* Add test large_dnode_009_pos.ksh, which performs concurrent creates
and deletes, to complement existing test which does only creates.

With the above fixes, there is very little contention in a test of about
200,000 racing dnode allocations produced by tests 'large_dnode_008_pos'
and 'large_dnode_009_pos'.

name                            type data
dnode_hold_dbuf_hold            4    0
dnode_hold_dbuf_read            4    0
dnode_hold_alloc_hits           4    3804690
dnode_hold_alloc_misses         4    216
dnode_hold_alloc_interior       4    3
dnode_hold_alloc_lock_retry     4    0
dnode_hold_alloc_lock_misses    4    0
dnode_hold_alloc_type_none      4    0
dnode_hold_free_hits            4    203105
dnode_hold_free_misses          4    4
dnode_hold_free_lock_misses     4    0
dnode_hold_free_lock_retry      4    0
dnode_hold_free_overflow        4    0
dnode_hold_free_refcount        4    57
dnode_hold_free_txg             4    0
dnode_allocate                  4    203154
dnode_reallocate                4    0
dnode_buf_evict                 4    23918
dnode_alloc_next_chunk          4    4887
dnode_alloc_race                4    0
dnode_alloc_next_block          4    18

The performance is slightly improved for concurrent creates with
16+ threads, and unchanged for low thread counts.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
2017-09-13 15:46:15 -07:00
dbavatar 89950722c6 Linux 4.8+ compatibility fix for vm stats
vm_node_stat must be used instead of vm_zone_stat. Unfortunately the
old code still compiles potentially leading to silent failure of
arc_evictable_memory()

AKAMAI: CR 3816601: Regression in zfs dropcache test

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
Signed-off-by: Debabrata Banerjee <dbanerje@akamai.com>
Closes #6528
2017-09-13 14:21:59 -07:00
LOLi 4810a108e8 Disable mount(8) canonical paths in do_mount()
By default the mount(8) command, as invoked by 'zfs mount', will try
to resolve any path parameter in its canonical form: this could lead
to mount failures when the cwd contains a symlink having the same name
of the dataset being mounted.

Fix this by explicitly disabling mount(8) path canonicalization.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #1791 
Closes #6429 
Closes #6437
2017-08-21 16:46:55 -07:00
LOLi ae5b4a05ff Fix range locking in ZIL commit codepath
Since OpenZFS 7578 (1b7c1e5) if we have a ZVOL with logbias=throughput
we will force WR_INDIRECT itxs in zvol_log_write() setting itx->itx_lr
offset and length to the offset and length of the BIO from
zvol_write()->zvol_log_write(): these offset and length are later used
to take a range lock in zillog->zl_get_data function: zvol_get_data().

Now suppose we have a ZVOL with blocksize=8K and push 4K writes to
offset 0: we will only be range-locking 0-4096. This means the
ASSERTion we make in dbuf_unoverride() is no longer valid because now
dmu_sync() is called from zilog's get_data functions holding a partial
lock on the dbuf.

Fix this by taking a range lock on the whole block in zvol_get_data().

Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6238 
Closes #6315 
Closes #6356 
Closes #6477
2017-08-21 16:46:54 -07:00
LOLi 3468fdbd34 Fix remounting snapshots read-write
It's not enough to preserve/restore MS_RDONLY on the superblock flags
to avoid remounting a snapshot read-write: be explicit about our
intentions to the VFS layer so the readonly bit is updated correctly
in do_remount_sb().

Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6510
Closes #6515
2017-08-21 16:46:52 -07:00
Brian Behlendorf fb3f1fdbd6 Fix ZTS grow_pool/setup
The addition of the large_dnode_008_pos test case, which runs
right before this one, exposed some racy behavior in grow_pool
setup.sh on the Ubuntu kmemleak builder.  Before creating
partitions on a device destroying any existing ones.

  ERROR: set_partition 1  100mb loop0 exited 1

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6499 
Closes #6516
2017-08-21 16:41:22 -07:00
sckobras 426563be70 vdev_id: implement slot numbering by port id
With HPE hardware and hpsa-driven SAS adapters, only a single phy is
reported, but no individual per-port phys (ie. no phy* entry below
port_dir), which breaks topology detection in the current sas_handler
code. Instead, slot information can be derived directly from the port
number. This change implements a new slot keyword "port" similar to
"id" and "lun", and assumes a default phy/port of 0 if no individual
phy entry can be found. It allows to use the "sas_direct" topology with
current HPE Dxxxx and Apollo 45xx JBODs.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Daniel Kobras <d.kobras@science-computing.de>
Closes #6484
2017-08-21 16:41:22 -07:00
Chunwei Chen aec4318870 Fix NULL pointer when O_SYNC read in snapshot
When doing read on a file open with O_SYNC, it will trigger zil_commit.
However for snapshot, there's no zil, so we shouldn't be doing that.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes #6478 
Closes #6494
2017-08-21 16:41:22 -07:00
sanjeevbagewadi 2d9b57d39f zio_dva_throttle_done() should allow zinjected ZIO
If fault injection is enabled, the ZIO_FLAG_IO_RETRY could be set by
zio_handle_device_injection() to generate the FMA events and update
stats. Hence, ignore the flag and process such zios.

A better fix would be to add another flag in the zio_t to indicate that
the zio is failed because of a zinject rule. However, considering the
fact that we do this in debug bits, we could do with the crude check
using the global flag zio_injection_enabled which is set to 1 when
zinject records are added.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Sanjeev Bagewadi <sanjeev.bagewadi@gmail.com>
Closes #6383 
Closes #6384
2017-08-21 16:41:22 -07:00
Fabian-Gruenbichler 4bdb8fcfa8 Man page fixes
* ztest.1 man page: fix typo
* zfs-module-parameters.5 man page: fix grammar

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Closes #6492
2017-08-21 16:41:22 -07:00
gaurkuma 58c1c40a5e Crash in dbuf_evict_one with DTRACE_PROBE
Update the dbuf__evict__one() tracepoint so that it can safely
handle a NULL dmu_buf_impl_t pointer.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>    
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: gaurkuma <gauravk.18@gmail.com>
Closes #6463
2017-08-21 16:41:22 -07:00
Tony Hutter 751575fe6f Tag zfs-0.7.1
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2017-08-08 13:14:32 -07:00
Brian Behlendorf 751941e248 Fix dnode allocation race
When performing concurrent object allocations using the new
multi-threaded allocator and large dnodes it's possible to
allocate overlapping large dnodes.

This case should have been handled by detecting an error
returned by dnode_hold_impl().  But that logic only checked
the returned dnp was not-NULL, and the dnp variable was not
reset to NULL when retrying.  Resolve this issue by properly
checking the return value of dnode_hold_impl().

Additionally, it was possible that dnode_hold_impl() would
misreport a dnode as free when it was in fact in use.  This
could occurs for two reasons:

* The per-slot zrl_lock must be held over the entire critical
  section which includes the alloc/free until the new dnode
  is assigned to children_dnodes.  Additionally, all of the
  zrl_lock's in the range must be held to protect moving
  dnodes.

* The dn->dn_ot_type cannot be solely relied upon to check
  the type.  When allocating a new dnode its type will be
  DMU_OT_NONE after dnode_create().  Only latter when
  dnode_allocate() is called will it transition to the new
  type.  This means there's a window when allocating where
  it can mistaken for a free dnode.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Ned Bass <bass6@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6414
Closes #6439
2017-08-08 10:17:33 -07:00
Ned Bass ef605a5517 Add debug log entries for failed receive records
Log contents of a receive record if an error occurs while writing
it out to the pool. This may help determine the cause when backup
streams are rejected as invalid.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ned Bass <bass6@llnl.gov>
Closes #6465
2017-08-08 10:17:23 -07:00
Karsten Kretschmer 8eb6dcec7d dracut: Install commands required for vdev_id
The vdev_id script requires awk, grep, and head.  Use dracut_install to
ensure that these commands are available in the initrd environment.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Karsten Kretschmer <kkretschmer@gmail.com>
Closes #6443
Closes #6452
2017-08-07 09:37:30 -07:00
Tony Hutter 07cbcd5089 Only record zio->io_delay on reads and writes
While investigating https://github.com/zfsonlinux/zfs/issues/6425 I
noticed that ioctl ZIOs were not setting zio->io_delay correctly.  They
would set the start time in zio_vdev_io_start(), but never set the end
time in zio_vdev_io_done(), since ioctls skip it and go straight to
zio_done().  This was causing spurious "delayed IO" events to appear,
which would eventually get rate-limited and displayed as
"Missed events" messages in zed.

To get around the problem, this patch only sets zio->io_delay for read
and write ZIOs, since that's all we care about anyway.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #6425
Closes #6440
2017-08-02 11:37:18 -07:00
Giuseppe Di Natale 12acabe2a4 mmp_on_uberblocks: Use kstat for uberblock counts
Use kstat to get a more accurate count of uberblock updates.
Using a loop with zdb can potentially miss some uberblocks.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #6407
Closes #6419
2017-08-02 11:21:33 -07:00
LOLi 20c88dc3ef Fix volmode=none property behavior at import time
At import time spa_import() calls zvol_create_minors() directly: with
the current implementation we have no way to avoid device node
creation when volmode=none.

Fix this by enforcing volmode=none directly in zvol_alloc().

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6426
2017-08-02 11:21:14 -07:00
Brian Behlendorf 0c8fedeb35 Fix aarch64 build
Add aarch64 to the list of architecture which do not sanitize the
LDFLAGS from the environment.  See fb963d33 for details.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6424
2017-08-02 11:20:50 -07:00
Giuseppe Di Natale affb7141d7 Disable zfs_send_007_pos
Test case zfs_send_007_pos regularly is killed
by test-runner during zfs-tests on buildbot. Disable
it for now until further investigation can be done.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #6422
2017-08-02 11:20:32 -07:00
bunder2015 e0031d86b7 Correct man page generation
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: bunder2015 <omfgbunder@gmail.com>
Closes #6409
Closes #6411
2017-07-28 11:01:53 -07:00
2860 changed files with 65739 additions and 354133 deletions
-10
View File
@@ -1,10 +0,0 @@
root = true
[*]
end_of_line = lf
insert_final_newline = true
trim_trailing_whitespace = true
[*.{c,h}]
tab_width = 8
indent_style = tab
+66 -122
View File
@@ -1,12 +1,10 @@
# Contributing to OpenZFS
<p align="center">
<img alt="OpenZFS Logo"
src="https://openzfs.github.io/openzfs-docs/_static/img/logo/480px-Open-ZFS-Secondary-Logo-Colour-halfsize.png"/>
</p>
# Contributing to ZFS on Linux
<p align="center"><img src="http://zfsonlinux.org/images/zfs-linux.png"/></p>
*First of all, thank you for taking the time to contribute!*
By using the following guidelines, you can help us make OpenZFS even better.
By using the following guidelines, you can help us make ZFS on Linux even
better.
## Table Of Contents
[What should I know before I get
@@ -29,22 +27,19 @@ started?](#what-should-i-know-before-i-get-started)
* [Commit Message Formats](#commit-message-formats)
* [New Changes](#new-changes)
* [OpenZFS Patch Ports](#openzfs-patch-ports)
* [Coverity Defect Fixes](#coverity-defect-fixes)
* [Signed Off By](#signed-off-by)
Helpful resources
* [OpenZFS Documentation](https://openzfs.github.io/openzfs-docs/)
* [OpenZFS Developer Resources](http://open-zfs.org/wiki/Developer_resources)
* [Git and GitHub for beginners](https://openzfs.github.io/openzfs-docs/Developer%20Resources/Git%20and%20GitHub%20for%20beginners.html)
* [ZFS on Linux wiki](https://github.com/zfsonlinux/zfs/wiki)
* [OpenZFS Documentation](http://open-zfs.org/wiki/Developer_resources)
## What should I know before I get started?
### Get ZFS
You can build zfs packages by following [these
instructions](https://openzfs.github.io/openzfs-docs/Developer%20Resources/Building%20ZFS.html),
instructions](https://github.com/zfsonlinux/zfs/wiki/Building-ZFS),
or install stable packages from [your distribution's
repository](https://openzfs.github.io/openzfs-docs/Getting%20Started/index.html).
repository](https://github.com/zfsonlinux/zfs/wiki/Getting-Started).
### Debug ZFS
A variety of methods and tools are available to aid ZFS developers.
@@ -53,30 +48,28 @@ configure option should be set. This will enable additional correctness
checks and all the ASSERTs to help quickly catch potential issues.
In addition, there are numerous utilities and debugging files which
provide visibility into the inner workings of ZFS. The most useful
of these tools are discussed in detail on the [Troubleshooting
page](https://openzfs.github.io/openzfs-docs/Basic%20Concepts/Troubleshooting.html).
provide visibility in to the inner workings of ZFS. The most useful
of these tools are discussed in detail on the [debugging ZFS wiki
page](https://github.com/zfsonlinux/zfs/wiki/Debugging).
### Where can I ask for help?
The [zfs-discuss mailing
list](https://openzfs.github.io/openzfs-docs/Project%20and%20Community/Mailing%20Lists.html)
or IRC are the best places to ask for help. Please do not file
support requests on the GitHub issue tracker.
The [mailing list](https://github.com/zfsonlinux/zfs/wiki/Mailing-Lists)
is the best place to ask for help.
## How Can I Contribute?
### Reporting Bugs
*Please* contact us via the [zfs-discuss mailing
list](https://openzfs.github.io/openzfs-docs/Project%20and%20Community/Mailing%20Lists.html)
or IRC if you aren't certain that you are experiencing a bug.
*Please* contact us via the [mailing
list](https://github.com/zfsonlinux/zfs/wiki/Mailing-Lists) if you aren't
certain that you are experiencing a bug.
If you run into an issue, please search our [issue
tracker](https://github.com/openzfs/zfs/issues) *first* to ensure the
tracker](https://github.com/zfsonlinux/zfs/issues) *first* to ensure the
issue hasn't been reported before. Open a new issue only if you haven't
found anything similar to your issue.
You can open a new issue and search existing issues using the public [issue
tracker](https://github.com/openzfs/zfs/issues).
tracker](https://github.com/zfsonlinux/zfs/issues).
#### When opening a new issue, please include the following information at the top of the issue:
* What distribution (with version) you are using.
@@ -108,13 +101,13 @@ information like:
* Stack traces which may be logged to `dmesg`.
### Suggesting Enhancements
OpenZFS is a widely deployed production filesystem which is under active
development. The team's primary focus is on fixing known issues, improving
performance, and adding compelling new features.
ZFS on Linux is a widely deployed production filesystem which is under
active development. The team's primary focus is on fixing known issues,
improving performance, and adding compelling new features.
You can view the list of proposed features
by filtering the issue tracker by the ["Type: Feature"
label](https://github.com/openzfs/zfs/issues?q=is%3Aopen+is%3Aissue+label%3A%22Type%3A+Feature%22).
by filtering the issue tracker by the ["Feature"
label](https://github.com/zfsonlinux/zfs/issues?q=is%3Aopen+is%3Aissue+label%3AFeature).
If you have an idea for a feature first check this list. If your idea already
appears then add a +1 to the top most comment, this helps us gauge interest
in that feature.
@@ -123,11 +116,8 @@ Otherwise, open a new issue and describe your proposed feature. Why is this
feature needed? What problem does it solve?
### Pull Requests
#### General
* All pull requests, except backports and releases, must be based on the current master branch
and should apply without conflicts.
* All pull requests must be based on the current master branch and apply
without conflicts.
* Please attempt to limit pull requests to a single commit which resolves
one specific issue.
* Make sure your commit messages are in the correct format. See the
@@ -139,28 +129,16 @@ logically independent patches which build on each other. This makes large
changes easier to review and approve which speeds up the merging process.
* Try to keep pull requests simple. Simple code with comments is much easier
to review and approve.
* All proposed changes must be approved by an OpenZFS organization member.
* If you have an idea you'd like to discuss or which requires additional testing, consider opening it as a draft pull request.
Once everything is in good shape and the details have been worked out you can remove its draft status.
Any required reviews can then be finalized and the pull request merged.
#### Tests and Benchmarks
* Every pull request will by tested by the buildbot on multiple platforms by running the [zfs-tests.sh and zloop.sh](
https://openzfs.github.io/openzfs-docs/Developer%20Resources/Building%20ZFS.html#running-zloop-sh-and-zfs-tests-sh) test suites.
* To verify your changes conform to the [style guidelines](
https://github.com/openzfs/zfs/blob/master/.github/CONTRIBUTING.md#style-guides
), please run `make checkstyle` and resolve any warnings.
* Static code analysis of each pull request is performed by the buildbot; run `make lint` to check your changes.
* Test cases should be provided when appropriate.
This includes making sure new features have adequate code coverage.
* If your pull request improves performance, please include some benchmarks.
* The pull request must pass all required [ZFS
Buildbot](http://build.zfsonlinux.org/) builders before
being accepted. If you are experiencing intermittent TEST
builder failures, you may be experiencing a [test suite
issue](https://github.com/openzfs/zfs/issues?q=is%3Aissue+is%3Aopen+label%3A%22Type%3A+Test+Suite%22).
There are also various [buildbot options](https://openzfs.github.io/openzfs-docs/Developer%20Resources/Buildbot%20Options.html)
issue](https://github.com/zfsonlinux/zfs/issues?q=is%3Aissue+is%3Aopen+label%3A%22Test+Suite%22).
There are also various [buildbot options](https://github.com/zfsonlinux/zfs/wiki/Buildbot-Options)
to control how changes are tested.
* All proposed changes must be approved by a ZFS on Linux organization member.
### Testing
All help is appreciated! If you're in a position to run the latest code
@@ -170,41 +148,16 @@ range of realistic workloads, configurations and architectures we're better
able quickly identify and resolve potential issues.
Users can also run the [ZFS Test
Suite](https://github.com/openzfs/zfs/tree/master/tests) on their systems
Suite](https://github.com/zfsonlinux/zfs/tree/master/tests) on their systems
to verify ZFS is behaving as intended.
## Style Guides
### Repository Structure
OpenZFS uses a standardised branching structure.
- The "development and main branch", is the branch all development should be based on.
- "Release branches" contain the latest released code for said version.
- "Staging branches" contain selected commits prior to being released.
**Branch Names:**
- Development and Main branch: `master`
- Release branches: `zfs-$VERSION-release`
- Staging branches: `zfs-$VERSION-staging`
`$VERSION` should be replaced with the `major.minor` version number.
_(This is the version number without the `.patch` version at the end)_
### Coding Conventions
We currently use [C Style and Coding Standards for
SunOS](http://www.cis.upenn.edu/%7Elee/06cse480/data/cstyle.ms.pdf) as our
coding convention.
This repository has an `.editorconfig` file. If your editor [supports
editorconfig](https://editorconfig.org/#download), it will
automatically respect most of this project's whitespace preferences.
Additionally, Git can help warn on whitespace problems as well:
```
git config --local core.whitespace trailing-space,space-before-tab,indent-with-non-tab,-tab-in-indent
```
### Commit Message Formats
#### New Changes
Commit messages for new changes must meet the following guidelines:
@@ -214,10 +167,18 @@ first line in the commit message.
please summarize important information such as why the proposed
approach was chosen or a brief description of the bug you are resolving.
Each line of the body must be 72 characters or less.
* The last line must be a `Signed-off-by:` tag. See the
[Signed Off By](#signed-off-by) section for more information.
* The last line must be a `Signed-off-by:` tag with the developer's
name followed by their email. This is the developer's certification
that they have the right to submit the patch for inclusion into
the code base and indicates agreement to the [Developer's Certificate
of Origin](https://www.kernel.org/doc/html/latest/process/submitting-patches.html#sign-your-work-the-developer-s-certificate-of-origin).
Code without a proper signoff cannot be merged.
An example commit message for new changes is provided below.
Git can append the `Signed-off-by` line to your commit messages. Simply
provide the `-s` or `--signoff` option when performing a `git commit`.
For more information about writing commit messages, visit [How to Write
a Git Commit Message](https://chris.beams.io/posts/git-commit/).
An example commit message is provided below.
```
This line is a brief summary of your change
@@ -230,52 +191,35 @@ attempting to solve.
Signed-off-by: Contributor <contributor@email.com>
```
#### Coverity Defect Fixes
If you are submitting a fix to a
[Coverity defect](https://scan.coverity.com/projects/zfsonlinux-zfs),
the commit message should meet the following guidelines:
* Provides a subject line in the format of
`Fix coverity defects: CID dddd, dddd...` where `dddd` represents
each CID fixed by the commit.
* Provides a body which lists each Coverity defect and how it was corrected.
* The last line must be a `Signed-off-by:` tag. See the
[Signed Off By](#signed-off-by) section for more information.
#### OpenZFS Patch Ports
If you are porting an OpenZFS patch, the commit message must meet
the following guidelines:
* The first line must be the summary line from the OpenZFS commit.
It must begin with `OpenZFS dddd - ` where `dddd` is the OpenZFS issue number.
* Provides a `Authored by:` line to attribute the patch to the original author.
* Provides the `Reviewed by:` and `Approved by:` lines from the original
OpenZFS commit.
* Provides a `Ported-by:` line with the developer's name followed by
their email.
* Provides a `OpenZFS-issue:` line which is a link to the original illumos
issue.
* Provides a `OpenZFS-commit:` line which links back to the original OpenZFS
commit.
* If necessary, provide some porting notes to describe any deviations from
the original OpenZFS commit.
An example Coverity defect fix commit message is provided below.
An example OpenZFS patch port commit message is provided below.
```
Fix coverity defects: CID 12345, 67890
OpenZFS 1234 - Summary from the original OpenZFS commit
CID 12345: Logically dead code (DEADCODE)
Authored by: Original Author <original@email.com>
Reviewed by: Reviewer One <reviewer1@email.com>
Reviewed by: Reviewer Two <reviewer2@email.com>
Approved by: Approver One <approver1@email.com>
Ported-by: ZFS Contributor <contributor@email.com>
Removed the if(var != 0) block because the condition could never be
satisfied.
Provide some porting notes here if necessary.
CID 67890: Resource Leak (RESOURCE_LEAK)
Ensure free is called after allocating memory in function().
Signed-off-by: Contributor <contributor@email.com>
OpenZFS-issue: https://www.illumos.org/issues/1234
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/abcd1234
```
#### Signed Off By
A line tagged as `Signed-off-by:` must contain the developer's
name followed by their email. This is the developer's certification
that they have the right to submit the patch for inclusion into
the code base and indicates agreement to the [Developer's Certificate
of Origin](https://www.kernel.org/doc/html/latest/process/submitting-patches.html#sign-your-work-the-developer-s-certificate-of-origin).
Code without a proper signoff cannot be merged.
Git can append the `Signed-off-by` line to your commit messages. Simply
provide the `-s` or `--signoff` option when performing a `git commit`.
For more information about writing commit messages, visit [How to Write
a Git Commit Message](https://chris.beams.io/posts/git-commit/).
#### Co-authored By
If someone else had part in your pull request, please add the following to the commit:
`Co-authored-by: Name <gitregistered@email.address>`
This is useful if their authorship was lost during squashing, rebasing, etc.,
but may be used in any situation where there are co-authors.
The email address used here should be the same as on the GitHub profile of said user.
If said user does not have their email address public, please use the following instead:
`Co-authored-by: Name <[username]@users.noreply.github.com>`
+46
View File
@@ -0,0 +1,46 @@
<!--
Thank you for reporting an issue.
*IMPORTANT* - Please search our issue tracker *before* making a new issue.
If you cannot find a similar issue, then create a new issue.
https://github.com/zfsonlinux/zfs/issues
*IMPORTANT* - This issue tracker is for *bugs* and *issues* only.
Please search the wiki and the mailing list archives before asking
questions on the mailing list.
https://github.com/zfsonlinux/zfs/wiki/Mailing-Lists
Please fill in as much of the template as possible.
-->
### System information
<!-- add version after "|" character -->
Type | Version/Name
--- | ---
Distribution Name |
Distribution Version |
Linux Kernel |
Architecture |
ZFS Version |
SPL Version |
<!--
Commands to find ZFS/SPL versions:
modinfo zfs | grep -iw version
modinfo spl | grep -iw version
-->
### Describe the problem you're observing
### Describe how to reproduce the problem
### Include any warning/errors/backtraces from the system logs
<!--
*IMPORTANT* - Please mark logs and text output from terminal commands
or else Github will not display them correctly.
An example is provided below.
Example:
```
this is an example how log text should be marked (wrap it with ```)
```
-->
-53
View File
@@ -1,53 +0,0 @@
---
name: Bug report
about: Create a report to help us improve OpenZFS
title: ''
labels: 'Type: Defect'
assignees: ''
---
<!-- Please fill out the following template, which will help other contributors address your issue. -->
<!--
Thank you for reporting an issue.
*IMPORTANT* - Please check our issue tracker before opening a new issue.
Additional valuable information can be found in the OpenZFS documentation
and mailing list archives.
Please fill in as much of the template as possible.
-->
### System information
<!-- add version after "|" character -->
Type | Version/Name
--- | ---
Distribution Name |
Distribution Version |
Linux Kernel |
Architecture |
ZFS Version |
SPL Version |
<!--
Commands to find ZFS/SPL versions:
modinfo zfs | grep -iw version
modinfo spl | grep -iw version
-->
### Describe the problem you're observing
### Describe how to reproduce the problem
### Include any warning/errors/backtraces from the system logs
<!--
*IMPORTANT* - Please mark logs and text output from terminal commands
or else Github will not display them correctly.
An example is provided below.
Example:
```
this is an example how log text should be marked (wrap it with ```)
```
-->
-11
View File
@@ -1,11 +0,0 @@
blank_issues_enabled: false
contact_links:
- name: OpenZFS Community Support Mailing list (Linux)
url: https://zfsonlinux.topicbox.com/groups/zfs-discuss
about: Get community support for OpenZFS on Linux
- name: FreeBSD Community Support Mailing list
url: https://lists.freebsd.org/mailman/listinfo/freebsd-fs
about: Get community support for OpenZFS on FreeBSD
- name: OpenZFS on IRC
url: https://webchat.freenode.net/#openzfs
about: Use IRC to get community support for OpenZFS
-33
View File
@@ -1,33 +0,0 @@
---
name: Feature request
about: Suggest a feature for OpenZFS
title: ''
labels: 'Type: Feature'
assignees: ''
---
<!--
Thank you for suggesting a feature.
Please check our issue tracker before opening a new feature request.
Filling out the following template will help other contributors better understand your proposed feature.
-->
### Describe the feature would like to see added to OpenZFS
<!--
Provide a clear and concise description of the feature.
-->
### How will this feature improve OpenZFS?
<!--
What problem does this feature solve?
-->
### Additional context
<!--
Any additional information you can add about the proposal?
-->
-37
View File
@@ -1,37 +0,0 @@
---
name: Code Question
about: Ask a question about the code
title: ''
labels: 'Type: Question'
assignees: ''
---
<!--
Thank you for taking an interest in the OpenZFS codebase.
Please be aware that most questions are preferably asked in the mailing list first.
This form is primarily meant for asking questions about the code itself.
Please also check our issue tracker before opening a new question.
Filling out the following template will help other contributors better understand your question.
-->
### Ask your question!
<!--
Please provide a clear and concise question.
-->
### Which portion of the codebase does your question involve?
<!--
Optional: Please describe what portion of the codebase your issue involved.
Example: "Testsuite", "Buildbots", "CLI", a code snippet etc.
-->
### Additional context
<!--
Any additional information you want to add?
-->
+10 -12
View File
@@ -1,25 +1,22 @@
<!--- Please fill out the following template, which will help other contributors review your Pull Request. -->
<!--- Provide a general summary of your changes in the Title above -->
<!---
Documentation on ZFS Buildbot options can be found at
https://openzfs.github.io/openzfs-docs/Developer%20Resources/Buildbot%20Options.html
https://github.com/zfsonlinux/zfs/wiki/Buildbot-Options
-->
### Description
<!--- Describe your changes in detail -->
### Motivation and Context
<!--- Why is this change required? What problem does it solve? -->
<!--- If it fixes an open issue, please link to the issue here. -->
### Description
<!--- Describe your changes in detail -->
### How Has This Been Tested?
<!--- Please describe in detail how you tested your changes. -->
<!--- Include details of your testing environment, and the tests you ran to -->
<!--- see how your change affects other areas of the code, etc. -->
<!--- If your change is a performance enhancement, please provide benchmarks here. -->
<!--- Please think about using the draft PR feature if appropriate -->
### Types of changes
<!--- What types of changes does your code introduce? Put an `x` in all the boxes that apply: -->
@@ -33,9 +30,10 @@ https://openzfs.github.io/openzfs-docs/Developer%20Resources/Buildbot%20Options.
### Checklist:
<!--- Go over all the following points, and put an `x` in all the boxes that apply. -->
<!--- If you're unsure about any of these, don't hesitate to ask. We're here to help! -->
- [ ] My code follows the ZFS on Linux [code style requirements](https://github.com/zfsonlinux/zfs/blob/master/.github/CONTRIBUTING.md#coding-conventions).
- [ ] My code follows the ZFS on Linux code style requirements.
- [ ] I have updated the documentation accordingly.
- [ ] I have read the [**contributing** document](https://github.com/zfsonlinux/zfs/blob/master/.github/CONTRIBUTING.md).
- [ ] I have added [tests](https://github.com/zfsonlinux/zfs/tree/master/tests) to cover my changes.
- [ ] I have run the ZFS Test Suite with this change applied.
- [ ] All commit messages are properly formatted and contain [`Signed-off-by`](https://github.com/zfsonlinux/zfs/blob/master/.github/CONTRIBUTING.md#signed-off-by).
- [ ] I have read the **CONTRIBUTING** document.
- [ ] I have added tests to cover my changes.
- [ ] All new and existing tests passed.
- [ ] All commit messages are properly formatted and contain `Signed-off-by`.
- [ ] Change has been approved by a ZFS on Linux member.
+17 -12
View File
@@ -1,25 +1,30 @@
codecov:
notify:
require_ci_to_pass: false # always post
after_n_builds: 2 # user and kernel
require_ci_to_pass: no
coverage:
precision: 0 # 0 decimals of precision
round: nearest # Round to nearest precision point
range: "50...90" # red -> yellow -> green
precision: 2
round: down
range: "50...100"
status:
project:
default:
threshold: 1% # allow 1% coverage variance
threshold: 1%
patch:
default:
threshold: 1% # allow 1% coverage variance
threshold: 1%
parsers:
gcov:
branch_detection:
conditional: yes
loop: yes
method: no
macro: no
comment:
layout: "reach, diff, flags, footer"
behavior: once # update if exists; post new; skip if deleted
require_changes: yes # only post when coverage changes
# ignore: Please place any ignores in config/ax_code_coverage.m4 instead
layout: "header, sunburst, diff"
behavior: default
require_changes: no
-17
View File
@@ -1,17 +0,0 @@
# Number of days of inactivity before an issue becomes stale
daysUntilStale: 365
# Number of days of inactivity before a stale issue is closed
daysUntilClose: 90
# Limit to only `issues` or `pulls`
only: issues
# Issues with these labels will never be considered stale
exemptLabels:
- "Type: Feature"
- "Type: Understood"
# Label to use when marking an issue as stale
staleLabel: "Status: Stale"
# Comment to post when marking an issue as stale. Set to `false` to disable
markComment: >
This issue has been automatically marked as "stale" because it has not had
any activity for a while. It will be closed in 90 days if no further activity occurs.
Thank you for your contributions.
+3
View File
@@ -0,0 +1,3 @@
preprocessorErrorDirective:./module/zfs/vdev_raidz_math_avx512f.c:243
preprocessorErrorDirective:./module/zfs/vdev_raidz_math_sse2.c:266
-36
View File
@@ -1,36 +0,0 @@
name: checkstyle
on:
push:
pull_request_target:
jobs:
checkstyle:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
with:
ref: ${{ github.event.pull_request.head.sha }}
- name: Install dependencies
run: |
sudo apt-get update
sudo apt-get install --yes -qq build-essential autoconf libtool gawk alien fakeroot linux-headers-$(uname -r)
sudo apt-get install --yes -qq zlib1g-dev uuid-dev libattr1-dev libblkid-dev libselinux-dev libudev-dev libssl-dev python-dev python-setuptools python-cffi python3 python3-dev python3-setuptools python3-cffi
# packages for tests
sudo apt-get install --yes -qq parted lsscsi ksh attr acl nfs-kernel-server fio
sudo apt-get install --yes -qq mandoc cppcheck pax-utils abigail-tools # devscripts - enable then bashisms fixed
sudo -E pip --quiet install flake8
- name: Prepare
run: |
sh ./autogen.sh
./configure
- name: Checkstyle
run: |
make checkstyle
- name: Lint
run: |
make lint
- name: CheckABI
run: |
make -j$(nproc)
make checkabi
+2 -11
View File
@@ -14,7 +14,6 @@
# Normal rules
#
*.[oa]
*.o.ur-safe
*.lo
*.la
*.mod.c
@@ -22,8 +21,6 @@
*.swp
*.gcno
*.gcda
*.pyc
*.pyo
.deps
.libs
.dirstamp
@@ -36,7 +33,6 @@ Makefile.in
# Top level generated files specific to this top level dir
#
/bin
/build
/configure
/config.log
/config.status
@@ -45,6 +41,8 @@ Makefile.in
/zfs_config.h.in
/zfs.release
/stamp-h1
/.script-config
/zfs-script-config.sh
/aclocal.m4
/autom4te.cache
@@ -61,10 +59,3 @@ cscope.*
*.tar.gz
*.patch
*.orig
*.log
*.tmp
venv
*.so
*.so.debug
*.so.full
+90 -303
View File
@@ -1,308 +1,95 @@
MAINTAINERS:
Brian Behlendorf is the principal developer of the ZFS on Linux port.
He works full time as a computer scientist at Lawrence Livermore
National Laboratory on the ZFS and Lustre filesystems. However,
this port would not have been possible without the help of many
others who have contributed their time, effort, and insight.
Brian Behlendorf <behlendorf1@llnl.gov>
Tony Hutter <hutter2@llnl.gov>
Brian Behlendorf <behlendorf1@llnl.gov>
PAST MAINTAINERS:
First and foremost the hard working ZFS developers at Sun/Oracle.
They are responsible for the bulk of the code in this project and
without their efforts there never would have been a ZFS filesystem.
Ned Bass <bass6@llnl.gov>
The ZFS Development Team at Sun/Oracle
CONTRIBUTORS:
Next all the developers at KQ Infotech who implemented a prototype
ZFS Posix Layer (ZPL). Their implementation provided an excellent
reference for adding the ZPL functionality.
Aaron Fineman <abyxcos@gmail.com>
Adam Leventhal <ahl@delphix.com>
Adam Stevko <adam.stevko@gmail.com>
Ahmed G <ahmedg@delphix.com>
Akash Ayare <aayare@delphix.com>
Alan Somers <asomers@gmail.com>
Alar Aun <spamtoaun@gmail.com>
Albert Lee <trisk@nexenta.com>
Alec Salazar <alec.j.salazar@gmail.com>
Alejandro R. Sedeño <asedeno@mit.edu>
Alek Pinchuk <alek@nexenta.com>
Alex Braunegg <alex.braunegg@gmail.com>
Alex McWhirter <alexmcwhirter@triadic.us>
Alex Reece <alex@delphix.com>
Alex Wilson <alex.wilson@joyent.com>
Alex Zhuravlev <alexey.zhuravlev@intel.com>
Alexander Eremin <a.eremin@nexenta.com>
Alexander Motin <mav@freebsd.org>
Alexander Pyhalov <apyhalov@gmail.com>
Alexander Stetsenko <ams@nexenta.com>
Alexey Shvetsov <alexxy@gentoo.org>
Alexey Smirnoff <fling@member.fsf.org>
Allan Jude <allanjude@freebsd.org>
AndCycle <andcycle@andcycle.idv.tw>
Andreas Buschmann <andreas.buschmann@tech.net.de>
Andreas Dilger <adilger@intel.com>
Andrew Barnes <barnes333@gmail.com>
Andrew Hamilton <ahamilto@tjhsst.edu>
Andrew Reid <ColdCanuck@nailedtotheperch.com>
Andrew Stormont <andrew.stormont@nexenta.com>
Andrew Tselischev <andrewtselischev@gmail.com>
Andrey Vesnovaty <andrey.vesnovaty@gmail.com>
Andriy Gapon <avg@freebsd.org>
Andy Bakun <github@thwartedefforts.org>
Aniruddha Shankar <k@191a.net>
Antonio Russo <antonio.e.russo@gmail.com>
Arkadiusz Bubała <arkadiusz.bubala@open-e.com>
Arne Jansen <arne@die-jansens.de>
Aron Xu <happyaron.xu@gmail.com>
Bart Coddens <bart.coddens@gmail.com>
Basil Crow <basil.crow@delphix.com>
Huang Liu <liu.huang@zte.com.cn>
Ben Allen <bsallen@alcf.anl.gov>
Ben Rubson <ben.rubson@gmail.com>
Benjamin Albrecht <git@albrecht.io>
Bill McGonigle <bill-github.com-public1@bfccomputing.com>
Bill Pijewski <wdp@joyent.com>
Boris Protopopov <boris.protopopov@nexenta.com>
Brad Lewis <brad.lewis@delphix.com>
Brian Behlendorf <behlendorf1@llnl.gov>
Brian J. Murrell <brian@sun.com>
Caleb James DeLisle <calebdelisle@lavabit.com>
Cao Xuewen <cao.xuewen@zte.com.cn>
Carlo Landmeter <clandmeter@gmail.com>
Carlos Alberto Lopez Perez <clopez@igalia.com>
Chaoyu Zhang <zhang.chaoyu@zte.com.cn>
Chen Can <chen.can2@zte.com.cn>
Chen Haiquan <oc@yunify.com>
Chip Parker <aparker@enthought.com>
Chris Burroughs <chris.burroughs@gmail.com>
Chris Dunlap <cdunlap@llnl.gov>
Chris Dunlop <chris@onthe.net.au>
Chris Siden <chris.siden@delphix.com>
Chris Wedgwood <cw@f00f.org>
Chris Williamson <chris.williamson@delphix.com>
Chris Zubrzycki <github@mid-earth.net>
Christ Schlacta <aarcane@aarcane.info>
Christer Ekholm <che@chrekh.se>
Christian Kohlschütter <christian@kohlschutter.com>
Christian Neukirchen <chneukirchen@gmail.com>
Christian Schwarz <me@cschwarz.com>
Christopher Voltz <cjunk@voltz.ws>
Chunwei Chen <david.chen@nutanix.com>
Clemens Fruhwirth <clemens@endorphin.org>
Coleman Kane <ckane@colemankane.org>
Colin Ian King <colin.king@canonical.com>
Craig Loomis <cloomis@astro.princeton.edu>
Craig Sanders <github@taz.net.au>
Cyril Plisko <cyril.plisko@infinidat.com>
DHE <git@dehacked.net>
Damian Wojsław <damian@wojslaw.pl>
Dan Kimmel <dan.kimmel@delphix.com>
Dan McDonald <danmcd@nexenta.com>
Dan Swartzendruber <dswartz@druber.com>
Dan Vatca <dan.vatca@gmail.com>
Daniel Hoffman <dj.hoffman@delphix.com>
Daniel Verite <daniel@verite.pro>
Daniil Lunev <d.lunev.mail@gmail.com>
Darik Horn <dajhorn@vanadac.com>
Dave Eddy <dave@daveeddy.com>
David Lamparter <equinox@diac24.net>
David Qian <david.qian@intel.com>
David Quigley <david.quigley@intel.com>
Debabrata Banerjee <dbanerje@akamai.com>
Denys Rtveliashvili <denys@rtveliashvili.name>
Derek Dai <daiderek@gmail.com>
Dimitri John Ledkov <xnox@ubuntu.com>
Dmitry Khasanov <pik4ez@gmail.com>
Dominik Hassler <hadfl@omniosce.org>
Dominik Honnef <dominikh@fork-bomb.org>
Don Brady <don.brady@delphix.com>
Dr. András Korn <korn-github.com@elan.rulez.org>
Eli Rosenthal <eli.rosenthal@delphix.com>
Eric Desrochers <eric.desrochers@canonical.com>
Eric Dillmann <eric@jave.fr>
Eric Schrock <Eric.Schrock@delphix.com>
Etienne Dechamps <etienne@edechamps.fr>
Evan Susarret <evansus@gmail.com>
Fabian Grünbichler <f.gruenbichler@proxmox.com>
Fajar A. Nugraha <github@fajar.net>
Fan Yong <fan.yong@intel.com>
Feng Sun <loyou85@gmail.com>
Frederik Wessels <wessels147@gmail.com>
Frédéric Vanniere <f.vanniere@planet-work.com>
Garrett D'Amore <garrett@nexenta.com>
Garrison Jensen <garrison.jensen@gmail.com>
Gary Mills <gary_mills@fastmail.fm>
Gaurav Kumar <gauravk.18@gmail.com>
GeLiXin <ge.lixin@zte.com.cn>
George Amanakis <g_amanakis@yahoo.com>
George Melikov <mail@gmelikov.ru>
George Wilson <gwilson@delphix.com>
Georgy Yakovlev <ya@sysdump.net>
Giuseppe Di Natale <guss80@gmail.com>
Gordan Bobic <gordan@redsleeve.org>
Gordon Ross <gwr@nexenta.com>
Gregor Kopka <gregor@kopka.net>
Grischa Zengel <github.zfsonlinux@zengel.info>
Gunnar Beutner <gunnar@beutner.name>
Gvozden Neskovic <neskovic@gmail.com>
Hajo Möller <dasjoe@gmail.com>
Hans Rosenfeld <hans.rosenfeld@nexenta.com>
Håkan Johansson <f96hajo@chalmers.se>
Igor Kozhukhov <ikozhukhov@gmail.com>
Igor Lvovsky <ilvovsky@gmail.com>
Isaac Huang <he.huang@intel.com>
JK Dingwall <james@dingwall.me.uk>
Jacek Fefliński <feflik@gmail.com>
James Cowgill <james.cowgill@mips.com>
James Lee <jlee@thestaticvoid.com>
James Pan <jiaming.pan@yahoo.com>
Jan Engelhardt <jengelh@inai.de>
Jan Kryl <jan.kryl@nexenta.com>
Jan Sanislo <oystr@cs.washington.edu>
Jason King <jason.brian.king@gmail.com>
Jason Zaman <jasonzaman@gmail.com>
Javen Wu <wu.javen@gmail.com>
Jeremy Gill <jgill@parallax-innovations.com>
Jeremy Jones <jeremy@delphix.com>
Jerry Jelinek <jerry.jelinek@joyent.com>
Jinshan Xiong <jinshan.xiong@intel.com>
Joe Stein <joe.stein@delphix.com>
John Albietz <inthecloud247@gmail.com>
John Eismeier <john.eismeier@gmail.com>
John L. Hammond <john.hammond@intel.com>
John Layman <jlayman@sagecloud.com>
John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
John Wren Kennedy <john.kennedy@delphix.com>
Johnny Stenback <github@jstenback.com>
Jorgen Lundman <lundman@lundman.net>
Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
Joshua M. Clulow <josh@sysmgr.org>
Justin Bedő <cu@cua0.org>
Justin Lecher <jlec@gentoo.org>
Justin T. Gibbs <gibbs@FreeBSD.org>
Jörg Thalheim <joerg@higgsboson.tk>
KORN Andras <korn@elan.rulez.org>
Kamil Domański <kamil@domanski.co>
Karsten Kretschmer <kkretschmer@gmail.com>
Kash Pande <kash@tripleback.net>
Keith M Wesolowski <wesolows@foobazco.org>
Kevin Tanguy <kevin.tanguy@ovh.net>
KireinaHoro <i@jsteward.moe>
Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl>
Kohsuke Kawaguchi <kk@kohsuke.org>
Kyle Blatter <kyleblatter@llnl.gov>
Kyle Fuller <inbox@kylefuller.co.uk>
Loli <ezomori.nozomu@gmail.com>
Lars Johannsen <laj@it.dk>
Li Dongyang <dongyang.li@anu.edu.au>
Li Wei <W.Li@Sun.COM>
Lukas Wunner <lukas@wunner.de>
Madhav Suresh <madhav.suresh@delphix.com>
Manoj Joseph <manoj.joseph@delphix.com>
Manuel Amador (Rudd-O) <rudd-o@rudd-o.com>
Marcel Huber <marcelhuberfoo@gmail.com>
Marcel Telka <marcel.telka@nexenta.com>
Marcel Wysocki <maci.stgn@gmail.com>
Mark Shellenbaum <Mark.Shellenbaum@Oracle.COM>
Mark Wright <markwright@internode.on.net>
Martin Matuska <mm@FreeBSD.org>
Massimo Maggi <me@massimo-maggi.eu>
Matt Johnston <matt@fugro-fsi.com.au>
Matt Kemp <matt@mattikus.com>
Matthew Ahrens <matt@delphix.com>
Matthew Thode <mthode@mthode.org>
Matus Kral <matuskral@me.com>
Max Grossman <max.grossman@delphix.com>
Maximilian Mehnert <maximilian.mehnert@gmx.de>
Michael Gebetsroither <michael@mgeb.org>
Michael Kjorling <michael@kjorling.se>
Michael Martin <mgmartin.mgm@gmail.com>
Michael Niewöhner <foss@mniewoehner.de>
Mike Gerdts <mike.gerdts@joyent.com>
Mike Harsch <mike@harschsystems.com>
Mike Leddy <mike.leddy@gmail.com>
Mike Swanson <mikeonthecomputer@gmail.com>
Milan Jurik <milan.jurik@xylab.cz>
Morgan Jones <mjones@rice.edu>
Moritz Maxeiner <moritz@ucworks.org>
Nathaniel Clark <Nathaniel.Clark@misrule.us>
Nathaniel Wesley Filardo <nwf@cs.jhu.edu>
Nav Ravindranath <nav@delphix.com>
Neal Gompa (ニール・ゴンパ) <ngompa13@gmail.com>
Ned Bass <bass6@llnl.gov>
Neependra Khare <neependra@kqinfotech.com>
Neil Stockbridge <neil@dist.ro>
Nick Garvey <garvey.nick@gmail.com>
Nikolay Borisov <n.borisov.lkml@gmail.com>
Olaf Faaland <faaland1@llnl.gov>
Oleg Drokin <green@linuxhacker.ru>
Oleg Stepura <oleg@stepura.com>
Patrik Greco <sikevux@sikevux.se>
Paul B. Henson <henson@acm.org>
Paul Dagnelie <pcd@delphix.com>
Paul Zuchowski <pzuchowski@datto.com>
Pavel Boldin <boldin.pavel@gmail.com>
Pavel Zakharov <pavel.zakharov@delphix.com>
Pawel Jakub Dawidek <pjd@FreeBSD.org>
Pedro Giffuni <pfg@freebsd.org>
Peng <peng.hse@xtaotech.com>
Peter Ashford <ashford@accs.com>
Prakash Surya <prakash.surya@delphix.com>
Prasad Joshi <prasadjoshi124@gmail.com>
Ralf Ertzinger <ralf@skytale.net>
Randall Mason <ClashTheBunny@gmail.com>
Remy Blank <remy.blank@pobox.com>
Ricardo M. Correia <ricardo.correia@oracle.com>
Rich Ercolani <rincebrain@gmail.com>
Richard Elling <Richard.Elling@RichardElling.com>
Richard Laager <rlaager@wiktel.com>
Richard Lowe <richlowe@richlowe.net>
Richard Sharpe <rsharpe@samba.org>
Richard Yao <ryao@gentoo.org>
Rohan Puri <rohan.puri15@gmail.com>
Romain Dolbeau <romain.dolbeau@atos.net>
Roman Strashkin <roman.strashkin@nexenta.com>
Ruben Kerkhof <ruben@rubenkerkhof.com>
Saso Kiselkov <saso.kiselkov@nexenta.com>
Scot W. Stevenson <scot.stevenson@gmail.com>
Sean Eric Fagan <sef@ixsystems.com>
Sebastian Gottschall <s.gottschall@dd-wrt.com>
Sen Haerens <sen@senhaerens.be>
Serapheim Dimitropoulos <serapheim@delphix.com>
Seth Forshee <seth.forshee@canonical.com>
Shampavman <sham.pavman@nexenta.com>
Shen Yan <shenyanxxxy@qq.com>
Simon Guest <simon.guest@tesujimath.org>
Simon Klinkert <simon.klinkert@gmail.com>
Sowrabha Gopal <sowrabha.gopal@delphix.com>
Stanislav Seletskiy <s.seletskiy@gmail.com>
Steffen Müthing <steffen.muething@iwr.uni-heidelberg.de>
Stephen Blinick <stephen.blinick@delphix.com>
Steve Dougherty <sdougherty@barracuda.com>
Steven Burgess <sburgess@dattobackup.com>
Steven Hartland <smh@freebsd.org>
Steven Johnson <sjohnson@sakuraindustries.com>
Stian Ellingsen <stian@plaimi.net>
Suman Chakravartula <schakrava@gmail.com>
Sydney Vanda <sydney.m.vanda@intel.com>
Sören Tempel <soeren+git@soeren-tempel.net>
Thijs Cramer <thijs.cramer@gmail.com>
Tim Chase <tim@chase2k.com>
Tim Connors <tconnors@rather.puzzling.org>
Tim Crawford <tcrawford@datto.com>
Tim Haley <Tim.Haley@Sun.COM>
Tobin Harding <me@tobin.cc>
Tom Caputi <tcaputi@datto.com>
Tom Matthews <tom@axiom-partners.com>
Tom Prince <tom.prince@ualberta.net>
Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Tony Hutter <hutter2@llnl.gov>
Toomas Soome <tsoome@me.com>
Trey Dockendorf <treydock@gmail.com>
Turbo Fredriksson <turbo@bayour.com>
Tyler J. Stachecki <stachecki.tyler@gmail.com>
Vitaut Bajaryn <vitaut.bayaryn@gmail.com>
Weigang Li <weigang.li@intel.com>
Will Andrews <will@freebsd.org>
Will Rouesnel <w.rouesnel@gmail.com>
Wolfgang Bumiller <w.bumiller@proxmox.com>
Xin Li <delphij@FreeBSD.org>
Ying Zhu <casualfisher@gmail.com>
YunQiang Su <syq@debian.org>
Yuri Pankov <yuri.pankov@gmail.com>
Yuxuan Shui <yshuiv7@gmail.com>
Zachary Bedell <zac@thebedells.org>
Anand Mitra <mitra@kqinfotech.com>
Anurag Agarwal <anurag@kqinfotech.com>
Neependra Khare <neependra@kqinfotech.com>
Prasad Joshi <prasad@kqinfotech.com>
Rohan Puri <rohan@kqinfotech.com>
Sandip Divekar <sandipd@kqinfotech.com>
Shoaib <shoaib@kqinfotech.com>
Shrirang <shrirang@kqinfotech.com>
Additionally the following individuals have all made contributions
to the project and deserve to be acknowledged.
Albert Lee <trisk@nexenta.com>
Alejandro R. Sedeño <asedeno@mit.edu>
Alex Zhuravlev <bzzz@whamcloud.com>
Alexander Eremin <a.eremin@nexenta.com>
Alexander Stetsenko <ams@nexenta.com>
Alexey Shvetsov <alexxy@gentoo.org>
Andreas Dilger <adilger@whamcloud.com>
Andrew Reid <ColdCanuck@nailedtotheperch.com>
Andrew Stormont <andrew.stormont@nexenta.com>
Andrew Tselischev <andrewtselischev@gmail.com>
Andriy Gapon <avg@FreeBSD.org>
Aniruddha Shankar <k@191a.net>
Bill Pijewski <wdp@joyent.com>
Chris Dunlap <cdunlap@llnl.gov>
Chris Dunlop <chris@onthe.net.au>
Chris Siden <chris.siden@delphix.com>
Chris Wedgwood <cw@f00f.org>
Christian Kohlschütter <christian@kohlschutter.com>
Christopher Siden <chris.siden@delphix.com>
Craig Sanders <github@taz.net.au>
Cyril Plisko <cyril.plisko@mountall.com>
Dan McDonald <danmcd@nexenta.com>
Daniel Verite <daniel@verite.pro>
Darik Horn <dajhorn@vanadac.com>
Eric Schrock <Eric.Schrock@delphix.com>
Etienne Dechamps <etienne.dechamps@ovh.net>
Fajar A. Nugraha <github@fajar.net>
Frederik Wessels <wessels147@gmail.com>
Garrett D'Amore <garrett@nexenta.com>
George Wilson <george.wilson@delphix.com>
Gordon Ross <gwr@nexenta.com>
Gregor Kopka <mailfrom-github.com@kopka.net>
Gunnar Beutner <gunnar@beutner.name>
James H <james@kagisoft.co.uk>
Javen Wu <wu.javen@gmail.com>
Jeremy Gill <jgill@parallax-innovations.com>
Jorgen Lundman <lundman@lundman.net>
KORN Andras <korn@elan.rulez.org>
Kyle Fuller <inbox@kylefuller.co.uk>
Manuel Amador (Rudd-O) <rudd-o@rudd-o.com>
Martin Matuska <mm@FreeBSD.org>
Massimo Maggi <massimo@mmmm.it>
Matthew Ahrens <mahrens@delphix.com>
Michael Martin <mgmartin.mgm@gmail.com>
Mike Harsch <mike@harschsystems.com>
Ned Bass <bass6@llnl.gov>
Oleg Stepura <oleg@stepura.com>
P.SCH <p88@yahoo.com>
Pawel Jakub Dawidek <pawel@dawidek.net>
Prakash Surya <surya1@llnl.gov>
Prasad Joshi <pjoshi@stec-inc.com>
Ricardo M. Correia <Ricardo.M.Correia@Sun.COM>
Richard Laager <rlaager@wiktel.com>
Richard Lowe <richlowe@richlowe.net>
Richard Yao <ryao@cs.stonybrook.edu>
Rohan Puri <rohan.puri15@gmail.com>
Shampavman <sham.pavman@nexenta.com>
Simon Klinkert <klinkert@webgods.de>
Suman Chakravartula <suman@gogrid.com>
Tim Haley <Tim.Haley@Sun.COM>
Turbo Fredriksson <turbo@bayour.com>
Xin Li <delphij@FreeBSD.org>
Yuxuan Shui <yshuiv7@gmail.com>
Zachary Bedell <zac@thebedells.org>
nordaux <nordaux@gmail.com>
-2
View File
@@ -1,2 +0,0 @@
The [OpenZFS Code of Conduct](http://www.open-zfs.org/wiki/Code_of_Conduct)
applies to spaces associated with the OpenZFS project, including GitHub.
+27 -25
View File
@@ -1,31 +1,33 @@
Refer to the git commit log for authoritative copyright attribution.
The majority of the code in the ZFS on Linux port comes from OpenSolaris
which has been released under the terms of the CDDL open source license.
This includes the core ZFS code, libavl, libnvpair, libefi, libunicode,
and libutil. The original OpenSolaris source can be downloaded from:
The original ZFS source code was obtained from Open Solaris which was
released under the terms of the CDDL open source license. Additional
changes have been included from OpenZFS and the Illumos project which
are similarly licensed. These projects can be found on Github at:
http://dlc.sun.com/osol/on/downloads/b121/on-src.tar.bz2
* https://github.com/illumos/illumos-gate
* https://github.com/openzfs/openzfs
Files which do not originate from OpenSolaris are noted in the file header
and attributed properly. These exceptions include, but are not limited
to, the vdev_disk.c and zvol.c implementation which are licensed under
the CDDL.
The zpios test code is originally derived from the Lustre pios test code
which is licensed under the GPLv2. As such the heavily modified zpios
kernel test code also remains licensed under the GPLv2.
The latest stable and development versions of this port can be downloaded
from the official ZFS on Linux site located at:
http://zfsonlinux.org/
This ZFS on Linux port was produced at the Lawrence Livermore National
Laboratory (LLNL) under Contract No. DE-AC52-07NA27344 (Contract 44)
between the U.S. Department of Energy (DOE) and Lawrence Livermore
National Security, LLC (LLNS) for the operation of LLNL. It has been
approved for release under LLNL-CODE-403049.
Unless otherwise noted, all files in this distribution are released
under the Common Development and Distribution License (CDDL).
Exceptions are noted within the associated source files. See the file
OPENSOLARIS.LICENSE for more information.
Exceptions are noted within the associated source files headers and
by including a THIRDPARTYLICENSE file with the license terms. A few
notable exceptions and their respective licenses include:
* Skein Checksum Implementation: module/icp/algs/skein/THIRDPARTYLICENSE
* AES Implementation: module/icp/asm-x86_64/aes/THIRDPARTYLICENSE.gladman
* AES Implementation: module/icp/asm-x86_64/aes/THIRDPARTYLICENSE.openssl
* PBKDF2 Implementation: lib/libzfs/THIRDPARTYLICENSE.openssl
* SPL Implementation: module/os/linux/spl/THIRDPARTYLICENSE.gplv2
* GCM Implementation: module/icp/asm-x86_64/modes/THIRDPARTYLICENSE.cryptogams
* GCM Implementation: module/icp/asm-x86_64/modes/THIRDPARTYLICENSE.openssl
* GHASH Implementation: module/icp/asm-x86_64/modes/THIRDPARTYLICENSE.cryptogams
* GHASH Implementation: module/icp/asm-x86_64/modes/THIRDPARTYLICENSE.openssl
This product includes software developed by the OpenSSL Project for use
in the OpenSSL Toolkit (http://www.openssl.org/)
See the LICENSE and NOTICE for more information.
Refer to the git commit log for authoritative copyright attribution.
+24
View File
@@ -0,0 +1,24 @@
This work was produced at the Lawrence Livermore National Laboratory
(LLNL) under Contract No. DE-AC52-07NA27344 (Contract 44) between
the U.S. Department of Energy (DOE) and Lawrence Livermore National
Security, LLC (LLNS) for the operation of LLNL.
This work was prepared as an account of work sponsored by an agency of
the United States Government. Neither the United States Government nor
Lawrence Livermore National Security, LLC nor any of their employees,
makes any warranty, express or implied, or assumes any liability or
responsibility for the accuracy, completeness, or usefulness of any
information, apparatus, product, or process disclosed, or represents
that its use would not infringe privately-owned rights.
Reference herein to any specific commercial products, process, or
services by trade name, trademark, manufacturer or otherwise does
not necessarily constitute or imply its endorsement, recommendation,
or favoring by the United States Government or Lawrence Livermore
National Security, LLC. The views and opinions of authors expressed
herein do not necessarily state or reflect those of the United States
Government or Lawrence Livermore National Security, LLC, and shall
not be used for advertising or product endorsement purposes.
The precise terms and conditions for copying, distribution, and
modification are specified in the file OPENSOLARIS.LICENSE.
+8 -10
View File
@@ -1,10 +1,8 @@
Meta: 1
Name: zfs
Branch: 1.0
Version: 2.0.0
Release: 1
Release-Tags: relext
License: CDDL
Author: OpenZFS
Linux-Maximum: 5.9
Linux-Minimum: 3.10
Meta: 1
Name: zfs
Branch: 1.0
Version: 0.7.13
Release: 1
Release-Tags: relext
License: CDDL
Author: OpenZFS on Linux
+26 -191
View File
@@ -1,68 +1,32 @@
ACLOCAL_AMFLAGS = -I config
SUBDIRS = include
if BUILD_LINUX
SUBDIRS += rpm
endif
include config/rpm.am
include config/deb.am
include config/tgz.am
SUBDIRS = include rpm
if CONFIG_USER
SUBDIRS += etc man scripts lib tests cmd contrib
if BUILD_LINUX
SUBDIRS += udev
endif
SUBDIRS += udev etc man scripts lib tests cmd contrib
endif
if CONFIG_KERNEL
SUBDIRS += module
extradir = $(prefix)/src/zfs-$(VERSION)
extradir = @prefix@/src/zfs-$(VERSION)
extra_HEADERS = zfs.release.in zfs_config.h.in
if BUILD_LINUX
kerneldir = $(prefix)/src/zfs-$(VERSION)/$(LINUX_VERSION)
kerneldir = @prefix@/src/zfs-$(VERSION)/$(LINUX_VERSION)
nodist_kernel_HEADERS = zfs.release zfs_config.h module/$(LINUX_SYMBOLS)
endif
endif
AUTOMAKE_OPTIONS = foreign
EXTRA_DIST = autogen.sh copy-builtin
EXTRA_DIST += cppcheck-suppressions.txt
EXTRA_DIST += config/config.awk config/rpm.am config/deb.am config/tgz.am
EXTRA_DIST += META AUTHORS COPYRIGHT LICENSE NEWS NOTICE README.md
EXTRA_DIST += CODE_OF_CONDUCT.md
EXTRA_DIST += module/lua/README.zfs module/os/linux/spl/README.md
# Include all the extra licensing information for modules
EXTRA_DIST += module/icp/algs/skein/THIRDPARTYLICENSE
EXTRA_DIST += module/icp/algs/skein/THIRDPARTYLICENSE.descrip
EXTRA_DIST += module/icp/asm-x86_64/aes/THIRDPARTYLICENSE.gladman
EXTRA_DIST += module/icp/asm-x86_64/aes/THIRDPARTYLICENSE.gladman.descrip
EXTRA_DIST += module/icp/asm-x86_64/aes/THIRDPARTYLICENSE.openssl
EXTRA_DIST += module/icp/asm-x86_64/aes/THIRDPARTYLICENSE.openssl.descrip
EXTRA_DIST += module/icp/asm-x86_64/modes/THIRDPARTYLICENSE.cryptogams
EXTRA_DIST += module/icp/asm-x86_64/modes/THIRDPARTYLICENSE.cryptogams.descrip
EXTRA_DIST += module/icp/asm-x86_64/modes/THIRDPARTYLICENSE.openssl
EXTRA_DIST += module/icp/asm-x86_64/modes/THIRDPARTYLICENSE.openssl.descrip
EXTRA_DIST += module/os/linux/spl/THIRDPARTYLICENSE.gplv2
EXTRA_DIST += module/os/linux/spl/THIRDPARTYLICENSE.gplv2.descrip
EXTRA_DIST += module/zfs/THIRDPARTYLICENSE.cityhash
EXTRA_DIST += module/zfs/THIRDPARTYLICENSE.cityhash.descrip
EXTRA_DIST += META DISCLAIMER COPYRIGHT README.markdown OPENSOLARIS.LICENSE
@CODE_COVERAGE_RULES@
GITREV = include/zfs_gitrev.h
PHONY = gitrev
gitrev:
$(AM_V_GEN)$(top_srcdir)/scripts/make_gitrev.sh $(GITREV)
all: gitrev
# Double-colon rules are allowed; there are multiple independent definitions.
maintainer-clean-local::
-$(RM) $(GITREV)
distclean-local::
-$(RM) -R autom4te*.cache build
-$(RM) -R autom4te*.cache
-find . \( -name SCCS -o -name BitKeeper -o -name .svn -o -name CVS \
-o -name .pc -o -name .hg -o -name .git \) -prune -o \
\( -name '*.orig' -o -name '*.rej' -o -name '*~' \
@@ -73,192 +37,63 @@ distclean-local::
-o -name '*.gcno' \) \
-type f -print | xargs $(RM)
all-local:
-[ -x ${top_builddir}/scripts/zfs-tests.sh ] && \
${top_builddir}/scripts/zfs-tests.sh -c
dist-hook:
$(AM_V_GEN)$(top_srcdir)/scripts/make_gitrev.sh -D $(distdir) $(GITREV)
$(SED) ${ac_inplace} -e 's/Release:[[:print:]]*/Release: $(RELEASE)/' \
sed -i 's/Release:[[:print:]]*/Release: $(RELEASE)/' \
$(distdir)/META
if BUILD_LINUX
# For compatibility, create a matching spl-x.y.z directly which contains
# symlinks to the updated header and object file locations. These
# compatibility links will be removed in the next major release.
if CONFIG_KERNEL
install-data-hook:
rm -rf $(DESTDIR)$(prefix)/src/spl-$(VERSION) && \
mkdir $(DESTDIR)$(prefix)/src/spl-$(VERSION) && \
cd $(DESTDIR)$(prefix)/src/spl-$(VERSION) && \
ln -s ../zfs-$(VERSION)/include/spl include && \
ln -s ../zfs-$(VERSION)/$(LINUX_VERSION) $(LINUX_VERSION) && \
ln -s ../zfs-$(VERSION)/zfs_config.h.in spl_config.h.in && \
ln -s ../zfs-$(VERSION)/zfs.release.in spl.release.in && \
cd $(DESTDIR)$(prefix)/src/zfs-$(VERSION)/$(LINUX_VERSION) && \
ln -fs zfs_config.h spl_config.h && \
ln -fs zfs.release spl.release
endif
endif
checkstyle: cstyle shellcheck flake8 commitcheck
PHONY += codecheck
codecheck: cstyle shellcheck checkbashisms flake8 mancheck testscheck vcscheck
PHONY += checkstyle
checkstyle: codecheck commitcheck
PHONY += commitcheck
commitcheck:
@if git rev-parse --git-dir > /dev/null 2>&1; then \
${top_srcdir}/scripts/commitcheck.sh; \
scripts/commitcheck.sh; \
fi
PHONY += cstyle
cstyle:
@find ${top_srcdir} -name build -prune \
-o -type f -name '*.[hc]' \
! -name 'zfs_config.*' ! -name '*.mod.c' \
! -name 'opt_global.h' ! -name '*_if*.h' \
! -path './module/zstd/lib/*' \
-exec ${top_srcdir}/scripts/cstyle.pl -cpP {} \+
@find ${top_srcdir} -name '*.[hc]' ! -name 'zfs_config.*' \
! -name '*.mod.c' -type f -exec scripts/cstyle.pl -cpP {} \+
filter_executable = -exec test -x '{}' \; -print
PHONY += shellcheck
shellcheck:
@if type shellcheck > /dev/null 2>&1; then \
shellcheck --exclude=SC1090 --exclude=SC1117 --format=gcc \
$$(find ${top_srcdir}/scripts/*.sh -type f) \
$$(find ${top_srcdir}/cmd/zed/zed.d/*.sh -type f) \
$$(find ${top_srcdir}/cmd/zpool/zpool.d/* \
-type f ${filter_executable}); \
else \
echo "skipping shellcheck because shellcheck is not installed"; \
shellcheck --exclude=SC1090 --format=gcc scripts/paxcheck.sh \
scripts/zloop.sh \
scripts/zfs-tests.sh \
scripts/zfs.sh \
scripts/commitcheck.sh \
$$(find cmd/zed/zed.d/*.sh -type f) \
$$(find cmd/zpool/zpool.d/* -executable); \
fi
PHONY += checkabi storeabi
checkabi: lib
$(MAKE) -C lib checkabi
storeabi: lib
$(MAKE) -C lib storeabi
PHONY += checkbashisms
checkbashisms:
@if type checkbashisms > /dev/null 2>&1; then \
checkbashisms -n -p -x \
$$(find ${top_srcdir} \
-name '.git' -prune \
-o -name 'build' -prune \
-o -name 'tests' -prune \
-o -name 'config' -prune \
-o -name 'zed-functions.sh*' -prune \
-o -name 'zfs-import*' -prune \
-o -name 'zfs-mount*' -prune \
-o -name 'zfs-zed*' -prune \
-o -name 'smart' -prune \
-o -name 'paxcheck.sh' -prune \
-o -name 'make_gitrev.sh' -prune \
-o -name '90zfs' -prune \
-o -type f ! -name 'config*' \
! -name 'libtool' \
-exec sh -c 'awk "NR==1 && /\#\!.*bin\/sh.*/ {print FILENAME;}" "{}"' \;); \
else \
echo "skipping checkbashisms because checkbashisms is not installed"; \
fi
PHONY += mancheck
mancheck:
@if type mandoc > /dev/null 2>&1; then \
find ${top_srcdir}/man/man8 -type f -name 'zfs.8' \
-o -name 'zpool.8' -o -name 'zdb.8' \
-o -name 'zgenhostid.8' | \
xargs mandoc -Tlint -Werror; \
else \
echo "skipping mancheck because mandoc is not installed"; \
fi
if BUILD_LINUX
stat_fmt = -c '%A %n'
else
stat_fmt = -f '%Sp %N'
endif
PHONY += testscheck
testscheck:
@find ${top_srcdir}/tests/zfs-tests -type f \
\( -name '*.ksh' -not ${filter_executable} \) -o \
\( -name '*.kshlib' ${filter_executable} \) -o \
\( -name '*.shlib' ${filter_executable} \) -o \
\( -name '*.cfg' ${filter_executable} \) | \
xargs -r stat ${stat_fmt} | \
awk '{c++; print} END {if(c>0) exit 1}'
PHONY += vcscheck
vcscheck:
@if git rev-parse --git-dir > /dev/null 2>&1; then \
git ls-files . --exclude-standard --others | \
awk '{c++; print} END {if(c>0) exit 1}' ; \
fi
PHONY += lint
lint: cppcheck paxcheck
PHONY += cppcheck
cppcheck:
@if type cppcheck > /dev/null 2>&1; then \
cppcheck --quiet --force --error-exitcode=2 --inline-suppr \
--suppressions-list=${top_srcdir}/cppcheck-suppressions.txt \
--suppressions-list=.github/suppressions.txt \
-UHAVE_SSE2 -UHAVE_AVX512F -UHAVE_UIO_ZEROCOPY \
${top_srcdir}; \
else \
echo "skipping cppcheck because cppcheck is not installed"; \
-UHAVE_DNLC ${top_srcdir}; \
fi
PHONY += paxcheck
paxcheck:
@if type scanelf > /dev/null 2>&1; then \
${top_srcdir}/scripts/paxcheck.sh ${top_builddir}; \
else \
echo "skipping paxcheck because scanelf is not installed"; \
scripts/paxcheck.sh ${top_srcdir}; \
fi
PHONY += flake8
flake8:
@if type flake8 > /dev/null 2>&1; then \
flake8 ${top_srcdir}; \
else \
echo "skipping flake8 because flake8 is not installed"; \
fi
PHONY += ctags
ctags:
$(RM) tags
find $(top_srcdir) -name '.?*' -prune \
-o -type f -name '*.[hcS]' -print | xargs ctags -a
find $(top_srcdir) -name .git -prune -o -name '*.[hc]' | xargs ctags
PHONY += etags
etags:
$(RM) TAGS
find $(top_srcdir) -name '.?*' -prune \
-o -type f -name '*.[hcS]' -print | xargs etags -a
find $(top_srcdir) -name .pc -prune -o -name '*.[hc]' | xargs etags -a
PHONY += cscopelist
cscopelist:
find $(top_srcdir) -name '.?*' -prune \
-o -type f -name '*.[hc]' -print >cscope.files
PHONY += tags
tags: ctags etags
PHONY += pkg pkg-dkms pkg-kmod pkg-utils
pkg: @DEFAULT_PACKAGE@
pkg-dkms: @DEFAULT_PACKAGE@-dkms
pkg-kmod: @DEFAULT_PACKAGE@-kmod
pkg-utils: @DEFAULT_PACKAGE@-utils
include config/rpm.am
include config/deb.am
include config/tgz.am
.PHONY: $(PHONY)
-3
View File
@@ -1,3 +0,0 @@
Descriptions of all releases can be found on github:
https://github.com/openzfs/zfs/releases
-16
View File
@@ -1,16 +0,0 @@
This work was produced under the auspices of the U.S. Department of Energy by
Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
This work was prepared as an account of work sponsored by an agency of the
United States Government. Neither the United States Government nor Lawrence
Livermore National Security, LLC, nor any of their employees makes any warranty,
expressed or implied, or assumes any legal liability or responsibility for the
accuracy, completeness, or usefulness of any information, apparatus, product, or
process disclosed, or represents that its use would not infringe privately owned
rights. Reference herein to any specific commercial product, process, or service
by trade name, trademark, manufacturer, or otherwise does not necessarily
constitute or imply its endorsement, recommendation, or favoring by the United
States Government or Lawrence Livermore National Security, LLC. The views and
opinions of authors expressed herein do not necessarily state or reflect those
of the United States Government or Lawrence Livermore National Security, LLC,
and shall not be used for advertising or product endorsement purposes.
View File
+19
View File
@@ -0,0 +1,19 @@
![img](http://zfsonlinux.org/images/zfs-linux.png)
ZFS on Linux is an advanced file system and volume manager which was originally
developed for Solaris and is now maintained by the OpenZFS community.
[![codecov](https://codecov.io/gh/zfsonlinux/zfs/branch/master/graph/badge.svg)](https://codecov.io/gh/zfsonlinux/zfs)
# Official Resources
* [Site](http://zfsonlinux.org)
* [Wiki](https://github.com/zfsonlinux/zfs/wiki)
* [Mailing lists](https://github.com/zfsonlinux/zfs/wiki/Mailing-Lists)
* [OpenZFS site](http://open-zfs.org/)
# Installation
Full documentation for installing ZoL on your favorite Linux distribution can
be found at [our site](http://zfsonlinux.org/).
# Contribute & Develop
We have a separate document with [contribution guidelines](./.github/CONTRIBUTING.md).
-35
View File
@@ -1,35 +0,0 @@
![img](https://openzfs.github.io/openzfs-docs/_static/img/logo/480px-Open-ZFS-Secondary-Logo-Colour-halfsize.png)
OpenZFS is an advanced file system and volume manager which was originally
developed for Solaris and is now maintained by the OpenZFS community.
This repository contains the code for running OpenZFS on Linux and FreeBSD.
[![codecov](https://codecov.io/gh/openzfs/zfs/branch/master/graph/badge.svg)](https://codecov.io/gh/openzfs/zfs)
[![coverity](https://scan.coverity.com/projects/1973/badge.svg)](https://scan.coverity.com/projects/openzfs-zfs)
# Official Resources
* [Documentation](https://openzfs.github.io/openzfs-docs/) - for using and developing this repo
* [ZoL Site](https://zfsonlinux.org) - Linux release info & links
* [Mailing lists](https://openzfs.github.io/openzfs-docs/Project%20and%20Community/Mailing%20Lists.html)
* [OpenZFS site](http://open-zfs.org/) - for conference videos and info on other platforms (illumos, OSX, Windows, etc)
# Installation
Full documentation for installing OpenZFS on your favorite operating system can
be found at the [Getting Started Page](https://openzfs.github.io/openzfs-docs/Getting%20Started/index.html).
# Contribute & Develop
We have a separate document with [contribution guidelines](./.github/CONTRIBUTING.md).
We have a [Code of Conduct](./CODE_OF_CONDUCT.md).
# Release
OpenZFS is released under a CDDL license.
For more details see the NOTICE, LICENSE and COPYRIGHT files; `UCRL-CODE-235197`
# Supported Kernels
* The `META` file contains the officially recognized supported Linux kernel versions.
* Supported FreeBSD versions are 12-STABLE and 13-CURRENT.
+52 -8
View File
@@ -1,15 +1,17 @@
#!/bin/sh
### prepare
#TEST_PREPARE_WATCHDOG="yes"
#TEST_PREPARE_SHARES="yes"
#TEST_PREPARE_WATCHDOG="no"
### SPLAT
#TEST_SPLAT_SKIP="yes"
#TEST_SPLAT_OPTIONS="-acvx"
### ztest
#TEST_ZTEST_SKIP="yes"
#TEST_ZTEST_TIMEOUT=1800
#TEST_ZTEST_DIR="/var/tmp/"
#TEST_ZTEST_OPTIONS="-V"
#TEST_ZTEST_CORE_DIR="/mnt/zloop"
### zimport
#TEST_ZIMPORT_SKIP="yes"
@@ -29,13 +31,9 @@
### zfs-tests.sh
#TEST_ZFSTESTS_SKIP="yes"
#TEST_ZFSTESTS_DIR="/mnt/"
#TEST_ZFSTESTS_DISKS="vdb vdc vdd"
#TEST_ZFSTESTS_DISKSIZE="8G"
#TEST_ZFSTESTS_ITERS="1"
#TEST_ZFSTESTS_OPTIONS="-vx"
#TEST_ZFSTESTS_RUNFILE="linux.run"
#TEST_ZFSTESTS_TAGS="functional"
### zfsstress
#TEST_ZFSSTRESS_SKIP="yes"
@@ -44,7 +42,53 @@
#TEST_ZFSSTRESS_RUNTIME=300
#TEST_ZFSSTRESS_POOL="tank"
#TEST_ZFSSTRESS_FS="fish"
#TEST_ZFSSTRESS_FSOPT="-o overlay=on"
#TEST_ZFSSTRESS_VDEV="/var/tmp/vdev"
#TEST_ZFSSTRESS_DIR="/$TEST_ZFSSTRESS_POOL/$TEST_ZFSSTRESS_FS"
#TEST_ZFSSTRESS_OPTIONS=""
### per-builder customization
#
# BB_NAME=builder-name <distribution-version-architecture-type>
# - distribution=Amazon,Debian,Fedora,RHEL,SUSE,Ubuntu
# - version=x.y
# - architecture=x86_64,i686,arm,aarch64
# - type=build,test
#
case "$BB_NAME" in
Amazon*)
# ZFS enabled xfstests fails to build
TEST_XFSTESTS_SKIP="yes"
;;
CentOS-7*)
# ZFS enabled xfstests fails to build
TEST_XFSTESTS_SKIP="yes"
;;
CentOS-6*)
;;
Debian*)
;;
Fedora*)
;;
RHEL*)
;;
SUSE*)
;;
Ubuntu-16.04*)
# ZFS enabled xfstests fails to build
TEST_XFSTESTS_SKIP="yes"
;;
Ubuntu*)
;;
*)
;;
esac
###
#
# Disable the following test suites on 32-bit systems.
#
if [ $(getconf LONG_BIT) = "32" ]; then
TEST_ZTEST_SKIP="yes"
TEST_XFSTESTS_SKIP="yes"
TEST_ZFSSTRESS_SKIP="yes"
fi
+1 -1
View File
@@ -1,4 +1,4 @@
#!/bin/sh
autoreconf -fiv || exit 1
autoreconf -fiv
rm -Rf autom4te.cache
+3 -10
View File
@@ -1,10 +1,3 @@
SUBDIRS = zfs zpool zdb zhack zinject zstream zstreamdump ztest
SUBDIRS += fsck_zfs vdev_id raidz_test zfs_ids_to_path
if USING_PYTHON
SUBDIRS += arcstat arc_summary dbufstat
endif
if BUILD_LINUX
SUBDIRS += mount_zfs zed zgenhostid zvol_id zvol_wait
endif
SUBDIRS = zfs zpool zdb zhack zinject zstreamdump ztest zpios
SUBDIRS += mount_zfs fsck_zfs zvol_id vdev_id arcstat dbufstat zed
SUBDIRS += arc_summary raidz_test zgenhostid
-1
View File
@@ -1 +0,0 @@
arc_summary
+1 -13
View File
@@ -1,13 +1 @@
bin_SCRIPTS = arc_summary
CLEANFILES = arc_summary
EXTRA_DIST = arc_summary2 arc_summary3
if USING_PYTHON_2
SCRIPT = arc_summary2
else
SCRIPT = arc_summary3
endif
arc_summary: $(SCRIPT)
cp $< $@
dist_bin_SCRIPTS = arc_summary.py
@@ -1,4 +1,4 @@
#!/usr/bin/env python2
#!/usr/bin/python
#
# $Id: arc_summary.pl,v 388:e27800740aa2 2011-07-08 02:53:29Z jhell $
#
@@ -35,14 +35,12 @@
# Note some of this code uses older code (eg getopt instead of argparse,
# subprocess.Popen() instead of subprocess.run()) because we need to support
# some very old versions of Python.
#
"""Print statistics on the ZFS Adjustable Replacement Cache (ARC)
Provides basic information on the ARC, its efficiency, the L2ARC (if present),
the Data Management Unit (DMU), Virtual Devices (VDEVs), and tunables. See the
in-source documentation and code at
https://github.com/openzfs/zfs/blob/master/module/zfs/arc.c for details.
https://github.com/zfsonlinux/zfs/blob/master/module/zfs/arc.c for details.
"""
import getopt
@@ -54,44 +52,6 @@ import errno
from subprocess import Popen, PIPE
from decimal import Decimal as D
if sys.platform.startswith('freebsd'):
# Requires py27-sysctl on FreeBSD
import sysctl
def load_kstats(namespace):
"""Collect information on a specific subsystem of the ARC"""
base = 'kstat.zfs.misc.%s.' % namespace
return [(kstat.name, D(kstat.value)) for kstat in sysctl.filter(base)]
def load_tunables():
return dict((ctl.name, ctl.value) for ctl in sysctl.filter('vfs.zfs'))
elif sys.platform.startswith('linux'):
def load_kstats(namespace):
"""Collect information on a specific subsystem of the ARC"""
kstat = 'kstat.zfs.misc.%s.%%s' % namespace
path = '/proc/spl/kstat/zfs/%s' % namespace
with open(path) as f:
entries = [line.strip().split() for line in f][2:] # Skip header
return [(kstat % name, D(value)) for name, _, value in entries]
def load_tunables():
basepath = '/sys/module/zfs/parameters'
tunables = {}
for name in os.listdir(basepath):
if not name:
continue
path = '%s/%s' % (basepath, name)
with open(path) as f:
value = f.read()
tunables[name] = value.strip()
return tunables
show_tunable_descriptions = False
alternate_tunable_layout = False
@@ -114,10 +74,24 @@ def get_Kstat():
of the same name.
"""
def load_proc_kstats(fn, namespace):
"""Collect information on a specific subsystem of the ARC"""
kstats = [line.strip() for line in open(fn)]
del kstats[0:2]
for kstat in kstats:
kstat = kstat.strip()
name, _, value = kstat.split()
Kstat[namespace + name] = D(value)
Kstat = {}
Kstat.update(load_kstats('arcstats'))
Kstat.update(load_kstats('zfetchstats'))
Kstat.update(load_kstats('vdev_cache_stats'))
load_proc_kstats('/proc/spl/kstat/zfs/arcstats',
'kstat.zfs.misc.arcstats.')
load_proc_kstats('/proc/spl/kstat/zfs/zfetchstats',
'kstat.zfs.misc.zfetchstats.')
load_proc_kstats('/proc/spl/kstat/zfs/vdev_cache_stats',
'kstat.zfs.misc.vdev_cache_stats.')
return Kstat
@@ -230,10 +204,6 @@ def get_arc_summary(Kstat):
arc_size = Kstat["kstat.zfs.misc.arcstats.size"]
mru_size = Kstat["kstat.zfs.misc.arcstats.mru_size"]
mfu_size = Kstat["kstat.zfs.misc.arcstats.mfu_size"]
meta_limit = Kstat["kstat.zfs.misc.arcstats.arc_meta_limit"]
meta_size = Kstat["kstat.zfs.misc.arcstats.arc_meta_used"]
dnode_limit = Kstat["kstat.zfs.misc.arcstats.arc_dnode_limit"]
dnode_size = Kstat["kstat.zfs.misc.arcstats.dnode_size"]
target_max_size = Kstat["kstat.zfs.misc.arcstats.c_max"]
target_min_size = Kstat["kstat.zfs.misc.arcstats.c_min"]
target_size = Kstat["kstat.zfs.misc.arcstats.c"]
@@ -258,22 +228,6 @@ def get_arc_summary(Kstat):
'per': fPerc(target_size, target_max_size),
'num': fBytes(target_size),
}
output['arc_sizing']['meta_limit'] = {
'per': fPerc(meta_limit, target_max_size),
'num': fBytes(meta_limit),
}
output['arc_sizing']['meta_size'] = {
'per': fPerc(meta_size, meta_limit),
'num': fBytes(meta_size),
}
output['arc_sizing']['dnode_limit'] = {
'per': fPerc(dnode_limit, meta_limit),
'num': fBytes(dnode_limit),
}
output['arc_sizing']['dnode_size'] = {
'per': fPerc(dnode_size, dnode_limit),
'num': fBytes(dnode_size),
}
# ARC Hash Breakdown
output['arc_hash_break'] = {}
@@ -379,26 +333,6 @@ def _arc_summary(Kstat):
arc['arc_size_break']['frequently_used_cache_size']['num'],
)
)
sys.stdout.write("\tMetadata Size (Hard Limit):\t%s\t%s\n" % (
arc['arc_sizing']['meta_limit']['per'],
arc['arc_sizing']['meta_limit']['num'],
)
)
sys.stdout.write("\tMetadata Size:\t\t\t%s\t%s\n" % (
arc['arc_sizing']['meta_size']['per'],
arc['arc_sizing']['meta_size']['num'],
)
)
sys.stdout.write("\tDnode Size (Hard Limit):\t%s\t%s\n" % (
arc['arc_sizing']['dnode_limit']['per'],
arc['arc_sizing']['dnode_limit']['num'],
)
)
sys.stdout.write("\tDnode Size:\t\t\t%s\t%s\n" % (
arc['arc_sizing']['dnode_size']['per'],
arc['arc_sizing']['dnode_size']['num'],
)
)
sys.stdout.write("\n")
@@ -945,7 +879,14 @@ def _tunable_summary(Kstat):
global show_tunable_descriptions
global alternate_tunable_layout
tunables = load_tunables()
names = os.listdir("/sys/module/zfs/parameters/")
values = {}
for name in names:
with open("/sys/module/zfs/parameters/" + name) as f:
value = f.read()
values[name] = value.strip()
descriptions = {}
if show_tunable_descriptions:
@@ -983,17 +924,22 @@ def _tunable_summary(Kstat):
sys.stderr.write("Tunable descriptions will be disabled.\n")
sys.stdout.write("ZFS Tunables:\n")
names.sort()
if alternate_tunable_layout:
fmt = "\t%s=%s\n"
else:
fmt = "\t%-50s%s\n"
for name in sorted(tunables.keys()):
for name in names:
if not name:
continue
if show_tunable_descriptions and name in descriptions:
sys.stdout.write("\t# %s\n" % descriptions[name])
sys.stdout.write(fmt % (name, tunables[name]))
sys.stdout.write(fmt % (name, values[name]))
unSub = [
@@ -1019,7 +965,7 @@ def zfs_header():
def usage():
"""Print usage information"""
sys.stdout.write("Usage: arc_summary [-h] [-a] [-d] [-p PAGE]\n\n")
sys.stdout.write("Usage: arc_summary.py [-h] [-a] [-d] [-p PAGE]\n\n")
sys.stdout.write("\t -h, --help : "
"Print this help message and exit\n")
sys.stdout.write("\t -a, --alternate : "
@@ -1032,10 +978,10 @@ def usage():
"should be an integer between 1 and " +
str(len(unSub)) + "\n\n")
sys.stdout.write("Examples:\n")
sys.stdout.write("\tarc_summary -a\n")
sys.stdout.write("\tarc_summary -p 4\n")
sys.stdout.write("\tarc_summary -ad\n")
sys.stdout.write("\tarc_summary --page=2\n")
sys.stdout.write("\tarc_summary.py -a\n")
sys.stdout.write("\tarc_summary.py -p 4\n")
sys.stdout.write("\tarc_summary.py -ad\n")
sys.stdout.write("\tarc_summary.py --page=2\n")
def main():
-943
View File
@@ -1,943 +0,0 @@
#!/usr/bin/env python3
#
# Copyright (c) 2008 Ben Rockwood <benr@cuddletech.com>,
# Copyright (c) 2010 Martin Matuska <mm@FreeBSD.org>,
# Copyright (c) 2010-2011 Jason J. Hellenthal <jhell@DataIX.net>,
# Copyright (c) 2017 Scot W. Stevenson <scot.stevenson@gmail.com>
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
#
# 1. Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# 2. Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
#
# THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND
# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
# ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
# OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
# HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
# LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
# OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
# SUCH DAMAGE.
"""Print statistics on the ZFS ARC Cache and other information
Provides basic information on the ARC, its efficiency, the L2ARC (if present),
the Data Management Unit (DMU), Virtual Devices (VDEVs), and tunables. See
the in-source documentation and code at
https://github.com/openzfs/zfs/blob/master/module/zfs/arc.c for details.
The original introduction to arc_summary can be found at
http://cuddletech.com/?p=454
"""
import argparse
import os
import subprocess
import sys
import time
DESCRIPTION = 'Print ARC and other statistics for OpenZFS'
INDENT = ' '*8
LINE_LENGTH = 72
DATE_FORMAT = '%a %b %d %H:%M:%S %Y'
TITLE = 'ZFS Subsystem Report'
SECTIONS = 'arc archits dmu l2arc spl tunables vdev zil'.split()
SECTION_HELP = 'print info from one section ('+' '.join(SECTIONS)+')'
# Tunables and SPL are handled separately because they come from
# different sources
SECTION_PATHS = {'arc': 'arcstats',
'dmu': 'dmu_tx',
'l2arc': 'arcstats', # L2ARC stuff lives in arcstats
'vdev': 'vdev_cache_stats',
'xuio': 'xuio_stats',
'zfetch': 'zfetchstats',
'zil': 'zil'}
parser = argparse.ArgumentParser(description=DESCRIPTION)
parser.add_argument('-a', '--alternate', action='store_true', default=False,
help='use alternate formatting for tunables and SPL',
dest='alt')
parser.add_argument('-d', '--description', action='store_true', default=False,
help='print descriptions with tunables and SPL',
dest='desc')
parser.add_argument('-g', '--graph', action='store_true', default=False,
help='print graph on ARC use and exit', dest='graph')
parser.add_argument('-p', '--page', type=int, dest='page',
help='print page by number (DEPRECATED, use "-s")')
parser.add_argument('-r', '--raw', action='store_true', default=False,
help='dump all available data with minimal formatting',
dest='raw')
parser.add_argument('-s', '--section', dest='section', help=SECTION_HELP)
ARGS = parser.parse_args()
if sys.platform.startswith('freebsd'):
# Requires py36-sysctl on FreeBSD
import sysctl
VDEV_CACHE_SIZE = 'vdev.cache_size'
def load_kstats(section):
base = 'kstat.zfs.misc.{section}.'.format(section=section)
# base is removed from the name
fmt = lambda kstat: '{name} : {value}'.format(name=kstat.name[len(base):],
value=kstat.value)
return [fmt(kstat) for kstat in sysctl.filter(base)]
def get_params(base):
cut = 8 # = len('vfs.zfs.')
return {ctl.name[cut:]: str(ctl.value) for ctl in sysctl.filter(base)}
def get_tunable_params():
return get_params('vfs.zfs')
def get_vdev_params():
return get_params('vfs.zfs.vdev')
def get_version_impl(request):
# FreeBSD reports versions for zpl and spa instead of zfs and spl.
name = {'zfs': 'zpl',
'spl': 'spa'}[request]
mib = 'vfs.zfs.version.{}'.format(name)
version = sysctl.filter(mib)[0].value
return '{} version {}'.format(name, version)
def get_descriptions(_request):
# py-sysctl doesn't give descriptions, so we have to shell out.
command = ['sysctl', '-d', 'vfs.zfs']
# The recommended way to do this is with subprocess.run(). However,
# some installed versions of Python are < 3.5, so we offer them
# the option of doing it the old way (for now)
if 'run' in dir(subprocess):
info = subprocess.run(command, stdout=subprocess.PIPE,
universal_newlines=True)
lines = info.stdout.split('\n')
else:
info = subprocess.check_output(command, universal_newlines=True)
lines = info.split('\n')
def fmt(line):
name, desc = line.split(':', 1)
return (name.strip(), desc.strip())
return dict([fmt(line) for line in lines if len(line) > 0])
elif sys.platform.startswith('linux'):
KSTAT_PATH = '/proc/spl/kstat/zfs'
SPL_PATH = '/sys/module/spl/parameters'
TUNABLES_PATH = '/sys/module/zfs/parameters'
VDEV_CACHE_SIZE = 'zfs_vdev_cache_size'
def load_kstats(section):
path = os.path.join(KSTAT_PATH, section)
with open(path) as f:
return list(f)[2:] # Get rid of header
def get_params(basepath):
"""Collect information on the Solaris Porting Layer (SPL) or the
tunables, depending on the PATH given. Does not check if PATH is
legal.
"""
result = {}
for name in os.listdir(basepath):
path = os.path.join(basepath, name)
with open(path) as f:
value = f.read()
result[name] = value.strip()
return result
def get_spl_params():
return get_params(SPL_PATH)
def get_tunable_params():
return get_params(TUNABLES_PATH)
def get_vdev_params():
return get_params(TUNABLES_PATH)
def get_version_impl(request):
# The original arc_summary called /sbin/modinfo/{spl,zfs} to get
# the version information. We switch to /sys/module/{spl,zfs}/version
# to make sure we get what is really loaded in the kernel
command = ["cat", "/sys/module/{0}/version".format(request)]
req = request.upper()
# The recommended way to do this is with subprocess.run(). However,
# some installed versions of Python are < 3.5, so we offer them
# the option of doing it the old way (for now)
if 'run' in dir(subprocess):
info = subprocess.run(command, stdout=subprocess.PIPE,
universal_newlines=True)
version = info.stdout.strip()
else:
info = subprocess.check_output(command, universal_newlines=True)
version = info.strip()
return version
def get_descriptions(request):
"""Get the descriptions of the Solaris Porting Layer (SPL) or the
tunables, return with minimal formatting.
"""
if request not in ('spl', 'zfs'):
print('ERROR: description of "{0}" requested)'.format(request))
sys.exit(1)
descs = {}
target_prefix = 'parm:'
# We would prefer to do this with /sys/modules -- see the discussion at
# get_version() -- but there isn't a way to get the descriptions from
# there, so we fall back on modinfo
command = ["/sbin/modinfo", request, "-0"]
# The recommended way to do this is with subprocess.run(). However,
# some installed versions of Python are < 3.5, so we offer them
# the option of doing it the old way (for now)
info = ''
try:
if 'run' in dir(subprocess):
info = subprocess.run(command, stdout=subprocess.PIPE,
universal_newlines=True)
raw_output = info.stdout.split('\0')
else:
info = subprocess.check_output(command,
universal_newlines=True)
raw_output = info.split('\0')
except subprocess.CalledProcessError:
print("Error: Descriptions not available",
"(can't access kernel module)")
sys.exit(1)
for line in raw_output:
if not line.startswith(target_prefix):
continue
line = line[len(target_prefix):].strip()
name, raw_desc = line.split(':', 1)
desc = raw_desc.rsplit('(', 1)[0]
if desc == '':
desc = '(No description found)'
descs[name.strip()] = desc.strip()
return descs
def cleanup_line(single_line):
"""Format a raw line of data from /proc and isolate the name value
part, returning a tuple with each. Currently, this gets rid of the
middle '4'. For example "arc_no_grow 4 0" returns the tuple
("arc_no_grow", "0").
"""
name, _, value = single_line.split()
return name, value
def draw_graph(kstats_dict):
"""Draw a primitive graph representing the basic information on the
ARC -- its size and the proportion used by MFU and MRU -- and quit.
We use max size of the ARC to calculate how full it is. This is a
very rough representation.
"""
arc_stats = isolate_section('arcstats', kstats_dict)
GRAPH_INDENT = ' '*4
GRAPH_WIDTH = 60
arc_size = f_bytes(arc_stats['size'])
arc_perc = f_perc(arc_stats['size'], arc_stats['c_max'])
mfu_size = f_bytes(arc_stats['mfu_size'])
mru_size = f_bytes(arc_stats['mru_size'])
meta_limit = f_bytes(arc_stats['arc_meta_limit'])
meta_size = f_bytes(arc_stats['arc_meta_used'])
dnode_limit = f_bytes(arc_stats['arc_dnode_limit'])
dnode_size = f_bytes(arc_stats['dnode_size'])
info_form = ('ARC: {0} ({1}) MFU: {2} MRU: {3} META: {4} ({5}) '
'DNODE {6} ({7})')
info_line = info_form.format(arc_size, arc_perc, mfu_size, mru_size,
meta_size, meta_limit, dnode_size,
dnode_limit)
info_spc = ' '*int((GRAPH_WIDTH-len(info_line))/2)
info_line = GRAPH_INDENT+info_spc+info_line
graph_line = GRAPH_INDENT+'+'+('-'*(GRAPH_WIDTH-2))+'+'
mfu_perc = float(int(arc_stats['mfu_size'])/int(arc_stats['c_max']))
mru_perc = float(int(arc_stats['mru_size'])/int(arc_stats['c_max']))
arc_perc = float(int(arc_stats['size'])/int(arc_stats['c_max']))
total_ticks = float(arc_perc)*GRAPH_WIDTH
mfu_ticks = mfu_perc*GRAPH_WIDTH
mru_ticks = mru_perc*GRAPH_WIDTH
other_ticks = total_ticks-(mfu_ticks+mru_ticks)
core_form = 'F'*int(mfu_ticks)+'R'*int(mru_ticks)+'O'*int(other_ticks)
core_spc = ' '*(GRAPH_WIDTH-(2+len(core_form)))
core_line = GRAPH_INDENT+'|'+core_form+core_spc+'|'
for line in ('', info_line, graph_line, core_line, graph_line, ''):
print(line)
def f_bytes(byte_string):
"""Return human-readable representation of a byte value in
powers of 2 (eg "KiB" for "kibibytes", etc) to two decimal
points. Values smaller than one KiB are returned without
decimal points. Note "bytes" is a reserved keyword.
"""
prefixes = ([2**80, "YiB"], # yobibytes (yotta)
[2**70, "ZiB"], # zebibytes (zetta)
[2**60, "EiB"], # exbibytes (exa)
[2**50, "PiB"], # pebibytes (peta)
[2**40, "TiB"], # tebibytes (tera)
[2**30, "GiB"], # gibibytes (giga)
[2**20, "MiB"], # mebibytes (mega)
[2**10, "KiB"]) # kibibytes (kilo)
bites = int(byte_string)
if bites >= 2**10:
for limit, unit in prefixes:
if bites >= limit:
value = bites / limit
break
result = '{0:.1f} {1}'.format(value, unit)
else:
result = '{0} Bytes'.format(bites)
return result
def f_hits(hits_string):
"""Create a human-readable representation of the number of hits.
The single-letter symbols used are SI to avoid the confusion caused
by the different "short scale" and "long scale" representations in
English, which use the same words for different values. See
https://en.wikipedia.org/wiki/Names_of_large_numbers and:
https://physics.nist.gov/cuu/Units/prefixes.html
"""
numbers = ([10**24, 'Y'], # yotta (septillion)
[10**21, 'Z'], # zetta (sextillion)
[10**18, 'E'], # exa (quintrillion)
[10**15, 'P'], # peta (quadrillion)
[10**12, 'T'], # tera (trillion)
[10**9, 'G'], # giga (billion)
[10**6, 'M'], # mega (million)
[10**3, 'k']) # kilo (thousand)
hits = int(hits_string)
if hits >= 1000:
for limit, symbol in numbers:
if hits >= limit:
value = hits/limit
break
result = "%0.1f%s" % (value, symbol)
else:
result = "%d" % hits
return result
def f_perc(value1, value2):
"""Calculate percentage and return in human-readable form. If
rounding produces the result '0.0' though the first number is
not zero, include a 'less-than' symbol to avoid confusion.
Division by zero is handled by returning 'n/a'; no error
is called.
"""
v1 = float(value1)
v2 = float(value2)
try:
perc = 100 * v1/v2
except ZeroDivisionError:
result = 'n/a'
else:
result = '{0:0.1f} %'.format(perc)
if result == '0.0 %' and v1 > 0:
result = '< 0.1 %'
return result
def format_raw_line(name, value):
"""For the --raw option for the tunable and SPL outputs, decide on the
correct formatting based on the --alternate flag.
"""
if ARGS.alt:
result = '{0}{1}={2}'.format(INDENT, name, value)
else:
spc = LINE_LENGTH-(len(INDENT)+len(value))
result = '{0}{1:<{spc}}{2}'.format(INDENT, name, value, spc=spc)
return result
def get_kstats():
"""Collect information on the ZFS subsystem. The step does not perform any
further processing, giving us the option to only work on what is actually
needed. The name "kstat" is a holdover from the Solaris utility of the same
name.
"""
result = {}
for section in SECTION_PATHS.values():
if section not in result:
result[section] = load_kstats(section)
return result
def get_version(request):
"""Get the version number of ZFS or SPL on this machine for header.
Returns an error string, but does not raise an error, if we can't
get the ZFS/SPL version.
"""
if request not in ('spl', 'zfs'):
error_msg = '(ERROR: "{0}" requested)'.format(request)
return error_msg
return get_version_impl(request)
def print_header():
"""Print the initial heading with date and time as well as info on the
kernel and ZFS versions. This is not called for the graph.
"""
# datetime is now recommended over time but we keep the exact formatting
# from the older version of arc_summary in case there are scripts
# that expect it in this way
daydate = time.strftime(DATE_FORMAT)
spc_date = LINE_LENGTH-len(daydate)
sys_version = os.uname()
sys_msg = sys_version.sysname+' '+sys_version.release
zfs = get_version('zfs')
spc_zfs = LINE_LENGTH-len(zfs)
machine_msg = 'Machine: '+sys_version.nodename+' ('+sys_version.machine+')'
spl = get_version('spl')
spc_spl = LINE_LENGTH-len(spl)
print('\n'+('-'*LINE_LENGTH))
print('{0:<{spc}}{1}'.format(TITLE, daydate, spc=spc_date))
print('{0:<{spc}}{1}'.format(sys_msg, zfs, spc=spc_zfs))
print('{0:<{spc}}{1}\n'.format(machine_msg, spl, spc=spc_spl))
def print_raw(kstats_dict):
"""Print all available data from the system in a minimally sorted format.
This can be used as a source to be piped through 'grep'.
"""
sections = sorted(kstats_dict.keys())
for section in sections:
print('\n{0}:'.format(section.upper()))
lines = sorted(kstats_dict[section])
for line in lines:
name, value = cleanup_line(line)
print(format_raw_line(name, value))
# Tunables and SPL must be handled separately because they come from a
# different source and have descriptions the user might request
print()
section_spl()
section_tunables()
def isolate_section(section_name, kstats_dict):
"""From the complete information on all sections, retrieve only those
for one section.
"""
try:
section_data = kstats_dict[section_name]
except KeyError:
print('ERROR: Data on {0} not available'.format(section_data))
sys.exit(1)
section_dict = dict(cleanup_line(l) for l in section_data)
return section_dict
# Formatted output helper functions
def prt_1(text, value):
"""Print text and one value, no indent"""
spc = ' '*(LINE_LENGTH-(len(text)+len(value)))
print('{0}{spc}{1}'.format(text, value, spc=spc))
def prt_i1(text, value):
"""Print text and one value, with indent"""
spc = ' '*(LINE_LENGTH-(len(INDENT)+len(text)+len(value)))
print(INDENT+'{0}{spc}{1}'.format(text, value, spc=spc))
def prt_2(text, value1, value2):
"""Print text and two values, no indent"""
values = '{0:>9} {1:>9}'.format(value1, value2)
spc = ' '*(LINE_LENGTH-(len(text)+len(values)+2))
print('{0}{spc} {1}'.format(text, values, spc=spc))
def prt_i2(text, value1, value2):
"""Print text and two values, with indent"""
values = '{0:>9} {1:>9}'.format(value1, value2)
spc = ' '*(LINE_LENGTH-(len(INDENT)+len(text)+len(values)+2))
print(INDENT+'{0}{spc} {1}'.format(text, values, spc=spc))
# The section output concentrates on important parameters instead of
# being exhaustive (that is what the --raw parameter is for)
def section_arc(kstats_dict):
"""Give basic information on the ARC, MRU and MFU. This is the first
and most used section.
"""
arc_stats = isolate_section('arcstats', kstats_dict)
throttle = arc_stats['memory_throttle_count']
if throttle == '0':
health = 'HEALTHY'
else:
health = 'THROTTLED'
prt_1('ARC status:', health)
prt_i1('Memory throttle count:', throttle)
print()
arc_size = arc_stats['size']
arc_target_size = arc_stats['c']
arc_max = arc_stats['c_max']
arc_min = arc_stats['c_min']
mfu_size = arc_stats['mfu_size']
mru_size = arc_stats['mru_size']
meta_limit = arc_stats['arc_meta_limit']
meta_size = arc_stats['arc_meta_used']
dnode_limit = arc_stats['arc_dnode_limit']
dnode_size = arc_stats['dnode_size']
target_size_ratio = '{0}:1'.format(int(arc_max) // int(arc_min))
prt_2('ARC size (current):',
f_perc(arc_size, arc_max), f_bytes(arc_size))
prt_i2('Target size (adaptive):',
f_perc(arc_target_size, arc_max), f_bytes(arc_target_size))
prt_i2('Min size (hard limit):',
f_perc(arc_min, arc_max), f_bytes(arc_min))
prt_i2('Max size (high water):',
target_size_ratio, f_bytes(arc_max))
caches_size = int(mfu_size)+int(mru_size)
prt_i2('Most Frequently Used (MFU) cache size:',
f_perc(mfu_size, caches_size), f_bytes(mfu_size))
prt_i2('Most Recently Used (MRU) cache size:',
f_perc(mru_size, caches_size), f_bytes(mru_size))
prt_i2('Metadata cache size (hard limit):',
f_perc(meta_limit, arc_max), f_bytes(meta_limit))
prt_i2('Metadata cache size (current):',
f_perc(meta_size, meta_limit), f_bytes(meta_size))
prt_i2('Dnode cache size (hard limit):',
f_perc(dnode_limit, meta_limit), f_bytes(dnode_limit))
prt_i2('Dnode cache size (current):',
f_perc(dnode_size, dnode_limit), f_bytes(dnode_size))
print()
print('ARC hash breakdown:')
prt_i1('Elements max:', f_hits(arc_stats['hash_elements_max']))
prt_i2('Elements current:',
f_perc(arc_stats['hash_elements'], arc_stats['hash_elements_max']),
f_hits(arc_stats['hash_elements']))
prt_i1('Collisions:', f_hits(arc_stats['hash_collisions']))
prt_i1('Chain max:', f_hits(arc_stats['hash_chain_max']))
prt_i1('Chains:', f_hits(arc_stats['hash_chains']))
print()
print('ARC misc:')
prt_i1('Deleted:', f_hits(arc_stats['deleted']))
prt_i1('Mutex misses:', f_hits(arc_stats['mutex_miss']))
prt_i1('Eviction skips:', f_hits(arc_stats['evict_skip']))
print()
def section_archits(kstats_dict):
"""Print information on how the caches are accessed ("arc hits").
"""
arc_stats = isolate_section('arcstats', kstats_dict)
all_accesses = int(arc_stats['hits'])+int(arc_stats['misses'])
actual_hits = int(arc_stats['mfu_hits'])+int(arc_stats['mru_hits'])
prt_1('ARC total accesses (hits + misses):', f_hits(all_accesses))
ta_todo = (('Cache hit ratio:', arc_stats['hits']),
('Cache miss ratio:', arc_stats['misses']),
('Actual hit ratio (MFU + MRU hits):', actual_hits))
for title, value in ta_todo:
prt_i2(title, f_perc(value, all_accesses), f_hits(value))
dd_total = int(arc_stats['demand_data_hits']) +\
int(arc_stats['demand_data_misses'])
prt_i2('Data demand efficiency:',
f_perc(arc_stats['demand_data_hits'], dd_total),
f_hits(dd_total))
dp_total = int(arc_stats['prefetch_data_hits']) +\
int(arc_stats['prefetch_data_misses'])
prt_i2('Data prefetch efficiency:',
f_perc(arc_stats['prefetch_data_hits'], dp_total),
f_hits(dp_total))
known_hits = int(arc_stats['mfu_hits']) +\
int(arc_stats['mru_hits']) +\
int(arc_stats['mfu_ghost_hits']) +\
int(arc_stats['mru_ghost_hits'])
anon_hits = int(arc_stats['hits'])-known_hits
print()
print('Cache hits by cache type:')
cl_todo = (('Most frequently used (MFU):', arc_stats['mfu_hits']),
('Most recently used (MRU):', arc_stats['mru_hits']),
('Most frequently used (MFU) ghost:',
arc_stats['mfu_ghost_hits']),
('Most recently used (MRU) ghost:',
arc_stats['mru_ghost_hits']))
for title, value in cl_todo:
prt_i2(title, f_perc(value, arc_stats['hits']), f_hits(value))
# For some reason, anon_hits can turn negative, which is weird. Until we
# have figured out why this happens, we just hide the problem, following
# the behavior of the original arc_summary.
if anon_hits >= 0:
prt_i2('Anonymously used:',
f_perc(anon_hits, arc_stats['hits']), f_hits(anon_hits))
print()
print('Cache hits by data type:')
dt_todo = (('Demand data:', arc_stats['demand_data_hits']),
('Demand prefetch data:', arc_stats['prefetch_data_hits']),
('Demand metadata:', arc_stats['demand_metadata_hits']),
('Demand prefetch metadata:',
arc_stats['prefetch_metadata_hits']))
for title, value in dt_todo:
prt_i2(title, f_perc(value, arc_stats['hits']), f_hits(value))
print()
print('Cache misses by data type:')
dm_todo = (('Demand data:', arc_stats['demand_data_misses']),
('Demand prefetch data:',
arc_stats['prefetch_data_misses']),
('Demand metadata:', arc_stats['demand_metadata_misses']),
('Demand prefetch metadata:',
arc_stats['prefetch_metadata_misses']))
for title, value in dm_todo:
prt_i2(title, f_perc(value, arc_stats['misses']), f_hits(value))
print()
def section_dmu(kstats_dict):
"""Collect information on the DMU"""
zfetch_stats = isolate_section('zfetchstats', kstats_dict)
zfetch_access_total = int(zfetch_stats['hits'])+int(zfetch_stats['misses'])
prt_1('DMU prefetch efficiency:', f_hits(zfetch_access_total))
prt_i2('Hit ratio:', f_perc(zfetch_stats['hits'], zfetch_access_total),
f_hits(zfetch_stats['hits']))
prt_i2('Miss ratio:', f_perc(zfetch_stats['misses'], zfetch_access_total),
f_hits(zfetch_stats['misses']))
print()
def section_l2arc(kstats_dict):
"""Collect information on L2ARC device if present. If not, tell user
that we're skipping the section.
"""
# The L2ARC statistics live in the same section as the normal ARC stuff
arc_stats = isolate_section('arcstats', kstats_dict)
if arc_stats['l2_size'] == '0':
print('L2ARC not detected, skipping section\n')
return
l2_errors = int(arc_stats['l2_writes_error']) +\
int(arc_stats['l2_cksum_bad']) +\
int(arc_stats['l2_io_error'])
l2_access_total = int(arc_stats['l2_hits'])+int(arc_stats['l2_misses'])
health = 'HEALTHY'
if l2_errors > 0:
health = 'DEGRADED'
prt_1('L2ARC status:', health)
l2_todo = (('Low memory aborts:', 'l2_abort_lowmem'),
('Free on write:', 'l2_free_on_write'),
('R/W clashes:', 'l2_rw_clash'),
('Bad checksums:', 'l2_cksum_bad'),
('I/O errors:', 'l2_io_error'))
for title, value in l2_todo:
prt_i1(title, f_hits(arc_stats[value]))
print()
prt_1('L2ARC size (adaptive):', f_bytes(arc_stats['l2_size']))
prt_i2('Compressed:', f_perc(arc_stats['l2_asize'], arc_stats['l2_size']),
f_bytes(arc_stats['l2_asize']))
prt_i2('Header size:',
f_perc(arc_stats['l2_hdr_size'], arc_stats['l2_size']),
f_bytes(arc_stats['l2_hdr_size']))
print()
prt_1('L2ARC breakdown:', f_hits(l2_access_total))
prt_i2('Hit ratio:',
f_perc(arc_stats['l2_hits'], l2_access_total),
f_hits(arc_stats['l2_hits']))
prt_i2('Miss ratio:',
f_perc(arc_stats['l2_misses'], l2_access_total),
f_hits(arc_stats['l2_misses']))
prt_i1('Feeds:', f_hits(arc_stats['l2_feeds']))
print()
print('L2ARC writes:')
if arc_stats['l2_writes_done'] != arc_stats['l2_writes_sent']:
prt_i2('Writes sent:', 'FAULTED', f_hits(arc_stats['l2_writes_sent']))
prt_i2('Done ratio:',
f_perc(arc_stats['l2_writes_done'],
arc_stats['l2_writes_sent']),
f_hits(arc_stats['l2_writes_done']))
prt_i2('Error ratio:',
f_perc(arc_stats['l2_writes_error'],
arc_stats['l2_writes_sent']),
f_hits(arc_stats['l2_writes_error']))
else:
prt_i2('Writes sent:', '100 %', f_hits(arc_stats['l2_writes_sent']))
print()
print('L2ARC evicts:')
prt_i1('Lock retries:', f_hits(arc_stats['l2_evict_lock_retry']))
prt_i1('Upon reading:', f_hits(arc_stats['l2_evict_reading']))
print()
def section_spl(*_):
"""Print the SPL parameters, if requested with alternative format
and/or descriptions. This does not use kstats.
"""
if sys.platform.startswith('freebsd'):
# No SPL support in FreeBSD
return
spls = get_spl_params()
keylist = sorted(spls.keys())
print('Solaris Porting Layer (SPL):')
if ARGS.desc:
descriptions = get_descriptions('spl')
for key in keylist:
value = spls[key]
if ARGS.desc:
try:
print(INDENT+'#', descriptions[key])
except KeyError:
print(INDENT+'# (No description found)') # paranoid
print(format_raw_line(key, value))
print()
def section_tunables(*_):
"""Print the tunables, if requested with alternative format and/or
descriptions. This does not use kstasts.
"""
tunables = get_tunable_params()
keylist = sorted(tunables.keys())
print('Tunables:')
if ARGS.desc:
descriptions = get_descriptions('zfs')
for key in keylist:
value = tunables[key]
if ARGS.desc:
try:
print(INDENT+'#', descriptions[key])
except KeyError:
print(INDENT+'# (No description found)') # paranoid
print(format_raw_line(key, value))
print()
def section_vdev(kstats_dict):
"""Collect information on VDEV caches"""
# Currently [Nov 2017] the VDEV cache is disabled, because it is actually
# harmful. When this is the case, we just skip the whole entry. See
# https://github.com/openzfs/zfs/blob/master/module/zfs/vdev_cache.c
# for details
tunables = get_vdev_params()
if tunables[VDEV_CACHE_SIZE] == '0':
print('VDEV cache disabled, skipping section\n')
return
vdev_stats = isolate_section('vdev_cache_stats', kstats_dict)
vdev_cache_total = int(vdev_stats['hits']) +\
int(vdev_stats['misses']) +\
int(vdev_stats['delegations'])
prt_1('VDEV cache summary:', f_hits(vdev_cache_total))
prt_i2('Hit ratio:', f_perc(vdev_stats['hits'], vdev_cache_total),
f_hits(vdev_stats['hits']))
prt_i2('Miss ratio:', f_perc(vdev_stats['misses'], vdev_cache_total),
f_hits(vdev_stats['misses']))
prt_i2('Delegations:', f_perc(vdev_stats['delegations'], vdev_cache_total),
f_hits(vdev_stats['delegations']))
print()
def section_zil(kstats_dict):
"""Collect information on the ZFS Intent Log. Some of the information
taken from https://github.com/openzfs/zfs/blob/master/include/sys/zil.h
"""
zil_stats = isolate_section('zil', kstats_dict)
prt_1('ZIL committed transactions:',
f_hits(zil_stats['zil_itx_count']))
prt_i1('Commit requests:', f_hits(zil_stats['zil_commit_count']))
prt_i1('Flushes to stable storage:',
f_hits(zil_stats['zil_commit_writer_count']))
prt_i2('Transactions to SLOG storage pool:',
f_bytes(zil_stats['zil_itx_metaslab_slog_bytes']),
f_hits(zil_stats['zil_itx_metaslab_slog_count']))
prt_i2('Transactions to non-SLOG storage pool:',
f_bytes(zil_stats['zil_itx_metaslab_normal_bytes']),
f_hits(zil_stats['zil_itx_metaslab_normal_count']))
print()
section_calls = {'arc': section_arc,
'archits': section_archits,
'dmu': section_dmu,
'l2arc': section_l2arc,
'spl': section_spl,
'tunables': section_tunables,
'vdev': section_vdev,
'zil': section_zil}
def main():
"""Run program. The options to draw a graph and to print all data raw are
treated separately because they come with their own call.
"""
kstats = get_kstats()
if ARGS.graph:
draw_graph(kstats)
sys.exit(0)
print_header()
if ARGS.raw:
print_raw(kstats)
elif ARGS.section:
try:
section_calls[ARGS.section](kstats)
except KeyError:
print('Error: Section "{0}" unknown'.format(ARGS.section))
sys.exit(1)
elif ARGS.page:
print('WARNING: Pages are deprecated, please use "--section"\n')
pages_to_calls = {1: 'arc',
2: 'archits',
3: 'l2arc',
4: 'dmu',
5: 'vdev',
6: 'tunables'}
try:
call = pages_to_calls[ARGS.page]
except KeyError:
print('Error: Page "{0}" not supported'.format(ARGS.page))
sys.exit(1)
else:
section_calls[call](kstats)
else:
# If no parameters were given, we print all sections. We might want to
# change the sequence by hand
calls = sorted(section_calls.keys())
for section in calls:
section_calls[section](kstats)
sys.exit(0)
if __name__ == '__main__':
main()
-1
View File
@@ -1 +0,0 @@
arcstat
+1 -5
View File
@@ -1,5 +1 @@
include $(top_srcdir)/config/Substfiles.am
bin_SCRIPTS = arcstat
SUBSTFILES += $(bin_SCRIPTS)
dist_bin_SCRIPTS = arcstat.py
+66 -124
View File
@@ -1,25 +1,20 @@
#!/usr/bin/env @PYTHON_SHEBANG@
#!/usr/bin/python
#
# Print out ZFS ARC Statistics exported via kstat(1)
# For a definition of fields, or usage, use arcstat -v
# For a definition of fields, or usage, use arctstat.pl -v
#
# This script was originally a fork of the original arcstat.pl (0.1)
# by Neelakanth Nadgir, originally published on his Sun blog on
# This script is a fork of the original arcstat.pl (0.1) by
# Neelakanth Nadgir, originally published on his Sun blog on
# 09/18/2007
# http://blogs.sun.com/realneel/entry/zfs_arc_statistics
#
# A new version aimed to improve upon the original by adding features
# and fixing bugs as needed. This version was maintained by Mike
# Harsch and was hosted in a public open source repository:
# This version aims to improve upon the original by adding features
# and fixing bugs as needed. This version is maintained by
# Mike Harsch and is hosted in a public open source repository:
# http://github.com/mharsch/arcstat
#
# but has since moved to the illumos-gate repository.
#
# This Python port was written by John Hixson for FreeNAS, introduced
# in commit e2c29f:
# https://github.com/freenas/freenas
#
# and has been improved by many people since.
# Comments, Questions, or Suggestions are always welcome.
# Contact the maintainer at ( mike at harschsystems dot com )
#
# CDDL HEADER START
#
@@ -47,8 +42,7 @@
# @hdr is the array of fields that needs to be printed, so we
# just iterate over this array and print the values using our pretty printer.
#
# This script must remain compatible with Python 2.6+ and Python 3.4+.
#
import sys
import time
@@ -56,16 +50,16 @@ import getopt
import re
import copy
from decimal import Decimal
from signal import signal, SIGINT, SIGWINCH, SIG_DFL
cols = {
# HDR: [Size, Scale, Description]
"time": [8, -1, "Time"],
"hits": [4, 1000, "ARC reads per second"],
"miss": [4, 1000, "ARC misses per second"],
"read": [4, 1000, "Total ARC accesses per second"],
"hit%": [4, 100, "ARC hit percentage"],
"hit%": [4, 100, "ARC Hit percentage"],
"miss%": [5, 100, "ARC miss percentage"],
"dhit": [4, 1000, "Demand hits per second"],
"dmis": [4, 1000, "Demand misses per second"],
@@ -77,16 +71,15 @@ cols = {
"pm%": [3, 100, "Prefetch miss percentage"],
"mhit": [4, 1000, "Metadata hits per second"],
"mmis": [4, 1000, "Metadata misses per second"],
"mread": [5, 1000, "Metadata accesses per second"],
"mread": [4, 1000, "Metadata accesses per second"],
"mh%": [3, 100, "Metadata hit percentage"],
"mm%": [3, 100, "Metadata miss percentage"],
"arcsz": [5, 1024, "ARC size"],
"size": [4, 1024, "ARC size"],
"c": [4, 1024, "ARC target size"],
"mfu": [4, 1000, "MFU list hits per second"],
"mru": [4, 1000, "MRU list hits per second"],
"mfug": [4, 1000, "MFU ghost list hits per second"],
"mrug": [4, 1000, "MRU ghost list hits per second"],
"arcsz": [5, 1024, "ARC Size"],
"c": [4, 1024, "ARC Target Size"],
"mfu": [4, 1000, "MFU List hits per second"],
"mru": [4, 1000, "MRU List hits per second"],
"mfug": [4, 1000, "MFU Ghost List hits per second"],
"mrug": [4, 1000, "MRU Ghost List hits per second"],
"eskip": [5, 1000, "evict_skip per second"],
"mtxmis": [6, 1000, "mutex_miss per second"],
"dread": [5, 1000, "Demand accesses per second"],
@@ -98,17 +91,12 @@ cols = {
"l2miss%": [7, 100, "L2ARC access miss percentage"],
"l2asize": [7, 1024, "Actual (compressed) size of the L2ARC"],
"l2size": [6, 1024, "Size of the L2ARC"],
"l2bytes": [7, 1024, "Bytes read per second from the L2ARC"],
"grow": [4, 1000, "ARC grow disabled"],
"need": [4, 1024, "ARC reclaim need"],
"free": [4, 1024, "ARC free memory"],
"avail": [5, 1024, "ARC available memory"],
"waste": [5, 1024, "Wasted memory due to round up to pagesize"],
"l2bytes": [7, 1024, "bytes read per second from the L2ARC"],
}
v = {}
hdr = ["time", "read", "miss", "miss%", "dmis", "dm%", "pmis", "pm%", "mmis",
"mm%", "size", "c", "avail"]
"mm%", "arcsz", "c"]
xhdr = ["time", "mfu", "mru", "mfug", "mrug", "eskip", "mtxmis", "dread",
"pread", "read"]
sint = 1 # Default interval is 1 second
@@ -118,55 +106,12 @@ opfile = None
sep = " " # Default separator is 2 spaces
version = "0.4"
l2exist = False
cmd = ("Usage: arcstat [-havxp] [-f fields] [-o file] [-s string] [interval "
cmd = ("Usage: arcstat.py [-hvx] [-f fields] [-o file] [-s string] [interval "
"[count]]\n")
cur = {}
d = {}
out = None
kstat = None
pretty_print = True
if sys.platform.startswith('freebsd'):
# Requires py27-sysctl on FreeBSD
import sysctl
def kstat_update():
global kstat
k = sysctl.filter('kstat.zfs.misc.arcstats')
if not k:
sys.exit(1)
kstat = {}
for s in k:
if not s:
continue
name, value = s.name, s.value
# Trims 'kstat.zfs.misc.arcstats' from the name
kstat[name[24:]] = int(value)
elif sys.platform.startswith('linux'):
def kstat_update():
global kstat
k = [line.strip() for line in open('/proc/spl/kstat/zfs/arcstats')]
if not k:
sys.exit(1)
del k[0:2]
kstat = {}
for s in k:
if not s:
continue
name, unused, value = s.split()
kstat[name] = int(value)
def detailed_usage():
@@ -182,7 +127,6 @@ def detailed_usage():
def usage():
sys.stderr.write("%s\n" % cmd)
sys.stderr.write("\t -h : Print this help message\n")
sys.stderr.write("\t -a : Print all possible stats\n")
sys.stderr.write("\t -v : List all possible field headers and definitions"
"\n")
sys.stderr.write("\t -x : Print extended stats\n")
@@ -190,17 +134,35 @@ def usage():
sys.stderr.write("\t -o : Redirect output to the specified file\n")
sys.stderr.write("\t -s : Override default field separator with custom "
"character or string\n")
sys.stderr.write("\t -p : Disable auto-scaling of numerical fields\n")
sys.stderr.write("\nExamples:\n")
sys.stderr.write("\tarcstat -o /tmp/a.log 2 10\n")
sys.stderr.write("\tarcstat -s \",\" -o /tmp/a.log 2 10\n")
sys.stderr.write("\tarcstat -v\n")
sys.stderr.write("\tarcstat -f time,hit%,dh%,ph%,mh% 1\n")
sys.stderr.write("\tarcstat.py -o /tmp/a.log 2 10\n")
sys.stderr.write("\tarcstat.py -s \",\" -o /tmp/a.log 2 10\n")
sys.stderr.write("\tarcstat.py -v\n")
sys.stderr.write("\tarcstat.py -f time,hit%,dh%,ph%,mh% 1\n")
sys.stderr.write("\n")
sys.exit(1)
def kstat_update():
global kstat
k = [line.strip() for line in open('/proc/spl/kstat/zfs/arcstats')]
if not k:
sys.exit(1)
del k[0:2]
kstat = {}
for s in k:
if not s:
continue
name, unused, value = s.split()
kstat[name] = Decimal(value)
def snap_stats():
global cur
global kstat
@@ -231,7 +193,7 @@ def prettynum(sz, scale, num=0):
elif 0 < num < 1:
num = 0
while abs(num) > scale and index < 5:
while num > scale and index < 5:
save = num
num = num / scale
index += 1
@@ -239,7 +201,7 @@ def prettynum(sz, scale, num=0):
if index == 0:
return "%*d" % (sz, num)
if abs(save / scale) < 10:
if (save / scale) < 10:
return "%*.1f%s" % (sz - 1, num, suffix[index])
else:
return "%*d%s" % (sz - 1, num, suffix[index])
@@ -249,14 +211,12 @@ def print_values():
global hdr
global sep
global v
global pretty_print
if pretty_print:
fmt = lambda col: prettynum(cols[col][0], cols[col][1], v[col])
else:
fmt = lambda col: v[col]
sys.stdout.write(sep.join(fmt(col) for col in hdr))
for col in hdr:
sys.stdout.write("%s%s" % (
prettynum(cols[col][0], cols[col][1], v[col]),
sep
))
sys.stdout.write("\n")
sys.stdout.flush()
@@ -264,14 +224,9 @@ def print_values():
def print_header():
global hdr
global sep
global pretty_print
if pretty_print:
fmt = lambda col: "%*s" % (cols[col][0], col)
else:
fmt = lambda col: col
sys.stdout.write(sep.join(fmt(col) for col in hdr))
for col in hdr:
sys.stdout.write("%*s%s" % (cols[col][0], col, sep))
sys.stdout.write("\n")
@@ -308,10 +263,8 @@ def init():
global sep
global out
global l2exist
global pretty_print
desired_cols = None
aflag = False
xflag = False
hflag = False
vflag = False
@@ -320,16 +273,14 @@ def init():
try:
opts, args = getopt.getopt(
sys.argv[1:],
"axo:hvs:f:p",
"xo:hvs:f:",
[
"all",
"extended",
"outfile",
"help",
"verbose",
"separator",
"columns",
"parsable"
"columns"
]
)
except getopt.error as msg:
@@ -338,8 +289,6 @@ def init():
opts = None
for opt, arg in opts:
if opt in ('-a', '--all'):
aflag = True
if opt in ('-x', '--extended'):
xflag = True
if opt in ('-o', '--outfile'):
@@ -355,13 +304,19 @@ def init():
if opt in ('-f', '--columns'):
desired_cols = arg
i += 1
if opt in ('-p', '--parsable'):
pretty_print = False
i += 1
argv = sys.argv[i:]
sint = int(argv[0]) if argv else sint
count = int(argv[1]) if len(argv) > 1 else (0 if len(argv) > 0 else 1)
sint = Decimal(argv[0]) if argv else sint
count = int(argv[1]) if len(argv) > 1 else count
if len(argv) > 1:
sint = Decimal(argv[0])
count = int(argv[1])
elif len(argv) > 0:
sint = Decimal(argv[0])
count = 0
if hflag or (xflag and desired_cols):
usage()
@@ -401,12 +356,6 @@ def init():
incompat)
usage()
if aflag:
if l2exist:
hdr = cols.keys()
else:
hdr = [col for col in cols.keys() if not col.startswith("l2")]
if opfile:
try:
out = open(opfile, "w")
@@ -455,7 +404,6 @@ def calculate():
v["mm%"] = 100 - v["mh%"] if v["mread"] > 0 else 0
v["arcsz"] = cur["size"]
v["size"] = cur["size"]
v["c"] = cur["c"]
v["mfu"] = d["mfu_hits"] / sint
v["mru"] = d["mru_hits"] / sint
@@ -475,12 +423,6 @@ def calculate():
v["l2size"] = cur["l2_size"]
v["l2bytes"] = d["l2_read_bytes"] / sint
v["grow"] = 0 if cur["arc_no_grow"] else 1
v["need"] = cur["arc_need_free"]
v["free"] = cur["memory_free_bytes"]
v["avail"] = cur["memory_available_bytes"]
v["waste"] = cur["abd_chunk_waste_size"]
def main():
global sint
-1
View File
@@ -1 +0,0 @@
dbufstat
+1 -5
View File
@@ -1,5 +1 @@
include $(top_srcdir)/config/Substfiles.am
bin_SCRIPTS = dbufstat
SUBSTFILES += $(bin_SCRIPTS)
dist_bin_SCRIPTS = dbufstat.py
@@ -1,4 +1,4 @@
#!/usr/bin/env @PYTHON_SHEBANG@
#!/usr/bin/python
#
# Print out statistics for all cached dmu buffers. This information
# is available through the dbufs kstat and may be post-processed as
@@ -27,17 +27,14 @@
# Copyright (C) 2013 Lawrence Livermore National Security, LLC.
# Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
#
# This script must remain compatible with Python 2.6+ and Python 3.4+.
#
import sys
import getopt
import errno
import re
bhdr = ["pool", "objset", "object", "level", "blkid", "offset", "dbsize"]
bxhdr = ["pool", "objset", "object", "level", "blkid", "offset", "dbsize",
"meta", "state", "dbholds", "dbc", "list", "atype", "flags",
"meta", "state", "dbholds", "list", "atype", "flags",
"count", "asize", "access", "mru", "gmru", "mfu", "gmfu", "l2",
"l2_dattr", "l2_asize", "l2_comp", "aholds", "dtype", "btype",
"data_bs", "meta_bs", "bsize", "lvls", "dholds", "blocks", "dsize"]
@@ -48,7 +45,7 @@ dxhdr = ["pool", "objset", "object", "dtype", "btype", "data_bs", "meta_bs",
"bsize", "lvls", "dholds", "blocks", "dsize", "cached", "direct",
"indirect", "bonus", "spill"]
dincompat = ["level", "blkid", "offset", "dbsize", "meta", "state", "dbholds",
"dbc", "list", "atype", "flags", "count", "asize", "access",
"list", "atype", "flags", "count", "asize", "access",
"mru", "gmru", "mfu", "gmfu", "l2", "l2_dattr", "l2_asize",
"l2_comp", "aholds"]
@@ -56,7 +53,7 @@ thdr = ["pool", "objset", "dtype", "cached"]
txhdr = ["pool", "objset", "dtype", "cached", "direct", "indirect",
"bonus", "spill"]
tincompat = ["object", "level", "blkid", "offset", "dbsize", "meta", "state",
"dbc", "dbholds", "list", "atype", "flags", "count", "asize",
"dbholds", "list", "atype", "flags", "count", "asize",
"access", "mru", "gmru", "mfu", "gmfu", "l2", "l2_dattr",
"l2_asize", "l2_comp", "aholds", "btype", "data_bs", "meta_bs",
"bsize", "lvls", "dholds", "blocks", "dsize"]
@@ -73,10 +70,9 @@ cols = {
"meta": [4, -1, "is this buffer metadata?"],
"state": [5, -1, "state of buffer (read, cached, etc)"],
"dbholds": [7, 1000, "number of holds on buffer"],
"dbc": [3, -1, "in dbuf cache"],
"list": [4, -1, "which ARC list contains this buffer"],
"atype": [7, -1, "ARC header type (data or metadata)"],
"flags": [9, -1, "ARC read flags"],
"flags": [8, -1, "ARC read flags"],
"count": [5, -1, "ARC data count"],
"asize": [7, 1024, "size of this ARC buffer"],
"access": [10, -1, "time this ARC buffer was last accessed"],
@@ -108,26 +104,11 @@ cols = {
hdr = None
xhdr = None
sep = " " # Default separator is 2 spaces
cmd = ("Usage: dbufstat [-bdhnrtvx] [-i file] [-f fields] [-o file] "
"[-s string] [-F filter]\n")
cmd = ("Usage: dbufstat.py [-bdhrtvx] [-i file] [-f fields] [-o file] "
"[-s string]\n")
raw = 0
if sys.platform.startswith("freebsd"):
import io
# Requires py-sysctl on FreeBSD
import sysctl
def default_ifile():
dbufs = sysctl.filter("kstat.zfs.misc.dbufs")[0].value
sys.stdin = io.StringIO(dbufs)
return "-"
elif sys.platform.startswith("linux"):
def default_ifile():
return "/proc/spl/kstat/zfs/dbufs"
def print_incompat_helper(incompat):
cnt = 0
for key in sorted(incompat):
@@ -170,7 +151,6 @@ def usage():
sys.stderr.write("\t -b : Print table of information for each dbuf\n")
sys.stderr.write("\t -d : Print table of information for each dnode\n")
sys.stderr.write("\t -h : Print this help message\n")
sys.stderr.write("\t -n : Exclude header from output\n")
sys.stderr.write("\t -r : Print raw values\n")
sys.stderr.write("\t -t : Print table of information for each dnode type"
"\n")
@@ -182,13 +162,11 @@ def usage():
sys.stderr.write("\t -o : Redirect output to the specified file\n")
sys.stderr.write("\t -s : Override default field separator with custom "
"character or string\n")
sys.stderr.write("\t -F : Filter output by value or regex\n")
sys.stderr.write("\nExamples:\n")
sys.stderr.write("\tdbufstat -d -o /tmp/d.log\n")
sys.stderr.write("\tdbufstat -t -s \",\" -o /tmp/t.log\n")
sys.stderr.write("\tdbufstat -v\n")
sys.stderr.write("\tdbufstat -d -f pool,object,objset,dsize,cached\n")
sys.stderr.write("\tdbufstat -bx -F dbc=1,objset=54,pool=testpool\n")
sys.stderr.write("\tdbufstat.py -d -o /tmp/d.log\n")
sys.stderr.write("\tdbufstat.py -t -s \",\" -o /tmp/t.log\n")
sys.stderr.write("\tdbufstat.py -v\n")
sys.stderr.write("\tdbufstat.py -d -f pool,object,objset,dsize,cached\n")
sys.stderr.write("\n")
sys.exit(1)
@@ -250,8 +228,7 @@ def print_header():
def get_typestring(t):
ot_strings = [
"DMU_OT_NONE",
type_strings = ["DMU_OT_NONE",
# general:
"DMU_OT_OBJECT_DIRECTORY",
"DMU_OT_OBJECT_ARRAY",
@@ -314,39 +291,15 @@ def get_typestring(t):
"DMU_OT_DEADLIST_HDR",
"DMU_OT_DSL_CLONES",
"DMU_OT_BPOBJ_SUBOBJ"]
otn_strings = {
0x80: "DMU_OTN_UINT8_DATA",
0xc0: "DMU_OTN_UINT8_METADATA",
0x81: "DMU_OTN_UINT16_DATA",
0xc1: "DMU_OTN_UINT16_METADATA",
0x82: "DMU_OTN_UINT32_DATA",
0xc2: "DMU_OTN_UINT32_METADATA",
0x83: "DMU_OTN_UINT64_DATA",
0xc3: "DMU_OTN_UINT64_METADATA",
0x84: "DMU_OTN_ZAP_DATA",
0xc4: "DMU_OTN_ZAP_METADATA",
0xa0: "DMU_OTN_UINT8_ENC_DATA",
0xe0: "DMU_OTN_UINT8_ENC_METADATA",
0xa1: "DMU_OTN_UINT16_ENC_DATA",
0xe1: "DMU_OTN_UINT16_ENC_METADATA",
0xa2: "DMU_OTN_UINT32_ENC_DATA",
0xe2: "DMU_OTN_UINT32_ENC_METADATA",
0xa3: "DMU_OTN_UINT64_ENC_DATA",
0xe3: "DMU_OTN_UINT64_ENC_METADATA",
0xa4: "DMU_OTN_ZAP_ENC_DATA",
0xe4: "DMU_OTN_ZAP_ENC_METADATA"}
# If "-rr" option is used, don't convert to string representation
if raw > 1:
return "%i" % t
try:
if t < len(ot_strings):
return ot_strings[t]
else:
return otn_strings[t]
except (IndexError, KeyError):
return "(UNKNOWN)"
return type_strings[t]
except IndexError:
return "%i" % t
def get_compstring(c):
@@ -358,7 +311,7 @@ def get_compstring(c):
"ZIO_COMPRESS_GZIP_6", "ZIO_COMPRESS_GZIP_7",
"ZIO_COMPRESS_GZIP_8", "ZIO_COMPRESS_GZIP_9",
"ZIO_COMPRESS_ZLE", "ZIO_COMPRESS_LZ4",
"ZIO_COMPRESS_ZSTD", "ZIO_COMPRESS_FUNCTION"]
"ZIO_COMPRESS_FUNCTION"]
# If "-rr" option is used, don't convert to string representation
if raw > 1:
@@ -431,32 +384,12 @@ def update_dict(d, k, line, labels):
return d
def skip_line(vals, filters):
'''
Determines if a line should be skipped during printing
based on a set of filters
'''
if len(filters) == 0:
return False
for key in vals:
if key in filters:
val = prettynum(cols[key][0], cols[key][1], vals[key]).strip()
# we want a full match here
if re.match("(?:" + filters[key] + r")\Z", val) is None:
return True
return False
def print_dict(d, filters, noheader):
if not noheader:
print_header()
def print_dict(d):
print_header()
for pool in list(d.keys()):
for objset in list(d[pool].keys()):
for v in list(d[pool][objset].values()):
if not skip_line(v, filters):
print_values(v)
print_values(v)
def dnodes_build_dict(filehandle):
@@ -497,7 +430,7 @@ def types_build_dict(filehandle):
return types
def buffers_print_all(filehandle, filters, noheader):
def buffers_print_all(filehandle):
labels = dict()
# First 3 lines are header information, skip the first two
@@ -508,14 +441,11 @@ def buffers_print_all(filehandle, filters, noheader):
for i, v in enumerate(next(filehandle).split()):
labels[v] = i
if not noheader:
print_header()
print_header()
# The rest of the file is buffer information
for line in filehandle:
vals = parse_line(line.split(), labels)
if not skip_line(vals, filters):
print_values(vals)
print_values(parse_line(line.split(), labels))
def main():
@@ -532,13 +462,11 @@ def main():
tflag = False
vflag = False
xflag = False
nflag = False
filters = dict()
try:
opts, args = getopt.getopt(
sys.argv[1:],
"bdf:hi:o:rs:tvxF:n",
"bdf:hi:o:rs:tvx",
[
"buffers",
"dnodes",
@@ -549,8 +477,7 @@ def main():
"separator",
"types",
"verbose",
"extended",
"filter"
"extended"
]
)
except getopt.error:
@@ -580,35 +507,6 @@ def main():
vflag = True
if opt in ('-x', '--extended'):
xflag = True
if opt in ('-n', '--noheader'):
nflag = True
if opt in ('-F', '--filter'):
fils = [x.strip() for x in arg.split(",")]
for fil in fils:
f = [x.strip() for x in fil.split("=")]
if len(f) != 2:
sys.stderr.write("Invalid filter '%s'.\n" % fil)
sys.exit(1)
if f[0] not in cols:
sys.stderr.write("Invalid field '%s' in filter.\n" % f[0])
sys.exit(1)
if f[0] in filters:
sys.stderr.write("Field '%s' specified multiple times in "
"filter.\n" % f[0])
sys.exit(1)
try:
re.compile("(?:" + f[1] + r")\Z")
except re.error:
sys.stderr.write("Invalid regex for field '%s' in "
"filter.\n" % f[0])
sys.exit(1)
filters[f[0]] = f[1]
if hflag or (xflag and desired_cols):
usage()
@@ -660,7 +558,7 @@ def main():
sys.exit(1)
if not ifile:
ifile = default_ifile()
ifile = '/proc/spl/kstat/zfs/dbufs'
if ifile is not "-":
try:
@@ -671,13 +569,13 @@ def main():
sys.exit(1)
if bflag:
buffers_print_all(sys.stdin, filters, nflag)
buffers_print_all(sys.stdin)
if dflag:
print_dict(dnodes_build_dict(sys.stdin), filters, nflag)
print_dict(dnodes_build_dict(sys.stdin))
if tflag:
print_dict(types_build_dict(sys.stdin), filters, nflag)
print_dict(types_build_dict(sys.stdin))
if __name__ == '__main__':
+1 -1
View File
@@ -1,6 +1,6 @@
#!/bin/sh
#
# fsck.zfs: A fsck helper to accommodate distributions that expect
# fsck.zfs: A fsck helper to accomidate distributions that expect
# to be able to execute a fsck on all filesystem types. Currently
# this script does nothing but it could be extended to act as a
# compatibility wrapper for 'zpool scrub'.
+9 -5
View File
@@ -1,5 +1,9 @@
include $(top_srcdir)/config/Rules.am
DEFAULT_INCLUDES += \
-I$(top_srcdir)/include \
-I$(top_srcdir)/lib/libspl/include
#
# Ignore the prefix for the mount helper. It must be installed in /sbin/
# because this path is hardcoded in the mount(8) for security reasons.
@@ -13,8 +17,8 @@ mount_zfs_SOURCES = \
mount_zfs.c
mount_zfs_LDADD = \
$(abs_top_builddir)/lib/libzfs/libzfs.la \
$(abs_top_builddir)/lib/libzfs_core/libzfs_core.la \
$(abs_top_builddir)/lib/libnvpair/libnvpair.la
mount_zfs_LDADD += $(LTLIBINTL)
$(top_builddir)/lib/libnvpair/libnvpair.la \
$(top_builddir)/lib/libuutil/libuutil.la \
$(top_builddir)/lib/libzpool/libzpool.la \
$(top_builddir)/lib/libzfs/libzfs.la \
$(top_builddir)/lib/libzfs_core/libzfs_core.la
+276 -33
View File
@@ -31,50 +31,239 @@
#include <sys/mntent.h>
#include <sys/stat.h>
#include <libzfs.h>
#include <libzutil.h>
#include <locale.h>
#include <getopt.h>
#include <fcntl.h>
#include <errno.h>
#define ZS_COMMENT 0x00000000 /* comment */
#define ZS_ZFSUTIL 0x00000001 /* caller is zfs(8) */
libzfs_handle_t *g_zfs;
typedef struct option_map {
const char *name;
unsigned long mntmask;
unsigned long zfsmask;
} option_map_t;
static const option_map_t option_map[] = {
/* Canonicalized filesystem independent options from mount(8) */
{ MNTOPT_NOAUTO, MS_COMMENT, ZS_COMMENT },
{ MNTOPT_DEFAULTS, MS_COMMENT, ZS_COMMENT },
{ MNTOPT_NODEVICES, MS_NODEV, ZS_COMMENT },
{ MNTOPT_DIRSYNC, MS_DIRSYNC, ZS_COMMENT },
{ MNTOPT_NOEXEC, MS_NOEXEC, ZS_COMMENT },
{ MNTOPT_GROUP, MS_GROUP, ZS_COMMENT },
{ MNTOPT_NETDEV, MS_COMMENT, ZS_COMMENT },
{ MNTOPT_NOFAIL, MS_COMMENT, ZS_COMMENT },
{ MNTOPT_NOSUID, MS_NOSUID, ZS_COMMENT },
{ MNTOPT_OWNER, MS_OWNER, ZS_COMMENT },
{ MNTOPT_REMOUNT, MS_REMOUNT, ZS_COMMENT },
{ MNTOPT_RO, MS_RDONLY, ZS_COMMENT },
{ MNTOPT_RW, MS_COMMENT, ZS_COMMENT },
{ MNTOPT_SYNC, MS_SYNCHRONOUS, ZS_COMMENT },
{ MNTOPT_USER, MS_USERS, ZS_COMMENT },
{ MNTOPT_USERS, MS_USERS, ZS_COMMENT },
/* acl flags passed with util-linux-2.24 mount command */
{ MNTOPT_ACL, MS_POSIXACL, ZS_COMMENT },
{ MNTOPT_NOACL, MS_COMMENT, ZS_COMMENT },
{ MNTOPT_POSIXACL, MS_POSIXACL, ZS_COMMENT },
#ifdef MS_NOATIME
{ MNTOPT_NOATIME, MS_NOATIME, ZS_COMMENT },
#endif
#ifdef MS_NODIRATIME
{ MNTOPT_NODIRATIME, MS_NODIRATIME, ZS_COMMENT },
#endif
#ifdef MS_RELATIME
{ MNTOPT_RELATIME, MS_RELATIME, ZS_COMMENT },
#endif
#ifdef MS_STRICTATIME
{ MNTOPT_STRICTATIME, MS_STRICTATIME, ZS_COMMENT },
#endif
#ifdef MS_LAZYTIME
{ MNTOPT_LAZYTIME, MS_LAZYTIME, ZS_COMMENT },
#endif
{ MNTOPT_CONTEXT, MS_COMMENT, ZS_COMMENT },
{ MNTOPT_FSCONTEXT, MS_COMMENT, ZS_COMMENT },
{ MNTOPT_DEFCONTEXT, MS_COMMENT, ZS_COMMENT },
{ MNTOPT_ROOTCONTEXT, MS_COMMENT, ZS_COMMENT },
#ifdef MS_I_VERSION
{ MNTOPT_IVERSION, MS_I_VERSION, ZS_COMMENT },
#endif
#ifdef MS_MANDLOCK
{ MNTOPT_NBMAND, MS_MANDLOCK, ZS_COMMENT },
#endif
/* Valid options not found in mount(8) */
{ MNTOPT_BIND, MS_BIND, ZS_COMMENT },
#ifdef MS_REC
{ MNTOPT_RBIND, MS_BIND|MS_REC, ZS_COMMENT },
#endif
{ MNTOPT_COMMENT, MS_COMMENT, ZS_COMMENT },
#ifdef MS_NOSUB
{ MNTOPT_NOSUB, MS_NOSUB, ZS_COMMENT },
#endif
#ifdef MS_SILENT
{ MNTOPT_QUIET, MS_SILENT, ZS_COMMENT },
#endif
/* Custom zfs options */
{ MNTOPT_XATTR, MS_COMMENT, ZS_COMMENT },
{ MNTOPT_NOXATTR, MS_COMMENT, ZS_COMMENT },
{ MNTOPT_ZFSUTIL, MS_COMMENT, ZS_ZFSUTIL },
{ NULL, 0, 0 } };
/*
* Break the mount option in to a name/value pair. The name is
* validated against the option map and mount flags set accordingly.
*/
static int
parse_option(char *mntopt, unsigned long *mntflags,
unsigned long *zfsflags, int sloppy)
{
const option_map_t *opt;
char *ptr, *name, *value = NULL;
int error = 0;
name = strdup(mntopt);
if (name == NULL)
return (ENOMEM);
for (ptr = name; ptr && *ptr; ptr++) {
if (*ptr == '=') {
*ptr = '\0';
value = ptr+1;
VERIFY3P(value, !=, NULL);
break;
}
}
for (opt = option_map; opt->name != NULL; opt++) {
if (strncmp(name, opt->name, strlen(name)) == 0) {
*mntflags |= opt->mntmask;
*zfsflags |= opt->zfsmask;
error = 0;
goto out;
}
}
if (!sloppy)
error = ENOENT;
out:
/* If required further process on the value may be done here */
free(name);
return (error);
}
/*
* Translate the mount option string in to MS_* mount flags for the
* kernel vfs. When sloppy is non-zero unknown options will be ignored
* otherwise they are considered fatal are copied in to badopt.
*/
static int
parse_options(char *mntopts, unsigned long *mntflags, unsigned long *zfsflags,
int sloppy, char *badopt, char *mtabopt)
{
int error = 0, quote = 0, flag = 0, count = 0;
char *ptr, *opt, *opts;
opts = strdup(mntopts);
if (opts == NULL)
return (ENOMEM);
*mntflags = 0;
opt = NULL;
/*
* Scan through all mount options which must be comma delimited.
* We must be careful to notice regions which are double quoted
* and skip commas in these regions. Each option is then checked
* to determine if it is a known option.
*/
for (ptr = opts; ptr && !flag; ptr++) {
if (opt == NULL)
opt = ptr;
if (*ptr == '"')
quote = !quote;
if (quote)
continue;
if (*ptr == '\0')
flag = 1;
if ((*ptr == ',') || (*ptr == '\0')) {
*ptr = '\0';
error = parse_option(opt, mntflags, zfsflags, sloppy);
if (error) {
strcpy(badopt, opt);
goto out;
}
if (!(*mntflags & MS_REMOUNT) &&
!(*zfsflags & ZS_ZFSUTIL)) {
if (count > 0)
strlcat(mtabopt, ",", MNT_LINE_MAX);
strlcat(mtabopt, opt, MNT_LINE_MAX);
count++;
}
opt = NULL;
}
}
out:
free(opts);
return (error);
}
/*
* Return the pool/dataset to mount given the name passed to mount. This
* is expected to be of the form pool/dataset, however may also refer to
* a block device if that device contains a valid zfs label.
*/
static void
parse_dataset(const char *target, char **dataset)
static char *
parse_dataset(char *dataset)
{
char cwd[PATH_MAX];
struct stat64 statbuf;
int error;
int len;
/*
* We expect a pool/dataset to be provided, however if we're
* given a device which is a member of a zpool we attempt to
* extract the pool name stored in the label. Given the pool
* name we can mount the root dataset.
*/
int fd = open(target, O_RDONLY);
if (fd >= 0) {
nvlist_t *config = NULL;
if (zpool_read_label(fd, &config, NULL) != 0)
config = NULL;
if (close(fd))
perror("close");
error = stat64(dataset, &statbuf);
if (error == 0) {
nvlist_t *config;
char *name;
int fd;
if (config) {
char *name = NULL;
if (!nvlist_lookup_string(config,
ZPOOL_CONFIG_POOL_NAME, &name))
(void) strlcpy(*dataset, name, PATH_MAX);
fd = open(dataset, O_RDONLY);
if (fd < 0)
goto out;
error = zpool_read_label(fd, &config, NULL);
(void) close(fd);
if (error)
goto out;
error = nvlist_lookup_string(config,
ZPOOL_CONFIG_POOL_NAME, &name);
if (error) {
nvlist_free(config);
if (name)
return;
} else {
dataset = strdup(name);
nvlist_free(config);
return (dataset);
}
}
out:
/*
* If a file or directory in your current working directory is
* named 'dataset' then mount(8) will prepend your current working
@@ -82,14 +271,16 @@ parse_dataset(const char *target, char **dataset)
* behavior so we simply check for it and strip the prepended
* patch when it is added.
*/
char cwd[PATH_MAX];
if (getcwd(cwd, PATH_MAX) != NULL) {
int len = strlen(cwd);
/* Do not add one when cwd already ends in a trailing '/' */
if (strncmp(cwd, target, len) == 0)
target += len + (cwd[len-1] != '/');
}
strlcpy(*dataset, target, PATH_MAX);
if (getcwd(cwd, PATH_MAX) == NULL)
return (dataset);
len = strlen(cwd);
/* Do not add one when cwd already ends in a trailing '/' */
if (strncmp(cwd, dataset, len) == 0)
return (dataset + len + (cwd[len-1] != '/'));
return (dataset);
}
/*
@@ -152,6 +343,34 @@ mtab_update(char *dataset, char *mntpoint, char *type, char *mntopts)
return (MOUNT_SUCCESS);
}
static void
append_mntopt(const char *name, const char *val, char *mntopts,
char *mtabopt, boolean_t quote)
{
char tmp[MNT_LINE_MAX];
snprintf(tmp, MNT_LINE_MAX, quote ? ",%s=\"%s\"" : ",%s=%s", name, val);
if (mntopts)
strlcat(mntopts, tmp, MNT_LINE_MAX);
if (mtabopt)
strlcat(mtabopt, tmp, MNT_LINE_MAX);
}
static void
zfs_selinux_setcontext(zfs_handle_t *zhp, zfs_prop_t zpt, const char *name,
char *mntopts, char *mtabopt)
{
char context[ZFS_MAXPROPLEN];
if (zfs_prop_get(zhp, zpt, context, sizeof (context),
NULL, NULL, 0, B_FALSE) == 0) {
if (strcmp(context, "none") != 0)
append_mntopt(name, context, mntopts, mtabopt, B_TRUE);
}
}
int
main(int argc, char **argv)
{
@@ -162,13 +381,12 @@ main(int argc, char **argv)
char badopt[MNT_LINE_MAX] = { '\0' };
char mtabopt[MNT_LINE_MAX] = { '\0' };
char mntpoint[PATH_MAX];
char dataset[PATH_MAX], *pdataset = dataset;
char *dataset;
unsigned long mntflags = 0, zfsflags = 0, remount = 0;
int sloppy = 0, fake = 0, verbose = 0, nomtab = 0, zfsutil = 0;
int error, c;
(void) setlocale(LC_ALL, "");
(void) setlocale(LC_NUMERIC, "C");
(void) textdomain(TEXT_DOMAIN);
opterr = 0;
@@ -218,7 +436,7 @@ main(int argc, char **argv)
return (MOUNT_USAGE);
}
parse_dataset(argv[0], &pdataset);
dataset = parse_dataset(argv[0]);
/* canonicalize the mount point */
if (realpath(argv[1], mntpoint) == NULL) {
@@ -229,7 +447,7 @@ main(int argc, char **argv)
}
/* validate mount options and set mntflags */
error = zfs_parse_mount_options(mntopts, &mntflags, &zfsflags, sloppy,
error = parse_options(mntopts, &mntflags, &zfsflags, sloppy,
badopt, mtabopt);
if (error) {
switch (error) {
@@ -269,7 +487,7 @@ main(int argc, char **argv)
zfsutil = 1;
if ((g_zfs = libzfs_init()) == NULL) {
(void) fprintf(stderr, "%s\n", libzfs_error_init(errno));
(void) fprintf(stderr, "%s", libzfs_error_init(errno));
return (MOUNT_SYSERR);
}
@@ -282,7 +500,32 @@ main(int argc, char **argv)
return (MOUNT_USAGE);
}
zfs_adjust_mount_options(zhp, mntpoint, mntopts, mtabopt);
/*
* Checks to see if the ZFS_PROP_SELINUX_CONTEXT exists
* if it does, create a tmp variable in case it's needed
* checks to see if the selinux context is set to the default
* if it is, allow the setting of the other context properties
* this is needed because the 'context' property overrides others
* if it is not the default, set the 'context' property
*/
if (zfs_prop_get(zhp, ZFS_PROP_SELINUX_CONTEXT, prop, sizeof (prop),
NULL, NULL, 0, B_FALSE) == 0) {
if (strcmp(prop, "none") == 0) {
zfs_selinux_setcontext(zhp, ZFS_PROP_SELINUX_FSCONTEXT,
MNTOPT_FSCONTEXT, mntopts, mtabopt);
zfs_selinux_setcontext(zhp, ZFS_PROP_SELINUX_DEFCONTEXT,
MNTOPT_DEFCONTEXT, mntopts, mtabopt);
zfs_selinux_setcontext(zhp,
ZFS_PROP_SELINUX_ROOTCONTEXT, MNTOPT_ROOTCONTEXT,
mntopts, mtabopt);
} else {
append_mntopt(MNTOPT_CONTEXT, prop,
mntopts, mtabopt, B_TRUE);
}
}
/* A hint used to determine an auto-mounted snapshot mount point */
append_mntopt(MNTOPT_MNTPOINT, mntpoint, mntopts, NULL, B_FALSE);
/* treat all snapshots as legacy mount points */
if (zfs_get_type(zhp) == ZFS_TYPE_SNAPSHOT)
+9 -7
View File
@@ -1,10 +1,11 @@
include $(top_srcdir)/config/Rules.am
# Includes kernel code, generate warnings for large stack frames
AM_CFLAGS += $(FRAME_LARGER_THAN)
AM_CFLAGS += $(DEBUG_STACKFLAGS) $(FRAME_LARGER_THAN)
AM_CPPFLAGS += -DDEBUG
# Unconditionally enable ASSERTs
AM_CPPFLAGS += -DDEBUG -UNDEBUG -DZFS_DEBUG
DEFAULT_INCLUDES += \
-I$(top_srcdir)/include \
-I$(top_srcdir)/lib/libspl/include
bin_PROGRAMS = raidz_test
@@ -14,7 +15,8 @@ raidz_test_SOURCES = \
raidz_bench.c
raidz_test_LDADD = \
$(abs_top_builddir)/lib/libzpool/libzpool.la \
$(abs_top_builddir)/lib/libzfs_core/libzfs_core.la
$(top_builddir)/lib/libnvpair/libnvpair.la \
$(top_builddir)/lib/libuutil/libuutil.la \
$(top_builddir)/lib/libzpool/libzpool.la
raidz_test_LDADD += -lm
raidz_test_LDADD += -lm -ldl
+2 -2
View File
@@ -113,7 +113,7 @@ run_gen_bench_impl(const char *impl)
}
}
static void
void
run_gen_bench(void)
{
char **impl_name;
@@ -197,7 +197,7 @@ run_rec_bench_impl(const char *impl)
}
}
static void
void
run_rec_bench(void)
{
char **impl_name;
+5 -3
View File
@@ -702,8 +702,10 @@ run_sweep(void)
opts->rto_dsize = size_v[s];
opts->rto_v = 0; /* be quiet */
VERIFY3P(thread_create(NULL, 0, sweep_thread, (void *) opts,
0, NULL, TS_RUN, defclsyspri), !=, NULL);
VERIFY3P(zk_thread_create(NULL, 0,
(thread_func_t)sweep_thread,
(void *) opts, 0, NULL, TS_RUN, 0,
PTHREAD_CREATE_JOINABLE), !=, NULL);
}
exit:
@@ -757,7 +759,7 @@ main(int argc, char **argv)
process_options(argc, argv);
kernel_init(SPA_MODE_READ);
kernel_init(FREAD);
/* setup random data because rand() is not reentrant */
rand_data = (int *)umem_alloc(SPA_MAXBLOCKSIZE, UMEM_NOFAIL);
-1
View File
@@ -38,7 +38,6 @@ static const char *raidz_impl_names[] = {
"avx512bw",
"aarch64_neon",
"aarch64_neonx2",
"powerpc_altivec",
NULL
};
+10 -9
View File
@@ -102,7 +102,7 @@ Usage: vdev_id [-h]
vdev_id <-d device> [-c config_file] [-p phys_per_port]
[-g sas_direct|sas_switch|scsi] [-m]
-c specify name of an alternative config file [default=$CONFIG]
-c specify name of alernate config file [default=$CONFIG]
-d specify basename of device (i.e. sda)
-e Create enclose device symlinks only (/dev/by-enclosure)
-g Storage network topology [default="$TOPOLOGY"]
@@ -114,8 +114,9 @@ EOF
}
map_slot() {
LINUX_SLOT=$1
CHANNEL=$2
local LINUX_SLOT=$1
local CHANNEL=$2
local MAPPED_SLOT=
MAPPED_SLOT=`awk "\\$1 == \"slot\" && \\$2 == ${LINUX_SLOT} && \
\\$4 ~ /^${CHANNEL}$|^$/ { print \\$3; exit }" $CONFIG`
@@ -126,9 +127,9 @@ map_slot() {
}
map_channel() {
MAPPED_CHAN=
PCI_ID=$1
PORT=$2
local MAPPED_CHAN=
local PCI_ID=$1
local PORT=$2
case $TOPOLOGY in
"sas_switch")
@@ -486,7 +487,7 @@ alias_handler () {
# digits as partitions, causing alias creation to fail. This
# ambiguity seems unavoidable, so devices using this facility
# must not use such names.
DM_PART=
local DM_PART=
if echo $DM_NAME | grep -q -E 'p[0-9][0-9]*$' ; then
if [ "$DEVTYPE" != "partition" ] ; then
DM_PART=`echo $DM_NAME | awk -Fp '/p/{print "-part"$2}'`
@@ -548,7 +549,7 @@ if [ ! -r $CONFIG ] ; then
exit 0
fi
if [ -z "$DEV" ] && [ -z "$ENCLOSURE_MODE" ] ; then
if [ -z "$DEV" -a -z "$ENCLOSURE_MODE" ] ; then
echo "Error: missing required option -d"
exit 1
fi
@@ -564,7 +565,7 @@ fi
TOPOLOGY=${TOPOLOGY:-sas_direct}
# Should we create /dev/by-enclosure symlinks?
if [ "$ENCLOSURE_MODE" = "yes" ] && [ "$TOPOLOGY" = "sas_direct" ] ; then
if [ "$ENCLOSURE_MODE" = "yes" -a "$TOPOLOGY" = "sas_direct" ] ; then
ID_ENCLOSURE=$(enclosure_handler)
if [ -z "$ID_ENCLOSURE" ] ; then
exit 0
+11 -7
View File
@@ -1,16 +1,20 @@
include $(top_srcdir)/config/Rules.am
# Unconditionally enable debugging for zdb
AM_CPPFLAGS += -DDEBUG -UNDEBUG -DZFS_DEBUG
AM_CPPFLAGS += -DDEBUG
DEFAULT_INCLUDES += \
-I$(top_srcdir)/include \
-I$(top_srcdir)/lib/libspl/include
sbin_PROGRAMS = zdb
zdb_SOURCES = \
zdb.c \
zdb_il.c \
zdb.h
zdb_il.c
zdb_LDADD = \
$(abs_top_builddir)/lib/libzpool/libzpool.la \
$(abs_top_builddir)/lib/libzfs_core/libzfs_core.la \
$(abs_top_builddir)/lib/libnvpair/libnvpair.la
$(top_builddir)/lib/libnvpair/libnvpair.la \
$(top_builddir)/lib/libuutil/libuutil.la \
$(top_builddir)/lib/libzpool/libzpool.la \
$(top_builddir)/lib/libzfs/libzfs.la \
$(top_builddir)/lib/libzfs_core/libzfs_core.la
+560 -4642
View File
File diff suppressed because it is too large Load Diff
+70 -97
View File
@@ -25,7 +25,7 @@
*/
/*
* Copyright (c) 2013, 2017 by Delphix. All rights reserved.
* Copyright (c) 2013, 2016 by Delphix. All rights reserved.
*/
/*
@@ -42,14 +42,11 @@
#include <sys/resource.h>
#include <sys/zil.h>
#include <sys/zil_impl.h>
#include <sys/spa_impl.h>
#include <sys/abd.h>
#include "zdb.h"
extern uint8_t dump_opt[256];
static char tab_prefix[4] = "\t\t\t";
static char prefix[4] = "\t\t\t";
static void
print_log_bp(const blkptr_t *bp, const char *prefix)
@@ -62,9 +59,8 @@ print_log_bp(const blkptr_t *bp, const char *prefix)
/* ARGSUSED */
static void
zil_prt_rec_create(zilog_t *zilog, int txtype, const void *arg)
zil_prt_rec_create(zilog_t *zilog, int txtype, lr_create_t *lr)
{
const lr_create_t *lr = arg;
time_t crtime = lr->lr_crtime[0];
char *name, *link;
lr_attr_t *lrattr;
@@ -79,55 +75,49 @@ zil_prt_rec_create(zilog_t *zilog, int txtype, const void *arg)
if (txtype == TX_SYMLINK) {
link = name + strlen(name) + 1;
(void) printf("%s%s -> %s\n", tab_prefix, name, link);
(void) printf("%s%s -> %s\n", prefix, name, link);
} else if (txtype != TX_MKXATTR) {
(void) printf("%s%s\n", tab_prefix, name);
(void) printf("%s%s\n", prefix, name);
}
(void) printf("%s%s", tab_prefix, ctime(&crtime));
(void) printf("%sdoid %llu, foid %llu, slots %llu, mode %llo\n",
tab_prefix, (u_longlong_t)lr->lr_doid,
(void) printf("%s%s", prefix, ctime(&crtime));
(void) printf("%sdoid %llu, foid %llu, slots %llu, mode %llo\n", prefix,
(u_longlong_t)lr->lr_doid,
(u_longlong_t)LR_FOID_GET_OBJ(lr->lr_foid),
(u_longlong_t)LR_FOID_GET_SLOTS(lr->lr_foid),
(longlong_t)lr->lr_mode);
(void) printf("%suid %llu, gid %llu, gen %llu, rdev 0x%llx\n",
tab_prefix,
(void) printf("%suid %llu, gid %llu, gen %llu, rdev 0x%llx\n", prefix,
(u_longlong_t)lr->lr_uid, (u_longlong_t)lr->lr_gid,
(u_longlong_t)lr->lr_gen, (u_longlong_t)lr->lr_rdev);
}
/* ARGSUSED */
static void
zil_prt_rec_remove(zilog_t *zilog, int txtype, const void *arg)
zil_prt_rec_remove(zilog_t *zilog, int txtype, lr_remove_t *lr)
{
const lr_remove_t *lr = arg;
(void) printf("%sdoid %llu, name %s\n", tab_prefix,
(void) printf("%sdoid %llu, name %s\n", prefix,
(u_longlong_t)lr->lr_doid, (char *)(lr + 1));
}
/* ARGSUSED */
static void
zil_prt_rec_link(zilog_t *zilog, int txtype, const void *arg)
zil_prt_rec_link(zilog_t *zilog, int txtype, lr_link_t *lr)
{
const lr_link_t *lr = arg;
(void) printf("%sdoid %llu, link_obj %llu, name %s\n", tab_prefix,
(void) printf("%sdoid %llu, link_obj %llu, name %s\n", prefix,
(u_longlong_t)lr->lr_doid, (u_longlong_t)lr->lr_link_obj,
(char *)(lr + 1));
}
/* ARGSUSED */
static void
zil_prt_rec_rename(zilog_t *zilog, int txtype, const void *arg)
zil_prt_rec_rename(zilog_t *zilog, int txtype, lr_rename_t *lr)
{
const lr_rename_t *lr = arg;
char *snm = (char *)(lr + 1);
char *tnm = snm + strlen(snm) + 1;
(void) printf("%ssdoid %llu, tdoid %llu\n", tab_prefix,
(void) printf("%ssdoid %llu, tdoid %llu\n", prefix,
(u_longlong_t)lr->lr_sdoid, (u_longlong_t)lr->lr_tdoid);
(void) printf("%ssrc %s tgt %s\n", tab_prefix, snm, tnm);
(void) printf("%ssrc %s tgt %s\n", prefix, snm, tnm);
}
/* ARGSUSED */
@@ -135,8 +125,9 @@ static int
zil_prt_rec_write_cb(void *data, size_t len, void *unused)
{
char *cdata = data;
int i;
for (size_t i = 0; i < len; i++) {
for (i = 0; i < len; i++) {
if (isprint(*cdata))
(void) printf("%c ", *cdata);
else
@@ -148,16 +139,15 @@ zil_prt_rec_write_cb(void *data, size_t len, void *unused)
/* ARGSUSED */
static void
zil_prt_rec_write(zilog_t *zilog, int txtype, const void *arg)
zil_prt_rec_write(zilog_t *zilog, int txtype, lr_write_t *lr)
{
const lr_write_t *lr = arg;
abd_t *data;
const blkptr_t *bp = &lr->lr_blkptr;
blkptr_t *bp = &lr->lr_blkptr;
zbookmark_phys_t zb;
int verbose = MAX(dump_opt['d'], dump_opt['i']);
int error;
(void) printf("%sfoid %llu, offset %llx, length %llx\n", tab_prefix,
(void) printf("%sfoid %llu, offset %llx, length %llx\n", prefix,
(u_longlong_t)lr->lr_foid, (u_longlong_t)lr->lr_offset,
(u_longlong_t)lr->lr_length);
@@ -165,21 +155,20 @@ zil_prt_rec_write(zilog_t *zilog, int txtype, const void *arg)
return;
if (lr->lr_common.lrc_reclen == sizeof (lr_write_t)) {
(void) printf("%shas blkptr, %s\n", tab_prefix,
(void) printf("%shas blkptr, %s\n", prefix,
!BP_IS_HOLE(bp) &&
bp->blk_birth >= spa_min_claim_txg(zilog->zl_spa) ?
bp->blk_birth >= spa_first_txg(zilog->zl_spa) ?
"will claim" : "won't claim");
print_log_bp(bp, tab_prefix);
print_log_bp(bp, prefix);
if (BP_IS_HOLE(bp)) {
(void) printf("\t\t\tLSIZE 0x%llx\n",
(u_longlong_t)BP_GET_LSIZE(bp));
(void) printf("%s<hole>\n", tab_prefix);
(void) printf("%s<hole>\n", prefix);
return;
}
if (bp->blk_birth < zilog->zl_header->zh_claim_txg) {
(void) printf("%s<block already committed>\n",
tab_prefix);
(void) printf("%s<block already committed>\n", prefix);
return;
}
@@ -199,7 +188,7 @@ zil_prt_rec_write(zilog_t *zilog, int txtype, const void *arg)
abd_copy_from_buf(data, lr + 1, lr->lr_length);
}
(void) printf("%s", tab_prefix);
(void) printf("%s", prefix);
(void) abd_iterate_func(data,
0, MIN(lr->lr_length, (verbose < 6 ? 20 : SPA_MAXBLOCKSIZE)),
zil_prt_rec_write_cb, NULL);
@@ -211,55 +200,52 @@ out:
/* ARGSUSED */
static void
zil_prt_rec_truncate(zilog_t *zilog, int txtype, const void *arg)
zil_prt_rec_truncate(zilog_t *zilog, int txtype, lr_truncate_t *lr)
{
const lr_truncate_t *lr = arg;
(void) printf("%sfoid %llu, offset 0x%llx, length 0x%llx\n", tab_prefix,
(void) printf("%sfoid %llu, offset 0x%llx, length 0x%llx\n", prefix,
(u_longlong_t)lr->lr_foid, (longlong_t)lr->lr_offset,
(u_longlong_t)lr->lr_length);
}
/* ARGSUSED */
static void
zil_prt_rec_setattr(zilog_t *zilog, int txtype, const void *arg)
zil_prt_rec_setattr(zilog_t *zilog, int txtype, lr_setattr_t *lr)
{
const lr_setattr_t *lr = arg;
time_t atime = (time_t)lr->lr_atime[0];
time_t mtime = (time_t)lr->lr_mtime[0];
(void) printf("%sfoid %llu, mask 0x%llx\n", tab_prefix,
(void) printf("%sfoid %llu, mask 0x%llx\n", prefix,
(u_longlong_t)lr->lr_foid, (u_longlong_t)lr->lr_mask);
if (lr->lr_mask & AT_MODE) {
(void) printf("%sAT_MODE %llo\n", tab_prefix,
(void) printf("%sAT_MODE %llo\n", prefix,
(longlong_t)lr->lr_mode);
}
if (lr->lr_mask & AT_UID) {
(void) printf("%sAT_UID %llu\n", tab_prefix,
(void) printf("%sAT_UID %llu\n", prefix,
(u_longlong_t)lr->lr_uid);
}
if (lr->lr_mask & AT_GID) {
(void) printf("%sAT_GID %llu\n", tab_prefix,
(void) printf("%sAT_GID %llu\n", prefix,
(u_longlong_t)lr->lr_gid);
}
if (lr->lr_mask & AT_SIZE) {
(void) printf("%sAT_SIZE %llu\n", tab_prefix,
(void) printf("%sAT_SIZE %llu\n", prefix,
(u_longlong_t)lr->lr_size);
}
if (lr->lr_mask & AT_ATIME) {
(void) printf("%sAT_ATIME %llu.%09llu %s", tab_prefix,
(void) printf("%sAT_ATIME %llu.%09llu %s", prefix,
(u_longlong_t)lr->lr_atime[0],
(u_longlong_t)lr->lr_atime[1],
ctime(&atime));
}
if (lr->lr_mask & AT_MTIME) {
(void) printf("%sAT_MTIME %llu.%09llu %s", tab_prefix,
(void) printf("%sAT_MTIME %llu.%09llu %s", prefix,
(u_longlong_t)lr->lr_mtime[0],
(u_longlong_t)lr->lr_mtime[1],
ctime(&mtime));
@@ -268,48 +254,46 @@ zil_prt_rec_setattr(zilog_t *zilog, int txtype, const void *arg)
/* ARGSUSED */
static void
zil_prt_rec_acl(zilog_t *zilog, int txtype, const void *arg)
zil_prt_rec_acl(zilog_t *zilog, int txtype, lr_acl_t *lr)
{
const lr_acl_t *lr = arg;
(void) printf("%sfoid %llu, aclcnt %llu\n", tab_prefix,
(void) printf("%sfoid %llu, aclcnt %llu\n", prefix,
(u_longlong_t)lr->lr_foid, (u_longlong_t)lr->lr_aclcnt);
}
typedef void (*zil_prt_rec_func_t)(zilog_t *, int, const void *);
typedef void (*zil_prt_rec_func_t)(zilog_t *, int, void *);
typedef struct zil_rec_info {
zil_prt_rec_func_t zri_print;
const char *zri_name;
char *zri_name;
uint64_t zri_count;
} zil_rec_info_t;
static zil_rec_info_t zil_rec_info[TX_MAX_TYPE] = {
{.zri_print = NULL, .zri_name = "Total "},
{.zri_print = zil_prt_rec_create, .zri_name = "TX_CREATE "},
{.zri_print = zil_prt_rec_create, .zri_name = "TX_MKDIR "},
{.zri_print = zil_prt_rec_create, .zri_name = "TX_MKXATTR "},
{.zri_print = zil_prt_rec_create, .zri_name = "TX_SYMLINK "},
{.zri_print = zil_prt_rec_remove, .zri_name = "TX_REMOVE "},
{.zri_print = zil_prt_rec_remove, .zri_name = "TX_RMDIR "},
{.zri_print = zil_prt_rec_link, .zri_name = "TX_LINK "},
{.zri_print = zil_prt_rec_rename, .zri_name = "TX_RENAME "},
{.zri_print = zil_prt_rec_write, .zri_name = "TX_WRITE "},
{.zri_print = zil_prt_rec_truncate, .zri_name = "TX_TRUNCATE "},
{.zri_print = zil_prt_rec_setattr, .zri_name = "TX_SETATTR "},
{.zri_print = zil_prt_rec_acl, .zri_name = "TX_ACL_V0 "},
{.zri_print = zil_prt_rec_acl, .zri_name = "TX_ACL_ACL "},
{.zri_print = zil_prt_rec_create, .zri_name = "TX_CREATE_ACL "},
{.zri_print = zil_prt_rec_create, .zri_name = "TX_CREATE_ATTR "},
{.zri_print = zil_prt_rec_create, .zri_name = "TX_CREATE_ACL_ATTR "},
{.zri_print = zil_prt_rec_create, .zri_name = "TX_MKDIR_ACL "},
{.zri_print = zil_prt_rec_create, .zri_name = "TX_MKDIR_ATTR "},
{.zri_print = zil_prt_rec_create, .zri_name = "TX_MKDIR_ACL_ATTR "},
{.zri_print = zil_prt_rec_write, .zri_name = "TX_WRITE2 "},
{ NULL, "Total " },
{ (zil_prt_rec_func_t)zil_prt_rec_create, "TX_CREATE " },
{ (zil_prt_rec_func_t)zil_prt_rec_create, "TX_MKDIR " },
{ (zil_prt_rec_func_t)zil_prt_rec_create, "TX_MKXATTR " },
{ (zil_prt_rec_func_t)zil_prt_rec_create, "TX_SYMLINK " },
{ (zil_prt_rec_func_t)zil_prt_rec_remove, "TX_REMOVE " },
{ (zil_prt_rec_func_t)zil_prt_rec_remove, "TX_RMDIR " },
{ (zil_prt_rec_func_t)zil_prt_rec_link, "TX_LINK " },
{ (zil_prt_rec_func_t)zil_prt_rec_rename, "TX_RENAME " },
{ (zil_prt_rec_func_t)zil_prt_rec_write, "TX_WRITE " },
{ (zil_prt_rec_func_t)zil_prt_rec_truncate, "TX_TRUNCATE " },
{ (zil_prt_rec_func_t)zil_prt_rec_setattr, "TX_SETATTR " },
{ (zil_prt_rec_func_t)zil_prt_rec_acl, "TX_ACL_V0 " },
{ (zil_prt_rec_func_t)zil_prt_rec_acl, "TX_ACL_ACL " },
{ (zil_prt_rec_func_t)zil_prt_rec_create, "TX_CREATE_ACL " },
{ (zil_prt_rec_func_t)zil_prt_rec_create, "TX_CREATE_ATTR " },
{ (zil_prt_rec_func_t)zil_prt_rec_create, "TX_CREATE_ACL_ATTR " },
{ (zil_prt_rec_func_t)zil_prt_rec_create, "TX_MKDIR_ACL " },
{ (zil_prt_rec_func_t)zil_prt_rec_create, "TX_MKDIR_ATTR " },
{ (zil_prt_rec_func_t)zil_prt_rec_create, "TX_MKDIR_ACL_ATTR " },
{ (zil_prt_rec_func_t)zil_prt_rec_write, "TX_WRITE2 " },
};
/* ARGSUSED */
static int
print_log_record(zilog_t *zilog, const lr_t *lr, void *arg, uint64_t claim_txg)
print_log_record(zilog_t *zilog, lr_t *lr, void *arg, uint64_t claim_txg)
{
int txtype;
int verbose = MAX(dump_opt['d'], dump_opt['i']);
@@ -327,13 +311,8 @@ print_log_record(zilog_t *zilog, const lr_t *lr, void *arg, uint64_t claim_txg)
(u_longlong_t)lr->lrc_txg,
(u_longlong_t)lr->lrc_seq);
if (txtype && verbose >= 3) {
if (!zilog->zl_os->os_encrypted) {
zil_rec_info[txtype].zri_print(zilog, txtype, lr);
} else {
(void) printf("%s(encrypted)\n", tab_prefix);
}
}
if (txtype && verbose >= 3)
zil_rec_info[txtype].zri_print(zilog, txtype, lr);
zil_rec_info[txtype].zri_count++;
zil_rec_info[0].zri_count++;
@@ -343,12 +322,11 @@ print_log_record(zilog_t *zilog, const lr_t *lr, void *arg, uint64_t claim_txg)
/* ARGSUSED */
static int
print_log_block(zilog_t *zilog, const blkptr_t *bp, void *arg,
uint64_t claim_txg)
print_log_block(zilog_t *zilog, blkptr_t *bp, void *arg, uint64_t claim_txg)
{
char blkbuf[BP_SPRINTF_LEN + 10];
int verbose = MAX(dump_opt['d'], dump_opt['i']);
const char *claim;
char *claim;
if (verbose <= 3)
return (0);
@@ -363,7 +341,7 @@ print_log_block(zilog_t *zilog, const blkptr_t *bp, void *arg,
if (claim_txg != 0)
claim = "already claimed";
else if (bp->blk_birth >= spa_min_claim_txg(zilog->zl_spa))
else if (bp->blk_birth >= spa_first_txg(zilog->zl_spa))
claim = "will claim";
else
claim = "won't claim";
@@ -377,7 +355,7 @@ print_log_block(zilog_t *zilog, const blkptr_t *bp, void *arg,
static void
print_log_stats(int verbose)
{
unsigned i, w, p10;
int i, w, p10;
if (verbose > 3)
(void) printf("\n");
@@ -418,15 +396,10 @@ dump_intent_log(zilog_t *zilog)
for (i = 0; i < TX_MAX_TYPE; i++)
zil_rec_info[i].zri_count = 0;
/* see comment in zil_claim() or zil_check_log_chain() */
if (zilog->zl_spa->spa_uberblock.ub_checkpoint_txg != 0 &&
zh->zh_claim_txg == 0)
return;
if (verbose >= 2) {
(void) printf("\n");
(void) zil_parse(zilog, print_log_block, print_log_record, NULL,
zh->zh_claim_txg, B_FALSE);
zh->zh_claim_txg);
print_log_stats(verbose);
}
}
+54 -9
View File
@@ -1,8 +1,10 @@
include $(top_srcdir)/config/Rules.am
AM_CFLAGS += $(LIBUDEV_CFLAGS) $(LIBUUID_CFLAGS)
DEFAULT_INCLUDES += \
-I$(top_srcdir)/include \
-I$(top_srcdir)/lib/libspl/include
SUBDIRS = zed.d
EXTRA_DIST = zed.d/README
sbin_PROGRAMS = zed
@@ -38,12 +40,55 @@ FMA_SRC = \
zed_SOURCES = $(ZED_SRC) $(FMA_SRC)
zed_LDADD = \
$(abs_top_builddir)/lib/libzfs/libzfs.la \
$(abs_top_builddir)/lib/libzfs_core/libzfs_core.la \
$(abs_top_builddir)/lib/libnvpair/libnvpair.la \
$(abs_top_builddir)/lib/libuutil/libuutil.la
$(top_builddir)/lib/libavl/libavl.la \
$(top_builddir)/lib/libnvpair/libnvpair.la \
$(top_builddir)/lib/libspl/libspl.la \
$(top_builddir)/lib/libuutil/libuutil.la \
$(top_builddir)/lib/libzpool/libzpool.la \
$(top_builddir)/lib/libzfs/libzfs.la \
$(top_builddir)/lib/libzfs_core/libzfs_core.la
zed_LDADD += -lrt $(LIBUDEV_LIBS) $(LIBUUID_LIBS)
zed_LDFLAGS = -pthread
zed_LDFLAGS = -lrt -pthread
EXTRA_DIST = agents/README.md
zedconfdir = $(sysconfdir)/zfs/zed.d
dist_zedconf_DATA = \
zed.d/zed-functions.sh \
zed.d/zed.rc
zedexecdir = $(libexecdir)/zfs/zed.d
dist_zedexec_SCRIPTS = \
zed.d/all-debug.sh \
zed.d/all-syslog.sh \
zed.d/data-notify.sh \
zed.d/generic-notify.sh \
zed.d/resilver_finish-notify.sh \
zed.d/scrub_finish-notify.sh \
zed.d/statechange-led.sh \
zed.d/statechange-notify.sh \
zed.d/vdev_clear-led.sh \
zed.d/vdev_attach-led.sh \
zed.d/pool_import-led.sh \
zed.d/resilver_finish-start-scrub.sh
zedconfdefaults = \
all-syslog.sh \
data-notify.sh \
resilver_finish-notify.sh \
scrub_finish-notify.sh \
statechange-led.sh \
statechange-notify.sh \
vdev_clear-led.sh \
vdev_attach-led.sh \
pool_import-led.sh \
resilver_finish-start-scrub.sh
install-data-hook:
$(MKDIR_P) "$(DESTDIR)$(zedconfdir)"
for f in $(zedconfdefaults); do \
test -f "$(DESTDIR)$(zedconfdir)/$${f}" -o \
-L "$(DESTDIR)$(zedconfdir)/$${f}" || \
ln -s "$(zedexecdir)/$${f}" "$(DESTDIR)$(zedconfdir)"; \
done
chmod 0600 "$(DESTDIR)$(zedconfdir)/zed.rc"
+1 -1
View File
@@ -25,7 +25,7 @@
*/
/*
* This file implements the minimal FMD module API required to support the
* This file imlements the minimal FMD module API required to support the
* fault logic modules in ZED. This support includes module registration,
* memory allocation, module property accessors, basic case management,
* one-shot timers and SERD engines.
+1 -1
View File
@@ -281,7 +281,7 @@ fmd_serd_eng_empty(fmd_serd_eng_t *sgp)
void
fmd_serd_eng_reset(fmd_serd_eng_t *sgp)
{
serd_log_msg(" SERD Engine: resetting %s", sgp->sg_name);
serd_log_msg(" SERD Engine: reseting %s", sgp->sg_name);
while (sgp->sg_count != 0)
fmd_serd_eng_discard(sgp, list_head(&sgp->sg_list));
+41 -95
View File
@@ -12,7 +12,6 @@
/*
* Copyright (c) 2016, Intel Corporation.
* Copyright (c) 2018, loli10K <ezomori.nozomu@gmail.com>
*/
#include <libnvpair.h>
@@ -54,25 +53,13 @@ pthread_t g_agents_tid;
libzfs_handle_t *g_zfs_hdl;
/* guid search data */
typedef enum device_type {
DEVICE_TYPE_L2ARC, /* l2arc device */
DEVICE_TYPE_SPARE, /* spare device */
DEVICE_TYPE_PRIMARY /* any primary pool storage device */
} device_type_t;
typedef struct guid_search {
uint64_t gs_pool_guid;
uint64_t gs_vdev_guid;
char *gs_devid;
device_type_t gs_vdev_type;
uint64_t gs_vdev_expandtime; /* vdev expansion time */
} guid_search_t;
/*
* Walks the vdev tree recursively looking for a matching devid.
* Returns B_TRUE as soon as a matching device is found, B_FALSE otherwise.
*/
static boolean_t
static void
zfs_agent_iter_vdev(zpool_handle_t *zhp, nvlist_t *nvl, void *arg)
{
guid_search_t *gsp = arg;
@@ -85,48 +72,19 @@ zfs_agent_iter_vdev(zpool_handle_t *zhp, nvlist_t *nvl, void *arg)
*/
if (nvlist_lookup_nvlist_array(nvl, ZPOOL_CONFIG_CHILDREN,
&child, &children) == 0) {
for (c = 0; c < children; c++) {
if (zfs_agent_iter_vdev(zhp, child[c], gsp)) {
gsp->gs_vdev_type = DEVICE_TYPE_PRIMARY;
return (B_TRUE);
}
}
for (c = 0; c < children; c++)
zfs_agent_iter_vdev(zhp, child[c], gsp);
return;
}
/*
* Iterate over any spares and cache devices
* On a devid match, grab the vdev guid
*/
if (nvlist_lookup_nvlist_array(nvl, ZPOOL_CONFIG_SPARES,
&child, &children) == 0) {
for (c = 0; c < children; c++) {
if (zfs_agent_iter_vdev(zhp, child[c], gsp)) {
gsp->gs_vdev_type = DEVICE_TYPE_L2ARC;
return (B_TRUE);
}
}
}
if (nvlist_lookup_nvlist_array(nvl, ZPOOL_CONFIG_L2CACHE,
&child, &children) == 0) {
for (c = 0; c < children; c++) {
if (zfs_agent_iter_vdev(zhp, child[c], gsp)) {
gsp->gs_vdev_type = DEVICE_TYPE_SPARE;
return (B_TRUE);
}
}
}
/*
* On a devid match, grab the vdev guid and expansion time, if any.
*/
if (gsp->gs_devid != NULL &&
if ((gsp->gs_vdev_guid == 0) &&
(nvlist_lookup_string(nvl, ZPOOL_CONFIG_DEVID, &path) == 0) &&
(strcmp(gsp->gs_devid, path) == 0)) {
(void) nvlist_lookup_uint64(nvl, ZPOOL_CONFIG_GUID,
&gsp->gs_vdev_guid);
(void) nvlist_lookup_uint64(nvl, ZPOOL_CONFIG_EXPANSION_TIME,
&gsp->gs_vdev_expandtime);
return (B_TRUE);
}
return (B_FALSE);
}
static int
@@ -141,7 +99,7 @@ zfs_agent_iter_pool(zpool_handle_t *zhp, void *arg)
if ((config = zpool_get_config(zhp, NULL)) != NULL) {
if (nvlist_lookup_nvlist(config, ZPOOL_CONFIG_VDEV_TREE,
&nvl) == 0) {
(void) zfs_agent_iter_vdev(zhp, nvl, gsp);
zfs_agent_iter_vdev(zhp, nvl, gsp);
}
}
/*
@@ -177,9 +135,9 @@ zfs_agent_post_event(const char *class, const char *subclass, nvlist_t *nvl)
}
/*
* On Linux, we don't get the expected FM_RESOURCE_REMOVED ereport
* from the vdev_disk layer after a hot unplug. Fortunately we do
* get an EC_DEV_REMOVE from our disk monitor and it is a suitable
* On ZFS on Linux, we don't get the expected FM_RESOURCE_REMOVED
* ereport from vdev_disk layer after a hot unplug. Fortunately we
* get a EC_DEV_REMOVE from our disk monitor and it is a suitable
* proxy so we remap it here for the benefit of the diagnosis engine.
*/
if ((strcmp(class, EC_DEV_REMOVE) == 0) &&
@@ -190,8 +148,6 @@ zfs_agent_post_event(const char *class, const char *subclass, nvlist_t *nvl)
struct timeval tv;
int64_t tod[2];
uint64_t pool_guid = 0, vdev_guid = 0;
guid_search_t search = { 0 };
device_type_t devtype = DEVICE_TYPE_PRIMARY;
class = "resource.fs.zfs.removed";
subclass = "";
@@ -200,55 +156,30 @@ zfs_agent_post_event(const char *class, const char *subclass, nvlist_t *nvl)
(void) nvlist_lookup_uint64(nvl, ZFS_EV_POOL_GUID, &pool_guid);
(void) nvlist_lookup_uint64(nvl, ZFS_EV_VDEV_GUID, &vdev_guid);
(void) gettimeofday(&tv, NULL);
tod[0] = tv.tv_sec;
tod[1] = tv.tv_usec;
(void) nvlist_add_int64_array(payload, FM_EREPORT_TIME, tod, 2);
/*
* For multipath, spare and l2arc devices ZFS_EV_VDEV_GUID or
* ZFS_EV_POOL_GUID may be missing so find them.
* For multipath, ZFS_EV_VDEV_GUID is missing so find it.
*/
(void) nvlist_lookup_string(nvl, DEV_IDENTIFIER,
&search.gs_devid);
(void) zpool_iter(g_zfs_hdl, zfs_agent_iter_pool, &search);
pool_guid = search.gs_pool_guid;
vdev_guid = search.gs_vdev_guid;
devtype = search.gs_vdev_type;
if (vdev_guid == 0) {
guid_search_t search = { 0 };
/*
* We want to avoid reporting "remove" events coming from
* libudev for VDEVs which were expanded recently (10s) and
* avoid activating spares in response to partitions being
* deleted and created in rapid succession.
*/
if (search.gs_vdev_expandtime != 0 &&
search.gs_vdev_expandtime + 10 > tv.tv_sec) {
zed_log_msg(LOG_INFO, "agent post event: ignoring '%s' "
"for recently expanded device '%s'", EC_DEV_REMOVE,
search.gs_devid);
goto out;
(void) nvlist_lookup_string(nvl, DEV_IDENTIFIER,
&search.gs_devid);
(void) zpool_iter(g_zfs_hdl, zfs_agent_iter_pool,
&search);
pool_guid = search.gs_pool_guid;
vdev_guid = search.gs_vdev_guid;
}
(void) nvlist_add_uint64(payload,
FM_EREPORT_PAYLOAD_ZFS_POOL_GUID, pool_guid);
(void) nvlist_add_uint64(payload,
FM_EREPORT_PAYLOAD_ZFS_VDEV_GUID, vdev_guid);
switch (devtype) {
case DEVICE_TYPE_L2ARC:
(void) nvlist_add_string(payload,
FM_EREPORT_PAYLOAD_ZFS_VDEV_TYPE,
VDEV_TYPE_L2CACHE);
break;
case DEVICE_TYPE_SPARE:
(void) nvlist_add_string(payload,
FM_EREPORT_PAYLOAD_ZFS_VDEV_TYPE, VDEV_TYPE_SPARE);
break;
case DEVICE_TYPE_PRIMARY:
(void) nvlist_add_string(payload,
FM_EREPORT_PAYLOAD_ZFS_VDEV_TYPE, VDEV_TYPE_DISK);
break;
}
(void) gettimeofday(&tv, NULL);
tod[0] = tv.tv_sec;
tod[1] = tv.tv_usec;
(void) nvlist_add_int64_array(payload, FM_EREPORT_TIME, tod, 2);
zed_log_msg(LOG_INFO, "agent post event: mapping '%s' to '%s'",
EC_DEV_REMOVE, class);
@@ -262,7 +193,6 @@ zfs_agent_post_event(const char *class, const char *subclass, nvlist_t *nvl)
list_insert_tail(&agent_events, event);
(void) pthread_mutex_unlock(&agent_lock);
out:
(void) pthread_cond_signal(&agent_cond);
}
@@ -420,3 +350,19 @@ zfs_agent_fini(void)
g_zfs_hdl = NULL;
}
/*
* In ZED context, all the FMA agents run in the same thread
* and do not require a unique libzfs instance. Modules should
* use these stubs.
*/
libzfs_handle_t *
__libzfs_init(void)
{
return (g_zfs_hdl);
}
void
__libzfs_fini(libzfs_handle_t *hdl)
{
}
+7
View File
@@ -39,6 +39,13 @@ extern int zfs_slm_init(void);
extern void zfs_slm_fini(void);
extern void zfs_slm_event(const char *, const char *, nvlist_t *);
/*
* In ZED context, all the FMA agents run in the same thread
* and do not require a unique libzfs instance.
*/
extern libzfs_handle_t *__libzfs_init(void);
extern void __libzfs_fini(libzfs_handle_t *);
#ifdef __cplusplus
}
#endif
+73 -12
View File
@@ -26,7 +26,6 @@
*/
#include <stddef.h>
#include <string.h>
#include <strings.h>
#include <libuutil.h>
#include <libzfs.h>
@@ -168,12 +167,14 @@ zfs_case_unserialize(fmd_hdl_t *hdl, fmd_case_t *cp)
static void
zfs_mark_vdev(uint64_t pool_guid, nvlist_t *vd, er_timeval_t *loaded)
{
uint64_t vdev_guid = 0;
uint64_t vdev_guid;
uint_t c, children;
nvlist_t **child;
zfs_case_t *zcp;
int ret;
(void) nvlist_lookup_uint64(vd, ZPOOL_CONFIG_GUID, &vdev_guid);
ret = nvlist_lookup_uint64(vd, ZPOOL_CONFIG_GUID, &vdev_guid);
assert(ret == 0);
/*
* Mark any cases associated with this (pool, vdev) pair.
@@ -252,10 +253,7 @@ zfs_mark_pool(zpool_handle_t *zhp, void *unused)
}
ret = nvlist_lookup_nvlist(config, ZPOOL_CONFIG_VDEV_TREE, &vd);
if (ret) {
zpool_close(zhp);
return (-1);
}
assert(ret == 0);
zfs_mark_vdev(pool_guid, vd, &loaded);
@@ -379,6 +377,11 @@ zfs_case_solve(fmd_hdl_t *hdl, zfs_case_t *zcp, const char *faultname,
nvlist_t *detector, *fault;
boolean_t serialize;
nvlist_t *fru = NULL;
#ifdef HAVE_LIBTOPO
nvlist_t *fmri;
topo_hdl_t *thp;
int err;
#endif
fmd_hdl_debug(hdl, "solving fault '%s'", faultname);
/*
@@ -397,6 +400,64 @@ zfs_case_solve(fmd_hdl_t *hdl, zfs_case_t *zcp, const char *faultname,
zcp->zc_data.zc_vdev_guid);
}
#ifdef HAVE_LIBTOPO
/*
* We also want to make sure that the detector (pool or vdev) properly
* reflects the diagnosed state, when the fault corresponds to internal
* ZFS state (i.e. not checksum or I/O error-induced). Otherwise, a
* device which was unavailable early in boot (because the driver/file
* wasn't available) and is now healthy will be mis-diagnosed.
*/
if (!fmd_nvl_fmri_present(hdl, detector) ||
(checkunusable && !fmd_nvl_fmri_unusable(hdl, detector))) {
fmd_case_close(hdl, zcp->zc_case);
nvlist_free(detector);
return;
}
fru = NULL;
if (zcp->zc_fru != NULL &&
(thp = fmd_hdl_topo_hold(hdl, TOPO_VERSION)) != NULL) {
/*
* If the vdev had an associated FRU, then get the FRU nvlist
* from the topo handle and use that in the suspect list. We
* explicitly lookup the FRU because the fmri reported from the
* kernel may not have up to date details about the disk itself
* (serial, part, etc).
*/
if (topo_fmri_str2nvl(thp, zcp->zc_fru, &fmri, &err) == 0) {
libzfs_handle_t *zhdl = fmd_hdl_getspecific(hdl);
/*
* If the disk is part of the system chassis, but the
* FRU indicates a different chassis ID than our
* current system, then ignore the error. This
* indicates that the device was part of another
* cluster head, and for obvious reasons cannot be
* imported on this system.
*/
if (libzfs_fru_notself(zhdl, zcp->zc_fru)) {
fmd_case_close(hdl, zcp->zc_case);
nvlist_free(fmri);
fmd_hdl_topo_rele(hdl, thp);
nvlist_free(detector);
return;
}
/*
* If the device is no longer present on the system, or
* topo_fmri_fru() fails for other reasons, then fall
* back to the fmri specified in the vdev.
*/
if (topo_fmri_fru(thp, fmri, &fru, &err) != 0)
fru = fmd_nvl_dup(hdl, fmri, FMD_SLEEP);
nvlist_free(fmri);
}
fmd_hdl_topo_rele(hdl, thp);
}
#endif
fault = fmd_nvl_create_fault(hdl, faultname, 100, detector,
fru, detector);
fmd_case_add_suspect(hdl, zcp->zc_case, fault);
@@ -921,27 +982,27 @@ _zfs_diagnosis_init(fmd_hdl_t *hdl)
{
libzfs_handle_t *zhdl;
if ((zhdl = libzfs_init()) == NULL)
if ((zhdl = __libzfs_init()) == NULL)
return;
if ((zfs_case_pool = uu_list_pool_create("zfs_case_pool",
sizeof (zfs_case_t), offsetof(zfs_case_t, zc_node),
NULL, UU_LIST_POOL_DEBUG)) == NULL) {
libzfs_fini(zhdl);
__libzfs_fini(zhdl);
return;
}
if ((zfs_cases = uu_list_create(zfs_case_pool, NULL,
UU_LIST_DEBUG)) == NULL) {
uu_list_pool_destroy(zfs_case_pool);
libzfs_fini(zhdl);
__libzfs_fini(zhdl);
return;
}
if (fmd_hdl_register(hdl, FMD_API_VERSION, &fmd_info) != 0) {
uu_list_destroy(zfs_cases);
uu_list_pool_destroy(zfs_case_pool);
libzfs_fini(zhdl);
__libzfs_fini(zhdl);
return;
}
@@ -977,5 +1038,5 @@ _zfs_diagnosis_fini(fmd_hdl_t *hdl)
uu_list_pool_destroy(zfs_case_pool);
zhdl = fmd_hdl_getspecific(hdl);
libzfs_fini(zhdl);
__libzfs_fini(zhdl);
}
+72 -108
View File
@@ -23,7 +23,6 @@
* Copyright (c) 2012 by Delphix. All rights reserved.
* Copyright 2014 Nexenta Systems, Inc. All rights reserved.
* Copyright (c) 2016, 2017, Intel Corporation.
* Copyright (c) 2017 Open-E, Inc. All Rights Reserved.
*/
/*
@@ -63,14 +62,17 @@
* If the device could not be replaced, then the second online attempt will
* trigger the FMA fault that we skipped earlier.
*
* On Linux udev provides a disk insert for both the disk and the partition.
* ZFS on Linux porting notes:
* In lieu of a thread pool, just spawn a thread on demmand.
* Linux udev provides a disk insert for both the disk and the partition
*
*/
#include <ctype.h>
#include <devid.h>
#include <fcntl.h>
#include <libnvpair.h>
#include <libzfs.h>
#include <libzutil.h>
#include <limits.h>
#include <stddef.h>
#include <stdlib.h>
@@ -80,10 +82,8 @@
#include <sys/sunddi.h>
#include <sys/sysevent/eventdefs.h>
#include <sys/sysevent/dev.h>
#include <thread_pool.h>
#include <pthread.h>
#include <unistd.h>
#include <errno.h>
#include "zfs_agents.h"
#include "../zed_log.h"
@@ -96,12 +96,12 @@ typedef void (*zfs_process_func_t)(zpool_handle_t *, nvlist_t *, boolean_t);
libzfs_handle_t *g_zfshdl;
list_t g_pool_list; /* list of unavailable pools at initialization */
list_t g_device_list; /* list of disks with asynchronous label request */
tpool_t *g_tpool;
boolean_t g_enumeration_done;
pthread_t g_zfs_tid; /* zfs_enum_pools() thread */
pthread_t g_zfs_tid;
typedef struct unavailpool {
zpool_handle_t *uap_zhp;
pthread_t uap_enable_tid; /* dataset enable thread if activated */
list_node_t uap_node;
} unavailpool_t;
@@ -134,6 +134,7 @@ zfs_unavail_pool(zpool_handle_t *zhp, void *data)
unavailpool_t *uap;
uap = malloc(sizeof (unavailpool_t));
uap->uap_zhp = zhp;
uap->uap_enable_tid = 0;
list_insert_tail((list_t *)data, uap);
} else {
zpool_close(zhp);
@@ -154,7 +155,7 @@ zfs_unavail_pool(zpool_handle_t *zhp, void *data)
* 1. physical match with no fs, no partition
* tag it top, partition disk
*
* 2. physical match again, see partition and tag
* 2. physical match again, see partion and tag
*
*/
@@ -189,8 +190,8 @@ zfs_process_add(zpool_handle_t *zhp, nvlist_t *vdev, boolean_t labeled)
char rawpath[PATH_MAX], fullpath[PATH_MAX];
char devpath[PATH_MAX];
int ret;
boolean_t is_dm = B_FALSE;
boolean_t is_sd = B_FALSE;
int is_dm = 0;
int is_sd = 0;
uint_t c;
vdev_stat_t *vs;
@@ -218,8 +219,8 @@ zfs_process_add(zpool_handle_t *zhp, nvlist_t *vdev, boolean_t labeled)
is_dm = zfs_dev_is_dm(path);
zed_log_msg(LOG_INFO, "zfs_process_add: pool '%s' vdev '%s', phys '%s'"
" wholedisk %d, %s dm (guid %llu)", zpool_get_name(zhp), path,
physpath ? physpath : "NULL", wholedisk, is_dm ? "is" : "not",
" wholedisk %d, dm %d (%llu)", zpool_get_name(zhp), path,
physpath ? physpath : "NULL", wholedisk, is_dm,
(long long unsigned int)guid);
/*
@@ -264,7 +265,7 @@ zfs_process_add(zpool_handle_t *zhp, nvlist_t *vdev, boolean_t labeled)
* testing)
*/
if (physpath != NULL && strcmp("scsidebug", physpath) == 0)
is_sd = B_TRUE;
is_sd = 1;
/*
* If the pool doesn't have the autoreplace property set, then use
@@ -425,17 +426,9 @@ zfs_process_add(zpool_handle_t *zhp, nvlist_t *vdev, boolean_t labeled)
nvlist_free(newvd);
/*
* Wait for udev to verify the links exist, then auto-replace
* the leaf disk at same physical location.
* auto replace a leaf disk at same physical location
*/
if (zpool_label_disk_wait(path, 3000) != 0) {
zed_log_msg(LOG_WARNING, "zfs_mod: expected replacement "
"disk %s is missing", path);
nvlist_free(nvroot);
return;
}
ret = zpool_vdev_attach(zhp, fullpath, path, nvroot, B_TRUE, B_FALSE);
ret = zpool_vdev_attach(zhp, fullpath, path, nvroot, B_TRUE);
zed_log_msg(LOG_INFO, " zpool_vdev_replace: %s with %s (%s)",
fullpath, path, (ret == 0) ? "no errors" :
@@ -473,20 +466,7 @@ zfs_iter_vdev(zpool_handle_t *zhp, nvlist_t *nvl, void *data)
&child, &children) == 0) {
for (c = 0; c < children; c++)
zfs_iter_vdev(zhp, child[c], data);
}
/*
* Iterate over any spares and cache devices
*/
if (nvlist_lookup_nvlist_array(nvl, ZPOOL_CONFIG_SPARES,
&child, &children) == 0) {
for (c = 0; c < children; c++)
zfs_iter_vdev(zhp, child[c], data);
}
if (nvlist_lookup_nvlist_array(nvl, ZPOOL_CONFIG_L2CACHE,
&child, &children) == 0) {
for (c = 0; c < children; c++)
zfs_iter_vdev(zhp, child[c], data);
return;
}
/* once a vdev was matched and processed there is nothing left to do */
@@ -531,14 +511,19 @@ zfs_iter_vdev(zpool_handle_t *zhp, nvlist_t *nvl, void *data)
(dp->dd_func)(zhp, nvl, dp->dd_islabeled);
}
static void
static void *
zfs_enable_ds(void *arg)
{
unavailpool_t *pool = (unavailpool_t *)arg;
assert(pool->uap_enable_tid = pthread_self());
(void) zpool_enable_datasets(pool->uap_zhp, NULL, 0);
zpool_close(pool->uap_zhp);
free(pool);
pool->uap_zhp = NULL;
/* Note: zfs_slm_fini() will cleanup this pool entry on exit */
return (NULL);
}
static int
@@ -573,13 +558,15 @@ zfs_iter_pool(zpool_handle_t *zhp, void *data)
for (pool = list_head(&g_pool_list); pool != NULL;
pool = list_next(&g_pool_list, pool)) {
if (pool->uap_enable_tid != 0)
continue; /* entry already processed */
if (strcmp(zpool_get_name(zhp),
zpool_get_name(pool->uap_zhp)))
continue;
if (zfs_toplevel_state(zhp) >= VDEV_STATE_DEGRADED) {
list_remove(&g_pool_list, pool);
(void) tpool_dispatch(g_tpool, zfs_enable_ds,
pool);
/* send to a background thread; keep on list */
(void) pthread_create(&pool->uap_enable_tid,
NULL, zfs_enable_ds, pool);
break;
}
}
@@ -671,7 +658,7 @@ zfs_deliver_add(nvlist_t *nvl, boolean_t is_lofi)
devid, devpath ? devpath : "NULL", is_slice);
/*
* Iterate over all vdevs looking for a match in the following order:
* Iterate over all vdevs looking for a match in the folllowing order:
* 1. ZPOOL_CONFIG_DEVID (identifies the unique disk)
* 2. ZPOOL_CONFIG_PHYS_PATH (identifies disk physical location).
*
@@ -716,8 +703,8 @@ zfsdle_vdev_online(zpool_handle_t *zhp, void *data)
{
char *devname = data;
boolean_t avail_spare, l2cache;
vdev_state_t newstate;
nvlist_t *tgt;
int error;
zed_log_msg(LOG_INFO, "zfsdle_vdev_online: searching for '%s' in '%s'",
devname, zpool_get_name(zhp));
@@ -725,58 +712,40 @@ zfsdle_vdev_online(zpool_handle_t *zhp, void *data)
if ((tgt = zpool_find_vdev_by_physpath(zhp, devname,
&avail_spare, &l2cache, NULL)) != NULL) {
char *path, fullpath[MAXPATHLEN];
uint64_t wholedisk;
uint64_t wholedisk = 0ULL;
error = nvlist_lookup_string(tgt, ZPOOL_CONFIG_PATH, &path);
if (error) {
zpool_close(zhp);
return (0);
}
error = nvlist_lookup_uint64(tgt, ZPOOL_CONFIG_WHOLE_DISK,
&wholedisk);
if (error)
wholedisk = 0;
verify(nvlist_lookup_string(tgt, ZPOOL_CONFIG_PATH,
&path) == 0);
verify(nvlist_lookup_uint64(tgt, ZPOOL_CONFIG_WHOLE_DISK,
&wholedisk) == 0);
(void) strlcpy(fullpath, path, sizeof (fullpath));
if (wholedisk) {
path = strrchr(path, '/');
if (path != NULL) {
path = zfs_strip_partition(path + 1);
if (path == NULL) {
zpool_close(zhp);
return (0);
}
} else {
zpool_close(zhp);
char *spath = zfs_strip_partition(fullpath);
if (!spath) {
zed_log_msg(LOG_INFO, "%s: Can't alloc",
__func__);
return (0);
}
(void) strlcpy(fullpath, path, sizeof (fullpath));
free(path);
(void) strlcpy(fullpath, spath, sizeof (fullpath));
free(spath);
/*
* We need to reopen the pool associated with this
* device so that the kernel can update the size of
* the expanded device. When expanding there is no
* need to restart the scrub from the beginning.
* device so that the kernel can update the size
* of the expanded device.
*/
boolean_t scrub_restart = B_FALSE;
(void) zpool_reopen_one(zhp, &scrub_restart);
} else {
(void) strlcpy(fullpath, path, sizeof (fullpath));
(void) zpool_reopen(zhp);
}
if (zpool_get_prop_int(zhp, ZPOOL_PROP_AUTOEXPAND, NULL)) {
vdev_state_t newstate;
if (zpool_get_state(zhp) != POOL_STATE_UNAVAIL) {
error = zpool_vdev_online(zhp, fullpath, 0,
zed_log_msg(LOG_INFO, "zfsdle_vdev_online: setting "
"device '%s' to ONLINE state in pool '%s'",
fullpath, zpool_get_name(zhp));
if (zpool_get_state(zhp) != POOL_STATE_UNAVAIL)
(void) zpool_vdev_online(zhp, fullpath, 0,
&newstate);
zed_log_msg(LOG_INFO, "zfsdle_vdev_online: "
"setting device '%s' to ONLINE state "
"in pool '%s': %d", fullpath,
zpool_get_name(zhp), error);
}
}
zpool_close(zhp);
return (1);
@@ -786,32 +755,23 @@ zfsdle_vdev_online(zpool_handle_t *zhp, void *data)
}
/*
* This function handles the ESC_DEV_DLE device change event. Use the
* provided vdev guid when looking up a disk or partition, when the guid
* is not present assume the entire disk is owned by ZFS and append the
* expected -part1 partition information then lookup by physical path.
* This function handles the ESC_DEV_DLE event.
*/
static int
zfs_deliver_dle(nvlist_t *nvl)
{
char *devname, name[MAXPATHLEN];
uint64_t guid;
char *devname;
if (nvlist_lookup_uint64(nvl, ZFS_EV_VDEV_GUID, &guid) == 0) {
sprintf(name, "%llu", (u_longlong_t)guid);
} else if (nvlist_lookup_string(nvl, DEV_PHYS_PATH, &devname) == 0) {
strlcpy(name, devname, MAXPATHLEN);
zfs_append_partition(name, MAXPATHLEN);
} else {
zed_log_msg(LOG_INFO, "zfs_deliver_dle: no guid or physpath");
if (nvlist_lookup_string(nvl, DEV_PHYS_PATH, &devname) != 0) {
zed_log_msg(LOG_INFO, "zfs_deliver_dle: no physpath");
return (-1);
}
if (zpool_iter(g_zfshdl, zfsdle_vdev_online, name) != 1) {
if (zpool_iter(g_zfshdl, zfsdle_vdev_online, devname) != 1) {
zed_log_msg(LOG_INFO, "zfs_deliver_dle: device '%s' not "
"found", name);
"found", devname);
return (1);
}
return (0);
}
@@ -889,12 +849,12 @@ zfs_enum_pools(void *arg)
*
* sent messages from zevents or udev monitor
*
* For now, each agent has its own libzfs instance
* For now, each agent has it's own libzfs instance
*/
int
zfs_slm_init()
{
if ((g_zfshdl = libzfs_init()) == NULL)
if ((g_zfshdl = __libzfs_init()) == NULL)
return (-1);
/*
@@ -906,7 +866,7 @@ zfs_slm_init()
if (pthread_create(&g_zfs_tid, NULL, zfs_enum_pools, NULL) != 0) {
list_destroy(&g_pool_list);
libzfs_fini(g_zfshdl);
__libzfs_fini(g_zfshdl);
return (-1);
}
@@ -924,15 +884,19 @@ zfs_slm_fini()
/* wait for zfs_enum_pools thread to complete */
(void) pthread_join(g_zfs_tid, NULL);
/* destroy the thread pool */
if (g_tpool != NULL) {
tpool_wait(g_tpool);
tpool_destroy(g_tpool);
}
while ((pool = (list_head(&g_pool_list))) != NULL) {
/*
* each pool entry has two possibilities
* 1. was made available (so wait for zfs_enable_ds thread)
* 2. still unavailable (just close the pool)
*/
if (pool->uap_enable_tid)
(void) pthread_join(pool->uap_enable_tid, NULL);
else if (pool->uap_zhp != NULL)
zpool_close(pool->uap_zhp);
list_remove(&g_pool_list, pool);
zpool_close(pool->uap_zhp);
free(pool);
}
list_destroy(&g_pool_list);
@@ -943,7 +907,7 @@ zfs_slm_fini()
}
list_destroy(&g_device_list);
libzfs_fini(g_zfshdl);
__libzfs_fini(g_zfshdl);
}
void
+158 -73
View File
@@ -22,7 +22,6 @@
* Copyright (c) 2006, 2010, Oracle and/or its affiliates. All rights reserved.
*
* Copyright (c) 2016, Intel Corporation.
* Copyright (c) 2018, loli10K <ezomori.nozomu@gmail.com>
*/
/*
@@ -72,6 +71,7 @@ zfs_retire_clear_data(fmd_hdl_t *hdl, zfs_retire_data_t *zdp)
*/
typedef struct find_cbdata {
uint64_t cb_guid;
const char *cb_fru;
zpool_handle_t *cb_zhp;
nvlist_t *cb_vdev;
} find_cbdata_t;
@@ -95,18 +95,26 @@ find_pool(zpool_handle_t *zhp, void *data)
* Find a vdev within a tree with a matching GUID.
*/
static nvlist_t *
find_vdev(libzfs_handle_t *zhdl, nvlist_t *nv, uint64_t search_guid)
find_vdev(libzfs_handle_t *zhdl, nvlist_t *nv, const char *search_fru,
uint64_t search_guid)
{
uint64_t guid;
nvlist_t **child;
uint_t c, children;
nvlist_t *ret;
char *fru;
if (nvlist_lookup_uint64(nv, ZPOOL_CONFIG_GUID, &guid) == 0 &&
guid == search_guid) {
fmd_hdl_debug(fmd_module_hdl("zfs-retire"),
"matched vdev %llu", guid);
return (nv);
if (search_fru != NULL) {
if (nvlist_lookup_string(nv, ZPOOL_CONFIG_FRU, &fru) == 0 &&
libzfs_fru_compare(zhdl, fru, search_fru))
return (nv);
} else {
if (nvlist_lookup_uint64(nv, ZPOOL_CONFIG_GUID, &guid) == 0 &&
guid == search_guid) {
fmd_hdl_debug(fmd_module_hdl("zfs-retire"),
"matched vdev %llu", guid);
return (nv);
}
}
if (nvlist_lookup_nvlist_array(nv, ZPOOL_CONFIG_CHILDREN,
@@ -114,7 +122,8 @@ find_vdev(libzfs_handle_t *zhdl, nvlist_t *nv, uint64_t search_guid)
return (NULL);
for (c = 0; c < children; c++) {
if ((ret = find_vdev(zhdl, child[c], search_guid)) != NULL)
if ((ret = find_vdev(zhdl, child[c], search_fru,
search_guid)) != NULL)
return (ret);
}
@@ -123,16 +132,8 @@ find_vdev(libzfs_handle_t *zhdl, nvlist_t *nv, uint64_t search_guid)
return (NULL);
for (c = 0; c < children; c++) {
if ((ret = find_vdev(zhdl, child[c], search_guid)) != NULL)
return (ret);
}
if (nvlist_lookup_nvlist_array(nv, ZPOOL_CONFIG_SPARES,
&child, &children) != 0)
return (NULL);
for (c = 0; c < children; c++) {
if ((ret = find_vdev(zhdl, child[c], search_guid)) != NULL)
if ((ret = find_vdev(zhdl, child[c], search_fru,
search_guid)) != NULL)
return (ret);
}
@@ -166,7 +167,8 @@ find_by_guid(libzfs_handle_t *zhdl, uint64_t pool_guid, uint64_t vdev_guid,
}
if (vdev_guid != 0) {
if ((*vdevp = find_vdev(zhdl, nvroot, vdev_guid)) == NULL) {
if ((*vdevp = find_vdev(zhdl, nvroot, NULL,
vdev_guid)) == NULL) {
zpool_close(zhp);
return (NULL);
}
@@ -175,37 +177,72 @@ find_by_guid(libzfs_handle_t *zhdl, uint64_t pool_guid, uint64_t vdev_guid,
return (zhp);
}
#ifdef HAVE_LIBTOPO
static int
search_pool(zpool_handle_t *zhp, void *data)
{
find_cbdata_t *cbp = data;
nvlist_t *config;
nvlist_t *nvroot;
config = zpool_get_config(zhp, NULL);
if (nvlist_lookup_nvlist(config, ZPOOL_CONFIG_VDEV_TREE,
&nvroot) != 0) {
zpool_close(zhp);
return (0);
}
if ((cbp->cb_vdev = find_vdev(zpool_get_handle(zhp), nvroot,
cbp->cb_fru, 0)) != NULL) {
cbp->cb_zhp = zhp;
return (1);
}
zpool_close(zhp);
return (0);
}
/*
* Given a FRU FMRI, find the matching pool and vdev.
*/
static zpool_handle_t *
find_by_fru(libzfs_handle_t *zhdl, const char *fru, nvlist_t **vdevp)
{
find_cbdata_t cb;
cb.cb_fru = fru;
cb.cb_zhp = NULL;
if (zpool_iter(zhdl, search_pool, &cb) != 1)
return (NULL);
*vdevp = cb.cb_vdev;
return (cb.cb_zhp);
}
#endif /* HAVE_LIBTOPO */
/*
* Given a vdev, attempt to replace it with every known spare until one
* succeeds or we run out of devices to try.
* Return whether we were successful or not in replacing the device.
* succeeds.
*/
static boolean_t
static void
replace_with_spare(fmd_hdl_t *hdl, zpool_handle_t *zhp, nvlist_t *vdev)
{
nvlist_t *config, *nvroot, *replacement;
nvlist_t **spares;
uint_t s, nspares;
char *dev_name;
zprop_source_t source;
int ashift;
config = zpool_get_config(zhp, NULL);
if (nvlist_lookup_nvlist(config, ZPOOL_CONFIG_VDEV_TREE,
&nvroot) != 0)
return (B_FALSE);
return;
/*
* Find out if there are any hot spares available in the pool.
*/
if (nvlist_lookup_nvlist_array(nvroot, ZPOOL_CONFIG_SPARES,
&spares, &nspares) != 0)
return (B_FALSE);
/*
* lookup "ashift" pool property, we may need it for the replacement
*/
ashift = zpool_get_prop_int(zhp, ZPOOL_PROP_ASHIFT, &source);
return;
replacement = fmd_nvl_alloc(hdl, FMD_SLEEP);
@@ -225,11 +262,6 @@ replace_with_spare(fmd_hdl_t *hdl, zpool_handle_t *zhp, nvlist_t *vdev)
&spare_name) != 0)
continue;
/* if set, add the "ashift" pool property to the spare nvlist */
if (source != ZPROP_SRC_DEFAULT)
(void) nvlist_add_uint64(spares[s],
ZPOOL_CONFIG_ASHIFT, ashift);
(void) nvlist_add_nvlist_array(replacement,
ZPOOL_CONFIG_CHILDREN, &spares[s], 1);
@@ -237,17 +269,12 @@ replace_with_spare(fmd_hdl_t *hdl, zpool_handle_t *zhp, nvlist_t *vdev)
dev_name, basename(spare_name));
if (zpool_vdev_attach(zhp, dev_name, spare_name,
replacement, B_TRUE, B_FALSE) == 0) {
free(dev_name);
nvlist_free(replacement);
return (B_TRUE);
}
replacement, B_TRUE) == 0)
break;
}
free(dev_name);
nvlist_free(replacement);
return (B_FALSE);
}
/*
@@ -262,6 +289,10 @@ zfs_vdev_repair(fmd_hdl_t *hdl, nvlist_t *nvl)
zfs_retire_data_t *zdp = fmd_hdl_getspecific(hdl);
zfs_retire_repaired_t *zrp;
uint64_t pool_guid, vdev_guid;
#ifdef HAVE_LIBTOPO
nvlist_t *asru;
#endif
if (nvlist_lookup_uint64(nvl, FM_EREPORT_PAYLOAD_ZFS_POOL_GUID,
&pool_guid) != 0 || nvlist_lookup_uint64(nvl,
FM_EREPORT_PAYLOAD_ZFS_VDEV_GUID, &vdev_guid) != 0)
@@ -284,6 +315,47 @@ zfs_vdev_repair(fmd_hdl_t *hdl, nvlist_t *nvl)
return;
}
#ifdef HAVE_LIBTOPO
asru = fmd_nvl_alloc(hdl, FMD_SLEEP);
(void) nvlist_add_uint8(asru, FM_VERSION, ZFS_SCHEME_VERSION0);
(void) nvlist_add_string(asru, FM_FMRI_SCHEME, FM_FMRI_SCHEME_ZFS);
(void) nvlist_add_uint64(asru, FM_FMRI_ZFS_POOL, pool_guid);
(void) nvlist_add_uint64(asru, FM_FMRI_ZFS_VDEV, vdev_guid);
/*
* We explicitly check for the unusable state here to make sure we
* aren't responding to a transient state change. As part of opening a
* vdev, it's possible to see the 'statechange' event, only to be
* followed by a vdev failure later. If we don't check the current
* state of the vdev (or pool) before marking it repaired, then we risk
* generating spurious repair events followed immediately by the same
* diagnosis.
*
* This assumes that the ZFS scheme code associated unusable (i.e.
* isolated) with its own definition of faulty state. In the case of a
* DEGRADED leaf vdev (due to checksum errors), this is not the case.
* This works, however, because the transient state change is not
* posted in this case. This could be made more explicit by not
* relying on the scheme's unusable callback and instead directly
* checking the vdev state, where we could correctly account for
* DEGRADED state.
*/
if (!fmd_nvl_fmri_unusable(hdl, asru) && fmd_nvl_fmri_has_fault(hdl,
asru, FMD_HAS_FAULT_ASRU, NULL)) {
topo_hdl_t *thp;
char *fmri = NULL;
int err;
thp = fmd_hdl_topo_hold(hdl, TOPO_VERSION);
if (topo_fmri_nvl2str(thp, asru, &fmri, &err) == 0)
(void) fmd_repair_asru(hdl, fmri);
fmd_hdl_topo_rele(hdl, thp);
topo_hdl_strfree(thp, fmri);
}
nvlist_free(asru);
#endif
zrp = fmd_hdl_alloc(hdl, sizeof (zfs_retire_repaired_t), FMD_SLEEP);
zrp->zrr_next = zdp->zrd_repaired;
zrp->zrr_pool = pool_guid;
@@ -319,19 +391,11 @@ zfs_retire_recv(fmd_hdl_t *hdl, fmd_event_t *ep, nvlist_t *nvl,
fmd_hdl_debug(hdl, "zfs_retire_recv: '%s'", class);
nvlist_lookup_uint64(nvl, FM_EREPORT_PAYLOAD_ZFS_VDEV_STATE, &state);
/*
* If this is a resource notifying us of device removal then simply
* check for an available spare and continue unless the device is a
* l2arc vdev, in which case we just offline it.
* If this is a resource notifying us of device removal, then simply
* check for an available spare and continue.
*/
if (strcmp(class, "resource.fs.zfs.removed") == 0 ||
(strcmp(class, "resource.fs.zfs.statechange") == 0 &&
state == VDEV_STATE_REMOVED)) {
char *devtype;
char *devname;
if (strcmp(class, "resource.fs.zfs.removed") == 0) {
if (nvlist_lookup_uint64(nvl, FM_EREPORT_PAYLOAD_ZFS_POOL_GUID,
&pool_guid) != 0 ||
nvlist_lookup_uint64(nvl, FM_EREPORT_PAYLOAD_ZFS_VDEV_GUID,
@@ -342,20 +406,8 @@ zfs_retire_recv(fmd_hdl_t *hdl, fmd_event_t *ep, nvlist_t *nvl,
&vdev)) == NULL)
return;
devname = zpool_vdev_name(NULL, zhp, vdev, B_FALSE);
/* Can't replace l2arc with a spare: offline the device */
if (nvlist_lookup_string(nvl, FM_EREPORT_PAYLOAD_ZFS_VDEV_TYPE,
&devtype) == 0 && strcmp(devtype, VDEV_TYPE_L2CACHE) == 0) {
fmd_hdl_debug(hdl, "zpool_vdev_offline '%s'", devname);
zpool_vdev_offline(zhp, devname, B_TRUE);
} else if (!fmd_prop_get_int32(hdl, "spare_on_remove") ||
replace_with_spare(hdl, zhp, vdev) == B_FALSE) {
/* Could not handle with spare */
fmd_hdl_debug(hdl, "no spare for '%s'", devname);
}
free(devname);
if (fmd_prop_get_int32(hdl, "spare_on_remove"))
replace_with_spare(hdl, zhp, vdev);
zpool_close(zhp);
return;
}
@@ -364,11 +416,12 @@ zfs_retire_recv(fmd_hdl_t *hdl, fmd_event_t *ep, nvlist_t *nvl,
return;
/*
* Note: on Linux statechange events are more than just
* Note: on zfsonlinux statechange events are more than just
* healthy ones so we need to confirm the actual state value.
*/
if (strcmp(class, "resource.fs.zfs.statechange") == 0 &&
state == VDEV_STATE_HEALTHY) {
nvlist_lookup_uint64(nvl, FM_EREPORT_PAYLOAD_ZFS_VDEV_STATE,
&state) == 0 && state == VDEV_STATE_HEALTHY) {
zfs_vdev_repair(hdl, nvl);
return;
}
@@ -424,7 +477,39 @@ zfs_retire_recv(fmd_hdl_t *hdl, fmd_event_t *ep, nvlist_t *nvl,
}
if (is_disk) {
#ifdef HAVE_LIBTOPO
/*
* This is a disk fault. Lookup the FRU, convert it to
* an FMRI string, and attempt to find a matching vdev.
*/
if (nvlist_lookup_nvlist(fault, FM_FAULT_FRU,
&fru) != 0 ||
nvlist_lookup_string(fru, FM_FMRI_SCHEME,
&scheme) != 0)
continue;
if (strcmp(scheme, FM_FMRI_SCHEME_HC) != 0)
continue;
thp = fmd_hdl_topo_hold(hdl, TOPO_VERSION);
if (topo_fmri_nvl2str(thp, fru, &fmri, &err) != 0) {
fmd_hdl_topo_rele(hdl, thp);
continue;
}
zhp = find_by_fru(zhdl, fmri, &vdev);
topo_hdl_strfree(thp, fmri);
fmd_hdl_topo_rele(hdl, thp);
if (zhp == NULL)
continue;
(void) nvlist_lookup_uint64(vdev,
ZPOOL_CONFIG_GUID, &vdev_guid);
aux = VDEV_AUX_EXTERNAL;
#else
continue;
#endif
} else {
/*
* This is a ZFS fault. Lookup the resource, and
@@ -498,7 +583,7 @@ zfs_retire_recv(fmd_hdl_t *hdl, fmd_event_t *ep, nvlist_t *nvl,
/*
* Attempt to substitute a hot spare.
*/
(void) replace_with_spare(hdl, zhp, vdev);
replace_with_spare(hdl, zhp, vdev);
zpool_close(zhp);
}
@@ -530,7 +615,7 @@ _zfs_retire_init(fmd_hdl_t *hdl)
zfs_retire_data_t *zdp;
libzfs_handle_t *zhdl;
if ((zhdl = libzfs_init()) == NULL)
if ((zhdl = __libzfs_init()) == NULL)
return;
if (fmd_hdl_register(hdl, FMD_API_VERSION, &fmd_info) != 0) {
@@ -551,7 +636,7 @@ _zfs_retire_fini(fmd_hdl_t *hdl)
if (zdp != NULL) {
zfs_retire_clear_data(hdl, zdp);
libzfs_fini(zdp->zrd_hdl);
__libzfs_fini(zdp->zrd_hdl);
fmd_hdl_free(hdl, zdp, sizeof (zfs_retire_data_t));
}
}
+4 -29
View File
@@ -1,6 +1,6 @@
/*
* This file is part of the ZFS Event Daemon (ZED).
*
* This file is part of the ZFS Event Daemon (ZED)
* for ZFS on Linux (ZoL) <http://zfsonlinux.org/>.
* Developed at Lawrence Livermore National Laboratory (LLNL-CODE-403049).
* Copyright (C) 2013-2014 Lawrence Livermore National Security, LLC.
* Refer to the ZoL git commit log for authoritative copyright attribution.
@@ -263,43 +263,18 @@ main(int argc, char *argv[])
if (zed_conf_read_state(zcp, &saved_eid, saved_etime) < 0)
exit(EXIT_FAILURE);
idle:
/*
* If -I is specified, attempt to open /dev/zfs repeatedly until
* successful.
*/
do {
if (!zed_event_init(zcp))
break;
/* Wait for some time and try again. tunable? */
sleep(30);
} while (!_got_exit && zcp->do_idle);
if (_got_exit)
goto out;
zed_event_init(zcp);
zed_event_seek(zcp, saved_eid, saved_etime);
while (!_got_exit) {
int rv;
if (_got_hup) {
_got_hup = 0;
(void) zed_conf_scan_dir(zcp);
}
rv = zed_event_service(zcp);
/* ENODEV: When kernel module is unloaded (osx) */
if (rv == ENODEV)
break;
zed_event_service(zcp);
}
zed_log_msg(LOG_NOTICE, "Exiting");
zed_event_fini(zcp);
if (zcp->do_idle && !_got_exit)
goto idle;
out:
zed_conf_destroy(zcp);
zed_log_fini();
exit(EXIT_SUCCESS);
-1
View File
@@ -1 +0,0 @@
history_event-zfs-list-cacher.sh
-53
View File
@@ -1,53 +0,0 @@
include $(top_srcdir)/config/Rules.am
include $(top_srcdir)/config/Substfiles.am
EXTRA_DIST += README
zedconfdir = $(sysconfdir)/zfs/zed.d
dist_zedconf_DATA = \
zed-functions.sh \
zed.rc
zedexecdir = $(zfsexecdir)/zed.d
dist_zedexec_SCRIPTS = \
all-debug.sh \
all-syslog.sh \
data-notify.sh \
generic-notify.sh \
resilver_finish-notify.sh \
scrub_finish-notify.sh \
statechange-led.sh \
statechange-notify.sh \
vdev_clear-led.sh \
vdev_attach-led.sh \
pool_import-led.sh \
resilver_finish-start-scrub.sh \
trim_finish-notify.sh
nodist_zedexec_SCRIPTS = history_event-zfs-list-cacher.sh
SUBSTFILES += $(nodist_zedexec_SCRIPTS)
zedconfdefaults = \
all-syslog.sh \
data-notify.sh \
history_event-zfs-list-cacher.sh \
resilver_finish-notify.sh \
scrub_finish-notify.sh \
statechange-led.sh \
statechange-notify.sh \
vdev_clear-led.sh \
vdev_attach-led.sh \
pool_import-led.sh \
resilver_finish-start-scrub.sh
install-data-hook:
$(MKDIR_P) "$(DESTDIR)$(zedconfdir)"
for f in $(zedconfdefaults); do \
test -f "$(DESTDIR)$(zedconfdir)/$${f}" -o \
-L "$(DESTDIR)$(zedconfdir)/$${f}" || \
ln -s "$(zedexecdir)/$${f}" "$(DESTDIR)$(zedconfdir)"; \
done
chmod 0600 "$(DESTDIR)$(zedconfdir)/zed.rc"
+4 -40
View File
@@ -1,50 +1,14 @@
#!/bin/sh
#
# Copyright (C) 2013-2014 Lawrence Livermore National Security, LLC.
# Copyright (c) 2020 by Delphix. All rights reserved.
#
#
# Log the zevent via syslog.
#
[ -f "${ZED_ZEDLET_DIR}/zed.rc" ] && . "${ZED_ZEDLET_DIR}/zed.rc"
. "${ZED_ZEDLET_DIR}/zed-functions.sh"
zed_exit_if_ignoring_this_event
# build a string of name=value pairs for this event
msg="eid=${ZEVENT_EID} class=${ZEVENT_SUBCLASS}"
if [ "${ZED_SYSLOG_DISPLAY_GUIDS}" = "1" ]; then
[ -n "${ZEVENT_POOL_GUID}" ] && msg="${msg} pool_guid=${ZEVENT_POOL_GUID}"
[ -n "${ZEVENT_VDEV_GUID}" ] && msg="${msg} vdev_guid=${ZEVENT_VDEV_GUID}"
else
[ -n "${ZEVENT_POOL}" ] && msg="${msg} pool='${ZEVENT_POOL}'"
[ -n "${ZEVENT_VDEV_PATH}" ] && msg="${msg} vdev=$(basename "${ZEVENT_VDEV_PATH}")"
fi
# log pool state if state is anything other than 'ACTIVE'
[ -n "${ZEVENT_POOL_STATE_STR}" ] && [ "$ZEVENT_POOL_STATE" -ne 0 ] && \
msg="${msg} pool_state=${ZEVENT_POOL_STATE_STR}"
# Log the following payload nvpairs if they are present
[ -n "${ZEVENT_VDEV_STATE_STR}" ] && msg="${msg} vdev_state=${ZEVENT_VDEV_STATE_STR}"
[ -n "${ZEVENT_CKSUM_ALGORITHM}" ] && msg="${msg} algorithm=${ZEVENT_CKSUM_ALGORITHM}"
[ -n "${ZEVENT_ZIO_SIZE}" ] && msg="${msg} size=${ZEVENT_ZIO_SIZE}"
[ -n "${ZEVENT_ZIO_OFFSET}" ] && msg="${msg} offset=${ZEVENT_ZIO_OFFSET}"
[ -n "${ZEVENT_ZIO_PRIORITY}" ] && msg="${msg} priority=${ZEVENT_ZIO_PRIORITY}"
[ -n "${ZEVENT_ZIO_ERR}" ] && msg="${msg} err=${ZEVENT_ZIO_ERR}"
[ -n "${ZEVENT_ZIO_FLAGS}" ] && msg="${msg} flags=$(printf '0x%x' "${ZEVENT_ZIO_FLAGS}")"
# log delays that are >= 10 milisec
[ -n "${ZEVENT_ZIO_DELAY}" ] && [ "$ZEVENT_ZIO_DELAY" -gt 10000000 ] && \
msg="${msg} delay=$((ZEVENT_ZIO_DELAY / 1000000))ms"
# list the bookmark data together
[ -n "${ZEVENT_ZIO_OBJSET}" ] && \
msg="${msg} bookmark=${ZEVENT_ZIO_OBJSET}:${ZEVENT_ZIO_OBJECT}:${ZEVENT_ZIO_LEVEL}:${ZEVENT_ZIO_BLKID}"
zed_log_msg "${msg}"
zed_log_msg "eid=${ZEVENT_EID}" "class=${ZEVENT_SUBCLASS}" \
"${ZEVENT_POOL_GUID:+"pool_guid=${ZEVENT_POOL_GUID}"}" \
"${ZEVENT_VDEV_PATH:+"vdev_path=${ZEVENT_VDEV_PATH}"}" \
"${ZEVENT_VDEV_STATE_STR:+"vdev_state=${ZEVENT_VDEV_STATE_STR}"}"
exit 0
@@ -1,85 +0,0 @@
#!/bin/sh
#
# Track changes to enumerated pools for use in early-boot
set -ef
FSLIST_DIR="@sysconfdir@/zfs/zfs-list.cache"
FSLIST_TMP="@runstatedir@/zfs-list.cache.new"
FSLIST="${FSLIST_DIR}/${ZEVENT_POOL}"
# If the pool specific cache file is not writeable, abort
[ -w "${FSLIST}" ] || exit 0
[ -f "${ZED_ZEDLET_DIR}/zed.rc" ] && . "${ZED_ZEDLET_DIR}/zed.rc"
. "${ZED_ZEDLET_DIR}/zed-functions.sh"
zed_exit_if_ignoring_this_event
zed_check_cmd "${ZFS}" sort diff grep
# If we are acting on a snapshot, we have nothing to do
printf '%s' "${ZEVENT_HISTORY_DSNAME}" | grep '@' && exit 0
# We obtain a lock on zfs-list to avoid any simultaneous writes.
# If we run into trouble, log and drop the lock
abort_alter() {
zed_log_msg "Error updating zfs-list.cache!"
zed_unlock zfs-list
}
finished() {
zed_unlock zfs-list
trap - EXIT
exit 0
}
case "${ZEVENT_HISTORY_INTERNAL_NAME}" in
create|"finish receiving"|import|destroy|rename)
;;
export)
zed_lock zfs-list
trap abort_alter EXIT
echo > "${FSLIST}"
finished
;;
set|inherit)
# Only act if one of the tracked properties is altered.
case "${ZEVENT_HISTORY_INTERNAL_STR%%=*}" in
canmount|mountpoint|atime|relatime|devices|exec|readonly| \
setuid|nbmand|encroot|keylocation|org.openzfs.systemd:requires| \
org.openzfs.systemd:requires-mounts-for| \
org.openzfs.systemd:before|org.openzfs.systemd:after| \
org.openzfs.systemd:wanted-by|org.openzfs.systemd:required-by| \
org.openzfs.systemd:nofail|org.openzfs.systemd:ignore \
) ;;
*) exit 0 ;;
esac
;;
*)
# Ignore all other events.
exit 0
;;
esac
zed_lock zfs-list
trap abort_alter EXIT
PROPS="name,mountpoint,canmount,atime,relatime,devices,exec\
,readonly,setuid,nbmand,encroot,keylocation\
,org.openzfs.systemd:requires,org.openzfs.systemd:requires-mounts-for\
,org.openzfs.systemd:before,org.openzfs.systemd:after\
,org.openzfs.systemd:wanted-by,org.openzfs.systemd:required-by\
,org.openzfs.systemd:nofail,org.openzfs.systemd:ignore"
"${ZFS}" list -H -t filesystem -o $PROPS -r "${ZEVENT_POOL}" > "${FSLIST_TMP}"
# Sort the output so that it is stable
sort "${FSLIST_TMP}" -o "${FSLIST_TMP}"
# Don't modify the file if it hasn't changed
diff -q "${FSLIST_TMP}" "${FSLIST}" || mv "${FSLIST_TMP}" "${FSLIST}"
rm -f "${FSLIST_TMP}"
finished
@@ -5,12 +5,10 @@
# Exit codes:
# 1: Internal error
# 2: Script wasn't enabled in zed.rc
# 3: Scrubs are automatically started for sequential resilvers
[ -f "${ZED_ZEDLET_DIR}/zed.rc" ] && . "${ZED_ZEDLET_DIR}/zed.rc"
. "${ZED_ZEDLET_DIR}/zed-functions.sh"
[ "${ZED_SCRUB_AFTER_RESILVER}" = "1" ] || exit 2
[ "${ZEVENT_RESILVER_TYPE}" != "sequential" ] || exit 3
[ -n "${ZEVENT_POOL}" ] || exit 1
[ -n "${ZEVENT_SUBCLASS}" ] || exit 1
zed_check_cmd "${ZPOOL}" || exit 1
+3 -3
View File
@@ -20,7 +20,7 @@
#
# Exit codes:
# 0: enclosure led successfully set
# 1: enclosure leds not available
# 1: enclosure leds not not available
# 2: enclosure leds administratively disabled
# 3: The led sysfs path passed from ZFS does not exist
# 4: $ZPOOL not set
@@ -68,7 +68,7 @@ check_and_set_led()
# timeout.
for _ in $(seq 1 5); do
# We want to check the current state first, since writing to the
# 'fault' entry always causes a SES command, even if the
# 'fault' entry always always causes a SES command, even if the
# current state is already what you want.
current=$(cat "${file}")
@@ -165,7 +165,7 @@ process_pool()
fi
}
if [ -n "$ZEVENT_VDEV_ENC_SYSFS_PATH" ] && [ -n "$ZEVENT_VDEV_STATE_STR" ] ; then
if [ ! -z "$ZEVENT_VDEV_ENC_SYSFS_PATH" ] && [ ! -z "$ZEVENT_VDEV_STATE_STR" ] ; then
# Got a statechange for an individual VDEV
val=$(state_to_val "$ZEVENT_VDEV_STATE_STR")
vdev=$(basename "$ZEVENT_VDEV_PATH")
-37
View File
@@ -1,37 +0,0 @@
#!/bin/sh
#
# Send notification in response to a TRIM_FINISH. The event
# will be received for each vdev in the pool which was trimmed.
#
# Exit codes:
# 0: notification sent
# 1: notification failed
# 2: notification not configured
# 9: internal error
[ -f "${ZED_ZEDLET_DIR}/zed.rc" ] && . "${ZED_ZEDLET_DIR}/zed.rc"
. "${ZED_ZEDLET_DIR}/zed-functions.sh"
[ -n "${ZEVENT_POOL}" ] || exit 9
[ -n "${ZEVENT_SUBCLASS}" ] || exit 9
zed_check_cmd "${ZPOOL}" || exit 9
umask 077
note_subject="ZFS ${ZEVENT_SUBCLASS} event for ${ZEVENT_POOL} on $(hostname)"
note_pathname="${TMPDIR:="/tmp"}/$(basename -- "$0").${ZEVENT_EID}.$$"
{
echo "ZFS has finished a trim:"
echo
echo " eid: ${ZEVENT_EID}"
echo " class: ${ZEVENT_SUBCLASS}"
echo " host: $(hostname)"
echo " time: ${ZEVENT_TIME_STRING}"
"${ZPOOL}" status -t "${ZEVENT_POOL}"
} > "${note_pathname}"
zed_notify "${note_subject}" "${note_pathname}"; rv=$?
rm -f "${note_pathname}"
exit "${rv}"
+1 -79
View File
@@ -202,10 +202,6 @@ zed_notify()
[ "${rv}" -eq 0 ] && num_success=$((num_success + 1))
[ "${rv}" -eq 1 ] && num_failure=$((num_failure + 1))
zed_notify_slack_webhook "${subject}" "${pathname}"; rv=$?
[ "${rv}" -eq 0 ] && num_success=$((num_success + 1))
[ "${rv}" -eq 1 ] && num_failure=$((num_failure + 1))
[ "${num_success}" -gt 0 ] && return 0
[ "${num_failure}" -gt 0 ] && return 1
return 2
@@ -363,80 +359,6 @@ zed_notify_pushbullet()
}
# zed_notify_slack_webhook (subject, pathname)
#
# Notification via Slack Webhook <https://api.slack.com/incoming-webhooks>.
# The Webhook URL (ZED_SLACK_WEBHOOK_URL) identifies this client to the
# Slack channel.
#
# Requires awk, curl, and sed executables to be installed in the standard PATH.
#
# References
# https://api.slack.com/incoming-webhooks
#
# Arguments
# subject: notification subject
# pathname: pathname containing the notification message (OPTIONAL)
#
# Globals
# ZED_SLACK_WEBHOOK_URL
#
# Return
# 0: notification sent
# 1: notification failed
# 2: not configured
#
zed_notify_slack_webhook()
{
[ -n "${ZED_SLACK_WEBHOOK_URL}" ] || return 2
local subject="$1"
local pathname="${2:-"/dev/null"}"
local msg_body
local msg_tag
local msg_json
local msg_out
local msg_err
local url="${ZED_SLACK_WEBHOOK_URL}"
[ -n "${subject}" ] || return 1
if [ ! -r "${pathname}" ]; then
zed_log_err "slack webhook cannot read \"${pathname}\""
return 1
fi
zed_check_cmd "awk" "curl" "sed" || return 1
# Escape the following characters in the message body for JSON:
# newline, backslash, double quote, horizontal tab, vertical tab,
# and carriage return.
#
msg_body="$(awk '{ ORS="\\n" } { gsub(/\\/, "\\\\"); gsub(/"/, "\\\"");
gsub(/\t/, "\\t"); gsub(/\f/, "\\f"); gsub(/\r/, "\\r"); print }' \
"${pathname}")"
# Construct the JSON message for posting.
#
msg_json="$(printf '{"text": "*%s*\n%s"}' "${subject}" "${msg_body}" )"
# Send the POST request and check for errors.
#
msg_out="$(curl -X POST "${url}" \
--header "Content-Type: application/json" --data-binary "${msg_json}" \
2>/dev/null)"; rv=$?
if [ "${rv}" -ne 0 ]; then
zed_log_err "curl exit=${rv}"
return 1
fi
msg_err="$(echo "${msg_out}" \
| sed -n -e 's/.*"error" *:.*"message" *: *"\([^"]*\)".*/\1/p')"
if [ -n "${msg_err}" ]; then
zed_log_err "slack webhook \"${msg_err}"\"
return 1
fi
return 0
}
# zed_rate_limit (tag, [interval])
#
# Check whether an event of a given type [tag] has already occurred within the
@@ -512,7 +434,7 @@ zed_guid_to_pool()
fi
guid=$(printf "%llu" "$1")
if [ -n "$guid" ] ; then
if [ ! -z "$guid" ] ; then
$ZPOOL get -H -ovalue,name guid | awk '$1=='"$guid"' {print $2}'
fi
}
+4 -18
View File
@@ -52,9 +52,9 @@
##
# Send notifications for 'ereport.fs.zfs.data' events.
# Disabled by default, any non-empty value will enable the feature.
# Disabled by default
#
#ZED_NOTIFY_DATA=
#ZED_NOTIFY_DATA=1
##
# Pushbullet access token.
@@ -74,14 +74,6 @@
#
#ZED_PUSHBULLET_CHANNEL_TAG=""
##
# Slack Webhook URL.
# This allows posting to the given channel and includes an access token.
# <https://api.slack.com/incoming-webhooks>
# Disabled by default; uncomment to enable.
#
#ZED_SLACK_WEBHOOK_URL=""
##
# Default directory for zed state files.
#
@@ -96,8 +88,7 @@ ZED_USE_ENCLOSURE_LEDS=1
##
# Run a scrub after every resilver
# Disabled by default, 1 to enable and 0 to disable.
#ZED_SCRUB_AFTER_RESILVER=0
#ZED_SCRUB_AFTER_RESILVER=1
##
# The syslog priority (e.g., specified as a "facility.level" pair).
@@ -118,10 +109,5 @@ ZED_USE_ENCLOSURE_LEDS=1
# Otherwise, if ZED_SYSLOG_SUBCLASS_EXCLUDE is set, the
# matching subclasses are excluded from logging.
#ZED_SYSLOG_SUBCLASS_INCLUDE="checksum|scrub_*|vdev.*"
ZED_SYSLOG_SUBCLASS_EXCLUDE="history_event"
##
# Use GUIDs instead of names when logging pool and vdevs
# Disabled by default, 1 to enable and 0 to disable.
#ZED_SYSLOG_DISPLAY_GUIDS=1
#ZED_SYSLOG_SUBCLASS_EXCLUDE="statechange|config_*|history_event"
+2 -2
View File
@@ -1,6 +1,6 @@
/*
* This file is part of the ZFS Event Daemon (ZED).
*
* This file is part of the ZFS Event Daemon (ZED)
* for ZFS on Linux (ZoL) <http://zfsonlinux.org/>.
* Developed at Lawrence Livermore National Laboratory (LLNL-CODE-403049).
* Copyright (C) 2013-2014 Lawrence Livermore National Security, LLC.
* Refer to the ZoL git commit log for authoritative copyright attribution.
+3 -8
View File
@@ -1,6 +1,6 @@
/*
* This file is part of the ZFS Event Daemon (ZED).
*
* This file is part of the ZFS Event Daemon (ZED)
* for ZFS on Linux (ZoL) <http://zfsonlinux.org/>.
* Developed at Lawrence Livermore National Laboratory (LLNL-CODE-403049).
* Copyright (C) 2013-2014 Lawrence Livermore National Security, LLC.
* Refer to the ZoL git commit log for authoritative copyright attribution.
@@ -153,8 +153,6 @@ _zed_conf_display_help(const char *prog, int got_err)
"Force daemon to run.");
fprintf(fp, "%*c%*s %s\n", w1, 0x20, -w2, "-F",
"Run daemon in the foreground.");
fprintf(fp, "%*c%*s %s\n", w1, 0x20, -w2, "-I",
"Idle daemon until kernel module is (re)loaded.");
fprintf(fp, "%*c%*s %s\n", w1, 0x20, -w2, "-M",
"Lock all pages in memory.");
fprintf(fp, "%*c%*s %s\n", w1, 0x20, -w2, "-P",
@@ -251,7 +249,7 @@ _zed_conf_parse_path(char **resultp, const char *path)
void
zed_conf_parse_opts(struct zed_conf *zcp, int argc, char **argv)
{
const char * const opts = ":hLVc:d:p:P:s:vfFMZI";
const char * const opts = ":hLVc:d:p:P:s:vfFMZ";
int opt;
if (!zcp || !argv || !argv[0])
@@ -276,9 +274,6 @@ zed_conf_parse_opts(struct zed_conf *zcp, int argc, char **argv)
case 'd':
_zed_conf_parse_path(&zcp->zedlet_dir, optarg);
break;
case 'I':
zcp->do_idle = 1;
break;
case 'p':
_zed_conf_parse_path(&zcp->pid_file, optarg);
break;
+2 -3
View File
@@ -1,6 +1,6 @@
/*
* This file is part of the ZFS Event Daemon (ZED).
*
* This file is part of the ZFS Event Daemon (ZED)
* for ZFS on Linux (ZoL) <http://zfsonlinux.org/>.
* Developed at Lawrence Livermore National Laboratory (LLNL-CODE-403049).
* Copyright (C) 2013-2014 Lawrence Livermore National Security, LLC.
* Refer to the ZoL git commit log for authoritative copyright attribution.
@@ -25,7 +25,6 @@ struct zed_conf {
unsigned do_memlock:1; /* true if locking memory */
unsigned do_verbose:1; /* true if verbosity enabled */
unsigned do_zero:1; /* true if zeroing state */
unsigned do_idle:1; /* true if idle enabled */
int syslog_facility; /* syslog facility value */
int min_events; /* RESERVED FOR FUTURE USE */
int max_events; /* RESERVED FOR FUTURE USE */
+1 -2
View File
@@ -21,7 +21,6 @@
#include <libnvpair.h>
#include <libudev.h>
#include <libzfs.h>
#include <libzutil.h>
#include <pthread.h>
#include <stdlib.h>
#include <string.h>
@@ -38,7 +37,7 @@
* A libudev monitor is established to monitor block device actions and pass
* them on to internal ZED logic modules. Initially, zfs_mod.c is the only
* consumer and is the Linux equivalent for the illumos syseventd ZFS SLM
* module responsible for handling disk events for ZFS.
* module responsible for handeling disk events for ZFS.
*/
pthread_t g_mon_tid;
+9 -22
View File
@@ -1,6 +1,6 @@
/*
* This file is part of the ZFS Event Daemon (ZED).
*
* This file is part of the ZFS Event Daemon (ZED)
* for ZFS on Linux (ZoL) <http://zfsonlinux.org/>.
* Developed at Lawrence Livermore National Laboratory (LLNL-CODE-403049).
* Copyright (C) 2013-2014 Lawrence Livermore National Security, LLC.
* Refer to the ZoL git commit log for authoritative copyright attribution.
@@ -28,7 +28,6 @@
#include "zed.h"
#include "zed_conf.h"
#include "zed_disk_event.h"
#include "zed_event.h"
#include "zed_exec.h"
#include "zed_file.h"
#include "zed_log.h"
@@ -41,36 +40,25 @@
/*
* Open the libzfs interface.
*/
int
void
zed_event_init(struct zed_conf *zcp)
{
if (!zcp)
zed_log_die("Failed zed_event_init: %s", strerror(EINVAL));
zcp->zfs_hdl = libzfs_init();
if (!zcp->zfs_hdl) {
if (zcp->do_idle)
return (-1);
if (!zcp->zfs_hdl)
zed_log_die("Failed to initialize libzfs");
}
zcp->zevent_fd = open(ZFS_DEV, O_RDWR);
if (zcp->zevent_fd < 0) {
if (zcp->do_idle)
return (-1);
if (zcp->zevent_fd < 0)
zed_log_die("Failed to open \"%s\": %s",
ZFS_DEV, strerror(errno));
}
zfs_agent_init(zcp->zfs_hdl);
if (zed_disk_event_init() != 0) {
if (zcp->do_idle)
return (-1);
if (zed_disk_event_init() != 0)
zed_log_die("Failed to initialize disk events");
}
return (0);
}
/*
@@ -884,7 +872,7 @@ _zed_event_add_time_strings(uint64_t eid, zed_strings_t *zsp, int64_t etime[])
/*
* Service the next zevent, blocking until one is available.
*/
int
void
zed_event_service(struct zed_conf *zcp)
{
nvlist_t *nvl;
@@ -902,13 +890,13 @@ zed_event_service(struct zed_conf *zcp)
errno = EINVAL;
zed_log_msg(LOG_ERR, "Failed to service zevent: %s",
strerror(errno));
return (EINVAL);
return;
}
rv = zpool_events_next(zcp->zfs_hdl, &nvl, &n_dropped, ZEVENT_NONE,
zcp->zevent_fd);
if ((rv != 0) || !nvl)
return (errno);
return;
if (n_dropped > 0) {
zed_log_msg(LOG_WARNING, "Missed %d events", n_dropped);
@@ -961,5 +949,4 @@ zed_event_service(struct zed_conf *zcp)
zed_strings_destroy(zsp);
}
nvlist_free(nvl);
return (0);
}
+4 -4
View File
@@ -1,6 +1,6 @@
/*
* This file is part of the ZFS Event Daemon (ZED).
*
* This file is part of the ZFS Event Daemon (ZED)
* for ZFS on Linux (ZoL) <http://zfsonlinux.org/>.
* Developed at Lawrence Livermore National Laboratory (LLNL-CODE-403049).
* Copyright (C) 2013-2014 Lawrence Livermore National Security, LLC.
* Refer to the ZoL git commit log for authoritative copyright attribution.
@@ -17,13 +17,13 @@
#include <stdint.h>
int zed_event_init(struct zed_conf *zcp);
void zed_event_init(struct zed_conf *zcp);
void zed_event_fini(struct zed_conf *zcp);
int zed_event_seek(struct zed_conf *zcp, uint64_t saved_eid,
int64_t saved_etime[]);
int zed_event_service(struct zed_conf *zcp);
void zed_event_service(struct zed_conf *zcp);
#endif /* !ZED_EVENT_H */
+2 -3
View File
@@ -1,6 +1,6 @@
/*
* This file is part of the ZFS Event Daemon (ZED).
*
* This file is part of the ZFS Event Daemon (ZED)
* for ZFS on Linux (ZoL) <http://zfsonlinux.org/>.
* Developed at Lawrence Livermore National Laboratory (LLNL-CODE-403049).
* Copyright (C) 2013-2014 Lawrence Livermore National Security, LLC.
* Refer to the ZoL git commit log for authoritative copyright attribution.
@@ -22,7 +22,6 @@
#include <sys/wait.h>
#include <time.h>
#include <unistd.h>
#include "zed_exec.h"
#include "zed_file.h"
#include "zed_log.h"
#include "zed_strings.h"
+2 -3
View File
@@ -1,6 +1,6 @@
/*
* This file is part of the ZFS Event Daemon (ZED).
*
* This file is part of the ZFS Event Daemon (ZED)
* for ZFS on Linux (ZoL) <http://zfsonlinux.org/>.
* Developed at Lawrence Livermore National Laboratory (LLNL-CODE-403049).
* Copyright (C) 2013-2014 Lawrence Livermore National Security, LLC.
* Refer to the ZoL git commit log for authoritative copyright attribution.
@@ -16,7 +16,6 @@
#define ZED_EXEC_H
#include <stdint.h>
#include "zed_strings.h"
int zed_exec_process(uint64_t eid, const char *class, const char *subclass,
const char *dir, zed_strings_t *zedlets, zed_strings_t *envs,
+2 -3
View File
@@ -1,6 +1,6 @@
/*
* This file is part of the ZFS Event Daemon (ZED).
*
* This file is part of the ZFS Event Daemon (ZED)
* for ZFS on Linux (ZoL) <http://zfsonlinux.org/>.
* Developed at Lawrence Livermore National Laboratory (LLNL-CODE-403049).
* Copyright (C) 2013-2014 Lawrence Livermore National Security, LLC.
* Refer to the ZoL git commit log for authoritative copyright attribution.
@@ -20,7 +20,6 @@
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
#include "zed_file.h"
#include "zed_log.h"
/*
+2 -2
View File
@@ -1,6 +1,6 @@
/*
* This file is part of the ZFS Event Daemon (ZED).
*
* This file is part of the ZFS Event Daemon (ZED)
* for ZFS on Linux (ZoL) <http://zfsonlinux.org/>.
* Developed at Lawrence Livermore National Laboratory (LLNL-CODE-403049).
* Copyright (C) 2013-2014 Lawrence Livermore National Security, LLC.
* Refer to the ZoL git commit log for authoritative copyright attribution.
+2 -2
View File
@@ -1,6 +1,6 @@
/*
* This file is part of the ZFS Event Daemon (ZED).
*
* This file is part of the ZFS Event Daemon (ZED)
* for ZFS on Linux (ZoL) <http://zfsonlinux.org/>.
* Developed at Lawrence Livermore National Laboratory (LLNL-CODE-403049).
* Copyright (C) 2013-2014 Lawrence Livermore National Security, LLC.
* Refer to the ZoL git commit log for authoritative copyright attribution.
+2 -2
View File
@@ -1,6 +1,6 @@
/*
* This file is part of the ZFS Event Daemon (ZED).
*
* This file is part of the ZFS Event Daemon (ZED)
* for ZFS on Linux (ZoL) <http://zfsonlinux.org/>.
* Developed at Lawrence Livermore National Laboratory (LLNL-CODE-403049).
* Copyright (C) 2013-2014 Lawrence Livermore National Security, LLC.
* Refer to the ZoL git commit log for authoritative copyright attribution.
+3 -3
View File
@@ -1,6 +1,6 @@
/*
* This file is part of the ZFS Event Daemon (ZED).
*
* This file is part of the ZFS Event Daemon (ZED)
* for ZFS on Linux (ZoL) <http://zfsonlinux.org/>.
* Developed at Lawrence Livermore National Laboratory (LLNL-CODE-403049).
* Copyright (C) 2013-2014 Lawrence Livermore National Security, LLC.
* Refer to the ZoL git commit log for authoritative copyright attribution.
@@ -108,7 +108,7 @@ _zed_strings_node_destroy(zed_strings_node_t *np)
* If [key] is specified, it will be used to index the node; otherwise,
* the string [val] will be used.
*/
static zed_strings_node_t *
zed_strings_node_t *
_zed_strings_node_create(const char *key, const char *val)
{
zed_strings_node_t *np;
+2 -2
View File
@@ -1,6 +1,6 @@
/*
* This file is part of the ZFS Event Daemon (ZED).
*
* This file is part of the ZFS Event Daemon (ZED)
* for ZFS on Linux (ZoL) <http://zfsonlinux.org/>.
* Developed at Lawrence Livermore National Laboratory (LLNL-CODE-403049).
* Copyright (C) 2013-2014 Lawrence Livermore National Security, LLC.
* Refer to the ZoL git commit log for authoritative copyright attribution.
+11 -12
View File
@@ -1,23 +1,22 @@
include $(top_srcdir)/config/Rules.am
DEFAULT_INCLUDES += \
-I$(top_srcdir)/include \
-I$(top_srcdir)/lib/libspl/include
sbin_PROGRAMS = zfs
zfs_SOURCES = \
zfs_iter.c \
zfs_iter.h \
zfs_main.c \
zfs_util.h \
zfs_project.c \
zfs_projectutil.h
zfs_util.h
zfs_LDADD = \
$(abs_top_builddir)/lib/libzfs/libzfs.la \
$(abs_top_builddir)/lib/libzfs_core/libzfs_core.la \
$(abs_top_builddir)/lib/libnvpair/libnvpair.la \
$(abs_top_builddir)/lib/libuutil/libuutil.la
$(top_builddir)/lib/libnvpair/libnvpair.la \
$(top_builddir)/lib/libuutil/libuutil.la \
$(top_builddir)/lib/libzpool/libzpool.la \
$(top_builddir)/lib/libzfs/libzfs.la \
$(top_builddir)/lib/libzfs_core/libzfs_core.la
zfs_LDADD += $(LTLIBINTL)
if BUILD_FREEBSD
zfs_LDADD += -lgeom -ljail
endif
zfs_LDFLAGS = -pthread
+6 -22
View File
@@ -31,7 +31,6 @@
#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <strings.h>
#include <libzfs.h>
@@ -134,31 +133,16 @@ zfs_callback(zfs_handle_t *zhp, void *data)
((cb->cb_flags & ZFS_ITER_DEPTH_LIMIT) == 0 ||
cb->cb_depth < cb->cb_depth_limit)) {
cb->cb_depth++;
/*
* If we are not looking for filesystems, we don't need to
* recurse into filesystems when we are at our depth limit.
*/
if ((cb->cb_depth < cb->cb_depth_limit ||
(cb->cb_flags & ZFS_ITER_DEPTH_LIMIT) == 0 ||
(cb->cb_types &
(ZFS_TYPE_FILESYSTEM | ZFS_TYPE_VOLUME))) &&
zfs_get_type(zhp) == ZFS_TYPE_FILESYSTEM) {
if (zfs_get_type(zhp) == ZFS_TYPE_FILESYSTEM)
(void) zfs_iter_filesystems(zhp, zfs_callback, data);
}
if (((zfs_get_type(zhp) & (ZFS_TYPE_SNAPSHOT |
ZFS_TYPE_BOOKMARK)) == 0) && include_snaps) {
ZFS_TYPE_BOOKMARK)) == 0) && include_snaps)
(void) zfs_iter_snapshots(zhp,
(cb->cb_flags & ZFS_ITER_SIMPLE) != 0,
zfs_callback, data, 0, 0);
}
(cb->cb_flags & ZFS_ITER_SIMPLE) != 0, zfs_callback,
data);
if (((zfs_get_type(zhp) & (ZFS_TYPE_SNAPSHOT |
ZFS_TYPE_BOOKMARK)) == 0) && include_bmarks) {
ZFS_TYPE_BOOKMARK)) == 0) && include_bmarks)
(void) zfs_iter_bookmarks(zhp, zfs_callback, data);
}
cb->cb_depth--;
}
@@ -240,7 +224,7 @@ zfs_compare(const void *larg, const void *rarg, void *unused)
*rat = '\0';
ret = strcmp(lname, rname);
if (ret == 0 && (lat != NULL || rat != NULL)) {
if (ret == 0) {
/*
* If we're comparing a dataset to one of its snapshots, we
* always make the full dataset first.
+304 -1799
View File
File diff suppressed because it is too large Load Diff
-295
View File
@@ -1,295 +0,0 @@
/*
* CDDL HEADER START
*
* The contents of this file are subject to the terms of the
* Common Development and Distribution License (the "License").
* You may not use this file except in compliance with the License.
*
* You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
* or http://www.opensolaris.org/os/licensing.
* See the License for the specific language governing permissions
* and limitations under the License.
*
* When distributing Covered Code, include this CDDL HEADER in each
* file and include the License file at usr/src/OPENSOLARIS.LICENSE.
* If applicable, add the following below this CDDL HEADER, with the
* fields enclosed by brackets "[]" replaced with your own identifying
* information: Portions Copyright [yyyy] [name of copyright owner]
*
* CDDL HEADER END
*/
/*
* Copyright (c) 2017, Intle Corporation. All rights reserved.
*/
#include <errno.h>
#include <getopt.h>
#include <stdio.h>
#include <stdlib.h>
#include <strings.h>
#include <unistd.h>
#include <fcntl.h>
#include <dirent.h>
#include <stddef.h>
#include <libintl.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/list.h>
#include <sys/zfs_project.h>
#include "zfs_util.h"
#include "zfs_projectutil.h"
typedef struct zfs_project_item {
list_node_t zpi_list;
char zpi_name[0];
} zfs_project_item_t;
static void
zfs_project_item_alloc(list_t *head, const char *name)
{
zfs_project_item_t *zpi;
zpi = safe_malloc(sizeof (zfs_project_item_t) + strlen(name) + 1);
strcpy(zpi->zpi_name, name);
list_insert_tail(head, zpi);
}
static int
zfs_project_sanity_check(const char *name, zfs_project_control_t *zpc,
struct stat *st)
{
int ret;
ret = stat(name, st);
if (ret) {
(void) fprintf(stderr, gettext("failed to stat %s: %s\n"),
name, strerror(errno));
return (ret);
}
if (!S_ISREG(st->st_mode) && !S_ISDIR(st->st_mode)) {
(void) fprintf(stderr, gettext("only support project quota on "
"regular file or directory\n"));
return (-1);
}
if (!S_ISDIR(st->st_mode)) {
if (zpc->zpc_dironly) {
(void) fprintf(stderr, gettext(
"'-d' option on non-dir target %s\n"), name);
return (-1);
}
if (zpc->zpc_recursive) {
(void) fprintf(stderr, gettext(
"'-r' option on non-dir target %s\n"), name);
return (-1);
}
}
return (0);
}
static int
zfs_project_load_projid(const char *name, zfs_project_control_t *zpc)
{
zfsxattr_t fsx;
int ret, fd;
fd = open(name, O_RDONLY | O_NOCTTY);
if (fd < 0) {
(void) fprintf(stderr, gettext("failed to open %s: %s\n"),
name, strerror(errno));
return (fd);
}
ret = ioctl(fd, ZFS_IOC_FSGETXATTR, &fsx);
if (ret)
(void) fprintf(stderr,
gettext("failed to get xattr for %s: %s\n"),
name, strerror(errno));
else
zpc->zpc_expected_projid = fsx.fsx_projid;
close(fd);
return (ret);
}
static int
zfs_project_handle_one(const char *name, zfs_project_control_t *zpc)
{
zfsxattr_t fsx;
int ret, fd;
fd = open(name, O_RDONLY | O_NOCTTY);
if (fd < 0) {
if (errno == ENOENT && zpc->zpc_ignore_noent)
return (0);
(void) fprintf(stderr, gettext("failed to open %s: %s\n"),
name, strerror(errno));
return (fd);
}
ret = ioctl(fd, ZFS_IOC_FSGETXATTR, &fsx);
if (ret) {
(void) fprintf(stderr,
gettext("failed to get xattr for %s: %s\n"),
name, strerror(errno));
goto out;
}
switch (zpc->zpc_op) {
case ZFS_PROJECT_OP_LIST:
(void) printf("%5u %c %s\n", fsx.fsx_projid,
(fsx.fsx_xflags & ZFS_PROJINHERIT_FL) ? 'P' : '-', name);
goto out;
case ZFS_PROJECT_OP_CHECK:
if (fsx.fsx_projid == zpc->zpc_expected_projid &&
fsx.fsx_xflags & ZFS_PROJINHERIT_FL)
goto out;
if (!zpc->zpc_newline) {
char c = '\0';
(void) printf("%s%c", name, c);
goto out;
}
if (fsx.fsx_projid != zpc->zpc_expected_projid)
(void) printf("%s - project ID is not set properly "
"(%u/%u)\n", name, fsx.fsx_projid,
(uint32_t)zpc->zpc_expected_projid);
if (!(fsx.fsx_xflags & ZFS_PROJINHERIT_FL))
(void) printf("%s - project inherit flag is not set\n",
name);
goto out;
case ZFS_PROJECT_OP_CLEAR:
if (!(fsx.fsx_xflags & ZFS_PROJINHERIT_FL) &&
(zpc->zpc_keep_projid ||
fsx.fsx_projid == ZFS_DEFAULT_PROJID))
goto out;
fsx.fsx_xflags &= ~ZFS_PROJINHERIT_FL;
if (!zpc->zpc_keep_projid)
fsx.fsx_projid = ZFS_DEFAULT_PROJID;
break;
case ZFS_PROJECT_OP_SET:
if (fsx.fsx_projid == zpc->zpc_expected_projid &&
(!zpc->zpc_set_flag || fsx.fsx_xflags & ZFS_PROJINHERIT_FL))
goto out;
fsx.fsx_projid = zpc->zpc_expected_projid;
if (zpc->zpc_set_flag)
fsx.fsx_xflags |= ZFS_PROJINHERIT_FL;
break;
default:
ASSERT(0);
break;
}
ret = ioctl(fd, ZFS_IOC_FSSETXATTR, &fsx);
if (ret)
(void) fprintf(stderr,
gettext("failed to set xattr for %s: %s\n"),
name, strerror(errno));
out:
close(fd);
return (ret);
}
static int
zfs_project_handle_dir(const char *name, zfs_project_control_t *zpc,
list_t *head)
{
char fullname[PATH_MAX];
struct dirent *ent;
DIR *dir;
int ret = 0;
dir = opendir(name);
if (dir == NULL) {
if (errno == ENOENT && zpc->zpc_ignore_noent)
return (0);
ret = -errno;
(void) fprintf(stderr, gettext("failed to opendir %s: %s\n"),
name, strerror(errno));
return (ret);
}
/* Non-top item, ignore the case of being removed or renamed by race. */
zpc->zpc_ignore_noent = B_TRUE;
errno = 0;
while (!ret && (ent = readdir(dir)) != NULL) {
/* skip "." and ".." */
if (strcmp(ent->d_name, ".") == 0 ||
strcmp(ent->d_name, "..") == 0)
continue;
if (strlen(ent->d_name) + strlen(name) >=
sizeof (fullname) + 1) {
errno = ENAMETOOLONG;
break;
}
sprintf(fullname, "%s/%s", name, ent->d_name);
ret = zfs_project_handle_one(fullname, zpc);
if (!ret && zpc->zpc_recursive && ent->d_type == DT_DIR)
zfs_project_item_alloc(head, fullname);
}
if (errno && !ret) {
ret = -errno;
(void) fprintf(stderr, gettext("failed to readdir %s: %s\n"),
name, strerror(errno));
}
closedir(dir);
return (ret);
}
int
zfs_project_handle(const char *name, zfs_project_control_t *zpc)
{
zfs_project_item_t *zpi;
struct stat st;
list_t head;
int ret;
ret = zfs_project_sanity_check(name, zpc, &st);
if (ret)
return (ret);
if ((zpc->zpc_op == ZFS_PROJECT_OP_SET ||
zpc->zpc_op == ZFS_PROJECT_OP_CHECK) &&
zpc->zpc_expected_projid == ZFS_INVALID_PROJID) {
ret = zfs_project_load_projid(name, zpc);
if (ret)
return (ret);
}
zpc->zpc_ignore_noent = B_FALSE;
ret = zfs_project_handle_one(name, zpc);
if (ret || !S_ISDIR(st.st_mode) || zpc->zpc_dironly ||
(!zpc->zpc_recursive &&
zpc->zpc_op != ZFS_PROJECT_OP_LIST &&
zpc->zpc_op != ZFS_PROJECT_OP_CHECK))
return (ret);
list_create(&head, sizeof (zfs_project_item_t),
offsetof(zfs_project_item_t, zpi_list));
zfs_project_item_alloc(&head, name);
while ((zpi = list_remove_head(&head)) != NULL) {
if (!ret)
ret = zfs_project_handle_dir(zpi->zpi_name, zpc, &head);
free(zpi);
}
return (ret);
}
-49
View File
@@ -1,49 +0,0 @@
/*
* CDDL HEADER START
*
* The contents of this file are subject to the terms of the
* Common Development and Distribution License (the "License").
* You may not use this file except in compliance with the License.
*
* You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
* or http://www.opensolaris.org/os/licensing.
* See the License for the specific language governing permissions
* and limitations under the License.
*
* When distributing Covered Code, include this CDDL HEADER in each
* file and include the License file at usr/src/OPENSOLARIS.LICENSE.
* If applicable, add the following below this CDDL HEADER, with the
* fields enclosed by brackets "[]" replaced with your own identifying
* information: Portions Copyright [yyyy] [name of copyright owner]
*
* CDDL HEADER END
*/
/*
* Copyright (c) 2017, Intel Corporation. All rights reserved.
*/
#ifndef _ZFS_PROJECTUTIL_H
#define _ZFS_PROJECTUTIL_H
typedef enum {
ZFS_PROJECT_OP_DEFAULT = 0,
ZFS_PROJECT_OP_LIST = 1,
ZFS_PROJECT_OP_CHECK = 2,
ZFS_PROJECT_OP_CLEAR = 3,
ZFS_PROJECT_OP_SET = 4,
} zfs_project_ops_t;
typedef struct zfs_project_control {
uint64_t zpc_expected_projid;
zfs_project_ops_t zpc_op;
boolean_t zpc_dironly;
boolean_t zpc_ignore_noent;
boolean_t zpc_keep_projid;
boolean_t zpc_newline;
boolean_t zpc_recursive;
boolean_t zpc_set_flag;
} zfs_project_control_t;
int zfs_project_handle(const char *name, zfs_project_control_t *zpc);
#endif /* _ZFS_PROJECTUTIL_H */
+1 -1
View File
@@ -33,7 +33,7 @@ extern "C" {
void * safe_malloc(size_t size);
void nomem(void);
extern libzfs_handle_t *g_zfs;
libzfs_handle_t *g_zfs;
#ifdef __cplusplus
}
-1
View File
@@ -1 +0,0 @@
zfs_ids_to_path
-9
View File
@@ -1,9 +0,0 @@
include $(top_srcdir)/config/Rules.am
sbin_PROGRAMS = zfs_ids_to_path
zfs_ids_to_path_SOURCES = \
zfs_ids_to_path.c
zfs_ids_to_path_LDADD = \
$(abs_top_builddir)/lib/libzfs/libzfs.la
-96
View File
@@ -1,96 +0,0 @@
/*
* CDDL HEADER START
*
* The contents of this file are subject to the terms of the
* Common Development and Distribution License (the "License").
* You may not use this file except in compliance with the License.
*
* You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
* or http://www.opensolaris.org/os/licensing.
* See the License for the specific language governing permissions
* and limitations under the License.
*
* When distributing Covered Code, include this CDDL HEADER in each
* file and include the License file at usr/src/OPENSOLARIS.LICENSE.
* If applicable, add the following below this CDDL HEADER, with the
* fields enclosed by brackets "[]" replaced with your own identifying
* information: Portions Copyright [yyyy] [name of copyright owner]
*
* CDDL HEADER END
*/
/*
* Copyright (c) 2019 by Delphix. All rights reserved.
*/
#include <libintl.h>
#include <unistd.h>
#include <sys/types.h>
#include <stdint.h>
#include <libzfs.h>
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
libzfs_handle_t *g_zfs;
static void
usage(int err)
{
fprintf(stderr, "Usage: [-v] zfs_ids_to_path <pool> <objset id> "
"<object id>\n");
exit(err);
}
int
main(int argc, char **argv)
{
boolean_t verbose = B_FALSE;
char c;
while ((c = getopt(argc, argv, "v")) != -1) {
switch (c) {
case 'v':
verbose = B_TRUE;
break;
}
}
argc -= optind;
argv += optind;
if (argc != 3) {
(void) fprintf(stderr, "Incorrect number of arguments: %d\n",
argc);
usage(1);
}
uint64_t objset, object;
if (sscanf(argv[1], "%llu", (u_longlong_t *)&objset) != 1) {
(void) fprintf(stderr, "Invalid objset id: %s\n", argv[2]);
usage(2);
}
if (sscanf(argv[2], "%llu", (u_longlong_t *)&object) != 1) {
(void) fprintf(stderr, "Invalid object id: %s\n", argv[3]);
usage(3);
}
if ((g_zfs = libzfs_init()) == NULL) {
(void) fprintf(stderr, "%s\n", libzfs_error_init(errno));
return (4);
}
zpool_handle_t *pool = zpool_open(g_zfs, argv[0]);
if (pool == NULL) {
fprintf(stderr, "Could not open pool %s\n", argv[1]);
libzfs_fini(g_zfs);
return (5);
}
char pathname[PATH_MAX * 2];
if (verbose) {
zpool_obj_to_path_ds(pool, objset, object, pathname,
sizeof (pathname));
} else {
zpool_obj_to_path(pool, objset, object, pathname,
sizeof (pathname));
}
printf("%s\n", pathname);
zpool_close(pool);
libzfs_fini(g_zfs);
return (0);
}
-1
View File
@@ -1 +0,0 @@
/zgenhostid
+1 -5
View File
@@ -1,5 +1 @@
include $(top_srcdir)/config/Rules.am
bin_PROGRAMS = zgenhostid
zgenhostid_SOURCES = zgenhostid.c
dist_bin_SCRIPTS = zgenhostid
+61
View File
@@ -0,0 +1,61 @@
#!/bin/bash
# Emulate genhostid(1) available on RHEL/CENTOS, for use on distros
# which do not provide that utility.
#
# Usage:
# zgenhostid
# zgenhostid <value>
#
# If /etc/hostid already exists and is size > 0, the script exits immediately
# and changes nothing. Unlike genhostid, this generates an error message.
#
# The first form generates a random hostid and stores it in /etc/hostid.
# The second form checks that the provided value is between 0x1 and 0xFFFFFFFF
# and if so, stores it in /etc/hostid. This form is not supported by
# genhostid(1).
hostid_file=/etc/hostid
function usage {
echo "$0 [value]"
echo "If $hostid_file is not present, store a hostid in it." >&2
echo "The optional value must be an 8-digit hex number between" >&2
echo "1 and 2^32-1. If no value is provided, a random one will" >&2
echo "be generated. The value must be unique among your systems." >&2
}
# hostid(1) ignores contents of /etc/hostid if size < 4 bytes. It would
# be better if this checked size >= 4 bytes but it the method must be
# widely portable.
if [ -s $hostid_file ]; then
echo "$hostid_file already exists. No change made." >&2
exit 1
fi
if [ -n "$1" ]; then
host_id=$1
else
# $RANDOM goes from 0..32k-1
number=$((((RANDOM % 4) * 32768 + RANDOM) * 32768 + RANDOM))
host_id=$(printf "%08x" $number)
fi
if egrep -o '^0{8}$' <<< $host_id >/dev/null 2>&1; then
usage
exit 2
fi
if ! egrep -o '^[a-fA-F0-9]{8}$' <<< $host_id >/dev/null 2>&1; then
usage
exit 3
fi
a=${host_id:6:2}
b=${host_id:4:2}
c=${host_id:2:2}
d=${host_id:0:2}
echo -ne \\x$a\\x$b\\x$c\\x$d > $hostid_file
exit 0
-152
View File
@@ -1,152 +0,0 @@
/*
* CDDL HEADER START
*
* The contents of this file are subject to the terms of the
* Common Development and Distribution License (the "License").
* You may not use this file except in compliance with the License.
*
* You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
* or http://www.opensolaris.org/os/licensing.
* See the License for the specific language governing permissions
* and limitations under the License.
*
* When distributing Covered Code, include this CDDL HEADER in each
* file and include the License file at usr/src/OPENSOLARIS.LICENSE.
* If applicable, add the following below this CDDL HEADER, with the
* fields enclosed by brackets "[]" replaced with your own identifying
* information: Portions Copyright [yyyy] [name of copyright owner]
*
* CDDL HEADER END
*/
/*
* Copyright (c) 2020, Georgy Yakovlev. All rights reserved.
*/
#include <errno.h>
#include <fcntl.h>
#include <getopt.h>
#include <inttypes.h>
#include <limits.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/stat.h>
#include <time.h>
#include <unistd.h>
static void usage(void);
static void
usage(void)
{
(void) fprintf(stderr,
"usage: zgenhostid [-fh] [-o path] [value]\n\n"
" -f\t\t force hostid file write\n"
" -h\t\t print this usage and exit\n"
" -o <filename>\t write hostid to this file\n\n"
"If hostid file is not present, store a hostid in it.\n"
"The optional value should be an 8-digit hex number between"
" 1 and 2^32-1.\n"
"If the value is 0 or no value is provided, a random one"
" will be generated.\n"
"The value must be unique among your systems.\n");
exit(EXIT_FAILURE);
/* NOTREACHED */
}
int
main(int argc, char **argv)
{
/* default file path, can be optionally set by user */
char path[PATH_MAX] = "/etc/hostid";
/* holds converted user input or lrand48() generated value */
unsigned long input_i = 0;
int opt;
int pathlen;
int force_fwrite = 0;
while ((opt = getopt_long(argc, argv, "fo:h?", 0, 0)) != -1) {
switch (opt) {
case 'f':
force_fwrite = 1;
break;
case 'o':
pathlen = snprintf(path, sizeof (path), "%s", optarg);
if (pathlen >= sizeof (path)) {
fprintf(stderr, "%s\n", strerror(EOVERFLOW));
exit(EXIT_FAILURE);
} else if (pathlen < 1) {
fprintf(stderr, "%s\n", strerror(EINVAL));
exit(EXIT_FAILURE);
}
break;
case 'h':
case '?':
usage();
}
}
char *in_s = argv[optind];
if (in_s != NULL) {
/* increment pointer by 2 if string is 0x prefixed */
if (strncasecmp("0x", in_s, 2) == 0) {
in_s += 2;
}
/* need to be exactly 8 characters */
const char *hex = "0123456789abcdefABCDEF";
if (strlen(in_s) != 8 || strspn(in_s, hex) != 8) {
fprintf(stderr, "%s\n", strerror(ERANGE));
usage();
}
input_i = strtoul(in_s, NULL, 16);
if (errno != 0) {
perror("strtoul");
exit(EXIT_FAILURE);
}
if (input_i > UINT32_MAX) {
fprintf(stderr, "%s\n", strerror(ERANGE));
usage();
}
}
struct stat fstat;
if (force_fwrite == 0 && stat(path, &fstat) == 0 &&
S_ISREG(fstat.st_mode)) {
fprintf(stderr, "%s: %s\n", path, strerror(EEXIST));
exit(EXIT_FAILURE);
}
/*
* generate if not provided by user
* also handle unlikely zero return from lrand48()
*/
while (input_i == 0) {
srand48(getpid() ^ time(NULL));
input_i = lrand48();
}
FILE *fp = fopen(path, "wb");
if (!fp) {
perror("fopen");
exit(EXIT_FAILURE);
}
/*
* we need just 4 bytes in native endianess
* not using sethostid() because it may be missing or just a stub
*/
uint32_t hostid = input_i;
int written = fwrite(&hostid, 1, 4, fp);
if (written != 4) {
perror("fwrite");
exit(EXIT_FAILURE);
}
fclose(fp);
exit(EXIT_SUCCESS);
}
+8 -5
View File
@@ -1,7 +1,8 @@
include $(top_srcdir)/config/Rules.am
# Unconditionally enable debugging for zhack
AM_CPPFLAGS += -DDEBUG -UNDEBUG -DZFS_DEBUG
DEFAULT_INCLUDES += \
-I$(top_srcdir)/include \
-I$(top_srcdir)/lib/libspl/include
sbin_PROGRAMS = zhack
@@ -9,6 +10,8 @@ zhack_SOURCES = \
zhack.c
zhack_LDADD = \
$(abs_top_builddir)/lib/libzpool/libzpool.la \
$(abs_top_builddir)/lib/libzfs_core/libzfs_core.la \
$(abs_top_builddir)/lib/libnvpair/libnvpair.la
$(top_builddir)/lib/libnvpair/libnvpair.la \
$(top_builddir)/lib/libuutil/libuutil.la \
$(top_builddir)/lib/libzpool/libzpool.la \
$(top_builddir)/lib/libzfs/libzfs.la \
$(top_builddir)/lib/libzfs_core/libzfs_core.la
+12 -8
View File
@@ -48,11 +48,12 @@
#include <sys/zio_compress.h>
#include <sys/zfeature.h>
#include <sys/dmu_tx.h>
#include <libzutil.h>
#include <libzfs.h>
extern boolean_t zfeature_checks_disable;
const char cmdname[] = "zhack";
libzfs_handle_t *g_zfs;
static importargs_t g_importargs;
static char *g_pool;
static boolean_t g_readonly;
@@ -103,8 +104,8 @@ fatal(spa_t *spa, void *tag, const char *fmt, ...)
/* ARGSUSED */
static int
space_delta_cb(dmu_object_type_t bonustype, const void *data,
zfs_file_info_t *zoi)
space_delta_cb(dmu_object_type_t bonustype, void *data,
uint64_t *userp, uint64_t *groupp)
{
/*
* Is it a valid type of object to track?
@@ -126,19 +127,21 @@ zhack_import(char *target, boolean_t readonly)
nvlist_t *props;
int error;
kernel_init(readonly ? SPA_MODE_READ :
(SPA_MODE_READ | SPA_MODE_WRITE));
kernel_init(readonly ? FREAD : (FREAD | FWRITE));
g_zfs = libzfs_init();
ASSERT(g_zfs != NULL);
dmu_objset_register_type(DMU_OST_ZFS, space_delta_cb);
g_readonly = readonly;
g_importargs.unique = B_TRUE;
g_importargs.can_be_active = readonly;
g_pool = strdup(target);
error = zpool_find_config(NULL, target, &config, &g_importargs,
&libzpool_config_ops);
error = zpool_tryimport(g_zfs, target, &config, &g_importargs);
if (error)
fatal(NULL, FTAG, "cannot import '%s'", target);
fatal(NULL, FTAG, "cannot import '%s': %s", target,
libzfs_error_description(g_zfs));
props = NULL;
if (readonly) {
@@ -526,6 +529,7 @@ main(int argc, char **argv)
"changes may not be committed to disk\n");
}
libzfs_fini(g_zfs);
kernel_fini();
return (rv);
+9 -3
View File
@@ -1,5 +1,9 @@
include $(top_srcdir)/config/Rules.am
DEFAULT_INCLUDES += \
-I$(top_srcdir)/include \
-I$(top_srcdir)/lib/libspl/include
sbin_PROGRAMS = zinject
zinject_SOURCES = \
@@ -8,6 +12,8 @@ zinject_SOURCES = \
zinject.h
zinject_LDADD = \
$(abs_top_builddir)/lib/libzfs/libzfs.la \
$(abs_top_builddir)/lib/libzfs_core/libzfs_core.la \
$(abs_top_builddir)/lib/libnvpair/libnvpair.la
$(top_builddir)/lib/libnvpair/libnvpair.la \
$(top_builddir)/lib/libuutil/libuutil.la \
$(top_builddir)/lib/libzpool/libzpool.la \
$(top_builddir)/lib/libzfs/libzfs.la \
$(top_builddir)/lib/libzfs_core/libzfs_core.la
+144 -20
View File
@@ -20,11 +20,13 @@
*/
/*
* Copyright (c) 2006, 2010, Oracle and/or its affiliates. All rights reserved.
* Copyright (c) 2012, 2020 by Delphix. All rights reserved.
* Copyright (c) 2012 by Delphix. All rights reserved.
*/
#include <libzfs.h>
#include <sys/zfs_context.h>
#include <errno.h>
#include <fcntl.h>
#include <stdarg.h>
@@ -47,6 +49,9 @@
#include "zinject.h"
extern void kernel_init(int);
extern void kernel_fini(void);
static int debug;
static void
@@ -85,6 +90,8 @@ parse_pathname(const char *inpath, char *dataset, char *relpath,
struct stat64 *statbuf)
{
struct extmnttab mp;
FILE *fp;
int match;
const char *rel;
char fullpath[MAXPATHLEN];
@@ -97,7 +104,35 @@ parse_pathname(const char *inpath, char *dataset, char *relpath,
return (-1);
}
if (getextmntent(fullpath, &mp, statbuf) != 0) {
if (strlen(fullpath) >= MAXPATHLEN) {
(void) fprintf(stderr, "invalid object; pathname too long\n");
return (-1);
}
if (stat64(fullpath, statbuf) != 0) {
(void) fprintf(stderr, "cannot open '%s': %s\n",
fullpath, strerror(errno));
return (-1);
}
#ifdef HAVE_SETMNTENT
if ((fp = setmntent(MNTTAB, "r")) == NULL) {
#else
if ((fp = fopen(MNTTAB, "r")) == NULL) {
#endif
(void) fprintf(stderr, "cannot open %s\n", MNTTAB);
return (-1);
}
match = 0;
while (getextmntent(fp, &mp, sizeof (mp)) == 0) {
if (makedev(mp.mnt_major, mp.mnt_minor) == statbuf->st_dev) {
match = 1;
break;
}
}
if (!match) {
(void) fprintf(stderr, "cannot find mountpoint for '%s'\n",
fullpath);
return (-1);
@@ -126,32 +161,51 @@ parse_pathname(const char *inpath, char *dataset, char *relpath,
}
/*
* Convert from a dataset to a objset id. Note that
* we grab the object number from the inode number.
* Convert from a (dataset, path) pair into a (objset, object) pair. Note that
* we grab the object number from the inode number, since looking this up via
* libzpool is a real pain.
*/
/* ARGSUSED */
static int
object_from_path(const char *dataset, uint64_t object, zinject_record_t *record)
object_from_path(const char *dataset, const char *path, struct stat64 *statbuf,
zinject_record_t *record)
{
zfs_handle_t *zhp;
objset_t *os;
int err;
if ((zhp = zfs_open(g_zfs, dataset, ZFS_TYPE_DATASET)) == NULL)
/*
* Before doing any libzpool operations, call sync() to ensure that the
* on-disk state is consistent with the in-core state.
*/
sync();
err = dmu_objset_own(dataset, DMU_OST_ZFS, B_TRUE, FTAG, &os);
if (err != 0) {
(void) fprintf(stderr, "cannot open dataset '%s': %s\n",
dataset, strerror(err));
return (-1);
}
record->zi_objset = zfs_prop_get_int(zhp, ZFS_PROP_OBJSETID);
record->zi_object = object;
record->zi_objset = dmu_objset_id(os);
record->zi_object = statbuf->st_ino;
zfs_close(zhp);
dmu_objset_disown(os, FTAG);
return (0);
}
/*
* Initialize the range based on the type, level, and range given.
* Calculate the real range based on the type, level, and range given.
*/
static int
initialize_range(err_type_t type, int level, char *range,
calculate_range(const char *dataset, err_type_t type, int level, char *range,
zinject_record_t *record)
{
objset_t *os = NULL;
dnode_t *dn = NULL;
int err;
int ret = -1;
/*
* Determine the numeric range from the string.
*/
@@ -179,7 +233,7 @@ initialize_range(err_type_t type, int level, char *range,
(void) fprintf(stderr, "invalid range '%s': must be "
"a numeric range of the form 'start[,end]'\n",
range);
return (-1);
goto out;
}
}
@@ -199,7 +253,7 @@ initialize_range(err_type_t type, int level, char *range,
if (range != NULL) {
(void) fprintf(stderr, "range cannot be specified when "
"type is 'dnode'\n");
return (-1);
goto out;
}
record->zi_start = record->zi_object * sizeof (dnode_phys_t);
@@ -208,9 +262,76 @@ initialize_range(err_type_t type, int level, char *range,
break;
}
record->zi_level = level;
/*
* Get the dnode associated with object, so we can calculate the block
* size.
*/
if ((err = dmu_objset_own(dataset, DMU_OST_ANY,
B_TRUE, FTAG, &os)) != 0) {
(void) fprintf(stderr, "cannot open dataset '%s': %s\n",
dataset, strerror(err));
goto out;
}
return (0);
if (record->zi_object == 0) {
dn = DMU_META_DNODE(os);
} else {
err = dnode_hold(os, record->zi_object, FTAG, &dn);
if (err != 0) {
(void) fprintf(stderr, "failed to hold dnode "
"for object %llu\n",
(u_longlong_t)record->zi_object);
goto out;
}
}
ziprintf("data shift: %d\n", (int)dn->dn_datablkshift);
ziprintf(" ind shift: %d\n", (int)dn->dn_indblkshift);
/*
* Translate range into block IDs.
*/
if (record->zi_start != 0 || record->zi_end != -1ULL) {
record->zi_start >>= dn->dn_datablkshift;
record->zi_end >>= dn->dn_datablkshift;
}
/*
* Check level, and then translate level 0 blkids into ranges
* appropriate for level of indirection.
*/
record->zi_level = level;
if (level > 0) {
ziprintf("level 0 blkid range: [%llu, %llu]\n",
record->zi_start, record->zi_end);
if (level >= dn->dn_nlevels) {
(void) fprintf(stderr, "level %d exceeds max level "
"of object (%d)\n", level, dn->dn_nlevels - 1);
goto out;
}
if (record->zi_start != 0 || record->zi_end != 0) {
int shift = dn->dn_indblkshift - SPA_BLKPTRSHIFT;
for (; level > 0; level--) {
record->zi_start >>= shift;
record->zi_end >>= shift;
}
}
}
ret = 0;
out:
if (dn) {
if (dn != DMU_META_DNODE(os))
dnode_rele(dn, FTAG);
}
if (os)
dmu_objset_disown(os, FTAG);
return (ret);
}
int
@@ -222,6 +343,8 @@ translate_record(err_type_t type, const char *object, const char *range,
struct stat64 statbuf;
int ret = -1;
kernel_init(FREAD);
debug = (getenv("ZINJECT_DEBUG") != NULL);
ziprintf("translating: %s\n", object);
@@ -273,16 +396,16 @@ translate_record(err_type_t type, const char *object, const char *range,
/*
* Convert (dataset, file) into (objset, object)
*/
if (object_from_path(dataset, statbuf.st_ino, record) != 0)
if (object_from_path(dataset, path, &statbuf, record) != 0)
goto err;
ziprintf("raw objset: %llu\n", record->zi_objset);
ziprintf("raw object: %llu\n", record->zi_object);
/*
* For the given object, initialize the range in bytes
* For the given object, calculate the real (type, level, range)
*/
if (initialize_range(type, level, (char *)range, record) != 0)
if (calculate_range(dataset, type, level, (char *)range, record) != 0)
goto err;
ziprintf(" objset: %llu\n", record->zi_objset);
@@ -304,6 +427,7 @@ translate_record(err_type_t type, const char *object, const char *range,
ret = 0;
err:
kernel_fini();
return (ret);
}
@@ -388,7 +512,7 @@ translate_device(const char *pool, const char *device, err_type_t label_type,
record->zi_end = record->zi_start + VDEV_PAD_SIZE - 1;
break;
case TYPE_LABEL_PAD2:
record->zi_start = offsetof(vdev_label_t, vl_be);
record->zi_start = offsetof(vdev_label_t, vl_pad2);
record->zi_end = record->zi_start + VDEV_PAD_SIZE - 1;
break;
}
+38 -169
View File
@@ -36,15 +36,12 @@
*
* Errors can be injected into a particular vdev using the '-d' option. This
* option takes a path or vdev GUID to uniquely identify the device within a
* pool. There are four types of errors that can be injected, IO, ENXIO,
* ECHILD, and EILSEQ. These can be controlled through the '-e' option and the
* default is ENXIO. For EIO failures, any attempt to read data from the device
* will return EIO, but a subsequent attempt to reopen the device will succeed.
* For ENXIO failures, any attempt to read from the device will return EIO, but
* any attempt to reopen the device will also return ENXIO. The EILSEQ failures
* only apply to read operations (-T read) and will flip a bit after the device
* has read the original data.
*
* pool. There are two types of errors that can be injected, EIO and ENXIO,
* that can be controlled through the '-e' option. The default is ENXIO. For
* EIO failures, any attempt to read data from the device will return EIO, but
* subsequent attempt to reopen the device will succeed. For ENXIO failures,
* any attempt to read from the device will return EIO, but any attempt to
* reopen the device will also return ENXIO.
* For label faults, the -L option must be specified. This allows faults
* to be injected into either the nvlist, uberblock, pad1, or pad2 region
* of all the labels for the specified device.
@@ -116,9 +113,9 @@
* specified.
*
* The '-e' option takes a string describing the errno to simulate. This must
* be one of 'io', 'checksum', 'decompress', or 'decrypt'. In most cases this
* will result in the same behavior, but RAID-Z will produce a different set of
* ereports for this situation.
* be either 'io' or 'checksum'. In most cases this will result in the same
* behavior, but RAID-Z will produce a different set of ereports for this
* situation.
*
* The '-a', '-u', and '-m' flags toggle internal flush behavior. If '-a' is
* specified, then the ARC cache is flushed appropriately. If '-u' is
@@ -159,6 +156,8 @@
libzfs_handle_t *g_zfs;
int zfs_fd;
#define ECKSUM EBADE
static const char *errtable[TYPE_INVAL] = {
"data",
"dnode",
@@ -232,12 +231,11 @@ usage(void)
"\t\tspa_vdev_exit() will trigger a panic.\n"
"\n"
"\tzinject -d device [-e errno] [-L <nvlist|uber|pad1|pad2>] [-F]\n"
"\t\t[-T <read|write|free|claim|all>] [-f frequency] pool\n\n"
"\t [-T <read|write|free|claim|all>] [-f frequency] pool\n"
"\t\tInject a fault into a particular device or the device's\n"
"\t\tlabel. Label injection can either be 'nvlist', 'uber',\n "
"\t\t'pad1', or 'pad2'.\n"
"\t\t'errno' can be 'nxio' (the default), 'io', 'dtl', or\n"
"\t\t'corrupt' (bit flip).\n"
"\t\t'errno' can be 'nxio' (the default), 'io', or 'dtl'.\n"
"\t\t'frequency' is a value between 0.0001 and 100.0 that limits\n"
"\t\tdevice error injection to a percentage of the IOs.\n"
"\n"
@@ -289,19 +287,16 @@ usage(void)
"\t\tspecified by the remaining tuple. Each number is in\n"
"\t\thexadecimal, and only one block can be specified.\n"
"\n"
"\tzinject [-q] <-t type> [-C dvas] [-e errno] [-l level]\n"
"\t\t[-r range] [-a] [-m] [-u] [-f freq] <object>\n"
"\tzinject [-q] <-t type> [-e errno] [-l level] [-r range]\n"
"\t [-a] [-m] [-u] [-f freq] <object>\n"
"\n"
"\t\tInject an error into the object specified by the '-t' option\n"
"\t\tand the object descriptor. The 'object' parameter is\n"
"\t\tinterpreted depending on the '-t' option.\n"
"\n"
"\t\t-q\tQuiet mode. Only print out the handler number added.\n"
"\t\t-e\tInject a specific error. Must be one of 'io',\n"
"\t\t\t'checksum', 'decompress', or 'decrypt'. Default is 'io'.\n"
"\t\t-C\tInject the given error only into specific DVAs. The\n"
"\t\t\tDVAs should be specified as a list of 0-indexed DVAs\n"
"\t\t\tseparated by commas (ex. '0,2').\n"
"\t\t-e\tInject a specific error. Must be either 'io' or\n"
"\t\t\t'checksum'. Default is 'io'.\n"
"\t\t-l\tInject error at a particular block level. Default is "
"0.\n"
"\t\t-m\tAutomatically remount underlying filesystem.\n"
@@ -338,7 +333,7 @@ iter_handlers(int (*func)(int, const char *, zinject_record_t *, void *),
zfs_cmd_t zc = {"\0"};
int ret;
while (zfs_ioctl(g_zfs, ZFS_IOC_INJECT_LIST_NEXT, &zc) == 0)
while (ioctl(zfs_fd, ZFS_IOC_INJECT_LIST_NEXT, &zc) == 0)
if ((ret = func((int)zc.zc_guid, zc.zc_name,
&zc.zc_inject_record, data)) != 0)
return (ret);
@@ -362,20 +357,17 @@ print_data_handler(int id, const char *pool, zinject_record_t *record,
return (0);
if (*count == 0) {
(void) printf("%3s %-15s %-6s %-6s %-8s %3s %-4s "
"%-15s\n", "ID", "POOL", "OBJSET", "OBJECT", "TYPE",
"LVL", "DVAs", "RANGE");
(void) printf("%3s %-15s %-6s %-6s %-8s %3s %-15s\n",
"ID", "POOL", "OBJSET", "OBJECT", "TYPE", "LVL", "RANGE");
(void) printf("--- --------------- ------ "
"------ -------- --- ---- ---------------\n");
"------ -------- --- ---------------\n");
}
*count += 1;
(void) printf("%3d %-15s %-6llu %-6llu %-8s %-3d 0x%02x ",
id, pool, (u_longlong_t)record->zi_objset,
(u_longlong_t)record->zi_object, type_to_name(record->zi_type),
record->zi_level, record->zi_dvas);
(void) printf("%3d %-15s %-6llu %-6llu %-8s %3d ", id, pool,
(u_longlong_t)record->zi_objset, (u_longlong_t)record->zi_object,
type_to_name(record->zi_type), record->zi_level);
if (record->zi_start == 0 &&
record->zi_end == -1ULL)
@@ -506,7 +498,7 @@ cancel_one_handler(int id, const char *pool, zinject_record_t *record,
zc.zc_guid = (uint64_t)id;
if (zfs_ioctl(g_zfs, ZFS_IOC_CLEAR_FAULT, &zc) != 0) {
if (ioctl(zfs_fd, ZFS_IOC_CLEAR_FAULT, &zc) != 0) {
(void) fprintf(stderr, "failed to remove handler %d: %s\n",
id, strerror(errno));
return (1);
@@ -539,7 +531,7 @@ cancel_handler(int id)
zc.zc_guid = (uint64_t)id;
if (zfs_ioctl(g_zfs, ZFS_IOC_CLEAR_FAULT, &zc) != 0) {
if (ioctl(zfs_fd, ZFS_IOC_CLEAR_FAULT, &zc) != 0) {
(void) fprintf(stderr, "failed to remove handler %d: %s\n",
id, strerror(errno));
return (1);
@@ -563,9 +555,8 @@ register_handler(const char *pool, int flags, zinject_record_t *record,
zc.zc_inject_record = *record;
zc.zc_guid = flags;
if (zfs_ioctl(g_zfs, ZFS_IOC_INJECT_FAULT, &zc) != 0) {
if (ioctl(zfs_fd, ZFS_IOC_INJECT_FAULT, &zc) != 0) {
(void) fprintf(stderr, "failed to add handler: %s\n",
errno == EDOM ? "block level exceeds max level of object" :
strerror(errno));
return (1);
}
@@ -606,14 +597,13 @@ register_handler(const char *pool, int flags, zinject_record_t *record,
(void) printf(" range: [%llu, %llu)\n",
(u_longlong_t)record->zi_start,
(u_longlong_t)record->zi_end);
(void) printf(" dvas: 0x%x\n", record->zi_dvas);
}
}
return (0);
}
static int
int
perform_action(const char *pool, zinject_record_t *record, int cmd)
{
zfs_cmd_t zc = {"\0"};
@@ -623,7 +613,7 @@ perform_action(const char *pool, zinject_record_t *record, int cmd)
zc.zc_guid = record->zi_guid;
zc.zc_cookie = cmd;
if (zfs_ioctl(g_zfs, ZFS_IOC_VDEV_SET_STATE, &zc) == 0)
if (ioctl(zfs_fd, ZFS_IOC_VDEV_SET_STATE, &zc) == 0)
return (0);
return (1);
@@ -679,59 +669,6 @@ parse_frequency(const char *str, uint32_t *percent)
return (0);
}
/*
* This function converts a string specifier for DVAs into a bit mask.
* The dva's provided by the user should be 0 indexed and separated by
* a comma. For example:
* "1" -> 0b0010 (0x2)
* "0,1" -> 0b0011 (0x3)
* "0,1,2" -> 0b0111 (0x7)
*/
static int
parse_dvas(const char *str, uint32_t *dvas_out)
{
const char *c = str;
uint32_t mask = 0;
boolean_t need_delim = B_FALSE;
/* max string length is 5 ("0,1,2") */
if (strlen(str) > 5 || strlen(str) == 0)
return (EINVAL);
while (*c != '\0') {
switch (*c) {
case '0':
case '1':
case '2':
/* check for pipe between DVAs */
if (need_delim)
return (EINVAL);
/* check if this DVA has been set already */
if (mask & (1 << ((*c) - '0')))
return (EINVAL);
mask |= (1 << ((*c) - '0'));
need_delim = B_TRUE;
break;
case ',':
need_delim = B_FALSE;
break;
default:
/* check for invalid character */
return (EINVAL);
}
c++;
}
/* check for dangling delimiter */
if (!need_delim)
return (EINVAL);
*dvas_out = mask;
return (0);
}
int
main(int argc, char **argv)
{
@@ -758,10 +695,9 @@ main(int argc, char **argv)
int dur_secs = 0;
int ret;
int flags = 0;
uint32_t dvas = 0;
if ((g_zfs = libzfs_init()) == NULL) {
(void) fprintf(stderr, "%s\n", libzfs_error_init(errno));
(void) fprintf(stderr, "%s", libzfs_error_init(errno));
return (1);
}
@@ -789,7 +725,7 @@ main(int argc, char **argv)
}
while ((c = getopt(argc, argv,
":aA:b:C:d:D:f:Fg:qhIc:t:T:l:mr:s:e:uL:p:")) != -1) {
":aA:b:d:D:f:Fg:qhIc:t:T:l:mr:s:e:uL:p:")) != -1) {
switch (c) {
case 'a':
flags |= ZINJECT_FLUSH_ARC;
@@ -813,17 +749,6 @@ main(int argc, char **argv)
case 'c':
cancel = optarg;
break;
case 'C':
ret = parse_dvas(optarg, &dvas);
if (ret != 0) {
(void) fprintf(stderr, "invalid DVA list '%s': "
"DVAs should be 0 indexed and separated by "
"commas.\n", optarg);
usage();
libzfs_fini(g_zfs);
return (1);
}
break;
case 'd':
device = optarg;
break;
@@ -845,16 +770,10 @@ main(int argc, char **argv)
error = EIO;
} else if (strcasecmp(optarg, "checksum") == 0) {
error = ECKSUM;
} else if (strcasecmp(optarg, "decompress") == 0) {
error = EINVAL;
} else if (strcasecmp(optarg, "decrypt") == 0) {
error = EACCES;
} else if (strcasecmp(optarg, "nxio") == 0) {
error = ENXIO;
} else if (strcasecmp(optarg, "dtl") == 0) {
error = ECHILD;
} else if (strcasecmp(optarg, "corrupt") == 0) {
error = EILSEQ;
} else {
(void) fprintf(stderr, "invalid error type "
"'%s': must be 'io', 'checksum' or "
@@ -924,7 +843,6 @@ main(int argc, char **argv)
break;
case 'r':
range = optarg;
flags |= ZINJECT_CALC_RANGE;
break;
case 's':
dur_secs = 1;
@@ -1007,7 +925,7 @@ main(int argc, char **argv)
*/
if (raw != NULL || range != NULL || type != TYPE_INVAL ||
level != 0 || record.zi_cmd != ZINJECT_UNINITIALIZED ||
record.zi_freq > 0 || dvas != 0) {
record.zi_freq > 0) {
(void) fprintf(stderr, "cancel (-c) incompatible with "
"any other options\n");
usage();
@@ -1042,8 +960,7 @@ main(int argc, char **argv)
* for doing injection, so handle it separately here.
*/
if (raw != NULL || range != NULL || type != TYPE_INVAL ||
level != 0 || record.zi_cmd != ZINJECT_UNINITIALIZED ||
dvas != 0) {
level != 0 || record.zi_cmd != ZINJECT_UNINITIALIZED) {
(void) fprintf(stderr, "device (-d) incompatible with "
"data error injection\n");
usage();
@@ -1064,15 +981,7 @@ main(int argc, char **argv)
if (error == ECKSUM) {
(void) fprintf(stderr, "device error type must be "
"'io', 'nxio' or 'corrupt'\n");
libzfs_fini(g_zfs);
return (1);
}
if (error == EILSEQ &&
(record.zi_freq == 0 || io_type != ZIO_TYPE_READ)) {
(void) fprintf(stderr, "device corrupt errors require "
"io type read and a frequency value\n");
"'io' or 'nxio'\n");
libzfs_fini(g_zfs);
return (1);
}
@@ -1091,7 +1000,7 @@ main(int argc, char **argv)
} else if (raw != NULL) {
if (range != NULL || type != TYPE_INVAL || level != 0 ||
record.zi_cmd != ZINJECT_UNINITIALIZED ||
record.zi_freq > 0 || dvas != 0) {
record.zi_freq > 0) {
(void) fprintf(stderr, "raw (-b) format with "
"any other options\n");
usage();
@@ -1126,8 +1035,7 @@ main(int argc, char **argv)
error = EIO;
} else if (record.zi_cmd == ZINJECT_PANIC) {
if (raw != NULL || range != NULL || type != TYPE_INVAL ||
level != 0 || device != NULL || record.zi_freq > 0 ||
dvas != 0) {
level != 0 || device != NULL || record.zi_freq > 0) {
(void) fprintf(stderr, "panic (-p) incompatible with "
"other options\n");
usage();
@@ -1148,15 +1056,6 @@ main(int argc, char **argv)
record.zi_type = atoi(argv[1]);
dataset[0] = '\0';
} else if (record.zi_cmd == ZINJECT_IGNORED_WRITES) {
if (raw != NULL || range != NULL || type != TYPE_INVAL ||
level != 0 || record.zi_freq > 0 || dvas != 0) {
(void) fprintf(stderr, "hardware failure (-I) "
"incompatible with other options\n");
usage();
libzfs_fini(g_zfs);
return (2);
}
if (nowrites == 0) {
(void) fprintf(stderr, "-s or -g meaningless "
"without -I (ignore writes)\n");
@@ -1210,44 +1109,14 @@ main(int argc, char **argv)
return (2);
}
if (error == ENXIO || error == EILSEQ) {
if (error == ENXIO) {
(void) fprintf(stderr, "data error type must be "
"'checksum' or 'io'\n");
libzfs_fini(g_zfs);
return (1);
}
if (dvas != 0) {
if (error == EACCES || error == EINVAL) {
(void) fprintf(stderr, "the '-C' option may "
"not be used with logical data errors "
"'decrypt' and 'decompress'\n");
libzfs_fini(g_zfs);
return (1);
}
record.zi_dvas = dvas;
}
if (error == EACCES) {
if (type != TYPE_DATA) {
(void) fprintf(stderr, "decryption errors "
"may only be injected for 'data' types\n");
libzfs_fini(g_zfs);
return (1);
}
record.zi_cmd = ZINJECT_DECRYPT_FAULT;
/*
* Internally, ZFS actually uses ECKSUM for decryption
* errors since EACCES is used to indicate the key was
* not found.
*/
error = ECKSUM;
} else {
record.zi_cmd = ZINJECT_DATA_FAULT;
}
record.zi_cmd = ZINJECT_DATA_FAULT;
if (translate_record(type, argv[0], range, level, &record, pool,
dataset) != 0) {
libzfs_fini(g_zfs);
+1
View File
@@ -0,0 +1 @@
/zpios
+11
View File
@@ -0,0 +1,11 @@
include $(top_srcdir)/config/Rules.am
DEFAULT_INCLUDES += \
-I$(top_srcdir)/include
sbin_PROGRAMS = zpios
zpios_SOURCES = \
zpios_main.c \
zpios_util.c \
zpios.h

Some files were not shown because too many files have changed in this diff Show More