Linux ZFS test suite runs with /proc/sys/kernel/stack_tracer_enabled=1,
via zfs.sh script, which has negative performance impact, up to 40%.
Since large stack is a rare issue now, preferred behavior would be:
- making stack tracer an opt-in feature for zfs.sh
- zfs-test.sh enables stack tracer only when requested
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Tony Nguyen <tony.nguyen@delphix.com>
#8173
This patch simply ensures that scn->scn_prefetch_queue is emptied
before the kernel module is unloaded and when scanning completes.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alek Pinchuk <apinchuk@datto.com>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes#8178
When receiving a send stream with forced rollback on a dataset with
snapshots zfs suggests said snapshots must be removed to successfully
receive the stream; however the message is misleading because it
prints the dataset name instead of one of its snapshots.
$ sudo zfs snap pp/recvfs@snap-orig
$ sudo zfs recv -F pp/recvfs < sendstream
cannot receive new filesystem stream: destination has snapshots (eg. pp/recvfs)
must destroy them to overwrite it
This change simply restores the snapshot name in the error message.
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes#8167
This fixes the build when cross-compiling, where the preprocessor might
be prefixed.
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Ben Wolsieffer <benwolsieffer@gmail.com>
Closes#8180
This one line patch moves an assert in the function dump_dir()
below an error check that ensures it ran correctly. This ensures
zdb dumps the error that actually caused the problem, as opposed
to one of its symptoms.
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes#8171
Commit 4c5b89f59 refactored dnode_hold() and in the process
accidentally introduced a slight change in behavior which was
not intended. The required behavior is that once the ZPL,
or other consumer, declares its intent to free a dnode then
dnode_hold() should immediately start failing. This updated
code wouldn't return the failure until after it was freed.
When DNODE_MUST_BE_ALLOCATED is set it must return ENOENT, and
when DNODE_MUST_BE_FREE is set it must return EEXIST;
This issue was uncovered by ztest_remap() which attempted
to remap a freeing object which should have been skipped as
described by the comment in dmu_objset_remap_indirects_impl().
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#8172
The verbose output of 'zpool list' was not correctly aligned due
to differences in the vdev name lengths. Minimally update the
code the correct the alignment using the same strategy employed
by 'zpool status'.
Missing dashes were added for the empty defaults columns, and
the vdev state is now printed for all vdevs.
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#7308Closes#8147
The 'while read line; ...; done' loop is run in a piped subshell
therefore the 'return 0' would not cause a return from the
is_mounted() function. In all cases, this function will
always return 1.
The fix is to 'return 1' from the subshell on a successful match
(no match == return 0), and then negating the final return value.
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: TerraTech <TerraTech@users.noreply.github.com>
Closes#8151
This patch corrects an issue where spa_vdev_remove() would
call spa_history_log_internal() while holding the spa config
lock. This function may decide to block until the next txg if
the current one seems too full. However, since the thread is
holding the config log, the txg sync thread cannot progress
and the system ends up deadlocked. This patch simply moves
all calls to spa_history_log_internal() outside of the config
lock.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes#8162
This patch fixes a small race condition in ztest_zil_remount()
that could result in a deadlock. ztest_device_removal() calls
spa_vdev_remove() which may eventually call spa_reset_logs().
If ztest_zil_remount() attempts to call zil_close() while this
is happening, it may fail when it asserts !zilog_is_dirty(zilog).
This patch simply adds locking to correct the issue.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes#8154
This commit fixes the following ASSERT in zfs_receive_one() when
receiving a send stream from a root dataset with the "-e" option:
$ sudo zfs snap source@snap
$ sudo zfs send source@snap | sudo zfs recv -e destination/recv
chopprefix > drrb->drr_toname
ASSERT at libzfs_sendrecv.c:3804:zfs_receive_one()
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes#8121
* Detect IO errors during device removal
While device removal cannot verify the checksums of individual
blocks during device removal, it can reasonably detect hard IO
errors from the leaf vdevs. Failure to perform this error
checking can result in device removal completing successfully,
but moving no data which will permanently corrupt the pool.
Situation 1: faulted/degraded vdevs
In the configuration shown below, the removal of mirror-0 will
permanently corrupt the pool. Device removal will preferentially
copy data from 'vdev1 -> vdev3' and from 'vdev2 -> vdev4'. Which
in this case will result in nothing being copied since one vdev
in each of those groups in unavailable. However, device removal
will complete successfully since all IO errors are ignored.
tank DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
/var/tmp/vdev1 FAULTED 0 0 0 external fault
/var/tmp/vdev2 ONLINE 0 0 0
mirror-1 DEGRADED 0 0 0
/var/tmp/vdev3 ONLINE 0 0 0
/var/tmp/vdev4 FAULTED 0 0 0 external fault
This issue is resolved by updating the source child selection
logic to exclude unreadable leaf vdevs. Additionally, unwritable
destination child vdevs which can never succeed are skipped to
prevent generating a large number of write IO errors.
Situation 2: individual hard IO errors
During removal if an unexpected hard IO error is encountered when
either reading or writing the child vdev the entire removal
operation is cancelled. While it may be possible to reconstruct
the data after removal that cannot be guaranteed. The only
strictly safe thing to do is to cancel the removal.
As a future improvement we may want to instead suspend the removal
process and allow the damaged region to be retried. But that work
is left for another time, hard IO errors during the removal process
are expected to be exceptionally rare.
Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #6900Closes#8161
ztest currently uses the boolean flag ztest_device_removal_active
to protect some tests that may not run successfully if they occur
at the same time as ztest_device_removal(). Unfortunately, in the
event that ztest is in the middle of a device removal when it
decides to issue a SIGKILL, the device removal will be
automatically restarted (without setting the flag) when the pool
is re-imported on the next run. This patch corrects this by
ensuring that any in-progress removals are completed before running
further tests after the re-import.
This patch also makes a few small changes to prevent race conditions
involving the creation and destruction of spa->spa_vdev_removal,
since this field is not protected by any locks. Some checks that
may run concurrently with setting / unsetting this field have been
updated to check spa->spa_removing_phys.sr_state instead. The most
significant change here is that spa_removal_get_stats() no longer
accounts for in-flight work done, since that could result in a NULL
pointer dereference.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes#8105
This commit reverts to using printk() instead of zfs_dbgmsg() to log
messages in vdev_disk_error(): this is necessary because the latter can
be called from interrupt context where we are not allowed to sleep.
Unfortunately zfs_dbgmsg() performs its allocations calling kmalloc()
with the KM_SLEEP flag which may result in the following oops:
BUG: scheduling while atomic: swapper/4/0/0x10000100
Call Trace:
<IRQ> [<0>] dump_stack+0x19/0x1b
...
[<0>] spl_kmem_alloc+0xdf/0x140 [spl] <-- kmem_alloc(size, KM_SLEEP)
[<0>] __dprintf+0x69/0x150 [zfs]
[<0>] ? kmem_cache_free+0x1e2/0x200
[<0>] vdev_disk_error.part.15+0x5f/0x70 [zfs]
[<0>] vdev_disk_io_flush_completion+0x48/0x70 [zfs]
[<0>] bio_endio+0x67/0xb0
[<0>] blk_update_request+0x90/0x360
...
[<0>] scsi_finish_command+0xdc/0x140
[<0>] scsi_softirq_done+0x132/0x160
[<0>] blk_done_softirq+0x96/0xc0
[<0>] __do_softirq+0xf5/0x280
[<0>] call_softirq+0x1c/0x30
[<0>] do_softirq+0x65/0xa0
[<0>] irq_exit+0x105/0x110
[<0>] do_IRQ+0x56/0xf0
[<0>] common_interrupt+0x162/0x162
<EOI> [<0>] ? cpuidle_enter_state+0x54/0xd0
[<0>] cpuidle_idle_call+0xde/0x230
[<0>] arch_cpu_idle+0xe/0xb0
[<0>] cpu_startup_entry+0x14a/0x1e0
[<0>] start_secondary+0x1f7/0x270
[<0>] start_cpu+0x5/0x14
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes#8137Closes#8150
Currently, several tests in the ZFS Test Suite that attempt to
test scrub and resilver behavior occasionally fail. A big reason
for this is that these tests use a combination of zinject and
zfs_scan_vdev_limit to attempt to slow these operations enough
to verify their test commands. This method works most of the time,
but provides no guarantees and leads to flaky behavior. This patch
adds a new tunable, zfs_scan_suspend_progress, that ensures that
scans make no progress, guaranteeing that tests can be run without
racing.
This patch also changes zfs_remove_max_bytes_pause to match this
new tunable. This provides some consistency between these two
similar tunables and ensures that the tunable will not misbehave
on 32-bit systems.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes#8111
This commit fixes several "not found" errors caused by calling undefined
or incorrect shell functions in the following ZFS Test Suite groups:
* alloc_class
* channel_program/lua_core
* channel_program/synctask_core
* cli_root/zpool_import
* cli_user/misc
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: bunder2015 <omfgbunder@gmail.com>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes#8152
Move strlcat() and strlcpy() from .c source files in to the libspl
string.h header. By changing these compatibility functions to static
inline functions they can included as needed without requiring linking
with the libspl.so library.
Remove strnlen() which is barely used in the source, and has been
provided by glibc since v2.10.
Finally, convert four instances of strncpy() to strlcpy() in
libzfs_input_check.c which were causing build warnings when compiling
with gcc 8.2.1. For example:
libzfs_input_check.c: In function ‘zfs_destroy’:
libzfs_input_check.c:651:9: error: ‘strncpy’ specified bound \
4096 equals destination size [-Werror=stringop-truncation]
(void) strncpy(zc.zc_name, dataset, sizeof (zc.zc_name));
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#8116
This change allows 'zpool split' to work with whole-disk devices and
updates the ZFS Test Suite with a new script to exercise this
functionality.
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes#6643Closes#8133
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: Christian Schwarz <me@cschwarz.com>
Closes#8134
filetest_001_pos consumes the output using read -r, assigning each
field to a variable. The problem comes when a vdev is marked degraded,
which appends extra fields to the line. This causes the trailing text
to be treated as part of the `cksum` variable. Using awk instead of
read -r allows us to extract the checksum error count from the output
whether the vdev is degraded or not.
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: John Wren Kennedy <john.kennedy@delphix.com>
Closes#8136
This change adds "lscpu" to the list of commands used by the ZFS Test
Suite: this is required by the "checksum" test group to read the CPU
frequency which is used in EdonR, Skein and SHA2 performance tests.
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes#8139
Porting Notes:
* Use thread pools (tpool) API instead of introducing taskq interfaces
to libzfs.
* Use pthread_mutext for locks as mutex_t isn't available.
* Ignore alternative libshare initialization since OpenZFS-7955 is
not present on zfsonlinux.
Authored by: Sebastien Roy <seb@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Brad Lewis <brad.lewis@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Prashanth Sreenivasa <pks@delphix.com>
Authored by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Matt Ahrens <mahrens@delphix.com>
Ported-by: Don Brady <don.brady@delphix.com>
OpenZFS-issue: https://www.illumos.org/issues/8115
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/a3f0e2b569Closes#8092
PR #8114 quoted the ${ENCRYPTIONROOT} parameter to ensure we don't
lose spaces when unlocking root filesystem in the off chance that
it has a space in its name.
Unfortunately, dracut and initramfs-tools do not actually get the
quotes from the cmdline. If we use root=ZFS="root pool/filesystem
name" the script still only sees root=ZFS=root and no quotation
marks.
Because + is a reserved character in ZFS, it's used as a
placeholder for spaces in the kernel cmdline. In this way,
root=ZFS=root+pool/filesystem+name will properly expand by
replacing the character with sed (POSIX compliant method).
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: bunder2015 <omfgbunder@gmail.com>
Signed-off-by: Kash Pande <kash@tripleback.net>
Issue #8114Closes#8117
CID 184285: Read from pointer after free (USE_AFTER_FREE)
This patch fixes an use-after-free in vdev_config_generate_stats()
moving the kmem_free() call at the end of the function.
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes#8120
Ensure that the _unitdir, _presetdir, _modulesloaddir, and
_systemdgeneratordir macros are always defined. If not set
them to the expected default values. Pass all of these options
to ./configure and package the resulting files in those locations.
Additionally, set __brp_mangle_shebangs_exclude_from until the
conversion to Python 3 is complete so they may be built cleanly
under mock.
Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#7567Closes#8119
Changed decrypt_fs zfs command to "load-key"
Plymouth case code based on "contrib/dracut/90zfs/zfs-lib.sh.in"
Systemd case based on "contrib/dracut/90zfs/zfs-load-key.sh.in"
Cleaned up misspelling of "available" throughout
Code style fixes
Single quote for ${ENCRYPTIONROOT}
Changed "${DECRYPT_CMD}" to "eval ${DECRYPT_CMD}"
Reviewed-by: Kash Pande <kash@tripleback.net>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: Garrett Fields <ghfields@gmail.com>
Closes#8093
This commit adds a new test case to the ZFS Test Suite to verify ZED
can detect when a device is physically removed from a running system:
the device will be offlined if a spare is not available in the pool.
We implement this by using the existing libudev functionality and
without relying solely on the FM kernel module capabilities which have
been observed to be unreliable with some kernels.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes#1537Closes#7926
Add quotations for ${ENCRYPTIONROOT} to avoid breaking systems
with a space in the name.
Reviewed-by: bunder2015 <omfgbunder@gmail.com>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Kash Pande <kash@tripleback.net>
Related-to: #8093Closes#8114
This patch adds a new slow I/Os (-s) column to zpool status to show the
number of VDEV slow I/Os. This is the number of I/Os that didn't
complete in zio_slow_io_ms milliseconds. It also adds a new parsable
(-p) flag to display exact values.
NAME STATE READ WRITE CKSUM SLOW
testpool ONLINE 0 0 0 -
mirror-0 ONLINE 0 0 0 -
loop0 ONLINE 0 0 0 20
loop1 ONLINE 0 0 0 0
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes#7756Closes#6885
It's disabled by default, update code and tests to reflect
the documentation.
Minor cleanup in delegate_common.kshlib.
Reviewed-by: Gregor Kopka <gregor@kopka.net>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes#7835Closes#8045
zfs_rename_006_pos has been flaky in the past because it was
missing a call to block_device_wait to ensure the zvols it creates
are present before running dd. Whenever this this happened,
zfs_rename_009_neg would also fail because the first test would
leak a zvol clone that it did not know how to clean up. This patch
fixes the root cause and reenables the test. It also fixes some
minor grammar errors.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes#5647Closes#5648Closes#8088
For Linux, place a file in the mount point folder so it will be
considered "busy". Fix the while loop so it doesn't rm in
directories above the testdir. Add Linux-specific code to test
overlay on|off.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Zuchowski <pzuchowski@datto.com>
Closes#4990Closes#8081
This adds a BuildRequires for gcc, make, and elfutils-libelf-devel
into our spec files. gcc has been a packaging requirement for
awhile now:
https://fedoraproject.org/wiki/Packaging:C_and_C%2B%2B
These additional BuildRequires allow us to mock build in
Fedora 29.
Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes#8095Closes#8102
ztest occasionally hits an assert that !zilog_is_dirty() during
zil_close(). This is caused by an interaction between 2 threads.
First, ztest_run() waits for each test thread to complete and
closes the associated dataset as soon as the thread joins. At
the same time, the ztest_vdev_add_remove() test may attempt to
remove the slog, which will open, dirty, and reset the logs on
every dataset in the pool (including those of other threads).
This patch simply ensures that we always join all of the test
threads before closing any datasets.
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes#8094
This patch simply ensures that vdev_indirect_splits_damage()
cannot hit a divide by zero exception if a split has no
children with valid data. The normal reconstruction code
path in vdev_indirect_reconstruct_io_done() already has this
check.
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes#8086
This patch simply corrects an issue where vdev_dtl_reassess()
could attempt to dirty the vdev config even when the spa was
not elligable for writing.
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes#8085
This patch ensures that logs are replayed on all datasets prior
to starting ztest workers. This ensures that the call to
vdev_offline() a log device in ztest_fault_inject() will not fail
due to the log device being required for replay.
This patch also fixes a small issue found during testing where
spa_keystore_load_wkey() does not check that the dataset specified
is an encryption root. This check was present in libzfs, however.
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes#8084
This patch fixes a race condition where the end of
vdev_remove_replace_with_indirect(), which holds
svr_lock, would race against spa_vdev_removal_destroy(),
which destroys the same lock and is called asynchronously
via dsl_sync_task_nowait().
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Issue #6900Closes#8083
vdev_clear() can call vdev_set_deferred_resilver() with a
non-leaf vdev to setup a deferred resilver. However, this
function is currently written to only handle leaf vdevs.
This bug was introduced with deferred resilvers in 80a91e74.
This patch makes this function recursive so that it can find
appropriate vdevs to resilver and set vdev_resilver_deferred
on them.
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Issue #7732Closes#8082
ZFS should be able to build without libudev installed. The recent
change for libzutil inadvertently broke that. Make the libudev code
conditional in zutil_import.c to resolve the build failure.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Don Brady <don.brady@delphix.com>
Closes#8097
When provided with an invalid 'dedupditto' value zpool prints
a misleading error message:
$ sudo zpool set dedupditto=99 pp
cannot set property for 'pp': property 'dedupditto'(14) not defined
Fix this by printing a meaningful error description for unsupported
'dedupditto' values.
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes#8079
In order to validate the gang block code ztest is configured to
artificially force a fraction of large blocks to be written as
gang blocks. The default setting chosen for this was to
write 25% of all blocks 32k or larger using gang blocks.
The confluence of an unrealistically large number of gang blocks,
the aggressive fault injection done by ztest, and the split
segment reconstruction logic introduced by device removal has
resulted in the following type of failure:
zdb -bccsv -G -d ... exit code 3
Specifically, zdb was unable to open the pool because it was
unable to reconstruct a damaged block. Manual investigation
of multiple failures clearly showed that the block could be
reconstructed. However, due to the large number of damaged
segments (>35) it could not be done in the allotted time.
Furthermore, the large number of gang blocks was determined
to be the reason for the unrealistically large number of
damaged segments. In order to make this situation less
likely, this change both increases the forced gang block
size to 64k and reduces the frequency to 3% of blocks.
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#8080
Adds a libzutil for utility functions that are common to libzfs and
libzpool consumers (most of what was in libzfs_import.c). This
removes the need for utilities to link against both libzpool and
libzfs.
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Don Brady <don.brady@delphix.com>
Closes#8050
Update zfs-events.5 with info from PSARC 2009/497 regarding ereport fields.
Also updates ZIO_STAGE_* and ZIO_FLAG_* descriptions to match current source.
Reviewed by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Elling <Richard.Elling@RichardElling.com>
Closes#8057
Make sure tests have proper include files. Make sure underlying
"chmod" style permissions don't interfere with ACLs.
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Zuchowski <pzuchowski@datto.com>
Closes#8069
It's better to use ksh/bash built in methods,
rather than spawn new processes every time.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Wren Kennedy <john.kennedy@delphix.com>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes#8071
When we delete a snapshot, we consolidate some bpobj's together because
we no longer need to keep their entries in separate buckets. This is
done in constant time by including the "sub" bpobj by reference in the
parent bpobj.
After many snapshots have been deleted, we may have many sub-bpobj's.
Usually, most sub-bpobj's don't contain many BP's. Compared to this
small payload, the sub-bpobj is relatively heavyweight since it is a
object in the MOS. A common scenario on a long-lived pool is for the
vast majority of MOS objects to be small sub-bpobj's.
To improve this situation, when consolidating bpobj's together,
bpobj_enqueue_subobj() can copy the contents of small bpobj's into the
parent, and then delete the enqueued bpobj, rather than including it by
reference. Since this copying is limited in size (to one block), the
consolidation is still constant time, though with a larger constant due
to reading in the one block of the enqueued bpobj.
This idea and mechanism are similar to how we handle "sub-subobj's".
When including a sub-bpobj by reference, if the sub-bpobj itself has
less than a block of sub-sub-bpobj's, the list of sub-sub-bpobj's is
copied to the parent bpobj's list of sub-bpobj's.
Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Paul Zuchowski <pzuchowski@datto.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes#8053
Issue #7908
Commit torvalds/linux@976516404 removed the current_kernel_time()
function (and several others). All callers are expected to use
current_kernel_time64(). Update the gethrestime_sec() wrapper
accordingly.
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#8074