mirror_zfs

mirror of https://git.proxmox.com/git/mirror_zfs.git synced 2026-05-26 12:12:13 +03:00

Author	SHA1	Message	Date
Serapheim Dimitropoulos	425d3237ee	Get rid of space_map_update() for ms_synced_length Initially, metaslabs and space maps used to be the same thing in ZFS. Later, we started differentiating them by referring to the space map as the on-disk state of the metaslab, making the metaslab a higher-level concept that is metadata that deals with space accounting. Today we've managed to split that code furthermore, with the space map being its own on-disk data structure used in areas of ZFS besides metaslabs (e.g. the vdev-wide space maps used for zpool checkpoint or vdev removal features). This patch refactors the space map code to further split the space map code from the metaslab code. It does so by getting rid of the idea that the space map can have a different in-core and on-disk length (sm_length vs smp_length) which is something that is only used for the metaslab code, and other consumers of space maps just have to deal with. Instead, this patch introduces changes that move the old in-core length of the metaslab's space map to the metaslab structure itself (see ms_synced_length field) while making the space map code only care about the actual space map's length on-disk. The result of this is that space map consumers no longer have to deal with syncing two different lengths for the same structure (e.g. space_map_update() goes away) while metaslab specific behavior stays within the metaslab code. Specifically, the ms_synced_length field keeps track of the amount of data metaslab_load() can read from the metaslab's space map while working concurrently with metaslab_sync() that may be appending to that same space map. As a side note, the patch also adds a few comments around the metaslab code documenting some assumptions and expected behavior. Reviewed-by: Matt Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com> Closes #8328	2019-02-12 10:38:11 -08:00
loli10K	d8d418ff0c	ZVOLs should not be allowed to have children zfs create, receive and rename can bypass this hierarchy rule. Update both userland and kernel module to prevent this issue and use pyzfs unit tests to exercise the ioctls directly. Note: this commit slightly changes zfs_ioc_create() ABI. This allow to differentiate a generic error (EINVAL) from the specific case where we tried to create a dataset below a ZVOL (ZFS_ERR_WRONG_PARENT). Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Matt Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tom Caputi <tcaputi@datto.com> Signed-off-by: loli10K <ezomori.nozomu@gmail.com>	2019-02-08 15:44:15 -08:00
loli10K	4417096956	Pool allocation classes misplacing small file blocks Due to an off-by-one condition in spa_preferred_class() we are picking the "normal" allocation class instead of the "special" one for file blocks with size equal to the special_small_blocks property value. This change fix the small code issue, update the ZFS Test Suite and the zfs(8) man page. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Don Brady <don.brady@delphix.com> Signed-off-by: loli10K <ezomori.nozomu@gmail.com> Closes #8351 Closes #8361	2019-02-08 12:32:12 -08:00
Tim Chase	0902c4577f	Fix ARC stats for embedded blkptrs Re-factor arc_read() to better account for embedded data blkptrs. Previously, reading the payload from an embedded blkptr would cause arcstats such as demand_metadata_misses to be bumped when there was actually no cache "miss" because the data are already available in the blkptr. The following test procedure was used to demonstrate the problem: zpool create tank ... zfs create -o compression=lz4 tank/fs echo blah > /tank/fs/blah stat /tank/fs/blah grep 'meta.*mis' /proc/spl/kstat/zfs/arcstats and repeating the last two steps to watch the metadata miss counter increment. This can also be demonstrated via the zfs_arc_miss DTRACE4 probe in arc_read(). Reviewed-by: loli10K <ezomori.nozomu@gmail.com> Reviewed-by: George Wilson <george.wilson@delphix.com> Reviewed-by: Matt Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Tim Chase <tim@chase2k.com> Closes #8319	2019-02-04 09:33:30 -08:00
Ahmed Ghanem	9634299657	OpenZFS 9185 - Enable testing over NFS in ZFS performance tests This change makes additions to the ZFS test suite that allows the performance tests to run over NFS. The test is run and performance data collected from the server side, while IO is generated on the NFS client. This has been tested with Linux and illumos NFS clients. Authored by: Ahmed Ghanem <ahmedg@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: John Kennedy <john.kennedy@delphix.com> Reviewed by: Kevin Greene <kevin.greene@delphix.com> Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Ported-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: John Kennedy <john.kennedy@delphix.com> OpenZFS-issue: https://www.illumos.org/issues/9185 Closes #8367	2019-02-04 09:27:37 -08:00
loli10K	1a745ef62e	zstreamdump: -d option is not documented in manpage This change simply documents the missing -d (dump contents) option in zstreamdump(8). Reviewed-by: bunder2015 <omfgbunder@gmail.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Igor Kozhukhov <igor@dilos.org> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: loli10K <ezomori.nozomu@gmail.com> Closes #8369	2019-02-04 09:13:00 -08:00
bunder2015	bf6ca0a631	shellcheck pass note: which is non-standard. Use builtin 'command -v' instead. [SC2230] note: Use -n instead of ! -z. [SC2236] Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Giuseppe Di Natale <guss80@gmail.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: bunder2015 <omfgbunder@gmail.com> Closes #8367	2019-02-04 09:07:19 -08:00
bunder2015	cca14128c9	flake8 pass F632 use ==/!= to compare str, bytes, and int literals Reviewed-by: Håkan Johansson <f96hajo@chalmers.se> Reviewed-by: Giuseppe Di Natale <guss80@gmail.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: bunder2015 <omfgbunder@gmail.com> Closes #8368	2019-02-04 09:02:46 -08:00
Tony Hutter	57dc41de96	Fix zpool iostat -w header names The zpool iostat latency histograms (-w) has column names 'sync_queue' and 'async_queue', which do not match the man page, nor the equivalent columns in average latency. Change the column names to be 'syncq_wait' and 'asyncq_wait' to be consistent. Reviewed-by: Olaf Faaland <faaland1@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #8338	2019-01-31 10:51:18 -08:00
Serapheim Dimitropoulos	6c926f426a	Simplify log vdev removal code Get rid of the majority metaslab metadata when removing log vdevs in spa_vdev_remove_log() with a call to metaslab_fini() instead of duplicating a lot of that in vdev_remove_empty_log(). Reviewed-by: Matt Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com> Closes #8347	2019-01-31 09:17:52 -08:00
Serapheim Dimitropoulos	7558997d2f	vs_alloc can underflow in L2ARC vdevs The current L2 ARC device code consistently uses psize to increment vs_alloc but varies between psize and lsize when decrementing it. The result of this behavior is that vs_alloc can be decremented more that it is incremented and underflow. This patch changes the code so asize is used anywhere. In addition, it ensures that vs_alloc gets incremented by the L2 ARC device code as buffers are written and not at the end of the l2arc_write_buffers() routine. The latter (and old) way would temporarily underflow vs_alloc as buffers that were just written, would be destroyed while l2arc_write_buffers() was still looping. Reviewed-by: Matt Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com> Closes #8298	2019-01-31 09:16:39 -08:00
Sara Hartse	2747f599ff	Don't acquire zthr_request_lock in zthr_wakeup Address a deadlock caused by simultaneous wakeup and cancel on a zthr by remove the hold of zthr_request_lock from zthr_wakeup. This allows thr_wakeup to not block a thread that is in the process of being cancelled. Reviewed-by: Matt Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com> Signed-off-by: Sara Hartse <sara.hartse@delphix.com> Closes #8333	2019-01-30 12:31:16 -08:00
Serapheim Dimitropoulos	21e7cf5da8	zdb -L should skip leak detection altogether Currently the point of -L option in zdb is to disable leak tracing and the loading of space maps because they are expensive, yet still do leak detection in terms of space. Unfortunately, there is a scenario where this is a lie. If we are using zdb -L on a pool where a vdev is being removed, zdb_claim_removing() will open the metaslab space maps of that device. This patch makes it so zdb -L skips leak detection altogether and ensures that no space maps are loaded. Reviewed-by: Matt Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com> Closes #8335	2019-01-30 09:54:27 -08:00
Tony Hutter	466f55334a	Exclude test-runner.py from the rpmbuild shebang check Exclude test-runner.py from the rpmbuild shebang check to allow it to run under Python 2 and 3. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #8331	2019-01-28 10:11:45 -08:00
Tony Hutter	caacc6e4c4	GCC 9.0: Fix ztest "directive argument is not a nul-terminated string" GCC 9.0 is complaining because we're trying to print strings that are defined like this: .zo_pool = { 'z', 't', 'e', 's', 't', '\0' }, Fix them by making them actual strings. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #8330	2019-01-28 10:11:45 -08:00
Brian Behlendorf	26a856594f	Linux 5.0 compat: Fix bio_set_dev() The Linux 5.0 kernel updated the bio_set_dev() macro so it calls the GPL-only bio_associate_blkg() symbol thus inadvertently converting the entire macro. Provide a minimal version which always assigns the request queue's root_blkg to the bio. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #8287	2019-01-28 10:11:45 -08:00
Tony Hutter	0c593296e9	Linux 5.0 compat: Disable vector instructions on 5.0+ kernels The 5.0 kernel no longer exports the functions we need to do vector (SSE/SSE2/SSE3/AVX...) instructions. Disable vector-based checksum algorithms when building against those kernels. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #8259	2019-01-28 10:11:45 -08:00
Tony Hutter	ed158b19b1	Linux 5.0 compat: Fix SUBDIRs SUBDIRs has been deprecated for a long time, and was finally removed in the 5.0 kernel. Use "M=" instead. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #8257	2019-01-28 10:11:45 -08:00
Tony Hutter	05805494dd	Linux 5.0 compat: Convert MS_* macros to SB_* In the 5.0 kernel, only the mount namespace code should use the MS_* macos. Filesystems should use the SB_* ones. https://patchwork.kernel.org/patch/10552493/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #8264	2019-01-28 10:11:39 -08:00
Tony Hutter	031cea17a3	Linux 5.0 compat: Use totalram_pages() totalram_pages() was converted to an atomic variable in 5.0: https://patchwork.kernel.org/patch/10652795/ Its value should now be read though the totalram_pages() helper function. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #8263	2019-01-28 10:11:14 -08:00
Tony Hutter	77e50c3070	Linux 5.0 compat: access_ok() drops 'type' parameter access_ok no longer needs a 'type' parameter in the 5.0 kernel. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #8261	2019-01-28 10:11:10 -08:00
Tony Hutter	5cb46f6a66	Linux 4.18 compat: Use ktime_get_coarse_real_ts64() Newer kernels remove current_kernel_time64(). Use ktime_get_coarse_real_ts64() in its place. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #8258	2019-01-28 10:11:03 -08:00
Serapheim Dimitropoulos	c853f382db	Change target size of metaslabs from 256GB to 16GB = Old behavior For vdev sizes 100GB to 50TB we keep ~200 metaslabs per vdev and the metaslab size grows from 512MB to 256GB. For vdev's bigger than that we start increasing the number of metaslabs until we hit the 128K limit. = New Behavior For vdev sizes 100GB to 3TB we keep ~200 metaslabs per vdev and the metaslab size grows from 512MB to 16GB. For vdev's bigger than that we start increasing the number of metaslabs until we hit the 128K limit. = Reasoning The old behavior makes metaslabs grow in size when the vdev range is between 3TB (ms_size 16GB) and 32PB (ms_size 256GB). Even though keeping the number of metaslabs is good in terms of potential number of I/Os per TXG, these bigger metaslabs take longer to be loaded and after they are loaded they can take up a lot of memory because of their range trees. This change tries to put a boundary in memory and loading time for the specific range of vdev sizes. Reviewed-by: Matt Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Don Brady <don.brady@delphix.com> Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com> Closes #8324	2019-01-25 16:38:27 -08:00
Serapheim Dimitropoulos	df72b8bebe	Rename range_tree_verify to range_tree_verify_not_present The range_tree_verify function looks for a segment in a range tree and panics if the segment is present on the tree. This patch gives the function a more descriptive name. Reviewed-by: Matt Ahrens <mahrens@delphix.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com> Closes #8327	2019-01-25 09:51:24 -08:00
Tim Chase	107dd2b174	Use proper tag for spa config refcounts in mmp_write_uberblock() This allows the spa config refcounts to use tracking in debug builds without triggering the "No such hold %p on refcount" panic. Reviewed-by: Olaf Faaland <faaland1@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tim Chase <tim@chase2k.com> Closes #8326	2019-01-25 09:50:06 -08:00
loli10K	7646af20ad	zfs userspace dumps core when used on ZVOLs If you try to get the userspace, groupspace or projectspace on a ZVOL, the generated error results in passing EINVAL to zfs_standard_error_fmt() when we should return a specific error to inform the user that those properties aren't available on volumes. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: Tom Caputi <tcaputi@datto.com> Signed-off-by: loli10K <ezomori.nozomu@gmail.com> Closes #8279	2019-01-25 09:47:52 -08:00
Damian Wojsław	8fccfa8e17	zpool iostat should print headers when terminal fills When `zpool iostat` fills the terminal the headers should be printed again. `zpool iostat -n` can be used to suppress this. If the command is not attached to a tty, headers will not be printed so as to not break existing scripts. Reviewed-by: Joshua M. Clulow <josh@sysmgr.org> Reviewed-by: Giuseppe Di Natale <guss80@gmail.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Damian Wojsław <damian@wojslaw.pl> Closes #8235 Closes #8262	2019-01-23 13:29:49 -08:00
Tom Caputi	b5d693581d	Fix bad kmem_free() in zvol_rename_minors_impl() Currently, zvol_rename_minors_impl() calls kmem_asprintf() to allocate and initialize a string. This function is a thin wrapper around the kernel's kvasprintf() and does not call into the SPL's kmem tracking code when it is enabled. However, this function frees the string with the tracked kmem_free() instead of the untracked strfree(), which causes the SPL kmem tracking code to believe that the function is attempting to free memory it never allocated, triggering an ASSERT. This patch simply corrects this issue. Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tom Caputi <tcaputi@datto.com> Closes #8307	2019-01-23 11:38:05 -08:00
loli10K	0a10863194	ztest: creates partially initialized root dataset Since `d8fdfc2` was integrated dsl_pool_create() does not call dmu_objset_create_impl() for the root dataset when running in userland (ztest): this creates a pool with a partially initialized root dataset. Trying to import and use this pool results in both zpool and zfs executables dumping core. Fix this by adopting an alternative change suggested in OpenZFS 8607 code review. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: Tom Caputi <tcaputi@datto.com> Original-patch-by: Robert Mustacchi <rm@joyent.com> Signed-off-by: loli10K <ezomori.nozomu@gmail.com> Closes #8277	2019-01-18 11:14:01 -08:00
Brian Behlendorf	ad63507135	Remove zfs_sync() panicking kernel check This check provides no real additional protection and unnecessarily introduces a dependency on the "oops_in_progress" kernel symbol. Remove the check, it there are special circumstances on other platforms which make this a requirement it can be reintroduced for all relevant call paths in a more portable comprehensive manor. Reviewed-by: Matt Ahrens <mahrens@delphix.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #8297	2019-01-18 11:11:47 -08:00
Serapheim Dimitropoulos	b194fab0fb	Factor metaslab_load_wait() in metaslab_load() Most callers that need to operate on a loaded metaslab, always call metaslab_load_wait() before loading the metaslab just in case someone else is already doing the work. Factoring metaslab_load_wait() within metaslab_load() makes the later more robust, as callers won't have to do the load-wait check explicitly every time they need to load a metaslab. Reviewed-by: Matt Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com> Closes #8290	2019-01-18 11:10:32 -08:00
Tom Caputi	960347d3a6	Fix 0 byte memory leak in zfs receive Currently, when a DRR_OBJECT record is read into memory in receive_read_record(), memory is allocated for the bonus buffer. However, if the object doesn't have a bonus buffer the code will still "allocate" the zero bytes, but the memory will not be passed to the processing thread for cleanup later. This causes the spl kmem tracking code to report a leak. This patch simply changes the code so that it only allocates this memory if it has a non-zero length. Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tom Caputi <tcaputi@datto.com> Closes #8266	2019-01-18 11:06:48 -08:00
Serapheim Dimitropoulos	1a759200e5	Document guidelines for usage of zfs_dbgmsg Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Matt Ahrens <matt@delphix.com> Reviewed-by: Igor Kozhukhov <igor@dilos.org> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com> Closes #8299	2019-01-18 10:16:56 -08:00
Neal Gompa (ニール・ゴンパ)	e45c1734a6	dkms: Enable debuginfo option to be set with zfs sysconfig file On some Linux distributions, the kernel module build will not default to building with debuginfo symbols, which can make it difficult for debugging and testing. For this case, we provide a flag to override the build to force debuginfo to be produced for the kernel module build. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by: Neal Gompa <ngompa@datto.com> Co-authored-by: Simon Watson <swatson@datto.com> Signed-off-by: Neal Gompa <ngompa@datto.com> Signed-off-by: Simon Watson <swatson@datto.com> Closes #8304	2019-01-18 10:10:24 -08:00
loli10K	60b0a963f5	Off-by-one in zap_leaf_array_create() Trying to set user properties with their length 1 byte shorter than the maximum size triggers an assertion failure in zap_leaf_array_create(): panic[cpu0]/thread=ffffff000a092c40: assertion failed: num_integers * integer_size < (8<<10) (0x2000 < 0x2000), file: ../../common/fs/zfs/zap_leaf.c, line: 233 ffffff000a092500 genunix:process_type+167c35 () ffffff000a0925a0 zfs:zap_leaf_array_create+1d2 () ffffff000a092650 zfs:zap_entry_create+1be () ffffff000a092720 zfs:fzap_update+ed () ffffff000a0927d0 zfs:zap_update+1a5 () ffffff000a0928d0 zfs:dsl_prop_set_sync_impl+5c6 () ffffff000a092970 zfs:dsl_props_set_sync_impl+fc () ffffff000a0929b0 zfs:dsl_props_set_sync+79 () ffffff000a0929f0 zfs:dsl_sync_task_sync+10a () ffffff000a092a80 zfs:dsl_pool_sync+3a3 () ffffff000a092b50 zfs:spa_sync+4e6 () ffffff000a092c20 zfs:txg_sync_thread+297 () ffffff000a092c30 unix:thread_start+8 () This patch simply corrects the assertion. Reviewed-by: Matt Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: loli10K <ezomori.nozomu@gmail.com> Closes #8278	2019-01-18 09:58:46 -08:00
Serapheim Dimitropoulos	8dc2197b7b	Simplify spa_sync by breaking it up to smaller functions The point of this refactoring is to break the high-level conceptual steps of spa_sync() to their own helper functions. In general large functions can enhance readability if structured well, but in this case the amount of conceptual steps taken could use the help of helper functions. Reviewed-by: Matt Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com> Closes #8293	2019-01-18 09:50:16 -08:00
Brian Behlendorf	ce5fb2a7c6	ztest: scrub verification By design ztest will never inject non-repairable damage in to the pool. Update the ztest_scrub() test case such that it waits for the scrub to complete and verifies the pool is always repairable. After enabling scrub verification two scenarios were encountered which are the result of how ztest manages failure injection. The first case is straight forward and pertains to detaching a mirror vdev. In this case, the pool must always be scrubbed prior the detach. Failure to do so can potentially lock in previously repairable data corruption by removing all good copies of a block leaving only damaged ones. The second is a little more subtle. The child/offset selection logic in ztest_fault_inject() depends on the calculated number of leaves always remaining constant between injection passes. This is true within a single execution of ztest, but when using zloop.sh random values are selected for each restart. Therefore, when ztest imports an existing pool it must be scrubbed before failure injection can be safely enabled. Otherwise it is possible that it will inject non-repairable damage. Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: Tom Caputi <tcaputi@datto.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #8269	2019-01-18 09:47:55 -08:00
Tom Caputi	305781da4b	Fix error handling incallers of dbuf_hold_level() Currently, the functions dbuf_prefetch_indirect_done() and dmu_assign_arcbuf_by_dnode() assume that dbuf_hold_level() cannot fail. In the event of an error the former will cause a NULL pointer dereference and the later will trigger a VERIFY. This patch adds error handling to these functions and their callers where necessary. Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tom Caputi <tcaputi@datto.com> Closes #8291	2019-01-17 15:47:08 -08:00
Serapheim Dimitropoulos	75058f3303	Remove unused vdev_t fields The following fields from the vdev_t struct are not used anywhere. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com> Closes #8285	2019-01-17 15:41:12 -08:00
Brian Behlendorf	52b684236d	ztest: scrub ddt repair The ztest_ddt_repair() test is designed inflict damage to the ddt which can be repairable by a scrub. Unfortunately, this repair logic was broken at some point and it went undetected. This issue is not specific to ztest, but thankfully this extra redundancy is rarely enabled and even more rarely needed. The root cause was identified to be the ddt_bp_create() function called by dsl_scan_ddt_entry() which did not set the dedup bit of the generated block pointer. The consequence of this was that the ZIO_DDT_READ_PIPELINE was never enabled for the block pointer during the scrub, and the dedup ditto repair logic was never run. Note that for demand reads which don't rely on ddt_bp_create() the required pipeline stages would be enabled and the repair performed. This was resolved by unconditionally setting the dedup bit in ddt_bp_create(). This way all codes paths which may need to perform a repair from a block pointer generated from the dtt entry will be able too. The only exception is that the dedup bit is cleared in ddt_phys_free() which is required to avoid leaking space. Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: Tom Caputi <tcaputi@datto.com> Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #8270	2019-01-17 15:25:00 -08:00
Serapheim Dimitropoulos	419ba59145	Update vdev_is_spacemap_addressable() for new spacemap encoding Since the new spacemap encoding was ported to ZoL that's no longer a limitation. This patch updates vdev_is_spacemap_addressable() that was performing that check. It also updates the appropriate test to ensure that the same functionality is tested. The test does so by creating pools that don't have the new spacemap encoding enabled - just the checkpoint feature. This patch also reorganizes that same tests in order to cut in half its memory consumption. Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com> Closes #8286	2019-01-16 15:06:20 -08:00
Brian Behlendorf	64bdf63f5c	ztest: split block reconstruction Increase the default allowed number of reconstruction attempts. There's not an exact right number for this setting. It needs to be set large enough to cover any realistic failure scenarios and small enough to avoid stalling the IO pipeline and invoking the dead man detection. The current value of 256 was empirically determined to be too low based on multi-day runs of ztest. The fault injection code would inject more damage than could be reconstructed given the relatively small number of attempts. However, in all observed cases the block could be reconstructed using a slightly higher limit. Based on local testing increasing the default value to 4096 was determined to strike the best balance. Checking all combinations takes less than 10s in the worst case, and has so far eliminated the vast majority of false positives detected by ztest. This delay is roughly on par with how long retries may be performed to a misbehaving HDD and was deemed to be reasonable. Better to err on the side of a brief delay rather than fail to reconstruct the data. Lastly, the -Y flag has been added to zdb to make it easy to try all possible combinations when performing split block reconstruction. For badly damaged blocks with 18 splits, they can be fully enumerated within a few minutes. This has been done to ensure permanent errors are never incorrectly reported when ztest verifies the pool with zdb. Reviewed by: Tom Caputi <tcaputi@datto.com> Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #8271	2019-01-16 14:10:02 -08:00
Serapheim Dimitropoulos	db587941c5	Make zdb results for checkpoint tests consistent This patch exports and re-imports the pool when these tests are analyzed with zdb to get consistent results. Reviewed by: Igor Kozhukhov <igor@dilos.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com> Closes #8292	2019-01-16 10:41:47 -08:00
Brian Behlendorf	6e91a72fe3	Disable 'zfs remap' command The implementation of 'zfs remap' has proven to be problematic since it modifies the objset (but not its logical contents) by dirtying metadata without owning it. The consequence of which is that dmu_objset_remap_indirects() is vulnerable to certain races. For example, if we are in the middle of receiving into the filesystem while it is being remapped. Then it is possible we could evict the objset when the receive completes (see dsl_dataset_clone_swap_sync_impl, or dmu_recv_end_sync), but dmu_objset_remap_indirects() may be still using the objset. The result of which would be a panic. Extended runs of ztest(8) have exposed other possible races which can occur when using 'zfs remap'. Several of these have been fixed but there may be others which have not yet been encountered and diagnosed. Furthermore, the ability to manually remap a filesystem is no longer particularly useful now that the removal code can map large chunks. Coupled with the fact that explaining what this command does and why it may be useful requires a detailed understanding of the internals of device removal. These are details users should not be bothered with. Therefore, the 'zfs remap' command is being disabled but not entirely removed. It may be removed in the future or potentially reworked to address the issues described above. Since 'zfs remap' has never been part of a tagged release its removal is expected to have minimal impact. The ZTS tests have been updated to continue to exercise the command to prevent atrophy, but it has been removed entirely from ztest(8). Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: Tom Caputi <tcaputi@datto.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #8238	2019-01-15 15:46:58 -08:00
Tom Caputi	5e7f3ace58	Fix zio leak in dbuf_read() Currently, dbuf_read() may decide to create a zio_root which is used as a parent for any child zios created in dbuf_read_impl(). However, if there is an error in dbuf_read_impl(), this zio is never executed and ends up leaked. This patch simply ensures that we always execute the root zio, even i it has no real work to do. Reviewed-by: Matt Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tom Caputi <tcaputi@datto.com> Closes #8267	2019-01-15 12:23:40 -08:00
loli10K	7b02fae7a6	Verify .gitignore entries This change adds a make target 'vcscheck' which scans the git workspace for new, untracked files missing from the .gitignore configuration; this is done to help prevent adding unwanted build artifacts to the source tree during development. Reviewed-by: Neal Gompa <ngompa@datto.com> Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: loli10K <ezomori.nozomu@gmail.com> Closes #8281	2019-01-15 11:56:29 -08:00
Brian Behlendorf	9b626c126e	Tag 0.8.0-rc3 Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2019-01-14 12:40:42 -08:00
Brian Behlendorf	d611989fdc	Minor spelling corrections Some minor spelling mistakes and typos. No functional changes. Reviewed-by: Neal Gompa <ngompa@datto.com> Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed-by: Giuseppe Di Natale <guss80@gmail.com> Reviewed-by: bunder2015 <omfgbunder@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #8272	2019-01-13 10:11:52 -08:00
Serapheim Dimitropoulos	61c3391acc	Serialize ZTHR operations to eliminate races Adds a new lock for serializing operations on zthrs. The commit also includes some code cleanup and refactoring. Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: Tom Caputi <tcaputi@datto.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com> Closes #8229	2019-01-13 10:09:46 -08:00
Paul Zuchowski	83c796c5e9	zfs filesystem skipped by df -h On full pool when pool root filesystem references very few bytes, the f_blocks returned to statvfs is 0 but should be at least 1. Reviewed by: Tom Caputi <tcaputi@datto.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Paul Zuchowski <pzuchowski@datto.com> Closes #8253 Closes #8254	2019-01-13 10:06:13 -08:00

1 2 3 4 5 ...

4947 Commits