mirror_zfs

mirror of https://git.proxmox.com/git/mirror_zfs.git synced 2026-04-17 08:54:52 +03:00

Author	SHA1	Message	Date
LOLi	3cbe89b12a	Fix zfs incremental send remove '-o' properties When receiving an incremental send stream with intermediary snapshots zfs_receive_one() does not correctly identify the top-level dataset: consequently we restore said snapshots as if they were children datasets in the hierarchy, forcing inheritance of any property received with 'zfs send -o' and effectively removing any locally set value. The test case did not correctly verify this situation because it uses adjacent snapshots, basically testing 'zfs send -i' instead of 'zfs send -I': this commit adds an additional intermediary snapshot to the test script. Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: loli10K <ezomori.nozomu@gmail.com> Closes #7478	2018-04-30 20:58:29 -07:00
Antonio Russo	c83ccb3e72	Add test with two kinds of file creation orders Data loss was identified in #7401 when many small files were copied. This adds a reproducer for this bug and other similar ones: randomly generate N files. Then, listing M of them by `ls -U` order, produce those same files in a directory of the same name. This triggers the bug consistently, provided N and M are large enough. Here, N=2^16 and M=2^13. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Antonio Russo <antonio.e.russo@gmail.com> Closes #7411	2018-04-30 12:45:47 -05:00
LOLi	b4555c777a	Fix 'zfs remap <poolname@snapname>' Only filesystems and volumes are valid 'zfs remap' parameters: when passed a snapshot name zfs_remap_indirects() does not handle the EINVAL returned from libzfs_core, which results in failing an assertion and consequently crashing. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov> Signed-off-by: loli10K <ezomori.nozomu@gmail.com> Closes #7454	2018-04-19 09:45:17 -07:00
Chunwei Chen	599b864813	Fix ENOSPC in "Handle zap_add() failures in ..." Commit `cc63068` caused ENOSPC error when copy a large amount of files between two directories. The reason is that the patch limits zap leaf expansion to 2 retries, and return ENOSPC when failed. The intent for limiting retries is to prevent pointlessly growing table to max size when adding a block full of entries with same name in different case in mixed mode. However, it turns out we cannot use any limit on the retry. When we copy files from one directory in readdir order, we are copying in hash order, one leaf block at a time. Which means that if the leaf block in source directory has expanded 6 times, and you copy those entries in that block, by the time you need to expand the leaf in destination directory, you need to expand it 6 times in one go. So any limit on the retry will result in error where it shouldn't. Note that while we do use different salt for different directories, it seems that the salt/hash function doesn't provide enough randomization to the hash distance to prevent this from happening. Since `cc63068` has already been reverted. This patch adds it back and removes the retry limit. Also, as it turn out, failing on zap_add() has a serious side effect for mzap_upgrade(). When upgrading from micro zap to fat zap, it will call zap_add() to transfer entries one at a time. If it hit any error halfway through, the remaining entries will be lost, causing those files to become orphan. This patch add a VERIFY to catch it. Reviewed-by: Sanjeev Bagewadi <sanjeev.bagewadi@gmail.com> Reviewed-by: Richard Yao <ryao@gentoo.org> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Albert Lee <trisk@forkgnu.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Signed-off-by: Chunwei Chen <david.chen@nutanix.com> Closes #7401 Closes #7421	2018-04-18 14:19:50 -07:00
Tom Caputi	b0ee5946aa	Fix issues with raw sends of spill blocks This patch fixes 2 issues in how spill blocks are processed during raw sends. The first problem is that compressed spill blocks were using the logical length rather than the physical length to determine how much data to dump into the send stream. The second issue is a typo that caused the spill record's object number to be used where the objset's ID number was required. Both issues have been corrected, and the payload_size is now printed in zstreamdump for future debugging. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tom Caputi <tcaputi@datto.com> Closes #7378 Closes #7432	2018-04-17 11:19:03 -07:00
Tom Caputi	e14a32b1c8	Fix object reclaim when using large dnodes Currently, when the receive_object() code wants to reclaim an object, it always assumes that the dnode is the legacy 512 bytes, even when the incoming bonus buffer exceeds this length. This causes a buffer overflow if --enable-debug is not provided and triggers an ASSERT if it is. This patch resolves this issue and adds an ASSERT to ensure this can't happen again. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tom Caputi <tcaputi@datto.com> Closes #7097 Closes #7433	2018-04-17 11:13:57 -07:00
bunder2015	b40d45bc6c	ZTS: fix reservation_013_pos integer overflow When using large disks the integers for calculating sizes can overflow past 2**31. Changing to long integers with typeset should correct this. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: bunder2015 <omfgbunder@gmail.com> Closes #4444 Closes #7451	2018-04-17 10:52:53 -07:00
Matt Ahrens	d830d4795a	OpenZFS 9280 - Assertion failure while running removal_with_ganging test with 4K devices Authored by: Matt Ahrens <Matt.Ahrens@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov> Approved by: Garrett D'Amore <garrett@damore.org> Ported-by: Brian Behlendorf <behlendorf1@llnl.gov> OpenZFS-issue: https://www.illumos.org/issues/9280 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/243952c Closes #7445	2018-04-17 10:44:50 -07:00
bunder2015	3eba666332	ZTS: zpool_create_002 clean up leftover filedisk zpool_create_002_pos did not clean up filedisk files left over from running the test. Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: bunder2015 <omfgbunder@gmail.com> Closes #7435 Closes #7439	2018-04-15 15:17:44 -07:00
Toomas Soome	5e567da987	OpenZFS 9213 - zfs: sytem typo Authored by: Toomas Soome <tsoome@me.com> Reviewed by: C Fraire <cfraire@me.com> Reviewed by: Andy Fiddaman <omnios@citrus-it.co.uk> Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Approved by: Joshua M. Clulow <josh@sysmgr.org> Ported-by: Brian Behlendorf <behlendorf1@llnl.gov> Porting Notes: * The additional instances of this typo addressed in the OpenZFS patch were already resolved. OpenZFS-issue: https://illumos.org/issues/9213 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/edc8ef7d92 Closes #7436	2018-04-15 10:59:13 -07:00
Matthew Ahrens	a1d477c24c	OpenZFS 7614, 9064 - zfs device evacuation/removal OpenZFS 7614 - zfs device evacuation/removal OpenZFS 9064 - remove_mirror should wait for device removal to complete This project allows top-level vdevs to be removed from the storage pool with "zpool remove", reducing the total amount of storage in the pool. This operation copies all allocated regions of the device to be removed onto other devices, recording the mapping from old to new location. After the removal is complete, read and free operations to the removed (now "indirect") vdev must be remapped and performed at the new location on disk. The indirect mapping table is kept in memory whenever the pool is loaded, so there is minimal performance overhead when doing operations on the indirect vdev. The size of the in-memory mapping table will be reduced when its entries become "obsolete" because they are no longer used by any block pointers in the pool. An entry becomes obsolete when all the blocks that use it are freed. An entry can also become obsolete when all the snapshots that reference it are deleted, and the block pointers that reference it have been "remapped" in all filesystems/zvols (and clones). Whenever an indirect block is written, all the block pointers in it will be "remapped" to their new (concrete) locations if possible. This process can be accelerated by using the "zfs remap" command to proactively rewrite all indirect blocks that reference indirect (removed) vdevs. Note that when a device is removed, we do not verify the checksum of the data that is copied. This makes the process much faster, but if it were used on redundant vdevs (i.e. mirror or raidz vdevs), it would be possible to copy the wrong data, when we have the correct data on e.g. the other side of the mirror. At the moment, only mirrors and simple top-level vdevs can be removed and no removal is allowed if any of the top-level vdevs are raidz. Porting Notes: * Avoid zero-sized kmem_alloc() in vdev_compact_children(). The device evacuation code adds a dependency that vdev_compact_children() be able to properly empty the vdev_child array by setting it to NULL and zeroing vdev_children. Under Linux, kmem_alloc() and related functions return a sentinel pointer rather than NULL for zero-sized allocations. * Remove comment regarding "mpt" driver where zfs_remove_max_segment is initialized to SPA_MAXBLOCKSIZE. Change zfs_condense_indirect_commit_entry_delay_ticks to zfs_condense_indirect_commit_entry_delay_ms for consistency with most other tunables in which delays are specified in ms. * ZTS changes: Use set_tunable rather than mdb Use zpool sync as appropriate Use sync_pool instead of sync Kill jobs during test_removal_with_operation to allow unmount/export Don't add non-disk names such as "mirror" or "raidz" to $DISKS Use $TEST_BASE_DIR instead of /tmp Increase HZ from 100 to 1000 which is more common on Linux removal_multiple_indirection.ksh Reduce iterations in order to not time out on the code coverage builders. removal_resume_export: Functionally, the test case is correct but there exists a race where the kernel thread hasn't been fully started yet and is not visible. Wait for up to 1 second for the removal thread to be started before giving up on it. Also, increase the amount of data copied in order that the removal not finish before the export has a chance to fail. * MMP compatibility, the concept of concrete versus non-concrete devices has slightly changed the semantics of vdev_writeable(). Update mmp_random_leaf_impl() accordingly. * Updated dbuf_remap() to handle the org.zfsonlinux:large_dnode pool feature which is not supported by OpenZFS. * Added support for new vdev removal tracepoints. * Test cases removal_with_zdb and removal_condense_export have been intentionally disabled. When run manually they pass as intended, but when running in the automated test environment they produce unreliable results on the latest Fedora release. They may work better once the upstream pool import refectoring is merged into ZoL at which point they will be re-enabled. Authored by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Alex Reece <alex@delphix.com> Reviewed-by: George Wilson <george.wilson@delphix.com> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Richard Laager <rlaager@wiktel.com> Reviewed by: Tim Chase <tim@chase2k.com> Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Approved by: Garrett D'Amore <garrett@damore.org> Ported-by: Tim Chase <tim@chase2k.com> Signed-off-by: Tim Chase <tim@chase2k.com> OpenZFS-issue: https://www.illumos.org/issues/7614 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/f539f1eb Closes #6900	2018-04-14 12:16:17 -07:00
Tim Chase	4b0f5b2d7b	Wait for resilver after online This test performs a rapid offline/online cycle of each of several mirror vdevs. It can run so quickly that there isn't sufficient pool redundancy to perform an offline. The solution is to wait until the pool is resilvered following the online operation. Also, add a pool sync before the offline operation to help reduce spurious errors. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tim Chase <tim@chase2k.com> Issue #6900	2018-04-13 18:04:10 -07:00
Seth Forshee	93b43af10d	Allow mounting datasets more than once Currently mounting an already mounted zfs dataset results in an error, whereas it is typically allowed with other filesystems. This causes some bad interactions with mount namespaces. Take this sequence for example: - Create a dataset - Create a snapshot of the dataset - Create a clone of the snapshot - Create a new mount namespace - Rename the original dataset The rename results in unmounting and remounting the clone in the original mount namespace, however the remount fails because the dataset is still mounted in the new mount namespace. (Note that this means the mount in the new mount namespace is never being unmounted, so perhaps the unmount/remount of the clone isn't actually necessary.) The problem here is a result of the way mounting is implemented in the kernel module. Since it is not mounting block devices it uses mount_nodev() instead of the usual mount_bdev(). However, mount_nodev() is written for filesystems for which each mount is a new instance (i.e. a new super block), and zfs should be able to detect when a mount request can be satisfied using an existing super block. Change zpl_mount() to call sget() directly with it's own test callback. Passing the objset_t object as the fs data allows checking if a superblock already exists for the dataset, and in that case we just need to return a new reference for the sb's root dentry. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tom Caputi <tcaputi@datto.com> Signed-off-by: Alek Pinchuk <apinchuk@datto.com> Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Closes #5796 Closes #7207	2018-04-13 10:44:05 -07:00
bunder2015	1e37dee03f	ZTS: clean up leftover ibackup_trunc files zfs_receive_raw_incremental did not clean up ibackup_trunc.* files left over from running the test. Also changing the path of the ibackup files so they can be placed in the correct directories when /var/tmp is not the temporary directory. Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: bunder2015 <omfgbunder@gmail.com> Closes #7430	2018-04-13 10:35:55 -07:00
LOLi	7fab636188	Add 'zpool split' coverage to the ZFS Test Suite This change adds five new tests to the ZTS: * zpool_split_cliargs: verify command line options and arguments * zpool_split_devices: verify zpool split accepts a device list * zpool_split_encryption: verify zpool can split encrypted pools * zpool_split_props: verify zpool split can set property values * zpool_split_vdevs: verify vdev layout when splitting the pool Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: loli10K <ezomori.nozomu@gmail.com> Closes #7409	2018-04-12 10:57:24 -07:00
Tomohiro Kusumi	8111eb4abc	Fix calloc(3) arguments order calloc(3) takes `nelem` (or `nmemb` in glibc) first, and then size of elements. No difference expected for having these in reverse order, however should follow the standard. http://pubs.opengroup.org/onlinepubs/009695399/functions/calloc.html Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@osnexus.com> Closes #7405	2018-04-12 10:50:39 -07:00
Mike Gerdts	d22f3a8244	OpenZFS 9286 - want refreservation=auto Authored by: Mike Gerdts <mike.gerdts@joyent.com> Reviewed by: Allan Jude <allanjude@freebsd.org> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: John Kennedy <john.kennedy@delphix.com> Reviewed by: Andy Stormont <astormont@racktopsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov> Approved by: Richard Lowe <richlowe@richlowe.net> Ported-by: Don Brady <don.brady@delphix.com> Porting Notes: * Adopted destroy_dataset in ZTS test cleanup * Use ksh shebang instead of bash for new tests OpenZFS-issue: https://www.illumos.org/issues/9286 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/723d0c85 Closes #7387	2018-04-11 14:52:13 -07:00
LOLi	9966754ac5	Fix zpool set feature@<feature>=disabled Commit `e4010f2` accidentally allows zpool to set pool features to "disabled"; this should only be allowed at pool creation. This commit adds additional checks and test coverage to 'zpool set'. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: loli10K <ezomori.nozomu@gmail.com> Closes #7402	2018-04-11 14:45:58 -07:00
Tony Hutter	4f301661df	Revert "Handle zap_add() failures in mixed ... " This reverts commit `cc63068e95`. Under certain circumstances this change can result in an ENOSPC error when adding new files to a directory. See #7401 for full details. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Issue #7401 Cloes #7416	2018-04-09 14:24:46 -07:00
Giuseppe Di Natale	7b47628acb	Clean up (k)shlib and cfg file shebangs Most kshlib files are imported by other scripts and do not have a shebang at the top of their files. Make all kshlib follow this convention. Remove shebangs from cfg files as well. Reviewed-by: loli10K <ezomori.nozomu@gmail.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov> Close #7406	2018-04-08 19:37:22 -07:00
Tony Hutter	6c9af9e8f4	Fix "file is executable, but no shebang" warnings Fedora 28's RPM build checks warn when executable files don't have a shebang line. These warnings are caused when we (incorrectly) include data & config files in the_SCRIPTS automake lines. Files in _SCRIPTS are marked executable by automake. This patch fixes the issue by including non-executable scripts in a _DATA line instead. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #7359 Closes #7395	2018-04-06 16:34:21 -07:00
Tom Caputi	1bf9a552bb	Make encrypted "zfs mount -a" failures consistent Currently, "zfs mount -a" will print a warning and fail to mount any encrypted datasets that do not have a key loaded. This patch makes the behavior of this failure consistent with other failure modes ("zfs mount -a" will silently continue, explict "zfs mount" will print a message and return an error code. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tom Caputi <tcaputi@datto.com> Closes #7382	2018-04-06 13:28:15 -07:00
Olaf Faaland	533ea0415b	Update mmp_delay on sync or skipped, failed write When an MMP write is skipped, or fails, and time since mts->mmp_last_write is already greater than mts->mmp_delay, increase mts->mmp_delay. The original code only updated mts->mmp_delay when a write succeeded, but this results in the write(s) after delays and failed write(s) reporting an ub_mmp_delay which is too low. Update mmp_last_write and mmp_delay if a txg sync was successful. At least one uberblock was written, thus extending the time we can be sure the pool will not be imported by another host. Do not allow mmp_delay to go below (MSEC2NSEC(zfs_multihost_interval) / vdev_count_leaves()) so that a period of frequent successful MMP writes, e.g. due to frequent txg syncs, does not result in an import activity check so short it is not reliable based on mmp thread writes alone. Remove unnecessary local variable, start. We do not use the start time of the loop iteration. Add a debug message in spa_activity_check() to allow verification of the import_delay value and to prove the activity check occurred. Alter the tests that import pools and attempt to detect an activity check. Calculate the expected duration of spa_activity_check() based on module parameters at the time the import is performed, rather than a fixed time set in mmp.cfg. The fixed time may be wrong. Also, use the default zfs_multihost_interval value so the activity check is longer and easier to recognize. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov> Signed-off-by: Olaf Faaland <faaland1@llnl.gov> Closes #7330	2018-04-04 16:38:44 -07:00
Tony Hutter	21a4f5cc86	Fedora 28: Fix misc bounds check compiler warnings Fix a bunch of (mostly) sprintf/snprintf truncation compiler warnings that show up on Fedora 28 (GCC 8.0.1). Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #7361 Closes #7368	2018-04-04 10:16:47 -07:00
LOLi	f119d00c1f	Fix add_nested_replacing_spare test case Use 'zpool reopen' instead of 'zpool scrub' to kick in the spare device: this is required to avoid spurious failures caused by a race condition in events processing by the ZFS Event Daemon: P1 (zpool scrub) P2 (zed) --- zfs_ioc_pool_scan() -> dsl_scan() -> vdev_reopen() -> vdev_set_state(VDEV_STATE_CANT_OPEN) zfs_ioc_vdev_attach() -> spa_vdev_attach() -> dsl_resilver_restart() -> dsl_sync_task() -> dsl_scan_setup_check() <- dsl_scan_setup_check(): EBUSY Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: loli10K <ezomori.nozomu@gmail.com> Closes #7247 Closes #7342	2018-04-03 17:30:14 -07:00
Andriy Gapon	5e00213e43	OpenZFS 9164 - assert: newds == os->os_dsl_dataset Authored by: Andriy Gapon <avg@FreeBSD.org> Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: Don Brady <don.brady@delphix.com> Reviewed-by: loli10K <ezomori.nozomu@gmail.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Approved by: Richard Lowe <richlowe@richlowe.net> Ported-by: Giuseppe Di Natale <dinatale2@llnl.gov> Porting Notes: * Re-enabled and tweaked the zpool_upgrade_007_pos test case to successfully run in under 5 minutes. OpenZFS-issue: https://www.illumos.org/issues/9164 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/0e776dc06a Closes #6112 Closes #7336	2018-03-30 12:00:40 -07:00
Brian Behlendorf	b2ab468dde	Fix mmap / libaio deadlock Calling uiomove() in mappedread() under the page lock can result in a deadlock if the user space page needs to be faulted in. Resolve the issue by dropping the page lock before the uiomove(). The inode range lock protects against concurrent updates via zfs_read() and zfs_write(). Reviewed-by: Albert Lee <trisk@forkgnu.org> Reviewed-by: Chunwei Chen <david.chen@nutanix.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #7335 Closes #7339	2018-03-28 10:19:22 -07:00
DeHackEd	668173b576	Remove libattr requirement RHEL/CentOS 6 supports sys/xattr.h eliminating the need for libattr-devel as a dependency. Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: DHE <git@dehacked.net> Closes #7344 Closes #7351	2018-03-27 16:51:32 -07:00
Alek P	272b5d730f	Add JSON output support to channel programs The changes piggyback JSON output support on top of channel programs (#6558). This way the JSON output support is targeted to scripting use cases and is easily maintainable since it really only touches one function (zfs_do_channel_program()). This patch ports Joyent's JSON nvlist library from illumos to enable easy JSON printing of channel program output nvlist. To keep the delta small I also took advantage of the fact that printing in zfs_do_channel_program() was almost always done before exiting the program. Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alek Pinchuk <apinchuk@datto.com> Closes #7281	2018-03-19 12:40:58 -07:00
Stephen Blinick	8a2a9db8df	OpenZFS 9076 - Adjust perf test concurrency settings ZFS Performance test concurrency should be lowered for better latency Work by Stephen Blinick. Nightly performance runs typically consist of two levels of concurrency; and both are fairly high. Since the IO runs are to a ZFS filesystem, within a zpool, which is based on some variable number of vdev's, the amount of IO driven to each device is variable. Additionally, different device types (HDD vs SSD, etc) can generally handle a different amount of concurrent IO before saturating. Nevertheless, in practice, it appears that most tests are well past the concurrency saturation point and therefore both perform with the same throughput, the maximum of the device. Because the queuedepth to the device(s) is so high however, the latency is much higher than the best possible at that throughput, and increases linearly with the increase in concurrency. This means that changes in code that impact latency during normal operation (before saturation) may not be apparent when a large component of the measured latency is from the IO sitting in a queue to be serviced. Therefore, changing the concurrency settings is recommended Authored by: Stephen Blinick <stephen.blinick@delphix.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: John Wren Kennedy <jwk404@gmail.com> Ported-by: John Wren Kennedy <jwk404@gmail.com> OpenZFS-issue: https://www.illumos.org/issues/9076 OpenZFS-commit: https://github.com/openzfs/openzfs/pull/562 Upstream bug: DLPX-45477 Closes #7302	2018-03-15 10:51:00 -07:00
Paul Zuchowski	8e5d14844d	zdb and inuse tests don't pass with real disks Due to zpool create auto-partioning in Linux (i.e. sdb1), certain utilities need to use the parition (sdb1) while others use the whole disk name (sdb). Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Paul Zuchowski <pzuchowski@datto.com> Closes #6939 Closes #7261	2018-03-07 17:03:33 -08:00
Wolfgang Bumiller	0e85048f53	Take user namespaces into account in policy checks Change file related checks to use user namespaces and make sure involved uids/gids are mappable in the current namespace. Note that checks without file ownership information will still not take user namespaces into account, as some of these should be handled via 'zfs allow' (otherwise root in a user namespace could issue commands such as `zpool export`). This also adds an initial user namespace regression test for the setgid bit loss, with a user_ns_exec helper usable in further tests. Additionally, configure checks for the required user namespace related features are added for: * ns_capable * kuid/kgid_has_mapping() * user_ns in cred_t Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com> Closes #6800 Closes #7270	2018-03-07 15:40:42 -08:00
Brian Behlendorf	434a3375ce	ZTS: fix send-c_stream_size_estimate The test could fail when attempting to write to a newly created volume which was missing its device node. Resolve the issue by calling block_device_wait() which blocks until udev creates the needed entry. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #7276 Closes #7277	2018-03-07 09:55:54 -08:00
Giuseppe Di Natale	a07ad58847	Fix dbufstats_001_pos Implement a new helper within_tolerance to test if a value is within range of a target. Because the dbufstats and dbufs kstat file are being read at slightly different times, it is possible for stats to be slightly off. Use within_tolerance to determine if the value is "close enough" to the target. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov> Closes #7239 Closes #7266	2018-03-07 09:53:04 -08:00
Tony Hutter	639b18944a	Allow to limit zed's syslog chattiness Some usage patterns like send/recv of replication streams can produce a large number of events. In such a case, the current all-syslog.sh zedlet will hold up to its name, and flood the logs with mostly redundant information. Two mitigate this situation, this changeset introduces to new variables ZED_SYSLOG_SUBCLASS_INCLUDE and ZED_SYSLOG_SUBCLASS_EXCLUDE to zed.rc that give more control over which event classes end up in the syslog. Reviewed-by: loli10K <ezomori.nozomu@gmail.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Daniel Kobras <d.kobras@science-computing.de> Closes #6886 Closes #7260	2018-03-06 15:41:52 -08:00
Olaf Faaland	d2160d0538	Record skipped MMP writes in multihost_history Once per pass through the MMP thread's loop, the vdev tree is walked to find a suitable leaf to write the next MMP block to. If no such leaf is found, the thread sleeps for a while and resumes at the top of the loop. Add an entry to multihost_history when no leaf can be found, and record the reason in the error column. The error code for such entries is a bitfield, displayed in hex: 0x1 At least one vdev (interior or leaf) was not writeable. 0x2 At least one writeable leaf vdev was found, but it had a pending MMP write. timestamp = the time in seconds since the epoch when no leaf could be found originally. duration = the time (in ns) during which no MMP block was written for this reason. This does not include the preceeding inter-write period nor the following inter-write period. vdev_guid = the number of sequential cycles of the MMP thread looop when this occurred. Sample output, truncated to fit: For records of skipped MMP writes the right-most column, vdev_path, is reported as "-". id txg timestamp error duration mmp_delay vdev_guid ... 936 11 1520036441 0 146264 891422313 1740883117838 ... 937 11 1520036441 0 163956 888356657 7320395061548 ... 938 11 1520036442 0 130690 885314969 7320395061548 ... 939 11 1520036442 0 2001068577 882296582 1740883117838 ... 940 11 1520036443 0 161806 882296582 7320395061548 ... 941 11 1520036443 0x2 0 998020546 1 ... 942 11 1520036444 0 136585 998020546 7320395061548 ... 943 11 1520036444 0x2 0 998020257 1 ... 944 11 1520036445 5 2002662964 994160219 1740883117838 ... 945 11 1520036445 0x2 998073118 994160219 3 ... 946 11 1520036447 0 247136 994160219 7320395061548 ... Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Olaf Faaland <faaland1@llnl.gov> Closes #7212	2018-03-06 15:15:15 -08:00
Giuseppe Di Natale	c7b55e71b0	Introduce a destroy_dataset helper Datasets can be busy when calling zfs destroy. Introduce a helper function to destroy datasets and use it to destroy datasets in zfs_allow_004_pos, zfs_promote_008_pos, and zfs_destroy_002_pos. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov> Closes #7224 Closes #7246 Closes #7249 Closes #7267	2018-03-06 14:54:57 -08:00
Tony Hutter	80d52c3919	Change checksum & IO delay ratelimit values Change checksum & IO delay ratelimit thresholds from 5/sec to 20/sec. This allows zed to actually trigger if a bunch of these events arrive in a short period of time (zed has a threshold of 10 events in 10 sec). Previously, if you had, say, 100 checksum errors in 1 sec, it would get ratelimited to 5/sec which wouldn't trigger zed to fault the drive. Also, convert the checksum and IO delay thresholds to module params for easy testing. Reviewed-by: loli10K <ezomori.nozomu@gmail.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #7252	2018-03-04 17:34:51 -08:00
chrisrd	d0f6fbaff3	ZTS: fix spurious failures in mv_files The test could fail because of a race condition between the files being generated in the background and attempting to move the files. Wait for all file generation to complete before trying to move the files around. Also, clean up the waiting: the 'wait' command without arguments waits for all child pids. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chris Dunlop <chris@onthe.net.au> Closes #7220 Closes #7242 Closes #7258	2018-03-02 09:57:29 -08:00
John Wren Kennedy	e086e717c3	Add ZFS perf test for dbuf cache This change adds a test for sequential reads out of the dbuf cache. It's essentially a copy of sequential_reads_cached, using a smaller data set. The sequential read tests are renamed to differentiate them. Authored by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: John Wren Kennedy <john.kennedy@delphix.com> Closes #7225	2018-02-28 10:38:37 -08:00
John Eismeier	d699aaef09	Fix some typos Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: George Melikov <mail@gmelikov.ru> Signed-off-by: John Eismeier <john.eismeier@gmail.com> Closes #7237	2018-02-28 08:57:10 -08:00
Scot W. Stevenson	19528cf949	Add Python 3 rewrite of arc_summary.py Add new script arc_summary3.py as a complete rewrite of the arc_summary.py tool (see issue #6873) Add new options: -g/--graph - Display crude graphic representation of ARC status and quit -r/--raw - Print all available information as minimally formatted list (for grep) -s/--section - Print a single section. This replaces -p/--page, which is kept for backwards use but marked as depreciated Add new sections with information on ZIL and SPL. Notify user if sections L2ARC and VDEV are skipped instead of failing silently. Add warning that -p/--page option is depreciated. Developed for Python 3.5. Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com> Reviewed by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Scot W. Stevenson <scot.stevenson@gmail.com> Closes #6873 Closes #6892	2018-02-28 08:52:34 -08:00
LOLi	4af6873af6	Fix segfault in zfs_do_bookmark() When invoked with wrong parameters 'zfs bookmark' fails to gracefully validate user input and crashes. This is a regression accidentally introduced in 587e228; this commit adds additional tests to the ZFS Test Suite to exercise this codepath. Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: KireinaHoro <i@jsteward.moe> Signed-off-by: loli10K <ezomori.nozomu@gmail.com> Closes #7228 Closes #7229	2018-02-26 09:55:18 -08:00
Brian Behlendorf	2a0428f16b	ZTS: Fix zfs_share_* test case failures Prevent false positives when running the zfs_share_* test cases due to leftover stale /var/lib/nfs/etab entries. When starting the test group re-synchronize the /var/lib/nfs/etab file with /etc/exports. At this point in the testing there will be no additional `zfs share` entries to add. Reviewed by: George Melikov <mail@gmelikov.ru> Reviewed-by: loli10K <ezomori.nozomu@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #7226	2018-02-24 10:07:12 -08:00
Tony Hutter	bf95a000c4	Add scrub after resilver zed script * Add a zed script to kick off a scrub after a resilver. The script is disabled by default. * Add a optional $PATH (-P) option to zed to allow it to use a custom $PATH for its zedlets. This is needed when you're running zed under the ZTS in a local workspace. * Update test scripts to not copy in all-debug.sh and all-syslog.sh by default. They can be optionally copied in as part of zed_setup(). These scripts slow down zed considerably under heavy events loads and can cause events to be dropped or their delivery delayed. This was causing some sporadic failures in the 'fault' tests. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Richard Laager <rlaager@wiktel.com> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #4662 Closes #7086	2018-02-23 11:38:05 -08:00
LOLi	faa97c1619	Want 'zfs send -b' This change implements 'zfs send -b' which can be used to send only received property values whether or not they are overridden by local settings. This can be very useful during "restore" operations from a backup pool because it allows to send only the property values originally sent from the backup source, even though they were later modified on the destination either by a 'zfs set' operation, explicit 'zfs inherit' or overridden during the receive process via 'zfs receive -o\|-x'. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: loli10K <ezomori.nozomu@gmail.com> Closes #7156	2018-02-21 12:32:06 -08:00
Tom Caputi	b0918402dc	Raw receive should change key atomically Currently, raw zfs sends transfer the encrypted master keys and objset_phys_t encryption parameters in the DRR_BEGIN payload of each send file. Both of these are processed as soon as they are read in dmu_recv_stream(), meaning that the new keys are set before the new snapshot is received. In addition to the fact that this changes the user's keys for the dataset earlier than they might expect, the keys were never reset to what they originally were in the event that the receive failed. This patch splits the processing into objset handling and key handling, the later of which is moved to dmu_recv_end() so that they key change can be done atomically. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tom Caputi <tcaputi@datto.com> Closes #7200	2018-02-21 12:31:03 -08:00
Tom Caputi	b1d217338a	Raw receives must compress metadnode blocks Currently, the DMU relies on ZIO layer compression to free LO dnode blocks that no longer have objects in them. However, raw receives disable all compression, meaning that these blocks can never be freed. In addition to the obvious space concerns, this could also cause incremental raw receives to fail to mount since the MAC of a hole is different from that of a completely zeroed block. This patch corrects this issue by adding a special case in zio_write_compress() which will attempt to compress these blocks to a hole even if ZIO_FLAG_RAW_ENCRYPT is set. This patch also removes the zfs_mdcomp_disable tunable, since tuning it could cause these same issues. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tom Caputi <tcaputi@datto.com> Closes #7198	2018-02-21 12:28:52 -08:00
Giuseppe Di Natale	f2c0dee23b	Correct count_uberblocks in mmp.kshlib A log_must call was causing count_uberblocks to return more than just the uberblock count. Remove the log_must since it was only logging a sleep. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Olaf Faaland <faaland1@llnl.gov> Reviewed-by: loli10K <ezomori.nozomu@gmail.com> Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov> Closes #7191	2018-02-20 16:28:52 -08:00
Nasf-Fan	9c5167d19f	Project Quota on ZFS Project quota is a new ZFS system space/object usage accounting and enforcement mechanism. Similar as user/group quota, project quota is another dimension of system quota. It bases on the new object attribute - project ID. Project ID is a numerical value to indicate to which project an object belongs. An object only can belong to one project though you (the object owner or privileged user) can change the object project ID via 'chattr -p' or 'zfs project [-s] -p' explicitly. The object also can inherit the project ID from its parent when created if the parent has the project inherit flag (that can be set via 'chattr +P' or 'zfs project -s [-p]'). By accounting the spaces/objects belong to the same project, we can know how many spaces/objects used by the project. And if we set the upper limit then we can control the spaces/objects that are consumed by such project. It is useful when multiple groups and users cooperate for the same project, or a user/group needs to participate in multiple projects. Support the following commands and functionalities: zfs set projectquota@project zfs set projectobjquota@project zfs get projectquota@project zfs get projectobjquota@project zfs get projectused@project zfs get projectobjused@project zfs projectspace zfs allow projectquota zfs allow projectobjquota zfs allow projectused zfs allow projectobjused zfs unallow projectquota zfs unallow projectobjquota zfs unallow projectused zfs unallow projectobjused chattr +/-P chattr -p project_id lsattr -p This patch also supports tree quota based on the project quota via "zfs project" commands set as following: zfs project [-d\|-r] <file\|directory ...> zfs project -C [-k] [-r] <file\|directory ...> zfs project -c [-0] [-d\|-r] [-p id] <file\|directory ...> zfs project [-p id] [-r] [-s] <file\|directory ...> For "df [-i] $DIR" command, if we set INHERIT (project ID) flag on the $DIR, then the proejct [obj]quota and [obj]used values for the $DIR's project ID will be shown as the total/free (avail) resource. Keep the same behavior as EXT4/XFS does. Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Reviewed-by Ned Bass <bass6@llnl.gov> Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Fan Yong <fan.yong@intel.com> TEST_ZIMPORT_POOLS="zol-0.6.1 zol-0.6.2 master" Change-Id: Ib4f0544602e03fb61fd46a849d7ba51a6005693c Closes #6290	2018-02-13 14:54:54 -08:00

1 2 3 4 5 ...

314 Commits