mirror_zfs

mirror of https://git.proxmox.com/git/mirror_zfs.git synced 2026-04-14 23:44:53 +03:00

Author	SHA1	Message	Date
наб	93ef500388	Don't abuse vfork() According to POSIX.1, "vfork() has the same effect as fork(2), except that the behavior is undefined if the process created by vfork() either modifies any data other than a variable of type pid_t used to store the return value from vfork(), [...], or calls any other function before successfully calling _exit(2) or one of the exec(3) family of functions." These do all three, and work by pure chance (or maybe they don't, but we blisfully don't know). Either way: bad idea to call vfork() from C, unless you're the standard library, and POSIX.1-2008 removes it entirely Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12015	2021-05-21 10:16:06 -07:00
Brian Behlendorf	8fb577ae6d	Fix dRAID sequential resilver silent damage handling This change addresses two distinct scenarios which are possible when performing a sequential resilver to a dRAID pool with vdevs that contain silent unknown damage. Which in this circumstance took the form of the devices being intentionally overwritten with zeros. However, it could also result from a device returning incorrect data while a sequential resilver was in progress. Scenario 1) A sequential resilver is performed while all of the dRAID vdevs are ONLINE and there is silent damage present on the vdev being resilvered. In this case, nothing will be repaired by vdev_raidz_io_done_reconstruct_known_missing() because rc->rc_error isn't set on any of the raid columns. To address this vdev_draid_io_start_read() has been updated to always mark the resilvering column as ESTALE for sequential resilver IO. Scenario 2) Multiple columns contain silent damage for the same block and a sequential resilver is performed. In this case it's impossible to generate the correct data from parity unless all of the damaged columns are being sequentially resilvered (and thus only good data is used to generate parity). This is as expected and there's nothing which can be done about it. However, we need to be careful not to make to situation worse. Since we can't verify the data is actually good without a checksum, we must only repair the devices which are being sequentially resilvered. Otherwise, an incorrect repair to a device which previously contained good data could effectively lock in the damage and make reconstruction impossible. A check for this was added to vdev_raidz_io_done_verified() along with a new test case. Lastly, this change updates the redundancy_draid_spare1 and redundancy_draid_spare3 test cases to be more representative of normal dRAID replacement operation. Specifically, what we care about is that the scrub run after a sequential resilver does not find additional blocks which need repair. This would indicate the sequential resilver failed to rebuild a section of one of the devices. Note also the tests were switched to using the verify_pool() function which still checks for checksum errors. Reviewed-by: Mark Maybee <mark.maybee@delphix.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #12061	2021-05-20 15:05:26 -07:00
наб	6fc3099248	Trim excess shellcheck annotations. Widen to all non-Korn scripts Before, make shellcheck checked scripts/{commitcheck,make_gitrev,man-dates,paxcheck,zfs-helpers,zfs, zfs-tests,zimport,zloop}.sh cmd/zed/zed.d/{{all-debug,all-syslog,data-notify,generic-notify, resilver_finish-start-scrub,scrub_finish-notify, statechange-led,statechange-notify,trim_finish-notify, zed-functions}.sh,history_event-zfs-list-cacher.sh.in} cmd/zpool/zpool.d/{dm-deps,iostat,lsblk,media,ses,smart,upath} now it also checks contrib/dracut/{02zfsexpandknowledge/module-setup, 90zfs/{export-zfs,parse-zfs,zfs-needshutdown, zfs-load-key,zfs-lib,module-setup, mount-zfs,zfs-generator}}.sh.in cmd/zed/zed.d/{pool_import-led,vdev_attach-led, resilver_finish-notify,vdev_clear-led}.sh contrib/initramfs/{zfsunlock,hooks/zfs.in,scripts/local-top/zfs} tests/zfs-tests/tests/perf/scripts/prefetch_io.sh scripts/common.sh.in contrib/bpftrace/zfs-trace.sh autogen.sh Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12042	2021-05-20 08:55:23 -07:00
Brian Behlendorf	6a13add559	ZTS: Increase redundancy test timeout The redundancy_draid.ksh and redundancy_raidz.ksh tests were updated by commit `93c8e91fe` to additionally verify self-healing. This additional check increased the run time which can now occasionally exceed the default maximum timeout in the CI environment. To prevent this from causing failures increase the default timeout for the redundancy test cases. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #12043	2021-05-14 09:11:56 -07:00
Brian Behlendorf	6217656da3	Revert "Fix raw sends on encrypted datasets when copying back snapshots" Commit `d1d4769` takes into account the encryption key version to decide if the local_mac could be zeroed out. However, this could lead to failure mounting encrypted datasets created with intermediate versions of ZFS encryption available in master between major releases. In order to prevent this situation revert `d1d4769` pending a more comprehensive fix which addresses the mount failure case. Reviewed-by: George Amanakis <gamanakis@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #11294 Issue #12025 Issue #12300 Closes #12033	2021-05-13 10:00:17 -07:00
наб	37086897b0	libzfs: add keylocation=https://, backed by fetch(3) or libcurl Add support for http and https to the keylocation properly to allow encryption keys to be fetched from the specified URL. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Issue #9543 Closes #9947 Closes #11956	2021-05-12 21:21:35 -07:00
Brian Behlendorf	7d07d1be39	ZTS: Add known exceptions The following seven tests been observed to occasionally fail during CI testing. This commit adds them to the list of known somewhat flaky test cases. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #12023	2021-05-11 19:55:12 -07:00
Brian Behlendorf	93c8e91fe7	Fix dRAID self-healing short columns When dRAID performs a normal read operation only the data columns in the raid map are read from disk. This is enough information to calculate the checksum, verify it, and return the needed data to the application. It's only in the event of a checksum failure that the additional parity and any empty columns must be read since they are required for parity reconstruction. Reading these additional columns is handled by vdev_raidz_read_all() which calls vdev_draid_map_alloc_empty() to expand the raid_map_t and submit IOs for the missing columns. This all works correctly, but it fails to account for any "short" columns. These are data columns which are padded with a empty skip sector at the end. Since that empty sector is not needed for a normal read it's not read when columns is first read from disk. However, like the parity and empty columns the skip sector is needed to perform reconstruction. The fix is to mark any "short" columns as never being read by clearing the rc_tried flag when expanding the raid_map_t. This will cause the entire column to re-read from disk in the event of a checksum failure allowing the self-healing functionality to repair the block. Note that this only effects the self-healing feature because when scrubbing a pool the parity, data, and empty columns are all read initially to verify their contents. Furthermore, only blocks which contain "short" columns would be effected, and only when the memory backing the skip sector wasn't already zeroed out. This change extends the existing redundancy_raidz.ksh test case to verify self-healing (as well as resilver and scrub). Then applies the same test case to dRAID with a slightly modified version of the test script called redundancy_draid.ksh. The unused variable combrec was also removed from both test cases. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Mark Maybee <mark.maybee@delphix.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #12010	2021-05-08 08:57:25 -07:00
наб	1966e959ca	Replace ZoL with OpenZFS where applicable Afterward, git grep ZoL matches: * README.md: * [ZoL Site](https://zfsonlinux.org) - Correct * etc/default/zfs.in:# ZoL userland configuration. - Changing this would induce a needless upgrade-check, if the user has modified the configuration; this can be updated the next time the defaults change * module/zfs/dmu_send.c: * ZoL < 0.7 does not handle [...] - Before 0.7 is ZoL, so fair enough Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Issue #11956	2021-05-07 17:20:37 -07:00
Ryan Moeller	ccb46cab50	ZTS: Fix xattr_002_neg passing too soon Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #11970	2021-04-30 07:37:02 -07:00
наб	6f4e132fec	ZTS: cli_root/zfs_load-key: add separate key files Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Issue: #11956 Closes #11976	2021-04-30 07:31:22 -07:00
Brian Behlendorf	b1f7341203	ZTS: Add known exceptions Both the zpool_initialize_import_export and checkpoint_discard_busy test cases a known to occasionally fail. Add them to the list of known possible failures and reference the appropriate issue on the tracker. Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #11949	2021-04-27 08:27:03 -07:00
Prawn	b0269cd8ce	receive: don't fail inheriting (-x) properties on wrong dataset type Receiving datasets while blanket inheriting properties like zfs receive -x mountpoint can generally be desirable, e.g. to avoid unexpected mounts on backup hosts. Currently this will fail to receive zvols due to the mountpoint property being applicable to filesystems only. This limitation currently requires operators to special-case their minds and tools for zvols. This change gets rid of this limitation for inherit (-x) by Spiting up the dataset type handling: Warnings for inheriting (-x), errors for overriding (-o). Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: InsanePrawn <insane.prawny@gmail.com> Closes #11416 Closes #11840 Closes #11864	2021-04-26 17:23:51 -07:00
Brian Behlendorf	50d9ff93df	ZTS: Improve redundancy test scripts - Add additional logging to provide more information about why the test failed. This including logging more of the individual commands and the contents and differences of the record files on failure. - Updated get_vdevs() to properly exclude all top-level vdevs including raidz3 and draid[1-3]. - Replaced gnudd with dd. This is the only remaining place in the test suite gnudd is used and it shouldn't be needed. - The refill_test_env function expects the pool as the first argument but never sets the pool variable. - Only fill the test pools to 50% of capacity instead of 75% to help speed up the tests. - Fix replace_missing_devs() calculation, MINDEVSIZE should be MINVDEVSIZE. - Fix damage_devs() so it overwrites almost all of the device so we're guaranteed to damage filesystem blocks. - redundancy_stripe.ksh should not use log_mustnot to check if the pool is healthy since the return value may be misinterpreted. Just perform a normal conditional check and log the failure. Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #11906	2021-04-18 21:58:36 -07:00
наб	86418090d7	ZTS: add zed_fd_spill to verify the fds ZEDLETs inherit Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11891	2021-04-15 13:46:05 -07:00
Brian Behlendorf	888700bc6b	ZTS: fix removal_condense_export test case It's been observed in the CI that the required 25% of obsolete bytes in the mapping can be to high a threshold for this test resulting in condensing never being triggered and a test failure. To prevent these failures make the existing zfs_condense_indirect_obsolete_pct tuning available so the obsolete percentage can be reduced from 25% to 5% during this test. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #11869	2021-04-11 21:49:13 -07:00
Brian Behlendorf	ea3cd8e420	ZTS: Add known exceptions The fault/auto_spare_shared, l2arc/persist_l2arc_007_pos, and alloc_class/alloc_class_013_pos test cases are not entirely reliable and may occasionally fail resulting in a false positive in the CI. Add these tests to known list of possible failures until they can be made 100% reliable. Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #11890	2021-04-11 15:55:38 -07:00
pablofsf	099fa7e475	Allow zfs to send replication streams with missing snapshots A tentative implementation and discussion was done in #5285. According to it a send --skip-missing\|-s flag has been added. In a replication stream, when there are snapshots missing in the hierarchy, if -s is provided print a warning and ignore dataset (and its children) instead of throwing an error Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Pablo Correa Gómez <ablocorrea@hotmail.com> Closes #11710	2021-04-11 12:05:35 -07:00
Ryan Moeller	5d508d92d2	ZTS: Improve cleanup in removal_with_export Kill the removal operation on every platform, not just Linux. The test has been fixed and is now stable on FreeBSD. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Igor Kozhukhov <igor@dilos.org> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #11856	2021-04-08 21:10:28 -07:00
Ryan Moeller	383401589e	ZTS: Tests using zhack may fail on FreeBSD As described in #11854, zhack is occasionally segfaulting on FreeBSD. Debugging this is proving to be tricky. To avoid false positives in the CI add entries for the tests that use zhack in zts-report to accept that they may occasionally fail on FreeBSD. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Issue #11854 Closes #11855	2021-04-08 13:21:53 -07:00
Ryan Moeller	e778b0485b	Ratelimit deadman zevents as with delay zevents Just as delay zevents can flood the zevent pipe when a vdev becomes unresponsive, so do the deadman zevents. Ratelimit deadman zevents according to the same tunable as for delay zevents. Enable deadman tests on FreeBSD and add a test for deadman event ratelimiting. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Don Brady <don.brady@delphix.com> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #11786	2021-04-07 16:23:57 -07:00
matt-fidd	a03b288cf0	zfs get -p only outputs 3 columns if "clones" property is empty get_clones_string currently returns an empty string for filesystem snapshots which have no clones. This breaks parsable `zfs get` output as only three columns are output, instead of 4. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Matt Fiddaman <github@m.fiddaman.uk> Co-authored-by: matt <matt@fiddaman.net> Closes #11837	2021-04-06 16:05:54 -07:00
Brian Behlendorf	ec580225d2	ZTS: pool_checkpoint improvements The pool_checkpoint tests may incorrectly fail because several of them invoke zdb for an imported pool. In this scenario it's not unexpected for zdb to fail if the pool is modified. To resolve this these zdb checks are now done after the pool has been exported. Additionally, the default cleanup functions assumed the pool would be imported when they were run. If this was not the case they're exit early and fail to cleanup all of the test state causing subsequent tests to fail. Add a check to only destroy the pool when it is imported. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #11832	2021-04-03 08:33:22 -07:00
Andrea Gelmini	bf169e9f15	Fix various typos Correct an assortment of typos throughout the code base. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net> Closes #11774	2021-04-02 18:52:15 -07:00
наб	73218f41b4	zed: allow limiting concurrent jobs 200ms time-out is relatively long, but if we already hit the cap, then we'll likely be able to spawn multiple new jobs when we wake up Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11807	2021-04-02 16:30:53 -07:00
Ryan Moeller	583e320546	ZTS: inheritance/inherit_001_pos is flaky Add inheritance/inherit_001_pos to the maybe fails on FreeBSD list. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #11830	2021-04-02 11:11:52 -07:00
Ryan Moeller	c05eec32a7	Allow pool names that look like Solaris disk names Nothing bad happens if a prefix of your pool name matches a disk name. This is a bit of a silly restriction at this point. Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org> Closes #11781 Closes #11813	2021-04-01 08:49:41 -07:00
Andrew	66e6d3f128	Fix regression in POSIX mode behavior Commit `235a85657` introduced a regression in evaluation of POSIX modes that require group DENY entries in the internal ZFS ACL. An example of such a POSX mode is 007. When write_implies_delete_child is set, then ACE_WRITE_DATA is added to `wanted_dirperms` in prior to calling zfs_zaccess_common(). This occurs is zfs_zaccess_delete(). Unfortunately, when zfs_zaccess_aces_check hits this particular DENY ACE, zfs_groupmember() is checked to determine whether access should be denied, and since zfs_groupmember() always returns B_TRUE on Linux and so this check is failed, resulting ultimately in EPERM being returned. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Andrew Walker <awalker@ixsystems.com> Closes #11760	2021-03-19 22:50:46 -07:00
Palash Gandhi	c23850759f	ZTS: New test for kernel panic induced by redacted send This change adds a new test that covers a bug fix in the binary search in the redacted send resume logic that causes a kernel panic. The bug was fixed in https://github.com/openzfs/zfs/pull/11297. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Palash Gandhi <palash.gandhi@delphix.com> Closes #11764	2021-03-19 22:47:50 -07:00
Ryan Moeller	5638803b6a	ZTS: Add tests for DOS mode attributes Create a new section of tests to run with acltype=off. For now the only test we have is for the DOS mode READONLY attribute on FreeBSD. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #11734	2021-03-16 15:00:14 -07:00
Ryan Moeller	9305ff2edf	ZTS: Fix incorrect use of libtest in user_run by xattr_003_neg You can't use user_run to eval ksh functions defined in libtest unless you include libtest in the user shell. Fix xattr_003_neg by: * include libtest in the user shell * then run get_xattr * assert this fails * use variables for filenames so they don't change in the user's shell * don't log the contents of /etc/passwd * cleanup all byproducts Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #11185	2021-03-12 16:17:30 -08:00
Ryan Moeller	e0b53a5dbb	ZTS: Use ksh and current environment for user_run The current user_run often does not work as expected. Commands are run in a different shell, with a different environment, and all output is discarded. Simplify user_run to retain the current environment, eliminate eval, and feed the command string into ksh. Enhance the logging for user_run so we can see out and err. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #11185	2021-03-12 16:17:01 -08:00
George Wilson	0936981d86	zpool import cachefile improvements Importing a pool using the cachefile is ideal to reduce the time required to import a pool. However, if the devices associated with a pool in the cachefile have changed, then the import would fail. This can easily be corrected by doing a normal import which would then read the pool configuration from the labels. The goal of this change is make importing using a cachefile more resilient and auto-correcting. This is accomplished by having the cachefile import logic automatically fallback to reading the labels of the devices similar to a normal import. The main difference between the fallback logic and a normal import is that the cachefile import logic will only look at the device directories that were originally used when the cachefile was populated. Additionally, the fallback logic will always import by guid to ensure that only the pools in the cachefile would be imported. External-issue: DLPX-71980 Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Wilson <gwilson@delphix.com> Closes #11716	2021-03-12 15:42:27 -08:00
Ryan Moeller	35aa9dc6df	FreeBSD: Fix scope of deadman tunables A few deadman tunables ended up in the wrong sysctl node. Move them to vfs.zfs.deadman.* Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #11715	2021-03-11 19:23:24 -08:00
Antonio Russo	b2eebe3ae7	ZTS events_002: Improve speed and reliability events_002 exercises the ZED, ensuring that it neither misses events, nor reporting events twice. On slow test hardware, some of the timeouts are insufficient to allow the ZED to properly settle. Conversely, on fast hardware these same timeouts are too long, unnecessarily slowing the test run. Instead of using a fixed timeout, wait for the expected final event before returning. Additionally, wait with a timeout for unexpected events to avoid missing them if they show up late. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Antonio Russo <aerusso@aerusso.net> Closes #11703	2021-03-08 08:42:45 -08:00
Ryan Moeller	b30cd70599	ZTS: Improve cleanup in zpool tests * Restore original kern.corefile value after the test. * Don't leave behind a frozen pool. * Clean up leftover vdev files. * Make zpool_002_pos and zpool_003_pos consistent in their handling of core files while here. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #11694	2021-03-07 09:41:01 -08:00
nssrikanth	bedbc13daa	Cancel TRIM / initialize on FAULTED non-writeable vdevs When a device which is actively trimming or initializing becomes FAULTED, and therefore no longer writable, cancel the active TRIM or initialization. When the device is merely taken offline with `zpool offline` then stop the operation but do not cancel it. When the device is brought back online the operation will be resumed if possible. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by: Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by: Vipin Kumar Verma <vipin.verma@hpe.com> Signed-off-by: Srikanth N S <srikanth.nagasubbaraoseetharaman@hpe.com> Closes #11588	2021-03-02 10:27:27 -08:00
Brian Behlendorf	3e73ea0c10	ZTS: zpool_trim_start_and_cancel_pos.ksh Several of the TRIM tests were based of the initialize tests and then adapted for TRIM. The zpool_trim_start_and_cancel_pos.ksh test was intended to be one such test but it was overlooked and actually never adapted. Update it accordingly. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #11649	2021-02-27 17:19:50 -08:00
Cedric Maunoury	b9c07ec71b	send_iterate_snap : doall send without fromsnap The behavior of a NULL fromsnap was inadvertently changed for a doall send when the send/recv logic in libzfs was updated. Restore the previous behavior by correcting send_iterate_snap() to include all the snapshots in the nvlist for this case. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Cedric Maunoury <cedric.maunoury@gmail.com> Closes #11608	2021-02-24 09:48:58 -08:00
Don Brady	03e02e5b56	Checksum errors may not be counted Fix regression seen in issue #11545 where checksum errors where not being counted or showing up in a zpool event. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Don Brady <don.brady@delphix.com> Closes #11609	2021-02-19 22:33:15 -08:00
Colm	658fb8020f	Add "compatibility" property for zpool feature sets Property to allow sets of features to be specified; for compatibility with specific versions / releases / external systems. Influences the behavior of 'zpool upgrade' and 'zpool create'. Initial man page changes and test cases included. Brief synopsis: zpool create -o compatibility=off\|legacy\|file[,file...] pool vdev... compatibility = off : disable compatibility mode (enable all features) compatibility = legacy : request that no features be enabled compatibility = file[,file...] : read features from specified files. Only features present in all files will be enabled on the resulting pool. Filenames may be absolute, or relative to /etc/zfs/compatibility.d or /usr/share/zfs/compatibility.d (/etc checked first). Only affects zpool create, zpool upgrade and zpool status. ABI changes in libzfs: * New function "zpool_load_compat" to load and parse compat sets. * Add "zpool_compat_status_t" typedef for compatibility parse status. * Add ZPOOL_PROP_COMPATIBILITY to the pool properties enum * Add ZPOOL_STATUS_COMPATIBILITY_ERR to the pool status enum An initial set of base compatibility sets are included in cmd/zpool/compatibility.d, and the Makefile for cmd/zpool is modified to install these in $pkgdatadir/compatibility.d and to create symbolic links to a reasonable set of aliases. Reviewed-by: ericloewe Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Colm Buckley <colm@tuatha.org> Closes #11468	2021-02-17 21:30:45 -08:00
José Luis Salvador Rufo	aef1830f93	Support uClibc for the tests compilations There are two issues that don't allow ZFS to be compiled using uClibc. `backtrace()`, and `program_invocation_short_name` as a `const`. This patch adds uClibc to the conditionals in the same way there are already for Glibc for `backtrace()`; and removes the external param `program_invocation_short_name` because its only used here for the whole project. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: José Luis Salvador Rufo <salvador.joseluis@gmail.com> Closes #11600	2021-02-16 21:51:46 -08:00
George Melikov	9eee7fce3b	zts-report.py: ignore some skipped tests in Github CI Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Melikov <mail@gmelikov.ru> Closes #11554	2021-02-02 09:37:23 -08:00
George Melikov	9f8c7e6a76	ZTS: add userspace_send_encrypted.ksh to Makefile All tests need to be included in the Makefiles. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Melikov <mail@gmelikov.ru> Closes #11541	2021-01-28 13:39:38 -08:00
Allan Jude	393e69241e	Add zdb -r <dataset> <object-id \| file> <output> While you can use zdb -R poolname vdev:offset:[<lsize>/]<psize>[:flags] to extract individual DVAs from a vdev, it would be handy for be able copy an entire file out of the pool. Given a file or object number, add support to copy the contents to a file. Useful for debugging and recovery. Reviewed-by: Jorgen Lundman <lundman@lundman.net> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Allan Jude <allan@klarasystems.com> Closes #11027	2021-01-27 21:36:01 -08:00
George Melikov	b8e6401b79	ZTS: pool_state test check for pool existence in cleanup If there is no scsi_debug module, then this test must be skipped, in this case cleanup routine should be prepared for absent pool. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Melikov <mail@gmelikov.ru> Closes #11534	2021-01-27 17:33:30 -08:00
Matthew Ahrens	62d4287f27	RAIDZ2/3 fails to heal silently corrupted parity w/2+ bad disks When scrubbing, (non-sequential) resilvering, or correcting a checksum error using RAIDZ parity, ZFS should heal any incorrect RAIDZ parity by overwriting it. For example, if P disks are silently corrupted (P being the number of failures tolerated; e.g. RAIDZ2 has P=2), `zpool scrub` should detect and heal all the bad state on these disks, including parity. This way if there is a subsequent failure we are fully protected. With RAIDZ2 or RAIDZ3, a block can have silent damage to a parity sector, and also damage (silent or known) to a data sector. In this case the parity should be healed but it is not. The problem can be noticed by scrubbing the pool twice. Assuming there was no damage concurrent with the scrubs, the first scrub should fix all silent damage, and the second scrub should be "clean" (`zpool status` should not report checksum errors on any disks). If the bug is encountered, then the second scrub will repair the silently-damaged parity that the first scrub failed to repair, and these checksum errors will be reported after the second scrub. Since the first scrub repaired all the damaged data, the bug can not be encountered during the second scrub, so subsequent scrubs (more than two) are not necessary. The root cause of the problem is some code that was inadvertently added to `raidz_parity_verify()` by the DRAID changes. The incorrect code causes the parity healing to be aborted if there is damaged data (`rc_error != 0`) or the data disk is not present (`!rc_tried`). These checks are not necessary, because we only call `raidz_parity_verify()` if we have the correct data (which may have been reconstructed using parity, and which was verified by the checksum). This commit fixes the problem by removing the incorrect checks in `raidz_parity_verify()`. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes #11489 Closes #11510	2021-01-26 16:05:05 -08:00
Will Andrews	d7265b3309	ZTS: zpool_export test improvements - refactor cleanup routines into common kshlib zpool_export_cleanup func - don't require physical disks to test, just use files Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Will Andrews <will@firepipe.net> Closes #11518	2021-01-26 13:14:04 -08:00
Will Andrews	35ac0ed1fd	ZTS: improve output clarity of check_prop_source Instead of just failing, indicate the expected and actual value and source as a NOTE. Tests using this failed in an earlier version of the changeset and this information helped find the cause. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Will Andrews <will@firepipe.net> Closes #11517	2021-01-25 14:39:58 -08:00
Will Andrews	a57acbb627	ZTS: remove duplicate check_prop_source from zfs_receive There is an identical definition in zfs_set_common.kshlib already. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Will Andrews <will@firepipe.net> Closes #11516	2021-01-25 14:38:19 -08:00
Will Andrews	de1a6cc387	logapi: cat output file instead of printing This avoids globbing together multiple lines in the log, if you happen to specify LOGAPI_DEBUG because you want to see it. Signed-off-by: Will Andrews <will@firepipe.net> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #11515	2021-01-25 14:33:57 -08:00
Ryan Moeller	a4acc47e4b	ZTS: Use swapctl to list swap devices on FreeBSD Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #11503	2021-01-24 15:56:59 -08:00
Matthew Macy	716408f560	Add basic io_uring test Provide a basic test coverage for io_uring I/O. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Matt Macy <mmacy@FreeBSD.org> Closes #11497	2021-01-23 15:42:42 -08:00
Brian Atkinson	d0cd9a5cc6	Extending FreeBSD UIO Struct In FreeBSD the struct uio was just a typedef to uio_t. In order to extend this struct, outside of the definition for the struct uio, the struct uio has been embedded inside of a uio_t struct. Also renamed all the uio_* interfaces to be zfs_uio_* to make it clear this is a ZFS interface. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Jorgen Lundman <lundman@lundman.net> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Brian Atkinson <batkinson@lanl.gov> Closes #11438	2021-01-20 21:27:30 -08:00
sterlingjensen	03f036cbcc	Re-apply path sanitizer, as mount(8) still mangles it Prior to util-linux 2.36.2, if a file or directory in the current working directory was named 'dataset' then mount(8) would prepend the current working directory to the dataset. Eventually, we should be able to drop this workaround. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Sterling Jensen <sterlingjensen@users.noreply.github.com> Closes #11295 Closes #11462	2021-01-19 11:57:31 -08:00
Antonio Russo	f8c4d63a26	ZTS: avoid piping to special devices As described in #11445, the kernel interface kernel_{read,write} no longer act on special devices. In the ZTS, zfs send and receive are tested by piping to these devices, leading to spurious failures (for positive tests) and may mask errors (for negative tests). Until a more permanent mechanism to address this deficiency is developed, clean up the output from the ZTS by avoiding directly piping to or from /dev/null and /dev/zero. For /dev/zero input, simply use a pipe: `cat </dev/zero \|` . However, for /dev/null output, the shell semantics for pipe failures means that zfs send error codes will be masked by the successful `\| cat >/dev/null` command execution. In that case, use a temporary file under $TEST_BASE_DIR for output in favor. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Attila Fülöp <attila@fueloep.org> Signed-off-by: Antonio Russo <aerusso@aerusso.net> Closes #11478	2021-01-19 11:53:35 -08:00
Antonio Russo	8752f7e320	ZTS: avoid race to unmount in zfs_rollback_001 The zfs_rollback_001 test modifies files in a temporary, test dataset repeatedly. Before each iteration, any preexisting dataset is removed, after unmounted with umount -f, if necessary. Add a short delay after the forced unmount, avoiding a race that can prevent zfs destroy from succeeding, leading to a test failure. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Antonio Russo <aerusso@aerusso.net> Closes #11451	2021-01-12 17:20:02 -08:00
Toomas Soome	4ba8c6b584	zfs_mount_all_mountpoints: cleanup_all should leave pool root mounted if pool root is not mounted, then zpool umount in next test will leave dataset mountpoint directory around and next zfs mount -a will fail with error: cannot mount '/testpool': directory is not empty Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Toomas Soome <tsoome@me.com> Closes #11417	2021-01-02 16:54:53 -08:00
Toomas Soome	40ab927ae8	implicit conversion from 'boolean_t' to 'ds_hold_flags_t' Build error on illumos with gcc 10 did reveal: In function 'dmu_objset_refresh_ownership': ../../common/fs/zfs/dmu_objset.c:857:25: error: implicit conversion from 'boolean_t' to 'ds_hold_flags_t' {aka 'enum ds_hold_flags'} [-Werror=enum-conversion] 857 \| dsl_dataset_disown(ds, decrypt, tag); \| ^~~~~~~ cc1: all warnings being treated as errors libzfs_input_check.c: In function 'zfs_ioc_input_tests': libzfs_input_check.c:754:28: error: implicit conversion from 'enum dmu_objset_type' to 'enum lzc_dataset_type' [-Werror=enum-conversion] 754 \| err = lzc_create(dataset, DMU_OST_ZFS, NULL, NULL, 0); \| ^~~~~~~~~~~ cc1: all warnings being treated as errors The same issue is present in openzfs, and also the same issue about ds_hold_flags_t, which currently defines exactly one valid value. Reviewed-by: Igor Kozhukhov <igor@dilos.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Toomas Soome <tsoome@me.com> Closes #11406	2020-12-27 16:31:02 -08:00
Brian Behlendorf	2844ad60d4	ZTS: Simplify zpool_initialize_verify_initialized Consider the test to be a success as long as the initializing pattern is found at least once per metaslab. This indicates that at least part of the free space was initialized. Ideally we'd check that the pattern was written to all free space but that's much trickier so this check is a reasonable compromise. Using a here-string to feed the loop in this test causes an empty string to still trigger the loop so we miss the `spacemaps=0` case. Pipe into the loop instead. While here, we can use `zpool wait -t initialize $TESTPOOL` to wait for the pool to initialize. Co-authored-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #11365	2020-12-18 08:42:59 -08:00
Matthew Ahrens	71e4ce0e52	special device removal space accounting fixes The space in special devices is not included in spa_dspace (or dsl_pool_adjustedsize(), or the zfs `available` property). Therefore there is always at least as much free space in the normal class, as there is allocated in the special class(es). And therefore, there is always enough free space to remove a special device. However, the checks for free space when removing special devices did not take this into account. This commit corrects that. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Don Brady <don.brady@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes #11329	2020-12-17 12:11:56 -08:00
sterlingjensen	fb188409f1	Use the correct return type for getopt Use the correct return type for getopt otherwise clang complains about tautological-constant-out-of-range-compare. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Sterling Jensen <sterlingjensen@users.noreply.github.com> Closes #11359	2020-12-17 10:19:30 -08:00
George Amanakis	c76a40bfda	Fix reporting of CKSUM errors in indirect vdevs When removing and subsequently reattaching a vdev, CKSUM errors may occur as vdev_indirect_read_all() reads from all children of a mirror in case of a resilver. Fix this by checking whether a child is missing the data and setting a flag (ic_error) which is then checked in vdev_indirect_repair() and suppresses incrementing the checksum counter. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #11277	2020-12-11 12:15:37 -08:00
Attila Fülöp	b9916b4064	ZTS: three small follow up fixes for #11167 Follow up fix for `0cb40fa3`. Remove unused variables, don't source unused libs and add missed cleanup. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Attila Fülöp <attila@fueloep.org> Closes #11311	2020-12-09 21:27:12 -08:00
sterlingjensen	1e4667af32	Drop path prefix workaround Canonicalization, the source of the trouble, was disabled in `9000a9f`. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Sterling Jensen <sterlingjensen@users.noreply.github.com> Closes #11295	2020-12-09 21:24:26 -08:00
George Melikov	1a735e763a	CI: add new zfs-tests-sanity workflow Run zfs-tests with sanity.run for brief results. Timeouts are rare, so minimize false positives by increasing the default from 60 to 180 seconds. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Melikov <mail@gmelikov.ru> Closes #11304	2020-12-08 09:53:45 -08:00
George Melikov	8e8fdce682	ZTS: zpool_trim tests throttle trim process Otherwise trim may finish before progress checks. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Melikov <mail@gmelikov.ru> Closes #11296	2020-12-07 10:06:10 -08:00
Brian Behlendorf	0484e8722f	ZTS: adjust zpool_import_012_pos timeout When running in the CI the zpool_import_012_pos test case occasionally takes longer than the maximum 600 seconds. When this happens the test case is considered to have failed but always completes a few minutes latter. Since the logs suggest nothing has actually failed this commit increases timeout and removes the exception. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #11286	2020-12-06 09:48:36 -08:00
Brian Behlendorf	81638c999d	ZTS: Update zfs_share_concurrent_shares.ksh Occasionally an out of memory error is hit by this test case when mounting the filesystems. Try and reduce the likelihood of this occurring by reducing the thread count from 100 to 50. It also has the advantage of slightly speeding up the test. cannot mount 'testpool/testfs3/79': Cannot allocate memory filesystem successfully created, but not mounted Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #11283	2020-12-06 09:47:33 -08:00
George Amanakis	d1d47691c2	Fix raw sends on encrypted datasets when copying back snapshots When sending raw encrypted datasets the user space accounting is present when it's not expected to be. This leads to the subsequent mount failure due a checksum error when verifying the local mac. Fix this by clearing the OBJSET_FLAG_USERACCOUNTING_COMPLETE and reset the local mac. This allows the user accounting to be correctly updated on first mount using the normal upgrade process. Reviewed-By: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-By: Tom Caputi <caputit1@tcnj.edu> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #10523 Closes #11221	2020-12-04 14:34:29 -08:00
Attila Fülöp	0cb40fa389	zpool: Dryrun fails to list some devices `zpool create -n` fails to list cache and spare vdevs. `zpool add -n` fails to list spare devices. `zpool split -n` fails to list `special` and `dedup` labels. `zpool add -n` and `zpool split -n` shouldn't list hole devices. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Attila Fülöp <attila@fueloep.org> Closes #11122 Closes #11167	2020-12-04 14:04:39 -08:00
Ryan Moeller	4b6e2a5a33	Add -u option to 'zfs create' Add -u option to 'zfs create' that prevents file system from being automatically mounted. This is similar to the 'zfs receive -u'. Authored by: pjd <pjd@FreeBSD.org> FreeBSD-commit: freebsd/freebsd@35c58230e2 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Allan Jude <allan@klarasystems.com> Ported-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #11254	2020-12-04 14:01:42 -08:00
Brian Behlendorf	8f158ae6ad	Add sanity.run file This run file contains a subset of functional tests which exercise as much functionality as possible while still executing relatively quickly. The included tests should take no more than a few seconds each to run at most. This provides a convenient way to sanity test a change before committing to a full test run which takes several hours. $ ./scripts/zfs-tests.sh -r sanity ... Results Summary PASS 813 Running Time: 00:14:42 Percent passed: 100.0% Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #11271	2020-12-03 10:49:39 -08:00
loli10K	4072f465bc	Fix 'zfs userspace' for received datasets in encrypted root For encrypted receives, where user accounting is initially disabled on creation, both 'zfs userspace' and 'zfs groupspace' fails with EOPNOTSUPP: this is because dmu_objset_id_quota_upgrade_cb() forgets to set OBJSET_FLAG_USERACCOUNTING_COMPLETE on the objset flags after a successful dmu_objset_space_upgrade(). Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: loli10K <ezomori.nozomu@gmail.com> Closes #9501 Closes #9596	2020-11-16 09:10:29 -08:00
Brian Behlendorf	b2255edcc0	Distributed Spare (dRAID) Feature This patch adds a new top-level vdev type called dRAID, which stands for Distributed parity RAID. This pool configuration allows all dRAID vdevs to participate when rebuilding to a distributed hot spare device. This can substantially reduce the total time required to restore full parity to pool with a failed device. A dRAID pool can be created using the new top-level `draid` type. Like `raidz`, the desired redundancy is specified after the type: `draid[1,2,3]`. No additional information is required to create the pool and reasonable default values will be chosen based on the number of child vdevs in the dRAID vdev. zpool create <pool> draid[1,2,3] <vdevs...> Unlike raidz, additional optional dRAID configuration values can be provided as part of the draid type as colon separated values. This allows administrators to fully specify a layout for either performance or capacity reasons. The supported options include: zpool create <pool> \ draid[<parity>][:<data>d][:<children>c][:<spares>s] \ <vdevs...> - draid[parity] - Parity level (default 1) - draid[:<data>d] - Data devices per group (default 8) - draid[:<children>c] - Expected number of child vdevs - draid[:<spares>s] - Distributed hot spares (default 0) Abbreviated example `zpool status` output for a 68 disk dRAID pool with two distributed spares using special allocation classes. ``` pool: tank state: ONLINE config: NAME STATE READ WRITE CKSUM slag7 ONLINE 0 0 0 draid2:8d:68c:2s-0 ONLINE 0 0 0 L0 ONLINE 0 0 0 L1 ONLINE 0 0 0 ... U25 ONLINE 0 0 0 U26 ONLINE 0 0 0 spare-53 ONLINE 0 0 0 U27 ONLINE 0 0 0 draid2-0-0 ONLINE 0 0 0 U28 ONLINE 0 0 0 U29 ONLINE 0 0 0 ... U42 ONLINE 0 0 0 U43 ONLINE 0 0 0 special mirror-1 ONLINE 0 0 0 L5 ONLINE 0 0 0 U5 ONLINE 0 0 0 mirror-2 ONLINE 0 0 0 L6 ONLINE 0 0 0 U6 ONLINE 0 0 0 spares draid2-0-0 INUSE currently in use draid2-0-1 AVAIL ``` When adding test coverage for the new dRAID vdev type the following options were added to the ztest command. These options are leverages by zloop.sh to test a wide range of dRAID configurations. -K draid\|raidz\|random - kind of RAID to test -D <value> - dRAID data drives per group -S <value> - dRAID distributed hot spares -R <value> - RAID parity (raidz or dRAID) The zpool_create, zpool_import, redundancy, replacement and fault test groups have all been updated provide test coverage for the dRAID feature. Co-authored-by: Isaac Huang <he.huang@intel.com> Co-authored-by: Mark Maybee <mmaybee@cray.com> Co-authored-by: Don Brady <don.brady@delphix.com> Co-authored-by: Matthew Ahrens <mahrens@delphix.com> Co-authored-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Mark Maybee <mmaybee@cray.com> Reviewed-by: Matt Ahrens <matt@delphix.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #10102	2020-11-13 13:51:51 -08:00
Brian Behlendorf	c08d442e45	Linux: Fix mount/unmount when dataset name has a space The custom zpl_show_devname() helper should translate spaces in to the octal escape sequence \040. The getmntent(2) function is aware of this convention and properly translates the escape character back to a space when reading the fsname. Without this change the `zfs mount` and `zfs unmount` commands incorrectly detect when a dataset with a name containing spaces is mounted. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #11182 Closes #11187	2020-11-11 17:14:24 -08:00
Tony Perkins	9bd14b8724	Start snapdir_iterate traversals to begin wtih the value of zero. The microzap hash can sometimes be zero for single digit snapnames. The zap cursor can then have a serialized value of two (for . and ..), and skip the first entry in the avl tree for the .zfs/snapshot directory listing, and therefore does not return all snapshots. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Cedric Berger <cedric@precidata.com> Signed-off-by: Tony Perkins <tperkins@datto.com> Closes #11039	2020-11-11 17:06:16 -08:00
sterlingjensen	a4ae4998cb	Fix memleak in cmd/mount_zfs.c Convert dynamic allocation to static buffer, simplify parse_dataset function return path. Add tests specific to the mount helper. Reviewed-by: Mateusz Guzik <mjguzik@gmail.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Sterling Jensen <sterlingjensen@users.noreply.github.com> Closes #11098	2020-11-10 15:50:44 -08:00
Ryan Moeller	52e585a822	ZTS: Add L1 corruption test Add a new test case which corrupts all level 1 block in a file. Then verifies that corruption is detected and repaired. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #11141	2020-11-05 17:16:25 -08:00
Ryan Moeller	cce66dfa8e	ZTS: Output all block copies in list_file_blocks The second part of list_file_blocks transforms the object description output by zdb -ddddd $ds $objnum into a stream of lines of the form "level path offset length" for the indirect blocks in the given file. The current code only works for the first copy of L0 blocks. L1 and L2 indirect blocks have more than one copy on disk. Add one more -d to the zdb command so we get all block copies and rewrite the transformation to match more than L0 and output all DVAs. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #11141	2020-11-05 17:16:21 -08:00
Ryan Moeller	94deb47872	ZTS: Fix list_file_blocks for mirror vdevs, level > 0 The first part of list_file_blocks transforms the pool configuration output by zdb -C $pool into shell code to set up a shell variable, VDEV_MAP, that maps from vdev id to the underlying vdev path. This variable is a simple indexed array. However, the vdev id in a DVA is only the id of the top level vdev. When the pool is mirrored, the top level vdev is a mirror and its children are the mirrored devices. So, what we need is to map from the top level vdev id to a list of the underlying vdev paths. ist_file_blocks does not need to work for raidz vdevs, so we can disregard that case. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #11141	2020-11-05 17:16:16 -08:00
Brian Behlendorf	2d7843401a	ZTS: zdb_block_size_histogram increase variance The expected variance for this test case was originally set at 10% based on local testing. Additional testing via the CI has show it can be as large as 11%. Increase the expected maximum to 12% to prevent this test from incorrectly failing. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #11148	2020-11-03 09:20:34 -08:00
Brian Behlendorf	123d2803bf	ZTS: Wait on all events in events_001_pos.ksh The events_001_pos.ksh test case can fail because it's possible, and correct, for the config_sync event to be posted after the last "expected" event. To accommodate this the run_and_verify() function has been updated to wait for all non-history events, not just the last event. This does not increase the run time of the test as long as all the events do get generated. Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #11147	2020-11-03 09:19:45 -08:00
Ryan Moeller	76d04993a6	Update references to nonexistent man pages in code Refer to the correct section or alternative for FreeBSD and Linux. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #11132	2020-10-30 08:55:59 -07:00
Tony Hutter	f829227f49	ZTS: Fix xattr_004_pos failure, don't use tmpfs Previously, xattr_004_pos would create files with xattrs on both tmpfs and ext2, and then copy them to zfs to verify that their xattrs were preserved. However tmpfs doesn't support xattrs. This was never noticed until Fedora 33. In Fedora 32 and older, /tmp was on the root partition (like ext4), whereas on Fedora 33 /tmp is actually tmpfs. That caused this test to fail on Fedora 33. This fix updates the test to only create the file on ext2, not tmpfs. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #11133	2020-10-30 08:47:42 -07:00
Adam D. Moss	666aa69f32	Non-l2arc pool reads shouldn't be l2arc misses The current l2_misses accounting behavior treats all reads to pools without a configured l2arc as an l2arc miss, IFF there is at least one other pool on the system which does have an l2arc configured. This makes it extremely hard to tune for an improved l2arc hit/miss ratio because this ratio will be modulated by reads from pools which do not (and should not) have l2arc devices; its upper limit will depend on the ratio of reads from l2arc'd pools and non-l2arc'd pools. This PR prevents ARC reads affecting l2arc stats (n.b. l2_misses is the only relevant one) where the target spa doesn't have an l2arc. Includes new test - l2arc_l2miss_pos.ksh Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Amanakis <gamanakis@gmail.com> Signed-off-by: Adam Moss <c@yotes.com> Closes #10921	2020-10-20 11:39:52 -07:00
Don Brady	13d65987a9	zed syslog entries drop important info ZED will log zevents summaries to the syslog, however the log entries tend to drop event details that can be useful for diagnosis. This is especially true for ereport events, like io, checksum, and delay. Update the all-syslog.sh script to log additional event information. Add an optional config option, ZED_SYSLOG_DISPLAY_GUIDS, to zed.rc for choosing GUIDs over names for pool and vdev. Change the default ZED_SYSLOG_SUBCLASS_EXCLUDE to exclude history_event events. These events tend to be frequent, convey no meaningful info, and are already logged in the zpool history. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Don Brady <don.brady@delphix.com> Closes #10967	2020-10-19 11:01:00 -07:00
Christian Schwarz	15a4ca4620	Fix crash caused by invalid snapshot names in redactnvl This is a follow up fix for commit `0fdd6106bb`. The VERIFY is only true when we haven't hit an error code path. See added test case for a reproducer. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Christian Schwarz <me@cschwarz.com> Closes #11048	2020-10-14 14:04:19 -07:00
Ryan Moeller	485b50bb9e	Cross-platform acltype The acltype property is currently hidden on FreeBSD and does not reflect the NFSv4 style ZFS ACLs used on the platform. This makes it difficult to observe that a pool imported from FreeBSD on Linux has a different type of ACL that is being ignored, and vice versa. Add an nfsv4 acltype and expose the property on FreeBSD. Make the default acltype nfsv4 on FreeBSD. Setting acltype to an unhanded style is treated the same as setting it to off. The ACLs will not be removed, but they will be ignored. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10520	2020-10-13 21:25:48 -07:00
Richard Elling	e9527d44e6	Add zpool_influxdb command A zpool_influxdb command is introduced to ease the collection of zpool statistics into the InfluxDB time-series database. Examples are given on how to integrate with the telegraf statistics aggregator, a companion to influxdb. Finally, a grafana dashboard template is included to show how pool latency distributions can be visualized in a ZFS + telegraf + influxdb + grafana environment. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Richard Elling <Richard.Elling@RichardElling.com> Closes #10786	2020-10-09 09:29:21 -07:00
Ryan Moeller	b7ab7ae241	Linux: Initialize zp in zfs_setattr_dir The value of zp is used without having been initialized under some conditions. Initialize the pointer to NULL. Add a regression test case using chown in acl/posix. However, this is not enough because the setup sets xattr=sa, which means zfs_setattr_dir will not be called. Create a second group of acl tests in acl/posix-sa duplicating the acl/posix tests with symlinks, and remove xattr=sa from the original acl/posix tests. This provides more coverage for the default xattr=on code. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10043 Closes #11025	2020-10-09 09:27:14 -07:00
Brian Behlendorf	d0249a4bd0	Replace ZFS on Linux references with OpenZFS This change updates the documentation to refer to the project as OpenZFS instead ZFS on Linux. Web links have been updated to refer to https://github.com/openzfs/zfs. The extraneous zfsonlinux.org web links in the ZED and SPL sources have been dropped. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #11007	2020-10-08 20:10:13 -07:00
Ryan Moeller	84cfe5d4f8	ZTS: Fix path to /dev/null in nopwrite_recsize Don't direct stdout and stderr of dd to $TEST_BASE_DIR/null, direct it to /dev/null. Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #11026	2020-10-08 16:39:23 -07:00
Ryan Moeller	73989f4b9e	Make dbufstat work on FreeBSD With procfs_list kstats implemented for FreeBSD, dbufs are now exposed as kstat.zfs.misc.dbufs. On FreeBSD, dbufstats can use the sysctl instead of procfs when no input file has been given. Enable the dbufstats tests on FreeBSD. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org> Closes #11008	2020-10-08 09:40:23 -07:00
George Amanakis	a76e4e6761	Make L2ARC tests more robust Instead of relying on arbitrary timers after pool export/import or cache device off/online rely on arcstats. This makes the L2ARC tests more robust. Also cleanup some functions related to persistent L2ARC. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Adam Moss <c@yotes.com> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #10983	2020-10-05 15:29:05 -07:00
John Poduska	5b525165e9	Mismatched nvlist names in zfs_keys_send_space This causes "zfs send -vt ..." to fail with: cannot resume send: Unknown error 1030 It turns out that some of the name/value pairs in the verification list for zfs_ioc_send_space(), zfs_keys_send_space, had the wrong name, so the ioctl got kicked out in zfs_check_input_nvpairs(). Update the names accordingly. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: John Poduska <jpoduska@datto.com> Closes #10978	2020-10-02 17:40:46 -07:00
Ryan Moeller	c0bd2e0fe2	Drop references when skipping dmu_send due to EXDEV When an invalid incremental send is requested where the "to" ds is before the "from" ds, make sure to drop the reference to the pool and the dataset before returning the error. Add an assert on FreeBSD to make sure we don't hold any locks after returning from an ioctl. Add some test coverage. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10919	2020-09-30 13:19:49 -07:00
George Wilson	c494aa7f57	vdev_ashift should only be set once == Motivation and Context The new vdev ashift optimization prevents the removal of devices when a zfs configuration is comprised of disks which have different logical and physical block sizes. This is caused because we set 'spa_min_ashift' in vdev_open and then later call 'vdev_ashift_optimize'. This would result in an inconsistency between spa's ashift calculations and that of the top-level vdev. In addition, the optimization logical ignores the overridden ashift value that would be provided by '-o ashift=<val>'. == Description This change reworks the vdev ashift optimization so that it's only set the first time the device is configured. It still allows the physical and logical ahsift values to be set every time the device is opened but those values are only consulted on first open. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Cedric Berger <cedric@precidata.com> Signed-off-by: George Wilson <gwilson@delphix.com> External-Issue: DLPX-71831 Closes #10932	2020-09-18 12:13:47 -07:00
Ryan Moeller	7ead2be3d2	Rename acltype=posixacl to acltype=posix Prefer acltype=off\|posix, retaining the old names as aliases. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10918	2020-09-16 12:26:06 -07:00
Toomas Soome	1db9e6e4e4	zfs label bootenv should store data as nvlist nvlist does allow us to support different data types and systems. To encapsulate user data to/from nvlist, the libzfsbootenv library is provided. Reviewed-by: Arvind Sankar <nivedita@alum.mit.edu> Reviewed-by: Allan Jude <allan@klarasystems.com> Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Igor Kozhukhov <igor@dilos.org> Signed-off-by: Toomas Soome <tsoome@me.com> Closes #10774	2020-09-15 15:42:27 -07:00
George Amanakis	085321621e	Add L2ARC arcstats for MFU/MRU buffers and buffer content type Currently the ARC state (MFU/MRU) of cached L2ARC buffer and their content type is unknown. Knowing this information may prove beneficial in adjusting the L2ARC caching policy. This commit adds L2ARC arcstats that display the aligned size (in bytes) of L2ARC buffers according to their content type (data/metadata) and according to their ARC state (MRU/MFU or prefetch). It also expands the existing evict_l2_eligible arcstat to differentiate between MFU and MRU buffers. L2ARC caches buffers from the MRU and MFU lists of ARC. Upon caching a buffer, its ARC state (MRU/MFU) is stored in the L2 header (b_arcs_state). The l2_m{f,r}u_asize arcstats reflect the aligned size (in bytes) of L2ARC buffers according to their ARC state (based on b_arcs_state). We also account for the case where an L2ARC and ARC cached MRU or MRU_ghost buffer transitions to MFU. The l2_prefetch_asize reflects the alinged size (in bytes) of L2ARC buffers that were cached while they had the prefetch flag set in ARC. This is dynamically updated as the prefetch flag of L2ARC buffers changes. When buffers are evicted from ARC, if they are determined to be L2ARC eligible then their logical size is recorded in evict_l2_eligible_m{r,f}u arcstats according to their ARC state upon eviction. Persistent L2ARC: When committing an L2ARC buffer to a log block (L2ARC metadata) its b_arcs_state and prefetch flag is also stored. If the buffer changes its arcstate or prefetch flag this is reflected in the above arcstats. However, the L2ARC metadata cannot currently be updated to reflect this change. Example: L2ARC caches an MRU buffer. L2ARC metadata and arcstats count this as an MRU buffer. The buffer transitions to MFU. The arcstats are updated to reflect this. Upon pool re-import or on/offlining the L2ARC device the arcstats are cleared and the buffer will now be counted as an MRU buffer, as the L2ARC metadata were not updated. Bug fix: - If l2arc_noprefetch is set, arc_read_done clears the L2CACHE flag of an ARC buffer. However, prefetches may be issued in a way that arc_read_done() is bypassed. Instead, move the related code in l2arc_write_eligible() to account for those cases too. Also add a test and update manpages for l2arc_mfuonly module parameter, and update the manpages and code comments for l2arc_noprefetch. Move persist_l2arc tests to l2arc. Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org> Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #10743	2020-09-14 10:10:44 -07:00
Don Brady	4f07282786	Avoid posting duplicate zpool events Duplicate io and checksum ereport events can misrepresent that things are worse than they seem. Ideally the zpool events and the corresponding vdev stat error counts in a zpool status should be for unique errors -- not the same error being counted over and over. This can be demonstrated in a simple example. With a single bad block in a datafile and just 5 reads of the file we end up with a degraded vdev, even though there is only one unique error in the pool. The proposed solution to the above issue, is to eliminate duplicates when posting events and when updating vdev error stats. We now save recent error events of interest when posting events so that we can easily check for duplicates when posting an error. Reviewed by: Brad Lewis <brad.lewis@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Don Brady <don.brady@delphix.com> Closes #10861	2020-09-04 10:34:28 -07:00
Ryan Moeller	964791acdc	Make spa_stats.c tunables visible on FreeBSD Use ZFS_MODULE_PARAM for cross-platform tunables in spa_stats.c, and add update tunables.cfg in tests for the newly supported ones. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10858	2020-09-01 16:19:19 -07:00
Ryan Moeller	7b4e27232d	Add 'zfs rename -u' to rename without remounting Allow to rename file systems without remounting if it is possible. It is possible for file systems with 'mountpoint' property set to 'legacy' or 'none' - we don't have to change mount directory for them. Currently such file systems are unmounted on rename and not even mounted back. This introduces layering violation, as we need to update 'f_mntfromname' field in statfs structure related to mountpoint (for the dataset we are renaming and all its children). In my opinion it is worth it, as it allow to update FreeBSD in even cleaner way - in ZFS-only configuration root file system is ZFS file system with 'mountpoint' property set to 'legacy'. If root dataset is named system/rootfs, we can snapshot it (system/rootfs@upgrade), clone it (system/oldrootfs), update FreeBSD and if it doesn't boot we can boot back from system/oldrootfs and rename it back to system/rootfs while it is mounted as /. Before it was not possible, because unmounting / was not possible. Authored by: Pawel Jakub Dawidek <pjd@FreeBSD.org> Reviewed-by: Allan Jude <allan@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Ported by: Matt Macy <mmacy@freebsd.org> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10839	2020-09-01 16:14:16 -07:00
Paul Dagnelie	4aa3b3bd47	Always track temporary fses and snapshots for accounting The root cause of the issue is that we only occasionally do as the comments in the code suggest and actually ignore the %recv dataset when it comes to filesystem limit tracking. Specifically, the only time we ignore it is when initializing the filesystem and snapshot limit values; when creating a new %recv dataset or deleting one, we always update the bookkeeping. This causes a problem if you init the fs count on a filesystem that already has a %recv dataset, since the bookmarking will be decremented but not incremented. This is resolved in this patch by simply always tracking the %recv dataset as a child. Reviewed-by: Matt Ahrens <matt@delphix.com> Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com> Signed-off-by: Paul Dagnelie <pcd@delphix.com> Closes #10791	2020-08-26 21:38:27 -07:00
Ryan Moeller	b0e75805ee	ZTS: Improve block_device_wait on FreeBSD FreeBSD doesn't have an equivalent to udevadm settle, so we have been resorting to a three second sleep to wait for device changes to take effect. This is far from ideal. We are mainly waiting for volmode=geom zvols to appear in /dev, so as a hack, reading the geom config will have the desired effect of quiescing the geom state. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10768	2020-08-24 08:50:15 -07:00
Tony Nguyen	c686c6fe75	ZFS performance tests should clean up NFS mount This change umounts client's NFS mount after each test so we can avoid two sporadic issues: 1) client NFS stale mount and 2) zpool export and zpool destroy failed due to dataset busy Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Nguyen <tony.nguyen@delphix.com> Closes #10767	2020-08-23 15:14:22 -07:00
Don Brady	7bba1d404c	'zfs share -a' should clean noauto exports This is a follow on to PR #10688 where `zfs share -a` allows the sharing of canmount=noauto datasets if they are mounted. However, when a dataset with canmount=noauto is not mounted, the command should also purge any existing entries from the exports file. Otherwise, after a reboot, the nfs server attempts to export the underlying mountpath, not the dataset. This can lead to a hard hang for existing client mounts. Instead of just skipping the adding of an export if not mounted and canmount=noauto, have it also remove an existing export of the dataset so that, after a reboot, we don't export an unmounted dataset. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Wilson <gwilson@delphix.com> Signed-off-by: Don Brady <don.brady@delphix.com> Closes #10747	2020-08-20 13:12:12 -07:00
Michael Niewöhner	10b3c7f5e4	Add zstd support to zfs This PR adds two new compression types, based on ZStandard: - zstd: A basic ZStandard compression algorithm Available compression. Levels for zstd are zstd-1 through zstd-19, where the compression increases with every level, but speed decreases. - zstd-fast: A faster version of the ZStandard compression algorithm zstd-fast is basically a "negative" level of zstd. The compression decreases with every level, but speed increases. Available compression levels for zstd-fast: - zstd-fast-1 through zstd-fast-10 - zstd-fast-20 through zstd-fast-100 (in increments of 10) - zstd-fast-500 and zstd-fast-1000 For more information check the man page. Implementation details: Rather than treat each level of zstd as a different algorithm (as was done historically with gzip), the block pointer `enum zio_compress` value is simply zstd for all levels, including zstd-fast, since they all use the same decompression function. The compress= property (a 64bit unsigned integer) uses the lower 7 bits to store the compression algorithm (matching the number of bits used in a block pointer, as the 8th bit was borrowed for embedded block pointers). The upper bits are used to store the compression level. It is necessary to be able to determine what compression level was used when later reading a block back, so the concept used in LZ4, where the first 32bits of the on-disk value are the size of the compressed data (since the allocation is rounded up to the nearest ashift), was extended, and we store the version of ZSTD and the level as well as the compressed size. This value is returned when decompressing a block, so that if the block needs to be recompressed (L2ARC, nop-write, etc), that the same parameters will be used to result in the matching checksum. All of the internal ZFS code ( `arc_buf_hdr_t`, `objset_t`, `zio_prop_t`, etc.) uses the separated _compress and _complevel variables. Only the properties ZAP contains the combined/bit-shifted value. The combined value is split when the compression_changed_cb() callback is called, and sets both objset members (os_compress and os_complevel). The userspace tools all use the combined/bit-shifted value. Additional notes: zdb can now also decode the ZSTD compression header (flag -Z) and inspect the size, version and compression level saved in that header. For each record, if it is ZSTD compressed, the parameters of the decoded compression header get printed. ZSTD is included with all current tests and new tests are added as-needed. Per-dataset feature flags now get activated when the property is set. If a compression algorithm requires a feature flag, zfs activates the feature when the property is set, rather than waiting for the first block to be born. This is currently only used by zstd but can be extended as needed. Portions-Sponsored-By: The FreeBSD Foundation Co-authored-by: Allan Jude <allanjude@freebsd.org> Co-authored-by: Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by: Sebastian Gottschall <s.gottschall@dd-wrt.com> Co-authored-by: Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl> Co-authored-by: Michael Niewöhner <foss@mniewoehner.de> Signed-off-by: Allan Jude <allan@klarasystems.com> Signed-off-by: Allan Jude <allanjude@freebsd.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Sebastian Gottschall <s.gottschall@dd-wrt.com> Signed-off-by: Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl> Signed-off-by: Michael Niewöhner <foss@mniewoehner.de> Closes #6247 Closes #9024 Closes #10277 Closes #10278	2020-08-20 10:30:06 -07:00
Brian Behlendorf	5266a0728a	ZED: Do not offline a missing device if no spare is available Due to commit `d48091d` a removed device is now explicitly offlined by the ZED if no spare is available, rather than the letting ZFS detect it as UNAVAIL. This broke auto-replacing of whole-disk devices, as described in issue #10577. In short, when a new device is reinserted in the same slot, the ZED will try to ONLINE it without letting ZFS recreate the necessary partition table. This change simply avoids setting the device OFFLINE when removed if no spare is available (or if spare_on_remove is false). This change has been left minimal to allow it to be backported to 0.8.x release. The auto_offline_001_pos ZTS test has been updated accordingly. Some follow up work is planned to update the ZED so it transitions the vdev to a REMOVED state. This is a state which has always existed but there is no current interface the ZED can use to accomplish this. Therefore it's being left to a follow up PR. Reviewed-by: Gionatan Danti <g.danti@assyoma.it> Co-authored-by: Gionatan Danti <g.danti@assyoma.it> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #10577 Closes #10730	2020-08-18 22:13:17 -07:00
Brian Behlendorf	d60c0dbdf3	ZTS: ztest may cause mmp tests failures The mmp_exported_import and mmp_inactive_import tests depend on ztest simulating an active pool. If ztest unexpectedly terminates due to an unrelated issue the test case will fail. Since ztest is not yet 100% reliable I've added these tests to the maybe exception list. They can be removed when the issues with ztest are resolved or if the test cases are updated to handle these unexpected failures. Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #10726	2020-08-17 22:31:18 -07:00
Allan Jude	fc34dfba8e	Fix L2ARC reads when compressed ARC disabled When reading compressed blocks from the L2ARC, with compressed ARC disabled, arc_hdr_size() returns LSIZE rather than PSIZE, but the actual read is PSIZE. This causes l2arc_read_done() to compare the checksum against the wrong size, resulting in checksum failure. This manifests as an increase in the kstat l2_cksum_bad and the read being retried from the main pool, making the L2ARC ineffective. Add new L2ARC tests with Compressed ARC enabled/disabled Blocks are handled differently depending on the state of the zfs_compressed_arc_enabled tunable. If a block is compressed on-disk, and compressed_arc is enabled: - the block is read from disk - It is NOT decompressed - It is added to the ARC in its compressed form - l2arc_write_buffers() may write it to the L2ARC (as is) - l2arc_read_done() compares the checksum to the BP (compressed) However, if compressed_arc is disabled: - the block is read from disk - It is decompressed - It is added to the ARC (uncompressed) - l2arc_write_buffers() will use l2arc_apply_transforms() to recompress the block, before writing it to the L2ARC - l2arc_read_done() compares the checksum to the BP (compressed) - l2arc_read_done() will use l2arc_untransform() to uncompress it This test writes out a test file to a pool consisting of one disk and one cache device, then randomly reads from it. Since the arc_max in the tests is low, this will feed the L2ARC, and result in reads from the L2ARC. We compare the value of the kstat l2_cksum_bad before and after to determine if any blocks failed to survive the trip through the L2ARC. Sponsored-by: The FreeBSD Foundation Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Allan Jude <allanjude@freebsd.org> Closes #10693	2020-08-13 23:31:20 -07:00
George Wilson	53c9d1d9b5	'zfs share -a' should handle 'canmount=noauto' The 'zfs share -a' currently skips any filesystems which have 'canmount=noauto' set. This behavior is unexpected since the one would expect 'zfs share -a' to share any mounted filesystem that has the 'sharenfs' property already set. This changes the behavior of 'zfs share -a' to allow the sharing of 'canmount=noauto' datasets if they are mounted. Reviewed-by: Matt Ahrens <matt@delphix.com> Reviewed-by: Don Brady <don.brady@delphix.com> Reviewed-by: Prakash Surya <prakash.surya@delphix.com> Signed-off-by: George Wilson <gwilson@delphix.com> External-issue: DLPX-71313 Closes #10688	2020-08-11 13:55:04 -07:00
Ryan Moeller	d4e6e9597d	ZTS: Remove bashisms from zfs-tests.sh Bring zfs-tests.sh in to compliance with the other scripts by converting it /bin/sh for to avoid a dependency on bash. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10640	2020-08-07 14:10:48 -07:00
Ryan Moeller	b6737193ee	FreeBSD: Fix `zfs jail` and add a test zfs_jail was not using zfs_ioctl so failed to map the IOC number correctly. Use zfs_ioctl to perform the jail ioctl and add a test case for FreeBSD. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10658	2020-08-01 08:44:54 -07:00
Ryan Moeller	af524bf7ff	ZTS: FreeBSD does have a l2arc.trim_ahead tunable Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10633	2020-07-31 21:18:32 -07:00
Ryan Moeller	18c624302d	ZTS: zvol_misc_volmode is flaky on FreeBSD Mark this as a known issue. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10655	2020-07-31 21:05:55 -07:00
Ryan Moeller	721ed01548	ZTS: Use POSIX-compatible space character class FreeBSD recently integrated a change which causes \s in a regex to throw an error instead of silently being misinterpreted as an s. Change the regex in zpool_colors.ksh to use [[:space:]]. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org> Closes #10651	2020-07-31 18:11:21 -07:00
Matthew Ahrens	948423a3d1	zfs promote does not delete livelist of origin When a clone is promoted, its livelist is no longer accurate, so it is discarded. If the clone's origin is also a clone (i.e. we are promoting a clone of a clone), then the origin's livelist is also no longer accurate, so it should be discarded, but the code doesn't actually do that. Consider a pool with: * Filesystem A * Clone B, a clone of A * Clone C, a clone of B If we promote C, it discards C's livelist. It should discard B's livelist, but that is not happening. The impact is that when B is destroyed, we use the livelist to find the blocks to free, but the livelist is no longer correct so we end up freeing blocks that are still in use by C. The incorrectly-freed blocks can be reallocated causing checksum errors. And when C is destroyed it can double-free the incorrectly-freed blocks. The problem is that we remove the livelist of `origin_ds->ds_dir`, but the origin snapshot has already been moved to the promoted dsl_dir. So this is actually trying to remove the livelist of the promoted dsl_dir, which was already removed. As explained in a comment in the beginning of `dsl_dataset_promote_sync()`, we need to use the saved `odd` for the origin's dsl_dir. Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Wilson <gwilson@delphix.com> Reviewed by: Sara Hartse <sara.hartse@delphix.com> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes #10652	2020-07-31 08:59:00 -07:00
Don Brady	a15c6f3310	ZTS: minor improvements to alloc_class_009_pos functional test * Fixed a typo that cause one of the variations to be a no-op * Added additional coverage for adding special vdev after pool create * Added additional coverage for using 4K sector size Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Don Brady <don.brady@delphix.com> Closes #10641	2020-07-30 09:11:05 -07:00
Matthew Ahrens	3eabed74c0	Fix lua stack overflow on recursive call to gsub() The `zfs program` subcommand invokes a LUA interpreter to run ZFS "channel programs". This interpreter runs in a constrained environment, with defined memory limits. The LUA stack (used for LUA functions that call each other) is allocated in the kernel's heap, and is limited by the `-m MEMORY-LIMIT` flag and the `zfs_lua_max_memlimit` module parameter. The C stack is used by certain LUA features that are implemented in C. The C stack is limited by `LUAI_MAXCCALLS=20`, which limits call depth. Some LUA C calls use more stack space than others, and `gsub()` uses an unusually large amount. With a programming trick, it can be invoked recursively using the C stack (rather than the LUA stack). This overflows the 16KB Linux kernel stack after about 11 iterations, less than the limit of 20. One solution would be to decrease `LUAI_MAXCCALLS`. This could be made to work, but it has a few drawbacks: 1. The existing test suite does not pass with `LUAI_MAXCCALLS=10`. 2. There may be other LUA functions that use a lot of stack space, and the stack space may change depending on compiler version and options. This commit addresses the problem by adding a new limit on the amount of free space (in bytes) remaining on the C stack while running the LUA interpreter: `LUAI_MINCSTACK=4096`. If there is less than this amount of stack space remaining, a LUA runtime error is generated. Reviewed-by: George Wilson <gwilson@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Reviewed-by: Allan Jude <allanjude@freebsd.org> Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes #10611 Closes #10613	2020-07-27 16:11:47 -07:00
tony-zfs	02fced3067	Add support to decode a resume token Adding a new subcommand to zstream called token. This now allows users to decode a resume token to retrieve the toname field. This can be useful for tools that need this information. The syntax works as follows zstream token <resume_token>. Reviewed-by: Matt Ahrens <matt@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Paul Zuchowski <pzuchowski@datto.com> Signed-off-by: Tony Perkins <tperkins@datto.com> Closes #10558	2020-07-23 17:44:03 -07:00
Ryan Moeller	0c79b070a7	ZTS: Fix devname2devid build on FreeBSD with libudev When libudev is installed on FreeBSD, configure finds it and sets WANT_DEVNAME2DEVID, but it isn't found by the linker because we didn't specify where it is. Use LIBUDEV_LIBS so the location of the library gets added to the linker flags for devname2devid. Also use LIBUDEV_CFLAGS here in case some other platform needs it. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10590	2020-07-22 10:49:22 -07:00
Brian Behlendorf	e862b7ecfc	Linux 4.10 compat: has_capability() Stock kernels older than 4.10 do not export the has_capability() function which is required by commit `e59a377`. To avoid breaking the build on older kernels revert to the safe legacy behavior and return EACCES when privileges cannot be checked. Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Reviewed-by: Matt Ahrens <matt@delphix.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #10565 Closes #10573	2020-07-19 09:56:21 -07:00
Brian Behlendorf	f1b8153794	Update zts-report.py with additional tests The following test cases have been observed to fail frequently enough to be a problem when reporting CI results. Until they can be updated to be entirely reliable add them to the zts-report.py script. alloc_class/alloc_class_011_neg cli_root/zpool_import/zpool_import_012_pos mmp/mmp_on_uberblocks rsend/send_partial_dataset Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #10578	2020-07-15 21:28:18 -07:00
Ryan Moeller	103bc5b957	ZTS: Fix nonportable use of stat in list_file_blocks FreeBSD stat uses -f to specify the format string rather than -c. list_file_blocks in blkdev.shlib uses stat -c %i to get a file's object ID for zdb. We already have a library function to do this portably. Use get_objnum to get the file's object ID. Take log_must off of the call to list_free_blocks in corrupt_blocks_at_level, which had masked the error. It was not good to pipe the output of log_must into the while-loop, anyway. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alek Pinchuk <apinchuk@datto.com> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10572	2020-07-15 21:26:39 -07:00
Matthew Ahrens	6774931dfa	Extend zdb to print inconsistencies in livelists and metaslabs Livelists and spacemaps are data structures that are logs of allocations and frees. Livelists entries are block pointers (blkptr_t). Spacemaps entries are ranges of numbers, most often used as to track allocated/freed regions of metaslabs/vdevs. These data structures can become self-inconsistent, for example if a block or range can be "double allocated" (two allocation records without an intervening free) or "double freed" (two free records without an intervening allocation). ZDB (as well as zfs running in the kernel) can detect these inconsistencies when loading livelists and metaslab. However, it generally halts processing when the error is detected. When analyzing an on-disk problem, we often want to know the entire set of inconsistencies, which is not possible with the current behavior. This commit adds a new flag, `zdb -y`, which analyzes the livelist and metaslab data structures and displays all of their inconsistencies. Note that this is different from the leak detection performed by `zdb -b`, which checks for inconsistencies between the spacemaps and the tree of block pointers, but assumes the spacemaps are self-consistent. The specific checks added are: Verify livelists by iterating through each sublivelists and: - report leftover FREEs - report double ALLOCs and double FREEs - record leftover ALLOCs together with their TXG [see Cross Check] Verify spacemaps by iterating over each metaslab and: - iterate over spacemap and then the metaslab's entries in the spacemap log, then report any double FREEs and double ALLOCs Verify that livelists are consistenet with spacemaps. The space referenced by livelists (after using the FREE's to cancel out corresponding ALLOCs) should be allocated, according to the spacemaps. Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by: Sara Hartse <sara.hartse@delphix.com> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> External-issue: DLPX-66031 Closes #10515	2020-07-14 17:51:05 -07:00
Arvind Sankar	38e2e9ce83	Centralize variable substitution A bunch of places need to edit files to incorporate the configured paths i.e. bindir, sbindir etc. Move this logic into a common file. Create arc_summary by copying arc_summary[23] as appropriate at build time instead of install time. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Closes #10559	2020-07-14 17:33:44 -07:00
George Wilson	c15d36c674	Remove dependency on sharetab file and refactor sharing logic == Motivation and Context The current implementation of 'sharenfs' and 'sharesmb' relies on the use of the sharetab file. The use of this file is os-specific and not required by linux or freebsd. Currently the code must maintain updates to this file which adds complexity and presents a significant performance impact when sharing many datasets. In addition, concurrently running 'zfs sharenfs' command results in missing entries in the sharetab file leading to unexpected failures. == Description This change removes the sharetab logic from the linux and freebsd implementation of 'sharenfs' and 'sharesmb'. It still preserves an os-specific library which contains the logic required for sharing NFS or SMB. The following entry points exist in the vastly simplified libshare library: - sa_enable_share -- shares a dataset but may not commit the change - sa_disable_share -- unshares a dataset but may not commit the change - sa_is_shared -- determine if a dataset is shared - sa_commit_share -- notify NFS/SMB subsystem to commit the shares - sa_validate_shareopts -- determine if sharing options are valid The sa_commit_share entry point is provided as a performance enhancement and is not required. The sa_enable_share/sa_disable_share may commit the share as part of the implementation. Libshare provides a framework for both NFS and SMB but some operating systems may not fully support these protocols or all features of the protocol. NFS Operation: For linux, libshare updates /etc/exports.d/zfs.exports to add and remove shares and then commits the changes by invoking 'exportfs -r'. This file, is automatically read by the kernel NFS implementation which makes for better integration with the NFS systemd service. For FreeBSD, libshare updates /etc/zfs/exports to add and remove shares and then commits the changes by sending a SIGHUP to mountd. SMB Operation: For linux, libshare adds and removes files in /var/lib/samba/usershares by calling the 'net' command directly. There is no need to commit the changes. FreeBSD does not support SMB. == Performance Results To test sharing performance we created a pool with an increasing number of datasets and invoked various zfs actions that would enable and disable sharing. The performance testing was limited to NFS sharing. The following tests were performed on an 8 vCPU system with 128GB and a pool comprised of 4 50GB SSDs: Scale testing: - Share all filesystems in parallel -- zfs sharenfs=on <dataset> & - Unshare all filesystems in parallel -- zfs sharenfs=off <dataset> & Functional testing: - share each filesystem serially -- zfs share -a - unshare each filesystem serially -- zfs unshare -a - reset sharenfs property and unshare -- zfs inherit -r sharenfs <pool> For 'zfs sharenfs=on' scale testing we saw an average reduction in time of 89.43% and for 'zfs sharenfs=off' we saw an average reduction in time of 83.36%. Functional testing also shows a huge improvement: - zfs share -- 97.97% reduction in time - zfs unshare -- 96.47% reduction in time - zfs inhert -r sharenfs -- 99.01% reduction in time Reviewed-by: Matt Ahrens <matt@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Reviewed-by: Bryant G. Ly <bryangly@gmail.com> Signed-off-by: George Wilson <gwilson@delphix.com> External-Issue: DLPX-68690 Closes #1603 Closes #7692 Closes #7943 Closes #10300	2020-07-13 09:19:18 -07:00
Matthew Ahrens	e59a377a8f	filesystem_limit/snapshot_limit is incorrectly enforced against root The filesystem_limit and snapshot_limit properties limit the number of filesystems or snapshots that can be created below this dataset. According to the manpage, "The limit is not enforced if the user is allowed to change the limit." Two types of users are allowed to change the limit: 1. Those that have been delegated the `filesystem_limit` or `snapshot_limit` permission, e.g. with `zfs allow USER filesystem_limit DATASET`. This works properly. 2. A user with elevated system privileges (e.g. root). This does not work - the root user will incorrectly get an error when trying to create a snapshot/filesystem, if it exceeds the `_limit` property. The problem is that `priv_policy_ns()` does not work if the `cred_t` is not that of the current process. This happens when `dsl_enforce_ds_ss_limits()` is called in syncing context (as part of a sync task's check func) to determine the permissions of the corresponding user process. This commit fixes the issue by passing the `task_struct` (typedef'ed as a `proc_t`) to syncing context, and then using `has_capability()` to determine if that process is privileged. Note that we still need to pass the `cred_t` to syncing context so that we can check if the user was delegated this permission with `zfs allow`. This problem only impacts Linux. Wrappers are added to FreeBSD but it continues to use `priv_check_cred()`, which works on arbitrary `cred_t`. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes #8226 Closes #10545	2020-07-11 17:18:02 -07:00
Serapheim Dimitropoulos	6f1db5f37e	Unconditionally enable debugging for libzpool We already enable -DDEBUG unconditionally (meaning regardless of this is a debug build or a performance build) for zdb and ztest as they are mostly used for development and debugging. This patch enables -DDEBUG for libzpool extending the debugging checks for zdb, ztest, and a couple of other test utilities. In addition to passing -DDEBUG we also enable -DZFS_DEBUG so all assertion checks work s expected. We do so not only in libzpool but in every utility that links to it, even if the utility doesn't directly use any functionality wrapped in ZFS_DEBUG macro definitions. The reason is that these utilities may still include headers that contain structs that have more fields when ZFS_DEBUG is defined. This can be a problem as enabling that flag for libzpool but not for zdb can lead into random problems (e.g. segmentation faults) as zdb may be have an incorrect view of a struct passed to it by libzpool. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com> Closes #10549	2020-07-10 15:30:31 -07:00
Arvind Sankar	3e597dee11	Use abs_top_builddir when referencing libraries libtool stores absolute paths in the dependency_libs component of the .la files. If the Makefile for a dependent library refers to the libraries by relative path, some libraries end up duplicated on the link command line. As an example, libzfs specifies libzfs_core, libnvpair and libuutil as dependencies to be linked in. The .la file for libzfs_core also specifies libnvpair, but using an absolute path, with the result that libnvpair is present twice in the linker command line for producing libzfs. While the only thing this causes is to slightly slow down the linking, we can avoid it by using absolute paths everywhere, including for convenience libraries just for consistency. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Closes #10538	2020-07-10 14:26:32 -07:00
Arvind Sankar	1537105a8c	Add config.rpath for AM_GNU_GETTEXT Commit `e8864b1b28` ("config: libintl/libiconv for gettext() detection") added an empty config.rpath with a comment that the real one doesn't work with libtool. However, an empty config.rpath doesn't really work: eg. on FreeBSD, where libintl is in /usr/local/lib, configure thinks that gettext doesn't exist and NLS should be disabled, which currently isn't supported in the source, and hence requires manual workaround to directly link -lintl without relying on configure. config.rpath is essential to let it be detected either in --prefix or using --with-libintl-prefix. I also don't see the mentioned issue with libtool flags applied to compilation, it seems to work fine to pass LTLIBINTL to libtool. It's unnecessary to include LTLIBICONV as the configure test will automatically append that to LTLIBINTL if it is necessary to link with libiconv. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Closes #10538	2020-07-10 14:26:12 -07:00
Arvind Sankar	4d61ade1a3	Clean up lib dependencies libzutil is currently statically linked into libzfs, libzfs_core and libzpool. Avoid the unnecessary duplication by removing it from libzfs and libzpool, and adding libzfs_core to libzpool. Remove a few unnecessary dependencies: - libuutil from libzfs_core - libtirpc from libspl - keep only libcrypto in libzfs, as we don't use any functions from libssl - librt is only used for clock_gettime, however on modern systems that's in libc rather than librt. Add a configure check to see if we actually need librt - libdl from raidz_test Add a few missing dependencies: - zlib to libefi and libzfs - libuuid to zpool, and libuuid and libudev to zed - libnvpair uses assertions, so add assert.c to provide aok and libspl_assertf Sort the LDADD for programs so that libraries that satisfy dependencies come at the end rather than the beginning of the linker command line. Revamp the configure tests for libaries to use FIND_SYSTEM_LIBRARY instead. This can take advantage of pkg-config, and it also avoids polluting LIBS. List all the required dependencies in the pkgconfig files, and move the one for libzfs_core into the latter's directory. Install pkgconfig files in $(libdir)/pkgconfig on linux and $(prefix)/libdata/pkgconfig on FreeBSD, instead of /usr/share/pkgconfig, as the more correct location for library .pc files. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Closes #10538	2020-07-10 14:26:00 -07:00
Arvind Sankar	b6437ea41c	Move libspl_assertf into .c file Variadic functions cannot be inlined. libspl_assertf ends up being duplicated in every file that uses it. Fix this by moving the function into a new assert.c. Also move the definition of aok into the new file instead of zone.c. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Closes #10538	2020-07-10 14:25:24 -07:00
Ryan Moeller	a2ec738c75	ZTS: Make bc conditional use compatible with new BSD bc FreeBSD recently replaced the GNU bc and dc in the base system with BSD licensed versions. They are supposed to be compatible with all the features present in the GNU versions, but it turns out they are picky about `if` statements having a corresponding `else`. ZTS uses `echo "if ($x > $y) 1" \| bc` in a few places, which causes tests to fail unexpectedly with the new bc. Change the two expressions in ZTS to `if ($x > $y) 1 else 0` for compatibility with the new BSD bc. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10551	2020-07-09 17:49:02 -07:00
Brian Behlendorf	9a49d3f3d3	Add device rebuild feature The device_rebuild feature enables sequential reconstruction when resilvering. Mirror vdevs can be rebuilt in LBA order which may more quickly restore redundancy depending on the pools average block size, overall fragmentation and the performance characteristics of the devices. However, block checksums cannot be verified as part of the rebuild thus a scrub is automatically started after the sequential resilver completes. The new '-s' option has been added to the `zpool attach` and `zpool replace` command to request sequential reconstruction instead of healing reconstruction when resilvering. zpool attach -s <pool> <existing vdev> <new vdev> zpool replace -s <pool> <old vdev> <new vdev> The `zpool status` output has been updated to report the progress of sequential resilvering in the same way as healing resilvering. The one notable difference is that multiple sequential resilvers may be in progress as long as they're operating on different top-level vdevs. The `zpool wait -t resilver` command was extended to wait on sequential resilvers. From this perspective they are no different than healing resilvers. Sequential resilvers cannot be supported for RAIDZ, but are compatible with the dRAID feature being developed. As part of this change the resilver_restart_* tests were moved in to the functional/replacement directory. Additionally, the replacement tests were renamed and extended to verify both resilvering and rebuilding. Original-patch-by: Isaac Huang <he.huang@intel.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: John Poduska <jpoduska@datto.com> Co-authored-by: Mark Maybee <mmaybee@cray.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #10349	2020-07-03 11:05:50 -07:00
Robert Novak	bfcbec6f5d	Add block histogram to zdb The block histogram tracks the changes to psize, lsize and asize both in the count of the number of blocks (by blocksize) and the total length of all of the blocks for that blocksize. It also keeps a running total of the cumulative size of all of the blocks up to each size to help determine the size of caching SSDs to be added to zfs hardware deployments. The block history counts and lengths are summarized in bins which are powers of two. Even rows with counts of zero are printed. This change is accessed by specifying one of two options: zdb -bbb pool zdb -Pbbb pool The first version prints the table in fixed size columns. The second prints in "parseable" output that can be placed into a CSV file. Fixed Column, nicenum output sample: block psize lsize asize size Count Length Cum. Count Length Cum. Count Length Cum. 512: 3.50K 1.75M 1.75M 3.43K 1.71M 1.71M 3.41K 1.71M 1.71M 1K: 3.65K 3.67M 5.43M 3.43K 3.44M 5.15M 3.50K 3.51M 5.22M 2K: 3.45K 6.92M 12.3M 3.41K 6.83M 12.0M 3.59K 7.26M 12.5M 4K: 3.44K 13.8M 26.1M 3.43K 13.7M 25.7M 3.49K 14.1M 26.6M 8K: 3.42K 27.3M 53.5M 3.41K 27.3M 53.0M 3.44K 27.6M 54.2M 16K: 3.43K 54.9M 108M 3.50K 56.1M 109M 3.42K 54.7M 109M 32K: 3.44K 110M 219M 3.41K 109M 218M 3.43K 110M 219M 64K: 3.41K 218M 437M 3.41K 218M 437M 3.44K 221M 439M 128K: 3.41K 437M 874M 3.70K 474M 911M 3.41K 437M 876M 256K: 3.41K 874M 1.71G 3.41K 874M 1.74G 3.41K 874M 1.71G 512K: 3.41K 1.71G 3.41G 3.41K 1.71G 3.45G 3.41K 1.71G 3.42G 1M: 3.41K 3.41G 6.82G 3.41K 3.41G 6.86G 3.41K 3.41G 6.83G 2M: 0 0 6.82G 0 0 6.86G 0 0 6.83G 4M: 0 0 6.82G 0 0 6.86G 0 0 6.83G 8M: 0 0 6.82G 0 0 6.86G 0 0 6.83G 16M: 0 0 6.82G 0 0 6.86G 0 0 6.83G Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Robert E. Novak <novak5@llnl.gov> Closes: #9158 Closes #10315	2020-06-26 15:09:20 -07:00
Arvind Sankar	6b99fc0620	Fixes for make dist Reduce the usage of EXTRA_DIST. If files are conditionally included in _SOURCES, _HEADERS etc, automake is smart enough to dist all files that could possibly be included, but this does not apply to EXTRA_DIST, resulting in make dist depending on the configuration. Add some files that were missing altogether in various Makefile's. The changes to disted files in this commit (excluding deleted files): +./cmd/zed/agents/README.md +./etc/init.d/README.md +./lib/libspl/os/freebsd/getexecname.c +./lib/libspl/os/freebsd/gethostid.c +./lib/libspl/os/freebsd/getmntany.c +./lib/libspl/os/freebsd/mnttab.c -./lib/libzfs/libzfs_core.pc -./lib/libzfs/libzfs.pc +./lib/libzfs/os/freebsd/libzfs_compat.c +./lib/libzfs/os/freebsd/libzfs_fsshare.c +./lib/libzfs/os/freebsd/libzfs_ioctl_compat.c +./lib/libzfs/os/freebsd/libzfs_zmount.c +./lib/libzutil/os/freebsd/zutil_compat.c +./lib/libzutil/os/freebsd/zutil_device_path_os.c +./lib/libzutil/os/freebsd/zutil_import_os.c +./module/lua/README.zfs +./module/os/linux/spl/README.md +./tests/README.md +./tests/zfs-tests/tests/functional/cli_root/zfs_clone/zfs_clone_rm_nested.ksh +./tests/zfs-tests/tests/functional/cli_root/zfs_send/zfs_send_encrypted_unloaded.ksh +./tests/zfs-tests/tests/functional/inheritance/README.config +./tests/zfs-tests/tests/functional/inheritance/README.state +./tests/zfs-tests/tests/functional/rsend/rsend_016_neg.ksh +./tests/zfs-tests/tests/perf/fio/sequential_readwrite.fio Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Closes #10501	2020-06-26 14:20:02 -07:00
felixdoerre	221e67040f	pam: implement a zfs_key pam module Implements a pam module for automatically loading zfs encryption keys for home datasets. The pam module: - loads a zfs key and mounts the dataset when a session opens. - unmounts the dataset and unloads the key when the session closes. - when the user is logged on and changes the password, the module changes the encryption key. Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: @jengelh <jengelh@inai.de> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Felix Dörre <felix@dogcraft.de> Closes #9886 Closes #9903	2020-06-24 18:45:44 -07:00
Arvind Sankar	5ca349f95d	Fix check for sed --in-place The test added in commit `4313a5b4c5` ("Detect if sed supports --in-place") doesn't work at least on my system (autoconfig-2.69). The issue is that SED has already been found and cached before this function is evaluated, with the result that the test is completely skipped. ... checking for a sed that does not truncate output... /usr/bin/sed ... checking for sed --in-place... (cached) /usr/bin/sed The first test is executed by libtool.m4. This looks to have been around in libtool for at least 15 years or so, not sure why this was not encountered at the time of the original commit. Fix this by caching the value of the ac_inplace flag rather than the path to SED. Also use $SED and add AC_REQUIRE to ensure that we use the sed that was located by the standard configure test. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Closes #10493	2020-06-24 18:19:59 -07:00
Ryan Moeller	9192f27c1d	Add zfs_multihost_interval tunable handler for FreeBSD This tunable required a handler to be implemented for ZFS_MODULE_PARAM_CALL. Add the handler so the tunable can be declared in common code. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10490	2020-06-23 13:32:42 -07:00
Matthew Ahrens	540493ba4f	Clarify comments in config/.m4, vdev_geom.c, zfs_allow_.ksh Rephrase comments to be more clear. Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes #10481	2020-06-22 09:46:37 -07:00
Brian Behlendorf	745ace3f24	Update zts-report.py with additional tests The following test cases may still occasionally fail and are being added to the "maybe" list for Linux until they can be updated to be entirely reliable. cli_root/zfs_rename/zfs_rename_002_pos.ksh cli_root/zpool_reopen/zpool_reopen_003_pos.ksh refreserv/refreserv_raidz These 6 tests consistently fail only on Fedora 31+, the failures are related to the kernel rescanning the partition table on loopback devices which is no longer reliable unless partprobe is used. In order to enable the Fedora bot by default they are also being added to the list until the tests can be updated. Any significant regression in functionality covered by these tests will still be detected by the FreeBSD builders. alloc_class/alloc_class_009_pos alloc_class/alloc_class_010_pos cli_root/zpool_expand/zpool_expand_001_pos cli_root/zpool_expand/zpool_expand_005_pos rsend/rsend_007_pos rsend/rsend_010_pos rsend/rsend_011_pos snapshot/rollback_003_pos Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #10489	2020-06-22 09:44:49 -07:00
Arvind Sankar	c3fe42aabd	Remove dead code Delete unused functions. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Closes #10470	2020-06-18 12:21:18 -07:00
Arvind Sankar	65c7cc49bf	Mark functions as static Mark functions used only in the same translation unit as static. This only includes functions that do not have a prototype in a header file either. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Closes #10470	2020-06-18 12:20:38 -07:00
adilger	f734301d22	linux: add basic fallocate(mode=0/2) compatibility Implement semi-compatible functionality for mode=0 (preallocation) and mode=FALLOC_FL_KEEP_SIZE (preallocation beyond EOF) for ZPL. Since ZFS does COW and snapshots, preallocating blocks for a file cannot guarantee that writes to the file will not run out of space. Even if the first overwrite was guaranteed, it would not handle any later overwrite of blocks due to COW, so strict compliance is futile. Instead, make a best-effort check that at least enough free space is currently available in the pool (with a bit of margin), then create a sparse file of the requested size and continue on with life. This does not handle all cases (e.g. several fallocate() calls before writing into the files when the filesystem is nearly full), which would require a more complex mechanism to be implemented, probably based on a modified version of dmu_prealloc(), but is usable as-is. A new module option zfs_fallocate_reserve_percent is used to control the reserve margin for any single fallocate call. By default, this is 110% of the requested preallocation size, so an additional 10% of available space is reserved for overhead to allow the application a good chance of finishing the write when the fallocate() succeeds. If the heuristics of this basic fallocate implementation are not desirable, the old non-functional behavior of returning EOPNOTSUPP for calls can be restored by setting zfs_fallocate_reserve_percent=0. The parameter of zfs_statvfs() is changed to take an inode instead of a dentry, since no dentry is available in zfs_fallocate_common(). A few tests from @behlendorf cover basic fallocate functionality. Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: Arshad Hussain <arshad.super@gmail.com> Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Co-authored-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Andreas Dilger <adilger@dilger.ca> Issue #326 Closes #10408	2020-06-18 11:22:11 -07:00
Matthew Ahrens	f66434268c	Remove unnecessary references to slavery The horrible effects of human slavery continue to impact society. The casual use of the term "slave" in computer software is an unnecessary reference to a painful human experience. This commit removes all possible references to the term "slave". Implementation notes: The zpool.d/slaves script is renamed to dm-deps, which uses the same terminology as `dmsetup deps`. References to the `/sys/class/block/$dev/slaves` directory remain. This directory name is determined by the Linux kernel. Although `dmsetup deps` provides the same information, it unfortunately requires elevated privileges, whereas the `/sys/...` directory is world-readable. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes #10435	2020-06-10 17:07:59 -07:00
Andrea Gelmini	dd4bc569b9	Fix typos Correct various typos in the comments and tests. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net> Closes #10423	2020-06-09 21:24:09 -07:00
Matthew Ahrens	7bcb7f0840	File incorrectly zeroed when receiving incremental stream that toggles -L Background: By increasing the recordsize property above the default of 128KB, a filesystem may have "large" blocks. By default, a send stream of such a filesystem does not contain large WRITE records, instead it decreases objects' block sizes to 128KB and splits the large blocks into 128KB blocks, allowing the large-block filesystem to be received by a system that does not support the `large_blocks` feature. A send stream generated by `zfs send -L` (or `--large-block`) preserves the large block size on the receiving system, by using large WRITE records. When receiving an incremental send stream for a filesystem with large blocks, if the send stream's -L flag was toggled, a bug is encountered in which the file's contents are incorrectly zeroed out. The contents of any blocks that were not modified by this send stream will be lost. "Toggled" means that the previous send used `-L`, but this incremental does not use `-L` (-L to no-L); or that the previous send did not use `-L`, but this incremental does use `-L` (no-L to -L). Changes: This commit addresses the problem with several changes to the semantics of zfs send/receive: 1. "-L to no-L" incrementals are rejected. If the previous send used `-L`, but this incremental does not use `-L`, the `zfs receive` will fail with this error message: incremental send stream requires -L (--large-block), to match previous receive. 2. "no-L to -L" incrementals are handled correctly, preserving the smaller (128KB) block size of any already-received files that used large blocks on the sending system but were split by `zfs send` without the `-L` flag. 3. A new send stream format flag is added, `SWITCH_TO_LARGE_BLOCKS`. This feature indicates that we can correctly handle "no-L to -L" incrementals. This flag is currently not set on any send streams. In the future, we intend for incremental send streams of snapshots that have large blocks to use `-L` by default, and these streams will also have the `SWITCH_TO_LARGE_BLOCKS` feature set. This ensures that streams from the default use of `zfs send` won't encounter the bug mentioned above, because they can't be received by software with the bug. Implementation notes: To facilitate accessing the ZPL's generation number, `zfs_space_delta_cb()` has been renamed to `zpl_get_file_info()` and restructured to fill in a struct with ZPL-specific info including owner and generation. In the "no-L to -L" case, if this is a compressed send stream (from `zfs send -cL`), large WRITE records that are being written to small (128KB) blocksize files need to be decompressed so that they can be written split up into multiple blocks. The zio pipeline will recompress each smaller block individually. A new test case, `send-L_toggle`, is added, which tests the "no-L to -L" case and verifies that we get an error for the "-L to no-L" case. Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes #6224 Closes #10383	2020-06-09 10:41:01 -07:00
Igor K	6722be2823	ZTS: Fix add-o_ashift.ksh Use option '-o' after action for compatibility Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Igor Kozhukhov <igor@dilos.org> Closes #10426	2020-06-09 10:31:16 -07:00
George Amanakis	b7654bd794	Trim L2ARC The l2arc_evict() function is responsible for evicting buffers which reference the next bytes of the L2ARC device to be overwritten. Teach this function to additionally TRIM that vdev space before it is overwritten if the device has been filled with data. This is done by vdev_trim_simple() which trims by issuing a new type of TRIM, TRIM_TYPE_SIMPLE. We also implement a "Trim Ahead" feature. It is a zfs module parameter, expressed in % of the current write size. This trims ahead of the current write size. A minimum of 64MB will be trimmed. The default is 0 which disables TRIM on L2ARC as it can put significant stress to underlying storage devices. To enable TRIM on L2ARC we set l2arc_trim_ahead > 0. We also implement TRIM of the whole cache device upon addition to a pool, pool creation or when the header of the device is invalid upon importing a pool or onlining a cache device. This is dependent on l2arc_trim_ahead > 0. TRIM of the whole device is done with TRIM_TYPE_MANUAL so that its status can be monitored by zpool status -t. We save the TRIM state for the whole device and the time of completion on-disk in the header, and restore these upon L2ARC rebuild so that zpool status -t can correctly report them. Whole device TRIM is done asynchronously so that the user can export of the pool or remove the cache device while it is trimming (ie if it is too slow). We do not TRIM the whole device if persistent L2ARC has been disabled by l2arc_rebuild_enabled = 0 because we may not want to lose all cached buffers (eg we may want to import the pool with l2arc_rebuild_enabled = 0 only once because of memory pressure). If persistent L2ARC has been disabled by setting the module parameter l2arc_rebuild_blocks_min_l2size to a value greater than the size of the cache device then the whole device is trimmed upon creation or import of a pool if l2arc_trim_ahead > 0. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Adam D. Moss <c@yotes.com> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #9713 Closes #9789 Closes #10224	2020-06-09 10:15:08 -07:00
alaviss	13dd63ff81	mkfile: include missing headers Without these headers, compilation fails on musl libc with offset_t being undeclared and MIN being implictly declared. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Hiếu Lê <leorize+oss@disroot.org> Closes #10406	2020-06-05 17:22:10 -07:00
Ryan Moeller	4f8b2356cc	ZTS: Retry export/destroy when busy in zpool_import_012 It can take a moment for the NFS server to give up the mountpoint after unsharing a filesystem. Use log_must_busy to retry export/destroy a few times after switching off sharenfs. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Igor Kozhukhov <igor@dilos.org> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10380	2020-05-27 17:18:06 -07:00
Brian Behlendorf	c946d5a913	ZTS: Fix zfs_mount.kshlib cleanup Update cleanup_filesystem to use destroy_dataset when performing cleanup. This ensures the destroy is retried if the pool is busy preventing occasional failures. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Giuseppe Di Natale <guss80@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #10358	2020-05-23 17:13:42 -07:00
felixdoerre	501a1511ae	mount: use the mount syscall directly Allow zfs datasets to be mounted on Linux without relying on the invocation of an external processes. This is the same behavior which is implemented for FreeBSD. Use of the libmount library was originally considered because it provides functionality to properly lock and update the /etc/mtab file. However, these days /etc/mtab is typically a symlink to /proc/self/mounts so there's nothing to updated. Therefore, we call mount(2) directly and avoid any additional dependencies. If required the legacy behavior can be enabled by setting the ZFS_MOUNT_HELPER environment variable. This may be needed in environments where SELinux in enabled and the zfs binary does not have mount permission. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Felix Dörre <felix@dogcraft.de> #10294	2020-05-20 18:02:41 -07:00
Paul Dagnelie	de4f06c275	Small program that converts a dataset id and an object id to a path Small program that converts a dataset id and an object id to a path Reviewed-by: Prakash Surya <prakash.surya@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Signed-off-by: Paul Dagnelie <pcd@delphix.com> Closes #10204	2020-05-20 10:05:33 -07:00
Brian Behlendorf	c87f958668	flake8 E741 variable name warning Update the zts-report.py script to conform to the flake8 E741 rule. "Variables named I, O, and l can be very hard to read. This is because the letter I and the letter l are easily confused, and the letter O and the number 0 can be easily confused." - https://www.flake8rules.com/rules/E741.html Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #10323	2020-05-14 09:41:29 -07:00
John Wren Kennedy	e1fcd940e7	ZTS: zpool_split_indirect deletes zfstest log file The cleanup routine for this test attempts to remove some temporary files with `rm -f $VDEV_*`, but VDEV_ is undefined. As a result, all files in the current working directory (/var/tmp/test_results/current) get removed instead. This includes the complete log file of all tests. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: George Amanakis <gamanakis@gmail.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: John Kennedy <john.kennedy@delphix.com> Closes #10324	2020-05-14 09:39:47 -07:00
John Poduska	41035a0496	Resilver restarts unnecessarily when it encounters errors When a resilver finishes, vdev_dtl_reassess is called to hopefully excise DTL_MISSING (amongst other things). If there are errors during the resilver, they are tracked in DTL_SCRUB, as spelled out in the block comment in vdev.c. DTL_SCRUB is in-core only, so it can only be used if the pool was online for the whole resilver. This state is tracked with the spa_scrub_started flag, which only gets set when the scan is initialized. Unfortunately, this flag gets cleared right before vdev_dtl_reassess gets called, so if there are any errors during the scan, DTL_MISSING will never get excised and the resilver will just continually restart. This fix simply moves clearing that flag until after the call to vdev_dtl_reasses. In addition, if a pool is imported and already has scn_errors > 0, this change will restart the resilver immediately instead of doing the rest of the scan and then restarting it from the beginning. On the other hand, if scn_errors == 0 at import, then no errors have been encountered so far, so the spa_scrub_started flag can be safely set. A test has been added to verify that resilver does not restart when relevant DTL's are available. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Paul Zuchowski <pzuchowski@datto.com> Signed-off-by: John Poduska <jpoduska@datto.com> Closes #10291	2020-05-13 10:54:27 -07:00
Petros Koutoupis	bd95f00d4b	Fixed LDADD library links in Makefiles for cross compilation builds When building on native dev system, there are no issues but when cross-compiling for target system, some linker errors are observed. The only way to avoid these errors is by adjusting the Makefile.am of those various components to add the library dependencies. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Petros Koutoupis <petros@petroskoutoupis.com> Closes #10304	2020-05-09 10:17:08 -07:00
Brian Behlendorf	d775c86dd4	ZTS: refreserv_005_pos.ksh When recursively destroying the dataset it's possible for the dataset volume to be open by an unrelated process, like blkid. Use the destroy_dataset() which will retry when this occurs. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #10305	2020-05-08 13:50:02 -07:00
George Amanakis	657fd33bcf	Improvements on persistent L2ARC Functional changes: We implement refcounts of log blocks and their aligned size on the cache device along with two corresponding arcstats. The refcounts are reflected in the header of the device and provide valuable information as to whether log blocks are accounted for correctly. These are dynamically adjusted as log blocks are committed/evicted. zdb also uses this information in the device header and compares it to the corresponding values as reported by dump_l2arc_log_blocks() which emulates l2arc_rebuild(). If the refcounts saved in the device header report higher values, zdb exits with an error. For this feature to work correctly there should be no active writes on the device. This is also employed in the tests of persistent L2ARC. We extend the structure of the cache device header by adding the two new variables mirroring the refcounts after the existing variables to preserve backward compatibility in terms of persistent L2ARC. 1) a new arcstat "l2_log_blk_asize" and refcount "l2ad_lb_asize" which reflect the total aligned size of log blocks on the device. This is also reflected in the header of the cache device as "dh_lb_asize". 2) a new arcstat "l2arc_log_blk_count" and refcount "l2ad_lb_count" which reflect the total number of L2ARC log blocks present on cache devices. It is also reflected in the header of the cache device as "dh_lb_count". In l2arc_rebuild_vdev() if the amount of committed log entries in a log block is 0 and the device header is valid we update the device header. This will facilitate trimming of the whole device in this case when TRIM for L2ARC is implemented. Improve loop protection in l2arc_rebuild() by using the starting offset of the payload of each log block instead of the starting offset of the log block. If the zio in l2arc_write_buffers() fails, restore the lbps array in the header of the device to its previous state in l2arc_write_done(). If l2arc_rebuild() ends the rebuild process without restoring any L2ARC log blocks in ARC and without any other error, this means that the lbps array in the header is pointing to non-existent or invalid log blocks. Reset the device header in this case. In l2arc_rebuild() change the zfs_dbgmsg messages to spa_history_log_internal() making them user visible with zpool history command. Non-functional changes: Make the first test in persistent L2ARC use `zdb -lll` to increase coverage in `zdb.c`. Rename psize with asize when referring to log blocks, since L2ARC_SET_PSIZE stores the vdev aligned size for log blocks. Also rename dh_log_blk_entries to dh_log_entries to make it clear that it is a mirror of l2ad_log_entries. Added comments for both changes. Fix inaccurate comments for example in l2arc_log_blk_restore(). Add asserts at the end in l2arc_evict() and l2arc_write_buffers(). Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #10228	2020-05-07 16:34:03 -07:00
Paul Dagnelie	108a454a46	Add support for boot environment data to be stored in the label Modern bootloaders leverage data stored in the root filesystem to enable some of their powerful features. GRUB specifically has a grubenv file which can store large amounts of configuration data that can be read and written at boot time and during normal operation. This allows sysadmins to configure useful features like automated failover after failed boot attempts. Unfortunately, due to the Copy-on-Write nature of ZFS, the standard behavior of these tools cannot handle writing to ZFS files safely at boot time. We need an alternative way to store data that allows the bootloader to make changes to the data. This work is very similar to work that was done on Illumos to enable similar functionality in the FreeBSD bootloader. This patch is different in that the data being stored is a raw grubenv file; this file can store arbitrary variables and values, and the scripting provided by grub is powerful enough that special structures are not required to implement advanced behavior. We repurpose the second padding area in each label to store the grubenv file, protected by an embedded checksum. We add two ioctls to get and set this data, and libzfs_core and libzfs functions to access them more easily. There are no direct command line interfaces to these functions; these will be added directly to the bootloader utilities. Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Paul Dagnelie <pcd@delphix.com> Closes #10009	2020-05-07 09:36:33 -07:00
George Amanakis	1b664952ae	Enable splitting mirrors with indirect vdevs When a top-level vdev is removed from a pool it is converted to an indirect vdev. Until now splitting such mirrored pools was not possible with zpool split. This patch enables handling of indirect vdevs and splitting of those pools with zpool split. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #10283	2020-05-06 10:32:28 -07:00
Ryan Moeller	6ed4391da9	ZTS: Count CKSUM for all vdevs in verify_pool The verify_pool function should detect checksum errors on any vdev, but it was only checking at the root of the pool. Accumulate the errors for all vdevs to obtain the correct count. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10271	2020-04-30 17:50:16 -07:00
Sara Hartse	89a6610ed0	Add more sanity testing for zdb input args Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: sara hartse <sara.hartse@delphix.com> Closes #10243	2020-04-28 09:56:31 -07:00
alex	47c9299fcc	zfs_create: round up volume size to multiple of bs Round up the volume size requested in `zfs create -V size` to the next higher multiple of the volblocksize. Updates the man page and adds a test to verify the new behavior. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reported-by: puffi <puffi@users.noreply.github.com> Signed-off-by: Alex John <alex@stty.io> Closes #8541 Closes #10196	2020-04-24 19:04:34 -07:00
Matthew Ahrens	196bee4cfd	Remove deduplicated send/receive code Deduplicated send streams (i.e. `zfs send -D` and `zfs receive` of such streams) are deprecated. Deduplicated send streams can be received by first converting them to non-deduplicated with the `zstream redup` command. This commit removes the code for sending and receiving deduplicated send streams. `zfs send -D` will now print a warning, ignore the `-D` flag, and generate a regular (non-deduplicated) send stream. `zfs receive` of a deduplicated send stream will print an error message and fail. The resulting code simplification (especially in the kernel's support for receiving dedup streams) should help enable future performance enhancements. Several new tests are added which leverage `zstream redup`. Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Issue #7887 Issue #10117 Issue #10156 Closes #10212	2020-04-23 10:06:57 -07:00
George Amanakis	9249f1272e	Persistent L2ARC minor fixes Minor fixes on persistent L2ARC improving code readability and fixing a typo in zdb.c when byte-swapping a log block. It also improves the pesist_l2arc_007_pos.ksh test by giving it more time to retrieve log blocks on the cache device. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Adam D. Moss <c@yotes.com> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #10210	2020-04-17 09:27:40 -07:00
Ryan Moeller	a7929f3137	Update FreeBSD tunables Remove some obsolete legacy compat, rename some misnamed, and add some missing tunables for FreeBSD. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10203	2020-04-15 11:14:47 -07:00
Ryan Moeller	af99094dee	Don't delete freebsd.run in distclean Add a comment so the file is not empty. The comment can be removed when FreeBSD-specific tests are added. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Sean Eric Fagan <sef@ixsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10206	2020-04-15 09:21:40 -07:00
Matthew Macy	9f0a21e641	Add FreeBSD support to OpenZFS Add the FreeBSD platform code to the OpenZFS repository. As of this commit the source can be compiled and tested on FreeBSD 11 and 12. Subsequent commits are now required to compile on FreeBSD and Linux. Additionally, they must pass the ZFS Test Suite on FreeBSD which is being run by the CI. As of this commit 1230 tests pass on FreeBSD and there are no unexpected failures. Reviewed-by: Sean Eric Fagan <sef@ixsystems.com> Reviewed-by: Jorgen Lundman <lundman@lundman.net> Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Matt Macy <mmacy@FreeBSD.org> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #898 Closes #8987	2020-04-14 11:36:28 -07:00
alex	c602b35cf7	ZTS: Fix and change testcase cache_010_neg Commit `379ca9c` removed the requirement on aux devices to be block devices only but the test case cache_010_neg was not updated, making it fail consistently. This change changes the test to check that cache devices _can_ be anything that presents a block interface. The testcase is renamed to cache_010_pos and the exceptions for known failure removed from the test runner. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reported-by: Richard Elling <Richard.Elling@RichardElling.com> Signed-off-by: Alex John <alex@stty.io> Closes #10172	2020-04-13 10:50:41 -07:00
Matthew Ahrens	c618f87cd2	Add `zstream redup` command to convert deduplicated send streams Deduplicated send and receive is deprecated. To ease migration to the new dedup-send-less world, the commit adds a `zstream redup` utility to convert deduplicated send streams to normal streams, so that they can continue to be received indefinitely. The new `zstream` command also replaces the functionality of `zstreamdump`, by way of the `zstream dump` subcommand. The `zstreamdump` command is replaced by a shell script which invokes `zstream dump`. The way that `zstream redup` works under the hood is that as we read the send stream, we build up a hash table which maps from `<GUID, object, offset> -> <file_offset>`. Whenever we see a WRITE record, we add a new entry to the hash table, which indicates where in the stream file to find the WRITE record for this block. (The key is `drr_toguid, drr_object, drr_offset`.) For entries other than WRITE_BYREF, we pass them through unchanged (except for the running checksum, which is recalculated). For WRITE_BYREF records, we change them to WRITE records. We find the referenced WRITE record by looking in the hash table (for the record with key `drr_refguid, drr_refobject, drr_refoffset`), and then reading the record header and payload from the specified offset in the stream file. This is why the stream can not be a pipe. The found WRITE record replaces the WRITE_BYREF record, with its `drr_toguid`, `drr_object`, and `drr_offset` fields changed to be the same as the WRITE_BYREF's (i.e. we are writing the same logical block, but with the data supplied by the previous WRITE record). This algorithm requires memory proportional to the number of WRITE records (same as `zfs send -D`), but the size per WRITE record is relatively low (40 bytes, vs. 72 for `zfs send -D`). A 1TB send stream with 8KB blocks (`recordsize=8k`) would use around 5GB of RAM to "redup". Reviewed-by: Jorgen Lundman <lundman@lundman.net> Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes #10124 Closes #10156	2020-04-10 10:39:55 -07:00
George Amanakis	77f6826b83	Persistent L2ARC This commit makes the L2ARC persistent across reboots. We implement a light-weight persistent L2ARC metadata structure that allows L2ARC contents to be recovered after a reboot. This significantly eases the impact a reboot has on read performance on systems with large caches. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: George Wilson <gwilson@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by: Saso Kiselkov <skiselkov@gmail.com> Co-authored-by: Jorgen Lundman <lundman@lundman.net> Co-authored-by: George Amanakis <gamanakis@gmail.com> Ported-by: Yuxuan Shui <yshuiv7@gmail.com> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #925 Closes #1823 Closes #2672 Closes #3744 Closes #9582	2020-04-10 10:33:35 -07:00
Ryan Moeller	4a21ec0560	ZTS: Fix non-portable date format The delegate tests use `date(1)` to generate snapshot names, using the format '%F-%T-%N' to get nanosecond resolution (since multiple snapshots may be taken in the same second). '%N' is not portable, and causes tests to fail on FreeBSD. Since the only purpose these timestamps serve is to create a unique name, simply use $RANDOM instead. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10170	2020-04-06 16:07:35 -07:00
Paul Dagnelie	5a42ef04fd	Add 'zfs wait' command Add a mechanism to wait for delete queue to drain. When doing redacted send/recv, many workflows involve deleting files that contain sensitive data. Because of the way zfs handles file deletions, snapshots taken quickly after a rm operation can sometimes still contain the file in question, especially if the file is very large. This can result in issues for redacted send/recv users who expect the deleted files to be redacted in the send streams, and not appear in their clones. This change duplicates much of the zpool wait related logic into a zfs wait command, which can be used to wait until the internal deleteq has been drained. Additional wait activities may be added in the future. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Gallagher <john.gallagher@delphix.com> Signed-off-by: Paul Dagnelie <pcd@delphix.com> Closes #9707	2020-04-01 10:02:06 -07:00
George Amanakis	37c22948e5	Reset l2ad_hand and l2ad_first in l2arc_evict Increasing l2arc_write_size or l2arc_write_boost can result in l2arc_write_buffers() not having enough space to perform its writes and panic zio_write_phys(). Instead of resetting l2ad_hand to l2ad_start at the end of l2arc_write_buffers() and not taking into account a possible user-mediated increase of l2arc_write_max, we do this in l2arc_evict(), right after l2arc_write_size() has run. If there is not enough space to evict (ie we will exceed l2ad_end) we evict to the end of the device, reset l2ad_hand to l2ad_start, set l2ad_first to 0 and iterate l2arc_evict(). We avoid infinite iteration of l2arc_evict() by making sure in l2arc_write_size() that l2ad_start + size does not exceed l2ad_end. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #10154	2020-03-31 10:46:48 -07:00
Ryan Moeller	c96a32e1a3	ZTS: Skip udev actions in zvol_misc when not Linux udev is only used on Linux. Skip udev_wait and udev_cleanup when not on Linux. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10165	2020-03-31 10:35:14 -07:00
Ryan Moeller	ef3331e703	ZTS: Wait for free space between quota tests And in removal tests, sync the specific pool we are waiting on. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10146	2020-03-26 10:48:19 -07:00
Ryan Moeller	22df2457a7	Avoid core dump on invalid redaction bookmark libzfs aborts and dumps core on EINVAL from the kernel when trying to do a redacted send with a bookmark that is not a redaction bookmark. Move redacted bookmark validation into libzfs. Check if the bookmark given for redactions is actually a redaction bookmark. Print an error message and exit gracefully if it is not. Don't abort on EINVAL in zfs_send_one. Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10138	2020-03-18 12:54:12 -07:00
Brian Behlendorf	6b7028ec51	Fix cstyle warnings Fix minor cstyle warnings accidentally introduced by `7145123b`. Reviewed-by: Paul Dagnelie <pcd@delphix.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #10143	2020-03-17 15:42:27 -07:00
Paul Dagnelie	7145123b0a	Separate warning for incomplete and corrupt streams This change adds a separate return code to zfs_ioc_recv that is used for incomplete streams, in addition to the existing return code for streams that contain corruption. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Signed-off-by: Paul Dagnelie <pcd@delphix.com> Closes #10122	2020-03-17 10:30:33 -07:00
Mariusz Zaborski	a57d3d45d6	Add option for forcible unmounting dataset while receiving snapshot. Currently when the dataset is in use we can't receive snapshots. zfs send test/1@asd \| zfs recv -FM test/2 cannot unmount '/test/2': Device busy This commits add option 'M' which attempts to forcibly unmount the dataset. Thanks to this we can enforce receiving snapshots in a single step. Note that this functionality is not supported on Linux because the VFS will prevent active mounted filesystems from being unmounted, even with the force option. This is the intended VFS behavior. Test cases were added to verify the expected behavior based on the platform. Discussed-with: Pawel Jakub Dawidek <pjd@FreeBSD.org> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Allan Jude <allanjude@freebsd.org> External-issue: https://reviews.freebsd.org/D22306 Closes #9904	2020-03-17 10:08:32 -07:00
Ryan Moeller	80d98a8f3a	ZTS: Use default_cleanup_noexit where needed And add log_pass appropriately. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10136	2020-03-17 09:55:18 -07:00
Ryan Moeller	e0d3284bc9	Exit status 256+signum is actually baked in to ksh While #10121 did fix the signal numbers for FreeBSD/Darwin, it incorrectly changed the expected encoding of exit status for commands that exited on a signal. The encoding 256+signum is a feature of the shell. Only the signal numbers themselves are platform-dependent. Always use the encoding 256+signum when checking exit status for signal exits. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10137	2020-03-17 09:49:58 -07:00
Ryan Moeller	d3fe62cb35	ZTS: Update flaky tests in zts-report Some tests which pass on FreeBSD but fail on Linux had been put in the "maybe" set. Move these back to "known" under an "if Linux" check so the expected outcome is clear. Add some tests that have been found to be flaky on FreeBSD stable/12 to the "maybe" set. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10120	2020-03-13 09:29:10 -07:00
Ryan Moeller	94eb65b4c1	ZTS: Use correct signal numbers for status checks Different operating systems encode exit status in different ways. The logapi shell library assumes the Solaris meaning of exit codes, which is not correct on other platforms. Define the needed constants according to the platform we are running on and use those to decode process exit status. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10121	2020-03-12 10:50:51 -07:00
Ryan Moeller	cdbc34fc2b	ZTS: Test boundary conditions in alloc_class_012 Issue #9142 describes an error in the checks for device removal that can prevent removal of special allocation class vdevs in some situations. Enhance alloc_class/alloc_class_012_pos to check situations where this bug occurs. Update zts-report with knowledge of issue #9142. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10116 Issue #9142	2020-03-12 10:50:01 -07:00
Ryan Moeller	e70b127e05	ZTS: Wait for free space between write_dirs tests Cleanup for write_dirs involves destroying a dataset filling a pool and then recreating the dataset for the next test. Due to the asynchronous nature of free space accounting, recreating the dataset can fail for lack of space, causing problems for the next test. Add wait_freeing $TESTPOOL to wait for the space to be freed and then sync_pool $TESTPOOL to update the space accounting before attempting to recreate the test filesystem. Only use a single disk to create the pool. Make it a small file so it does not take too long to fill. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10112	2020-03-12 10:48:46 -07:00
Ryan Moeller	ddd9ef3a4f	ZTS: Add a failsafe callback to run after each test Tests that get killed do not have an opportunity to clean up. There are many bad states this can leave the system in, but of particular gravity is when zinject has been used to induce bad behavior for one or more of the test disks. Create a failsafe mechanism in test-runner.py that runs a callback script after every test. The script is common to all tests so all tests benefit from the protection. Add an obligatory `zinject -c all` to clear all zinject state after every test case is run. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10096	2020-03-10 11:00:56 -07:00
Ryan Moeller	9be70c3784	ZTS: Simplify some libtest functions Don't echo the results of arithmetic expressions, it's not necessary. Use hw.clockrate sysctl to get CPU freq instead of parsing dmesg.boot for a line that might not even be there anymore. Reduce bookkeeping in fill_fs, making it easier to follow. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10113	2020-03-10 10:44:14 -07:00
Ryan Moeller	2b95e91132	ZTS: Another round of changes for FreeBSD Highlights: * is_linux -> is_illumos swaps * make block_device_wait more clever when paths are given * slightly optimize default_cleanup_noexit * remove platform differences in user_run * temporarily expect non-libfetch behavior for keylocation=/foo/bar * fix sharenfs exceptions * don't test multihost property * fix misc broken platform checks * clear zinjected faults in removal_resume_export callback Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10092	2020-03-06 09:31:32 -08:00
Ryan Moeller	f5f6fb03b7	Change default to overlay=on Filesystems allow overlay mounts by default on FreeBSD and Linux. Respect the native convention by switching the default to overlay=on, while retaining the option to turn the property off for compatibility with other operating systems' conventions. Update documentation and tests accordingly. Reviewed-by: Richard Laager <rlaager@wiktel.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10030	2020-03-06 09:28:19 -08:00
Ryan Moeller	788398c562	ZTS: Update zts-report exceptions for FreeBSD The new zfs_sync_trim_* tests are skipped on FreeBSD. Both of the previously failing tests are now passing. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10105	2020-03-06 09:26:38 -08:00
Brian Behlendorf	5a1abc4b5b	ZTS: Speed up write_dirs cleanup The write_dirs tests fill a filesystem with a bunch of files until it is full. In cleanup the files are truncated and removed individually. These tests already take a while to run. It is quicker and easier to destroy the whole dataset and create a new one to replace it in the cleanup functions. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10098	2020-03-04 15:12:12 -08:00
Brian Behlendorf	fa23c5be88	ZTS: Add missing quotes `default_setup` takes a disk list as the first argument and has optional additional arguments that control secondary functionality. A couple of test setups mistakenly call `default_setup $DISKS`. Add quotes so the second and subsequent disks are correctly included in the pool as vdevs rather than triggering unwanted behavior from `default_setup`. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10097	2020-03-04 15:10:45 -08:00
Brian Behlendorf	4b06d05298	ZTS: Add zts-report exceptions for FreeBSD There are three tests we expect to fail only on FreeBSD. * link_count never exits and eventually times out: - @amotin tells me this test is probably not applicable to us - Skip on FreeBSD * userobj feature does not activate immediately after pool upgrade - low impact; we are aware of this issue * removal does not appear to condense on export - low impact; we are aware of this issue Additionally removal_with_zdb passes on FreeBSD, so it is moved to "maybe". Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10093	2020-03-04 15:09:40 -08:00
Brian Behlendorf	d16c029f99	ZTS: Test the correct filesystem_limits behavior See issue #8226: Property filesystem_limit does not work as documented There have been previous attempts to fix the behavior on Linux, but so far the issue is still open. See PRs #8228, #8280. The existing tests pass for the incorrect behavior. This is a problem on FreeBSD; we are failing the tests because we implement the feature correctly. I have adapted the tests based on the work by @loli10k in #8280 and extended the changes to fix the snapshot_limit test as well. Linux now fails these tests, so entries linking to the issue have been added to the "maybe" group in zts-report.py. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10082	2020-03-04 15:07:52 -08:00
Brian Behlendorf	2288d41968	Add trim support to zpool wait Manual trims fall into the category of long-running pool activities which people might want to wait synchronously for. This change adds support to 'zpool wait' for waiting for manual trim operations to complete. It also adds a '-w' flag to 'zpool trim' which can be used to turn 'zpool trim' into a synchronous operation. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com> Signed-off-by: John Gallagher <john.gallagher@delphix.com> Closes #10071	2020-03-04 15:07:11 -08:00
Ryan Moeller	0a0f9a7dc6	ZTS: Provide for nested cleanup routines Shared test library functions lack a simple way to ensure proper cleanup in the event of a failure. The `log_onexit` cleanup pattern cannot be used in library functions because it uses one global variable to store the cleanup command. An example of where this is a serious issue is when a tunable that artifically stalls kernel progress gets activated and then some check fails. Unless the caller knows about the tunable and sets it back, the system will be left in a bad state. To solve this problem, turn the global cleanup variable into a stack. Provide push and pop functions to add additional cleanup steps and remove them after it is safe again. The first use of this new functionality is in attempt_during_removal, which sets REMOVAL_SUSPEND_PROGRESS. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10080	2020-03-03 10:28:09 -08:00
Ryan Moeller	9bb907bc3f	Make spa_history_zone platform-dependent in kernel This function should only return "linux" on Linux. Move the kernel part of the function out of common code. Fix the tests for FreeBSD. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10079	2020-03-02 09:43:30 -08:00
Ryan Moeller	1289fbb557	ZTS: Change issue URL template to OpenZFS org Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10081	2020-03-02 09:42:22 -08:00
Ryan Moeller	6c0abcfddd	ZTS: Fixup shebang in rsend_016, add to common.run All other ksh scripts use /bin/ksh in the shebang. Make rsend_016_neg consistent with the rest of the suite. The test also was absent from any runfiles. Add it to common.run. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Igor Kozhukhov <igor@dilos.org> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10051	2020-02-28 09:48:29 -08:00
Ryan Moeller	f0410e9806	ZTS: Eliminate partitioning from zpool_add Use file vdevs if we are short on $DISKS. Also fixed vol recursion for FreeBSD in 004. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10060	2020-02-28 09:46:51 -08:00
Ryan Moeller	3d5ba1cf29	ZTS: Misc fixes for FreeBSD * Set geom debug flags in corrupt_blocks_at_level * Use the right time zone for history tests * Add missing commands.cfg entry for diskinfo * Rewrite get_last_txg_synced to use zdb * Don't check ulimits for sparse files * Suspend removal before removing a vdev, not after Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10054	2020-02-27 09:38:34 -08:00
Matthew Ahrens	ab9646166d	ZTS: Fix zfs_receive_004_neg `zfs recv` of an incremental stream that already exists is ignored, with a message like: receiving incremental stream of pool/fs@incsnap into pool/fs@incsnap snap testpool/testfs@incsnap already exists; ignoring And the command exits successfully (exit code 0). The zfs_receive_004_neg test is expecting that a this case will fail, with nonzero exit code. The fix is to remove this specific command from the test case. This lets us check that the remaining commands do in fact fail. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes #10055	2020-02-27 09:37:34 -08:00
Ryan Moeller	647ff8e975	ZTS: Fix zfs_copies_002_pos The function `get_used_prop` does not exist. Use `get_prop used` instead. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10059	2020-02-26 14:29:12 -08:00
Ryan Moeller	abef699866	ZTS: Adapt casenorm tests for FreeBSD Several casenorm tests pass on FreeBSD but are expected to fail on Linux. Move the passing tests from "fail" to "maybe" so that passing on FreeBSD is not unexpected. Invert platform logic so FreeBSD doesn't use illumos-only zlook. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Igor Kozhukhov <igor@dilos.org> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10050	2020-02-26 08:41:30 -08:00
Ryan Moeller	3a192f7d89	ZTS: Misc fixes for FreeBSD * Check for mountd in is_shared to avoid timeout when not running * Enhance robustness of some cleanup functions * Simplify atime lookup * Skip sharenfs validation for now * Don't add mountpoint property to inheritance validation on FreeBSD Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10047	2020-02-25 16:23:27 -08:00
Olaf Faaland	034313908a	ZTS: zed_start should not fail if zed is already running zed_start may be called in places where zed is not typically already running, but this is not a requirement of the tests. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Olaf Faaland <faaland1@llnl.gov> Closes #9974	2020-02-25 16:02:10 -08:00
Ryan Moeller	2757010166	ZTS: Move atime_003 to linux.run This test verifies relatime behavior, which is only present on Linux. Move the test to linux.run Reviewed-by: Igor Kozhukhov <igor@dilos.org> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10046	2020-02-25 15:27:41 -08:00
Ryan Moeller	92bd4cabd6	ZTS: Misc fixes for FreeBSD * Force UFS sync before snap in vol rollback tests * rw is not a valid share option on FreeBSD, use ro instead * zfs_unmount_nested: mountpoint is in the pool, rmdir before export * Fix some more platform checks * Fix disappearing group in delegate tests * Don't try delegating for jailed, only root can set it Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10038	2020-02-24 10:17:55 -08:00
Ryan Moeller	24fcd9fc5c	ZTS: Eliminate partitioning from zpool_destroy The zpool destroy tests partition a single disk to create two pools. This can be done using two disks and no partitioning instead. And temporarily allow vol recursion for FreeBSD while in here. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10036	2020-02-21 16:00:22 -08:00
Ryan Moeller	b7dbbf6aa7	ZTS: Refactor is_shared, fix impl on FreeBSD FreeBSD doesn't have a `share` command. It does have showmount. Split the separate platform impls out of is_shared_impl. Dispatch to the correct platform impl function from is_shared. Eliminate the use of is_shared_impl from tests. is_shared works. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10037	2020-02-21 15:59:20 -08:00
Ryan Moeller	f5f438194d	ZTS: Move privilege tests to sunos.run These tests are unspported on FreeBSD and Linux for lack of pfexec. Move the privilege tests to sunos.run and remove the platform checks. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10035	2020-02-21 08:52:44 -08:00
Ryan Moeller	6a60841631	ZTS: Don't use lsblk on FreeBSD These tests use lsblk to find the sector size of a disk. FreeBSD doesn't have lsblk. Use diskinfo -v to get sector size on FreeBSD. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Igor Kozhukhov <igor@dilos.org>\ Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10033	2020-02-21 08:38:34 -08:00
Ryan Moeller	ca7ea23f8a	ZTS: Fix userquota_006_pos on FreeBSD FreeBSD uses `pw` for account management. `userquota_006_pos` erroneously invokes the non-existent `groupdel` command on FreeBSD. Use `pw groupdel -n` instead of `groupdel` on FreeBSD. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10032	2020-02-20 08:14:24 -08:00
Ryan Moeller	b11375d74a	ZTS: Check the right mount options on FreeBSD FreeBSD does not support the "devices" and "nodevices" mount options. Do not check these options on FreeBSD. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10028	2020-02-20 08:12:23 -08:00
Ryan Moeller	325a551232	ZTS: Fix faulty slog_replay_fs_001 test This test is supposed to verify zil operations. For TX_WRITE, writes must be synchronous in order to be entered in the zil. Linux seems to be doing sync writes even when they are not asked for, but on FreeBSD the test does not do what is intended. Use dd oflag=sync for the parts of this test that are supposed to result in TX_WRITE zil entries. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10022	2020-02-20 08:11:51 -08:00
Ryan Moeller	8136956716	ZTS: Eliminate partitioning from zpool_create etc These tests can be made to work without a bunch of complex partitioning of physical disks. Use the 3 disks directly, creating a few file disks if needed for a compelling reason. Reduce the use of shared variables that don't have a clear utility. Catch the fallout in tests that include cfg/shlib from zpool_create. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10002	2020-02-20 08:10:13 -08:00
Ryan Moeller	873cd182de	ZTS: Fix zpool_create/create-o_ashift on FreeBSD For some unknown reason, egrep was misbehaving with this pattern on FreeBSD. The command works fine run interactively from a shell, but in the test the output of egrep is empty. Work around the issue by using a filter in the awk script instead. While here, add a bit of diagnostic output and other simplifications to the awk script as well. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10023	2020-02-19 10:27:23 -08:00
Ryan Moeller	392556f0ef	ZTS: Avoid nonportable cmp flag FreeBSD doesn't have the -n flag for cmp. Read the area for the first four labels from the disk to a separate file to compare instead of using the special flag to limit the size. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10024	2020-02-19 09:03:31 -08:00
Ryan Moeller	43849fdf3f	ZTS: Move free to Linux commands list FreeBSD does not have the free command. This command is only used by Linux in a perf hostinfo function. Move free from the list of common commands to the list of Linux commands. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10011	2020-02-18 11:23:41 -08:00
Ryan Moeller	5f087dda78	Enable zpool events tunables and tests on FreeBSD We have have made the necessary changes in our module code to expose zevents through both devd and the zpool events ioctl. Now the tunables can be exposed and zpool events tests can be enabled on both platforms. A few minor tweaks to the tests were needed to accommodate the way wc formats output on FreeBSD. zed remains to be ported. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10008	2020-02-18 11:22:56 -08:00
Richard Laager	f244846462	Prefer org.openzfs for features and properties Moving forward, we wish to use org.openzfs (no dash) rather than org.open-zfs or org.zfsonlinux for feature GUIDs and property names. The existing feature GUIDs cannot be changed. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Richard Laager <rlaager@wiktel.com> Closes #10003	2020-02-18 09:36:50 -08:00
Ryan Moeller	fb63fc0c03	ZTS: Move cksum to common system commands The cksum command is used by delegate tests. We have it on FreeBSD, so it should not have been moved to the Linux commands list. Move it back to the common commands list. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10007	2020-02-16 12:49:49 -08:00
Jason King	13b5a4d5c0	Support setting user properties in a channel program This adds support for setting user properties in a zfs channel program by adding 'zfs.sync.set_prop' and 'zfs.check.set_prop' to the ZFS LUA API. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Matt Ahrens <matt@delphix.com> Co-authored-by: Sara Hartse <sara.hartse@delphix.com> Contributions-by: Jason King <jason.king@joyent.com> Signed-off-by: Sara Hartse <sara.hartse@delphix.com> Signed-off-by: Jason King <jason.king@joyent.com> Closes #9950	2020-02-14 13:41:42 -08:00
Ryan Moeller	4f4ddf98ee	ZTS: Misc test fixes for FreeBSD Add missing logic for FreeBSD to a few test scripts. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #9994	2020-02-13 13:52:34 -08:00
Ryan Moeller	71439163bb	ZTS: Don't include zpool_create.shlib in zpool_add The zpool_add tests include zpool_create.shlib for a few silly variables. Don't use those variables for the file names. Include zpool_add.kshlib for whatever variables we still need. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #9997	2020-02-13 12:11:25 -08:00
Ryan Moeller	3bdc4f6314	ZTS: Eliminate partitioning from zpool_remove These tests do not need to use partitions. Get rid of the partitioning and just use the disks directly. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #9996	2020-02-13 12:10:35 -08:00
Ryan Moeller	3e725f0ad2	ZTS: Eliminate partitioning from write_dirs These tests do not need to use partitions. Get rid of the partitioning and just use the disks directly. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #9995	2020-02-13 12:08:59 -08:00
Ryan Moeller	5ceccda5cb	ZTS: Cleanup some cleanup functions Cleanup functions should make a best effort to clean up as much as possible. Do a consistency pass in a bunch of tests to make the cleanup functions less prone to failure and fix a few typos here and there. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #9993	2020-02-13 12:05:32 -08:00
Ryan Moeller	523bc0d548	Fix a typo/whitespace in tests README Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #9991	2020-02-13 12:04:47 -08:00
Ryan Moeller	b90b01cbc8	ZTS: Use ECKSUM instead of EBADE in libzfs test Linux defines ECKSUM as EBADE, FreeBSD defines it as EINTEGRITY. Test for ECKSUM instead of EBADE so we don't have to define EBADE for this test on FreeBSD. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #9992	2020-02-13 12:03:01 -08:00
Ryan Moeller	610eec452d	ZTS: Move user_namespace test to linux.run Namespaces is a Linux feature not available on other platforms. Move the user_namespace test out of common.run to linux.run. Reviewed-by: Igor Kozhukhov <igor@dilos.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #9982	2020-02-12 13:06:00 -08:00
Ryan Moeller	834f274fbf	ZTS: Interpret env vars in faketty on FreeBSD This was missed in review. On FreeBSD, script does not understand environment variables being passed as a command. Use env to make faketty handle env vars on FreeBSD. Reviewed-by: Igor Kozhukhov <igor@dilos.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #9981	2020-02-12 13:04:51 -08:00
Ryan Moeller	9c536b9a78	ZTS: Fix zdb_display_block on FreeBSD Missed this in the review, but wc output on FreeBSD is indented, so string comparisons mismatch when comparing to an unindented number. Compare counts as integers instead of strings. Reviewed-by: Igor Kozhukhov <igor@dilos.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Paul Zuchowski <pzuchowski@datto.com> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #9980	2020-02-12 13:01:55 -08:00
Ryan Moeller	1bbeb6d755	ZTS: Move zpool_split_wholedisks to linux.run This test uses the scsi_debug Linux kernel module. Move the test to linux.run until we have an alternative to scsi_debug worked out on FreeBSD. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #9984	2020-02-12 12:59:28 -08:00
Christian Schwarz	948f0c4419	zcp: add zfs.sync.bookmark Add support for bookmark creation and cloning. Reviewed-by: Matt Ahrens <matt@delphix.com> Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Christian Schwarz <me@cschwarz.com> Closes #9571	2020-02-11 13:19:17 -08:00
Christian Schwarz	a73f361fdb	Implement bookmark copying This feature allows copying existing bookmarks using zfs bookmark fs#target fs#newbookmark There are some niche use cases for such functionality, e.g. when using bookmarks as markers for replication progress. Copying redaction bookmarks produces a normal bookmark that cannot be used for redacted send (we are not duplicating the redaction object). ZCP support for bookmarking (both creation and copying) will be implemented in a separate patch based on this work. Overview: - Terminology: - source = existing snapshot or bookmark - new/bmark = new bookmark - Implement bookmark copying in `dsl_bookmark.c` - create new bookmark node - copy source's `zbn_phys` to new's `zbn_phys` - zero-out redaction object id in copy - Extend existing bookmark ioctl nvlist schema to accept bookmarks as sources - => `dsl_bookmark_create_nvl_validate` is authoritative - use `dsl_dataset_is_before` check for both snapshot and bookmark sources - Adjust CLI - refactor shortname expansion logic in `zfs_do_bookmark` - Update man pages - warn about redaction bookmark handling - Add test cases - CLI - pyyzfs libzfs_core bindings Reviewed-by: Matt Ahrens <matt@delphix.com> Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Christian Schwarz <me@cschwarz.com> Closes #9571	2020-02-11 13:19:12 -08:00
Paul Zuchowski	bc67cba7c0	Fix zdb -R with 'b' flag zdb -R :b fails due to the indirect block being compressed, and the 'b' and 'd' flag not working in tandem when specified. Fix the flag parsing code and create a zfs test for zdb -R block display. Also fix the zio flags where the dotted notation for the vdev portion of DVA (i.e. 0.0:offset:length) fails. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Paul Zuchowski <pzuchowski@datto.com> Closes #9640 Closes #9729	2020-02-10 14:00:05 -08:00
Graham Christensen	dda702fd16	bash scripts: use /usr/bin/env for bash shebangs Not all systems / distros have a `/bin/bash`, and these scripts are more difficult to run at development time. For example, my system is NixOS which doesn't have a /bin/bash. This is not a problem for NixOS building ZFS as a package: the build environment automatically replaces these shebangs with corrected paths. The problem is much more annoying at development time: either the scripts don't run, or I correct them for my local machine and deal with a perpetually dirty work tree. Before committing this patch I confirmed there are existing scripts which use `/usr/bin/env` to locate bash, so I am thinking this is a safe transformation. There are a handful of other shebangs in this repository which don't work on my system. This patch is useful on its own specifically for `commitcheck.sh`, otherwise I can't validate my commits before submission. Here are the remaining shebangs which NixOS systems won't have: 1274 #!/bin/ksh -p 91 #!/bin/ksh 89 #! /bin/ksh -p 2 #!/bin/sed -f 1 #!/usr/bin/perl -w 1 #!/usr/bin/ksh 1 #!/bin/nawk -f plus this which will create an invalid shebang in `tests/zfs-tests/tests/functional/mv_files/mv_files_common.kshlib`: echo "#!/bin/ksh" > $TEST_BASE_DIR/exitsZero.ksh I chose to leave those alone for now, and gauge the interest in this much smaller patch first. The fixes for these are easy enough by simply using `/usr/bin/env ksh`: 91 #!/bin/ksh 1 #!/usr/bin/ksh The fix for the other set is much trickier. Quoting the GNU coreutils manual: Most operating systems (e.g. GNU/Linux, BSDs) treat all text after the first space as a single argument. When using env in a script it is thus not possible to specify multiple arguments. and not all `env`'s support arguments. Mine (GNU Coreutils 8.31) does, though this feature is new since April 2018, GNU Coreutils 8.30: https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=668306ed86c8c79b0af0db8b9c882654ebb66db2 and worse, requires the -S argument: -S, --split-string=S process and split S into separate arguments; used to pass multiple arguments on shebang lines Example: $ seq 1 2 \| $(nix-build '<nixpkgs>' -A coreutils)/bin/env "sort -nr" /nix/[...]-coreutils-8.31/bin/env: ‘sort -nr’: No such file or directory /nix/[...]-coreutils-8.31/bin/env: use -[v]S to pass options in shebang lines $ seq 1 2 \| $(nix-build '<nixpkgs>' -A coreutils)/bin/env "-S sort -nr" 2 1 GNU Coreutils says FreeBSD's `env` does, though I wonder if FreeBSD's would be unhappy with the `-S`: https://www.gnu.org/software/coreutils/manual/html_node/env-invocation.html#env-invocation BusyBox v1.30.1 does not, and does not have a `-S`-like option: $ seq 1 2 \| $(nix-build '<nixpkgs>' -A busybox)/bin/env "sort -nr" env: can't execute 'sort -nr': No such file or directory Toybox 0.8.1 also does not, and also does not have a `-S` option: $ seq 1 2 \| $(nix-build '<nixpkgs>' -A toybox)/bin/env "sort -nr" env: exec sort -nr: No such file or directory --- At any rate, if this patch merges and the remaining ~1,500 are updated, the much larger patch should probably include a checkstyle-like test asserting all new shebangs use `/usr/bin/env`. I also don't mind dealing with NixOS weirdness if the project would prefer that. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Graham Christensen <graham@grahamc.com> Closes #9893	2020-02-10 13:13:46 -08:00
Ryan Moeller	572b5b302a	ZTS: Test zvol I/O in different volmodes We found that our zvol code had some issues with volmode=dev that were not revealed by ZTS. Add some basic I/O operations to exercise more code paths in the volmode test. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #9953	2020-02-10 13:10:25 -08:00
Attila Fülöp	31b160f0a6	ICP: Improve AES-GCM performance Currently SIMD accelerated AES-GCM performance is limited by two factors: a. The need to disable preemption and interrupts and save the FPU state before using it and to do the reverse when done. Due to the way the code is organized (see (b) below) we have to pay this price twice for each 16 byte GCM block processed. b. Most processing is done in C, operating on single GCM blocks. The use of SIMD instructions is limited to the AES encryption of the counter block (AES-NI) and the Galois multiplication (PCLMULQDQ). This leads to the FPU not being fully utilized for crypto operations. To solve (a) we do crypto processing in larger chunks while owning the FPU. An `icp_gcm_avx_chunk_size` module parameter was introduced to make this chunk size tweakable. It defaults to 32 KiB. This step alone roughly doubles performance. (b) is tackled by porting and using the highly optimized openssl AES-GCM assembler routines, which do all the processing (CTR, AES, GMULT) in a single routine. Both steps together result in up to 32x reduction of the time spend in the en/decryption routines, leading up to approximately 12x throughput increase for large (128 KiB) blocks. Lastly, this commit changes the default encryption algorithm from AES-CCM to AES-GCM when setting the `encryption=on` property. Reviewed-By: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-By: Jason King <jason.king@joyent.com> Reviewed-By: Tom Caputi <tcaputi@datto.com> Reviewed-By: Richard Laager <rlaager@wiktel.com> Signed-off-by: Attila Fülöp <attila@fueloep.org> Closes #9749	2020-02-10 12:59:50 -08:00
Igor K	818d4a87fd	ZTS: Add an is_dilos function for future ZTS updates Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Igor Kozhukhov <igor@dilos.org> Closes #9960	2020-02-07 12:32:52 -08:00
Ryan Moeller	9825e7ad1d	ZTS: Use wc -c instead of --bytes for portability FreeBSD does not have the long opts for wc. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Paul Dagnelie <pcd@delphix.com> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #9963	2020-02-07 12:31:38 -08:00
Ryan Moeller	591505dc27	ZTS: Only use ext4 on Linux And while here, add a workaround for FreeBSD to ensure dirty data is synced before taking a snapshot, by remounting read-only, then remount again read-write after the snapshot. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #9914	2020-01-31 09:00:47 -08:00
Ryan Moeller	12430796d6	Use the correct spelling of 'jailed' in tests FreeBSD has jails, not zones. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #9899	2020-01-31 08:52:29 -08:00

... 3 4 5 6 7 ...

1134 Commits