mirror_zfs

mirror of https://git.proxmox.com/git/mirror_zfs.git synced 2026-03-22 08:51:30 +03:00

Author	SHA1	Message	Date
Tino Reichardt	1d3ba0bf01	Replace dead opensolaris.org license link The commit replaces all findings of the link: http://www.opensolaris.org/os/licensing with this one: https://opensource.org/licenses/CDDL-1.0 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de> Closes #13619	2022-07-11 14:16:13 -07:00
Brian Behlendorf	ad8b9f940c	Scrub mirror children without BPs When scrubbing a raidz/draid pool, which contains a replacing or sparing mirror with multiple online children, only one child will be read. This is not normally a serious concern because the DTL records are used to determine where a good copy of the data is. As long as the data can be read from one child the mirror vdev will use it to repair gaps in any of its children. Furthermore, even if the data which was read is corrupt the raidz code will detect this and issue its own repair I/O to correct the damage in the mirror vdev. However, in the scenario where the DTL is wrong due to silent data corruption (say due to overwriting one child) and the scrub happens to read from a child with good data, then the other damaged mirror child will not be detected nor repaired. While this is possible for both raidz and draid vdevs, it's most pronounced when using draid. This is because by default the zed will sequentially rebuild a draid pool to a distributed spare, and the distributed spare half of the mirror is always preferred since it delivers better performance. This means the damaged half of the mirror will go undetected even after scrubbing. For system administrations this behavior is non-intuitive and in a worst case scenario could result in the only good copy of the data being unknowingly detached from the mirror. This change resolves the issue by reading all replacing/sparing mirror children when scrubbing. When the BP isn't available for verification, then compare the data buffers from each child. They must all be identical, if not there's silent damage and an error is returned to prompt the top-level vdev to issue a repair I/O to rewrite the data on all of the mirror children. Since we can't tell which child was wrong a checksum error is logged against the replacing or sparing mirror vdev. Reviewed-by: Mark Maybee <mark.maybee@delphix.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13555	2022-06-23 10:36:28 -07:00
наб	88d5580e51	tests: add zfs_unshare_008_pos checking whitespace escaping Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13165	2022-05-12 09:27:00 -07:00
Brian Atkinson	f567d67fda	Adding ZTS test for O_APPEND Commit `63b18e4` fixed an issue in zpl_aio_write() to make sure that kiocb->ki_pos was updated correctly when opening a file with O_APPEND. Adding a test to verify O_APPEND functionality with lseek can make sure that all other distros/kernel versions also have the correct behavior. Also moved the threadappends_001_pos test into this append test directory in functional ZTS directory. This way the two append tests are together for organization purposes. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Brian Atkinson <batkinson@lanl.gov> Closes #13424	2022-05-11 08:38:16 -07:00
наб	0425d58852	autoconf: use include directives instead of recursing down tests (mostly) Only down to tests/zfs-tests/tests, but pull out C programs into the main Makefile ‒ this means we get correct dependency tracking for all programs (and parallelise across them) dist diff: -zfs-2.1.99/tests/zfs-tests/tests/stress/ -zfs-2.1.99/tests/zfs-tests/tests/stress/Makefile.am -zfs-2.1.99/tests/zfs-tests/tests/stress/Makefile.in Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13316	2022-05-10 10:20:09 -07:00
наб	93a8c85e7b	Move test-runner.1 into top-level man/ dist delta: +zfs-2.1.99/man/man1/test-runner.1 -zfs-2.1.99/tests/test-runner/man/ -zfs-2.1.99/tests/test-runner/man/test-runner.1 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13316	2022-05-10 10:19:18 -07:00
наб	0f6c4fd00e	autoconf: use include directives instead of recursing down man Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13316	2022-05-10 10:19:10 -07:00
наб	5cdca5b1da	autoconf: use include directives instead of recursing down cmd No installation diff, dist lost -zfs-2.1.99/cmd/fsck_zfs/fsck.zfs which was distributed erroneously, since it's generated Also clean gitrev on clean Also add -e 'any possible bashisms' to default checkbashisms flags, and fully parallelise it and shellcheck, and it works out-of-tree, too Also align the Release in the dist META file correctly Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13316	2022-05-10 10:18:38 -07:00
Shaan Nobee	411f4a018d	Speed up WB_SYNC_NONE when a WB_SYNC_ALL occurs simultaneously Page writebacks with WB_SYNC_NONE can take several seconds to complete since they wait for the transaction group to close before being committed. This is usually not a problem since the caller does not need to wait. However, if we're simultaneously doing a writeback with WB_SYNC_ALL (e.g via msync), the latter can block for several seconds (up to zfs_txg_timeout) due to the active WB_SYNC_NONE writeback since it needs to wait for the transaction to complete and the PG_writeback bit to be cleared. This commit deals with 2 cases: - No page writeback is active. A WB_SYNC_ALL page writeback starts and even completes. But when it's about to check if the PG_writeback bit has been cleared, another writeback with WB_SYNC_NONE starts. The sync page writeback ends up waiting for the non-sync page writeback to complete. - A page writeback with WB_SYNC_NONE is already active when a WB_SYNC_ALL writeback starts. The WB_SYNC_ALL writeback ends up waiting for the WB_SYNC_NONE writeback. The fix works by carefully keeping track of active sync/non-sync writebacks and committing when beneficial. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Shaan Nobee <sniper111@gmail.com> Closes #12662 Closes #12790	2022-05-03 13:23:26 -07:00
Brian Behlendorf	3d149db1f3	ZTS: Retry auto_spare_multiple.ksh The auto_spare_multiple.ksh test may incorrectly fail for a similar reason as the auto_spare_shared.ksh test. Add it to known list of exceptions which should be retried to prevent failures in the CI. Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13318	2022-04-13 09:44:54 -07:00
Brian Behlendorf	5846f71182	ZTS: Retry redundancy_draid_spare[1,3].ksh The redundancy_draid_spare1.ksh and redundancy_draid_spare3.ksh test cases are a little to strict for the sequential resilver case. While unlikely it is possible that a handful of correctable checksum errors will be reported resulting in a test failure. Update the zts-report.py script to allow this the test case to be retried if requested. Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13318	2022-04-13 09:44:33 -07:00
наб	533aef2b3c	tests: zts-report: add #13215 to known list for FreeBSD Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13259	2022-04-01 18:03:32 -07:00
наб	e78b7a73fe	tests: zts-report: issue numbers are numbers Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13259	2022-04-01 18:03:27 -07:00
наб	fbe811b054	tests: remove unused functions As found by git -C tests/ grep ^function \| grep -vFe '.lua:' -e '.zcp:' \| while IFS=":$IFS" read -r _ _ fn _; do [ $(git -C tests/ grep -wF $fn \| head -2 \| wc -l) -eq 1 ] && echo $fn; done after all rounds this comes out to, sorted: check_slog_state chgusr_exec cksum_files cleanup_pools compare_modes count_ACE dataset_set_defaultproperties ds_is_snapshot get_ACE get_group get_min get_mode get_owner get_rand_checksum get_rand_checksum_any get_rand_large_recsize get_rand_recsize get_user_group getitem indirect_vdev_mapping_size is_dilos log_noresult log_notinuse log_other log_timed_out log_uninitiated log_warning num_jobs_by_cpu plus_sign_check_l plus_sign_check_v record_cksum rwx_node seconds_mmp_waits_for_activity set_cur_usr setup_mirrors setup_raidzs showshares_smb zfs_zones_setup This, of course, doesn't catch recursive ones, or ones that log with their own function name as a prefix, but Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13259	2022-04-01 18:03:05 -07:00
наб	7b6c4cbb1f	tests: zts-report: simplify Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13259	2022-04-01 18:02:48 -07:00
наб	abccb4b584	tests: zts-report: prune unused reasons Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13259	2022-04-01 18:02:43 -07:00
наб	6b2616435b	tests: zts-report: only known-SKIP zfs_unshare_002_pos on Linux Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13259	2022-04-01 18:02:36 -07:00
наб	caccfc870f	tests: clean out unused/single-use/useless commands from the list Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13259	2022-04-01 17:59:18 -07:00
наб	50a33b8c69	tests: logapi: don't cat excessively This also fixes line welding in test error output Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13259	2022-04-01 17:58:37 -07:00
наб	f63c9dc70a	egrep -> grep -E Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13259	2022-04-01 17:58:07 -07:00
наб	3886e7081a	tests: zfs_unshare_006: log_unsupported iff usershares are actually off Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13259	2022-04-01 17:55:30 -07:00
наб	5b0e75caef	tests: don't use share/unshare exportfs aliases, support FreeBSD NFS Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #13259	2022-04-01 17:55:14 -07:00
Tony Hutter	b73505c7e0	ZTS: Log test name to /dev/kmsg on Linux Add a -K option to the test suite to log each test name to /dev/kmsg (on Linux), so if there's a kernel warning we'll be able to match it up to a particular test. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #13227	2022-03-23 09:15:02 -06:00
наб	26bbce8173	tests: replace explicit $? \|\| log_fail with log_must Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12996	2022-03-15 15:13:33 -07:00
Brian Behlendorf	19229e5f1e	ZTS: Modify receive-o-x_props_override.ksh exception As previously noted in #12272 the receive-o-x_props_override.ksh test reliably fails on FreeBSD. Since we don't expect this test to pass move the exception from the "maybe" to "known" section. This way we don't retry the FAILED test when it is not expected to pass. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13167	2022-03-01 08:47:30 -08:00
Damian Szuberski	78fad47cb3	Add Linux kmemleak support to ZTS - Kmemleak `clear` is invoked right before every test case run. - Kmemleak `scan` is requested right after each test case is finished. - Kmemleak instrumentation is not used for setup/cleanup/pretest/posttest/failsafe stages to shorten the test case execution time. - Kmemleak periodic scan is disabled (`scan=0`) before the test suite run to avoid interfering with the on-demand scan results. - There are unavoidable potential false positives coming from kernel areas other than OpenZFS module. - The ZTS with kmemleak enabled duration is increased by ~50%. Example run ``` Running Time: 07:12:13 Percent passed: 98.3% unreferenced object 0xffff9da82aea5410 (size 80): comm "kworker/u32:10", pid 942206, jiffies 4296749716 (age 2615.516s) hex dump (first 32 bytes): 00 30 30 00 00 00 00 00 ff 8f 30 00 00 00 00 00 .00.......0..... 51 e6 77 05 a8 9d ff ff 00 00 00 00 00 00 00 00 Q.w............. backtrace: [<000000005cf1fea2>] alloc_extent_state+0x1d/0xb0 [btrfs] [<0000000083f78ae5>] set_extent_bit+0x2ff/0x670 [btrfs] [<00000000de29249e>] lock_extent_bits+0x6b/0xa0 [btrfs] [<00000000b241f424>] lock_and_cleanup_extent_if_need+0xaf/0x1c0 [btrfs] [<0000000093ca72b5>] btrfs_buffered_write+0x297/0x7d0 [btrfs] [<000000002c2938c8>] btrfs_file_write_iter+0x127/0x390 [btrfs] [<00000000b888f720>] do_iter_readv_writev+0x152/0x1b0 [<00000000320f0bcc>] do_iter_write+0x7c/0x1c0 [<000000000b5a8fe0>] lo_write_bvec+0x62/0x150 [loop] [<000000009aa03c73>] loop_process_work+0x250/0xbd0 [loop] [<00000000c7487d8a>] process_one_work+0x1f1/0x390 [<000000000b236831>] worker_thread+0x53/0x3e0 [<0000000023cb3e57>] kthread+0x127/0x150 [<000000002d48676a>] ret_from_fork+0x22/0x30 ``` Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: szubersk <szuberskidamian@gmail.com> Closes #13084	2022-02-24 10:21:13 -08:00
Brian Behlendorf	a5b3fab341	ZTS: Retry in import_rewind_config_changed.ksh As explained by the disclaimer in the test case, "This test can fail since nothing guarantees that old MOS blocks aren't overwritten." This behavior is expected and correct, but results in a flaky test case which is problematic for the CI. The best we can do to resolve this is to retry the sub-test which failed when the MOS blocks have clearly been overwritten. When testing failures were rare enough that a single retry should normally be sufficient. However, we allow up to five for good measure. Reviewed by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13119	2022-02-20 19:21:31 -08:00
Brian Behlendorf	7901b62685	ZTS: Fix vdev_zaps_004_pos.ksh When attaching a vdev to a mirror wait for the resilver to complete before invoking `zdb` to inspect the pool. This ensures the pool is essentially idle which allows `zdb` to open the imported pool reliably. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13112 Closes #6935	2022-02-17 12:09:06 -08:00
Brian Behlendorf	b7baf49bd3	ZTS: Fix zpool_expand_001_pos The dRAID section of the zpool_expand_001_pos test would reliably fail because the calculated expansion size assumed the dRAID top-level vdev was created with a distributed spare. Create the vdev as expected to resolve the test failure. This test case flaw was accidentally caused by changing the default number of dRAID distributed spares from one to zero while dRAID was being developed. Additionally, remove zpool_expand_005_pos from the list of possible faulty tests. It appears to be passing consistently in my testing. Reviewed by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13091	2022-02-13 14:22:00 -08:00
Brian Behlendorf	100b8950f4	ZTS: Update enospc_002_pos test case The on-disk cost of creating a snapshot or bookmark is sufficiently low that it is difficult to make it reliably fail even when the pool is "full". In order to avoid false positives remove these two checks from the test case. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #13060	2022-02-04 09:36:46 -08:00
наб	17b2ae0b24	Fix test-runner on FreeBSD CLOCK_MONOTONIC_RAW is only a thing on Linux and macOS. I'm not actually sure why the previous hardcoding of a constant didn't error out, but when we removed it, it sure does now. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by: Rich Ercolani <rincebrain@gmail.com> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #12995	2022-01-21 15:37:46 -08:00
Damian Szuberski	8a7c4efd3c	Removed Python 2 and Python 3.5- support Deprecation of Python versions below 3.6 gives opportunity to unify the build and install requirements for OpenZFS packages. The minimal supported Python version is 3.6 as this is the most recent Python package CentOS/RHEL 7 users can get. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rich Ercolani <rincebrain@gmail.com> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: szubersk <szuberskidamian@gmail.com> Closes #12925	2022-01-13 09:51:12 -07:00
Brian Behlendorf	d2f374c3f2	ZTS: Fix rollback_003_pos.ksh Under Linux when rolling back a mounted filesystem negative dentries may not be dropped from the cache. This can result in an ENOENT being incorrectly returned on first access. Issuing a `df` before the unmount results in the negative dentries being invalidated and side steps the issue. This is solely a workaround for the test case on Linux and not correct behavior. The core issue of invalidating negative dentries needs to be handled with a kernel side change. This is being tracked as issue #6143. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #12898 Issue #6143	2021-12-22 11:05:07 -08:00
Brian Behlendorf	9ba5d8d204	ZTS: Fix refreserv_raidz.ksh The rerefreserv_raidz test was failing on Linux because the sync being issued doesn't guarantee a pool sync. Switch to using the sync_pool function and remove the ZTS exception for Linux. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #12897	2021-12-22 09:37:27 -08:00
Brian Behlendorf	7b5d783a46	ZTS: rsend_007_pos failures The rsend_007_pos test reliably fails on Linux in the cleanup function. This is caused by an unmount error when attempting to recursively destroy the newly received datasets. Invoking `df` prior to the `zfs destroy` interestingly avoids the unmont error. Why this should matter is unclear and should be investigated. However, this minor tweak may allow us to remove the ZTS rsend exceptions. The subsequent rsend_010_pos and rsend_011_pos failures were a result of this initial failure. The other "maybe" failures I was unable to reproduce and have not been recently observed in the master branch. Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #5665 Closes #6086 Closes #6087 Closes #6446 Closes #12876	2021-12-21 11:11:07 -08:00
Brian Behlendorf	eecd3f1a21	ZTS: alloc_class.ksh must wait for the process to exit The alloc_class_* tests may fail on Linux with an EBUSY error if `zfs destroy` is run before the `dd` process has had a chance to terminate. Wait on the pid after the `kill -9` to make sure. When testing I didn't observe any failures for the alloc_class tests. Remove them from the exceptions list, the CI was used to verify the tests pass on all platforms. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Rich Ercolani <rincebrain@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #12873	2021-12-17 12:40:34 -08:00
Brian Behlendorf	14ba514af6	ZTS: import_rewind_device_replaced reliably fails The import_rewind_device_replaced.ksh test was never entirely reliable because it depends on MOS data not being overwritten. The MOS data is not protected by the snapshot so occasional failures were always expected. However, this test is now failing reliably on all platforms indicating something has changed in the code since the test was marked "maybe". Convert the test to a "known" failure until the root cause is identified and resolved. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #12821	2021-12-06 09:45:17 -08:00
Brian Behlendorf	77e2756de0	Linux 5.13 compat: retry zvol_open() when contended Due to a possible lock inversion the zvol open call path on Linux needs to be able to retry in the case where the spa_namespace_lock cannot be acquired. For Linux 5.12 an older kernel this was accomplished by returning -ERESTARTSYS from zvol_open() to request that blkdev_get() drop the bdev->bd_mutex lock, reaquire it, then call the open callback again. However, as of the 5.13 kernel this behavior was removed. Therefore, for 5.12 and older kernels we preserved the existing retry logic, but for 5.13 and newer kernels we retry internally in zvol_open(). This should always succeed except in the case where a pool's vdev are layed on zvols, in which case it may fail. To handle this case vdev_disk_open() has been updated to retry when opening a device when -ERESTARTSYS is returned. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #12301 Closes #12759	2021-12-01 17:07:12 -07:00
Paul Dagnelie	2320e6eb43	Add zfs-test facility to automatically rerun failing tests This was a project proposed as part of the Quality theme for the hackthon for the 2021 OpenZFS Developer Summit. The idea is to improve the usability of the automated tests that get run when a PR is created by having failing tests automatically rerun in order to make flaky tests less impactful. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Signed-off-by: Paul Dagnelie <pcd@delphix.com> Closes #12740	2021-12-01 10:38:53 -07:00
Brian Behlendorf	371e0f7754	Exclude zfs_copies_003_pos on Linux This test case may fail on 5.13 and newer Linux kernels if the /dev/zvol/ device is not created by udev. Reviewed-by: Rich Ercolani <rincebrain@gmail.com> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #12301 Closes #12738	2021-11-10 13:56:01 -07:00
Rich Ercolani	380b072403	Exclude zvol_misc_volmode for now It keeps failing, on changes which aren't related at all. So until someone runs down why, I'd like it to stop being the sole reason for CI failures. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #12733	2021-11-08 19:01:19 -07:00
youzhongyang	ec64fdb93d	Skip snapshot in zfs_iter_mounted() The intention of the zfs_iter_mounted() is to traverse the dataset and its descendants, not the snapshots. The current code can cause a mounted snapshot to be included and thus zfs_open() on the snapshot with ZFS_TYPE_FILESYSTEM would print confusing message such as "cannot open 'rpool/fs@snap': snapshot delimiter '@' is not expected here". Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Youzhong Yang <yyang@mathworks.com> Closes #12447 Closes #12448	2021-10-20 16:07:19 -07:00
Brian Behlendorf	648445e007	ZTS: Add known exceptions Add the following test failures to the exception list for FreeBSD to ensure we notice new unexpected failures. pool_checkpoint/checkpoint_big_rewind pool_checkpoint/checkpoint_indirect And the following for Linux. zvol/zvol_misc/zvol_misc_snapdev Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #12621 Issue #12622 Issue #12623 Closes #12624	2021-10-11 10:52:32 -07:00
Ryan Moeller	96ad227a9d	ZTS: Minimize udev_wait in zvol_misc tests The zvol_misc tests, in particular zvol_misc_volmode, make use of a common udev_wait function to wait for zvol devices in /dev to quiesce on Linux. On other platforms this function currently only sleeps for one second before returning. This is insufficient, and zvol_misc_volmode has been flaky on FreeBSD as a result. Replace udev_wait with block_device_wait, passing through the optional device parameter where possible. Rearrange a few checks to strengthen the verifications we are making and avoid unnecessarily sleeping. We must keep udev_wait in a couple places to pass in Github CI workflows. Remove zvol_misc_volmode from the maybe failing tests on FreeBSD in zts-report.py. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #12583	2021-10-01 09:36:02 -06:00
Ryan Moeller	c27c124a88	ZTS: Remove exceptions for flaky zhack on FreeBSD Issue #11854 has been resolved, so we can remove the exceptions for it. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #12527	2021-09-01 13:20:00 -07:00
Ka Ho Ng	c3cb57ae47	ZTS: Enable punch-hole tests on FreeBSD Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Ka Ho Ng <khng@FreeBSD.org> Sponsored-by: The FreeBSD Foundation Closes #12458	2021-08-30 13:33:32 -07:00
Ryan Moeller	8ae86e2edc	ZTS: Add tests for creation time Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Reviewed-by: Allan Jude <allan@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #12432	2021-08-17 10:25:58 -07:00
George Amanakis	ab8a8f0745	Fixes in persistent L2ARC In l2arc_add_vdev() first decide whether the device is eligible for L2ARC rebuild or whole device trim and then add it to the list of cache devices. Otherwise l2arc_feed_thread() might already start writing on the device invalidating previous content as l2ad_hand = l2ad_start. However l2arc_rebuild_vdev() needs the device present in the cache device list to figure out its l2arc_dev_t. Fix this by moving most of l2arc_rebuild_vdev() in a new function l2arc_rebuild_dev() which does not need to search in the cache device list. In contrast to l2arc_add_vdev() we do not have to worry about l2arc_feed_thread() invalidating previous content when onlining a cache device. The device parameters (l2ad*) are not cleared when offlining the device and writing new buffers will not invalidate all previous content. In worst case only buffers that have not had their log block written to the device will be lost. Retire persist_l2arc_00{4,5,8} tests since they cover code already covered by the remaining ones. Test persist_l2arc_006 is renamed to persist_l2arc_004 and persist_l2arc_007 is renamed to persist_l2arc_005. Fix a typo in persist_l2arc_004, and remove an assertion that is not always true from l2arc_arcstats_pos. Also update an assertion in persist_l2arc_005 and explain why in a comment. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #12365	2021-07-26 12:30:24 -07:00
Ryan Moeller	cfc564f9b1	ZED: Match added disk by pool/vdev GUID if found (#12217 ) This enables ZED to auto-online vdevs that are not wholedisk managed by ZFS. Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Don Brady <don.brady@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov>	2021-06-30 07:37:20 -07:00
Brian Behlendorf	63f4b959a6	ZTS: Add known exceptions The receive-o-x_props_override test case reliably fails on the FreeBSD main builders (but not on Linux), until the root cause is understood add this test to the FreeBSD exception list. On Linux the alloc_class_012_pos test case may occasionally fail. This is a known false positive which has also been added to the Linux exception list until the test can be made entirely reliable. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #12272	2021-06-23 15:53:13 -07:00

1 2 3

148 Commits