mirror_zfs

mirror of https://git.proxmox.com/git/mirror_zfs.git synced 2026-01-09 14:52:05 +03:00

Author	SHA1	Message	Date
Rob Norris	cbd95a950a	ZTS: handle FreeBSD version numbers correctly (#16340 ) FreeBSD patchlevel versions are optional and, if present, in a different location in the version string. Sponsored-by: https://despairlabs.com/sponsor/ Signed-off-by: Rob Norris <robn@despairlabs.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Reviewed-by: Tony Hutter <hutter2@llnl.gov>	2024-07-12 10:58:03 -07:00
Rob Norris	10de12e9ed	icp: reorganise SHA2 digest mechanisms sha2_mech_type_t serves double-duty, as the list of MAC providers and also the algo type for direct callers to SHA2Init. Until we disentangle that, reorganise it to make the separation more clear. While we're there, remove the digest mechs we don't use. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #16209	2024-05-31 15:13:23 -07:00
Tony Hutter	a301dc364c	ZTS: Fix redacted_send failures on FreeBSD We're seeing failures for redacted_deleted and redacted_mount on FreeBSD 13-15: 09:58:34.74 diff: /dev/fd/3: No such file or directory 09:58:34.74 ERROR: diff /dev/fd/3 /dev/fd/4 exited 2 The test was trying to diff the file listings between two directories to see if they are the same. The workaround is to do a string comparison of the directory listings instead of using `diff`. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #16224	2024-05-31 15:11:00 -07:00
Brian Behlendorf	6b95031f56	zed: Add deadman-slot_off.sh zedlet Optionally turn off disk's enclosure slot if an I/O is hung triggering the deadman. It's possible for outstanding I/O to a misbehaving SCSI disk to neither promptly complete or return an error. This can occur due to retry and recovery actions taken by the SCSI layer, driver, or disk. When it occurs the pool will be unresponsive even though there may be sufficient redundancy configured to proceeded without this single disk. When a hung I/O is detected by the kmods it will be posted as a deadman event. By default an I/O is considered to be hung after 5 minutes. This value can be changed with the zfs_deadman_ziotime_ms module parameter. If ZED_POWER_OFF_ENCLOSURE_SLOT_ON_DEADMAN is set the disk's enclosure slot will be powered off causing the outstanding I/O to fail. The ZED will then handle this like a normal disk failure. By default ZED_POWER_OFF_ENCLOSURE_SLOT_ON_DEADMAN is not set. As part of this change `zfs_deadman_events_per_second` is added to control the ratelimitting of deadman events independantly of delay events. In practice, a single deadman event is sufficient and more aren't particularly useful. Alphabetize the zfs_deadman_* entries in zfs.4. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #16226	2024-05-29 10:46:41 -07:00
Rob N	d0aa9dbccf	Use memset to zero stack allocations containing unions C99 6.7.8.17 says that when an undesignated initialiser is used, only the first element of a union is initialised. If the first element is not the largest within the union, how the remaining space is initialised is up to the compiler. GCC extends the initialiser to the entire union, while Clang treats the remainder as padding, and so initialises according to whatever automatic/implicit initialisation rules are currently active. When Linux is compiled with CONFIG_INIT_STACK_ALL_PATTERN, -ftrivial-auto-var-init=pattern is added to the kernel CFLAGS. This flag sets the policy for automatic/implicit initialisation of variables on the stack. Taken together, this means that when compiling under CONFIG_INIT_STACK_ALL_PATTERN on Clang, the "zero" initialiser will only zero the first element in a union, and the rest will be filled with a pattern. This is significant for aes_ctx_t, which in aes_encrypt_atomic() and aes_decrypt_atomic() is initialised to zero, but then used as a gcm_ctx_t, which is the fifth element in the union, and thus gets pattern initialisation. Later, it's assumed to be zero, resulting in a hang. As confusing and undiscoverable as it is, by the spec, we are at fault when we initialise a structure containing a union with the zero initializer. As such, this commit replaces these uses with an explicit memset(0). Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #16135 Closes #16206	2024-05-24 19:00:29 -07:00
Don Brady	975a13259b	Add support for parallel pool exports Changed spa_export_common() such that it no longer holds the spa_namespace_lock for the entire duration and instead sets spa_export_thread to indicate an import is in progress on the spa. This allows for an export to a diffent pool to proceed in parallel while an export is still processing potentially long operations like spa_unload_log_sm_flush_all(). Calls like spa_lookup() and spa_vdev_enter() that rely on the spa_namespace_lock to serialize them against a concurrent export, now wait for any in-progress export thread to complete before proceeding. The 'zpool import -a' sub-command also provides multi-threaded support, using a thread pool to submit the exports in parallel. Sponsored-By: Klara Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Wilson <gwilson@delphix.com> Signed-off-by: Don Brady <don.brady@klarasystems.com> Closes #16153	2024-05-14 08:57:41 -07:00
Don Brady	c3f2f1aa2d	vdev probe to slow disk can stall mmp write checker Simplify vdev probes in the zio_vdev_io_done context to avoid holding the spa config lock for a long duration. Also allow zpool clear if no evidence of another host is using the pool. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Olaf Faaland <faaland1@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Don Brady <don.brady@klarasystems.com> Closes #15839	2024-04-29 14:35:53 -07:00
Allan Jude	5044c4e3ff	Fast Dedup: ZAP Shrinking This allows ZAPs to shrink. When there are two empty sibling leafs, one of them is collapsed and its storage space is reused. This improved performance on directories that at one time contained a large number of files, but many or all of those files have since been deleted. This also applies to all other types of ZAPs as well. Sponsored-by: iXsystems, Inc. Sponsored-by: Klara, Inc. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Alexander Stetsenko <alex.stetsenko@klarasystems.com> Closes #15888	2024-04-24 14:51:21 -07:00
Seth Troisi	cdae59e153	ZTS: user_namespace_004.ksh avoid error in cleanup if unsupported Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Seth Troisi <sethtroisi@google.com> Closes #16114	2024-04-22 10:47:44 -07:00
George Wilson	c183d164aa	Parallel pool import This commit allow spa_load() to drop the spa_namespace_lock so that imports can happen concurrently. Prior to dropping the spa_namespace_lock, the import logic will set the spa_load_thread value to track the thread which is doing the import. Consumers of spa_lookup() retain the same behavior by blocking when either a thread is holding the spa_namespace_lock or the spa_load_thread value is set. This will ensure that critical concurrent operations cannot take place while a pool is being imported. The zpool command is also enhanced to provide multi-threaded support when invoking zpool import -a. Lastly, zinject provides a mechanism to insert artificial delays when importing a pool and new zfs tests are added to verify parallel import functionality. Contributions-by: Don Brady <don.brady@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Wilson <gwilson@delphix.com> Closes #16093	2024-04-22 09:42:38 -07:00
Rob Norris	26d49fec5f	tests/quota: consistently clear quota property between tests When run in isolation, quota_005_pos would fail in cleanup because it would attempt restore the previous quota, which was 0, and so get an error (because you can't set quota to '0', you have to use 'none'). It worked as part of the quota tag set because the previous tests did not clean up their quota, so there was always a non-zero quota to return to. This adds a simple quota reset function, and has all quota tests run it at cleanup. For the ones that weren't cleaning up, they now do, and for quota_005_pos, which was trying to do the right thing, it now just resets it. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #16097	2024-04-19 16:06:56 -07:00
Rob Norris	f75574cbaa	tests/quota_005_pos: use a long int for doubling the quota size When run in isolation, quota_005_pos would see an empty ~300G dataset. Doubling it's space overflows a int32, which meant it was trying to then set the quota to a negative value, and would fail. When run as part of the quota tests, the filesystem appears to have stuff in it, and so a lower available space, which doesn't overflow, and so succeeds. The bare minimum fix seems to be to use a int64 for the available space, so it can be comfortably doubled. Here it is. (Also a typo fix and a tiny bit of cleanup). Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #16097	2024-04-19 16:06:37 -07:00
Rob N	4725e543be	zinject: "no-op" error injection When injected, this causes the matching IO to appear to succeed, but the actual work is never submitted to the physical device. This can be used to simulate a write-back cache servicing a write, but the backing device has failed and the cache cannot complete the operation in the background. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #16085	2024-04-15 13:52:20 -07:00
Umer Saleem	a100a195fa	Add support for zfs mount -R <filesystem> This commit adds support for mounting a dataset along with all of it's children with '-R' flag for zfs mount. There can be scenarios where we want to mount all datasets under one hierarchy instead of mounting all datasets present on system with '-a' flag. '-R' flag should work on all root and non-root datasets. Usage information and man page has been updated for zfs mount. A test for verifying the behavior for '-R' flag is also added. Reviewed-by: Ameer Hamza <ahamza@ixsystems.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Umer Saleem <usaleem@ixsystems.com> Closes #16015	2024-04-11 15:10:24 -07:00
Rob Norris	1bf649cb0a	vdev_disk: fix alignment check when buffer has non-zero starting offset If a linear buffer spans multiple pages, and the first page has a non-zero starting offset, the checker would not include the offset, and so would think there was an alignment gap at the end of the first page, rather than at the start. That is, for a 16K buffer spread across five pages with an initial 512B offset: [.XXXXXXX][XXXXXXXX][XXXXXXXX][XXXXXXXX][XXXXXXX.] It would be interpreted as: [XXXXXXX.][XXXXXXXX]... And be rejected as misaligned. Since it's already a linear ABD, the "linearising" copy would just reuse the buffer as-is, and the second check would failing, tripping the VERIFY in vdev_disk_io_rw(). This commit fixes all this by including the offset in the check for end-of-page alignment. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #16076	2024-04-11 14:43:27 -07:00
Rob Norris	bc27c49404	tests: add test for vdev_disk page alignment check This provides a test driver and a set of test vectors for the page alignment check callback function vdev_disk_check_pages_cb(). Because there's no good facility for exposing this function to a userspace test right now, for now I'm just duplicating the function and adding commentary to remind people to keep them in sync. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #16076	2024-04-11 14:42:46 -07:00
Rob Norris	756e10b0a1	tests: simple zinject disk fault arg check Just making sure the valid values for disk faults are accepted. Obviously we can do a lot more, but this will do to get us started. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #15953	2024-04-03 16:06:43 -07:00
George Wilson	b1e46f869e	Add ashift validation when adding devices to a pool Currently, zpool add allows users to add top-level vdevs that have different ashifts but doing so prevents users from being able to perform a top-level vdev removal. Often times consumers may not realize that they have mismatched ashifts until the top-level removal fails. This feature adds ashift validation to the zpool add command and will fail the operation if the sector size of the specified vdev does not match the existing pool. This behavior can be disabled by using the -f flag. In addition, new flags have been added to provide fine-grained control to disable specific checks. These flags are: --allow-in-use --allow-ashift-mismatch --allow-replicaton-mismatch The force flag will disable all of these checks. Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Mark Maybee <mmaybee@delphix.com> Signed-off-by: George Wilson <gwilson@delphix.com> Closes #15509	2024-03-29 13:15:56 -06:00
Robert Evans	e39e20b6dc	ZTS: fix flakiness in cp_files_002_pos Fix RANDOM to not return zero. Overwriting with `dd ... count=0` does not test anything. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Allan Jude <allan@klarasystems.com> Signed-off-by: Robert Evans <evansr@google.com> Closes #16029	2024-03-27 14:59:16 -07:00
Alexander Motin	0c8eb974ff	BRT: Check pool clone stats in more tests This should allow to catch some leaks, if those happen. While there fix some cosmetic issues. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #16007	2024-03-27 14:47:06 -07:00
Alexander Motin	b403427624	BRT: Fix tests to work on non-empty pools It should not normally happen, but if it does, better to not fail everything for no good reason, or it may be hard to debug. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #16007	2024-03-27 14:46:48 -07:00
Tony Hutter	d6a3d3f12a	ZTS: Skip cross-fs bclone tests if FreeBSD < 14.0 Skip cross filesystem block cloning tests on FreeBSD if running less than version 14.0. Cross filesystem copy_file_range() was added in FreeBSD 14. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #15901	2024-02-16 08:59:56 -08:00
Kevin Greene	79c6dffa6b	Allowing PERFPOOL to be defined by zfs-test users Reviewed-by: John Wren Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Kevin Greene <kevin.greene@delphix.com> Closes #15868	2024-02-09 10:02:46 -08:00
Don Brady	cbe882298e	Add slow disk diagnosis to ZED Slow disk response times can be indicative of a failing drive. ZFS currently tracks slow I/Os (slower than zio_slow_io_ms) and generates events (ereport.fs.zfs.delay). However, no action is taken by ZED, like is done for checksum or I/O errors. This change adds slow disk diagnosis to ZED which is opt-in using new VDEV properties: VDEV_PROP_SLOW_IO_N VDEV_PROP_SLOW_IO_T If multiple VDEVs in a pool are undergoing slow I/Os, then it skips the zpool_vdev_degrade(). Sponsored-By: OpenDrives Inc. Sponsored-By: Klara Inc. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Allan Jude <allan@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by: Rob Wing <rob.wing@klarasystems.com> Signed-off-by: Don Brady <don.brady@klarasystems.com> Closes #15469	2024-02-08 09:19:52 -08:00
Cameron Harr	0823388752	Add 'zpool status -e' flag to see unhealthy vdevs When very large pools are present, it can be laborious to find reasons for why a pool is degraded and/or where an unhealthy vdev is. This option filters out vdevs that are ONLINE and with no errors to make it easier to see where the issues are. Root and parents of unhealthy vdevs will always be printed. Testing: ZFS errors and drive failures for multiple vdevs were simulated with zinject. Sample vdev listings with '-e' option - All vdevs healthy NAME STATE READ WRITE CKSUM iron5 ONLINE 0 0 0 - ZFS errors NAME STATE READ WRITE CKSUM iron5 ONLINE 0 0 0 raidz2-5 ONLINE 1 0 0 L23 ONLINE 1 0 0 L24 ONLINE 1 0 0 L37 ONLINE 1 0 0 - Vdev faulted NAME STATE READ WRITE CKSUM iron5 DEGRADED 0 0 0 raidz2-6 DEGRADED 0 0 0 L67 FAULTED 0 0 0 too many errors - Vdev faults and data errors NAME STATE READ WRITE CKSUM iron5 DEGRADED 0 0 0 raidz2-1 DEGRADED 0 0 0 L2 FAULTED 0 0 0 too many errors raidz2-5 ONLINE 1 0 0 L23 ONLINE 1 0 0 L24 ONLINE 1 0 0 L37 ONLINE 1 0 0 raidz2-6 DEGRADED 0 0 0 L67 FAULTED 0 0 0 too many errors - Vdev missing NAME STATE READ WRITE CKSUM iron5 DEGRADED 0 0 0 raidz2-6 DEGRADED 0 0 0 L67 UNAVAIL 3 1 0 - Slow devices when -s provided with -e NAME STATE READ WRITE CKSUM SLOW iron5 DEGRADED 0 0 0 - raidz2-5 DEGRADED 0 0 0 - L10 FAULTED 0 0 0 0 external device fault L51 ONLINE 0 0 0 14 Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Cameron Harr <harr1@llnl.gov> Closes #15769	2024-02-07 09:12:12 -08:00
Brian Behlendorf	6dccdf501e	BRT: Fix FICLONE/FICLONERANGE shortened copy On Linux the ioctl_ficlonerange() and ioctl_ficlone() system calls are expected to either fully clone the specified range or return an error. The range may be for an entire file. While internally ZFS supports cloning partial ranges there's no way to return the length cloned to the caller so we need to make this all or nothing. As part of this change support for the REMAP_FILE_CAN_SHORTEN flag has been added. When REMAP_FILE_CAN_SHORTEN is set zfs_clone_range() will return a shortened range when encountering pending dirty records. When it's clear zfs_clone_range() will block and wait for the records to be written out allowing the blocks to be cloned. Furthermore, the file range lock is held over the region being cloned to prevent it from being modified while cloning. This doesn't quite provide an atomic semantics since if an error is encountered only a portion of the range may be cloned. This will be converted to an error if REMAP_FILE_CAN_SHORTEN was not provided and returned to the caller. However, the destination file range is left in an undefined state. A test case has been added which exercises this functionality by verifying that `cp --reflink=never\|auto\|always` works correctly. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #15728 Closes #15842	2024-02-05 16:44:45 -08:00
Brian Behlendorf	435b173fd9	ZTS: Apply zfs_bclone_enabled to bclone tests If block cloning is disabled by default then enable it when running the bclone tests. Follow up to #15529. Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #15796	2024-01-22 16:14:08 -08:00
Tino Reichardt	a0b2a93c41	fix: variable type with zfs-tests/cmd/clonefile.c Compiling on arm64 freebsd-13.2 and arm64 almalinux-8 brings currently this error: ``` CC tests/zfs-tests/cmd/clonefile.o tests/zfs-tests/cmd/clonefile.c:166:43: error: result of comparison of \ constant -1 with expression of type 'char' is always true \ [-Werror,-Wtautological-constant-out-of-range-compare] while ((c = getopt(argc, argv, "crfdq")) != -1) { ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ ~~ 1 error generated. gmake[2]: *** [Makefile:8675: tests/zfs-tests/cmd/clonefile.o] Error 1 ``` Fix: use correct variable type `int`. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de> Closes #15783	2024-01-17 09:06:14 -08:00
Pawel Jakub Dawidek	f45dd90f34	Fix cloning into mmaped and cached file. If the destination file is mmaped and the mmaped region was already read, so it is cached, we need to update mmaped pages after successful clone using update_pages(). Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Pointed out by: Ka Ho Ng <khng@freebsd.org> Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net> Closes #15772	2024-01-17 08:51:07 -08:00
Rob N	f0bf7a247d	Linux 6.7 compat: zfs_setattr fix atime update In `db4fc559c` I messed up and changed this bit of code to set the inode atime to an uninitialised value, when actually it was just supposed to loading the atime from the inode to be stored in the SA. This changes it to what it should have been. Ensure times change by the right amount Previously, we only checked if the times changed at all, which missed a bug where the atime was being set to an undefined value. Now ensure the times change by two seconds (or thereabouts), ensuring we catch cases where we set the time to something bonkers Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Sponsored-by: https://despairlabs.com/sponsor/ Closes #15762 Closes #15773	2024-01-16 14:01:17 -08:00
Umer Saleem	995734ed12	ZTS: Test for clone, mmap and write for block cloning For block cloning, if we mmap the cloned file and write from the map into the file, it triggers a panic in dbuf_redirty() on Linux. The same scenario causes data corruption on FreeBSD. Both these issues are fixed under PR#15656 and PR#15665. It would be good to add a test for this scenario in ZTS. The test program and issue was produced by @robn. Reviewed-by: Pawel Jakub Dawidek <pawel@dawidek.net> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Ameer Hamza <ahamza@ixsystems.com> Signed-off-by: Umer Saleem <usaleem@ixsystems.com> Closes #15717	2024-01-16 13:15:10 -08:00
Stefan Lendl	66670ba9f0	fix(mount): do not truncate shares not zfs mount When running zfs share -a resetting the exports.d/zfs.exports makes sense the get a clean state. Truncating was also called with zfs mount which would not populate the file again. Add test to verify shares persist after mount -a. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Stefan Lendl <s.lendl@proxmox.com> Closes #15607 Closes #15660	2024-01-12 12:05:11 -08:00
Brian Behlendorf	c4fa674367	Enable block_cloning tests on FreeBSD Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net> Closes #15749	2024-01-12 11:57:13 -08:00
Pawel Jakub Dawidek	4cf4bc7334	Block cloning tests. The test mostly focus on testing various corner cases. The tests take a long time to run, so for the common.run runfile we randomly select a hundred tests. To run all the bclone tests, bclone.run runfile should be used. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net> Closes #15631	2023-12-26 12:01:53 -08:00
Umer Saleem	dbda45160f	Test LWB buffer overflow for block cloning PR#15634 removes 128K into 2x68K LWB split optimization, since it was found to cause LWB buffer overflow while trying to write 128KB TX_CLONE_RANGE record with 1022 block pointers into 68KB buffer, with multiple VDEVs ZIL. This commit adds a test for this particular scenario by writing maximum sizes TX_CLONE_RANE record with 1022 block pointers into 68KB buffer, with two SLOG devices. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Ameer Hamza <ahamza@ixsystems.com> Signed-off-by: Umer Saleem <usaleem@ixsystems.com> Closes #15672	2023-12-15 14:18:27 -08:00
Brian Behlendorf	f4b97c1e00	ZTS: Update raidz_expand_005_pos.ksh Align the raidz_expand_005_pos test with the raidz_expand_004_pos test and only verify no errors were reported. Allow scrub repair IO. Reviewed-by: Don Brady <don.brady@klarasystems.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #15663	2023-12-12 09:56:19 -08:00
Tony Hutter	b1748eaee0	ZTS: Add dirty dnode stress test Add a test for the dirty dnode SEEK_HOLE/SEEK_DATA bug described in https://github.com/openzfs/zfs/issues/15526 The bug was fixed in https://github.com/openzfs/zfs/pull/15571 and was backported to 2.2.2 and 2.1.14. This test case is just to make sure it does not come back. seekflood.c originally written by Rob Norris. Reviewed-by: Graham Perrin <grahamperrin@freebsd.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #15608	2023-12-11 09:59:59 -08:00
Brian Behlendorf	9f6ad4dcb6	ZTS: Add zpool_import_status.ksh to Makefile.am The zpool_import_status.ksh test case was not being run because it was not included in the Makefile.am. Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #15655	2023-12-11 09:40:32 -08:00
Brian Behlendorf	8ad73bf449	ZTS: Disable io_uring test on CentOS 9 The io_uring test fails on CentOS 9 with the following fio error. Disable the test for the benefit of the CI until this can be fully investigated. This basic test passes as expected on newer kernels. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #15636	2023-12-08 17:31:31 -08:00
Ameer Hamza	2ebb9a4811	ZTS: Add test cases for block cloning replay Reviewed-by: Kay Pedersen <mail@mkwg.de> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ameer Hamza <ahamza@ixsystems.com> Closes #15614	2023-12-08 16:39:14 -08:00
Ameer Hamza	5ff43969e7	ZTS: block_cloning: Use numeric sort for get_same_blocks Reviewed-by: Kay Pedersen <mail@mkwg.de> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ameer Hamza <ahamza@ixsystems.com> Closes #15614	2023-12-08 16:38:45 -08:00
Don Brady	687e4d7f9c	Extend import_progress kstat with a notes field Detail the import progress of log spacemaps as they can take a very long time. Also grab the spa_note() messages to, as they provide insight into what is happening Sponsored-By: OpenDrives Inc. Sponsored-By: Klara Inc. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Don Brady <don.brady@klarasystems.com> Co-authored-by: Allan Jude <allan@klarasystems.com> Closes #15539	2023-12-05 14:27:56 -08:00
oromenahar	c7b6119268	Allow block cloning across encrypted datasets When two datasets share the same master encryption key, it is safe to clone encrypted blocks. Currently only snapshots and clones of a dataset share with it the same encryption key. Added a test for: - Clone from encrypted sibling to encrypted sibling with non encrypted parent - Clone from encrypted parent to inherited encrypted child - Clone from child to sibling with encrypted parent - Clone from snapshot to the original datasets - Clone from foreign snapshot to a foreign dataset - Cloning from non-encrypted to encrypted datasets - Cloning from encrypted to non-encrypted datasets Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Original-patch-by: Pawel Jakub Dawidek <pawel@dawidek.net> Signed-off-by: Kay Pedersen <mail@mkwg.de> Closes #15544	2023-12-05 11:03:48 -08:00
VaibhavB	7d68900af3	ZTS: get_persistent_disk_name can return truncated names Instead of using only the 3rd element return the entire string after the split to handle device names with dashes. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Vaibhav Bhanawat <vaibhav.bhanawat@delphix.com> Closes #15567	2023-11-29 09:34:29 -08:00
Rob Norris	213d682967	zdb: show BRT statistics and dump its contents Same idea as the dedup stats, but for block cloning. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Kay Pedersen <mail@mkwg.de> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #15541	2023-11-27 13:35:07 -08:00
Tony Hutter	3551a32e5e	ZTS: Fix zfs_load-key failures on F39 The zfs_load-key tests were failing on F39 due to their use of the deprecated ssl.wrap_socket function. This commit updates the test to instead use ssl.SSLContext() as described in https://stackoverflow.com/a/65194957. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #15534 Closes #15550	2023-11-27 13:24:37 -08:00
Tony Hutter	a94860a6de	ZTS: Fix 'could not unmount datasets' on Alma 9 (#15542 ) Many tests are failing on AlmaLinux 9 because ZTS could not destroy the pool in cleanup. This was due to $PWD being set to '.' instead of the expected full path. This patch sets $PWD to the full path. Signed-off-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Don Brady <don.brady@delphix.com>	2023-11-20 16:07:32 -08:00
Rich Ercolani	03e9caaec0	Add a tunable to disable BRT support. Copy the disable parameter that FreeBSD implemented, and extend it to work on Linux as well, until we're sure this is stable. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #15529	2023-11-16 11:35:22 -08:00
Alexander Motin	35da345160	L2ARC: Restrict write size to 1/4 of the device PR #15457 exposed weird logic in L2ARC write sizing. If it appeared bigger than device size, instead of liming write it reset all the system-wide tunables to their default. Aside of being excessive, it did not actually help with the problem, still allowing infinite loop to happen. This patch removes the tunables reverting logic, but instead limits L2ARC writes (or at least eviction/trim) to 1/4 of the capacity. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Amanakis <gamanakis@gmail.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #15519	2023-11-14 13:47:57 -08:00
Umer Saleem	40fccc423a	ZTS: Test for all known zpool feature sets zpool_create_features_007_pos only tested for compat-2020 feature set. It would be useful to test for all known features sets. If any additional feature is found enabled that is not present in compatibility list or feature set, it should be caught and reported earlier. This commit also removes encryption from openzfsonosx-1.8.1 compatibility list. Encryption enables bookmark_v2, since it is a dependency of encryption, but not listed in openzfsonoxx-1.8.1 compatibility list. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Umer Saleem <usaleem@ixsystems.com> Closes #15505	2023-11-09 10:58:23 -08:00
Don Brady	5caeef02fa	RAID-Z expansion feature This feature allows disks to be added one at a time to a RAID-Z group, expanding its capacity incrementally. This feature is especially useful for small pools (typically with only one RAID-Z group), where there isn't sufficient hardware to add capacity by adding a whole new RAID-Z group (typically doubling the number of disks). == Initiating expansion == A new device (disk) can be attached to an existing RAIDZ vdev, by running `zpool attach POOL raidzP-N NEW_DEVICE`, e.g. `zpool attach tank raidz2-0 sda`. The new device will become part of the RAIDZ group. A "raidz expansion" will be initiated, and the new device will contribute additional space to the RAIDZ group once the expansion completes. The `feature@raidz_expansion` on-disk feature flag must be `enabled` to initiate an expansion, and it remains `active` for the life of the pool. In other words, pools with expanded RAIDZ vdevs can not be imported by older releases of the ZFS software. == During expansion == The expansion entails reading all allocated space from existing disks in the RAIDZ group, and rewriting it to the new disks in the RAIDZ group (including the newly added device). The expansion progress can be monitored with `zpool status`. Data redundancy is maintained during (and after) the expansion. If a disk fails while the expansion is in progress, the expansion pauses until the health of the RAIDZ vdev is restored (e.g. by replacing the failed disk and waiting for reconstruction to complete). The pool remains accessible during expansion. Following a reboot or export/import, the expansion resumes where it left off. == After expansion == When the expansion completes, the additional space is available for use, and is reflected in the `available` zfs property (as seen in `zfs list`, `df`, etc). Expansion does not change the number of failures that can be tolerated without data loss (e.g. a RAIDZ2 is still a RAIDZ2 even after expansion). A RAIDZ vdev can be expanded multiple times. After the expansion completes, old blocks remain with their old data-to-parity ratio (e.g. 5-wide RAIDZ2, has 3 data to 2 parity), but distributed among the larger set of disks. New blocks will be written with the new data-to-parity ratio (e.g. a 5-wide RAIDZ2 which has been expanded once to 6-wide, has 4 data to 2 parity). However, the RAIDZ vdev's "assumed parity ratio" does not change, so slightly less space than is expected may be reported for newly-written blocks, according to `zfs list`, `df`, `ls -s`, and similar tools. Sponsored-by: The FreeBSD Foundation Sponsored-by: iXsystems, Inc. Sponsored-by: vStack Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Mark Maybee <mark.maybee@delphix.com> Authored-by: Matthew Ahrens <mahrens@delphix.com> Contributions-by: Fedor Uporov <fuporov.vstack@gmail.com> Contributions-by: Stuart Maybee <stuart.maybee@comcast.net> Contributions-by: Thorsten Behrens <tbehrens@outlook.com> Contributions-by: Fmstrat <nospam@nowsci.com> Contributions-by: Don Brady <dev.fs.zfs@gmail.com> Signed-off-by: Don Brady <dev.fs.zfs@gmail.com> Closes #15022	2023-11-08 10:19:41 -08:00
Tony Hutter	05c4710e89	Revert "zvol: Temporally disable blk-mq" This reverts commit `aefb6a2bd6`. `aefb6a2bd` temporally disabled blk-mq until we could fix a fix for Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #15439	2023-10-24 14:41:25 -07:00
VaibhavB	de7b1ae30a	run-zts test procfs/pool_state failed with uncorrectable I/O failure Once we trigger the zpool scrub, all zpool/zfs command gets stuck for 180 seconds. Post 180 seconds zpool/zfs commands gets start executing however few more seconds(10s) it take to update the status. hence sleeping for 200 seconds so that we get the correct status. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: vaibhav.bhanawat <vaibhav.bhanawat@delphix.com> Closes #15364	2023-10-20 11:57:39 -07:00
dennisfriedrichsen	0d6cec418e	Fix typo in tests/zfs-tests/tests/functional/cli_user/misc/misc.cfg Reviewed-by: Rob N <robn@despairlabs.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Dennis R. Friedrichsen <dennis.r.friedrichsen@gmail.com> Closes #15417	2023-10-20 11:52:13 -07:00
Don Brady	f0f330e121	Fix ZED auto-replace for VDEVs using by-id paths The change is simple -- restore the original code so that the VDEV path is updated when using by-id paths. The more challenging part was to devise a second ZTS test, that would test auto-replace for 'by-id' and help prevent a future regression. With that new test, we can now do an A\|B test with , and without, the fix to confirm that auto-replace for by-id paths works. The existing auto-replace test, functional/fault/auto_replace_001_pos, will confirm that we didn't break auto-replace for 'by-vdev' paths. In the original functional/fault/auto_replace_001_pos test, the disk wipe (using dd) was not effective in removing the partitioning since the kernel was never informed of the wipe. Added a call to wipefs(8) so that the kernel is informed and ZED will re-partition the device. Added a validation step that the re-partitioning occurred by confirming that the GPT partition UUID changes. Sponsored-By: OpenDrives Inc. Sponsored-By: Klara Inc. Reviewed-by: Rob Norris <rob.norris@klarasystems.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Don Brady <don.brady@klarasystems.com> Closes #15363	2023-10-20 09:29:02 -07:00
Brian Behlendorf	9facf2d1ad	ZTS: Debug zfs_share_concurrent_shares failure Update zfs_share_concurrent_shares test case to wait a few seconds and recheck that the filesystem isn't shared. The intent here is determine the nature of the error and if it may be a race. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Umer Saleem <usaleem@ixsystems.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #15379	2023-10-10 13:32:33 -07:00
Tony Hutter	aefb6a2bd6	zvol: Temporally disable blk-mq There was a report of zvol data loss (#15351) after enabling blk-mq on a zvol backed with 16k physical block sized disks. Out of an abundance of caution, do not allow the user to enable blk-mq until we can look into the issue. Note that blk-mq was not enabled by default on zvols. It was always opt-in via the zvol_use_blk_mq module parameter. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Addresses: #15351 Closes #15378	2023-10-10 08:57:48 -07:00
Umer Saleem	250349ffff	ZTS: Fix verify_fs_mount in delegate_common.kshlib verify_fs_mount expects the dataset to remain unmounted after updating the mountpoint property in delegate_common.kshlib. This commit updates verify_fs_mount and uses nomount parameter for zfs set to update the mountpoint property without mounting the dataset. This fixes the zfs_allow_010_pos test case, which was failing on FreeBSD after the behavior update in setting the mountpoint property. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Umer Saleem <usaleem@ixsystems.com> Closes #15376	2023-10-09 17:24:24 -07:00
Alexander Motin	008baa091f	FreeBSD: Reduce divergence from in-tree sources This includes random small tweaks, primarily a build fixes, required when ZFS is built as part of FreeBSD base. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #15368	2023-10-09 13:27:18 -07:00
Rob Norris	54b1b1d893	import: require force when cachefile hostid doesn't match on-disk Previously, if a cachefile is passed to zpool import, the cached config is mostly offered as-is to ZFS_IOC_POOL_TRYIMPORT->spa_tryimport(), and the results are taken as the canonical pool config and handed back to ZFS_IOC_POOL_IMPORT. In the course of its operation, spa_load() will inspect the pool and build a new config from what it finds on disk. However, it then regenerates a new config ready to import, and so rightly sets the hostid and hostname for the local host in the config it returns. Because of this, the "require force" checks always decide the pool is exported and last touched by the local host, even if this is not true, which is possible in a HA environment when MMP is not enabled. The pool may be imported on another head, but the import checks still pass here, so the pool ends up imported on both. (This doesn't happen when a cachefile isn't used, because the pool config is discovered in userspace in zpool_find_import(), and that does find the on-disk hostid and hostname correctly). Since the systemd zfs-import-cache.service unit uses cachefile imports, this can lead to a system returning after a crash with a "valid" cachefile on disk and automatically, quietly, importing a pool that has already been taken up by a secondary head. This commit causes the on-disk hostid and hostname to be included in the ZPOOL_CONFIG_LOAD_INFO item in the returned config, and then changes the "force" checks for zpool import to use them if present. This method should give no change in behaviour for old userspace on new kernels (they won't know to look for the new config items) and for new userspace on old kernels (the won't find the new config items). Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Closes #15290	2023-10-06 09:24:44 -07:00
Rob Norris	8f5aa8cb00	tests: add tests for zpool import behaviour when hostid changes Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Closes #15290	2023-10-06 09:23:39 -07:00
Rob N	4848a0898e	tests/block_cloning: sync before write in fallback test We're still seeing this test fail intermittently (that is, the clone happens), which must mean the write and the clone can still be happening on different txgs. It might be that there's still activity after the pool is created. So here we force a sync before starting the write. Sponsored-By: Klara Inc. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #15359	2023-10-06 08:39:20 -07:00
Umer Saleem	4e16964e1c	Add '-u' - nomount flag for zfs set This commit adds '-u' flag for zfs set operation. With this flag, mountpoint, sharenfs and sharesmb properties can be updated without actually mounting or sharing the dataset. Previously, if dataset was unmounted, and mountpoint property was updated, dataset was not mounted after the update. This behavior is changed in #15240. We mount the dataset whenever mountpoint property is updated, regardless if it's mounted or not. To provide the user with option to keep the dataset unmounted and still update the mountpoint without mounting the dataset, '-u' flag can be used. If any of mountpoint, sharenfs or sharesmb properties are updated with '-u' flag, the property is set to desired value but the operation to (re/un)mount and/or (re/un)share the dataset is not performed and dataset remains as it was before. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Umer Saleem <usaleem@ixsystems.com> Closes #15322	2023-10-02 16:58:54 -07:00
Akash B	ba769ea351	Fix ENOSPC for extended quota When unlinking multiple files from a pool at 100% capacity, it was possible for ENOSPC to be returned after the first few unlinks. This issue was fixed previously by PR #13172 but then this was again introduced by PR #13839. This is resolved using the existing mechanism of returning ERESTART when over quota as long as we know enough space will shortly be available after processing the pending deferred frees. Also, updated the existing testcase which reliably reproduced the issue without this patch. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Dipak Ghosh <dipak.ghosh@hpe.com> Signed-off-by: Akash B <akash-b@hpe.com> Closes #15312	2023-09-28 14:10:07 -07:00
Paul Dagnelie	ec994486b1	Reduce trim min size even lower for tests to reduce flakiness Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Paul Dagnelie <pcd@delphix.com> Closes #15315	2023-09-27 12:06:24 -07:00
Paul Dagnelie	a59e294e21	ZTS: Fix introduced test bug in block_cloning_copyfilerange Reviewed-by: John Wren Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Paul Dagnelie <pcd@delphix.com> Closes #15316	2023-09-26 14:37:28 -07:00
Paul Dagnelie	8a9ba769ee	Set timeout before creating pool in test Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <rob.norris@klarasystems.com> Signed-off-by: Paul Dagnelie <pcd@delphix.com> Closes #15309	2023-09-25 11:14:00 -07:00
Rob N	2dc89b922b	tests/block_cloning: try harder to stay on same txg in fallback test We've observed this test failing intermittently. When it does, the "same block" check shows that both files have the same content, that is, the file was cloned. The only way this could have happened is if the open txg moved between the dd and clonefile calls. That's possible because although we set zfs_txg_timeout to be large, that only affects the wait time in the sync thread at the start of a new txg; it doesn't change anything if its currently waiting or working. So here we just force the txgs to move immediately before, which should get both operations onto the same txg as intented. Sponsored-By: OpenDrives Inc. Sponsored-By: Klara Inc. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris Rob Norris <rob.norris@klarasystems.com> Closes #15303	2023-09-21 17:54:15 -07:00
Umer Saleem	bbac1d2977	Update the behavior of mountpoint property There are some inconsistencies in the handling of mountpoint property. This commit updates the behavior and makes it consistent. If mountpoint property is set when dataset is unmounted, this would update the mountpoint property. The mountpoint could be valid or invalid in this case. Setting the mountpoint property would result in success in this case. Dataset would still be unmounted here. On the other hand, if dataset is mounted and mountpoint property is updated to something invalid where mount cannot be successful, for example, setting the mountpoint inside a readonly directory. This would unmount the dataset, set the mountpoint property to requested value and tries to mount the dataset. The mount operation returns error and this error is treated as overall failure of setting the property while the property is actually set. To make the behavior consistent in case dataset is mounted or unmounted, we should try to mount the dataset whenever mountpoint property is updated. This would result in mounting the datasets if canmount property is set to on, regardless if the dataset was previously unmounted. The failure in mount operation while setting the mountpoint property should not be treated as failure, since the property is actually set now to user requested value. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Ameer Hamza <ahamza@ixsystems.com> Signed-off-by: Umer Saleem <usaleem@ixsystems.com> Closes #15240	2023-09-19 17:15:24 -07:00
Rob N	7228ba1114	cmd: add 'help' subcommand to zpool and zfs 'program help subcommand' is a reasonably common pattern for multifunction command-line programs. This commit adds support for that style to the zpool and zfs commands. When run as 'zpool help [<topic>]' or 'zfs help [<topic>]', executes the 'man' program on the PATH with the most likely manpage name for the requested topic: "zpool-<topic>" or "zfs-<topic>" for subcommands, or "zpool<topic>" or "zfs<topic>" for the "concepts" and "props" topics. If no topic is supplied, uses the top "zpool" or "zfs" pages. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Kay Pedersen <mail@mkwg.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #15288	2023-09-19 09:06:47 -07:00
Rob N	6cb933c56e	tests: install missing PAM tests 'pam_change_unmounted' and 'pam_recursive' both exist and are referenced by the test run config, but weren't being installed and so are excluded. This gets them installed so they will run as expected. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Kay Pedersen <mail@mkwg.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #15291	2023-09-19 08:48:02 -07:00
ednadolski-ix	0ee9b02390	update max_variance limit in zdb_block_size_histogram test for CI Commit `2d7843401a` had previously updated this hardcoded limit to allow for CI testing. As there is no deterministic pass/fail value, the need has arisen for one more small increase. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Edmund Nadolski <edmund.nadolski@ixsystems.com> Closes #15252	2023-09-09 10:23:29 -07:00
Paul Dagnelie	bee9cfb813	Increase limit of redaction list by using spill block Currently redaction bookmarks and their associated redaction lists have a relatively low limit of 36 redaction snapshots. This is imposed by the number of snapshot GUIDs that fit in the bonus buffer of the redaction list object. While this is more than enough for most use cases, there are some limited cases where larger numbers would be useful to support. We tweak the redaction list creation code to use a spill block if the number of redaction snapshots is above the amount that would fit in the bonus buffer. We also make a small change to allow spill blocks to be use for types of data besides SA. In order to fully leverage this logic, we also change the redaction code to use vmem_alloc, to handle extremely large allocations if needed. Finally, small tweaks were made to the zfs commands and the test suite. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Signed-off-by: Paul Dagnelie <pcd@delphix.com> Closes #15018	2023-08-26 11:34:43 -07:00
Rob N	804414aad2	tests/block_cloning: rename and document get_same_blocks helper `get_same_blocks` is a helper to compare two files and return a list of the blocks that are clones of each other. Its very necessary for block cloning tests. Previously it was incorrectly called `unique_blocks`, which is the _inverse_ of what it does (an early version did list unique blocks; it was changed but the name was not). So if nothing else, it should be called `duplicate_blocks`. But, keeping the details of a clone operation in your head is actually quite difficult, without the additional overhead of wondering how the tools work. So I've renamed it to better describe what it does, added a usage note, and changed it to return block indexes from 0 instead of 1, to match how L0 blocks are normally counted. Reviewed-by: Umer Saleem <usaleem@ixsystems.com> Reviewed-by: Kay Pedersen <mail@mkwg.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #15181	2023-08-25 10:31:29 -07:00
Rob N	cae502c175	copy_file_range: fix fallback when source create on same txg In `019dea0a5` we removed the conversion from EAGAIN->EXDEV inside zfs_clone_range(), but forgot to add a test for EAGAIN to the copy_file_range() entry points to trigger fallback to a content copy. This commit fixes that. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Kay Pedersen <mail@mkwg.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #15170 Closes #15172	2023-08-14 17:34:14 -07:00
Rob N	ead3eea3e0	linux/copy_file_range: properly request a fallback copy on Linux <5.3 Before Linux 5.3, the filesystem's copy_file_range handler had to signal back to the kernel that we can't fulfill the request and it should fallback to a content copy. This is done by returning -EOPNOTSUPP. This commit converts the EXDEV return from zfs_clone_range to EOPNOTSUPP, to force the kernel to fallback for all the valid reasons it might be unable to clone. Without it the copy_file_range() syscall will return EXDEV to userspace, breaking its semantics. Add test for copy_file_range fallbacks. copy_file_range should always fallback to a content copy whenever ZFS can't service the request with cloning. Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Kay Pedersen <mail@mkwg.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #15131	2023-08-01 11:31:11 -07:00
Zach Dykstra	fcd61d937f	readmmap.c: fix building with MUSL libc glibc includes sys/types.h from stdlib.h. This is not the case for MUSL, so explicitly include it. Fixes usage of uint_t. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Zach Dykstra <dykstra.zachary@gmail.com> Closes #15130	2023-08-01 09:01:32 -07:00
oromenahar	e9c59310f7	Check the return value in clonefile test Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <rob.norris@klarasystems.com> Signed-off-by: Kay Pedersen <mail@mkwg.de> Closes #15128	2023-08-01 08:26:12 -07:00
Rob Norris	48d0e9465d	zts: block cloning tests Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Kay Pedersen <mail@mkwg.de> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Sponsored-By: OpenDrives Inc. Sponsored-By: Klara Inc. Closes #15050 Closes #405 Closes #13349	2023-07-24 16:37:58 -07:00
Ameer Hamza	4d2dad04aa	Ignore pool ashift property during vdev attachment Ashift can be set for a vdev only during its creation, and the top-level vdev does not change when a vdev is attached or replaced. The ashift property should not be used during attachment, as it does not allow attaching/replacing a vdev if the pool's ashift property is increased after the existing vdev was created. Instead, we should be able to attach the vdev if the attached vdev can satisfy the ashift requirement with its parent. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ameer Hamza <ahamza@ixsystems.com> Closes #15061	2023-07-20 09:57:16 -07:00
George Amanakis	10e36e1761	Shorten arcstat_quiescence sleep time With the latest L2ARC fixes, 2 seconds is too long to wait for quiescence of arcstats like l2_size. Shorten this interval to avoid having the persistent L2ARC tests in ZTS prematurely terminated. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #14981	2023-06-15 12:45:36 -07:00
Brian Behlendorf	e32e326c5b	ZTS: Skip send_raw_ashift on FreeBSD On FreeBSD 14 this test runs slowly in the CI environment and is killed by the 10 minute timeout. Skip the test on FreeBSD until the slow down is resolved. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #14961	2023-06-14 08:04:05 -07:00
George Amanakis	8af1104f83	Store the L2ARC device ashift in the vdev label If this is not done, and the pool has an ashift other than the default (at the moment 9) then the following happens: 1) vdev_alloc() assigns the ashift of the pool to L2ARC device, but upon export it is not stored anywhere 2) at the first import, vdev_open() sees an vdev_ashift() of 0 and assigns the logical_ashift, which is 9 3) reading the contents of L2ARC, including the header fails 4) L2ARC buffers are not restored in ARC. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #14313 Closes #14963	2023-06-14 08:01:17 -07:00
Brian Behlendorf	6db4ed51d6	ZTS: Skip checkpoint_discard_busy Until the ASSERT which is occasionally hit while running checkpoint_discard_busy is resolved skip this test case. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #12053 Closes #14952	2023-06-09 11:10:01 -07:00
Rob Norris	8653f1de48	zdb: add -B option to generate backup stream This is more-or-less like `zfs send`, but specifying the snapshot by its objset id for situations where it can't be referenced any other way. Sponsored-By: Klara, Inc. Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Reviewed-by: WHR <msl0000023508@gmail.com> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #14642	2023-06-05 11:54:42 -07:00
Val Packett	db994458bb	PAM: support password changes even when not mounted There's usually no requirement that a user be logged in for changing their password, so let's not be surprising here. We need to use the fetch_lazy mechanism for the old password to avoid a double prompt for it, so that mechanism is now generalized a bit. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Felix Dörre <felix@dogcraft.de> Signed-off-by: Val Packett <val@packett.cool> Closes #14834	2023-05-31 17:01:11 -07:00
Val Packett	850bccd3bc	PAM: add 'recursive_homes' flag to use with 'prop_mountpoint' It's not always desirable to have a fixed flat homes directory. With the 'recursive_homes' flag, 'prop_mountpoint' search would traverse the whole tree starting at 'homes' (which can now be '*' to mean all pools) to find a dataset with a mountpoint matching the home directory. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Felix Dörre <felix@dogcraft.de> Signed-off-by: Val Packett <val@packett.cool> Closes #14834	2023-05-31 17:00:58 -07:00
Brian Behlendorf	e085e98d54	ZTS: zvol_misc_trim disable blk mq Disable the zvol_misc_fua.ksh and zvol_misc_trim.ksh test cases on impacted kernels. This issue is being actively worked in #14872 and as part of that fix this commit will be reverted. VERIFY(zh->zh_claim_txg == 0) failed PANIC at zil.c:904:zil_create() Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #14872 Closes #14870	2023-05-29 12:55:35 -07:00
Richard Yao	677c6f8457	btree: Implement faster binary search algorithm This implements a binary search algorithm for B-Trees that reduces branching to the absolute minimum necessary for a binary search algorithm. It also enables the compiler to inline the comparator to ensure that the only slowdown when doing binary search is from waiting for memory accesses. Additionally, it instructs the compiler to unroll the loop, which gives an additional 40% improve with Clang and 8% improvement with GCC. Consumers must opt into using the faster algorithm. At present, only B-Trees used inside kernel code have been modified to use the faster algorithm. Micro-benchmarks suggest that this can improve binary search performance by up to 3.5 times when compiling with Clang 16 and up to 1.9 times when compiling with GCC 12.2. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Closes #14866	2023-05-26 10:03:12 -07:00
Alexander Motin	b6fbe61fa6	zil: Add some more statistics. In addition to a number of actual log bytes written, account also a total written bytes including padding and total allocated bytes (bytes <= write <= alloc). It should allow to monitor zil traffic and space efficiency. Add dtrace probe for zil block size selection. Make zilstat report more information and fit it into less width. Reviewed-by: Ameer Hamza <ahamza@ixsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #14863	2023-05-25 13:51:53 -07:00
Akash B	9d618615d1	Fix concurrent resilvers initiated at same time For draid vdevs it was possible to initiate both the sequential and healing resilver at same time. This fixes the following two scenarios. 1) There's a window where a sequential rebuild can be started via ZED even if a healing resilver has been scheduled. - This is fixed by adding additional check in spa_vdev_attach() for any scheduled resilver and return appropriate error code when a resilver is already in progress. 2) It was possible for zpool clear to start a healing resilver when it wasn't needed at all. This occurs because during a vdev_open() the device is presumed to be healthy not until the device is validated by vdev_validate() and it's set unavailable. However, by this point an async resilver will have already been requested if the DTL isn't empty. - This is fixed by cancelling the SPA_ASYNC_RESILVER request immediately at the end of vdev_reopen() when a resilver is unneeded. Finally, added a testcase in ZTS for verification. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Dipak Ghosh <dipak.ghosh@hpe.com> Signed-off-by: Akash B <akash-b@hpe.com> Closes #14881 Closes #14892	2023-05-24 12:28:09 -07:00
George Amanakis	482eeef804	Teach zpool scrub to scrub only blocks in error log Added a flag '-e' in zpool scrub to scrub only blocks in error log. A user can pause, resume and cancel the error scrub by passing additional command line arguments -p -s just like a regular scrub. This involves adding a new flag, creating new libzfs interfaces, a new ioctl, and the actual iteration and read-issuing logic. Error scrubbing is executed in multiple txg to make sure pool performance is not affected. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Co-authored-by: TulsiJain tulsi.jain@delphix.com Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #8995 Closes #12355	2023-05-18 11:59:42 -07:00
Brian Behlendorf	e34e15ed6d	Add the ability to uninitialize zpool initialize functions well for touching every free byte...once. But if we want to do it again, we're currently out of luck. So let's add zpool initialize -u to clear it. Co-authored-by: Rich Ercolani <rincebrain@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #12451 Closes #14873	2023-05-18 10:02:20 -07:00
Don Brady	da211a4a33	Refine special_small_blocks property validation When the special_small_blocks property is being set during a pool create it enforces a limit of 128KiB even if the pool's record size is larger. If the recordsize property is being set during a pool create, then use that value instead of the default SPA_OLD_MAXBLOCKSIZE value. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Don Brady <dev.fs.zfs@gmail.com> Closes #13815 Closes #14811	2023-05-12 09:12:28 -07:00
Brian Behlendorf	c8b3dda186	Debug auto_replace_001_pos failures Reduced the timeout to 60 seconds which should be more than sufficient and allow the test to be marked as FAILED rather than KILLED. Also dump the pool status on cleanup. Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #14829	2023-05-09 08:57:02 -07:00
George Amanakis	d38c815fe2	Remove duplicate code in l2arc_evict() l2arc_evict() performs the adjustment of the size of buffers to be written on L2ARC unnecessarily. l2arc_write_size() is called right before l2arc_evict() and performs those adjustments. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #14828	2023-05-09 08:54:41 -07:00
George Amanakis	6839ec6f10	Enable the head_errlog feature to remove errors In case check_filesystem() does not error out and does not report an error, remove that error block from error lists and logs without requiring a scrub. This can happen when the original file and all snapshots/clones referencing it have been removed. Otherwise zpool status will still report that "Permanent errors have been detected..." without actually reporting any of them. To implement this change the functions introduced in corrective receive were modified to take into account the head_errlog feature. Before this change: ============================= pool: test state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A config: NAME STATE READ WRITE CKSUM test ONLINE 0 0 0 /home/user/vdev_a ONLINE 0 0 2 errors: Permanent errors have been detected in the following files: ============================= After this change: ============================= pool: test state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P config: NAME STATE READ WRITE CKSUM test ONLINE 0 0 0 /home/user/vdev_a ONLINE 0 0 2 errors: No known data errors ============================= Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #14813	2023-05-09 08:53:27 -07:00
George Amanakis	4eca03faaf	Fixes in head_errlog feature with encryption For the head_errlog feature use dsl_dataset_hold_obj_flags() instead of dsl_dataset_hold_obj() in order to enable access to the encryption keys (if loaded). This enables reporting of errors in encrypted filesystems which are not mounted but have their keys loaded. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Amanakis <gamanakis@gmail.com> Closes #14837	2023-05-08 13:35:03 -07:00
Pawel Jakub Dawidek	6fa6bb051c	Simplify and optimize random_int_between(). Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net> Closes #14805	2023-05-05 09:09:12 -07:00
Ameer Hamza	82ac409acc	zpool import -m also removing spare and cache when log device is missing spa_import() relies on a pool config fetched by spa_try_import() for spare/cache devices. Import flags are not passed to spa_tryimport(), which makes it return early due to a missing log device and missing retrieving the cache device and spare eventually. Passing ZFS_IMPORT_MISSING_LOG to spa_tryimport() makes it fetch the correct configuration regardless of the missing log device. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ameer Hamza <ahamza@ixsystems.com> Closes #14794	2023-05-03 15:10:32 -07:00

1 2 3 4 5 ...

1262 Commits