mirror_zfs

mirror of https://git.proxmox.com/git/mirror_zfs.git synced 2026-05-21 18:26:47 +03:00

Author	SHA1	Message	Date
shuppy	6eef5cdc94	ZTS: add regression test for #17180 In #17180, we fixed an interesting bug that i believe i hit in one of my pools, but as far as i can tell, there was no test for it. this patch adds a regression test for #17180, minimised from my attempts to reproduce the bug in a way that resembled the history of my pool. Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Adam Moss <c@yotes.com> Signed-off-by: delan azabani <dazabani@igalia.com> Closes #18109	2026-01-06 09:33:03 -08:00
Dimitry Andric	2dbd6af5e4	Rename several printf attributes declarations to __printf__ For kernel builds on FreeBSD, we redefine `__printf__` to `__freebsd_kprintf__`, to support FreeBSD kernel printf(9) extensions with clang. In OpenZFS various printf related functions are declared with `__attribute__((format(printf, X, Y)))`, so these won't work with the above redefinition. With clang 21 and higher, this leads to errors similar to: sys/contrib/openzfs/module/zfs/spa_misc.c:414:38: error: passing 'printf' format string where 'freebsd_kprintf' format string is expected [-Werror,-Wformat] 414 \| (void) vsnprintf(buf, sizeof (buf), fmt, adx); \| ^ Since attribute names can always be spelled with leading and trailing double underscores, rename these instances. Note that in the FreeBSD base system we usually use `__printflike` from `<sys/cdefs.h>`, but that does not apply to OpenZFS. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Signed-off-by: Dimitry Andric <dimitry@andric.com> Closes #18095	2026-01-05 14:15:22 -08:00
Andrew Walker	312bdab0f5	Add handling for STATX_CHANGE_COOKIE This commit adds handling for the STATX_CHANGE_COOKIE so that we can properly surface the ZFS znode sequence to NFS clients via knfsd. If knfsd does not have STATX_CHANGE_COOKIE in statx result then it will synthesize the NFS change_info4 structure and related change4id values algorithmically based on the ctime value of the file. Since internally ZFS is using ktime_get_coarse_real_ts64() for the timestamp calculation here it introduces the possiblity that the change will not increment the change4id of directories / files causing a failure in the client to invalidate its attr cache (among other things). See RFC 8881 Section 10.8 for discussion of how clients may implement name and directory caching. Notable in this commit is that we are not initializing the inode->i_version to the znode->z_seq number. The reason for this is that we're intentionally not setting `SB_I_VERSION`. This indicates that the filesystem manages its own i_version and so it is not populated in the generic_fillattr. The following compares tight loop of setattr over NFSv4 protocol while traching nfsd4_change_attribute. Before change: inode, change_attribute 4723, 7590032215978780890 4723, 7590032215978780890 4723, 7590032215978780890 4723, 7590032215982780865 4723, 7590032215982780865 After change: inode, change_attribute 7602, 7590032992517123951 7602, 7590032992517123952 7602, 7590032992517123953 7602, 7590032992517123954 7602, 7590032992517123955 Reviewed-by: Ameer Hamza <ahamza@ixsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Signed-off-by: Andrew Walker <andrew.walker@truenas.com> Closes #18097	2026-01-05 14:06:28 -08:00
Rob Norris	a1319bf654	kmem: don't add __GFP_RECLAIMABLE for KM_VMEM allocations vmalloc()'d memory is not movable/reclaimable, so __GFP_RECLAIMABLE is not a valid flag, and since 6.19 the kernel warns if you use it. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #18107	2026-01-05 13:35:13 -08:00
Ivan Shapovalov	dbb3f247ed	cmd/zfs: clone: accept `-u` to not mount newly created datasets Signed-off-by: Ivan Shapovalov <intelfx@intelfx.name> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Closes #18080	2026-01-05 12:21:56 -05:00
Alexander Moch	b9b84445ea	CI: Add Alpine Linux 3.23 runner to the pipeline (#18087 ) Add an Alpine Linux 3.23 runner to the CI chain to run OpenZFS builds and tests against musl libc. Currently, zfs_send_sparse is killed after 10 minutes on Alpine, causing cascading EBUSY failures in the test suite. With zfs_send_sparse disabled, the ZFS test suite reaches a pass rate of 94.62%. This commit introduces the required Alpine-specific setup and a small set of shell and cloud-init compatibility fixes that also apply to existing Linux runners. The Alpine runner is not enabled by default and is not executed for new pull requests. Sponsored-by: ERNW Research GmbH - https://ernw-research.de/ Signed-off-by: Alexander Moch <amoch@ernw.de> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>	2025-12-30 09:29:48 -08:00
Alexander Moch	e72f3054e3	cmd/ztest: avoid `PATH_MAX` stack allocation in `ztest_get_zdb_bin()` (#18085 ) Calling realpath(path, buf) can trigger fortified header wrappers that allocate a PATH_MAX-sized temporary buffer on the stack, exceeding the 4 KiB frame limit on some systems. Use the heap-allocating realpath(path, NULL) form instead. Sponsored-by: ERNW Research GmbH - https://ernw-research.de/ Signed-off-by: Alexander Moch <amoch@ernw.de> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov>	2025-12-29 11:16:34 -08:00
Rob Norris	f041375b52	kmem: don't add __GFP_COMP for KM_VMEM allocations It hasn't been necessary since Linux 3.13 (torvalds/linux@a57a49887e), and since 6.19 the kernel warns if you use it. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #18053	2025-12-23 12:54:34 -08:00
Rob Norris	f95e306266	kmem: don't pass __GFP_HIGHMEM to __vmalloc Since Linux 4.12 (torvalds/linux@19809c2da2) __GFP_HIGHMEM has been automatically added to calls to __vmalloc() internally, so we don't need it anymore. This is good, because since 6.19 the kernel warns if you use __GFP_HIGHMEM. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #18053	2025-12-23 12:54:11 -08:00
Rob Norris	3c8665cb5d	Linux 6.19: replace i_state access with inode_state_read_once() Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #18053	2025-12-23 12:53:32 -08:00
Ivan Shapovalov	3c4193333b	zed.d, contrib: fix shellcheck errors in scripts Not sure why this was not caught by CI; perhaps my shellcheck is new enough to catch more things. Signed-off-by: Ivan Shapovalov <intelfx@intelfx.name>	2025-12-23 11:12:21 -08:00
Ivan Shapovalov	e28d980d68	man: cosmetic: fix typos; use consistent spelling for "non-existing" Signed-off-by: Ivan Shapovalov <intelfx@intelfx.name>	2025-12-23 11:12:21 -08:00
Ivan Shapovalov	1e7280cece	zfs_main: cosmetic: add missing flag to the comment for create Signed-off-by: Ivan Shapovalov <intelfx@intelfx.name>	2025-12-23 11:12:21 -08:00
Ivan Shapovalov	9880ac3080	zvol: cosmetic: fix up `volthreading` property short name Signed-off-by: Ivan Shapovalov <intelfx@intelfx.name>	2025-12-23 11:12:21 -08:00
Rob Norris	654e7628d6	u8_textprep: move into module/zfs Now that it's built into the main zfs module in all cases, there's no reason to put it in its own dir. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #18071	2025-12-22 14:58:36 -08:00
Rob Norris	309006a0c6	libunicode: merge into libzpool It's a single source file that is not used anywhere else, so there's no reason to keep it separate. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #18071	2025-12-22 14:58:20 -08:00
Tony Hutter	648a9a2938	CI: Test 2.4.x in qemu-test-repo-vm.sh, quick mode The qemu-test-repo-vm.sh script tests installs ZFS from different repos. Have it test from the new 2.4.x repos as well. Also add a checkbox to run in "lookup mode". This just does a quick lookup to see what version is installed in each repo. It does not do a test install and module load. It only takes 3min to run vs over an hour for the full version. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #18070	2025-12-19 19:57:19 -08:00
Rob Norris	0d44b58d7f	libshare: fold into libzfs and reorg headers a little libzfs is the only user of libshare, and only internally, so there's no particular reason to build it separately, nor to export its symbols. So, pull it into libzfs proper, remove its "public" header, and hide its symbols. The bare minimum "public" API is just to count and enumerate the supported share types. These are moved to libzfs.h with the other share API. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #18072	2025-12-19 19:52:33 -08:00
Alexander Motin	962e68865e	Use reduced precision for scan times Scan time limits do not need precision beyond 1ms. Switching scn_sync_start_time and spa_sync_starttime from gethrtime() to getlrtime() saves ~3% of CPU time during resilver scan stage. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com> Closes #18061	2025-12-18 10:22:11 -08:00
Alexander Motin	a83bb15fcd	Reduce minimal scrub/resilver times With higher throughput and lower latency of modern devices ZFS can happily live with pretty short (fractions of a second) TXGs. But the two decade old multi-second minimal time limits can almost stop payload writes by extending TXGs beyond dirty data limits of ARC ability to amortize it. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com> Closes #18060	2025-12-18 10:21:45 -08:00
Allan Jude	1d43387dd8	zdb: Add -O option for -r to specify object-id "zdb -r -O pool/dataset obj-id destination" will copy the file with object-id obj-id to the named destination; without -O it'll still be interpreted as a pathname. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Akash B <akash-b@hpe.com> Signed-off-by: Sean Eric Fagan <sean.fagan@klarasystems.com> Closes #16307	2025-12-18 09:25:09 -08:00
Mark Maybee	7ff329ac2e	Fix rangelock test for growing block size If the file already has more than one block, then the current block size cannot change. But if the file block size is less than the maximum block size supported by the file system, and there are multiple blocks in the file, the current code will almost always extend the rangelock to its maximum size. This means that all writes become serialized and even reads are slowed as they will more often contend with writes. This commit adjusts the test so that we will not lock the entire range if there is more than one block in the file already. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Signed-off-by: Mark Maybee <mark.maybee@perforce.com> Closes #18046 Closes #18064	2025-12-18 09:23:38 -08:00
Alexander Motin	051a8c7494	Bypass snprintf() in quota checks if no quotas set This improves synthetic 1 byte write speed by ~2.5%. Reviewed-by: Ameer Hamza <ahamza@ixsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com> Closes #18063	2025-12-17 21:59:47 -05:00
Alexander Motin	0550abd4b8	RAIDZ: Remove some excessive logging There were some per I/O logging into dbgmsg in RAIDZ code, that increased CPU load and wiped useful content out of dbgmsg, for example during routine disk replacement process. I don't think we need it to be that verbose. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com> Closes #18059	2025-12-17 14:00:01 -08:00
Turbo Fredriksson	0ba3403323	Change shellcheck and checkbashism triggers. Newer versions of `shellcheck` and `checkbashism` finds more than previous, so fix those. Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Closes #18000	2025-12-16 09:15:51 -08:00
Turbo Fredriksson	6c6a469bea	Replace bashisms in ZFS shell function stub. The `type` command is an optional feature in POSIX, so shouldn't be used. Instead, use `command -v`, which commit `e865e7809e` did, but it missed this file. Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Closes #18000	2025-12-16 09:15:51 -08:00
Turbo Fredriksson	1842d6b3cb	Make lines stay within 80 char limit. Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Closes #18000	2025-12-16 09:15:51 -08:00
Turbo Fredriksson	ead77e952e	Add some comments to clarify the mounting of filesystems. There's no real documenation (which should probably be written!), so instead document the code the best we can on what's going and with the mounting of file systems to make future updates easier. Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Closes #18000	2025-12-16 09:15:51 -08:00
Turbo Fredriksson	01cb64510d	Standardise if/then/else and for/do/done lines. More code standard changes, where if/then is on different lines. To have it on the same, or on different lines, can be argued, but we need to pick one, and try not to mix how to do things. Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Closes #18000	2025-12-16 09:15:51 -08:00
Turbo Fredriksson	29819a0177	Add missing initrd config variables. The `ZFS_INITRD_ADDITIONAL_DATASETS` variable is used in the initrd script to boot additional OS file systems besides the root file system. But it wasn't included as an example in the config files. The `ZFS_POOL_EXCEPTIONS` was included in the example defaults file, but it was not exported, so not available in the initrd. Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Closes #18000	2025-12-16 09:15:51 -08:00
Turbo Fredriksson	4af8e28a59	Remove unnecessary sourcing of variables. The file `/etc/default/zfs` is already sourced by the `/etc/zfs/zfs-functions`, so no need to source it again. Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Closes #18000	2025-12-16 09:15:51 -08:00
Turbo Fredriksson	94975ff79b	Fix issue with finding degraded pool(s). When a pool is degraded, or needs special action, the `zpool import` (without pool to import) line will report: ``` pool: rpool id: 01234567890123456789 state: ONLINE action: The pool can be imported using its name or numeric identifier. config: [..] ``` If the import with the pool name fails, it is supposed to try importing using the pool ID. However, the script is also getting the `action` line (and probably `scrub:` if/when that's available): pool; The pool can be imported using its name or numeric identifier.;config:; which causes issues on consequent import attempts. Cleanup the information by rewriting the `sed` command line. Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Closes #18000	2025-12-16 09:15:51 -08:00
Turbo Fredriksson	33dd57e1b4	Prefix all variables that are local with underscore. This just to make them easier to see. Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Closes #18000	2025-12-16 09:15:51 -08:00
Turbo Fredriksson	d3b447de4e	Shell script good practices changes. It's considered good practice to: 1) Wrap the variable name in `{}`. As in `${variable}` instead of `$variable`. 2) Put variables in `"`. Also some minor error message tuning. Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Closes #18000	2025-12-16 09:15:51 -08:00
Turbo Fredriksson	61ab032ae0	Fix potential global variable overwrite. In a previous commit (`e865e7809e`), the `local` keyword was removed in functions because of bashism. Removing bashisms is correct, however this could cause variable overwrites, since several functions use the same variable name. So this commit make function variables unique in the (now) global name space. The problem from the original bug report (see #17963) could not be duplicated, but it is still sane to make sure that variables stay unique. Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Closes #18000	2025-12-16 09:15:51 -08:00
Tony Hutter	32faecb0c2	CI: Use Ubuntu mirrors instead of azure (#18057 ) Use the official Ubuntu apt mirrors instead of azure.archive.ubuntu.com, since that mirror can be slow: https://github.com/actions/runner-images/issues/7048 This can help speed up the 'Setup QEMU' stage. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #18057	2025-12-16 09:15:18 -08:00
Alan Somers	a69a90b49e	Remove the obsolete FreeBSD 14.2-RELEASE from CI Sponsored by: ConnectWise Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Signed-off-by: Alan Somers <asomers@gmail.com> Closes #18013	2025-12-15 15:13:04 -08:00
Tony Hutter	842fb1c135	CI: Change timeout values The 'Setup QEMU' CI step updates and installs all packages necessary to startup QEMU. Typically the step takes a little over a minute, but we've seen cases where it can take legitimately take more than 45min minutes. Change the timeout to 60 minutes. In addition, change the 'Install dependencies' timeout to 60min since we've also seen timeouts there. Lastly, remove all timeouts from the zfs-qemu-packages workflow. We do this so that we can always build packages from a branch, even if the time it takes to do a CI step changes over time. It's ok to eliminate the timeouts from the zfs-qemu-packages completely since that workflow is only run manually. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #18056	2025-12-15 14:58:01 -08:00
Alexander Motin	22e89aca88	DDT: Fix compressed entry buffer size The first byte of the entry after compression is used for algorithm and byte order flag. We should decrement when calling compression/ decompression algorithm. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com> Closes #18055	2025-12-15 14:52:44 -08:00
Alexander Motin	3b1ff816bd	DDT: Add/use zap_lookup_length_uint64_by_dnode() Unlike other ZAP consumers due to compression DDT does not know how big entry it is reading from ZAP. Due to this it called zap_length_uint64_by_dnode() and zap_lookup_uint64_by_dnode(), each of which does full ZAP entry lookup. Introduction of the combined ZAP method dramatically reduces the CPU overhead and locks contention at DBUF layer. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com> Closes #18048	2025-12-15 14:38:34 -08:00
Alexander Motin	ff5414406f	DDT: Switch to using ZAP _by_dnode() interfaces As was previously done for BRT, avoid holding/releasing DDT ZAP dnodes for every access. Instead hold the dnodes during all their life time, never releasing. While at this, add _by_dnode() interfaces for zap_length_uint64() and zap_count(), actively used by DDT code. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com> Closes #18047	2025-12-15 09:49:14 -08:00
Alexander Motin	46d6f1fe56	DDT: Move logs searches out of the lock Postponing entry removal from the DDT log in case of hit till later single-threaded sync stage allows to make ddl_tree stable during multi-threaded ZIO processing stage. It allows to drop the DDT lock before the search instead of after, reducing the contention a lot. Actually ddt_log_update_entry() was already handling the case of entry present in the active log, so we only need to remove it from flushing log, if the entry happen to be there. My tests with parallel 4KB block writes show throughput increase from 480MB/s (122K blocks/s) to 827MB/s (212K blocks/s), even though still limited by the global DDT lock contention. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com> Closes #18044	2025-12-15 09:17:04 -08:00
Alexander Motin	3d76ba2737	Improve async destroy processing timing Previous code effectively enforced that all async free ZIOs were _issued_ within the TXG timeout. But they could take forever to complete, especially if the required metadata were not in ARC. This patch introduces periodic waits every 2000 ZIOs, which should give at least somewhat reasonable TXG timings even for single HDD pools with empty ARC. And makes them complete within half of the TXG timeout, since we might still need time to sync DDT and BRT. While there, change zfs_max_async_dedup_frees semantics to include also clone and gang blocks, which are similar. Bump the default value from set long ago to be more forgiving to block cloning (still not having logs and benefiting from large TXGs), now that we have better working time limits. The limit now is a possible amount of dirty data produced by BRT updates. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Allan Jude <allan@klarasystems.com> Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com> Closes #18043	2025-12-11 18:46:08 -08:00
Alexander Motin	f72fd378c8	Defer async destroys on pool import We've observed a number of cases when pool import stuck for many minutes due to large async destroy trying to load DDT or BRT from HDD pool. While proper destroy dosage is a separate problem, lets give import process a chance to complete before that at all. It may be not enough if there is a lot of ZIL to replay, but that is harder to cover, since those are in separate syscalls. Code investigation shown that we already have this mechanism used for scrub/resilver, so this patch converts SCAN_IMPORT_WAIT_TXGS into a tunable and applies it to async destroys also. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com> Closes #18033	2025-12-11 18:44:46 -08:00
Alexander Motin	d976587a35	ZTS: Fix zvol_misc_fua SLOG writes check Instead of comparing number of SLOG writes to number of normal writes we should just make sure SLOG got the required number of writes. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com> Closes #18033	2025-12-11 18:43:59 -08:00
Alexander Motin	20f09eae42	ZIO: ZIO_STAGE_DDT_WRITE is a blocking stage ddt_lookup() in zio_ddt_write() might require synchronous DDT ZAP read. Running it from interrupt taskq might lead to deadlock. Inclusion of ZIO_STAGE_DDT_WRITE into ZIO_BLOCKING_STAGES should hopefully fix that, even though I am not sure how I got there. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com> Closes #17981	2025-12-10 19:51:53 -05:00
Alexander Motin	d393166c54	ARC: Increase parallel eviction batching Before parallel eviction implementation zfs_arc_evict_batch_limit caused loop exits after evicting 10 headers. The cost of it is not big and well motivated. Now though taskq task exit after the same 10 headers is much more expensive. To cover the context switch overhead of taskq introduce another level of batching, controlled by zfs_arc_evict_batches_limit tunable, used only for parallel eviction. My tests including 36 parallel reads with 4KB recordsize that shown 1.4GB/s (~460K blocks/s) before with heavy arc_evict_lock contention, now show 6.5GB/s (~1.6M blocks/s) without arc_evict_lock contention. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com> Closes #17970	2025-12-10 13:03:01 -08:00
Rob Norris	9fdb854109	Linux: work around use of GPL-only symbol `kasan_flag_enabled` We may not be able to avoid our code referencing the symbol, but we can ensure that a symbol of that name is available to the linker during build, and so not require linking the GPL-exported version. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #18009 Closes #18040	2025-12-10 10:04:57 -08:00
Chunwei Chen	0c194352b5	Fix ddtprune causing space leak In zio_ddt_free, if a pruned dde is still in ddt, it would do nothing and cause space leak. Reviewed-by: Rob Norris <robn@despairlabs.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Allan Jude <allan@klarasystems.com> Signed-off-by: Chunwei Chen <david.chen@nutanix.com> Closes #17982 Closes #17983	2025-12-10 10:02:14 -08:00
Alexander Moch	ff47dd35e2	Ensure 64-bit `off_t` is used in user space instead of `loff_t` Use 64-bit POSIX off_t in user space instead of the Linux kernel type loff_t. This is enforced at configure time via AC_SYS_LARGEFILE and AC_CHECK_SIZEOF([off_t]). loff_t remains in shared headers where they mirror Linux VFS interfaces, and on FreeBSD we typedef loff_t to off_t in those headers since libc does not provide it. Reviewed-by: Rob Norris <robn@despairlabs.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Moch <mail@alexmoch.com> Closes #18020	2025-12-10 09:45:39 -08:00

1 2 3 4 5 ...

10453 Commits