mirror_zfs

mirror of https://git.proxmox.com/git/mirror_zfs.git synced 2026-04-17 08:54:52 +03:00

Author	SHA1	Message	Date
oromenahar	c7b6119268	Allow block cloning across encrypted datasets When two datasets share the same master encryption key, it is safe to clone encrypted blocks. Currently only snapshots and clones of a dataset share with it the same encryption key. Added a test for: - Clone from encrypted sibling to encrypted sibling with non encrypted parent - Clone from encrypted parent to inherited encrypted child - Clone from child to sibling with encrypted parent - Clone from snapshot to the original datasets - Clone from foreign snapshot to a foreign dataset - Cloning from non-encrypted to encrypted datasets - Cloning from encrypted to non-encrypted datasets Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Original-patch-by: Pawel Jakub Dawidek <pawel@dawidek.net> Signed-off-by: Kay Pedersen <mail@mkwg.de> Closes #15544	2023-12-05 11:03:48 -08:00
Alexander Motin	55b764e062	ZIL: Do not clone blocks from the future ZIL claim can not handle block pointers cloned from the future, since they are not yet allocated at that point. It may happen either if the block was just written when it was cloned, or if the pool was frozen or somehow else rewound on import. Handle it from two sides: prevent cloning of blocks with physical birth time from not yet synced or frozen TXG, and abort ZIL claim if we still detect such blocks due to rewind or something else. While there, assert that any cloned blocks we claim are really allocated by calling metaslab_check_free(). Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #15617	2023-12-05 10:58:11 -08:00
Dex Wood	014265f4e6	Add Ntfy notification support to ZED This commit adds the zed_notify_ntfy() function and hooks it into zed_notify(). This will allow ZED to send notifications to ntfy.sh or a self-hosted Ntfy service, which can be received on a desktop or mobile device. It is configured with ZED_NTFY_TOPIC, ZED_NTFY_URL, and ZED_NTFY_ACCESS_TOKEN variables in zed.rc. Reviewed-by: @classabbyamp Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Dex Wood <slash2314@gmail.com> Closes #15584	2023-12-01 15:25:17 -08:00
Alexander Motin	bcd83ccd25	ZIL: Remove TX_CLONE_RANGE replay for ZVOLs. zil_claim_clone_range() takes references on cloned blocks before ZIL replay. Later zil_free_clone_range() drops them after replay or on dataset destroy. The total balance is neutral. It means we do not need to do anything (drop the references) for not implemented yet TX_CLONE_RANGE replay for ZVOLs. This is a logical follow up to #15603. Reviewed-by: Kay Pedersen <mail@mkwg.de> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #15612	2023-12-01 15:23:20 -08:00
Alexander Motin	adcea23cb0	ZIO: Add overflow checks for linear buffers Since we use a limited set of kmem caches, quite often we have unused memory after the end of the buffer. Put there up to a 512-byte canary when built with debug to detect buffer overflows at the free time. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #15553	2023-12-01 11:50:10 -08:00
Brooks Davis	3e4bef52b0	Only provide execvpe(3) when needed Check for the existence of execvpe(3) and only provide the FreeBSD compat version if required. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Brooks Davis <brooks.davis@sri.com> Closes #15609	2023-11-30 16:04:18 -08:00
Yuri Pankov	735ba3a7b7	Use uint64_t instead of u_int64_t Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Yuri Pankov <ypankov@tintri.com> Closes #15610	2023-11-30 10:36:33 -08:00
Alexander Motin	a03ebd9bee	ZIL: Call brt_pending_add() replaying TX_CLONE_RANGE zil_claim_clone_range() takes references on cloned blocks before ZIL replay. Later zil_free_clone_range() drops them after replay or on dataset destroy. The total balance is neutral. It means on actual replay we must take additional references, which would stay in BRT. Without this blocks could be freed prematurely when either original file or its clone are destroyed. I've observed BRT being emptied and the feature being deactivated after ZIL replay completion, which should not have happened. With the patch I see expected stats. Reviewed-by: Kay Pedersen <mail@mkwg.de> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #15603	2023-11-29 10:51:34 -08:00
Wraithh	8adf2e3066	Fix zoneid when USER_NS is disabled getzoneid() should return GLOBAL_ZONEID instead of 0 when USER_NS is disabled. Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ilkka Sovanto <github@ilkka.kapsi.fi> Closes #15560	2023-11-29 09:55:17 -08:00
VaibhavB	7d68900af3	ZTS: get_persistent_disk_name can return truncated names Instead of using only the 3rd element return the entire string after the split to handle device names with dashes. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Vaibhav Bhanawat <vaibhav.bhanawat@delphix.com> Closes #15567	2023-11-29 09:34:29 -08:00
Martin Matuška	1c38cdfe98	zdb: fix printf() length for uint64_t devid Bug introduced in `213d682967`. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Warner Losh <imp@FreeBSD.org> Signed-off-by: Martin Matuska <mm@FreeBSD.org> Closes #15606	2023-11-29 09:18:30 -08:00
Alexander Motin	2a27fd4111	ZIL: Assert record sizes in different places This should make sure we have log written without overflows. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #15517	2023-11-28 13:35:14 -08:00
Shengqi Chen	b94ce4e17d	module/icp/asm-arm/sha2: fix compiling on armv5/6 The `adr` insn in neon kernel generates an compiling error on armv5/6 target. Fix that by using `ldr`. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Shengqi Chen <harry-chen@outlook.com> Closes #15557	2023-11-28 13:26:12 -08:00
Shengqi Chen	4340f69be1	module/icp/asm-arm/sha2: auto detect __ARM_ARCH This patch uses __ARM_ARCH set by compiler (both GCC and Clang have this) whenever possible instead of hardcoding it to 7. This change allows code to compile on earlier ARM architectures such as armv5te. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Shengqi Chen <harry-chen@outlook.com> Closes #15557	2023-11-28 13:25:44 -08:00
Jaron Kent-Dobias	b3a985fa42	Linux 6.6 compat: fix configure error with clang (#15558 ) With Linux v6.6.x and clang 16, a configure step fails on a warning that later results in an error while building, due to 'ts' being uninitialized. Add a trivial initialization to silence the warning. Signed-off-by: Jaron Kent-Dobias <jaron@kent-dobias.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov>	2023-11-28 11:34:40 -08:00
Rob N	688514e470	dmu_buf_will_clone: fix race in transition back to NOFILL Previously, dmu_buf_will_clone() would roll back any dirty record, but would not clean out the modified data nor reset the state before releasing the lock. That leaves the last-written data in db_data, but the dbuf in the wrong state. This is eventually corrected when the dbuf state is made NOFILL, and dbuf_noread() called (which clears out the old data), but at this point its too late, because the lock was already dropped with that invalid state. Any caller acquiring the lock before the call into dmu_buf_will_not_fill() can find what appears to be a clean, readable buffer, and would take the wrong state from it: it should be getting the data from the cloned block, not from earlier (unwritten) dirty data. Even after the state was switched to NOFILL, the old data was still not cleaned out until dbuf_noread(), which is another gap for a caller to take the lock and read the wrong data. This commit fixes all this by properly cleaning up the previous state and then setting the new state before dropping the lock. The DBUF_VERIFY() calls confirm that the dbuf is in a valid state when the lock is down. Sponsored-by: Klara, Inc. Sponsored-By: OpenDrives Inc. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pawel Jakub Dawidek <pawel@dawidek.net> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #15566 Closes #15526	2023-11-28 09:53:04 -08:00
Matthew Ahrens	67894a597f	unnecessary alloc/free in dsl_scan_visitbp() Clean up code in dsl_scan_visitbp() by removing an unnecessary alloc/free and `goto`. This has the side benefit of reducing CPU usage, which is only really noticeable if we are not doing i/o for the leaf blocks, like when `zfs_no_scrub_io` is set. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Mark Maybee <mark.maybee@delphix.com> Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes #15549	2023-11-28 09:20:48 -08:00
Rob N	30d581121b	dnode_is_dirty: check dnode and its data for dirtiness Over its history this the dirty dnode test has been changed between checking for a dnodes being on `os_dirty_dnodes` (`dn_dirty_link`) and `dn_dirty_record`. `de198f2d9` Fix lseek(SEEK_DATA/SEEK_HOLE) mmap consistency `2531ce372` Revert "Report holes when there are only metadata changes" `ec4f9b8f3` Report holes when there are only metadata changes `454365bba` Fix dirty check in dmu_offset_next() `66aca2473` SEEK_HOLE should not block on txg_wait_synced() Also illumos/illumos-gate@c543ec060d illumos/illumos-gate@2bcf0248e9 It turns out both are actually required. In the case of appending data to a newly created file, the dnode proper is dirtied (at least to change the blocksize) and dirty records are added. Thus, a single logical operation is represented by separate dirty indicators, and must not be separated. The incorrect dirty check becomes a problem when the first block of a file is being appended to while another process is calling lseek to skip holes. There is a small window where the dnode part is undirtied while there are still dirty records. In this case, `lseek(fd, 0, SEEK_DATA)` would not know that the file is dirty, and would go to `dnode_next_offset()`. Since the object has no data blocks yet, it returns `ESRCH`, indicating no data found, which results in `ENXIO` being returned to `lseek()`'s caller. Since coreutils 9.2, `cp` performs sparse copies by default, that is, it uses `SEEK_DATA` and `SEEK_HOLE` against the source file and attempts to replicate the holes in the target. When it hits the bug, its initial search for data fails, and it goes on to call `fallocate()` to create a hole over the entire destination file. This has come up more recently as users upgrade their systems, getting OpenZFS 2.2 as well as a newer coreutils. However, this problem has been reproduced against 2.1, as well as on FreeBSD 13 and 14. This change simply updates the dirty check to check both types of dirty. If there's anything dirty at all, we immediately go to the "wait for sync" stage, It doesn't really matter after that; both changes are on disk, so the dirty fields should be correct. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Rich Ercolani <rincebrain@gmail.com> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #15571 Closes #15526	2023-11-28 09:07:57 -08:00
rmacklem	acb33ee1c1	FreeBSD: Fix ZFS so that snapshots under .zfs/snapshot are NFS visible Call vfs_exjail_clone() for mounts created under .zfs/snapshot to fill in the mnt_exjail field for the mount. If this is not done, the snapshots under .zfs/snapshot with not be accessible over NFS. This version has the argument name in vfs.h fixed to match that of the name in spl_vfs.c, although it really does not matter. External-issue: https://reviews.freebsd.org/D42672 Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rick Macklem <rmacklem@uoguelph.ca> Closes #15563	2023-11-27 16:31:03 -08:00
Akash B	c1a47de86f	zdb: Fix zdb '-O\|-r' options with -e/exported zpool zdb with '-e' or exported zpool doesn't work along with '-O' and '-r' options as we process them before '-e' has been processed. Below errors are seen: ~> zdb -e pool-mds65/mdt65 -O oi.9/0x200000009:0x0:0x0 failed to hold dataset 'pool-mds65/mdt65': No such file or directory ~> zdb -e pool-oss0/ost0 -r file1 /tmp/filecopy1 -p. failed to hold dataset 'pool-oss0/ost0': No such file or directory zdb: internal error: No such file or directory We need to make sure to process '-O\|-r' options after the '-e' option has been processed, which imports the pool to the namespace if it's not in the cachefile. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Akash B <akash-b@hpe.com> Closes #15532	2023-11-27 13:41:58 -08:00
Rob Norris	213d682967	zdb: show BRT statistics and dump its contents Same idea as the dedup stats, but for block cloning. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Kay Pedersen <mail@mkwg.de> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #15541	2023-11-27 13:35:07 -08:00
Rob Norris	803a9c12c9	brt: lift internal definitions into _impl header So that zdb (and others!) can get at the BRT on-disk structures. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Kay Pedersen <mail@mkwg.de> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #15541	2023-11-27 13:34:43 -08:00
Tony Hutter	3551a32e5e	ZTS: Fix zfs_load-key failures on F39 The zfs_load-key tests were failing on F39 due to their use of the deprecated ssl.wrap_socket function. This commit updates the test to instead use ssl.SSLContext() as described in https://stackoverflow.com/a/65194957. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #15534 Closes #15550	2023-11-27 13:24:37 -08:00
AllKind	95b68eb693	zfs-dkms: fix shell-init error message If all zfs dkms modules have been removed, a shell-init error message may appear, because /var/lib/dkms/zfs does no longer exist. Resolve this by leaving the directory earlier on. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Mart Frauenlob <AllKind@fastest.cc> Closes #15576	2023-11-27 13:17:48 -08:00
Alexander Motin	cf33166336	ZVOL: Minor code cleanup - Remove zsda_tx field, it is used only once. - Remove unneeded string lengths checks, all names are terminated. - Replace few explicit MAXNAMELEN usages with sizeof(). - Change dsname from MAXNAMELEN to ZFS_MAX_DATASET_NAME_LEN, as expected by dsl_dataset_name(). Both are 256 bytes now, but it is better to be safe. This should have no functional difference. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #15535	2023-11-27 13:16:59 -08:00
Alan Somers	126efb5889	FreeBSD: Fix the build on FreeBSD 12 It was broken for several reasons: * VOP_UNLOCK lost an argument in 13.0. So OpenZFS should be using VOP_UNLOCK1, but a few direct calls to VOP_UNLOCK snuck in. * The location of the zlib header moved in 13.0 and 12.1. We can drop support for building on 12.0, which is EoL. * knlist_init lost an argument in 13.0. OpenZFS change `9d0887402b` assumed 13.0 or later. * FreeBSD 13.0 added copy_file_range, and OpenZFS change `67a1b03791` assumed 13.0 or later. Sponsored-by: Axcient Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Alan Somers <asomers@gmail.com> Closes #15551	2023-11-27 12:58:03 -08:00
Alexander Motin	a490875103	ZIL: Refactor TX_WRITE encryption similar to TX_CLONE_RANGE It should be purely textual change to make the code more readable. Should cause no functional difference. Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Reviewed-by: Tom Caputi <caputit1@tcnj.edu> Reviewed-by: Sean Eric Fagan <sef@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Edmund Nadolski <edmund.nadolski@ixsystems.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #15543 Closes #15513	2023-11-27 09:56:30 -08:00
Alexander Motin	27d8c23c58	ZIL: Do not encrypt block pointers in lr_clone_range_t In case of crash cloned blocks need to be claimed on pool import. It is only possible if they (lr_bps) and their count (lr_nbps) are not encrypted but only authenticated, similar to block pointer in lr_write_t. Few other fields can be and are still encrypted. This should fix panic on ZIL claim after crash when block cloning is actively used. Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Reviewed-by: Tom Caputi <caputit1@tcnj.edu> Reviewed-by: Sean Eric Fagan <sef@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Edmund Nadolski <edmund.nadolski@ixsystems.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #15543 Closes #15513	2023-11-27 09:53:32 -08:00
Don Brady	7bbd42ef49	Don't allow attach to a raidz child vdev Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Don Brady <don.brady@klarasystems.com> Closes #15536 Closes #15564	2023-11-27 09:46:38 -08:00
Tony Hutter	a94860a6de	ZTS: Fix 'could not unmount datasets' on Alma 9 (#15542 ) Many tests are failing on AlmaLinux 9 because ZTS could not destroy the pool in cleanup. This was due to $PWD being set to '.' instead of the expected full path. This patch sets $PWD to the full path. Signed-off-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Don Brady <don.brady@delphix.com>	2023-11-20 16:07:32 -08:00
Brooks Davis	cd67bc0ae4	freebsd: remove __FBSDID macro use With FreeBSD's switch to git the $FreeBSD$ string is no longer expanded and they have mostly been removed upstream. Stop using __FBSDID and remove the no-longer needed sys/cdefs.h includes. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Brooks Davis <brooks.davis@sri.com> Closes #15527	2023-11-17 14:02:09 -08:00
Alexander Motin	5a3bffab10	ZIO: Optimize zio_flush() - Generalize vdev_nowritecache handling by traversing through the VDEV tree and skipping children ZIOs where not supported. - Remove intermediate zio_null() in case of several VDEV children. - Remove children handling from zio_ioctl(). There are no other use cases for this code beside DKIOCFLUSHWRITECACHED, and would there be, I doubt they would so straightforward apply to all VDEV children. Comparing to removed previous optimization this should improve cases of redundant ZILs/SLOGs. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Wilson <george.wilson@delphix.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #15515	2023-11-17 14:00:59 -08:00
Alexander Motin	22c8c33a58	Use abd_zero_off() where applicable In several places abd_zero() cleaned ABD filled at the next line. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #15514	2023-11-17 13:28:32 -08:00
Rob N	92dc4ad83d	Consider `dnode_t` allocations in dbuf cache size accounting Entries in the dbuf cache contribute only the size of the dbuf data to the cache size. Attached "user" data is not counted. This can lead to the data currently "owned" by the cache consuming more memory accounting appears to show. In some cases (eg a metadnode data block with all child dnode_t slots allocated), the actual size can be as much as 3x as what the cache believes it to be. This is arguably correct behaviour, as the cache is only tracking the size of the dbuf data, not even the overhead of the dbuf_t. On the other hand, in the above case of dnodes, evicting cached metadnode dbufs is the only current way to reclaim the dnode objects, and can lead to the situation where the dbuf cache appears to be comfortably within its target memory window and yet is holding enormous amounts of slab memory that cannot be reclaimed. This commit adds a facility for a dbuf user to artificially inflate the apparent size of the dbuf for caching purposes. This at least allows for cache tuning to be adjusted to match something closer to the real memory overhead. metadnode dbufs carry a >1KiB allocation per dnode in their user data. This informs the dbuf cache machinery of that fact, allowing it to make better decisions when evicting dbufs. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #15511	2023-11-17 13:25:53 -08:00
Paul Dagnelie	6c6fae6fae	Fix memory leak in zfs_setprocinit code Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rich Ercolani <rincebrain@gmail.com> Signed-off-by: Paul Dagnelie <pcd@delphix.com> Closes #15508	2023-11-17 13:21:04 -08:00
Rich Ercolani	03e9caaec0	Add a tunable to disable BRT support. Copy the disable parameter that FreeBSD implemented, and extend it to work on Linux as well, until we're sure this is stable. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #15529	2023-11-16 11:35:22 -08:00
Umer Saleem	5796e3a742	Packaging: Auto-generate changelog during configure (#15528 ) Auto-generate changelog based off on @VERSION@ during configure, so that it is not needed to be update with new releases / version updates. Signed-off-by: Umer Saleem <usaleem@ixsystems.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov>	2023-11-16 08:58:47 -08:00
Alexander Motin	35da345160	L2ARC: Restrict write size to 1/4 of the device PR #15457 exposed weird logic in L2ARC write sizing. If it appeared bigger than device size, instead of liming write it reset all the system-wide tunables to their default. Aside of being excessive, it did not actually help with the problem, still allowing infinite loop to happen. This patch removes the tunables reverting logic, but instead limits L2ARC writes (or at least eviction/trim) to 1/4 of the capacity. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Amanakis <gamanakis@gmail.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #15519	2023-11-14 13:47:57 -08:00
Chunwei Chen	da51bd17e5	Fix snap_obj_array memory leak in check_filesystem() Use goto out instead of return for early exit to make sure snap_obj_array is freed. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chunwei Chen <david.chen@nutanix.com> Closes #15516	2023-11-14 12:59:02 -08:00
Tony Hutter	2319656802	Linux 6.6 compat: META Update the META file to reflect compatibility with the 6.6 kernel. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Umer Saleem <usaleem@ixsystems.com> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #15520	2023-11-14 09:55:28 -08:00
Tony Hutter	786641dcf9	Workaround UBSAN errors for variable arrays This gets around UBSAN errors when using arrays at the end of structs. It converts some zero-length arrays to variable length arrays and disables UBSAN checking on certain modules. It is based off of the patch from #15460. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Tested-by: Thomas Lamprecht <t.lamprecht@proxmox.com> Co-authored-by: Thomas Lamprecht <t.lamprecht@proxmox.com> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Issue #15145 Closes #15510	2023-11-12 16:26:07 -08:00
Alexander Motin	3a8d9b8487	Linux: Reclaim unused spl_kmem_cache_reclaim It is unused for 3 years since #10576. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #15507	2023-11-10 10:34:46 -08:00
Umer Saleem	40fccc423a	ZTS: Test for all known zpool feature sets zpool_create_features_007_pos only tested for compat-2020 feature set. It would be useful to test for all known features sets. If any additional feature is found enabled that is not present in compatibility list or feature set, it should be caught and reported earlier. This commit also removes encryption from openzfsonosx-1.8.1 compatibility list. Encryption enables bookmark_v2, since it is a dependency of encryption, but not listed in openzfsonoxx-1.8.1 compatibility list. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Umer Saleem <usaleem@ixsystems.com> Closes #15505	2023-11-09 10:58:23 -08:00
Umer Saleem	15a8fa76b2	Update zpool-features.7 for grub2 compatibility list updates This commit updates zpool-features.7 man page to add newly added zpool features to grub2 compatibility list. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Umer Saleem <usaleem@ixsystems.com> Closes #15505	2023-11-09 10:58:09 -08:00
shodanshok	887a3c533b	Increase L2ARC write rate and headroom Current L2ARC write rate and headroom parameters are very conservative: l2arc_write_max=8M and l2arc_headroom=2 (ie: a full L2ARC writes at 8 MB/s, scanning 16/32 MB of ARC tail each time; a warming L2ARC runs at 2x these rates). These values were selected 15+ years ago based on then-current SSDs size, performance and endurance. Today we have multi-TB, fast and cheap SSDs which can sustain much higher read/write rates. For this reason, this patch increases l2arc_write_max to 32M and l2arc_headroom to 8 (4x increase for both). Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Gionatan Danti <g.danti@assyoma.it> Closes #15457	2023-11-08 16:30:47 -08:00
Martin Matuška	1c1be60fa2	Unbreak FreeBSD world build after `3bd4df384` Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Martin Matuska <mm@FreeBSD.org> Closes #15504	2023-11-08 16:29:34 -08:00
Low-power	a160c153e2	Linux: reject read/write mapping to immutable file only on VM_SHARED Private read/write mapping can't be used to modify the mapped files, so they will remain be immutable. Private read/write mappings are usually used to load the data segment of executable files, rejecting them will rendering immutable executable files to stop working. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: WHR <msl0000023508@gmail.com> Closes #15344	2023-11-08 12:19:38 -08:00
AllKind	3a81bf4ad2	Workaround to allow openzfs-zfs-dkms install on Ubuntu As shown in #15404#issuecomment-1765002181, Ubuntu kernel has 'Provides: zfs-dkms', which will cause uninstall of the kernel, when attempting to install openzfs-zfs-dkms. As a workaround remove the 'Conflicts: zfs-dkms' definition from the debian control file. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Mart Frauenlob <AllKind@fastest.cc> Closes #15503	2023-11-08 10:30:46 -08:00
Don Brady	5caeef02fa	RAID-Z expansion feature This feature allows disks to be added one at a time to a RAID-Z group, expanding its capacity incrementally. This feature is especially useful for small pools (typically with only one RAID-Z group), where there isn't sufficient hardware to add capacity by adding a whole new RAID-Z group (typically doubling the number of disks). == Initiating expansion == A new device (disk) can be attached to an existing RAIDZ vdev, by running `zpool attach POOL raidzP-N NEW_DEVICE`, e.g. `zpool attach tank raidz2-0 sda`. The new device will become part of the RAIDZ group. A "raidz expansion" will be initiated, and the new device will contribute additional space to the RAIDZ group once the expansion completes. The `feature@raidz_expansion` on-disk feature flag must be `enabled` to initiate an expansion, and it remains `active` for the life of the pool. In other words, pools with expanded RAIDZ vdevs can not be imported by older releases of the ZFS software. == During expansion == The expansion entails reading all allocated space from existing disks in the RAIDZ group, and rewriting it to the new disks in the RAIDZ group (including the newly added device). The expansion progress can be monitored with `zpool status`. Data redundancy is maintained during (and after) the expansion. If a disk fails while the expansion is in progress, the expansion pauses until the health of the RAIDZ vdev is restored (e.g. by replacing the failed disk and waiting for reconstruction to complete). The pool remains accessible during expansion. Following a reboot or export/import, the expansion resumes where it left off. == After expansion == When the expansion completes, the additional space is available for use, and is reflected in the `available` zfs property (as seen in `zfs list`, `df`, etc). Expansion does not change the number of failures that can be tolerated without data loss (e.g. a RAIDZ2 is still a RAIDZ2 even after expansion). A RAIDZ vdev can be expanded multiple times. After the expansion completes, old blocks remain with their old data-to-parity ratio (e.g. 5-wide RAIDZ2, has 3 data to 2 parity), but distributed among the larger set of disks. New blocks will be written with the new data-to-parity ratio (e.g. a 5-wide RAIDZ2 which has been expanded once to 6-wide, has 4 data to 2 parity). However, the RAIDZ vdev's "assumed parity ratio" does not change, so slightly less space than is expected may be reported for newly-written blocks, according to `zfs list`, `df`, `ls -s`, and similar tools. Sponsored-by: The FreeBSD Foundation Sponsored-by: iXsystems, Inc. Sponsored-by: vStack Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Mark Maybee <mark.maybee@delphix.com> Authored-by: Matthew Ahrens <mahrens@delphix.com> Contributions-by: Fedor Uporov <fuporov.vstack@gmail.com> Contributions-by: Stuart Maybee <stuart.maybee@comcast.net> Contributions-by: Thorsten Behrens <tbehrens@outlook.com> Contributions-by: Fmstrat <nospam@nowsci.com> Contributions-by: Don Brady <dev.fs.zfs@gmail.com> Signed-off-by: Don Brady <dev.fs.zfs@gmail.com> Closes #15022	2023-11-08 10:19:41 -08:00
Umer Saleem	9198de8f10	Linux 6.6 compat: fix implicit conversion error with debug build With Linux v6.6.0 and GCC 12, when debug build is configured, implicit conversion error is raised while converting 'enum <anonymous>' to 'boolean_t'. Use 'B_TRUE' instead of 'true' to fix the issue. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Snajdr <snajpa@snajpa.net> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Umer Saleem <usaleem@ixsystems.com> Closes #15489	2023-11-07 13:24:16 -08:00

1 2 3 4 5 ...

8915 Commits