mirror_zfs

mirror of https://git.proxmox.com/git/mirror_zfs.git synced 2026-04-17 08:54:52 +03:00

Author	SHA1	Message	Date
наб	23b6f17abb	linux/spl: proc: use global table_{min,max} values instead of local ones Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11879	2021-04-15 14:55:50 -07:00
наб	7de4c88b39	linux/spl: base proc_dohostid() on proc_dostring() This fixes /proc/sys/kernel/spl/hostid on kernels with mainline commit 32927393dc1ccd60fb2bdc05b9e8e88753761469 ("sysctl: pass kernel pointers to ->proc_handler") ‒ 5.7-rc1 and up The access_ok() check in copy_to_user() in proc_copyout_string() would always fail, so all userspace reads and writes would fail with EINVAL proc_dostring() strips only the final new-line, but simple_strtoul() doesn't actually need a back-trimmed string ‒ writing "012345678 \n" is still allowed, as is "012345678zupsko", &c. This alters what happens when an invalid value is written ‒ previously it'd get set to what-ever simple_strtoul() returned (probably 0, thereby resetting it to default), now it does nothing Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11878 Closes #11879	2021-04-15 14:55:43 -07:00
Paul Dagnelie	414f7249dc	Add SIGSTOP and SIGTSTP handling to issig This change adds SIGSTOP and SIGTSTP handling to the issig function; this mirrors its behavior on Solaris. This way, long running kernel tasks can be stopped with the appropriate signals. Note that doing so with ctrl-z on the command line doesn't return control of the tty to the shell, because tty handling is done separately from stopping the process. That can be future work, if people feel that it is a necessary addition. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Signed-off-by: Paul Dagnelie <pcd@delphix.com> Issue #810 Issue #10843 Closes #11801	2021-04-15 13:34:35 -07:00
Mateusz Guzik	93f81eb721	FreeBSD: use vnlru_free_vfsops if available Fixes issues when zfs is used along with other filesystems. External-issue: https://cgit.freebsd.org/src/commit/?id=e9272225e6bed840b00eef1c817b188c172338ee Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Closes #11881	2021-04-12 11:01:46 -07:00
Mateusz Guzik	5ad86e973c	FreeBSD: add missing seqc write begin/end around zfs_acl_chown_setattr It happens to trip over an assert but does not matter for correctness at this time. Done for future proofing. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Closes #11884	2021-04-12 10:59:57 -07:00
Mateusz Guzik	d8c09f3fcc	FreeBSD: add support for lockless symlink lookup Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Closes #11883	2021-04-12 10:59:22 -07:00
Ryan Moeller	a631283b74	Move zfsdev_state_{init,destroy} to common code Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org> Closes #11833	2021-04-08 21:17:43 -07:00
Ryan Moeller	1dff545278	Eliminate zfsdev_get_state_impl After `3937ab20f` zfsdev_get_state_impl can become zfsdev_get_state. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org> Closes #11833	2021-04-08 21:17:18 -07:00
TerraTech	161ed825ca	zpl_inode.c: Fix SMACK interoperability SMACK needs to have the ZFS dentry security field setup before SMACK's d_instantiate() hook is called as it requires functioning '__vfs_getxattr()' calls to properly set the labels. Fxes: 1) file instantiation properly setting the object label to the subject's label 2) proper file labeling in a transmutable directory Functions Updated: 1) zpl_create() 2) zpl_mknod() 3) zpl_mkdir() 4) zpl_symlink() External-issue: https://github.com/cschaufler/smack-next/issues/1 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: TerraTech <TerraTech@users.noreply.github.com> Closes #11646 Closes #11839	2021-04-08 21:15:29 -07:00
Matthew Ahrens	bbcec73783	kmem_alloc(KM_SLEEP) should use kvmalloc() `kmem_alloc(size>PAGESIZE, KM_SLEEP)` is backed by `kmalloc()`, which finds contiguous physical memory. If there isn't enough contiguous physical memory available (e.g. due to physical page fragmentation), the OOM killer will be invoked to make more memory available. This is not ideal because processes may be killed when there is still plenty of free memory (it just happens to be in individual pages, not contiguous runs of pages). We have observed this when allocating the ~13KB `zfs_cmd_t`, for example in `zfsdev_ioctl()`. This commit changes the behavior of `kmem_alloc(size>PAGESIZE, KM_SLEEP)` when there are insufficient contiguous free pages. In this case we will find individual pages and stitch them together using virtual memory. This is accomplished by using `kvmalloc()`, which implements the described behavior by trying `kmalloc(__GFP_NORETRY)` and falling back on `vmalloc()`. The behavior of `kmem_alloc(KM_NOSLEEP)` is not changed; it continues to use `kmalloc(GPF_ATOMIC \| __GFP_NORETRY)`. This is because `vmalloc()` may sleep. Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Wilson <gwilson@delphix.com> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes #11461	2021-04-06 12:44:54 -07:00
Andrea Gelmini	bf169e9f15	Fix various typos Correct an assortment of typos throughout the code base. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net> Closes #11774	2021-04-02 18:52:15 -07:00
Ryan Moeller	dce3176349	Avoid taking global lock to destroy zfsdev state We have exclusive access to our zfsdev state object in this section until it is invalidated by setting zs_minor to -1, so we can destroy the state without taking a lock if we do the invalidation last, after a member to ensure correct ordering. While here, strengthen the assertions that zs_minor is valid when we enter. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org> Closes #11751	2021-04-02 11:09:05 -07:00
Ryan Moeller	02aaf11fc7	FreeBSD: Fix stable/12 after AT_BENEATH removal Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #11827	2021-04-02 11:06:44 -07:00
Luis Henriques	2037edbdaa	Fix error code on __zpl_ioctl_setflags() Other (all?) Linux filesystems seem to return -EPERM instead of -EACCESS when trying to set FS_APPEND_FL or FS_IMMUTABLE_FL without the CAP_LINUX_IMMUTABLE capability. This was detected by generic/545 test in the fstest suite. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Luis Henriques <henrix@camandro.org> Closes #11791	2021-03-26 10:46:45 -07:00
Andrea Gelmini	8a915ba1f6	Removed duplicated includes Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net> Closes #11775	2021-03-22 12:34:58 -07:00
Brian Atkinson	f52124dce8	Removing old code for k(un)map_atomic It used to be required to pass a enum km_type to kmap_atomic() and kunmap_atomic(), however this is no longer necessary and the wrappers zfs_k(un)map_atomic removed these. This is confusing in the ABD code as the struct abd_iter member iter_km no longer exists and the wrapper macros simply compile them out. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Adam Moss <c@yotes.com> Signed-off-by: Brian Atkinson <batkinson@lanl.gov> Closes #11768	2021-03-19 22:38:44 -07:00
Coleman Kane	ffd6978ef5	Linux 5.12 update: bio_max_segs() replaces BIO_MAX_PAGES The BIO_MAX_PAGES macro is being retired in favor of a bio_max_segs() function that implements the typical MIN(x,y) logic used throughout the kernel for bounding the allocation, and also the new implementation is intended to be signed-safe (which the former was not). Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Coleman Kane <ckane@colemankane.org> Closes #11765	2021-03-19 22:33:42 -07:00
Coleman Kane	e2a8296131	Linux 5.12 compat: idmapped mounts In Linux 5.12, the filesystem API was modified to support ipmapped mounts by adding a "struct user_namespace *" parameter to a number functions and VFS handlers. This change adds the needed autoconf macros to detect the new interfaces and updates the code appropriately. This change does not add support for idmapped mounts, instead it preserves the existing behavior by passing the initial user namespace where needed. A subsequent commit will be required to add support for idmapped mounted. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Coleman Kane <ckane@colemankane.org> Closes #11712	2021-03-19 21:00:59 -07:00
Mateusz Guzik	2f385c913f	FreeBSD: make seqc asserts conditional on replay Avoids tripping on asserts when doing pool recovery. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Closes #11739	2021-03-17 22:09:45 -07:00
Ryan Moeller	ec3e4c6784	FreeBSD: Fix memory leaks in kstats Don't handle (incorrectly) kmem_zalloc() failure. With KM_SLEEP, will never return NULL. Free the data allocated for non-virtual kstats when deleting the object. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #11767	2021-03-17 21:55:18 -07:00
Adam D. Moss	1daad98176	Linux: always check or verify return of igrab() zhold() wraps igrab() on Linux, and igrab() may fail when the inode is in the process of being deleted. This means zhold() must only be called when a reference exists and therefore it cannot be deleted. This is the case for all existing consumers so add a VERIFY and a comment explaining this requirement. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Adam Moss <c@yotes.com> Closes #11704	2021-03-16 16:33:34 -07:00
Don Brady	dd0b5c8559	Reference_tracking_enable should be a module param To make use of zfs_refcount_held tunable it should be a module parameter in open-zfs. Also, since the macros will auto-generate OS specific tunables, removed the existing zfs_refcount_held reference in module/os/freebsd/zfs/sysctl_os.c. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Allan Jude <allan@klarasystems.com> Signed-off-by: Don Brady <don.brady@delphix.com> Closes #11753	2021-03-16 14:56:17 -07:00
Mariusz Zaborski	e464f7c7cc	FreeBSD: bring back possibility to rewind the checkpoint from bootloader Add parsing of the rewind options. When I was upstreaming the change [1], I omitted the part where we detect that the pool should be rewind. When the FreeBSD repo has synced with the OpenZFS, this part of the code was removed. [1] FreeBSD repo: 277f38abffc6a8160b5044128b5b2c620fbb970c [2] OpenZFS repo: `f2c027bd6a` External-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=254152 Originally reviewed by: tsoome, allanjude Originally reviewed by: kevans (ok from high-level overview) Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Mariusz Zaborski <oshogbo@vexillium.org> Closes #11730	2021-03-12 16:12:14 -08:00
Ryan Moeller	f845b2dd1c	FreeBSD: Clean up zfsdev_close to match Linux Resolve some oddities in zfsdev_close() which could result in a panic and were not present in the equivalent function for Linux. - Remove unused definition ZFS_MIN_MINOR - FreeBSD: Simplify zfsdev state destruction - Assert zs_minor is valid in zfsdev_close - Make locking around zfsdev state match Linux Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #11720	2021-03-12 16:09:15 -08:00
Mateusz Guzik	5ebe425a5b	Macroify teardown lock handling This will allow platforms to implement it as they see fit, in particular in a different manner than rrm locks. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Matt Macy <mmacy@FreeBSD.org> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Closes #11153	2021-03-12 15:51:39 -08:00
Mateusz Guzik	9847f77f01	FreeBSD: rename teardown inactive macros to mimick rrm convention Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Matt Macy <mmacy@FreeBSD.org> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Closes #11153	2021-03-12 15:51:31 -08:00
Mateusz Guzik	f9acd578f0	FreeBSD: remove 2 assertions that teardown lock is not held They are not very useful and hard to implement in the rms routine the code is about to start using. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Matt Macy <mmacy@FreeBSD.org> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Closes #11153	2021-03-12 15:51:20 -08:00
Mateusz Guzik	300f68e017	FreeBSD: rework asserts in zfs_dd_lookup 1. even up ifdefs 2. drop the arguably useless teardown lock asserts -- nothing else checks for it Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Matt Macy <mmacy@FreeBSD.org> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Closes #11153	2021-03-12 15:51:07 -08:00
Christian Schwarz	93e3658035	zvol: call zil_replaying() during replay zil_replaying(zil, tx) has the side-effect of informing the ZIL that an entry has been replayed in the (still open) tx. The ZIL uses that information to record the replay progress in the ZIL header when that tx's txg syncs. ZPL log entries are not idempotent and logically dependent and thus calling zil_replaying() is necessary for correctness. For ZVOLs the question of correctness is more nuanced: ZVOL logs only TX_WRITE and TX_TRUNCATE, both of which are idempotent. Logical dependencies between two records exist only if the write or discard request had sync semantics or if the ranges affected by the records overlap. Thus, at a first glance, it would be correct to restart replay from the beginning if we crash before replay completes. But this does not address the following scenario: Assume one log record per LWB. The chain on disk is HDR -> 1:W(1, "A") -> 2:W(1, "B") -> 3:W(2, "X") -> 4:W(3, "Z") where N:W(O, C) represents log entry number N which is a TX_WRITE of C to offset A. We replay 1, 2 and 3 in one txg, sync that txg, then crash. Bit flips corrupt 2, 3, and 4. We come up again and restart replay from the beginning because we did not call zil_replaying() during replay. We replay 1 again, then interpret 2's invalid checksum as the end of the ZIL chain and call replay done. The replayed zvol content is "AX". If we had called zil_replaying() the HDR would have pointed to 3 and our resumed replay would not have replayed anything because 3 was corrupted, resulting in zvol content "BX". If 3 logically depends on 2 then the replay corrupted the ZVOL_OBJ's contents. This patch adds the zil_replaying() calls to the replay functions. Since the callbacks in the replay function need the zilog_t* pointer so that they can call zil_replaying() we open the ZIL while replaying in zvol_create_minor(). We also verify that replay has been done when on-demand-opening the ZIL on the first modifying bio. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Christian Schwarz <me@cschwarz.com> Closes #11667	2021-03-07 09:49:58 -08:00
Ryan Moeller	4b2e20824b	Intentionally allow ZFS_READONLY in zfs_write ZFS_READONLY represents the "DOS R/O" attribute. When that flag is set, we should behave as if write access were not granted by anything in the ACL. In particular: We _must_ allow writes after opening the file r/w, then setting the DOS R/O attribute, and writing some more. (Similar to how you can write after fchmod(fd, 0444).) Restore these semantics which were lost on FreeBSD when refactoring zfs_write. To my knowledge Linux does not actually expose this flag, but we'll need it to eventually so I've added the supporting checks. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #11693	2021-03-07 09:31:52 -08:00
Brian Behlendorf	6bbb44e157	Initialize ZIL buffers When populating a ZIL destination buffer ensure it is always zeroed before its contents are constructed. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Tom Caputi <caputit1@tcnj.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #11687	2021-03-05 14:45:13 -08:00
Christian Schwarz	e439ee83c1	linux: zvol: avoid heap allocation for zvol_request_sync=1 The spl_kmem_alloc showed up in some flamegraphs in a single-threaded 4k sync write workload at 85k IOPS on an Intel(R) Xeon(R) Silver 4215 CPU @ 2.50GHz. Certainly not a huge win but I believe the change is clean and easy to maintain down the road. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Signed-off-by: Christian Schwarz <me@cschwarz.com> Closes #11666	2021-03-03 08:15:28 -08:00
Andriy Gapon	2e160dee97	Fix assert in FreeBSD-specific dmu_read_pages The function has three similar pieces of code: for read-behind pages, requested pages and read-ahead pages. All three pieces had an assert to ensure that the page is not mapped. Later the assert was relaxed to require that the page is not mapped for writing. But that was done in two places out of three. This change fixes the third piece, read-ahead. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Andriy Gapon <avg@FreeBSD.org> Closes #11654	2021-02-27 17:23:09 -08:00
Coleman Kane	d939930fcc	Linux 5.12 compat: bio->bi_disk member moved The struct bio member bi_disk was moved underneath a new member named bi_bdev. So all attempts to reference bio->bi_disk need to now become bio->bi_bdev->bd_disk. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Coleman Kane <ckane@colemankane.org> Closes #11639	2021-02-24 10:04:34 -08:00
Brian Behlendorf	1dfc82a14e	Linux: increase max nvlist_src size On Linux increase the maximum allowed size of the src nvlist which can be passed to the /dev/zfs ioctl. Originally, this was set to a maximum of KMALLOC_MAX_SIZE (4M) because it was kmalloc'd. Since that time it's been converted to a vmalloc so that's no longer a hard limit, and it's desirable for `zfs send/recv` to allow larger nvlists so more snapshots can be sent at once. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #6572 Closes #11638	2021-02-24 09:57:18 -08:00
Brian Atkinson	c0801bf35a	Cleaning up uio headers Making uio_impl.h the common header interface between Linux and FreeBSD so both OS's can share a common header file. This also helps reduce code duplication for zfs_uio_t for each OS. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Brian Atkinson <batkinson@lanl.gov> Closes #11622	2021-02-20 20:16:50 -08:00
Ryan Moeller	64e0fe14ff	Restore FreeBSD resource usage accounting Add zfs_racct_* interfaces for platform-dependent read/write accounting. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #11613	2021-02-19 22:34:33 -08:00
Mark Johnston	e7adccf7f5	FreeBSD: disable the use of hardware crypto offload drivers for now First, the crypto request completion handler contains a bug in that it fails to reset fs_done correctly after the request is completed. This is only a problem for asynchronous drivers. Second, some hardware drivers have input constraints which ZFS does not satisfy. For instance, ccp(4) apparently requires the AAD length for AES-GCM to be a multiple of the cipher block size, and with qat(4) the AES-GCM AAD length may not be longer than 240 bytes. FreeBSD's generic crypto framework doesn't have a mechanism to automatically fall back to a software implementation if a hardware driver cannot process a request, and ZFS does not tolerate such errors. The plan is to implement such a fallback mechanism, but with FreeBSD 13.0 approaching we should simply disable the use hardware drivers for now. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Mark Johnston <markj@FreeBSD.org> Closes #11612	2021-02-18 15:51:20 -08:00
Ryan Libby	bf156c966b	Remove unused abd_alloc_scatter_offset_chunkcnt Remove function that become unused after refactoring in `e2af2acce3`. Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Libby <rlibby@FreeBSD.org> Closes #11614	2021-02-17 21:39:13 -08:00
khng300	fc273894d2	Rename zfs_inode_update to zfs_znode_update_vfs zfs_znode_update_vfs is a more platform-agnostic name than zfs_inode_update. Besides that, the function's prototype is moved to include/sys/zfs_znode.h as the function is also used in common code. Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ka Ho Ng <khng300@gmail.com> Sponsored by: The FreeBSD Foundation Closes #11580	2021-02-09 11:17:29 -08:00
Kleber Tarcísio	4f22619ae3	Add an assert to clarify code The first time through the loop prevdb and prevhdl are NULL. They are then both set, but only prevdb is checked. Add an ASSERT to make it clear that prevhdl must be set when prevdb is. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Kleber <klebertarcisio@yahoo.com.br> Closes #10754 Closes #11575	2021-02-09 11:14:59 -08:00
Matthew Ahrens	f8c0d7e1f6	fix abd_nr_pages_off for gang abd `__vdev_disk_physio()` uses `abd_nr_pages_off()` to allocate a bio with a sufficient number of iovec's to process this zio (i.e. `nr_iovecs`/`bi_max_vecs`). If there are not enough iovec's in the bio, then additional bio's will be allocated. However, this is a sub-optimal code path. In particular, it requires several abd calls (to `abd_nr_pages_off()` and `abd_bio_map_off()`) which will have to walk the constituents of the ABD (the pages or the gang children) because they are looking for offsets > 0. For gang ABD's, `abd_nr_pages_off()` returns the number of iovec's needed for the first constituent, rather than the sum of all constituents (within the requested range). This always under-estimates the required number of iovec's, which causes us to always need several bio's. The end result is that `__vdev_disk_physio()` is usually O(n^2) for gang ABD's (and occasionally O(n^3), when more than 16 bio's are needed). This commit fixes `abd_nr_pages_off()`'s handling of gang ABD's, to correctly determine how many iovec's are needed, by adding up the number of iovec's for each of the gang children in the requested range. Reviewed-by: Mark Maybee <mark.maybee@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes #11536	2021-01-28 09:28:20 -08:00
Paul Dagnelie	2921ad6cba	Fix zrele race in zrele_async that can cause hang There is a race condition in zfs_zrele_async when we are checking if we would be the one to evict an inode. This can lead to a txg sync deadlock. Instead of calling into iput directly, we attempt to perform the atomic decrement ourselves, unless that would set the i_count value to zero. In that case, we dispatch a call to iput to run later, to prevent a deadlock from occurring. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Signed-off-by: Paul Dagnelie <pcd@delphix.com> Closes #11527 Closes #11530	2021-01-27 21:29:58 -08:00
Brian Behlendorf	0e6c493fec	cppcheck: integrete cppcheck In order for cppcheck to perform a proper analysis it needs to be aware of how the sources are compiled (source files, include paths/files, extra defines, etc). All the needed information is available from the Makefiles and can be leveraged with a generic cppcheck Makefile target. So let's add one. Additional minor changes: * Removing the cppcheck-suppressions.txt file. With cppcheck 2.3 and these changes it appears to no longer be needed. Some inline suppressions were also removed since they appear not to be needed. We can add them back if it turns out they're needed for older versions of cppcheck. * Added the ax_count_cpus m4 macro to detect at configure time how many processors are available in order to run multiple cppcheck jobs. This value is also now used as a replacement for nproc when executing the kernel interface checks. * "PHONY =" line moved in to the Rules.am file which is included at the top of all Makefile.am's. This is just convenient becase it allows us to use the += syntax to add phony targets. * One upside of this integration worth mentioning is it now allows `make cppcheck` to be run in any directory to check that subtree. * For the moment, cppcheck is not run against the FreeBSD specific kernel sources. The cppcheck-FreeBSD target will need to be implemented and testing on FreeBSD to support this. Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #11508	2021-01-26 16:12:26 -08:00
Brian Behlendorf	a06ba74a1e	cppcheck: return value always 0 Identical condition and return expression 'rc', return value is always 0. Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #11508	2021-01-26 16:12:18 -08:00
Brian Behlendorf	2cdd75bed7	cppcheck: remove redundant ASSERTs The ASSERT that the passed pointer isn't NULL appears after the pointer has already been dereferenced. Remove the redundant check. Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #11508	2021-01-26 16:12:10 -08:00
Matthew Macy	a4134da2b2	spl-taskq: Make sure thread tsd hash entry is cleared Like any other thread created by thread_create() we need to call thread_exit() to properly clean it up. In particular, this ensures the tsd hash for the thread is cleared. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Matt Macy <mmacy@FreeBSD.org> Closes #11512	2021-01-25 11:18:28 -08:00
Ryan Moeller	1c94345103	FreeBSD: upstream changes to VFS interface Set VIRF_MOUNTPOINT flag on snapshot mountpoint. Authored-by: Mateusz Guzik <mjg@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #11458	2021-01-23 15:40:43 -08:00
Brian Behlendorf	83b91ae1a4	Linux 5.10 compat: restore custom uio_prefaultpages() As part of commit `1c2358c1` the custom uio_prefaultpages() code was removed in favor of using the generic kernel provided iov_iter_fault_in_readable() interface. Unfortunately, it turns out that up until the Linux 4.7 kernel the function would only ever fault in the first iovec of the iov_iter. The result being uiomove_iov() may hang waiting for the page. This commit effectively restores the custom uio_prefaultpages() pages code for Linux 4.9 and earlier kernels which contain the troublesome version of iov_iter_fault_in_readable(). Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #11463 Closes #11484	2021-01-21 10:43:39 -08:00
Brian Atkinson	d0cd9a5cc6	Extending FreeBSD UIO Struct In FreeBSD the struct uio was just a typedef to uio_t. In order to extend this struct, outside of the definition for the struct uio, the struct uio has been embedded inside of a uio_t struct. Also renamed all the uio_* interfaces to be zfs_uio_* to make it clear this is a ZFS interface. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Jorgen Lundman <lundman@lundman.net> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Brian Atkinson <batkinson@lanl.gov> Closes #11438	2021-01-20 21:27:30 -08:00

1 2 3 4 5 ...

334 Commits