mirror_zfs

mirror of https://git.proxmox.com/git/mirror_zfs.git synced 2026-04-14 15:34:53 +03:00

Author	SHA1	Message	Date
наб	0c2eb3f540	fsck.zfs: implement 4/8 exit codes as suggested in manpage Update the fsck.zfs helper to bubble up some already-known-about errors if they are detected in the pool. health=degraded => 4/"Filesystem errors left uncorrected" health=faulted && dataset in /etc/fstab => 8/"Operational error" pool not found => 8/"Operational error" everything else => 0 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11806	2021-03-31 10:49:56 -07:00
Mike Swanson	67859aedd1	Add compatibility file sets (ZoL 0.6.1, 0.6.4, OpenZFS 2.1) ZoL 0.6.1 introduced feature flags with the three features that all implementations at the time were guaranteed to have. 0.6.4 introduced a few more until 0.6.5 added two after that. OpenZFS 2.1 added the dRAID feature. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Mike Swanson <mikeonthecomputer@gmail.com> Closes #11818	2021-03-31 09:40:25 -07:00
Brian Behlendorf	9ac82cab2d	Update META Increase the version to 2.1.99 to indicate the master branch is newer than the 2.1.x release. This ensures packages built from master branch are considered to be newer than the last release. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2021-03-30 10:50:55 -07:00
Brian Behlendorf	3522f57b6a	Tag 2.1.0-rc1 New features: - Distributed Spare (dRAID) Feature - Added "compatibility" property for zpool feature sets - Added zpool_influxdb command to collect zpool statistics Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2021-03-29 16:39:52 -07:00
наб	38280c3526	zed: reap child after killing on time-out When a child process is killed waitpid() must be called on the pid the reap the zombie process. Update BUGS section to reflect reality by replacing "zedlets aren't time limited with "zedlets can be interrupted". Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11769 Closes #11798	2021-03-26 14:21:00 -07:00
Matthew Ahrens	2b56a63457	Use a helper function to clarify gang block size For gang blocks, `DVA_GET_ASIZE()` is the total space allocated for the gang DVA including its children BP's. The space allocated at each DVA's vdev/offset is `vdev_psize_to_asize(vd, SPA_GANGBLOCKSIZE)`. This commit makes this relationship more clear by using a helper function, `vdev_gang_header_asize()`, for the space allocated at the gang block's vdev/offset. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes #11744	2021-03-26 11:19:35 -07:00
Matthew Ahrens	b85f47efd0	When specifying raidz vdev name, parity count should match When specifying the name of a RAIDZ vdev on the command line, it can be specified as raidz-<vdevID> or raidzP-<vdevID>. e.g. `zpool clear poolname raidz-0` or `zpool clear poolname raidz2-0` If the parity is specified in the vdev name, it should match the actual parity of that RAIDZ vdev, otherwise the command should fail. This commit makes it so. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by: Stuart Maybee <stuart.maybee@comcast.net> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes #11742	2021-03-26 11:12:22 -07:00
Luis Henriques	2037edbdaa	Fix error code on __zpl_ioctl_setflags() Other (all?) Linux filesystems seem to return -EPERM instead of -EACCESS when trying to set FS_APPEND_FL or FS_IMMUTABLE_FL without the CAP_LINUX_IMMUTABLE capability. This was detected by generic/545 test in the fstest suite. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Luis Henriques <henrix@camandro.org> Closes #11791	2021-03-26 10:46:45 -07:00
Jessica Clarke	ef977fce66	Support running FreeBSD buildworld on Arm-based macOS hosts Arm-based Macs are like FreeBSD and provide a full 64-bit stat from the start, so have no stat64 variants. Thus, define stat64 and fstat64 as aliases for the normal versions. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Jessica Clarke <jrtc27@jrtc27.com> Closes #11771	2021-03-26 10:45:12 -07:00
Andrea Gelmini	8a915ba1f6	Removed duplicated includes Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net> Closes #11775	2021-03-22 12:34:58 -07:00
Andrea Gelmini	be1e69f31c	Fix typo in Python method name Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net> Closes #11776	2021-03-22 12:32:38 -07:00
Alexander Motin	891568c990	Split dmu_zfetch() speculation and execution parts To make better predictions on parallel workloads dmu_zfetch() should be called as early as possible to reduce possible request reordering. In particular, it should be called before dmu_buf_hold_array_by_dnode() calls dbuf_hold(), which may sleep waiting for indirect blocks, waking up multiple threads same time on completion, that can significantly reorder the requests, making the stream look like random. But we should not issue prefetch requests before the on-demand ones, since they may get to the disks first despite the I/O scheduler, increasing on-demand request latency. This patch splits dmu_zfetch() into two functions: dmu_zfetch_prepare() and dmu_zfetch_run(). The first can be executed as early as needed. It only updates statistics and makes predictions without issuing any I/Os. The I/O issuance is handled by dmu_zfetch_run(), which can be called later when all on-demand I/Os are already issued. It even tracks the activity of other concurrent threads, issuing the prefetch only when _all_ on-demand requests are issued. For many years it was a big problem for storage servers, handling deeper request queues from their clients, having to either serialize consequential reads to make ZFS prefetcher usable, or execute the incoming requests as-is and get almost no prefetch from ZFS, relying only on deep enough prefetch by the clients. Benefits of those ways varied, but neither was perfect. With this patch deeper queue sequential read benchmarks with CrystalDiskMark from Windows via iSCSI to FreeBSD target show me much better throughput with almost 100% prefetcher hit rate, comparing to almost zero before. While there, I also removed per-stream zs_lock as useless, completely covered by parent zf_lock. Also I reused zs_blocks refcount to track zf_stream linkage of the stream, since I believe previous zs_fetch == NULL check in dmu_zfetch_stream_done() was racy. Delete prefetch streams when they reach ends of files. It saves up to 1KB of RAM per file, plus reduces searches through the stream list. Block data prefetch (speculation and indirect block prefetch is still done since they are cheaper) if all dbufs of the stream are already in DMU cache. First cache miss immediately fires all the prefetch that would be done for the stream by that time. It saves some CPU time if same files within DMU cache capacity are read over and over. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Adam Moss <c@yotes.com> Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored-By: iXsystems, Inc. Closes #11652	2021-03-19 22:56:11 -07:00
Chunwei Chen	296a4a369b	Fix zfs_get_data access to files with wrong generation If TX_WRITE is create on a file, and the file is later deleted and a new directory is created on the same object id, it is possible that when zil_commit happens, zfs_get_data will be called on the new directory. This may result in panic as it tries to do range lock. This patch fixes this issue by record the generation number during zfs_log_write, so zfs_get_data can check if the object is valid. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chunwei Chen <david.chen@nutanix.com> Closes #10593 Closes #11682	2021-03-19 22:53:31 -07:00
Andrew	66e6d3f128	Fix regression in POSIX mode behavior Commit `235a85657` introduced a regression in evaluation of POSIX modes that require group DENY entries in the internal ZFS ACL. An example of such a POSX mode is 007. When write_implies_delete_child is set, then ACE_WRITE_DATA is added to `wanted_dirperms` in prior to calling zfs_zaccess_common(). This occurs is zfs_zaccess_delete(). Unfortunately, when zfs_zaccess_aces_check hits this particular DENY ACE, zfs_groupmember() is checked to determine whether access should be denied, and since zfs_groupmember() always returns B_TRUE on Linux and so this check is failed, resulting ultimately in EPERM being returned. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Andrew Walker <awalker@ixsystems.com> Closes #11760	2021-03-19 22:50:46 -07:00
Palash Gandhi	c23850759f	ZTS: New test for kernel panic induced by redacted send This change adds a new test that covers a bug fix in the binary search in the redacted send resume logic that causes a kernel panic. The bug was fixed in https://github.com/openzfs/zfs/pull/11297. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by: John Kennedy <john.kennedy@delphix.com> Signed-off-by: Palash Gandhi <palash.gandhi@delphix.com> Closes #11764	2021-03-19 22:47:50 -07:00
Martin Matuška	cd5b812818	Allow setting bootfs property on pools with indirect vdevs The FreeBSD boot loader relies on the bootfs property and is capable of booting from removed (indirect) vdevs. Reviewed-by Eric van Gyzen Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Martin Matuska <mm@FreeBSD.org> Closes #11763	2021-03-19 22:46:43 -07:00
Ryan Moeller	0ab84bff55	Fix typo in zgenhostid.8 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #11770	2021-03-19 22:39:42 -07:00
Brian Atkinson	f52124dce8	Removing old code for k(un)map_atomic It used to be required to pass a enum km_type to kmap_atomic() and kunmap_atomic(), however this is no longer necessary and the wrappers zfs_k(un)map_atomic removed these. This is confusing in the ABD code as the struct abd_iter member iter_km no longer exists and the wrapper macros simply compile them out. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Adam Moss <c@yotes.com> Signed-off-by: Brian Atkinson <batkinson@lanl.gov> Closes #11768	2021-03-19 22:38:44 -07:00
Serapheim Dimitropoulos	793c958f6f	Initialize metaslab range trees in metaslab_init = Motivation We've noticed several zloop crashes within Delphix generated due to the following sequence of events: - A device gets expanded and new metaslabas are allocated for it. These metaslabs go through `metaslab_init()` but haven't gone through `metaslab_sync_done()` yet. This meas that the only range tree that's actually set is the `ms_allocatable`. All the others are NULL. - A vdev_initialization is issues and `vdev_initialize_thread` starts processing one of these new metaslabs of the expanded vdev. - As part of `vdev_initialize_calculate_progress()` we call into `metaslab_load()` and `metaslab_load_impl()` which in turn tries to dereference the metaslabs trees that are still NULL and therefore we crash. The same failure can come up from the `vdev_trim` code paths. = This Patch We considered the following solutions to deal with this issue: [A] Add logic to `vdev_initialize/trim` to skip those new metaslabs. We decided against this as it would be good to avoid exposing this lower-level detail to higer-level operations. [B] Have `metaslab_load_impl()` return early for new metaslabs and thus never touch those range_trees that are NULL at that time. This seemed more of a work-around for the bug and not a clear-cut solution. [C] Refactor our logic so all metaslabs have their range_trees created at the time of their creatin in `metaslab_init()`. In this patch we decided to go with [C] because: (1) It doesn't expose more metaslab details to higher level operations such as vdev initialize and trim. (2) The current behavior of creating the range trees lazily in `metaslab_sync_done()` is unnecessarily complicated. (3) Always initializing the metaslab range_trees makes other parts of the codebase cleaner. For example, we used to use `ms_freed` as the reference value for knowing whether all the range_trees have been initialized. Now we no longer need to do that check in most places (and in the few that we do we use the `ms_new` boolean field now which is more readable). = Side Changes Probably due to a mismerge we set `ms_loaded` to `B_TRUE` twice in `metasloab_load_impl()`. In this patch we remove the extraneous assignment. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com> Closes #11737	2021-03-19 22:36:02 -07:00
Coleman Kane	ffd6978ef5	Linux 5.12 update: bio_max_segs() replaces BIO_MAX_PAGES The BIO_MAX_PAGES macro is being retired in favor of a bio_max_segs() function that implements the typical MIN(x,y) logic used throughout the kernel for bounding the allocation, and also the new implementation is intended to be signed-safe (which the former was not). Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Coleman Kane <ckane@colemankane.org> Closes #11765	2021-03-19 22:33:42 -07:00
Coleman Kane	e2a8296131	Linux 5.12 compat: idmapped mounts In Linux 5.12, the filesystem API was modified to support ipmapped mounts by adding a "struct user_namespace *" parameter to a number functions and VFS handlers. This change adds the needed autoconf macros to detect the new interfaces and updates the code appropriately. This change does not add support for idmapped mounts, instead it preserves the existing behavior by passing the initial user namespace where needed. A subsequent commit will be required to add support for idmapped mounted. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Coleman Kane <ckane@colemankane.org> Closes #11712	2021-03-19 21:00:59 -07:00
Matthew Ahrens	330c6c0523	Clean up RAIDZ/DRAID ereport code The RAIDZ and DRAID code is responsible for reporting checksum errors on their child vdevs. Checksum errors represent events where a disk returned data or parity that should have been correct, but was not. In other words, these are instances of silent data corruption. The checksum errors show up in the vdev stats (and thus `zpool status`'s CKSUM column), and in the event log (`zpool events`). Note, this is in contrast with the more common "noisy" errors where a disk goes offline, in which case ZFS knows that the disk is bad and doesn't try to read it, or the device returns an error on the requested read or write operation. RAIDZ/DRAID generate checksum errors via three code paths: 1. When RAIDZ/DRAID reconstructs a damaged block, checksum errors are reported on any children whose data was not used during the reconstruction. This is handled in `raidz_reconstruct()`. This is the most common type of RAIDZ/DRAID checksum error. 2. When RAIDZ/DRAID is not able to reconstruct a damaged block, that means that the data has been lost. The zio fails and an error is returned to the consumer (e.g. the read(2) system call). This would happen if, for example, three different disks in a RAIDZ2 group are silently damaged. Since the damage is silent, it isn't possible to know which three disks are damaged, so a checksum error is reported against every child that returned data or parity for this read. (For DRAID, typically only one "group" of children is involved in each io.) This case is handled in `vdev_raidz_cksum_finish()`. This is the next most common type of RAIDZ/DRAID checksum error. 3. If RAIDZ/DRAID is not able to reconstruct a damaged block (like in case 2), but there happens to be additional copies of this block due to "ditto blocks" (i.e. multiple DVA's in this blkptr_t), and one of those copies is good, then RAIDZ/DRAID compares each sector of the data or parity that it retrieved with the good data from the other DVA, and if they differ then it reports a checksum error on this child. This differs from case 2 in that the checksum error is reported on only the subset of children that actually have bad data or parity. This case happens very rarely, since normally only metadata has ditto blocks. If the silent damage is extensive, there will be many instances of case 2, and the pool will likely be unrecoverable. The code for handling case 3 is considerably more complicated than the other cases, for two reasons: 1. It needs to run after the main raidz read logic has completed. The data RAIDZ read needs to be preserved until after the alternate DVA has been read, which necessitates refcounts and callbacks managed by the non-raidz-specific zio layer. 2. It's nontrivial to map the sections of data read by RAIDZ to the correct data. For example, the correct data does not include the parity information, so the parity must be recalculated based on the correct data, and then compared to the parity that was read from the RAIDZ children. Due to the complexity of case 3, the rareness of hitting it, and the minimal benefit it provides above case 2, this commit removes the code for case 3. These types of errors will now be handled the same as case 2, i.e. the checksum error will be reported against all children that returned data or parity. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes #11735	2021-03-19 16:22:10 -07:00
Mateusz Guzik	2f385c913f	FreeBSD: make seqc asserts conditional on replay Avoids tripping on asserts when doing pool recovery. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Closes #11739	2021-03-17 22:09:45 -07:00
Matthew Ahrens	46df6e98aa	Remove unused rr_code The `rr_code` field in `raidz_row_t` is unused. This commit removes the field, as well as the code that's used to set it. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes #11736	2021-03-17 21:57:09 -07:00
Ryan Moeller	ec3e4c6784	FreeBSD: Fix memory leaks in kstats Don't handle (incorrectly) kmem_zalloc() failure. With KM_SLEEP, will never return NULL. Free the data allocated for non-virtual kstats when deleting the object. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #11767	2021-03-17 21:55:18 -07:00
Adam D. Moss	1daad98176	Linux: always check or verify return of igrab() zhold() wraps igrab() on Linux, and igrab() may fail when the inode is in the process of being deleted. This means zhold() must only be called when a reference exists and therefore it cannot be deleted. This is the case for all existing consumers so add a VERIFY and a comment explaining this requirement. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Adam Moss <c@yotes.com> Closes #11704	2021-03-16 16:33:34 -07:00
Dries Michiels	5f9d61d06b	Update FreeBSD versions Update supported FreeBSD versions in documentation. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Dries Michiels <driesm.michiels@gmail.com> Closes #11718	2021-03-16 15:03:28 -07:00
gldisater	07dff5cffe	Hold and release permissions exist The man page was missing these two permissions. Add the missing permissions to the man page. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Jeremy Faulkner <gldisater@gldis.ca> Closes #11727	2021-03-16 15:01:21 -07:00
Ryan Moeller	5638803b6a	ZTS: Add tests for DOS mode attributes Create a new section of tests to run with acltype=off. For now the only test we have is for the DOS mode READONLY attribute on FreeBSD. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #11734	2021-03-16 15:00:14 -07:00
Don Brady	dd0b5c8559	Reference_tracking_enable should be a module param To make use of zfs_refcount_held tunable it should be a module parameter in open-zfs. Also, since the macros will auto-generate OS specific tunables, removed the existing zfs_refcount_held reference in module/os/freebsd/zfs/sysctl_os.c. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Allan Jude <allan@klarasystems.com> Signed-off-by: Don Brady <don.brady@delphix.com> Closes #11753	2021-03-16 14:56:17 -07:00
Ryan Moeller	9305ff2edf	ZTS: Fix incorrect use of libtest in user_run by xattr_003_neg You can't use user_run to eval ksh functions defined in libtest unless you include libtest in the user shell. Fix xattr_003_neg by: * include libtest in the user shell * then run get_xattr * assert this fails * use variables for filenames so they don't change in the user's shell * don't log the contents of /etc/passwd * cleanup all byproducts Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #11185	2021-03-12 16:17:30 -08:00
Ryan Moeller	e0b53a5dbb	ZTS: Use ksh and current environment for user_run The current user_run often does not work as expected. Commands are run in a different shell, with a different environment, and all output is discarded. Simplify user_run to retain the current environment, eliminate eval, and feed the command string into ksh. Enhance the logging for user_run so we can see out and err. Reviewed-by: John Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #11185	2021-03-12 16:17:01 -08:00
Mariusz Zaborski	e464f7c7cc	FreeBSD: bring back possibility to rewind the checkpoint from bootloader Add parsing of the rewind options. When I was upstreaming the change [1], I omitted the part where we detect that the pool should be rewind. When the FreeBSD repo has synced with the OpenZFS, this part of the code was removed. [1] FreeBSD repo: 277f38abffc6a8160b5044128b5b2c620fbb970c [2] OpenZFS repo: `f2c027bd6a` External-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=254152 Originally reviewed by: tsoome, allanjude Originally reviewed by: kevans (ok from high-level overview) Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Mariusz Zaborski <oshogbo@vexillium.org> Closes #11730	2021-03-12 16:12:14 -08:00
Ryan Moeller	f845b2dd1c	FreeBSD: Clean up zfsdev_close to match Linux Resolve some oddities in zfsdev_close() which could result in a panic and were not present in the equivalent function for Linux. - Remove unused definition ZFS_MIN_MINOR - FreeBSD: Simplify zfsdev state destruction - Assert zs_minor is valid in zfsdev_close - Make locking around zfsdev state match Linux Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #11720	2021-03-12 16:09:15 -08:00
Mateusz Guzik	e3e82dcc51	FreeBSD: switch teardown lock to rms This deserializes otherwise non-contending operations. The previous scheme of using 17 locks hashed by curthread runs into conflicts very quickly. Check the pull request for sample results. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Matt Macy <mmacy@FreeBSD.org> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Closes #11153	2021-03-12 15:51:48 -08:00
Mateusz Guzik	5ebe425a5b	Macroify teardown lock handling This will allow platforms to implement it as they see fit, in particular in a different manner than rrm locks. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Matt Macy <mmacy@FreeBSD.org> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Closes #11153	2021-03-12 15:51:39 -08:00
Mateusz Guzik	9847f77f01	FreeBSD: rename teardown inactive macros to mimick rrm convention Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Matt Macy <mmacy@FreeBSD.org> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Closes #11153	2021-03-12 15:51:31 -08:00
Mateusz Guzik	f9acd578f0	FreeBSD: remove 2 assertions that teardown lock is not held They are not very useful and hard to implement in the rms routine the code is about to start using. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Matt Macy <mmacy@FreeBSD.org> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Closes #11153	2021-03-12 15:51:20 -08:00
Mateusz Guzik	300f68e017	FreeBSD: rework asserts in zfs_dd_lookup 1. even up ifdefs 2. drop the arguably useless teardown lock asserts -- nothing else checks for it Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Matt Macy <mmacy@FreeBSD.org> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Closes #11153	2021-03-12 15:51:07 -08:00
Mateusz Guzik	446400346d	Add branch prediction to ZFS_ENTER and ZFS_VERIFY_ZP macros They are expected to fail only in corner cases. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Matt Macy <mmacy@FreeBSD.org> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Closes #11153	2021-03-12 15:51:03 -08:00
George Wilson	0936981d86	zpool import cachefile improvements Importing a pool using the cachefile is ideal to reduce the time required to import a pool. However, if the devices associated with a pool in the cachefile have changed, then the import would fail. This can easily be corrected by doing a normal import which would then read the pool configuration from the labels. The goal of this change is make importing using a cachefile more resilient and auto-correcting. This is accomplished by having the cachefile import logic automatically fallback to reading the labels of the devices similar to a normal import. The main difference between the fallback logic and a normal import is that the cachefile import logic will only look at the device directories that were originally used when the cachefile was populated. Additionally, the fallback logic will always import by guid to ensure that only the pools in the cachefile would be imported. External-issue: DLPX-71980 Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Wilson <gwilson@delphix.com> Closes #11716	2021-03-12 15:42:27 -08:00
Martin Matuška	b8fa03efbc	Fix whitespace introduced in `ecc277cff` The manual page change in `ecc277c` has introduced whitespace on line ends. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Martin Matuska <mm@FreeBSD.org> Closes #11722	2021-03-11 19:42:04 -08:00
Ryan Moeller	35aa9dc6df	FreeBSD: Fix scope of deadman tunables A few deadman tunables ended up in the wrong sysctl node. Move them to vfs.zfs.deadman.* Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #11715	2021-03-11 19:23:24 -08:00
Adam D. Moss	c94d648b1c	Microoptimizations for VERIFY() and friends Add branch hints and constify the intermediate evaluations of left/right params in VERIFY3*(). Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Adam Moss <c@yotes.com> Closes #11708	2021-03-11 17:16:09 -08:00
Allan Jude	92e8fb6395	Add missing files to Makefile Some .h files that were added were missed in this Makefile. Since they are .h files, their being missing only resulted in them disappeared from the dist archive. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Allan Jude <allan@klarasystems.com> Closes #11705	2021-03-11 17:13:34 -08:00
George Melikov	8d534c37ac	CI checkstyle: pin ubuntu version Our checkstyle doesn't work well on Ubuntu 20.04, temporary pin it to 18.04. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Melikov <mail@gmelikov.ru> Closes #11713	2021-03-11 17:11:31 -08:00
Don Brady	f5ada6538d	Return finer grain errors in libzfs unmount_one Added errno mappings to unmount_one() in libzfs. Changed do_unmount() implementation to return errno errors directly like is done for do_mount() and others. Reviewed-by: Mark Maybee <mark.maybee@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Don Brady <don.brady@delphix.com> Closes #11681	2021-03-08 08:46:45 -08:00
Tony Hutter	4fdbd43450	vdev_id: Create symlinks even if no /dev/mapper/ vdev_id uses the /dev/mapper/ symlinks to resolve a UUID to a dm name (like dm-1). However on some multipath setups, there is no /dev/mapper/ entry for the UUID at the time vdev_id is called by udev. However, this isn't necessarily needed, as we may be able to resolve the dm name from the $DEVNAME that udev passes us (like DEVNAME="/dev/dm-1"). This patch tries to resolve the dm name from $DEVNAME first, before falling back to looking in /dev/mapper/. This fixed an issue where the by-vdev names weren't reliably showing up on one of our nodes. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #11698	2021-03-08 08:43:30 -08:00
Antonio Russo	b2eebe3ae7	ZTS events_002: Improve speed and reliability events_002 exercises the ZED, ensuring that it neither misses events, nor reporting events twice. On slow test hardware, some of the timeouts are insufficient to allow the ZED to properly settle. Conversely, on fast hardware these same timeouts are too long, unnecessarily slowing the test run. Instead of using a fixed timeout, wait for the expected final event before returning. Additionally, wait with a timeout for unexpected events to avoid missing them if they show up late. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Antonio Russo <aerusso@aerusso.net> Closes #11703	2021-03-08 08:42:45 -08:00
Christian Schwarz	93e3658035	zvol: call zil_replaying() during replay zil_replaying(zil, tx) has the side-effect of informing the ZIL that an entry has been replayed in the (still open) tx. The ZIL uses that information to record the replay progress in the ZIL header when that tx's txg syncs. ZPL log entries are not idempotent and logically dependent and thus calling zil_replaying() is necessary for correctness. For ZVOLs the question of correctness is more nuanced: ZVOL logs only TX_WRITE and TX_TRUNCATE, both of which are idempotent. Logical dependencies between two records exist only if the write or discard request had sync semantics or if the ranges affected by the records overlap. Thus, at a first glance, it would be correct to restart replay from the beginning if we crash before replay completes. But this does not address the following scenario: Assume one log record per LWB. The chain on disk is HDR -> 1:W(1, "A") -> 2:W(1, "B") -> 3:W(2, "X") -> 4:W(3, "Z") where N:W(O, C) represents log entry number N which is a TX_WRITE of C to offset A. We replay 1, 2 and 3 in one txg, sync that txg, then crash. Bit flips corrupt 2, 3, and 4. We come up again and restart replay from the beginning because we did not call zil_replaying() during replay. We replay 1 again, then interpret 2's invalid checksum as the end of the ZIL chain and call replay done. The replayed zvol content is "AX". If we had called zil_replaying() the HDR would have pointed to 3 and our resumed replay would not have replayed anything because 3 was corrupted, resulting in zvol content "BX". If 3 logically depends on 2 then the replay corrupted the ZVOL_OBJ's contents. This patch adds the zil_replaying() calls to the replay functions. Since the callbacks in the replay function need the zilog_t* pointer so that they can call zil_replaying() we open the ZIL while replaying in zvol_create_minor(). We also verify that replay has been done when on-demand-opening the ZIL on the first modifying bio. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Christian Schwarz <me@cschwarz.com> Closes #11667	2021-03-07 09:49:58 -08:00

... 8 9 10 11 12 ...

7103 Commits