Compare commits

...

180 Commits

Author SHA1 Message Date
Tony Hutter 6a6bd49398 Tag zfs-2.1.6
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2022-09-28 17:25:10 -07:00
Richard Yao 566e908fa0 Fix bad free in skein code
Clang's static analyzer found a bad free caused by skein_mac_atomic().
It will allocate a context on the stack and then pass it to
skein_final(), which attempts to free it. Upon inspection,
skein_digest_atomic() also has the same problem.

These functions were created to match the OpenSolaris ICP API, so I was
curious how we avoided this in other providers and looked at the SHA2
code. It appears that SHA2 has a SHA2Final() helper function that is
called by the exported sha2_mac_final()/sha2_digest_final() as well as
the sha2_mac_atomic() and sha2_digest_atomic() functions. The real work
is done in SHA2Final() while some checks and the free are done in
sha2_mac_final()/sha2_digest_final().

We fix the use after free in the skein code by taking inspiration from
the SHA2 code. We introduce a skein_final_nofree() that does most of the
work, and make skein_final() into a function that calls it and then
frees the memory.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13954
2022-09-28 17:25:10 -07:00
Tony Hutter a2705b1dd5 zpool: Don't print "repairing" on force faulted drives
If you force fault a drive that's resilvering, it's scan stats can get
frozen in time, giving the false impression that it's being resilvered.
This commit checks the vdev state to see if the vdev is healthy before
reporting "resilvering" or "repairing" in zpool status.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #13927
Closes #13930
2022-09-28 12:41:23 -07:00
Mateusz Guzik 63d4838b4a FreeBSD: handle V_PCATCH
See https://cgit.FreeBSD.org/src/commit/?id=a75d1ddd74312f5dd79bc1e965f7077679659f2e

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #13910
2022-09-28 10:35:13 -07:00
Mateusz Guzik eec942cc54 FreeBSD: catch up to 1400068
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #13909
2022-09-28 10:35:13 -07:00
Mateusz Guzik 2c8e3e4b28 FreeBSD: stop passing LK_INTERLOCK to VOP_LOCK
There is an ongoing effort to eliminate this feature.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #13908
2022-09-28 10:35:13 -07:00
Richard Yao 55816c64da FreeBSD: Fix integer conversion for vnlru_free{,_vfsops}()
When reviewing #13875, I noticed that our FreeBSD code has an issue
where it converts from `int64_t` to `int` when calling
`vnlru_free{,_vfsops}()`. The result is that if the int64_t is `1 <<
36`, the int will be 0, since the low bits are 0. Even when some low
bits are set, a value such as `((1 << 36) + 1)` would truncate to 1,
which is wrong.

There is protection against this on 32-bit platforms, but on 64-bit
platforms, there is no check to protect us, so we add a check.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13882
2022-09-28 10:35:13 -07:00
Ryan Moeller 8dcd6af623 FreeBSD: Ignore symlink to i386 includes
A symlink to i386 includes is created in the build dir on amd64 since
freebsd/freebsd-src@d07600c563

Tell git to ignore it like the other include links.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #13719
2022-09-28 10:35:13 -07:00
Richard Yao c973929b29 LUA: Fix CVE-2014-5461
Apply the fix from upstream.

http://www.lua.org/bugs.html#5.2.2-1
https://www.opencve.io/cve/CVE-2014-5461

It should be noted that exploiting this requires the `SYS_CONFIG`
privilege, and anyone with that privilege likely has other opportunities
to do exploits, so it is unlikely that bad actors could exploit this
unless system administrators are executing untrusted ZFS Channel
Programs.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13949
2022-09-27 16:49:02 -07:00
Richard Yao 835e03682c Linux: Fix uninitialized variable usage in zio_do_crypt_data()
Coverity complained about this. An error from `hkdf_sha512()` before uio
initialization will cause pointers to uninitialized memory to be passed
to `zio_crypt_destroy_uio()`. This is a regression that was introduced
by cf63739191. Interestingly, this never
affected FreeBSD, since the FreeBSD version never had that patch ported.
Since moving uio initialization to the top of this function would slow
down the qat_crypt() path, we only move the `memset()` calls to the top
of the function. This is sufficient to fix this problem.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13944
2022-09-27 15:43:26 -07:00
Alexander Motin 33223cbc3c Refactor Log Size Limit
Original Log Size Limit implementation blocked all writes in case of
limit reached until the TXG is committed and the log is freed.  It
caused huge delays and following speed spikes in application writes.

This implementation instead smoothly throttles writes, using exactly
the same mechanism as used for dirty data.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: jxdking <lostking2008@hotmail.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Issue #12284
Closes #13476
2022-09-26 14:55:27 -07:00
Brian Behlendorf 91e02156dd Revert "Reduce dbuf_find() lock contention"
This reverts commit 34dbc618f5.  While this
change resolved the lock contention observed for certain workloads, it
inadventantly reduced the maximum hash inserts/removes per second.  This
appears to be due to the slightly higher acquisition cost of a rwlock vs
a mutex.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
2022-09-21 13:15:51 -07:00
Richard Yao b66f8d3c2b Add zfs_btree_verify_intensity kernel module parameter
I see a few issues in the issue tracker that might be aided by being
able to turn this on. We have no module parameter for it, so I would
like to add one.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13874
2022-09-21 13:15:51 -07:00
Richard Yao 5096ed31c8 Fix incorrect size given to bqueue_enqueue() call in dmu_redact.c
We pass sizeof (struct redact_record *) rather than sizeof (struct
redact_record). Passing the pointer size is wrong.

Coverity caught this in two places.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13885
2022-09-21 13:15:51 -07:00
Ameer Hamza 035e52f591 Delay ZFS_PROP_SHARESMB property to handle it for encrypted raw receive
For encrypted raw receive, objset creation is delayed until a call to
dmu_recv_stream(). ZFS_PROP_SHARESMB property requires objset to be
populated when calling zpl_earlier_version(). To correctly handle the
ZFS_PROP_SHARESMB property for encrypted raw receive, this change
delays setting the property.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #13878
2022-09-21 13:15:26 -07:00
Ameer Hamza d5105f068f zfs recv hangs if max recordsize is less than received recordsize
- Some optimizations for bqueue enqueue/dequeue.
- Added a fix to prevent deadlock when both bqueue_enqueue_impl()
and bqueue_dequeue() waits for signal to be triggered.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #13855
2022-09-21 13:15:26 -07:00
наб faa1e4082d include: move SPA_MINBLOCKSHIFT and zio_encrypt to sys/fs/zfs.h
These are used by userspace, so should live in a public header

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12116
2022-09-21 13:15:26 -07:00
Alexander Motin 44cec45f72 Improve too large physical ashift handling
When iterating through children physical ashifts for vdev, prefer
ones above the maximum logical ashift, that we can actually use,
but within the administrator defined maximum.

When selecting top-level vdev ashift, do not set it to the defined
maximum in case physical ashift is even higher, but just ignore one.
Using the maximum does not prevent misaligned writes, but reduces
space efficiency.  Since ZFS tries to write data sequentially and
aggregates the writes, in many cases large misanigned writes may be
not as bad as the space penalty otherwise.

Allow internal physical ashifts for vdevs higher than SHIFT_MAX.
May be one day allocator or aggregation could benefit from that.

Reduce zfs_vdev_max_auto_ashift default from 16 (64KB) to 14 (16KB),
so that ZFS may still use bigger ashifts up to SHIFT_MAX (64KB),
but only if it really has to or explicitly told to, but not as an
"optimization".

There are some read-intensive NVMe SSDs that report Preferred Write
Alignment of 64KB, and attempt to build RAIDZ2 of those leads to a
space inefficiency that can't be justified.  Instead these changes
make ZFS fall back to logical ashift of 12 (4KB) by default and
only warn user that it may be suboptimal for performance.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #13798
2022-09-21 13:15:15 -07:00
Rich Ercolani ebbbe01e31 Ask libtool to stop hiding some errors
For #13083, curiously, it did not print the actual error, just
that the compile failed with "Error 1".

In theory, this flag should cause it to report errors twice sometimes.
In practice, I'm pretty okay with reporting some twice if it avoids
reporting some never.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #13086
2022-09-21 16:12:14 -07:00
Kevin Jin d05f3039f7 Add Module Parameter Regarding Log Size Limit
zfs_wrlog_data_max
The upper limit of TX_WRITE log data. Once it is reached,
write operation is blocked, until log data is cleared out
after txg sync. It only counts TX_WRITE log with WR_COPIED
or WR_NEED_COPY.

Reviewed-by: Prakash Surya <prakash.surya@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: jxdking <lostking2008@hotmail.com>
Closes #12284
2022-09-21 16:12:14 -07:00
Kevin Jin 999830a021 Optimize txg_kick() process (#12274)
Use dp_dirty_pertxg[] for txg_kick(), instead of dp_dirty_total in
original code. Extra parameter "txg" is added for txg_kick(), thus it
knows which txg to kick. Also txg_kick() call is moved from
dsl_pool_need_dirty_delay() to dsl_pool_dirty_space() so that we can
know the txg number assigned for txg_kick().

Some unnecessary code regarding dp_dirty_total in txg_sync_thread() is
also cleaned up.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: jxdking <lostking2008@hotmail.com>
Closes #12274
2022-09-21 16:12:14 -07:00
Ameer Hamza a5b0d42540 zfs recv hangs if max recordsize is less than received recordsize
- Some optimizations for bqueue enqueue/dequeue.
- Added a fix to prevent deadlock when both bqueue_enqueue_impl()
and bqueue_dequeue() waits for signal to be triggered.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #13855
2022-09-19 09:39:07 -07:00
Christian Schwarz cde04badd1 make DMU_OT_IS_METADATA and DMU_OT_IS_ENCRYPTED return B_TRUE or B_FALSE
Without this patch, the

    ASSERT3U(dbuf_is_metadata(db), ==, arc_is_metadata(buf));

at the beginning of dbuf_assign_arcbuf can panic
if the object type is a DMU_OT_NEWTYPE that has
DMU_OT_METADATA set.

While we're at it, fix DMU_OT_IS_ENCRYPTED as well.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Christian Schwarz <christian.schwarz@nutanix.com>
Closes #13842
2022-09-15 16:58:35 -07:00
Richard Yao 3f7c174b50 vdev_draid_lookup_map() should not iterate outside draid_maps
Coverity reported this as an out-of-bounds read.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13865
2022-09-15 16:58:35 -07:00
Akash B 03fa3ef264 Add physical device size to SIZE column in 'zpool list -v'
Add physical device size/capacity only for physical devices in
'zpool list -v' instead of displaying "-" in the SIZE column.
This would make it easier to see the individual device capacity and
to determine which spares are large enough to replace which devices.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Dipak Ghosh <dipak.ghosh@hpe.com>
Signed-off-by: Akash B <akash-b@hpe.com>
Closes #12561
Closes #13106
2022-09-15 10:23:01 -07:00
George Amanakis 8bd3dca9bf Introduce a tunable to exclude special class buffers from L2ARC
Special allocation class or dedup vdevs may have roughly the same
performance as L2ARC vdevs. Introduce a new tunable to exclude those
buffers from being cacheable on L2ARC.

Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #11761
Closes #12285
2022-09-14 11:27:00 -07:00
наб c8f795ba53 config: check for parallel(1), use it for cstyle
Before:
$ time make cstyle
real    0m23.118s
user    0m23.002s
sys     0m0.114s

After:
$ time make cstyle
real    0m4.577s
user    0m31.487s
sys     0m0.699s

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Issue #12899
2022-09-14 11:23:25 -07:00
Tony Hutter 7bbfac9d04 zed: Fix config_sync autoexpand flood
Users were seeing floods of `config_sync` events when autoexpand was
enabled.  This happened because all "disk status change" udev events
invoke the autoexpand codepath, which calls zpool_relabel_disk(),
which in turn cause another "disk status change" event to happen,
in a feedback loop.  Note that "disk status change" happens every time
a user calls close() on a block device.

This commit breaks the feedback loop by only allowing an autoexpand
to happen if the disk actually changed size.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes: #7132
Closes: #7366
Closes #13729
2022-09-14 09:57:44 -07:00
Walter Huf 2010c183bc Add xattr_handler support for Android kernels
Some ARM BSPs run the Android kernel, which has
a modified xattr_handler->get() function signature.
This adds support to compile against these kernels.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Walter Huf <hufman@gmail.com>
Closes #13824
2022-09-14 09:57:37 -07:00
Samuel aa9e887d2a Fix column width in 'zpool iostat -v' and 'zpool list -v'
This commit fixes a minor spacing issue caused when
enumerating vdev names, which originated from #13031

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Akash B <akash-b@hpe.com>
Signed-off-by: Samuel Wycliffe <samuelwycliffe@gmail.com>
Closes #13811
2022-09-14 09:57:05 -07:00
Ryan Moeller 78206a2e44 FreeBSD: Mark ZFS_MODULE_PARAM_CALL as MPSAFE
ZFS_MODULE_PARAM_CALL handlers implement their own locking if needed
and do not require Giant.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #13756
2022-09-13 17:59:15 -07:00
Alexander Motin b6ebf270eb Apply arc_shrink_shift to ARC above arc_c_min
It makes sense to free memory in smaller chunks when approaching
arc_c_min to let other kernel subsystems to free more, since after
that point we can't free anything.  This also matches behavior on
Linux, where to shrinker reported only the size above arc_c_min.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #13794
2022-09-13 17:59:10 -07:00
George Wilson 15b64fbc94 Importing from cachefile can trip assertion
When importing from cachefile, it is possible that the builtin retry
logic will trip an assertion because it also fails to find the pool.
This fix addresses that case and returns the correct error message to
the user.

Reviewed-by: Richard Yao <ryao@gentoo.org>
Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Wilson <gwilson@delphix.com>
Closes #13781
2022-09-13 17:59:04 -07:00
Tony Hutter b1be0a5c15 ZTS: Fix zpool_expand_001_pos
`zpool_expand_001_pos` was often failing due to not seeing autoexpand
commands in the `zpool history`.  During testing, I found this to be
unreliable (sometimes the "online" wouldn't appear in `zpool history`)
and unnecessary, as we could simply check that the pool increased in
size.

This commit revamps the test to check for the expanded pool size
and corresponding new free space.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #13743
2022-09-13 17:58:03 -07:00
Tony Hutter 65f8f92d12 zed: Look for NVMe DEVPATH if no ID_BUS
We tried replacing an NVMe drive using autoreplace, only
to see zed reject it with:

zed[27955]: zed_udev_monitor: /dev/nvme5n1 no devid source

This happened because ZED saw that ID_BUS was not set by udev
for the NVMe drive, and thus didn't think it was "real drive".
This commit allows NVMe drives to be autoreplaced even if
ID_BUS is not set.

Reviewed-by: Don Brady <don.brady@intel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #13512
Closes #13646
2022-09-13 17:51:11 -07:00
Tony Hutter acd7464639 zed: Ignore false 'atari' partitions in autoreplace
libudev will sometimes falsely identify an 'atari' partition on a
blank disk, preventing it from being used in an autoreplace.  This
seems to be a known issue.  The workaround is to just ignore the
fake partition and continue with the autoreplace.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #13497
Closes #13632
2022-09-13 17:51:06 -07:00
Tony Hutter f48d9b4269 rpm: Silence "unversioned Obsoletes" warnings on EL 9
Get rid of RPM warnings on AlmaLinux 9:

"It's not recommended to have unversioned Obsoletes"

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #13584
Closes #13638
2022-09-13 17:50:59 -07:00
Neal Gompa (ニール・ゴンパ) e1b49e3f1d rpm: Use the correct version-release information in dependencies
This tightly links the subpackages together and ensures that everything
is upgraded together.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Neal Gompa <ngompa@datto.com>
Closes #13489
2022-09-13 17:50:42 -07:00
Richard Yao 8131a96544 Fix use-after-free in btree code
Coverty static analysis found these.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #10989
Closes #13861
2022-09-13 16:15:38 -07:00
gregory-lee-bartholomew 979fd5a434 contrib: dracut: zfs-snapshot-bootfs: exit status fix
When the zfs-snapshot-bootfs service attempts to create a snapshot
that already exists, the exit status of the command is non-zero and
the service reports failed to the systemd service manager. This is a
common occurrence if bootfs.snapshot is left set on the kernel command
line and it should not be considered a failure.

This service was originally set to ignore this error by prefixing
the command with - on the ExecStart line, but the leading - appears
to have been dropped in #13359.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gregory Bartholomew <gregory.lee.bartholomew@gmail.com>
Closes #13769
2022-08-12 14:31:51 -07:00
r-ricci 533779f5f2 arcstat: fix -p option
When the -p option is used, a list of floats is passed to sep.join(),
which expects strings. Fix this by converting each value to a string.

Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Roberto Ricci <ricci@disroot.org>
Closes #12916 
Closes #13767
2022-08-12 14:29:24 -07:00
Paul Zuchowski db5fd16f0b Fix problem with zdb_objset_id test.
Use large numbers for datasets with
numeric names to avoid name and id
collisions.

Signed-off-by: Paul Zuchowski <pzuchowski@datto.com>
2022-08-09 11:46:12 -07:00
Coleman Kane e0dbab1a14 Linux 6.0 compat: register_shrinker() now var-arg
The 6.0 kernel added a printf-style var-arg for args > 0 to the
register_shrinker function, in order to add names to shrinkers, in
commit e33c267ab70de4249d22d7eab1cc7d68a889bac2. This enables the
shrinkers to have friendly names exposed in /sys/kernel/debug/shrinker/.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #13748
2022-08-09 09:41:06 -07:00
Brian Behlendorf 4063d7b6b4 Linux 5.20 compat: blk_cleanup_disk()
As of the Linux 5.20 kernel blk_cleanup_disk() has been removed,
all callers should use put_disk().

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13728
2022-08-09 09:41:06 -07:00
Brian Behlendorf 58571ba447 Linux 5.20 compat: bdevname()
As of the Linux 5.20 kernel bdevname() has been removed, all
callers should use snprintf() and the "%pg" format specifier.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13728
2022-08-09 09:41:06 -07:00
Brian Behlendorf 57e1052d33 Linux 5.19 compat: META
Update the META file to reflect compatibility with the 5.19 kernel.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13715
2022-08-09 09:41:06 -07:00
Paul Zuchowski fcbddc7f7c Fix problem with zdb -d
zdb -d <pool>/<objset ID> does not work when
other command line arguments are included i.e.
zdb -U <cachefile> -d <pool>/<objset ID>
This change fixes the command line parsing
to handle this situation.  Also fix issue
where zdb -r <dataset> <file> does not handle
the root <dataset> of the pool. Introduce -N
option to force <objset ID> to be interpreted
as a numeric objsetID.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Paul Zuchowski <pzuchowski@datto.com>
Closes #12845
Closes #12944
2022-08-08 16:56:38 -07:00
Tino Reichardt b06aff105c Fix checkstyle warning: E275 missing whitespace after keyword
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #13710
2022-08-02 10:05:14 -07:00
Rich Ercolani 035ee628cf Revert behavior of 59eab109 on not-Linux
It turns out that short-circuiting the EFAULT behavior on a short read
breaks things on FreeBSD. So until there's a nicer solution, let's
just revert the behavior for not-Linux.

Reference:
https://reviews.freebsd.org/R10:70f51f0e474ffe1fb74cb427423a2fba3637544d

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12698
2022-08-02 10:05:14 -07:00
Rich Ercolani 5c56591b57 Handle partial reads in zfs_read
Currently, dmu_read_uio_dnode can read 64K of a requested 1M in one
loop, get EFAULT back from zfs_uiomove() (because the iovec only holds
64k), and return EFAULT, which turns into EAGAIN on the way out. EAGAIN
gets interpreted as "I didn't read anything", the caller tries again
without consuming the 64k we already read, and we're stuck.

This apparently works on newer kernels because the caller which breaks
on older Linux kernels by happily passing along a 1M read request and a
64k iovec just requests 64k at a time.

With this, we now won't return EFAULT if we got a partial read.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12370
Closes #12509
Closes #12516
2022-08-02 10:05:14 -07:00
наб 17512aba0c module: lua: ldo: fix pragma name
/home/nabijaczleweli/store/code/zfs/module/lua/ldo.c:175:32: warning:
unknown option after ‘#pragma GCC diagnostic’ kind [-Wpragmas]
  175 | #pragma GCC diagnostic ignored "-Winfinite-recursion"a
      |                                ^~~~~~~~~~~~~~~~~~~~~~

Fixes: a6e8113fed ("Silence
-Winfinite-recursion warning in luaD_throw()")

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13348
2022-07-28 14:17:38 -07:00
Brian Behlendorf 98315be036 ZTS: Fix io_uring support check
Not all Linux distribution kernels enable io_uring support by
default.  Update the run time check to verify that the booted
kernel was built with CONFIG_IO_URING=y.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Co-authored-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13648
Closes #13685
2022-07-27 13:38:56 -07:00
Brian Behlendorf 69ad0bd769 Fix objtool: missing int3 after ret warning
Resolve straight-line speculation warnings reported by objtool
for x86_64 assembly on Linux when CONFIG_SLS is set.  See the
following LWN article for the complete details.

https://lwn.net/Articles/877845/

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
Attila Fülöp b9d862f2db ICP: Add missing stack frame info to SHA asm files
Since the assembly routines calculating SHA checksums don't use
a standard stack layout, CFI directives are needed to unroll the
stack.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes #11733
2022-07-27 13:38:56 -07:00
Brian Behlendorf d2ff2196a5 Fix -Wformat-overflow warning in zfs_project_handle_dir()
Switch to using asprintf() to satisfy the compiler and resolve the
potential format-overflow warning.  Not the conditional before the
sprintf() would have prevented this regardless.

    cmd/zfs/zfs_project.c: In function ‘zfs_project_handle_dir’:
    cmd/zfs/zfs_project.c:241:38: error: ‘/’ directive writing
    1 byte into a region of size between 0 and 4352
    [-Werror=format-overflow=]
    cmd/zfs/zfs_project.c:241:17: note: ‘sprintf’ output between
    2 and 4609 bytes into a destination of size 4352

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
Brian Behlendorf d483ef3744 Fix -Wformat-truncation warning in upgrade_set_callback()
Extend the buffer slightly resolve the warning.

    cmd/zfs/zfs_main.c: In function ‘upgrade_set_callback’:
    cmd/zfs/zfs_main.c:2446:22: error: ‘%llu’ directive output
    may be truncated writing between 1 and 20 bytes into a
    region of size 16 [-Werror=format-truncation=]
    cmd/zfs/zfs_main.c:2445:24: note: ‘snprintf’ output between
    2 and 21 bytes into a destination of size 16

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
Brian Behlendorf 60f2cfd24f Fix -Wuse-after-free warning in dbuf_destroy()
Move the use of the db pointer after it is freed.  It's only used as
a tag so a dereference would never occur, but there's no reason we
can't invert the order to resolve the warning.

    module/zfs/dbuf.c: In function 'dbuf_destroy':
    module/zfs/dbuf.c:2953:17: error:
    pointer 'db' may be used after 'free' [-Werror=use-after-free]

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
Brian Behlendorf 6a81173026 Fix -Wuse-after-free warning in dbuf_issue_final_prefetch_done()
Move the use of the private pointer after it is freed.  It's only
used as a tag so a dereference would never occur, but there's no
harm in inverting the order to resolve the warning.

    module/zfs/dbuf.c: In function 'dbuf_issue_final_prefetch_done':
    module/zfs/dbuf.c:3204:17: error:
    pointer 'private' may be used after 'free' [-Werror=use-after-free]

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
Brian Behlendorf 087f5dedd5 Fix -Wattribute-warning in dsl layer
The memcpy(), memmove(), and memset() functions have been annotated
to perform bounds checking when using FORTIFY_SOURCE.  A warning is
now generted when writing beyond the end of the specified field.

Alternately, the new struct_group() macro could be used to create
an anonymous union member for use by memcpy().  However, since this
is the only place the macro would be helpful it's preferable to
restructure the code slights to avoid the need for additional
compatibility code when the macro does not exist.

https://lore.kernel.org/lkml/20211118183807.1283332-1-keescook@chromium.org/T/

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
Brian Behlendorf c771583f23 Fix -Wattribute-warning in edonr
The wrong union memory was being accessed in EdonRInit resulting in
a write beyond size of field compiler warning.  Reference the correct
member to resolve the warning.  The warning was correct and this in
case the mistake was harmless.

    In function ‘fortify_memcpy_chk’,
    inlined from ‘EdonRInit’ at zfs/module/icp/algs/edonr/edonr.c:494:3:
    ./include/linux/fortify-string.h:344:25: error: call to
    ‘__write_overflow_field’ declared with attribute warning:
    detected write beyond size of field (1st parameter);
    maybe use struct_group()? [-Werror=attribute-warning]

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
Brian Behlendorf ef0e506f46 Fix -Wattribute-warning in zfs_log_xvattr()
Restructure the code in zfs_log_xvattr() to use a lr_attr_end
structure when accessing lr_attr_t elements located after the
variable sized array.  This makes the code more understandable
and resolves the accessing beyond the end of the field warnings.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
Brian Behlendorf d7a8c573cf Silence -Winfinite-recursion warning in luaD_throw()
This code should be kept inline with the upstream lua version as much
as possible.  Therefore, we simply want to silence the warning.  This
check was enabled by default as part of -Wall in gcc 12.1.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
наб 2d235d58f8 config: prune unused -Wno-bool-compare checks
Reviewed-by: Alejandro Colomar <alx.manpages@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13110
2022-07-27 13:38:56 -07:00
наб 37430e8211 libtpool: -Wno-clobbered
Also remove -Wno-unused-but-set-variable

Upstream-bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61118
Reviewed-by: Alejandro Colomar <alx.manpages@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13110
2022-07-27 13:38:56 -07:00
Tino Reichardt 4b0977027b Remove sha1 hashing from OpenZFS, it's not used anywhere.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Attila Fülöp <attila@fueloep.org>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12895
Closes #12902
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
2022-07-26 10:12:44 -07:00
Alexander Motin 15868d3ecb Fix scrub resume from newly created hole.
It may happen that scan bookmark points to a block that was turned
into a part of a big hole.  In such case dsl_scan_visitbp() may skip
it and dsl_scan_check_resume() will not be called for it.  As result
new scan suspend won't be possible until the end of the object, that
may take hours if the object is a multi-terabyte ZVOL on a slow HDD
pool, stretching TXG to all that time, creating all sorts of problems.

This patch changes the resume condition to any greater or equal block,
so even if we miss the bookmarked block, the next one we find will
delete the bookmark, allowing new suspend.

Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
2022-07-26 10:10:37 -07:00
Alexander Motin bbb50e6129 Avoid memory copy when verifying raidz/draid parity
Before this change for every valid parity column raidz_parity_verify()
allocated new buffer and copied there existing data, then recalculated
the parity and compared the result with the copy.  This patch removes
the memory copy, simply swapping original buffer pointers with newly
allocated empty ones for parity recalculation and comparison. Original
buffers with potentially incorrect parity data are then just freed,
while new recalculated ones are used for repair.

On a pool of 12 4-wide raidz vdevs, storing 1.5TB of 16MB blocks, this
change reduces memory traffic during scrub by 17% and total unhalted
CPU time by 25%.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13613
2022-07-26 10:10:37 -07:00
Alexander Motin 03e33b2bb8 Avoid memory copies during mirror scrub
Issuing several scrub reads for a block we may use the parent ZIO
buffer for one of child ZIOs.  If that read complete successfully,
then we won't need to copy the data explicitly.  If block has only
one copy (typical for root vdev, which is also a mirror inside),
then we never need to copy -- succeed or fail as-is.  Previous
code also copied data from buffer of every successfully completed
child ZIO, but that just does not make any sense.

On healthy N-wide mirror this saves all N+1 (or even more in case
of ditto blocks) memory copies for each scrubbed block, allowing
CPU to focus mostly on check-summing.  For other vdev types it
should save one memory copy per block copy at root vdev.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13606
2022-07-26 10:10:37 -07:00
Alexander Motin 4b8f16072d Fix and disable blocks statistics during scrub
Block statistics calculation during scrub I/O issue in case of sorted
scrub accounted ditto blocks several times.  Embedded blocks on other
side were not accounted at all.  This change moves the accounting from
issue to scan stage, that fixes both problems and also allows to avoid
pool-wide locking and the lock contention it created.

Since this statistics is quite specific and is not even exposed now
anywhere, disable its calculation by default to not waste CPU time.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13579
2022-07-26 10:10:37 -07:00
Alexander Motin 5e06805d8e Avoid two 64-bit divisions per scanned block
Change math to make it like the ARC, using multiplications instead.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13591
2022-07-26 10:10:37 -07:00
Alexander Motin dc91a6a660 Several B-tree optimizations
- Introduce first element offset within a leaf.  It allows to reduce
by ~50% average memmove() size when adding/removing elements.  If the
added/removed element is in the first half of the leaf, we may shift
elements before it and adjust the bth_first instead of moving more
elements after it.
 - Use memcpy() instead of memmove() when we know there is no overlap.
 - Switch from uint64_t to uint32_t.  It does not limit anything,
but 32-bit arches should appreciate it greatly in hot paths.
 - Store leaf capacity in struct btree to avoid 64-bit divisions.
 - Adjust zfs_btree_insert_into_leaf() to always result in balanced
leaves after splitting, no matter where the new element was inserted.
Not that we care about it much, but it should also allow B-trees with
as little as two elements per leaf instead of 4 previously.

When scrubbing pool of 12 SSDs, storing 1.5TB of 4KB zvol blocks this
reduces amount of time spent in memmove() inside the scan thread from
13.7% to 5.7% and total scrub time by ~15 seconds out of 9 minutes.
It should also reduce spacemaps load time, but I haven't measured it.

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13582
2022-07-26 10:10:37 -07:00
Alexander Motin a861aa2b9e Several sorted scrub optimizations
- Reduce size and comparison complexity of q_exts_by_size B-tree.
Previous code used two 64-bit divisions and many other operations to
compare two B-tree elements.  It created enormous overhead.  This
implementation moves the math to the upper level and stores the score
in the B-tree elements themselves.  Since all that we need to store in
that B-tree is the extent score and offset, those can fit into single
8 byte value instead of 24 bytes of q_exts_by_addr element and can be
compared with single operation.
 - Better decouple secondary tree logic from main range_tree by moving
rt_btree_ops and related functions into dsl_scan.c as ext_size_ops.
Those functions are very small to worry about the code duplication and
range_tree does not need to know details such as rt_btree_compare.
 - Instead of accounting number of pending bytes per pool, that needs
atomic on global variable per block, account the number of non-empty
per-vdev queues, that change much more rarely.
 - When extent scan is interrupted by TXG end, continue it in the next
TXG instead of selecting next best extent.  It allows to avoid leaving
one truncated (and so likely not the best any more) extent each TXG.

On top of some other optimizations this saves about 1.5 minutes out of
10 to scrub pool of 12 SSDs, storing 1.5TB of 4KB zvol blocks.

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <caputit1@tcnj.edu>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13576
2022-07-26 10:10:37 -07:00
Alexander Motin 881249de6f FreeBSD: Improve crypto_dispatch() handling
Handle crypto_dispatch() return values same as crp->crp_etype errors.
On FreeBSD 12 many drivers returned same errors both ways, and lack
of proper handling for the first ended up in assertion panic later.
It was changed in FreeBSD 13, but there is no reason to not be safe.

While there, skip waiting for completion, including locking and
wakeup() call, for sessions on synchronous crypto drivers, such as
typical aesni and software.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13563
2022-07-26 10:10:37 -07:00
Alexander Motin 916d9de158 Reduce ZIO io_lock contention on sorted scrub
During sorted scrub multiple threads (one per vdev) are issuing many
ZIOs same time, all using the same scn->scn_zio_root ZIO as parent.
It causes huge lock contention on the single global lock on that ZIO.
Improve it by introducing per-queue null ZIOs, children to that one,
and using them instead as proxy.

For 12 SSD pool storing 1.5TB of 4KB blocks on 80-core system this
dramatically reduces lock contention and reduces scrub time from 21
minutes down to 12.5, while actual read stages (not scan) are about
3x faster, reaching 100K blocks per second per vdev.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13553
2022-07-26 10:10:37 -07:00
Alexander Motin 813e15f28c AVL: Remove obsolete branching optimizations
Modern Clang and GCC can successfully implement simple conditions
without branching with math and flag operations.  Use of arrays for
translation no longer helps as much as it was 14+ years ago.

Disassemble of the code generated by Clang 13.0.0 on FreeBSD 13.1,
Clang 14.0.4 on FreeBSD 14 and GCC 10.2.1 on Debian 11 with this
change still shows no branching instructions.

Profiling of CPU-bound scan stage of sorted scrub shows reproducible
reduction of time spent inside avl_find() from 6.52% to 4.58%.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13540
2022-07-26 10:10:37 -07:00
Alexander Motin 884364ea85 More speculative prefetcher improvements
- Make prefetch distance adaptive: up to 4MB prefetch doubles for
every, hit same as before, but after that it grows by 1/8 every time
the prefetch read does not complete in time to satisfy the demand.
My tests show that 4MB is sufficient for wide NVMe pool to saturate
single reader thread at 2.5GB/s, while new 64MB maximum allows the
same thread to reach 1.5GB/s on wide HDD pool.  Further distance
increase may increase speed even more, but less dramatic and with
higher latency.

 - Allow early reuse of inactive prefetch streams: streams that never
saw hits can be reused immediately if there is a demand, while others
can be reused after 1s of inactivity, starting with the oldest.  After
2s of inactivity streams are deleted to free resources same as before.
This allows by several times increase strided read performance on HDD
pool in presence of simultaneous random reads, previously filling the
zfetch_max_streams limit for seconds and so blocking most of prefetch.

 - Always issue intermediate indirect block reads with SYNC priority.
Each of those reads if delayed for longer may delay up to 1024 other
block prefetches, that may be not good for wide pools.

Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13452
2022-07-26 10:10:37 -07:00
Alexander Motin 6e1e90d64c Improve mg_aliquot math
When calculating mg_aliquot alike to #12046 use number of unique data
disks in the vdev, not the total number of children vdev.  Increase
default value of the tunable from 512KB to 1MB to compensate.

Before this change each disk in striped pool was getting 512KB of
sequential data, in 2-wide mirror -- 1MB, in 3-wide RAIDZ1 -- 768KB.
After this change in all the cases each disk should get 1MB.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13388
2022-07-26 10:10:37 -07:00
Alexander Motin dd9c110ab5 Improve log spacemap load time
Previous flushing algorithm limited only total number of log blocks to
the minimum of 256K and 4x number of metaslabs in the pool.  As result,
system with 1500 disks with 1000 metaslabs each, touching several new
metaslabs each TXG could grow spacemap log to huge size without much
benefits.  We've observed one of such systems importing pool for about
45 minutes.

This patch improves the situation from five sides:
 - By limiting maximum period for each metaslab to be flushed to 1000
TXGs, that effectively limits maximum number of per-TXG spacemap logs
to load to the same number.
 - By making flushing more smooth via accounting number of metaslabs
that were touched after the last flush and actually need another flush,
not just ms_unflushed_txg bump.
 - By applying zfs_unflushed_log_block_pct to the number of metaslabs
that were touched after the last flush, not all metaslabs in the pool.
 - By aggressively prefetching per-TXG spacemap logs up to 16 TXGs in
advance, making log spacemap load process for wide HDD pool CPU-bound,
accelerating it by many times.
 - By reducing zfs_unflushed_log_block_max from 256K to 128K, reducing
single-threaded by nature log processing time from ~10 to ~5 minutes.

As further optimization we could skip bumping ms_unflushed_txg for
metaslabs not touched since the last flush, but that would be an
incompatible change, requiring new pool feature.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #12789
2022-07-26 10:10:37 -07:00
Alexander Motin fdb80a2301 Add more control/visibility to spa_load_verify().
Use error thresholds from policy to control whether to scrub data
and/or metadata.  If threshold is set to UINT64_MAX, then caller
probably does not care about result and we may skip that part.

By default import neither set the data error threshold nor read
the error counter, so skip the data scrub for faster import.
Metadata are still scrubbed and fail if even single error found.

While there just for symmetry return number of metadata errors in
case threshold is not set to zero and we haven't reached it.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #13022
2022-07-26 10:10:37 -07:00
Allan Jude 72a4709a59 spa.c: Replace VERIFY(nvlist_*(...) == 0) with fnvlist_* (#12678)
The fnvlist versions of the functions are fatal if they fail,
saving each call from having to include checking the result.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Allan Jude <allan@klarasystems.com>
2022-07-26 10:10:37 -07:00
Alexander Motin 415882d228 Avoid small buffer copying on write
It is wrong for arc_write_ready() to use zfs_abd_scatter_enabled to
decide whether to reallocate/copy the buffer, because the answer is
OS-specific and depends on the buffer size.  Instead of that use
abd_size_alloc_linear(), moved into public header.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12425
2022-07-26 10:10:37 -07:00
Alexander Motin 5b860ae1fb Remove refcount from spa_config_*()
The only reason for spa_config_*() to use refcount instead of simple
non-atomic (thanks to scl_lock) variable for scl_count is tracking,
hard disabled for the last 8 years.  Switch to simple int scl_count
reduces the lock hold time by avoiding atomic, plus makes structure
fit into single cache line, reducing the locks contention.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #12287
2022-07-26 10:10:37 -07:00
Brian Behlendorf 3920d7f325 Scrub mirror children without BPs
When scrubbing a raidz/draid pool, which contains a replacing or
sparing mirror with multiple online children, only one child will
be read.  This is not normally a serious concern because the DTL
records are used to determine where a good copy of the data is.
As long as the data can be read from one child the mirror vdev
will use it to repair gaps in any of its children.  Furthermore,
even if the data which was read is corrupt the raidz code will
detect this and issue its own repair I/O to correct the damage
in the mirror vdev.

However, in the scenario where the DTL is wrong due to silent
data corruption (say due to overwriting one child) and the scrub
happens to read from a child with good data, then the other damaged
mirror child will not be detected nor repaired.

While this is possible for both raidz and draid vdevs, it's most
pronounced when using draid.  This is because by default the zed
will sequentially rebuild a draid pool to a distributed spare,
and the distributed spare half of the mirror is always preferred
since it delivers better performance.  This means the damaged
half of the mirror will go undetected even after scrubbing.

For system administrations this behavior is non-intuitive and in
a worst case scenario could result in the only good copy of the
data being unknowingly detached from the mirror.

This change resolves the issue by reading all replacing/sparing
mirror children when scrubbing.  When the BP isn't available for
verification, then compare the data buffers from each child.  They
must all be identical, if not there's silent damage and an error
is returned to prompt the top-level vdev to issue a repair I/O to
rewrite the data on all of the mirror children.  Since we can't
tell which child was wrong a checksum error is logged against the
replacing or sparing mirror vdev.

Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13555
2022-07-14 10:21:29 -07:00
Tony Hutter 6c3c5fcfbe Tag zfs-2.1.5
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2022-06-21 17:00:34 -07:00
Matthew Thode 6e954130d4 Remove install of zfs-load-module.service for dracut
The zfs-load-module.service service is not currently provided by
the OpenZFS repository so we cannot safely assume it exists.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Thode <mthode@mthode.org>
Closes #13574
2022-06-21 10:53:46 -07:00
Ryan Moeller 403d4bc66e FreeBSD: Silence clang unused-but-set-variable
Quick and dirty build fix for warnings being treated as errors.

Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
2022-06-15 11:27:28 -07:00
Alexander Motin 6ff89fe126 Improve sorted scan memory accounting
Since we use two B-trees q_exts_by_size and q_exts_by_addr, we should
count 2x sizeof (range_seg_gap_t) per node.  And since average B-tree
memory efficiency is about 75%, we should increase it to 3x.

Previous code under-counted up to 30% of the memory usage.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13537
2022-06-15 11:23:49 -07:00
Rich Ercolani cc565f557b Corrected edge case in uncompressed ARC->L2ARC handling
I genuinely don't know why this didn't come up before,
but adding the LZ4 early abort pointed out this flaw,
in which we're allocating a buffer of one size, and
then telling the compressor that we're handing it buffers
of a different size, which may be Very Different - say,
allocating 512b and then telling it the inputs are 128k.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Amanakis <gamanakis@gmail.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #13375
2022-06-14 18:10:21 -07:00
Alexander Motin 338188562b Remove wrong assertion in log spacemap
It is typical, but not generally true that if log summary has more
blocks it must also have unflushed metaslabs.  Normally with metaslabs
flushed in order it works, but there are known exceptions, such as
device removal or metaslab being loaded during its flush attempt.

Before 600a02b884 if spa_flush_metaslabs() hit loading metaslab it
usually stopped (unless memlimit is also exceeded), but now it may
flush more metaslabs, just skipping that particular one.  This
increased chances of assertion to fire when the skipped metaslab is
flushed on next iteration if all other metaslabs in that summary
entry are already flushed out of order.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13486 
Closes #13513
2022-06-06 16:57:56 -07:00
Ryan Moeller 1fdd768d7f libzfs: Fail making a dataset handle gracefully
When a dataset is in the process of being received it gets marked as
inconsistent and should not be used.  We should check for this when
opening a dataset handle in libzfs and return with an appropriate error
set, rather than hitting an abort because of the incomplete data.

zfs_open() passes errno to zfs_standard_error() after observing
make_dataset_handle() fail, which ends up aborting if errno is 0.
Set errno before returning where we know it has not been set already.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #13077
2022-06-06 16:57:51 -07:00
наб 56eed508d4 libzfs: mount: don't leak mnt_param_t if mnt_func fails
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12968
2022-06-06 16:57:46 -07:00
Rich Ercolani 271241187b Reject zfs send -RI with nonexistent fromsnap
Right now, zfs send -I dataset@nonexistent dataset@existent fails, but
zfs send -RI dataset@nonexistent dataset@existent does not.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12574
Closes #12575
2022-06-06 16:57:41 -07:00
Brian Behlendorf fc18fa92c8 Linux 5.18 compat: META
Update the META file to reflect compatibility with the 5.18 kernel.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13527
2022-06-01 14:24:49 -07:00
Brian Behlendorf db530f6aa0 autoconf: AC_MSG_CHECKING consistency
Make the wording more consistent for the kernel AC_MSG_CHECKING
output (e.g. "checking whether ...".).  Additionally, group some
of the VFS interface checks with the others.  No functional change.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Attila Fülöp <attila@fueloep.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13529
2022-06-01 14:24:49 -07:00
Brian Behlendorf 1c4e6a312c Linux 5.19 compat: asm/fpu/internal.h
As of the Linux 5.19 kernel the asm/fpu/internal.h header was
entirely removed.  It has been effectively empty since the 5.16
kernel and provides no required functionality.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Attila Fülöp <attila@fueloep.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13529
2022-06-01 14:24:49 -07:00
Brian Behlendorf 69430e39e3 Linux 5.19 compat: zap_flags_t conflict
As of the Linux 5.19 kernel an identically named zap_flags_t typedef
is declared in the include/linux/mm_types.h linux header.  Sadly,
the inclusion of this header cannot be easily avoided.  To resolve
the conflict a #define is used to remap the name in the OpenZFS
sources when building against the Linux kernel.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13515
2022-06-01 14:24:49 -07:00
Brian Behlendorf ee84970d4f Linux 5.19 compat: bdev_start_io_acct() / bdev_end_io_acct()
As of the Linux 5.19 kernel the disk_*_io_acct() helper functions
have been replaced by the bdev_*_io_acct() functions.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13515
2022-06-01 14:24:49 -07:00
Brian Behlendorf fec407fb69 Linux 5.19 compat: aops->read_folio()
As of the Linux 5.19 kernel the readpage() address space operation
has been replaced by read_folio().

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13515
2022-06-01 14:24:49 -07:00
Brian Behlendorf 7ae5ea8864 Linux 5.19 compat: blkdev_issue_secure_erase()
Linux 5.19 commit torvalds/linux@44abff2c0 splits the secure
erase functionality from the blkdev_issue_discard() function.
The blkdev_issue_secure_erase() must now be issued to issue
a secure erase.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13515
2022-06-01 14:24:49 -07:00
Brian Behlendorf 048301b6dc Linux 5.19 compat: bdev_max_secure_erase_sectors()
Linux 5.19 commit torvalds/linux@44abff2c0 removed the
blk_queue_secure_erase() helper function.  The preferred
interface is to now use the bdev_max_secure_erase_sectors()
function to check for discard support.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13515
2022-06-01 14:24:49 -07:00
Brian Behlendorf 9ce5eb18ef Linux 5.19 compat: bdev_max_discard_sectors()
Linux 5.19 commit torvalds/linux@70200574cc removed the
blk_queue_discard() helper function.  The preferred interface
is to now use the bdev_max_discard_sectors() function to check
for discard support.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13515
2022-06-01 14:24:49 -07:00
Brian Behlendorf 5a639f0802 Linux 5.18 compat: bio_alloc()
As for the Linux 5.18 kernel bio_alloc() expects a block_device struct
as an argument.  This removes the need for the bio_set_dev() compatibility
code for 5.18 and newer kernels.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13515
2022-06-01 14:24:49 -07:00
Ryan Moeller 090bda59e3 Silence unused-but-set-variable warning
This was breaking the kmod port build on FreeBSD with Clang 13.

Use the same trick as we do for ASSERT() to make DNODE_VERIFY() use
its parameter at compile time without actually using it at run time
in non-debug builds.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #13507
2022-05-27 09:19:37 -07:00
heeplr 2458c7e63a zed: support subject as header in zed_notify_email()
Some minimal MUAs don't support passing the subjects as cmdline option.
This commit checks if "@SUBJECT@" is missing in ZED_EMAIL_OPTS and then
prepends a subject header to the notification message.
Also set a default for ${subject}.

Reviewed-by: Ahelenia Ziemia<C5><84>ska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Daniel Hiepler <d-git@coderdu.de>
Closes #13440
2022-05-27 09:19:37 -07:00
Umer Saleem 0a688b2345 rpm: Keep debug symbols if configured with '--enable-debuginfo'
Do not strip debug information from packages if '--enable-debuginfo' is
configured.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Closes #13500
2022-05-27 09:19:37 -07:00
Ryan Moeller fde66e583d FreeBSD: libspl: Add locking around statfs globals
Makes getmntent and getmntany thread-safe for external consumers of
libzfs zpool_disable_datasets, zfs_iter_mounted, libzfs_mnttab_update,
libzfs_mnttab_find.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #13484
2022-05-27 09:19:37 -07:00
Brian Behlendorf ed16dd7635 Standardize RHEL version check in packages
This is a follow up to 3c35662299 which standardizes how the RHEL
version check is done.  This simpler "0%{?rhel}" check is used
elsewhere in the packages so we do the same here.

Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13501
2022-05-27 09:19:37 -07:00
Rich Ercolani 5d9c527536 Modified ncompress requirement in RPM to exclude RHEL9
The bug this was working around is no longer present.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #13480
Closes #13490
2022-05-27 09:19:37 -07:00
Brian Behlendorf 5d534f1371 zed: Take no action on scrub/resilver checksum errors
When scrubbing/resilvering a pool it can be counter productive to
cancel the scan and kick of a replace operation to a hot spare
when encountering checksum errors.  In this case, the best course
of action is to allow the scrub/resilver to complete as quickly
as possible and to keep the vdevs fully online if possible.

Realistically, this is less of an issue for a RAIDZ since a
traditional resilver must be used and checksums will be verified.
However, this is not the case for a mirror or dRAID pool which is
sequentially resilvered and checksum verification is deferred
until after the replace operation completes.

Regardless, we apply this policy to all pool types since it's
a good idea for all vdevs.  Degrading additional vdevs has the
potential to make a bad situation worse.  Note the checksum
errors will still be reported as both an event and by
`zpool status`.  This change only prevents the ZED from
proactively taking any action.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13499
2022-05-27 09:19:37 -07:00
Mark Johnston 4184b78be1 zdb: Fix handling of nul termination in symlink targets
The SA attribute containing the symlink target does not include a nul
terminator, so when printing the target zdb would sometimes include
garbage at the end of the string.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #13482
2022-05-27 09:19:37 -07:00
наб 96c7c63994 automake: don't install /e/d/zfs or /e/z/zfs-functions +x
Closes #13496
Backport-of: #13503
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
2022-05-25 14:57:09 -07:00
Savyasachee Jha 115e059818 Multiple dracut module install script cleanups
- Replaced intances of `dracut_install` with `inst_simple`
- Removed calls to `test -x mark_hostonly` because the function is an
inbuilt dracut function
- Removed redundant installation of `systemd-ask-password` and
`systemd-tty-ask-password-agent` because they are already installed by
the systemd module. There is no need to install them again
- Removed multiple calls to the `mark_hostonly` function because the
`inst_simple` has a command-line switch for it
- Cleaned up the installation of the `zpool.cache`, `vdev_id.conf` and
`hostid` files to make the logic easier to follow
- Cleaned up and simplified the systemd service installation logic by
invoking systemctl instead of creating symlinks manually
- Replaced various hard-coded paths with dracut equivalents to better
conform with expected dracut behaviour
- Removed redundant call to `mkdir` (`inst_simple` creates the parent
directory if it does not exist on the destination initrd)

Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: Andrew J. Hesford <ajh@sideband.org>
Signed-off-by: Savyasachee Jha <hi@savyasacheejha.com>
Closes #13010
2022-05-25 11:09:23 -07:00
Savyasachee Jha 4252517f5f Remove absolute paths to udev rules and binaries for dracut
Since dracut functions can locate both udev rules and binaries, there is
no point in keeping absolute paths in the module setup script. It also
breaks the --sysroot option in dracut. This commit removes mentions to
absolute paths for binaries and udev rules.

Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: Andrew J. Hesford <ajh@sideband.org>
Signed-off-by: Savyasachee Jha <hi@savyasacheejha.com>
Closes #13010
2022-05-25 11:09:23 -07:00
Savyasachee Jha ebbfc6cb85 Make dracut fail if essential files cannot be installed
Dracut will now fail in initramfs generation if essential files cannot
be installed.

Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: Andrew J. Hesford <ajh@sideband.org>
Signed-off-by: Savyasachee Jha <hi@savyasacheejha.com>
Closes #13010
2022-05-25 11:09:23 -07:00
Savyasachee Jha 0671f72706 Make better use of dracut functions when building initramfs
Setting up the module involves multiple redundant calls to a bunch of
dracut functions wheich can be combined into one. Additionally, the mass
of code required to load libgcc_s.so* can be replaced with one dracut
function. This has the additional effect of removing errors involving
the non-installation of libgcc_s.so* which are seen on debian bullseye
when using version 2.1.2-1~bpo11+1 from the backports repository.

The systemd binaries are separated out into their own `dracut_install`
function call so they do not get pulled in when dracut does not load the
systemd module.

Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: Andrew J. Hesford <ajh@sideband.org>
Signed-off-by: Savyasachee Jha <hi@savyasacheejha.com>
Closes #13010
2022-05-25 11:09:23 -07:00
Coleman Kane 05147319b0 Fix compiler warnings about zero-length arrays in inline bitops
The compiler appears to be expanding the unused NULL pointer into a
zero-length array via the inline bitops code. When -Werror=array-bounds
is used, this causes a build failure. Recommended solution is allocate
temporary structures, fill with zeros (to avoid uninitialized data use
warnings), and pass the pointer to those to the inline calls.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #13463 
Closes #13465
2022-05-20 10:33:24 -07:00
Brian Behlendorf 0112bc2312 Add missing AC_MSG_RESULT(no) to configure
When the HAVE_IOPS_MKDIR_USERNS check fails output result
as required.
 
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13454
2022-05-20 10:33:24 -07:00
hping b28c0c4bf8 abd_os: remove redundant refcount creation for abd_children
Refcount creation for abd_zero_scatter->abd_children is redundant in
abd_alloc_zero_scatter, as it has been done in abd_init_struct.

In addition, abd_children is undefined when ZFS_DEBUG is disabled, the
reference of abd_children in abd_alloc_zero_scatter breaks build of
libzpool when ZFS_DEBUG is disabled.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Ping Huang <huangping@smartx.com>
Closes #13429
2022-05-20 10:33:24 -07:00
Aidan Harris eee389ba2e Fix functions without a prototype
clang-15 emits the following error message for functions without
a prototype:

fs/zfs/os/linux/spl/spl-kmem-cache.c:1423:27: error:
  a function declaration without a prototype is deprecated
  in all versions of C [-Werror,-Wstrict-prototypes]

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Aidan Harris <me@aidanharr.is>
Closes #13421
2022-05-20 10:33:24 -07:00
Mateusz Guzik 2c5c8bb0a6 FreeBSD: use zero_region instead of allocating a dedicated page
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #13406
2022-05-20 10:33:24 -07:00
szubersk 756c3e085b autoconf: Fail when __copy_from_user_inatomic is a non-GPL symbol
A followup to 849c14e048
Fix https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1009242

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #13389
2022-05-20 10:33:24 -07:00
Damian Szuberski 13b1f336d3 PPC get_user workaround
Linux 5.12 PPC 5.12 get_user() and __copy_from_user_inatomic()
inline helpers very indirectly include a reference to the GPL'd
array mmu_feature_keys[] and fails to build. Workaround this by
using copy_from_user() and throwing EFAULT for any calls to
__copy_from_user_inatomic(). This is a workaround until a fix
for Linux commit 7613f5a66becfd0e43a0f34de8518695888f5458
"powerpc/64s/kuap: Use mmu_has_feature()" is fully addressed.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Authored-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #11958
Closes #12590
Closes #13367
2022-05-20 10:33:24 -07:00
Brian Atkinson 60fc173251 Adding ZERO_PAGE detection
On some architectures ZERO_PAGE is unavailable because it references
a GPL exported symbol of empty_zero_page. Originally e08b993 removed
the call to PAGE_ZERO(0) for assignment to the abd_zero_page. However,
a simple check can be done to avoid a kernel allocation and free for
the abd_zero_page if ZERO_PAGE is available.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Brian Atkinson <batkinson@lanl.gov>
Closes #13199
2022-05-20 10:33:24 -07:00
Damian Szuberski 3bb068d4d5 autoconf: Pretend CONFIG_MODULES is always on
- Unconditionally inject `CONFIG_MODULES` make variable
  and `#define CONFIG_MODULES` to Kbuild in `ZFS_LINUX_COMPILE`
  autoconf function to emulate loadable kernel modules support.
  This allows OpenZFS to perform Linux checks despite
  `CONFIG_MODULES=n` in the actual Linux config.

- Add `ZFS_AC_KERNEL_CONFIG_MODULES` check which encompasses
  the logic from `ZFS_AC_KERNEL_TEST_MODULE` with additional
  diagnostic messages to the user

- Removed `ZFS_AC_KERNEL_TEST_MODULE` as it merely duplicates
  every check in `ZFS_AC_KERNEL_CONFIG_DEFINED`

- Moved `ZFS_AC_MODULE_SYMVERS` after `ZFS_AC_KERNEL_CONFIG_DEFINED`
  so the user has a chance to see the proper diagnostic from the
  steps before.

A workaround for Linux's

```
commit 3e3005df73b535cb849cf4ec8075d6aa3c460f68
Author: Masahiro Yamada <masahiroy@kernel.org>
Date:   Wed Mar 31 22:38:03 2021 +0900

kbuild: unify modules(_install) for in-tree and external modules

If you attempt to build or install modules ('make modules(_install)'
with CONFIG_MODULES disabled, you will get a clear error message, but
nothing for external module builds.

Factor out the modules and modules_install rules into the common part,
so you will get the same error message when you try to build external
modules with CONFIG_MODULES=n.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
```

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #10832
Closes #13361
2022-05-20 10:33:24 -07:00
Damian Szuberski 210b33109d Strengthen Linux kernel capabilities detection
- Add `CONFIG_BLOCK` Linux config requirement to
  `ZFS_AC_KERNEL_CONFIG_DEFINED`. OpenZFS won't compile without
  that block device support due to large amount of functional
  dependencies on it.

- Remove dependency on `groups_alloc()` in
  `ZFS_AC_KERNEL_SRC_GROUP_INFO_GID` to circumvent the missing stub
  in Linux 4.X kernel headers.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #13351
2022-05-20 10:33:24 -07:00
Richard Laager 1d54deb42f zvol_wait: Ignore locked zvols
"When an encrypted zvol is locked the zfs-volume-wait service does not
start.  The /sbin/zvol_wait should not wait for links when the volume
has property keystatus=unavailable."
-- https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1888405

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Thanks: James Dingwall <james-launchpad@dingwall.me.uk>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Closes #10662
2022-05-20 10:33:24 -07:00
Ka Ho Ng 1f31889046 FreeBSD: Implement hole-punching support
This adds supports for hole-punching facilities in the FreeBSD kernel
starting from __FreeBSD_version 1400032.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ka Ho Ng <khng@FreeBSD.org>
Sponsored-by: The FreeBSD Foundation
Closes #12458
2022-05-17 11:15:29 -07:00
наб 1467a1bb33 module: zstd: check we don't leak symbols; regenerate symbol map
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12988
Closes #13209
(cherry picked from commit 6ef00196db)
2022-05-16 15:48:21 -07:00
наб 2a64eeb6c7 man: zpool-import.8: -d -or -c
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13437
2022-05-10 13:36:37 -07:00
Brian Behlendorf bb29f1eb38 Reduce dbuf_find() lock contention
Holding a dbuf is a common operation which can become highly contended
in dbuf_find() when acquiring the dbuf hash mutex.  This is particularly
true on Linux when reading/writing volumes since by default up to 32
threads from the zvol_taskq may be taking a hold of the same dbuf.
This should also be observable on FreeBSD as long as there are enough
processes accessing the volume concurrently.

This is further aggregrated by the fact that only the block id will
be unique when calculating the dbuf hash for a single volume.  The
objset id, object id, and level will be the same for data blocks.
This has been observed to result in a somehwat less than uniform hash
distribution and a longer than expected max hash chain depth (~20)
on a large memory system (256 GB) using volumes.

This commit improves the siutation by switching the hash mutex to
an rwlock to allow concurrent lookups, and increasing DBUF_RWLOCKS
from 2048 to 8192 to further reduce the odds of a hash collision.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13405
2022-05-06 12:02:45 -07:00
наб 1184df6b93 contrib: dracut: remove getargbool polyfill
It was originally released in dracut 008 in February 2011;
we can probably drop it now

Upstream-commit: 47a02e3972
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13291
2022-05-06 12:01:48 -07:00
наб 1781ee703b Add dracut.zfs.7
Thorough documentation with a dracut.bootup(7)-style flowchart,
dracut.cmdline(7)-style cmdline listing,
and per-file docs like the old README

Upstream-commit: e3fc330d6c
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13291
2022-05-06 12:01:48 -07:00
наб 0947096044 contrib: dracut: zfs-needshutdown: don't list
Upstream-commit: 1cc9cc2f89
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13291
2022-05-06 12:01:48 -07:00
наб a0e81a4074 contrib: dracut: zfs-{rollback,snapshot}-bootfs: order after key loading
This fixes at least one race I got with an encrypted root

Upstream-commit: 6ebdb0b20d
Upstream-commit: b8d9679f36
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13291
2022-05-06 12:01:48 -07:00
наб fc41be5a8d contrib: dracut: don't require essentials to be under the same encroot
Upstream-commit: 30c6dce7f7
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13291
2022-05-06 12:01:48 -07:00
наб 059a563810 contrib: dracut: inline single-use import_pool, move single-use ask_for_password
Also don't set ROOTFS_MOUNTED; the final mention was removed in dracut
011 from July 2011

Upstream-commit: eaf1e06045
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13291
2022-05-06 12:01:48 -07:00
наб 5c97f76f5a contrib: dracut: zfs-lib: remove find_bootfs
Upstream-commit: dac0b0785a
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13291
2022-05-06 12:01:48 -07:00
наб 5c0aa409ed contrib: dracut: zfs-lib: simplify ask_for_password
The only user is mount-zfs.sh (non-systemd systems),
so reduce it to what it needs

Upstream-commit: 5d31169d7c
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13291
2022-05-06 12:01:48 -07:00
наб 71a1d8e5dc contrib; dracut: flatten zfs-load-key, simplify zfs-env-bootfs
Upstream-commit: fec2c613a4
Upstream-change: drop 90zfs/module-setup.sh.in cleanups that don't apply
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13291
2022-05-06 12:01:48 -07:00
наб 0864c29e7c contrib; dracut: centralise root= parsing, actually support root=s
So far, everything parsed root= manually, which meant that while
zfs-parse.sh was updated, and supposedly supported + -> ' ' conversion,
it meant nothing

Instead, centralise parsing, and allow:
  root=
  root=zfs
  root=zfs:
  root=zfs:AUTO

  root=ZFS=data/set
  root=zfs:data/set
  root=zfs:ZFS=data/set (as a side-effect; allowed but undocumented)

  rootfstype=zfs AND root=data/set <=> root=data/set
  rootfstype=zfs AND root=         <=> root=zfs:AUTO

So rootfstype=zfs /also/ behaves as expected, and + decoding works

Upstream-commit: 245529d85f
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13291
2022-05-06 12:01:48 -07:00
наб b551725df4 contrib: dracut: parse-zfs: stop pretending we support FILESYSTEM=
It was added in the original ae26d0465a ("Add dracut support") commit
in 2011, and was then broken a bit later with the advent of
dracut-zfs-generator, or maybe earlier as part of other churn

Either way, it's broken, and has been in 2.0+ as well, and no-one
complained. Stop pretending we support it at all

Upstream-commit: 2c74617bcf
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13291
2022-05-06 12:01:48 -07:00
наб ae054e690e contrib: dracut: parse-zfs: drop initqueue-finished for i/f
The switch was released in dracut 009 in March 2011,
we can safely get rid of the compatibility hook

Upstream-commit: 47636f5661
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13291
2022-05-06 12:01:48 -07:00
наб 0657247548 contrib/dracut: zfs-lib: export_all: replace with inline zpool export -a
07a3312f17, which introduced this in
October of 2014, didn't have zpool export -a available; we do

Upstream-commit: 6a41310c70
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13093
2022-05-06 12:01:48 -07:00
jokersus bc03fee94d Remove REMAKE_INITRD
The option has been deprecated in dkms and will break packaging in
future versions. See https://github.com/dell/dkms/commit/7114c62

Upstream-commit: b5c16861e9
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Amanakis <gamanakis@gmail.com>
Signed-off-by: jokersus <jokersus.cava@gmail.com>
Closes #12781
2022-05-06 11:32:45 -07:00
Rich Ercolani ce8ae064d2 Python 3.10 fixes, part 2
There was a fallback case I overlooked in the initial patch, with
a similarly imperfect version extractor.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12045 
Closes #12673
2022-05-04 11:37:49 -07:00
Brian Behlendorf 4c9c96aba4 Silence unused-but-set-variable warnings
Clang 13.0.0 added support for `Wunused-but-set-parameter` and
`-Wunused-but-set-variable` which correctly detects two unused
variables in zstd resulting in a build failure.  This commit
annotates these instances accordingly.

  https://releases.llvm.org/13.0.1/tools/clang/docs/ReleaseNotes.html#id6

In FSE_createCTable(), malloc() is intentionally defined as NULL when
compiled in the kernel so the variable is unused.

  zstd/lib/compress/fse_compress.c:307:12: error: variable 'size'
  set but not used [-Werror,-Wunused-but-set-variable]

Additionally, in ZSTD_seqDecompressedSize() the assert is compiled
out similarly resulting in an unused variable.

  zstd/lib/compress/zstd_compress_superblock.c:412:12: error: variable
  'litLengthSum' set but not used [-Werror,-Wunused-but-set-variable]

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2022-05-02 15:42:58 -07:00
наб ecec151c14 module: zfs: freebsd: fix unused, remove argsused
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12844
2022-05-02 15:42:58 -07:00
наб a4f582f0b6 FreeBSD: remove unused variable
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Issue #12899
2022-05-02 15:42:58 -07:00
наб 9e68b734b3 zvol: remove unused variable
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12917
2022-05-02 15:42:58 -07:00
наб a175fe82e6 fm: remove unused variables
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12917
2022-05-02 15:42:58 -07:00
наб b8e1366ee6 zvol: remove unused variable
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12917
2022-05-02 15:42:58 -07:00
наб 7536ad35ca module/zfs: vdev_removal: spa_vdev_remove_thread: remove unused variable
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12187
2022-05-02 15:42:58 -07:00
наб 986d64ccca module/zfs: vdev_indirect: vdev_indirect_repair: remove unused variable
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12187
2022-05-02 15:42:58 -07:00
наб 18e9268087 module/zfs: dbuf: dbuf_read_impl: remove unused variable
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12187
2022-05-02 15:42:58 -07:00
наб 4149e19dfc module/zfs: arc: arc_hdr_realloc_crypt: remove unused variables
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12187
2022-05-02 15:42:58 -07:00
наб 116d447fb5 libzfs: zfs_send: remove unused variable
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12187
2022-05-02 15:42:58 -07:00
наб a1a54b3e47 libzutil: zpool_find_config: remove unused variable
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12187
2022-05-02 15:42:58 -07:00
Brian Behlendorf ce8d41ef75 Skip spacemaps reading in case of pool readonly import
The only zdb utility require to read metaslab-related data during
read-only pool import because of spacemaps validation. Add global
variable which will allow zdb read spacemaps in case of readonly
import mode.

Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Fedor Uporov <fuporov.vstack@gmail.com>
Closes #9095
Closes #12687
2022-04-28 16:47:12 -07:00
наб c0ff5f1560 zfs: holds: dequadratify
Before:
  15  0m0.177s
  30  0m0.653s
  45  0m1.289s
  60  0m2.129s
  75  0m3.264s
  90  0m4.397s
  100 0m5.996s
  117 0m8.552s

After:
  30  0m0.053s
  117 0m0.125s

Upstream-commit: 2a70a09072
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13372
Closes #13373
2022-04-28 15:18:47 -07:00
Brian Behlendorf 49c1346c10 Linux 5.18 compat: replace __set_page_dirty_nobuffers
Replace __set_page_dirty_nobuffers with filemap_dirty_folio.

Upstream-commit: 6b1f86f8e9c7f9de7ca1cb987b2cf25e99b1ae3a
("Merge tag 'folio-5.18b' of
git://git.infradead.org/users/willy/pagecache ")

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Authored-by: Satadru Pramanik <satadru@gmail.com>
Signed-off-by: Satadru Pramanik <satadru@gmail.com>
Closes #13325
Closes #13380
2022-04-28 15:17:38 -07:00
Brian Behlendorf 71cd3726c0 Fix O_APPEND for Linux 3.15 and older kernels
When using a Linux kernel which predates the iov_iter interface the
O_APPEND flag should be applied in zpl_aio_write() via the call to
generic_write_checks().  The updated pos variable  was incorrectly
ignored resulting in the current offset being used.

This issue should only realistically impact the RHEL/CentOS 7.x
kernels which are based on Linux 3.10.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13370 
Closes #13377
2022-04-28 15:15:28 -07:00
наб 642426095a Linux 5.18 compat: kobj_type.default_attrs replaced with default_groups
Upstream-commit: cdb4f26a63c391317e335e6e683a614358e70aeb ("kobject:
 kobj_type: remove default_attrs")
Upstream-commit: 0cdda2edb3
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13357
2022-04-25 10:00:09 -07:00
Alexander Motin 972637dc06 FreeBSD: Fix translation from ABD to physical pages.
In hypothetical case of non-linear ABD with single segment, multiple
to page size but not aligned to it, vdev_geom_fill_unmap_cb() could
fill one page less into bio_ma array.

I am not sure it is expoitable, but better to be safe than sorry.

Reported-by: Mark Johnston <markj@FreeBSD.org>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
(cherry picked from commit 5352f85cddce44e82fb1c4caec3b333e3666d7fd)
2022-04-21 16:59:09 -07:00
Rich Ercolani c220771a47 Corrected oversight in ZERO_RANGE behavior
It turns out, no, in fact, ZERO_RANGE and PUNCH_HOLE do
have differing semantics in some ways - in particular,
one requires KEEP_SIZE, and the other does not.

Also added a zero-range test to catch this, corrected a flaw
that made the punch-hole test succeed vacuously, and a typo
in file_write.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #13329 
Closes #13338
2022-04-21 16:58:07 -07:00
наб 361dc138b1 Document zfs inherit -S's interaction with noninheritable properties
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Upstream-commit: 92295af800
Closes #11894
Closes #13335
2022-04-21 11:09:35 -07:00
Brian Behlendorf aa1c3c1d1d Linux 5.17 compat: GENHD_FL_EXT_DEVT / GENHD_FL_NO_PART_SCAN
As of the 5.17 kernel the GENHD_FL_EXT_DEVT flag has been removed
and the GENHD_FL_NO_PART_SCAN flag renamed GENHD_FL_NO_PART. Update
zvol_alloc() to set GENHD_FL_NO_PART for the newer kernels which
is sufficient.  The behavior for prior kernels remains unchanged.

1ebe2e5f ("block: remove GENHD_FL_EXT_DEVT")
46e7eac6 ("block: rename GENHD_FL_NO_PART_SCAN to GENHD_FL_NO_PART")

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13294
Closes #13297
2022-04-20 13:44:19 -07:00
Mark Johnston b7546f92ea FreeBSD: Return Mach error codes from VOP_(GET|PUT)PAGES
FreeBSD's memory management system uses its own error numbers and gets
confused when these VOPs return EIO.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reported-by: Peter Holm <pho@FreeBSD.org>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #13311
2022-04-19 10:42:54 -07:00
Mark Johnston e9cd90f6e5 FreeBSD: Parameterize ZFS_ENTER/ZFS_VERIFY_VP with an error code
For legacy reasons, a couple of VOPs have to return error numbers that
don't come from the usual errno namespace.  To handle the cases where
ZFS_ENTER or ZFS_VERIFY_ZP fail, we need to be able to override the
default error return value of EIO.  Extend the macros to permit this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #13311
2022-04-19 10:42:54 -07:00
наб ff23ef0c99 libzfs: import: zpool_clear_label: actually fail if clearing l2arc header fails
Found with -Wunused-but-set-variable on Clang trunk

Upstream-commit: a4e0cee178
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13304
2022-04-15 14:16:59 -07:00
наб 1f4c79b1ce libzfs: sendrecv: always cancel progress thread in zfs_send_one()
This is in line with all the other uses of the progress thread

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11560
Closes #13284
2022-04-11 15:48:46 -07:00
Riccardo Schirone 35ddd8ee2e Linux 5.18 compat: use address_space_operations->readahead
->readpages was removed and replaced by ->readahead. Define
zpl_readahead for kernels that don't have ->readpages.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Riccardo Schirone <rschirone91@gmail.com>
Closes #13278
2022-04-06 13:15:27 -07:00
Riccardo Schirone 10a9f5fc47 Linux 5.18 compat: blkg_tryget is moved to private headers
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Riccardo Schirone <rschirone91@gmail.com>
Closes #13278
2022-04-06 13:15:27 -07:00
наб 9f7f704507 Linux 5.18 compat: replace genhd.h with blkdev.h includes
blkdev.h includes genhd.h since dawn of upstream git, so this is
globally safe

Upstream-commit: 322cbb50de711814c42fb088f6d31901502c711a ("block:
 remove genhd.h")

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13251
2022-04-06 13:15:27 -07:00
наб 215a8255a9 Linux 5.18 compat: 4-argument bio_alloc()
bio_alloc(gfp_t gfp_mask, unsigned short nr_iovecs)

became

  bio_alloc(struct block_device *bdev, unsigned short nr_vecs,
            unsigned int opf, gfp_t gfp_mask)
passing NULL/0 continues previous behaviour

Upstream-commit: 07888c665b405b1cd3577ddebfeb74f4717a84c4 ("block:
 pass a block_device and opf to bio_alloc")

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13251
2022-04-06 13:15:27 -07:00
Ryan Moeller a5a28723bd FreeBSD: Use NDFREE_PNBUF if available
NDF_ONLY_PNBUF has been removed from FreeBSD in favor of NDFREE_PNBUF.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #13277
2022-04-06 10:29:53 -07:00
Brian Behlendorf 5a9994f5ae Export minimal zfs_refcount interfaces
Lustre makes light use of the zfs_refcount interfaces which
isn't a problem when using a non-debug build of OpenZFS. However,
when debugging is enabled the required symbols are not exported.

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12613
2022-04-06 10:29:00 -07:00
Brian Behlendorf 9f6943504a Default to zfs_dmu_offset_next_sync=1
Strict hole reporting was previously disabled by default as a
performance optimization.  However, this has lead to confusion
over the expected behavior and a variety of workarounds being
adopted by consumers of ZFS.  Change the default behavior to
always report holes and force the TXG sync.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Upstream-commit: 05b3eb6d23
Ref: #13261
Closes #12746
2022-04-01 09:59:47 -07:00
наб fe6f2651f5 etc/systemd/zfs-mount-generator: serialise, handle keylocation=http[s]://
* etc/systemd/zfs-mount-generator: serialise

The wins for a relatively normal workload are rather slim:
	real	0.02119s/0.00985s=2.15029x
	user	0.02130s/0.00346s=6.15560x
	sys	0.03858s/0.00643s=6.00062x

	wall-total	0.014518s/0.005925s=2.45009x
	wall-init	0.014518s/0.002457s=5.90684x
	wall-real	0.014518s/0.003467s=4.18668x

But this is a big win on machines with a lot of datasets and expensive
forks.

For example, the gain on a VM on my work laptop with 900+ legacy-mount
Docker datasets, the original gains from the C rewrite were
only five-fold:
	real    0.516s/0.102s=5.05882x
	user    0.237s/0.143s=1.65734x
	sys     0.287s/0.100s=2.87x

And this serial variant gains this back there as well:
	real    0.102s/0.008s=12.75x
	user    0.143s/0.007s=20.42857
	sys     0.100s/0.001s=100x

	wall-total	0.09717s/0.00319s=30.40255x
	wall-init	0.00203s/0.00200s=1.015941x
	wall-real	0.09513s/0.00118s=80.02043x

For a total of
	real    0.516s/0.008s=64.5x
	user    0.237s/0.007s=33.85714x
	sys     0.287s/0.001s=287x

Suggested-by: Richard Laager <rlaager@wiktel.com>

* etc/systemd/zfs-mount-generator: pull in network for keylocation=https

Also simplify RequiresMountsFor= handling
Ref: #11956

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Upstream-commit: 4325de09cd
Closes #12138
2022-04-01 09:58:45 -07:00
наб 7fbb90feea libzfs: diff: stream_bytes: use fputc, %hho formats chars
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Upstream-commit: a72129edcb
Closes #12829
2022-04-01 09:58:45 -07:00
наб 5a21214be8 zfs, libzfs: diff: accept -h/ZFS_DIFF_NO_MANGLE, disabling path escaping
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Upstream-commit: 344bbc82e7
Closes #12829
2022-04-01 09:58:45 -07:00
228 changed files with 5275 additions and 6759 deletions
+2 -2
View File
@@ -1,10 +1,10 @@
Meta: 1
Name: zfs
Branch: 1.0
Version: 2.1.4
Version: 2.1.6
Release: 1
Release-Tags: relext
License: CDDL
Author: OpenZFS
Linux-Maximum: 5.17
Linux-Maximum: 5.19
Linux-Minimum: 3.10
+12 -2
View File
@@ -103,7 +103,7 @@ endif
endif
PHONY += codecheck
codecheck: cstyle shellcheck checkbashisms flake8 mancheck testscheck vcscheck
codecheck: cstyle shellcheck checkbashisms flake8 mancheck testscheck vcscheck zstdcheck
PHONY += checkstyle
checkstyle: codecheck commitcheck
@@ -114,14 +114,20 @@ commitcheck:
${top_srcdir}/scripts/commitcheck.sh; \
fi
if HAVE_PARALLEL
cstyle_line = -print0 | parallel -X0 ${top_srcdir}/scripts/cstyle.pl -cpP {}
else
cstyle_line = -exec ${top_srcdir}/scripts/cstyle.pl -cpP {} +
endif
PHONY += cstyle
cstyle:
@find ${top_srcdir} -name build -prune \
-o -type f -name '*.[hc]' \
! -name 'zfs_config.*' ! -name '*.mod.c' \
! -name 'opt_global.h' ! -name '*_if*.h' \
! -name 'zstd_compat_wrapper.h' \
! -path './module/zstd/lib/*' \
-exec ${top_srcdir}/scripts/cstyle.pl -cpP {} \+
$(cstyle_line)
filter_executable = -exec test -x '{}' \; -print
@@ -173,6 +179,10 @@ vcscheck:
awk '{c++; print} END {if(c>0) exit 1}' ; \
fi
PHONY += zstdcheck
zstdcheck:
@$(MAKE) -C module/zstd checksymbols
PHONY += lint
lint: cppcheck paxcheck
+1 -1
View File
@@ -271,7 +271,7 @@ def print_values():
if pretty_print:
fmt = lambda col: prettynum(cols[col][0], cols[col][1], v[col])
else:
fmt = lambda col: v[col]
fmt = lambda col: str(v[col])
sys.stdout.write(sep.join(fmt(col) for col in hdr))
sys.stdout.write("\n")
+95 -31
View File
@@ -110,8 +110,9 @@ extern int zfs_recover;
extern unsigned long zfs_arc_meta_min, zfs_arc_meta_limit;
extern int zfs_vdev_async_read_max_active;
extern boolean_t spa_load_verify_dryrun;
extern boolean_t spa_mode_readable_spacemaps;
extern int zfs_reconstruct_indirect_combinations_max;
extern int zfs_btree_verify_intensity;
extern uint_t zfs_btree_verify_intensity;
static const char cmdname[] = "zdb";
uint8_t dump_opt[256];
@@ -3124,13 +3125,18 @@ dump_znode_symlink(sa_handle_t *hdl)
{
int sa_symlink_size = 0;
char linktarget[MAXPATHLEN];
linktarget[0] = '\0';
int error;
error = sa_size(hdl, sa_attr_table[ZPL_SYMLINK], &sa_symlink_size);
if (error || sa_symlink_size == 0) {
return;
}
if (sa_symlink_size >= sizeof (linktarget)) {
(void) printf("symlink size %d is too large\n",
sa_symlink_size);
return;
}
linktarget[sa_symlink_size] = '\0';
if (sa_lookup(hdl, sa_attr_table[ZPL_SYMLINK],
&linktarget, sa_symlink_size) == 0)
(void) printf("\ttarget %s\n", linktarget);
@@ -8266,6 +8272,23 @@ zdb_embedded_block(char *thing)
free(buf);
}
/* check for valid hex or decimal numeric string */
static boolean_t
zdb_numeric(char *str)
{
int i = 0;
if (strlen(str) == 0)
return (B_FALSE);
if (strncmp(str, "0x", 2) == 0 || strncmp(str, "0X", 2) == 0)
i = 2;
for (; i < strlen(str); i++) {
if (!isxdigit(str[i]))
return (B_FALSE);
}
return (B_TRUE);
}
int
main(int argc, char **argv)
{
@@ -8311,7 +8334,7 @@ main(int argc, char **argv)
zfs_btree_verify_intensity = 3;
while ((c = getopt(argc, argv,
"AbcCdDeEFGhiI:klLmMo:Op:PqrRsSt:uU:vVx:XYyZ")) != -1) {
"AbcCdDeEFGhiI:klLmMNo:Op:PqrRsSt:uU:vVx:XYyZ")) != -1) {
switch (c) {
case 'b':
case 'c':
@@ -8325,6 +8348,7 @@ main(int argc, char **argv)
case 'l':
case 'm':
case 'M':
case 'N':
case 'O':
case 'r':
case 'R':
@@ -8416,31 +8440,6 @@ main(int argc, char **argv)
(void) fprintf(stderr, "-p option requires use of -e\n");
usage();
}
if (dump_opt['d'] || dump_opt['r']) {
/* <pool>[/<dataset | objset id> is accepted */
if (argv[2] && (objset_str = strchr(argv[2], '/')) != NULL &&
objset_str++ != NULL) {
char *endptr;
errno = 0;
objset_id = strtoull(objset_str, &endptr, 0);
/* dataset 0 is the same as opening the pool */
if (errno == 0 && endptr != objset_str &&
objset_id != 0) {
target_is_spa = B_FALSE;
dataset_lookup = B_TRUE;
} else if (objset_id != 0) {
printf("failed to open objset %s "
"%llu %s", objset_str,
(u_longlong_t)objset_id,
strerror(errno));
exit(1);
}
/* normal dataset name not an objset ID */
if (endptr == objset_str) {
objset_id = -1;
}
}
}
#if defined(_LP64)
/*
@@ -8469,13 +8468,18 @@ main(int argc, char **argv)
*/
spa_load_verify_dryrun = B_TRUE;
/*
* ZDB should have ability to read spacemaps.
*/
spa_mode_readable_spacemaps = B_TRUE;
kernel_init(SPA_MODE_READ);
if (dump_all)
verbose = MAX(verbose, 1);
for (c = 0; c < 256; c++) {
if (dump_all && strchr("AeEFklLOPrRSXy", c) == NULL)
if (dump_all && strchr("AeEFklLNOPrRSXy", c) == NULL)
dump_opt[c] = 1;
if (dump_opt[c])
dump_opt[c] += verbose;
@@ -8514,6 +8518,7 @@ main(int argc, char **argv)
return (dump_path(argv[0], argv[1], NULL));
}
if (dump_opt['r']) {
target_is_spa = B_FALSE;
if (argc != 3)
usage();
dump_opt['v'] = verbose;
@@ -8524,6 +8529,10 @@ main(int argc, char **argv)
rewind = ZPOOL_DO_REWIND |
(dump_opt['X'] ? ZPOOL_EXTREME_REWIND : 0);
/* -N implies -d */
if (dump_opt['N'] && dump_opt['d'] == 0)
dump_opt['d'] = dump_opt['N'];
if (nvlist_alloc(&policy, NV_UNIQUE_NAME_TYPE, 0) != 0 ||
nvlist_add_uint64(policy, ZPOOL_LOAD_REQUEST_TXG, max_txg) != 0 ||
nvlist_add_uint32(policy, ZPOOL_LOAD_REWIND_POLICY, rewind) != 0)
@@ -8542,6 +8551,34 @@ main(int argc, char **argv)
targetlen = strlen(target);
if (targetlen && target[targetlen - 1] == '/')
target[targetlen - 1] = '\0';
/*
* See if an objset ID was supplied (-d <pool>/<objset ID>).
* To disambiguate tank/100, consider the 100 as objsetID
* if -N was given, otherwise 100 is an objsetID iff
* tank/100 as a named dataset fails on lookup.
*/
objset_str = strchr(target, '/');
if (objset_str && strlen(objset_str) > 1 &&
zdb_numeric(objset_str + 1)) {
char *endptr;
errno = 0;
objset_str++;
objset_id = strtoull(objset_str, &endptr, 0);
/* dataset 0 is the same as opening the pool */
if (errno == 0 && endptr != objset_str &&
objset_id != 0) {
if (dump_opt['N'])
dataset_lookup = B_TRUE;
}
/* normal dataset name not an objset ID */
if (endptr == objset_str) {
objset_id = -1;
}
} else if (objset_str && !zdb_numeric(objset_str + 1) &&
dump_opt['N']) {
printf("Supply a numeric objset ID with -N\n");
exit(1);
}
} else {
target_pool = target;
}
@@ -8659,13 +8696,27 @@ main(int argc, char **argv)
}
return (error);
} else {
target_pool = strdup(target);
if (strpbrk(target, "/@") != NULL)
*strpbrk(target_pool, "/@") = '\0';
zdb_set_skip_mmp(target);
/*
* If -N was supplied, the user has indicated that
* zdb -d <pool>/<objsetID> is in effect. Otherwise
* we first assume that the dataset string is the
* dataset name. If dmu_objset_hold fails with the
* dataset string, and we have an objset_id, retry the
* lookup with the objsetID.
*/
boolean_t retry = B_TRUE;
retry_lookup:
if (dataset_lookup == B_TRUE) {
/*
* Use the supplied id to get the name
* for open_objset.
*/
error = spa_open(target, &spa, FTAG);
error = spa_open(target_pool, &spa, FTAG);
if (error == 0) {
error = name_from_objset_id(spa,
objset_id, dsname);
@@ -8674,10 +8725,23 @@ main(int argc, char **argv)
target = dsname;
}
}
if (error == 0)
if (error == 0) {
if (objset_id > 0 && retry) {
int err = dmu_objset_hold(target, FTAG,
&os);
if (err) {
dataset_lookup = B_TRUE;
retry = B_FALSE;
goto retry_lookup;
} else {
dmu_objset_rele(os, FTAG);
}
}
error = open_objset(target, FTAG, &os);
}
if (error == 0)
spa = dmu_objset_spa(os);
free(target_pool);
}
}
nvlist_free(policy);
+20
View File
@@ -35,6 +35,7 @@
#include <sys/fs/zfs.h>
#include <sys/fm/protocol.h>
#include <sys/fm/fs/zfs.h>
#include <sys/zio.h>
#include "zfs_agents.h"
#include "fmd_api.h"
@@ -773,6 +774,8 @@ zfs_fm_recv(fmd_hdl_t *hdl, fmd_event_t *ep, nvlist_t *nvl, const char *class)
ZFS_MAKE_EREPORT(FM_EREPORT_ZFS_PROBE_FAILURE))) {
char *failmode = NULL;
boolean_t checkremove = B_FALSE;
uint32_t pri = 0;
int32_t flags = 0;
/*
* If this is a checksum or I/O error, then toss it into the
@@ -795,6 +798,23 @@ zfs_fm_recv(fmd_hdl_t *hdl, fmd_event_t *ep, nvlist_t *nvl, const char *class)
checkremove = B_TRUE;
} else if (fmd_nvl_class_match(hdl, nvl,
ZFS_MAKE_EREPORT(FM_EREPORT_ZFS_CHECKSUM))) {
/*
* We ignore ereports for checksum errors generated by
* scrub/resilver I/O to avoid potentially further
* degrading the pool while it's being repaired.
*/
if (((nvlist_lookup_uint32(nvl,
FM_EREPORT_PAYLOAD_ZFS_ZIO_PRIORITY, &pri) == 0) &&
(pri == ZIO_PRIORITY_SCRUB ||
pri == ZIO_PRIORITY_REBUILD)) ||
((nvlist_lookup_int32(nvl,
FM_EREPORT_PAYLOAD_ZFS_ZIO_FLAGS, &flags) == 0) &&
(flags & (ZIO_FLAG_SCRUB | ZIO_FLAG_RESILVER)))) {
fmd_hdl_debug(hdl, "ignoring '%s' for "
"scrub/resilver I/O", class);
return;
}
if (zcp->zc_data.zc_serd_checksum[0] == '\0') {
zfs_serd_name(zcp->zc_data.zc_serd_checksum,
pool_guid, vdev_guid, "checksum");
+147 -8
View File
@@ -894,14 +894,90 @@ zfs_deliver_check(nvlist_t *nvl)
return (0);
}
/*
* Given a path to a vdev, lookup the vdev's physical size from its
* config nvlist.
*
* Returns the vdev's physical size in bytes on success, 0 on error.
*/
static uint64_t
vdev_size_from_config(zpool_handle_t *zhp, const char *vdev_path)
{
nvlist_t *nvl = NULL;
boolean_t avail_spare, l2cache, log;
vdev_stat_t *vs = NULL;
uint_t c;
nvl = zpool_find_vdev(zhp, vdev_path, &avail_spare, &l2cache, &log);
if (!nvl)
return (0);
verify(nvlist_lookup_uint64_array(nvl, ZPOOL_CONFIG_VDEV_STATS,
(uint64_t **)&vs, &c) == 0);
if (!vs) {
zed_log_msg(LOG_INFO, "%s: no nvlist for '%s'", __func__,
vdev_path);
return (0);
}
return (vs->vs_pspace);
}
/*
* Given a path to a vdev, lookup if the vdev is a "whole disk" in the
* config nvlist. "whole disk" means that ZFS was passed a whole disk
* at pool creation time, which it partitioned up and has full control over.
* Thus a partition with wholedisk=1 set tells us that zfs created the
* partition at creation time. A partition without whole disk set would have
* been created by externally (like with fdisk) and passed to ZFS.
*
* Returns the whole disk value (either 0 or 1).
*/
static uint64_t
vdev_whole_disk_from_config(zpool_handle_t *zhp, const char *vdev_path)
{
nvlist_t *nvl = NULL;
boolean_t avail_spare, l2cache, log;
uint64_t wholedisk;
nvl = zpool_find_vdev(zhp, vdev_path, &avail_spare, &l2cache, &log);
if (!nvl)
return (0);
verify(nvlist_lookup_uint64(nvl, ZPOOL_CONFIG_WHOLE_DISK,
&wholedisk) == 0);
return (wholedisk);
}
/*
* If the device size grew more than 1% then return true.
*/
#define DEVICE_GREW(oldsize, newsize) \
((newsize > oldsize) && \
((newsize / (newsize - oldsize)) <= 100))
static int
zfsdle_vdev_online(zpool_handle_t *zhp, void *data)
{
char *devname = data;
boolean_t avail_spare, l2cache;
nvlist_t *udev_nvl = data;
nvlist_t *tgt;
int error;
char *tmp_devname, devname[MAXPATHLEN];
uint64_t guid;
if (nvlist_lookup_uint64(udev_nvl, ZFS_EV_VDEV_GUID, &guid) == 0) {
sprintf(devname, "%llu", (u_longlong_t)guid);
} else if (nvlist_lookup_string(udev_nvl, DEV_PHYS_PATH,
&tmp_devname) == 0) {
strlcpy(devname, tmp_devname, MAXPATHLEN);
zfs_append_partition(devname, MAXPATHLEN);
} else {
zed_log_msg(LOG_INFO, "%s: no guid or physpath", __func__);
}
zed_log_msg(LOG_INFO, "zfsdle_vdev_online: searching for '%s' in '%s'",
devname, zpool_get_name(zhp));
@@ -953,12 +1029,75 @@ zfsdle_vdev_online(zpool_handle_t *zhp, void *data)
vdev_state_t newstate;
if (zpool_get_state(zhp) != POOL_STATE_UNAVAIL) {
error = zpool_vdev_online(zhp, fullpath, 0,
&newstate);
zed_log_msg(LOG_INFO, "zfsdle_vdev_online: "
"setting device '%s' to ONLINE state "
"in pool '%s': %d", fullpath,
zpool_get_name(zhp), error);
/*
* If this disk size has not changed, then
* there's no need to do an autoexpand. To
* check we look at the disk's size in its
* config, and compare it to the disk size
* that udev is reporting.
*/
uint64_t udev_size = 0, conf_size = 0,
wholedisk = 0, udev_parent_size = 0;
/*
* Get the size of our disk that udev is
* reporting.
*/
if (nvlist_lookup_uint64(udev_nvl, DEV_SIZE,
&udev_size) != 0) {
udev_size = 0;
}
/*
* Get the size of our disk's parent device
* from udev (where sda1's parent is sda).
*/
if (nvlist_lookup_uint64(udev_nvl,
DEV_PARENT_SIZE, &udev_parent_size) != 0) {
udev_parent_size = 0;
}
conf_size = vdev_size_from_config(zhp,
fullpath);
wholedisk = vdev_whole_disk_from_config(zhp,
fullpath);
/*
* Only attempt an autoexpand if the vdev size
* changed. There are two different cases
* to consider.
*
* 1. wholedisk=1
* If you do a 'zpool create' on a whole disk
* (like /dev/sda), then zfs will create
* partitions on the disk (like /dev/sda1). In
* that case, wholedisk=1 will be set in the
* partition's nvlist config. So zed will need
* to see if your parent device (/dev/sda)
* expanded in size, and if so, then attempt
* the autoexpand.
*
* 2. wholedisk=0
* If you do a 'zpool create' on an existing
* partition, or a device that doesn't allow
* partitions, then wholedisk=0, and you will
* simply need to check if the device itself
* expanded in size.
*/
if (DEVICE_GREW(conf_size, udev_size) ||
(wholedisk && DEVICE_GREW(conf_size,
udev_parent_size))) {
error = zpool_vdev_online(zhp, fullpath,
0, &newstate);
zed_log_msg(LOG_INFO,
"%s: autoexpanding '%s' from %llu"
" to %llu bytes in pool '%s': %d",
__func__, fullpath, conf_size,
MAX(udev_size, udev_parent_size),
zpool_get_name(zhp), error);
}
}
}
zpool_close(zhp);
@@ -989,7 +1128,7 @@ zfs_deliver_dle(nvlist_t *nvl)
zed_log_msg(LOG_INFO, "zfs_deliver_dle: no guid or physpath");
}
if (zpool_iter(g_zfshdl, zfsdle_vdev_online, name) != 1) {
if (zpool_iter(g_zfshdl, zfsdle_vdev_online, nvl) != 1) {
zed_log_msg(LOG_INFO, "zfs_deliver_dle: device '%s' not "
"found", name);
return (1);
+17 -4
View File
@@ -224,6 +224,8 @@ zed_notify()
# ZED_EMAIL_OPTS. This undergoes the following keyword substitutions:
# - @ADDRESS@ is replaced with the space-delimited recipient email address(es)
# - @SUBJECT@ is replaced with the notification subject
# If @SUBJECT@ was omited here, a "Subject: ..." header will be added to notification
#
#
# Arguments
# subject: notification subject
@@ -241,7 +243,7 @@ zed_notify()
#
zed_notify_email()
{
local subject="$1"
local subject="${1:-"ZED notification"}"
local pathname="${2:-"/dev/null"}"
: "${ZED_EMAIL_PROG:="mail"}"
@@ -262,12 +264,23 @@ zed_notify_email()
return 1
fi
ZED_EMAIL_OPTS="$(echo "${ZED_EMAIL_OPTS}" \
# construct cmdline options
ZED_EMAIL_OPTS_PARSED="$(echo "${ZED_EMAIL_OPTS}" \
| sed -e "s/@ADDRESS@/${ZED_EMAIL_ADDR}/g" \
-e "s/@SUBJECT@/${subject}/g")"
# shellcheck disable=SC2086
eval ${ZED_EMAIL_PROG} ${ZED_EMAIL_OPTS} < "${pathname}" >/dev/null 2>&1
# pipe message to email prog
# shellcheck disable=SC2086,SC2248
{
# no subject passed as option?
if [ "${ZED_EMAIL_OPTS%@SUBJECT@*}" = "${ZED_EMAIL_OPTS}" ] ; then
# inject subject header
printf "Subject: %s\n" "${subject}"
fi
# output message
cat "${pathname}"
} |
eval ${ZED_EMAIL_PROG} ${ZED_EMAIL_OPTS_PARSED} >/dev/null 2>&1
rv=$?
if [ "${rv}" -ne 0 ]; then
zed_log_err "${ZED_EMAIL_PROG##*/} exit=${rv}"
+1
View File
@@ -30,6 +30,7 @@ ZED_EMAIL_ADDR="root"
# The string @SUBJECT@ will be replaced with the notification subject;
# this should be protected with quotes to prevent word-splitting.
# Email will only be sent if ZED_EMAIL_ADDR is defined.
# If @SUBJECT@ was omited here, a "Subject: ..." header will be added to notification
#
#ZED_EMAIL_OPTS="-s '@SUBJECT@' @ADDRESS@"
+50 -10
View File
@@ -78,6 +78,8 @@ zed_udev_event(const char *class, const char *subclass, nvlist_t *nvl)
zed_log_msg(LOG_INFO, "\t%s: %s", DEV_PHYS_PATH, strval);
if (nvlist_lookup_uint64(nvl, DEV_SIZE, &numval) == 0)
zed_log_msg(LOG_INFO, "\t%s: %llu", DEV_SIZE, numval);
if (nvlist_lookup_uint64(nvl, DEV_PARENT_SIZE, &numval) == 0)
zed_log_msg(LOG_INFO, "\t%s: %llu", DEV_PARENT_SIZE, numval);
if (nvlist_lookup_uint64(nvl, ZFS_EV_POOL_GUID, &numval) == 0)
zed_log_msg(LOG_INFO, "\t%s: %llu", ZFS_EV_POOL_GUID, numval);
if (nvlist_lookup_uint64(nvl, ZFS_EV_VDEV_GUID, &numval) == 0)
@@ -130,6 +132,20 @@ dev_event_nvlist(struct udev_device *dev)
numval *= strtoull(value, NULL, 10);
(void) nvlist_add_uint64(nvl, DEV_SIZE, numval);
/*
* If the device has a parent, then get the parent block
* device's size as well. For example, /dev/sda1's parent
* is /dev/sda.
*/
struct udev_device *parent_dev = udev_device_get_parent(dev);
if ((value = udev_device_get_sysattr_value(parent_dev, "size"))
!= NULL) {
uint64_t numval = DEV_BSIZE;
numval *= strtoull(value, NULL, 10);
(void) nvlist_add_uint64(nvl, DEV_PARENT_SIZE, numval);
}
}
/*
@@ -169,7 +185,7 @@ zed_udev_monitor(void *arg)
while (1) {
struct udev_device *dev;
const char *action, *type, *part, *sectors;
const char *bus, *uuid;
const char *bus, *uuid, *devpath;
const char *class, *subclass;
nvlist_t *nvl;
boolean_t is_zfs = B_FALSE;
@@ -208,6 +224,12 @@ zed_udev_monitor(void *arg)
* if this is a disk and it is partitioned, then the
* zfs label will reside in a DEVTYPE=partition and
* we can skip passing this event
*
* Special case: Blank disks are sometimes reported with
* an erroneous 'atari' partition, and should not be
* excluded from being used as an autoreplace disk:
*
* https://github.com/openzfs/zfs/issues/13497
*/
type = udev_device_get_property_value(dev, "DEVTYPE");
part = udev_device_get_property_value(dev,
@@ -215,14 +237,23 @@ zed_udev_monitor(void *arg)
if (type != NULL && type[0] != '\0' &&
strcmp(type, "disk") == 0 &&
part != NULL && part[0] != '\0') {
zed_log_msg(LOG_INFO,
"%s: skip %s since it has a %s partition already",
__func__,
udev_device_get_property_value(dev, "DEVNAME"),
part);
/* skip and wait for partition event */
udev_device_unref(dev);
continue;
const char *devname =
udev_device_get_property_value(dev, "DEVNAME");
if (strcmp(part, "atari") == 0) {
zed_log_msg(LOG_INFO,
"%s: %s is reporting an atari partition, "
"but we're going to assume it's a false "
"positive and still use it (issue #13497)",
__func__, devname);
} else {
zed_log_msg(LOG_INFO,
"%s: skip %s since it has a %s partition "
"already", __func__, devname, part);
/* skip and wait for partition event */
udev_device_unref(dev);
continue;
}
}
/*
@@ -248,10 +279,19 @@ zed_udev_monitor(void *arg)
* device id string is required in the message schema
* for matching with vdevs. Preflight here for expected
* udev information.
*
* Special case:
* NVMe devices don't have ID_BUS set (at least on RHEL 7-8),
* but they are valid for autoreplace. Add a special case for
* them by searching for "/nvme/" in the udev DEVPATH:
*
* DEVPATH=/devices/pci0000:00/0000:00:1e.0/nvme/nvme2/nvme2n1
*/
bus = udev_device_get_property_value(dev, "ID_BUS");
uuid = udev_device_get_property_value(dev, "DM_UUID");
if (!is_zfs && (bus == NULL && uuid == NULL)) {
devpath = udev_device_get_devpath(dev);
if (!is_zfs && (bus == NULL && uuid == NULL &&
strstr(devpath, "/nvme/") == NULL)) {
zed_log_msg(LOG_INFO, "zed_udev_monitor: %s no devid "
"source", udev_device_get_devnode(dev));
udev_device_unref(dev);
+6 -3
View File
@@ -2480,7 +2480,7 @@ upgrade_set_callback(zfs_handle_t *zhp, void *data)
/* upgrade */
if (version < cb->cb_version) {
char verstr[16];
char verstr[24];
(void) snprintf(verstr, sizeof (verstr),
"%llu", (u_longlong_t)cb->cb_version);
if (cb->cb_lastfs[0] && !same_pool(zhp, cb->cb_lastfs)) {
@@ -6593,7 +6593,7 @@ zfs_do_holds(int argc, char **argv)
/*
* 1. collect holds data, set format options
*/
ret = zfs_for_each(argc, argv, flags, types, NULL, NULL, limit,
ret = zfs_for_each(1, argv + i, flags, types, NULL, NULL, limit,
holds_callback, &cb);
if (ret != 0)
++errors;
@@ -7672,7 +7672,7 @@ zfs_do_diff(int argc, char **argv)
int c;
struct sigaction sa;
while ((c = getopt(argc, argv, "FHt")) != -1) {
while ((c = getopt(argc, argv, "FHth")) != -1) {
switch (c) {
case 'F':
flags |= ZFS_DIFF_CLASSIFY;
@@ -7683,6 +7683,9 @@ zfs_do_diff(int argc, char **argv)
case 't':
flags |= ZFS_DIFF_TIMESTAMP;
break;
case 'h':
flags |= ZFS_DIFF_NO_MANGLE;
break;
default:
(void) fprintf(stderr,
gettext("invalid option '%c'\n"), optopt);
+10 -4
View File
@@ -207,7 +207,6 @@ static int
zfs_project_handle_dir(const char *name, zfs_project_control_t *zpc,
list_t *head)
{
char fullname[PATH_MAX];
struct dirent *ent;
DIR *dir;
int ret = 0;
@@ -227,21 +226,28 @@ zfs_project_handle_dir(const char *name, zfs_project_control_t *zpc,
zpc->zpc_ignore_noent = B_TRUE;
errno = 0;
while (!ret && (ent = readdir(dir)) != NULL) {
char *fullname;
/* skip "." and ".." */
if (strcmp(ent->d_name, ".") == 0 ||
strcmp(ent->d_name, "..") == 0)
continue;
if (strlen(ent->d_name) + strlen(name) >=
sizeof (fullname) + 1) {
if (strlen(ent->d_name) + strlen(name) + 1 >= PATH_MAX) {
errno = ENAMETOOLONG;
break;
}
sprintf(fullname, "%s/%s", name, ent->d_name);
if (asprintf(&fullname, "%s/%s", name, ent->d_name) == -1) {
errno = ENOMEM;
break;
}
ret = zfs_project_handle_one(fullname, zpc);
if (!ret && zpc->zpc_recursive && ent->d_type == DT_DIR)
zfs_project_item_alloc(head, fullname);
free(fullname);
}
if (errno && !ret) {
+20 -8
View File
@@ -2438,7 +2438,14 @@ print_status_config(zpool_handle_t *zhp, status_cbdata_t *cb, const char *name,
(void) nvlist_lookup_uint64_array(root, ZPOOL_CONFIG_SCAN_STATS,
(uint64_t **)&ps, &c);
if (ps != NULL && ps->pss_state == DSS_SCANNING && children == 0) {
/*
* If you force fault a drive that's resilvering, its scan stats can
* get frozen in time, giving the false impression that it's
* being resilvered. That's why we check the state to see if the vdev
* is healthy before reporting "resilvering" or "repairing".
*/
if (ps != NULL && ps->pss_state == DSS_SCANNING && children == 0 &&
vs->vs_state == VDEV_STATE_HEALTHY) {
if (vs->vs_scan_processed != 0) {
(void) printf(gettext(" (%s)"),
(ps->pss_func == POOL_SCAN_RESILVER) ?
@@ -2450,7 +2457,7 @@ print_status_config(zpool_handle_t *zhp, status_cbdata_t *cb, const char *name,
/* The top-level vdevs have the rebuild stats */
if (vrs != NULL && vrs->vrs_state == VDEV_REBUILD_ACTIVE &&
children == 0) {
children == 0 && vs->vs_state == VDEV_STATE_HEALTHY) {
if (vs->vs_rebuild_processed != 0) {
(void) printf(gettext(" (resilvering)"));
}
@@ -5458,8 +5465,8 @@ get_namewidth_iostat(zpool_handle_t *zhp, void *data)
* get_namewidth() returns the maximum width of any name in that column
* for any pool/vdev/device line that will be output.
*/
width = get_namewidth(zhp, cb->cb_namewidth, cb->cb_name_flags,
cb->cb_verbose);
width = get_namewidth(zhp, cb->cb_namewidth,
cb->cb_name_flags | VDEV_NAME_TYPE_ID, cb->cb_verbose);
/*
* The width we are calculating is the width of the header and also the
@@ -6035,6 +6042,7 @@ print_one_column(zpool_prop_t prop, uint64_t value, const char *str,
size_t width = zprop_width(prop, &fixed, ZFS_TYPE_POOL);
switch (prop) {
case ZPOOL_PROP_SIZE:
case ZPOOL_PROP_EXPANDSZ:
case ZPOOL_PROP_CHECKPOINT:
case ZPOOL_PROP_DEDUPRATIO:
@@ -6130,8 +6138,12 @@ print_list_stats(zpool_handle_t *zhp, const char *name, nvlist_t *nv,
* 'toplevel' boolean value is passed to the print_one_column()
* to indicate that the value is valid.
*/
print_one_column(ZPOOL_PROP_SIZE, vs->vs_space, NULL, scripted,
toplevel, format);
if (vs->vs_pspace)
print_one_column(ZPOOL_PROP_SIZE, vs->vs_pspace, NULL,
scripted, B_TRUE, format);
else
print_one_column(ZPOOL_PROP_SIZE, vs->vs_space, NULL,
scripted, toplevel, format);
print_one_column(ZPOOL_PROP_ALLOCATED, vs->vs_alloc, NULL,
scripted, toplevel, format);
print_one_column(ZPOOL_PROP_FREE, vs->vs_space - vs->vs_alloc,
@@ -6282,8 +6294,8 @@ get_namewidth_list(zpool_handle_t *zhp, void *data)
list_cbdata_t *cb = data;
int width;
width = get_namewidth(zhp, cb->cb_namewidth, cb->cb_name_flags,
cb->cb_verbose);
width = get_namewidth(zhp, cb->cb_namewidth,
cb->cb_name_flags | VDEV_NAME_TYPE_ID, cb->cb_verbose);
if (width < 9)
width = 9;
+6 -4
View File
@@ -28,15 +28,17 @@ filter_out_deleted_zvols() {
list_zvols() {
read -r default_volmode < /sys/module/zfs/parameters/zvol_volmode
zfs list -t volume -H -o \
name,volmode,receive_resume_token,redact_snaps |
while IFS=" " read -r name volmode token redacted; do # IFS=\t here!
name,volmode,receive_resume_token,redact_snaps,keystatus |
while IFS=" " read -r name volmode token redacted keystatus; do # IFS=\t here!
# /dev links are not created for zvols with volmode = "none"
# or for redacted zvols.
# /dev links are not created for zvols with volmode = "none",
# redacted zvols, or encrypted zvols for which the key has not
# been loaded.
[ "$volmode" = "none" ] && continue
[ "$volmode" = "default" ] && [ "$default_volmode" = "3" ] &&
continue
[ "$redacted" = "-" ] || continue
[ "$keystatus" = "unavailable" ] && continue
# We also ignore partially received zvols if it is
# not an incremental receive, as those won't even have a block
+32 -36
View File
@@ -88,7 +88,7 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_NO_FORMAT_TRUNCATION], [
])
dnl #
dnl # Check if gcc supports -Wno-format-truncation option.
dnl # Check if gcc supports -Wno-format-zero-length option.
dnl #
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_NO_FORMAT_ZERO_LENGTH], [
AC_MSG_CHECKING([whether $CC supports -Wno-format-zero-length])
@@ -108,57 +108,30 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_NO_FORMAT_ZERO_LENGTH], [
AC_SUBST([NO_FORMAT_ZERO_LENGTH])
])
dnl #
dnl # Check if gcc supports -Wno-bool-compare option.
dnl # Check if gcc supports -Wno-clobbered option.
dnl #
dnl # We actually invoke gcc with the -Wbool-compare option
dnl # We actually invoke gcc with the -Wclobbered option
dnl # and infer the 'no-' version does or doesn't exist based upon
dnl # the results. This is required because when checking any of
dnl # no- prefixed options gcc always returns success.
dnl #
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_NO_BOOL_COMPARE], [
AC_MSG_CHECKING([whether $CC supports -Wno-bool-compare])
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_NO_CLOBBERED], [
AC_MSG_CHECKING([whether $CC supports -Wno-clobbered])
saved_flags="$CFLAGS"
CFLAGS="$CFLAGS -Werror -Wbool-compare"
CFLAGS="$CFLAGS -Werror -Wclobbered"
AC_COMPILE_IFELSE([AC_LANG_PROGRAM([], [])], [
NO_BOOL_COMPARE=-Wno-bool-compare
NO_CLOBBERED=-Wno-clobbered
AC_MSG_RESULT([yes])
], [
NO_BOOL_COMPARE=
NO_CLOBBERED=
AC_MSG_RESULT([no])
])
CFLAGS="$saved_flags"
AC_SUBST([NO_BOOL_COMPARE])
])
dnl #
dnl # Check if gcc supports -Wno-unused-but-set-variable option.
dnl #
dnl # We actually invoke gcc with the -Wunused-but-set-variable option
dnl # and infer the 'no-' version does or doesn't exist based upon
dnl # the results. This is required because when checking any of
dnl # no- prefixed options gcc always returns success.
dnl #
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_NO_UNUSED_BUT_SET_VARIABLE], [
AC_MSG_CHECKING([whether $CC supports -Wno-unused-but-set-variable])
saved_flags="$CFLAGS"
CFLAGS="$CFLAGS -Werror -Wunused-but-set-variable"
AC_COMPILE_IFELSE([AC_LANG_PROGRAM([], [])], [
NO_UNUSED_BUT_SET_VARIABLE=-Wno-unused-but-set-variable
AC_MSG_RESULT([yes])
], [
NO_UNUSED_BUT_SET_VARIABLE=
AC_MSG_RESULT([no])
])
CFLAGS="$saved_flags"
AC_SUBST([NO_UNUSED_BUT_SET_VARIABLE])
AC_SUBST([NO_CLOBBERED])
])
dnl #
@@ -184,6 +157,29 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_IMPLICIT_FALLTHROUGH], [
AC_SUBST([IMPLICIT_FALLTHROUGH])
])
dnl #
dnl # Check if cc supports -Winfinite-recursion option.
dnl #
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_INFINITE_RECURSION], [
AC_MSG_CHECKING([whether $CC supports -Winfinite-recursion])
saved_flags="$CFLAGS"
CFLAGS="$CFLAGS -Werror -Winfinite-recursion"
AC_COMPILE_IFELSE([AC_LANG_PROGRAM([], [])], [
INFINITE_RECURSION=-Winfinite-recursion
AC_DEFINE([HAVE_INFINITE_RECURSION], 1,
[Define if compiler supports -Winfinite-recursion])
AC_MSG_RESULT([yes])
], [
INFINITE_RECURSION=
AC_MSG_RESULT([no])
])
CFLAGS="$saved_flags"
AC_SUBST([INFINITE_RECURSION])
])
dnl #
dnl # Check if gcc supports -fno-omit-frame-pointer option.
dnl #
+8
View File
@@ -0,0 +1,8 @@
dnl #
dnl # Check if GNU parallel is available.
dnl #
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_PARALLEL], [
AC_CHECK_PROG([PARALLEL], [parallel], [yes])
AM_CONDITIONAL([HAVE_PARALLEL], [test "x$PARALLEL" = "xyes"])
])
+1 -1
View File
@@ -224,7 +224,7 @@ EOD`
ac_python_version=$PYTHON_VERSION
else
ac_python_version=`$PYTHON -c "import sys; \
print (sys.version[[:3]])"`
print ('.'.join(sys.version.split('.')[[:2]]))"`
fi
fi
+2 -3
View File
@@ -3,16 +3,15 @@ dnl # 5.16 API change
dnl # add_disk grew a must-check return code
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_ADD_DISK], [
ZFS_LINUX_TEST_SRC([add_disk_ret], [
#include <linux/genhd.h>
#include <linux/blkdev.h>
], [
struct gendisk *disk = NULL;
int err = add_disk(disk);
err = err;
])
])
AC_DEFUN([ZFS_AC_KERNEL_ADD_DISK], [
AC_MSG_CHECKING([whether add_disk() returns int])
ZFS_LINUX_TEST_RESULT([add_disk_ret],
+38 -1
View File
@@ -464,7 +464,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_CGROUP_HEADER], [
])
AC_DEFUN([ZFS_AC_KERNEL_BLK_CGROUP_HEADER], [
AC_MSG_CHECKING([for existence of linux/blk-cgroup.h])
AC_MSG_CHECKING([whether linux/blk-cgroup.h exists])
ZFS_LINUX_TEST_RESULT([blk_cgroup_header],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_LINUX_BLK_CGROUP_HEADER, 1,
@@ -474,6 +474,41 @@ AC_DEFUN([ZFS_AC_KERNEL_BLK_CGROUP_HEADER], [
])
])
dnl #
dnl # Linux 5.18 API
dnl #
dnl # In 07888c665b405b1cd3577ddebfeb74f4717a84c4 ("block: pass a block_device and opf to bio_alloc")
dnl # bio_alloc(gfp_t gfp_mask, unsigned short nr_iovecs)
dnl # became
dnl # bio_alloc(struct block_device *bdev, unsigned short nr_vecs, unsigned int opf, gfp_t gfp_mask)
dnl # however
dnl # > NULL/0 can be passed, both for the
dnl # > passthrough case on a raw request_queue and to temporarily avoid
dnl # > refactoring some nasty code.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BIO_ALLOC_4ARG], [
ZFS_LINUX_TEST_SRC([bio_alloc_4arg], [
#include <linux/bio.h>
],[
gfp_t gfp_mask = 0;
unsigned short nr_iovecs = 0;
struct block_device *bdev = NULL;
unsigned int opf = 0;
struct bio *__attribute__((unused)) allocated = bio_alloc(bdev, nr_iovecs, opf, gfp_mask);
])
])
AC_DEFUN([ZFS_AC_KERNEL_BIO_ALLOC_4ARG], [
AC_MSG_CHECKING([whether bio_alloc() wants 4 args])
ZFS_LINUX_TEST_RESULT([bio_alloc_4arg],[
AC_MSG_RESULT(yes)
AC_DEFINE([HAVE_BIO_ALLOC_4ARG], 1, [bio_alloc() takes 4 arguments])
],[
AC_MSG_RESULT(no)
])
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_BIO], [
ZFS_AC_KERNEL_SRC_REQ
ZFS_AC_KERNEL_SRC_BIO_OPS
@@ -488,6 +523,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BIO], [
ZFS_AC_KERNEL_SRC_BDEV_SUBMIT_BIO_RETURNS_VOID
ZFS_AC_KERNEL_SRC_BIO_SET_DEV_MACRO
ZFS_AC_KERNEL_SRC_BLK_CGROUP_HEADER
ZFS_AC_KERNEL_SRC_BIO_ALLOC_4ARG
])
AC_DEFUN([ZFS_AC_KERNEL_BIO], [
@@ -512,4 +548,5 @@ AC_DEFUN([ZFS_AC_KERNEL_BIO], [
ZFS_AC_KERNEL_BIO_BDEV_DISK
ZFS_AC_KERNEL_BDEV_SUBMIT_BIO_RETURNS_VOID
ZFS_AC_KERNEL_BLK_CGROUP_HEADER
ZFS_AC_KERNEL_BIO_ALLOC_4ARG
])
+74 -30
View File
@@ -74,6 +74,8 @@ AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_UPDATE_READAHEAD], [
AC_DEFINE(HAVE_BLK_QUEUE_UPDATE_READAHEAD, 1,
[blk_queue_update_readahead() exists])
],[
AC_MSG_RESULT(no)
AC_MSG_CHECKING([whether disk_update_readahead() exists])
ZFS_LINUX_TEST_RESULT([disk_update_readahead], [
AC_MSG_RESULT(yes)
@@ -86,69 +88,111 @@ AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_UPDATE_READAHEAD], [
])
dnl #
dnl # 2.6.32 API,
dnl # blk_queue_discard()
dnl # 5.19: bdev_max_discard_sectors() available
dnl # 2.6.32: blk_queue_discard() available
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_QUEUE_DISCARD], [
ZFS_LINUX_TEST_SRC([bdev_max_discard_sectors], [
#include <linux/blkdev.h>
],[
struct block_device *bdev __attribute__ ((unused)) = NULL;
unsigned int error __attribute__ ((unused));
error = bdev_max_discard_sectors(bdev);
])
ZFS_LINUX_TEST_SRC([blk_queue_discard], [
#include <linux/blkdev.h>
],[
struct request_queue *q __attribute__ ((unused)) = NULL;
struct request_queue r;
struct request_queue *q = &r;
int value __attribute__ ((unused));
memset(q, 0, sizeof(r));
value = blk_queue_discard(q);
])
])
AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_DISCARD], [
AC_MSG_CHECKING([whether blk_queue_discard() is available])
ZFS_LINUX_TEST_RESULT([blk_queue_discard], [
AC_MSG_CHECKING([whether bdev_max_discard_sectors() is available])
ZFS_LINUX_TEST_RESULT([bdev_max_discard_sectors], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BDEV_MAX_DISCARD_SECTORS, 1,
[bdev_max_discard_sectors() is available])
],[
ZFS_LINUX_TEST_ERROR([blk_queue_discard])
AC_MSG_RESULT(no)
AC_MSG_CHECKING([whether blk_queue_discard() is available])
ZFS_LINUX_TEST_RESULT([blk_queue_discard], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BLK_QUEUE_DISCARD, 1,
[blk_queue_discard() is available])
],[
ZFS_LINUX_TEST_ERROR([blk_queue_discard])
])
])
])
dnl #
dnl # 4.8 API,
dnl # blk_queue_secure_erase()
dnl #
dnl # 2.6.36 - 4.7 API,
dnl # blk_queue_secdiscard()
dnl # 5.19: bdev_max_secure_erase_sectors() available
dnl # 4.8: blk_queue_secure_erase() available
dnl # 2.6.36: blk_queue_secdiscard() available
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_QUEUE_SECURE_ERASE], [
ZFS_LINUX_TEST_SRC([bdev_max_secure_erase_sectors], [
#include <linux/blkdev.h>
],[
struct block_device *bdev __attribute__ ((unused)) = NULL;
unsigned int error __attribute__ ((unused));
error = bdev_max_secure_erase_sectors(bdev);
])
ZFS_LINUX_TEST_SRC([blk_queue_secure_erase], [
#include <linux/blkdev.h>
],[
struct request_queue *q __attribute__ ((unused)) = NULL;
struct request_queue r;
struct request_queue *q = &r;
int value __attribute__ ((unused));
memset(q, 0, sizeof(r));
value = blk_queue_secure_erase(q);
])
ZFS_LINUX_TEST_SRC([blk_queue_secdiscard], [
#include <linux/blkdev.h>
],[
struct request_queue *q __attribute__ ((unused)) = NULL;
struct request_queue r;
struct request_queue *q = &r;
int value __attribute__ ((unused));
memset(q, 0, sizeof(r));
value = blk_queue_secdiscard(q);
])
])
AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_SECURE_ERASE], [
AC_MSG_CHECKING([whether blk_queue_secure_erase() is available])
ZFS_LINUX_TEST_RESULT([blk_queue_secure_erase], [
AC_MSG_CHECKING([whether bdev_max_secure_erase_sectors() is available])
ZFS_LINUX_TEST_RESULT([bdev_max_secure_erase_sectors], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BLK_QUEUE_SECURE_ERASE, 1,
[blk_queue_secure_erase() is available])
AC_DEFINE(HAVE_BDEV_MAX_SECURE_ERASE_SECTORS, 1,
[bdev_max_secure_erase_sectors() is available])
],[
AC_MSG_RESULT(no)
AC_MSG_CHECKING([whether blk_queue_secdiscard() is available])
ZFS_LINUX_TEST_RESULT([blk_queue_secdiscard], [
AC_MSG_CHECKING([whether blk_queue_secure_erase() is available])
ZFS_LINUX_TEST_RESULT([blk_queue_secure_erase], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BLK_QUEUE_SECDISCARD, 1,
[blk_queue_secdiscard() is available])
AC_DEFINE(HAVE_BLK_QUEUE_SECURE_ERASE, 1,
[blk_queue_secure_erase() is available])
],[
ZFS_LINUX_TEST_ERROR([blk_queue_secure_erase])
AC_MSG_RESULT(no)
AC_MSG_CHECKING([whether blk_queue_secdiscard() is available])
ZFS_LINUX_TEST_RESULT([blk_queue_secdiscard], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BLK_QUEUE_SECDISCARD, 1,
[blk_queue_secdiscard() is available])
],[
ZFS_LINUX_TEST_ERROR([blk_queue_secure_erase])
])
])
])
])
@@ -215,17 +259,17 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_QUEUE_FLUSH], [
ZFS_LINUX_TEST_SRC([blk_queue_flush], [
#include <linux/blkdev.h>
], [
struct request_queue *q = NULL;
struct request_queue *q __attribute__ ((unused)) = NULL;
(void) blk_queue_flush(q, REQ_FLUSH);
], [$NO_UNUSED_BUT_SET_VARIABLE], [ZFS_META_LICENSE])
], [], [ZFS_META_LICENSE])
ZFS_LINUX_TEST_SRC([blk_queue_write_cache], [
#include <linux/kernel.h>
#include <linux/blkdev.h>
], [
struct request_queue *q = NULL;
struct request_queue *q __attribute__ ((unused)) = NULL;
blk_queue_write_cache(q, true, true);
], [$NO_UNUSED_BUT_SET_VARIABLE], [ZFS_META_LICENSE])
], [], [ZFS_META_LICENSE])
])
AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_FLUSH], [
@@ -278,9 +322,9 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_QUEUE_MAX_HW_SECTORS], [
ZFS_LINUX_TEST_SRC([blk_queue_max_hw_sectors], [
#include <linux/blkdev.h>
], [
struct request_queue *q = NULL;
struct request_queue *q __attribute__ ((unused)) = NULL;
(void) blk_queue_max_hw_sectors(q, BLK_SAFE_MAX_SECTORS);
], [$NO_UNUSED_BUT_SET_VARIABLE])
], [])
])
AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_MAX_HW_SECTORS], [
@@ -301,9 +345,9 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_QUEUE_MAX_SEGMENTS], [
ZFS_LINUX_TEST_SRC([blk_queue_max_segments], [
#include <linux/blkdev.h>
], [
struct request_queue *q = NULL;
struct request_queue *q __attribute__ ((unused)) = NULL;
(void) blk_queue_max_segments(q, BLK_MAX_SEGMENTS);
], [$NO_UNUSED_BUT_SET_VARIABLE])
], [])
])
AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_MAX_SEGMENTS], [
+81
View File
@@ -294,6 +294,83 @@ AC_DEFUN([ZFS_AC_KERNEL_BLKDEV_BDEV_WHOLE], [
])
])
dnl #
dnl # 5.20 API change,
dnl # Removed bdevname(), snprintf(.., %pg) should be used.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLKDEV_BDEVNAME], [
ZFS_LINUX_TEST_SRC([bdevname], [
#include <linux/fs.h>
#include <linux/blkdev.h>
], [
struct block_device *bdev __attribute__ ((unused)) = NULL;
char path[BDEVNAME_SIZE];
(void) bdevname(bdev, path);
])
])
AC_DEFUN([ZFS_AC_KERNEL_BLKDEV_BDEVNAME], [
AC_MSG_CHECKING([whether bdevname() exists])
ZFS_LINUX_TEST_RESULT([bdevname], [
AC_DEFINE(HAVE_BDEVNAME, 1, [bdevname() is available])
AC_MSG_RESULT(yes)
], [
AC_MSG_RESULT(no)
])
])
dnl #
dnl # 5.19 API: blkdev_issue_secure_erase()
dnl # 3.10 API: blkdev_issue_discard(..., BLKDEV_DISCARD_SECURE)
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLKDEV_ISSUE_SECURE_ERASE], [
ZFS_LINUX_TEST_SRC([blkdev_issue_secure_erase], [
#include <linux/blkdev.h>
],[
struct block_device *bdev = NULL;
sector_t sector = 0;
sector_t nr_sects = 0;
int error __attribute__ ((unused));
error = blkdev_issue_secure_erase(bdev,
sector, nr_sects, GFP_KERNEL);
])
ZFS_LINUX_TEST_SRC([blkdev_issue_discard_flags], [
#include <linux/blkdev.h>
],[
struct block_device *bdev = NULL;
sector_t sector = 0;
sector_t nr_sects = 0;
unsigned long flags = 0;
int error __attribute__ ((unused));
error = blkdev_issue_discard(bdev,
sector, nr_sects, GFP_KERNEL, flags);
])
])
AC_DEFUN([ZFS_AC_KERNEL_BLKDEV_ISSUE_SECURE_ERASE], [
AC_MSG_CHECKING([whether blkdev_issue_secure_erase() is available])
ZFS_LINUX_TEST_RESULT([blkdev_issue_secure_erase], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BLKDEV_ISSUE_SECURE_ERASE, 1,
[blkdev_issue_secure_erase() is available])
],[
AC_MSG_RESULT(no)
AC_MSG_CHECKING([whether blkdev_issue_discard() is available])
ZFS_LINUX_TEST_RESULT([blkdev_issue_discard_flags], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BLKDEV_ISSUE_DISCARD, 1,
[blkdev_issue_discard() is available])
],[
ZFS_LINUX_TEST_ERROR([blkdev_issue_discard()])
])
])
])
dnl #
dnl # 5.13 API change
dnl # blkdev_get_by_path() no longer handles ERESTARTSYS
@@ -326,6 +403,8 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLKDEV], [
ZFS_AC_KERNEL_SRC_BLKDEV_CHECK_DISK_CHANGE
ZFS_AC_KERNEL_SRC_BLKDEV_BDEV_CHECK_MEDIA_CHANGE
ZFS_AC_KERNEL_SRC_BLKDEV_BDEV_WHOLE
ZFS_AC_KERNEL_SRC_BLKDEV_BDEVNAME
ZFS_AC_KERNEL_SRC_BLKDEV_ISSUE_SECURE_ERASE
])
AC_DEFUN([ZFS_AC_KERNEL_BLKDEV], [
@@ -339,5 +418,7 @@ AC_DEFUN([ZFS_AC_KERNEL_BLKDEV], [
ZFS_AC_KERNEL_BLKDEV_CHECK_DISK_CHANGE
ZFS_AC_KERNEL_BLKDEV_BDEV_CHECK_MEDIA_CHANGE
ZFS_AC_KERNEL_BLKDEV_BDEV_WHOLE
ZFS_AC_KERNEL_BLKDEV_BDEVNAME
ZFS_AC_KERNEL_BLKDEV_GET_ERESTARTSYS
ZFS_AC_KERNEL_BLKDEV_ISSUE_SECURE_ERASE
])
+12 -5
View File
@@ -6,13 +6,16 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_CHECK_EVENTS], [
#include <linux/blkdev.h>
unsigned int blk_check_events(struct gendisk *disk,
unsigned int clearing) { return (0); }
unsigned int clearing) {
(void) disk, (void) clearing;
return (0);
}
static const struct block_device_operations
bops __attribute__ ((unused)) = {
.check_events = blk_check_events,
};
], [], [$NO_UNUSED_BUT_SET_VARIABLE])
], [], [])
])
AC_DEFUN([ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_CHECK_EVENTS], [
@@ -31,7 +34,10 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_RELEASE_VOID], [
ZFS_LINUX_TEST_SRC([block_device_operations_release_void], [
#include <linux/blkdev.h>
void blk_release(struct gendisk *g, fmode_t mode) { return; }
void blk_release(struct gendisk *g, fmode_t mode) {
(void) g, (void) mode;
return;
}
static const struct block_device_operations
bops __attribute__ ((unused)) = {
@@ -40,7 +46,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_RELEASE_VOID], [
.ioctl = NULL,
.compat_ioctl = NULL,
};
], [], [$NO_UNUSED_BUT_SET_VARIABLE])
], [], [])
])
AC_DEFUN([ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_RELEASE_VOID], [
@@ -61,6 +67,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_REVALIDATE_DISK], [
#include <linux/blkdev.h>
int blk_revalidate_disk(struct gendisk *disk) {
(void) disk;
return(0);
}
@@ -68,7 +75,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_REVALIDATE_DISK], [
bops __attribute__ ((unused)) = {
.revalidate_disk = blk_revalidate_disk,
};
], [], [$NO_UNUSED_BUT_SET_VARIABLE])
], [], [])
])
AC_DEFUN([ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_REVALIDATE_DISK], [
+86 -2
View File
@@ -19,19 +19,48 @@ AC_DEFUN([ZFS_AC_KERNEL_CONFIG_DEFINED], [
])
])
ZFS_AC_KERNEL_SRC_CONFIG_MODULES
ZFS_AC_KERNEL_SRC_CONFIG_BLOCK
ZFS_AC_KERNEL_SRC_CONFIG_DEBUG_LOCK_ALLOC
ZFS_AC_KERNEL_SRC_CONFIG_TRIM_UNUSED_KSYMS
ZFS_AC_KERNEL_SRC_CONFIG_ZLIB_INFLATE
ZFS_AC_KERNEL_SRC_CONFIG_ZLIB_DEFLATE
ZFS_AC_KERNEL_SRC_CONFIG_ZLIB_INFLATE
AC_MSG_CHECKING([for kernel config option compatibility])
ZFS_LINUX_TEST_COMPILE_ALL([config])
AC_MSG_RESULT([done])
ZFS_AC_KERNEL_CONFIG_MODULES
ZFS_AC_KERNEL_CONFIG_BLOCK
ZFS_AC_KERNEL_CONFIG_DEBUG_LOCK_ALLOC
ZFS_AC_KERNEL_CONFIG_TRIM_UNUSED_KSYMS
ZFS_AC_KERNEL_CONFIG_ZLIB_INFLATE
ZFS_AC_KERNEL_CONFIG_ZLIB_DEFLATE
ZFS_AC_KERNEL_CONFIG_ZLIB_INFLATE
])
dnl #
dnl # Check CONFIG_BLOCK
dnl #
dnl # Verify the kernel has CONFIG_BLOCK support enabled.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_CONFIG_BLOCK], [
ZFS_LINUX_TEST_SRC([config_block], [
#if !defined(CONFIG_BLOCK)
#error CONFIG_BLOCK not defined
#endif
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_CONFIG_BLOCK], [
AC_MSG_CHECKING([whether CONFIG_BLOCK is defined])
ZFS_LINUX_TEST_RESULT([config_block], [
AC_MSG_RESULT([yes])
],[
AC_MSG_RESULT([no])
AC_MSG_ERROR([
*** This kernel does not include the required block device support.
*** Rebuild the kernel with CONFIG_BLOCK=y set.])
])
])
dnl #
@@ -72,6 +101,61 @@ AC_DEFUN([ZFS_AC_KERNEL_CONFIG_DEBUG_LOCK_ALLOC], [
])
])
dnl #
dnl # Check CONFIG_MODULES
dnl #
dnl # Verify the kernel has CONFIG_MODULES support enabled.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_CONFIG_MODULES], [
ZFS_LINUX_TEST_SRC([config_modules], [
#if !defined(CONFIG_MODULES)
#error CONFIG_MODULES not defined
#endif
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_CONFIG_MODULES], [
AC_MSG_CHECKING([whether CONFIG_MODULES is defined])
AS_IF([test "x$enable_linux_builtin" != xyes], [
ZFS_LINUX_TEST_RESULT([config_modules], [
AC_MSG_RESULT([yes])
],[
AC_MSG_RESULT([no])
AC_MSG_ERROR([
*** This kernel does not include the required loadable module
*** support!
***
*** To build OpenZFS as a loadable Linux kernel module
*** enable loadable module support by setting
*** `CONFIG_MODULES=y` in the kernel configuration and run
*** `make modules_prepare` in the Linux source tree.
***
*** If you don't intend to enable loadable kernel module
*** support, please compile OpenZFS as a Linux kernel built-in.
***
*** Prepare the Linux source tree by running `make prepare`,
*** use the OpenZFS `--enable-linux-builtin` configure option,
*** copy the OpenZFS sources into the Linux source tree using
*** `./copy-builtin <linux source directory>`,
*** set `CONFIG_ZFS=y` in the kernel configuration and compile
*** kernel as usual.
])
])
], [
ZFS_LINUX_TRY_COMPILE([], [], [
AC_MSG_RESULT([not needed])
],[
AC_MSG_RESULT([error])
AC_MSG_ERROR([
*** This kernel is unable to compile object files.
***
*** Please make sure you prepared the Linux source tree
*** by running `make prepare` there.
])
])
])
])
dnl #
dnl # Check CONFIG_TRIM_UNUSED_KSYMS
dnl #
+29
View File
@@ -0,0 +1,29 @@
dnl #
dnl # On certain architectures `__copy_from_user_inatomic`
dnl # is a GPL exported variable and cannot be used by OpenZFS.
dnl #
dnl #
dnl # Checking if `__copy_from_user_inatomic` is available.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC___COPY_FROM_USER_INATOMIC], [
ZFS_LINUX_TEST_SRC([__copy_from_user_inatomic], [
#include <linux/uaccess.h>
], [
int result __attribute__ ((unused)) = __copy_from_user_inatomic(NULL, NULL, 0);
], [], [ZFS_META_LICENSE])
])
AC_DEFUN([ZFS_AC_KERNEL___COPY_FROM_USER_INATOMIC], [
AC_MSG_CHECKING([whether __copy_from_user_inatomic is available])
ZFS_LINUX_TEST_RESULT([__copy_from_user_inatomic_license], [
AC_MSG_RESULT(yes)
], [
AC_MSG_RESULT(no)
AC_MSG_ERROR([
*** The `__copy_from_user_inatomic()` Linux kernel function is
*** incompatible with the CDDL license and will prevent the module
*** linking stage from succeeding. OpenZFS cannot be compiled.
])
])
])
+24 -7
View File
@@ -2,6 +2,9 @@ dnl #
dnl # Handle differences in kernel FPU code.
dnl #
dnl # Kernel
dnl # 5.19: The asm/fpu/internal.h header was removed, it has been
dnl # effectively empty since the 5.16 kernel.
dnl #
dnl # 5.16: XCR code put into asm/fpu/xcr.h
dnl # HAVE_KERNEL_FPU_XCR_HEADER
dnl #
@@ -33,21 +36,31 @@ AC_DEFUN([ZFS_AC_KERNEL_FPU_HEADER], [
],[
AC_DEFINE(HAVE_KERNEL_FPU_API_HEADER, 1,
[kernel has asm/fpu/api.h])
AC_MSG_RESULT(asm/fpu/api.h)
AC_MSG_CHECKING([whether fpu/xcr header is available])
fpu_headers="asm/fpu/api.h"
ZFS_LINUX_TRY_COMPILE([
#include <linux/module.h>
#include <asm/fpu/xcr.h>
],[
],[
AC_DEFINE(HAVE_KERNEL_FPU_XCR_HEADER, 1,
[kernel has asm/fpu/xcr.h])
AC_MSG_RESULT(asm/fpu/xcr.h)
],[
AC_MSG_RESULT(no asm/fpu/xcr.h)
[kernel has asm/fpu/xcr.h])
fpu_headers="$fpu_headers asm/fpu/xcr.h"
])
ZFS_LINUX_TRY_COMPILE([
#include <linux/module.h>
#include <asm/fpu/internal.h>
],[
],[
AC_DEFINE(HAVE_KERNEL_FPU_INTERNAL_HEADER, 1,
[kernel has asm/fpu/internal.h])
fpu_headers="$fpu_headers asm/fpu/internal.h"
])
AC_MSG_RESULT([$fpu_headers])
],[
AC_MSG_RESULT(i387.h & xcr.h)
AC_MSG_RESULT([i387.h & xcr.h])
])
])
@@ -93,7 +106,9 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_FPU], [
#include <linux/types.h>
#ifdef HAVE_KERNEL_FPU_API_HEADER
#include <asm/fpu/api.h>
#ifdef HAVE_KERNEL_FPU_INTERNAL_HEADER
#include <asm/fpu/internal.h>
#endif
#else
#include <asm/i387.h>
#include <asm/xcr.h>
@@ -130,7 +145,9 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_FPU], [
#include <linux/types.h>
#ifdef HAVE_KERNEL_FPU_API_HEADER
#include <asm/fpu/api.h>
#ifdef HAVE_KERNEL_FPU_INTERNAL_HEADER
#include <asm/fpu/internal.h>
#endif
#else
#include <asm/i387.h>
#include <asm/xcr.h>
+55 -28
View File
@@ -2,6 +2,19 @@ dnl #
dnl # Check for generic io accounting interface.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_GENERIC_IO_ACCT], [
ZFS_LINUX_TEST_SRC([bdev_io_acct], [
#include <linux/blkdev.h>
], [
struct block_device *bdev = NULL;
struct bio *bio = NULL;
unsigned long passed_time = 0;
unsigned long start_time;
start_time = bdev_start_io_acct(bdev, bio_sectors(bio),
bio_op(bio), passed_time);
bdev_end_io_acct(bdev, bio_op(bio), start_time);
])
ZFS_LINUX_TEST_SRC([disk_io_acct], [
#include <linux/blkdev.h>
], [
@@ -50,61 +63,75 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_GENERIC_IO_ACCT], [
AC_DEFUN([ZFS_AC_KERNEL_GENERIC_IO_ACCT], [
dnl #
dnl # 5.12 API,
dnl # 5.19 API,
dnl #
dnl # bio_start_io_acct() and bio_end_io_acct() became GPL-exported
dnl # so use disk_start_io_acct() and disk_end_io_acct() instead
dnl # disk_start_io_acct() and disk_end_io_acct() have been replaced by
dnl # bdev_start_io_acct() and bdev_end_io_acct().
dnl #
AC_MSG_CHECKING([whether generic disk_*_io_acct() are available])
ZFS_LINUX_TEST_RESULT([disk_io_acct], [
AC_MSG_CHECKING([whether generic bdev_*_io_acct() are available])
ZFS_LINUX_TEST_RESULT([bdev_io_acct], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_DISK_IO_ACCT, 1, [disk_*_io_acct() available])
AC_DEFINE(HAVE_BDEV_IO_ACCT, 1, [bdev_*_io_acct() available])
], [
AC_MSG_RESULT(no)
dnl #
dnl # 5.7 API,
dnl # 5.12 API,
dnl #
dnl # Added bio_start_io_acct() and bio_end_io_acct() helpers.
dnl # bio_start_io_acct() and bio_end_io_acct() became GPL-exported
dnl # so use disk_start_io_acct() and disk_end_io_acct() instead
dnl #
AC_MSG_CHECKING([whether generic bio_*_io_acct() are available])
ZFS_LINUX_TEST_RESULT([bio_io_acct], [
AC_MSG_CHECKING([whether generic disk_*_io_acct() are available])
ZFS_LINUX_TEST_RESULT([disk_io_acct], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BIO_IO_ACCT, 1, [bio_*_io_acct() available])
AC_DEFINE(HAVE_DISK_IO_ACCT, 1, [disk_*_io_acct() available])
], [
AC_MSG_RESULT(no)
dnl #
dnl # 4.14 API,
dnl # 5.7 API,
dnl #
dnl # generic_start_io_acct/generic_end_io_acct now require
dnl # request_queue to be provided. No functional changes,
dnl # but preparation for inflight accounting.
dnl # Added bio_start_io_acct() and bio_end_io_acct() helpers.
dnl #
AC_MSG_CHECKING([whether generic_*_io_acct wants 4 args])
ZFS_LINUX_TEST_RESULT_SYMBOL([generic_acct_4args],
[generic_start_io_acct], [block/bio.c], [
AC_MSG_CHECKING([whether generic bio_*_io_acct() are available])
ZFS_LINUX_TEST_RESULT([bio_io_acct], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_GENERIC_IO_ACCT_4ARG, 1,
[generic_*_io_acct() 4 arg available])
AC_DEFINE(HAVE_BIO_IO_ACCT, 1, [bio_*_io_acct() available])
], [
AC_MSG_RESULT(no)
dnl #
dnl # 3.19 API addition
dnl # 4.14 API,
dnl #
dnl # torvalds/linux@394ffa50 allows us to increment
dnl # iostat counters without generic_make_request().
dnl # generic_start_io_acct/generic_end_io_acct now require
dnl # request_queue to be provided. No functional changes,
dnl # but preparation for inflight accounting.
dnl #
AC_MSG_CHECKING(
[whether generic_*_io_acct wants 3 args])
ZFS_LINUX_TEST_RESULT_SYMBOL([generic_acct_3args],
AC_MSG_CHECKING([whether generic_*_io_acct wants 4 args])
ZFS_LINUX_TEST_RESULT_SYMBOL([generic_acct_4args],
[generic_start_io_acct], [block/bio.c], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_GENERIC_IO_ACCT_3ARG, 1,
[generic_*_io_acct() 3 arg available])
AC_DEFINE(HAVE_GENERIC_IO_ACCT_4ARG, 1,
[generic_*_io_acct() 4 arg available])
], [
AC_MSG_RESULT(no)
dnl #
dnl # 3.19 API addition
dnl #
dnl # torvalds/linux@394ffa50 allows us to increment
dnl # iostat counters without generic_make_request().
dnl #
AC_MSG_CHECKING(
[whether generic_*_io_acct wants 3 args])
ZFS_LINUX_TEST_RESULT_SYMBOL([generic_acct_3args],
[generic_start_io_acct], [block/bio.c], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_GENERIC_IO_ACCT_3ARG, 1,
[generic_*_io_acct() 3 arg available])
], [
AC_MSG_RESULT(no)
])
])
])
])
+58
View File
@@ -0,0 +1,58 @@
dnl #
dnl # 5.17 API change,
dnl #
dnl # GENHD_FL_EXT_DEVT flag removed
dnl # GENHD_FL_NO_PART_SCAN renamed GENHD_FL_NO_PART
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_GENHD_FLAGS], [
ZFS_LINUX_TEST_SRC([genhd_fl_ext_devt], [
#include <linux/blkdev.h>
], [
int flags __attribute__ ((unused)) = GENHD_FL_EXT_DEVT;
])
ZFS_LINUX_TEST_SRC([genhd_fl_no_part], [
#include <linux/blkdev.h>
], [
int flags __attribute__ ((unused)) = GENHD_FL_NO_PART;
])
ZFS_LINUX_TEST_SRC([genhd_fl_no_part_scan], [
#include <linux/blkdev.h>
], [
int flags __attribute__ ((unused)) = GENHD_FL_NO_PART_SCAN;
])
])
AC_DEFUN([ZFS_AC_KERNEL_GENHD_FLAGS], [
AC_MSG_CHECKING([whether GENHD_FL_EXT_DEVT flag is available])
ZFS_LINUX_TEST_RESULT([genhd_fl_ext_devt], [
AC_MSG_RESULT(yes)
AC_DEFINE(ZFS_GENHD_FL_EXT_DEVT, GENHD_FL_EXT_DEVT,
[GENHD_FL_EXT_DEVT flag is available])
], [
AC_MSG_RESULT(no)
AC_DEFINE(ZFS_GENHD_FL_EXT_DEVT, 0,
[GENHD_FL_EXT_DEVT flag is not available])
])
AC_MSG_CHECKING([whether GENHD_FL_NO_PART flag is available])
ZFS_LINUX_TEST_RESULT([genhd_fl_no_part], [
AC_MSG_RESULT(yes)
AC_DEFINE(ZFS_GENHD_FL_NO_PART, GENHD_FL_NO_PART,
[GENHD_FL_NO_PART flag is available])
], [
AC_MSG_RESULT(no)
AC_MSG_CHECKING([whether GENHD_FL_NO_PART_SCAN flag is available])
ZFS_LINUX_TEST_RESULT([genhd_fl_no_part_scan], [
AC_MSG_RESULT(yes)
AC_DEFINE(ZFS_GENHD_FL_NO_PART, GENHD_FL_NO_PART_SCAN,
[GENHD_FL_NO_PART_SCAN flag is available])
], [
ZFS_LINUX_TEST_ERROR([GENHD_FL_NO_PART|GENHD_FL_NO_PART_SCAN])
])
])
])
+2 -2
View File
@@ -5,9 +5,9 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_GET_DISK_RO], [
ZFS_LINUX_TEST_SRC([get_disk_ro], [
#include <linux/blkdev.h>
],[
struct gendisk *disk = NULL;
struct gendisk *disk __attribute__ ((unused)) = NULL;
(void) get_disk_ro(disk);
], [$NO_UNUSED_BUT_SET_VARIABLE])
], [])
])
AC_DEFUN([ZFS_AC_KERNEL_GET_DISK_RO], [
+2 -2
View File
@@ -6,8 +6,8 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_GROUP_INFO_GID], [
ZFS_LINUX_TEST_SRC([group_info_gid], [
#include <linux/cred.h>
],[
struct group_info *gi = groups_alloc(1);
gi->gid[0] = KGIDT_INIT(0);
struct group_info gi __attribute__ ((unused)) = {};
gi.gid[0] = KGIDT_INIT(0);
])
])
+20
View File
@@ -49,6 +49,13 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_MAKE_REQUEST_FN], [
struct gendisk *disk __attribute__ ((unused));
disk = blk_alloc_disk(NUMA_NO_NODE);
])
ZFS_LINUX_TEST_SRC([blk_cleanup_disk], [
#include <linux/blkdev.h>
],[
struct gendisk *disk __attribute__ ((unused));
blk_cleanup_disk(disk);
])
])
AC_DEFUN([ZFS_AC_KERNEL_MAKE_REQUEST_FN], [
@@ -73,6 +80,19 @@ AC_DEFUN([ZFS_AC_KERNEL_MAKE_REQUEST_FN], [
ZFS_LINUX_TEST_RESULT([blk_alloc_disk], [
AC_MSG_RESULT(yes)
AC_DEFINE([HAVE_BLK_ALLOC_DISK], 1, [blk_alloc_disk() exists])
dnl #
dnl # 5.20 API change,
dnl # Removed blk_cleanup_disk(), put_disk() should be used.
dnl #
AC_MSG_CHECKING([whether blk_cleanup_disk() exists])
ZFS_LINUX_TEST_RESULT([blk_cleanup_disk], [
AC_MSG_RESULT(yes)
AC_DEFINE([HAVE_BLK_CLEANUP_DISK], 1,
[blk_cleanup_disk() exists])
], [
AC_MSG_RESULT(no)
])
], [
AC_MSG_RESULT(no)
])
+2
View File
@@ -53,6 +53,8 @@ AC_DEFUN([ZFS_AC_KERNEL_MKDIR], [
AC_DEFINE(HAVE_IOPS_MKDIR_USERNS, 1,
[iops->mkdir() takes struct user_namespace*])
],[
AC_MSG_RESULT(no)
AC_MSG_CHECKING([whether iops->mkdir() takes umode_t])
ZFS_LINUX_TEST_RESULT([inode_operations_mkdir], [
AC_MSG_RESULT(yes)
+1 -1
View File
@@ -15,7 +15,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_PAGEMAP_FOLIO_WAIT_BIT], [
])
AC_DEFUN([ZFS_AC_KERNEL_PAGEMAP_FOLIO_WAIT_BIT], [
AC_MSG_CHECKING([folio_wait_bit() exists])
AC_MSG_CHECKING([whether folio_wait_bit() exists])
ZFS_LINUX_TEST_RESULT([pagemap_has_folio_wait_bit], [
AC_MSG_RESULT([yes])
AC_DEFINE(HAVE_PAGEMAP_FOLIO_WAIT_BIT, 1,
+25
View File
@@ -0,0 +1,25 @@
dnl #
dnl # Linux 5.18 removes address_space_operations ->readpages in favour of
dnl # ->readahead
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_VFS_READPAGES], [
ZFS_LINUX_TEST_SRC([vfs_has_readpages], [
#include <linux/fs.h>
static const struct address_space_operations
aops __attribute__ ((unused)) = {
.readpages = NULL,
};
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_VFS_READPAGES], [
AC_MSG_CHECKING([whether aops->readpages exists])
ZFS_LINUX_TEST_RESULT([vfs_has_readpages], [
AC_MSG_RESULT([yes])
AC_DEFINE(HAVE_VFS_READPAGES, 1,
[address_space_operations->readpages exists])
],[
AC_MSG_RESULT([no])
])
])
+2 -2
View File
@@ -8,14 +8,14 @@ dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_REVALIDATE_DISK], [
ZFS_LINUX_TEST_SRC([revalidate_disk_size], [
#include <linux/genhd.h>
#include <linux/blkdev.h>
], [
struct gendisk *disk = NULL;
(void) revalidate_disk_size(disk, false);
])
ZFS_LINUX_TEST_SRC([revalidate_disk], [
#include <linux/genhd.h>
#include <linux/blkdev.h>
], [
struct gendisk *disk = NULL;
(void) revalidate_disk(disk);
+52 -15
View File
@@ -54,6 +54,21 @@ AC_DEFUN([ZFS_AC_KERNEL_SHRINK_CONTROL_HAS_NID], [
])
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_REGISTER_SHRINKER_VARARG], [
ZFS_LINUX_TEST_SRC([register_shrinker_vararg], [
#include <linux/mm.h>
unsigned long shrinker_cb(struct shrinker *shrink,
struct shrink_control *sc) { return 0; }
],[
struct shrinker cache_shrinker = {
.count_objects = shrinker_cb,
.scan_objects = shrinker_cb,
.seeks = DEFAULT_SEEKS,
};
register_shrinker(&cache_shrinker, "vararg-reg-shrink-test");
])
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_SHRINKER_CALLBACK], [
ZFS_LINUX_TEST_SRC([shrinker_cb_shrink_control], [
#include <linux/mm.h>
@@ -83,29 +98,50 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_SHRINKER_CALLBACK], [
AC_DEFUN([ZFS_AC_KERNEL_SHRINKER_CALLBACK],[
dnl #
dnl # 3.0 - 3.11 API change
dnl # ->shrink(struct shrinker *, struct shrink_control *sc)
dnl # 6.0 API change
dnl # register_shrinker() becomes a var-arg function that takes
dnl # a printf-style format string as args > 0
dnl #
AC_MSG_CHECKING([whether new 2-argument shrinker exists])
ZFS_LINUX_TEST_RESULT([shrinker_cb_shrink_control], [
AC_MSG_CHECKING([whether new var-arg register_shrinker() exists])
ZFS_LINUX_TEST_RESULT([register_shrinker_vararg], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SINGLE_SHRINKER_CALLBACK, 1,
[new shrinker callback wants 2 args])
AC_DEFINE(HAVE_REGISTER_SHRINKER_VARARG, 1,
[register_shrinker is vararg])
dnl # We assume that the split shrinker callback exists if the
dnl # vararg register_shrinker() exists, because the latter is
dnl # a much more recent addition, and the macro test for the
dnl # var-arg version only works if the callback is split
AC_DEFINE(HAVE_SPLIT_SHRINKER_CALLBACK, 1,
[cs->count_objects exists])
],[
AC_MSG_RESULT(no)
dnl #
dnl # 3.12 API change,
dnl # ->shrink() is logically split in to
dnl # ->count_objects() and ->scan_objects()
dnl # 3.0 - 3.11 API change
dnl # cs->shrink(struct shrinker *, struct shrink_control *sc)
dnl #
AC_MSG_CHECKING([whether ->count_objects callback exists])
ZFS_LINUX_TEST_RESULT([shrinker_cb_shrink_control_split], [
AC_MSG_CHECKING([whether new 2-argument shrinker exists])
ZFS_LINUX_TEST_RESULT([shrinker_cb_shrink_control], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SPLIT_SHRINKER_CALLBACK, 1,
[->count_objects exists])
AC_DEFINE(HAVE_SINGLE_SHRINKER_CALLBACK, 1,
[new shrinker callback wants 2 args])
],[
ZFS_LINUX_TEST_ERROR([shrinker])
AC_MSG_RESULT(no)
dnl #
dnl # 3.12 API change,
dnl # cs->shrink() is logically split in to
dnl # cs->count_objects() and cs->scan_objects()
dnl #
AC_MSG_CHECKING([if cs->count_objects callback exists])
ZFS_LINUX_TEST_RESULT(
[shrinker_cb_shrink_control_split],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SPLIT_SHRINKER_CALLBACK, 1,
[cs->count_objects exists])
],[
ZFS_LINUX_TEST_ERROR([shrinker])
])
])
])
])
@@ -141,6 +177,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_SHRINKER], [
ZFS_AC_KERNEL_SRC_SHRINK_CONTROL_HAS_NID
ZFS_AC_KERNEL_SRC_SHRINKER_CALLBACK
ZFS_AC_KERNEL_SRC_SHRINK_CONTROL_STRUCT
ZFS_AC_KERNEL_SRC_REGISTER_SHRINKER_VARARG
])
AC_DEFUN([ZFS_AC_KERNEL_SHRINKER], [
+37
View File
@@ -0,0 +1,37 @@
dnl #
dnl # Linux 5.2/5.18 API
dnl #
dnl # In cdb4f26a63c391317e335e6e683a614358e70aeb ("kobject: kobj_type: remove default_attrs")
dnl # struct kobj_type.default_attrs
dnl # was finally removed in favour of
dnl # struct kobj_type.default_groups
dnl #
dnl # This was added in aa30f47cf666111f6bbfd15f290a27e8a7b9d854 ("kobject: Add support for default attribute groups to kobj_type"),
dnl # if both are present (5.2-5.17), we prefer default_groups; they're otherwise equivalent
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_SYSFS_DEFAULT_GROUPS], [
ZFS_LINUX_TEST_SRC([sysfs_default_groups], [
#include <linux/kobject.h>
],[
struct kobj_type __attribute__ ((unused)) kt = {
.default_groups = (const struct attribute_group **)NULL };
])
])
AC_DEFUN([ZFS_AC_KERNEL_SYSFS_DEFAULT_GROUPS], [
AC_MSG_CHECKING([whether struct kobj_type.default_groups exists])
ZFS_LINUX_TEST_RESULT([sysfs_default_groups],[
AC_MSG_RESULT(yes)
AC_DEFINE([HAVE_SYSFS_DEFAULT_GROUPS], 1, [struct kobj_type has default_groups])
],[
AC_MSG_RESULT(no)
])
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_SYSFS], [
ZFS_AC_KERNEL_SRC_SYSFS_DEFAULT_GROUPS
])
AC_DEFUN([ZFS_AC_KERNEL_SYSFS], [
ZFS_AC_KERNEL_SYSFS_DEFAULT_GROUPS
])
+30
View File
@@ -0,0 +1,30 @@
dnl #
dnl # Linux 5.18 uses filemap_dirty_folio in lieu of
dnl # ___set_page_dirty_nobuffers
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_VFS_FILEMAP_DIRTY_FOLIO], [
ZFS_LINUX_TEST_SRC([vfs_has_filemap_dirty_folio], [
#include <linux/pagemap.h>
#include <linux/writeback.h>
static const struct address_space_operations
aops __attribute__ ((unused)) = {
.dirty_folio = filemap_dirty_folio,
};
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_VFS_FILEMAP_DIRTY_FOLIO], [
dnl #
dnl # Linux 5.18 uses filemap_dirty_folio in lieu of
dnl # ___set_page_dirty_nobuffers
dnl #
AC_MSG_CHECKING([whether filemap_dirty_folio exists])
ZFS_LINUX_TEST_RESULT([vfs_has_filemap_dirty_folio], [
AC_MSG_RESULT([yes])
AC_DEFINE(HAVE_VFS_FILEMAP_DIRTY_FOLIO, 1,
[filemap_dirty_folio exists])
],[
AC_MSG_RESULT([no])
])
])
+2
View File
@@ -134,6 +134,8 @@ AC_DEFUN([ZFS_AC_KERNEL_VFS_IOV_ITER], [
AC_DEFINE(HAVE_IOV_ITER_FAULT_IN_READABLE, 1,
[iov_iter_fault_in_readable() is available])
],[
AC_MSG_RESULT(no)
AC_MSG_CHECKING([whether fault_in_iov_iter_readable() is available])
ZFS_LINUX_TEST_RESULT([fault_in_iov_iter_readable], [
AC_MSG_RESULT(yes)
+32
View File
@@ -0,0 +1,32 @@
dnl #
dnl # Linux 5.19 uses read_folio in lieu of readpage
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_VFS_READ_FOLIO], [
ZFS_LINUX_TEST_SRC([vfs_has_read_folio], [
#include <linux/fs.h>
static int
test_read_folio(struct file *file, struct folio *folio) {
(void) file; (void) folio;
return (0);
}
static const struct address_space_operations
aops __attribute__ ((unused)) = {
.read_folio = test_read_folio,
};
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_VFS_READ_FOLIO], [
dnl #
dnl # Linux 5.19 uses read_folio in lieu of readpage
dnl #
AC_MSG_CHECKING([whether read_folio exists])
ZFS_LINUX_TEST_RESULT([vfs_has_read_folio], [
AC_MSG_RESULT([yes])
AC_DEFINE(HAVE_VFS_READ_FOLIO, 1, [read_folio exists])
],[
AC_MSG_RESULT([no])
])
])
+1 -1
View File
@@ -23,7 +23,7 @@ AC_DEFUN([ZFS_AC_KERNEL_VFS_SET_PAGE_DIRTY_NOBUFFERS], [
dnl # Linux 5.14 change requires set_page_dirty() to be assigned
dnl # in address_space_operations()
dnl #
AC_MSG_CHECKING([__set_page_dirty_nobuffers exists])
AC_MSG_CHECKING([whether __set_page_dirty_nobuffers exists])
ZFS_LINUX_TEST_RESULT([vfs_has_set_page_dirty_nobuffers], [
AC_MSG_RESULT([yes])
AC_DEFINE(HAVE_VFS_SET_PAGE_DIRTY_NOBUFFERS, 1,
+28 -1
View File
@@ -100,6 +100,19 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_XATTR_HANDLER_GET], [
.get = get,
};
],[])
ZFS_LINUX_TEST_SRC([xattr_handler_get_dentry_inode_flags], [
#include <linux/xattr.h>
int get(const struct xattr_handler *handler,
struct dentry *dentry, struct inode *inode,
const char *name, void *buffer,
size_t size, int flags) { return 0; }
static const struct xattr_handler
xops __attribute__ ((unused)) = {
.get = get,
};
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_XATTR_HANDLER_GET], [
@@ -142,7 +155,21 @@ AC_DEFUN([ZFS_AC_KERNEL_XATTR_HANDLER_GET], [
AC_DEFINE(HAVE_XATTR_GET_DENTRY, 1,
[xattr_handler->get() wants dentry])
],[
ZFS_LINUX_TEST_ERROR([xattr get()])
dnl #
dnl # Android API change,
dnl # The xattr_handler->get() callback was
dnl # changed to take dentry, inode and flags.
dnl #
AC_MSG_RESULT(no)
AC_MSG_CHECKING(
[whether xattr_handler->get() wants dentry and inode and flags])
ZFS_LINUX_TEST_RESULT([xattr_handler_get_dentry_inode_flags], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_XATTR_GET_DENTRY_INODE_FLAGS, 1,
[xattr_handler->get() wants dentry and inode and flags])
],[
ZFS_LINUX_TEST_ERROR([xattr get()])
])
])
])
])
+27
View File
@@ -0,0 +1,27 @@
dnl #
dnl # ZERO_PAGE() is an alias for emtpy_zero_page. On certain architectures
dnl # this is a GPL exported variable.
dnl #
dnl #
dnl # Checking if ZERO_PAGE is exported GPL-only
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_ZERO_PAGE], [
ZFS_LINUX_TEST_SRC([zero_page], [
#include <asm/pgtable.h>
], [
struct page *p __attribute__ ((unused));
p = ZERO_PAGE(0);
], [], [ZFS_META_LICENSE])
])
AC_DEFUN([ZFS_AC_KERNEL_ZERO_PAGE], [
AC_MSG_CHECKING([whether ZERO_PAGE() is GPL-only])
ZFS_LINUX_TEST_RESULT([zero_page_license], [
AC_MSG_RESULT(no)
], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_ZERO_PAGE_GPL_ONLY, 1,
[ZERO_PAGE() is GPL-only])
])
])
+21 -28
View File
@@ -8,8 +8,8 @@ AC_DEFUN([ZFS_AC_CONFIG_KERNEL], [
ZFS_AC_QAT
dnl # Sanity checks for module building and CONFIG_* defines
ZFS_AC_KERNEL_TEST_MODULE
ZFS_AC_KERNEL_CONFIG_DEFINED
ZFS_AC_MODULE_SYMVERS
dnl # Sequential ZFS_LINUX_TRY_COMPILE tests
ZFS_AC_KERNEL_FPU_HEADER
@@ -61,6 +61,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_SRC], [
ZFS_AC_KERNEL_SRC_BIO
ZFS_AC_KERNEL_SRC_BLKDEV
ZFS_AC_KERNEL_SRC_BLK_QUEUE
ZFS_AC_KERNEL_SRC_GENHD_FLAGS
ZFS_AC_KERNEL_SRC_REVALIDATE_DISK
ZFS_AC_KERNEL_SRC_GET_DISK_RO
ZFS_AC_KERNEL_SRC_GENERIC_READLINK_GLOBAL
@@ -99,10 +100,14 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_SRC], [
ZFS_AC_KERNEL_SRC_SET_NLINK
ZFS_AC_KERNEL_SRC_SGET
ZFS_AC_KERNEL_SRC_LSEEK_EXECUTE
ZFS_AC_KERNEL_SRC_VFS_FILEMAP_DIRTY_FOLIO
ZFS_AC_KERNEL_SRC_VFS_READ_FOLIO
ZFS_AC_KERNEL_SRC_VFS_GETATTR
ZFS_AC_KERNEL_SRC_VFS_FSYNC_2ARGS
ZFS_AC_KERNEL_SRC_VFS_ITERATE
ZFS_AC_KERNEL_SRC_VFS_DIRECT_IO
ZFS_AC_KERNEL_SRC_VFS_READPAGES
ZFS_AC_KERNEL_SRC_VFS_SET_PAGE_DIRTY_NOBUFFERS
ZFS_AC_KERNEL_SRC_VFS_RW_ITERATE
ZFS_AC_KERNEL_SRC_VFS_GENERIC_WRITE_CHECKS
ZFS_AC_KERNEL_SRC_VFS_IOV_ITER
@@ -131,12 +136,14 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_SRC], [
ZFS_AC_KERNEL_SRC_BIO_MAX_SEGS
ZFS_AC_KERNEL_SRC_SIGNAL_STOP
ZFS_AC_KERNEL_SRC_SIGINFO
ZFS_AC_KERNEL_SRC_SYSFS
ZFS_AC_KERNEL_SRC_SET_SPECIAL_STATE
ZFS_AC_KERNEL_SRC_VFS_SET_PAGE_DIRTY_NOBUFFERS
ZFS_AC_KERNEL_SRC_STANDALONE_LINUX_STDARG
ZFS_AC_KERNEL_SRC_PAGEMAP_FOLIO_WAIT_BIT
ZFS_AC_KERNEL_SRC_ADD_DISK
ZFS_AC_KERNEL_SRC_KTHREAD
ZFS_AC_KERNEL_SRC_ZERO_PAGE
ZFS_AC_KERNEL_SRC___COPY_FROM_USER_INATOMIC
AC_MSG_CHECKING([for available kernel interfaces])
ZFS_LINUX_TEST_COMPILE_ALL([kabi])
@@ -171,6 +178,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_RESULT], [
ZFS_AC_KERNEL_BIO
ZFS_AC_KERNEL_BLKDEV
ZFS_AC_KERNEL_BLK_QUEUE
ZFS_AC_KERNEL_GENHD_FLAGS
ZFS_AC_KERNEL_REVALIDATE_DISK
ZFS_AC_KERNEL_GET_DISK_RO
ZFS_AC_KERNEL_GENERIC_READLINK_GLOBAL
@@ -209,10 +217,14 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_RESULT], [
ZFS_AC_KERNEL_SET_NLINK
ZFS_AC_KERNEL_SGET
ZFS_AC_KERNEL_LSEEK_EXECUTE
ZFS_AC_KERNEL_VFS_FILEMAP_DIRTY_FOLIO
ZFS_AC_KERNEL_VFS_READ_FOLIO
ZFS_AC_KERNEL_VFS_GETATTR
ZFS_AC_KERNEL_VFS_FSYNC_2ARGS
ZFS_AC_KERNEL_VFS_ITERATE
ZFS_AC_KERNEL_VFS_DIRECT_IO
ZFS_AC_KERNEL_VFS_READPAGES
ZFS_AC_KERNEL_VFS_SET_PAGE_DIRTY_NOBUFFERS
ZFS_AC_KERNEL_VFS_RW_ITERATE
ZFS_AC_KERNEL_VFS_GENERIC_WRITE_CHECKS
ZFS_AC_KERNEL_VFS_IOV_ITER
@@ -241,12 +253,14 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_RESULT], [
ZFS_AC_KERNEL_BIO_MAX_SEGS
ZFS_AC_KERNEL_SIGNAL_STOP
ZFS_AC_KERNEL_SIGINFO
ZFS_AC_KERNEL_SYSFS
ZFS_AC_KERNEL_SET_SPECIAL_STATE
ZFS_AC_KERNEL_VFS_SET_PAGE_DIRTY_NOBUFFERS
ZFS_AC_KERNEL_STANDALONE_LINUX_STDARG
ZFS_AC_KERNEL_PAGEMAP_FOLIO_WAIT_BIT
ZFS_AC_KERNEL_ADD_DISK
ZFS_AC_KERNEL_KTHREAD
ZFS_AC_KERNEL_ZERO_PAGE
ZFS_AC_KERNEL___COPY_FROM_USER_INATOMIC
])
dnl #
@@ -435,8 +449,6 @@ AC_DEFUN([ZFS_AC_KERNEL], [
AC_SUBST(LINUX)
AC_SUBST(LINUX_OBJ)
AC_SUBST(LINUX_VERSION)
ZFS_AC_MODULE_SYMVERS
])
dnl #
@@ -531,27 +543,6 @@ AC_DEFUN([ZFS_AC_QAT], [
])
])
dnl #
dnl # Basic toolchain sanity check.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_TEST_MODULE], [
AC_MSG_CHECKING([whether modules can be built])
ZFS_LINUX_TRY_COMPILE([], [], [
AC_MSG_RESULT([yes])
],[
AC_MSG_RESULT([no])
if test "x$enable_linux_builtin" != xyes; then
AC_MSG_ERROR([
*** Unable to build an empty module.
])
else
AC_MSG_ERROR([
*** Unable to build an empty module.
*** Please run 'make scripts' inside the kernel source tree.])
fi
])
])
dnl #
dnl # ZFS_LINUX_CONFTEST_H
dnl #
@@ -654,8 +645,10 @@ AC_DEFUN([ZFS_LINUX_COMPILE], [
build kernel modules with LLVM/CLANG toolchain])
AC_TRY_COMMAND([
KBUILD_MODPOST_NOFINAL="$5" KBUILD_MODPOST_WARN="$6"
make modules -k -j$TEST_JOBS ${KERNEL_CC:+CC=$KERNEL_CC} ${KERNEL_LD:+LD=$KERNEL_LD} ${KERNEL_LLVM:+LLVM=$KERNEL_LLVM} -C $LINUX_OBJ $ARCH_UM
M=$PWD/$1 >$1/build.log 2>&1])
make modules -k -j$TEST_JOBS ${KERNEL_CC:+CC=$KERNEL_CC}
${KERNEL_LD:+LD=$KERNEL_LD} ${KERNEL_LLVM:+LLVM=$KERNEL_LLVM}
CONFIG_MODULES=y CFLAGS_MODULE=-DCONFIG_MODULES
-C $LINUX_OBJ $ARCH_UM M=$PWD/$1 >$1/build.log 2>&1])
AS_IF([AC_TRY_COMMAND([$2])], [$3], [$4])
])
+7 -2
View File
@@ -209,8 +209,8 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS], [
AX_COUNT_CPUS([])
AC_SUBST(CPU_COUNT)
ZFS_AC_CONFIG_ALWAYS_CC_NO_UNUSED_BUT_SET_VARIABLE
ZFS_AC_CONFIG_ALWAYS_CC_NO_BOOL_COMPARE
ZFS_AC_CONFIG_ALWAYS_CC_NO_CLOBBERED
ZFS_AC_CONFIG_ALWAYS_CC_INFINITE_RECURSION
ZFS_AC_CONFIG_ALWAYS_CC_IMPLICIT_FALLTHROUGH
ZFS_AC_CONFIG_ALWAYS_CC_FRAME_LARGER_THAN
ZFS_AC_CONFIG_ALWAYS_CC_NO_FORMAT_TRUNCATION
@@ -226,6 +226,7 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS], [
ZFS_AC_CONFIG_ALWAYS_SED
ZFS_AC_CONFIG_ALWAYS_CPPCHECK
ZFS_AC_CONFIG_ALWAYS_SHELLCHECK
ZFS_AC_CONFIG_ALWAYS_PARALLEL
])
AC_DEFUN([ZFS_AC_CONFIG], [
@@ -323,6 +324,10 @@ AC_DEFUN([ZFS_AC_RPM], [
RPM_DEFINE_COMMON=${RPM_DEFINE_COMMON}' --define "$(DEBUG_KMEM_TRACKING_ZFS) 1"'
RPM_DEFINE_COMMON=${RPM_DEFINE_COMMON}' --define "$(ASAN_ZFS) 1"'
AS_IF([test "x$enable_debuginfo" = xyes], [
RPM_DEFINE_COMMON=${RPM_DEFINE_COMMON}' --define "__strip /bin/true"'
])
RPM_DEFINE_UTIL=' --define "_initconfdir $(initconfdir)"'
dnl # Make the next three RPM_DEFINE_UTIL additions conditional, since
+1 -3
View File
@@ -1,14 +1,12 @@
#!/bin/sh
. /lib/dracut-zfs-lib.sh
_do_zpool_export() {
ret=0
errs=""
final="${1}"
info "ZFS: Exporting ZFS storage pools..."
errs=$(export_all -F 2>&1)
errs=$(zpool export -aF 2>&1)
ret=$?
[ -z "${errs}" ] || echo "${errs}" | vwarn
if [ "x${ret}" != "x0" ]; then
+61 -101
View File
@@ -6,8 +6,8 @@ check() {
[ "${1}" = "-d" ] && return 0
# Verify the zfs tool chain
for tool in "@sbindir@/zgenhostid" "@sbindir@/zpool" "@sbindir@/zfs" "@mounthelperdir@/mount.zfs" ; do
test -x "$tool" || return 1
for tool in "zgenhostid" "zpool" "zfs" "mount.zfs"; do
command -v "${tool}" >/dev/null || return 1
done
return 0
@@ -19,125 +19,85 @@ depends() {
}
installkernel() {
instmods zfs
instmods zcommon
instmods znvpair
instmods zavl
instmods zunicode
instmods zlua
instmods icp
instmods spl
instmods zlib_deflate
instmods zlib_inflate
instmods -c zfs
}
install() {
inst_rules @udevruledir@/90-zfs.rules
inst_rules @udevruledir@/69-vdev.rules
inst_rules @udevruledir@/60-zvol.rules
dracut_install hostid
dracut_install grep
dracut_install @sbindir@/zgenhostid
dracut_install @sbindir@/zfs
dracut_install @sbindir@/zpool
# Workaround for https://github.com/openzfs/zfs/issues/4749 by
# ensuring libgcc_s.so(.1) is included
if ldd @sbindir@/zpool | grep -qF 'libgcc_s.so'; then
# Dracut will have already tracked and included it
:;
elif command -v gcc-config >/dev/null 2>&1; then
# On systems with gcc-config (Gentoo, Funtoo, etc.):
# Use the current profile to resolve the appropriate path
s="$(gcc-config -c)"
dracut_install "/usr/lib/gcc/${s%-*}/${s##*-}/libgcc_s.so"*
elif [ "$(echo /usr/lib/libgcc_s.so*)" != "/usr/lib/libgcc_s.so*" ]; then
# Try a simple path first
dracut_install /usr/lib/libgcc_s.so*
elif [ "$(echo /lib*/libgcc_s.so*)" != "/lib*/libgcc_s.so*" ]; then
# SUSE
dracut_install /lib*/libgcc_s.so*
else
# Fallback: Guess the path and include all matches
dracut_install /usr/lib*/gcc/**/libgcc_s.so*
inst_rules 90-zfs.rules 69-vdev.rules 60-zvol.rules
inst_multiple \
zgenhostid \
zfs \
zpool \
mount.zfs \
hostid \
grep \
awk \
tr \
cut \
head ||
{ dfatal "Failed to install essential binaries"; exit 1; }
# Adapted from https://github.com/zbm-dev/zfsbootmenu
if ! ldd "$(command -v zpool)" | grep -qF 'libgcc_s.so'; then
# On systems with gcc-config (Gentoo, Funtoo, etc.), use it to find libgcc_s
if command -v gcc-config >/dev/null; then
inst_simple "/usr/lib/gcc/$(s=$(gcc-config -c); echo "${s%-*}/${s##*-}")/libgcc_s.so.1" ||
{ dfatal "Unable to install libgcc_s.so"; exit 1; }
# Otherwise, use dracut's library installation function to find the right one
elif ! inst_libdir_file "libgcc_s.so*"; then
# If all else fails, just try looking for some gcc arch directory
inst_simple /usr/lib/gcc/*/*/libgcc_s.so* ||
{ dfatal "Unable to install libgcc_s.so"; exit 1; }
fi
fi
# shellcheck disable=SC2050
if [ @LIBFETCH_DYNAMIC@ -gt 0 ]; then
for d in $libdirs; do
[ -e "$d/@LIBFETCH_SONAME@" ] && dracut_install "$d/@LIBFETCH_SONAME@"
done
fi
dracut_install @mounthelperdir@/mount.zfs
dracut_install @udevdir@/vdev_id
dracut_install awk
dracut_install cut
dracut_install tr
dracut_install head
dracut_install @udevdir@/zvol_id
inst_hook cmdline 95 "${moddir}/parse-zfs.sh"
if [ -n "$systemdutildir" ] ; then
inst_script "${moddir}/zfs-generator.sh" "$systemdutildir"/system-generators/dracut-zfs-generator
if [ -n "${systemdutildir}" ]; then
inst_script "${moddir}/zfs-generator.sh" "${systemdutildir}/system-generators/dracut-zfs-generator"
fi
inst_hook pre-mount 90 "${moddir}/zfs-load-key.sh"
inst_hook mount 98 "${moddir}/mount-zfs.sh"
inst_hook cleanup 99 "${moddir}/zfs-needshutdown.sh"
inst_hook shutdown 20 "${moddir}/export-zfs.sh"
inst_simple "${moddir}/zfs-lib.sh" "/lib/dracut-zfs-lib.sh"
if [ -e @sysconfdir@/zfs/zpool.cache ]; then
inst @sysconfdir@/zfs/zpool.cache
type mark_hostonly >/dev/null 2>&1 && mark_hostonly @sysconfdir@/zfs/zpool.cache
fi
inst_script "${moddir}/zfs-lib.sh" "/lib/dracut-zfs-lib.sh"
if [ -e @sysconfdir@/zfs/vdev_id.conf ]; then
inst @sysconfdir@/zfs/vdev_id.conf
type mark_hostonly >/dev/null 2>&1 && mark_hostonly @sysconfdir@/zfs/vdev_id.conf
fi
# -H ensures they are marked host-only
# -o ensures there is no error upon absence of these files
inst_multiple -o -H \
"@sysconfdir@/zfs/zpool.cache" \
"@sysconfdir@/zfs/vdev_id.conf"
# Synchronize initramfs and system hostid
if [ -f @sysconfdir@/hostid ]; then
inst @sysconfdir@/hostid
type mark_hostonly >/dev/null 2>&1 && mark_hostonly @sysconfdir@/hostid
elif HOSTID="$(hostid 2>/dev/null)" && [ "${HOSTID}" != "00000000" ]; then
zgenhostid -o "${initdir}@sysconfdir@/hostid" "${HOSTID}"
type mark_hostonly >/dev/null 2>&1 && mark_hostonly @sysconfdir@/hostid
if ! inst_simple -H @sysconfdir@/hostid; then
if HOSTID="$(hostid 2>/dev/null)" && [ "${HOSTID}" != "00000000" ]; then
zgenhostid -o "${initdir}@sysconfdir@/hostid" "${HOSTID}"
mark_hostonly @sysconfdir@/hostid
fi
fi
if dracut_module_included "systemd"; then
mkdir -p "${initdir}/$systemdsystemunitdir/zfs-import.target.wants"
for _service in "zfs-import-scan.service" "zfs-import-cache.service" ; do
dracut_install "@systemdunitdir@/$_service"
if ! [ -L "${initdir}/$systemdsystemunitdir/zfs-import.target.wants/$_service" ]; then
ln -sf ../$_service "${initdir}/$systemdsystemunitdir/zfs-import.target.wants/$_service"
type mark_hostonly >/dev/null 2>&1 && mark_hostonly "@systemdunitdir@/$_service"
fi
inst_simple "${systemdsystemunitdir}/zfs-import.target"
systemctl -q --root "${initdir}" add-wants initrd.target zfs-import.target
inst_simple "${moddir}/zfs-env-bootfs.service" "${systemdsystemunitdir}/zfs-env-bootfs.service"
systemctl -q --root "${initdir}" add-wants zfs-import.target zfs-env-bootfs.service
for _service in \
"zfs-import-scan.service" \
"zfs-import-cache.service"; do
inst_simple "${systemdsystemunitdir}/${_service}"
systemctl -q --root "${initdir}" add-wants zfs-import.target "${_service}"
done
inst "${moddir}"/zfs-env-bootfs.service "${systemdsystemunitdir}"/zfs-env-bootfs.service
ln -s ../zfs-env-bootfs.service "${initdir}/${systemdsystemunitdir}/zfs-import.target.wants"/zfs-env-bootfs.service
type mark_hostonly >/dev/null 2>&1 && mark_hostonly @systemdunitdir@/zfs-env-bootfs.service
dracut_install systemd-ask-password
dracut_install systemd-tty-ask-password-agent
mkdir -p "${initdir}/$systemdsystemunitdir/initrd.target.wants"
dracut_install @systemdunitdir@/zfs-import.target
if ! [ -L "${initdir}/$systemdsystemunitdir/initrd.target.wants"/zfs-import.target ]; then
ln -s ../zfs-import.target "${initdir}/$systemdsystemunitdir/initrd.target.wants"/zfs-import.target
type mark_hostonly >/dev/null 2>&1 && mark_hostonly @systemdunitdir@/zfs-import.target
fi
for _service in zfs-snapshot-bootfs.service zfs-rollback-bootfs.service ; do
inst "${moddir}/$_service" "${systemdsystemunitdir}/$_service"
if ! [ -L "${initdir}/$systemdsystemunitdir/initrd.target.wants/$_service" ]; then
ln -s "../$_service" "${initdir}/$systemdsystemunitdir/initrd.target.wants/$_service"
fi
for _service in \
"zfs-snapshot-bootfs.service" \
"zfs-rollback-bootfs.service"; do
inst_simple "${moddir}/${_service}" "${systemdsystemunitdir}/${_service}"
systemctl -q --root "${initdir}" add-wants initrd.target "${_service}"
done
# There isn't a pkg-config variable for this,
# and dracut doesn't automatically resolve anything this'd be next to
local systemdsystemenvironmentgeneratordir
systemdsystemenvironmentgeneratordir="$(pkg-config --variable=prefix systemd || echo "/usr")/lib/systemd/system-environment-generators"
mkdir -p "${initdir}/${systemdsystemenvironmentgeneratordir}"
inst "${moddir}"/import-opts-generator.sh "${systemdsystemenvironmentgeneratordir}"/zfs-import-opts.sh
inst_simple "${moddir}/import-opts-generator.sh" "${systemdutildir}/system-environment-generators/zfs-import-opts.sh"
fi
}
+83 -49
View File
@@ -3,48 +3,73 @@
. /lib/dracut-zfs-lib.sh
ZFS_DATASET=""
ZFS_POOL=""
case "${root}" in
zfs:*) ;;
*) return ;;
esac
decode_root_args || return 0
GENERATOR_FILE=/run/systemd/generator/sysroot.mount
GENERATOR_EXTENSION=/run/systemd/generator/sysroot.mount.d/zfs-enhancement.conf
if [ -e "$GENERATOR_FILE" ] && [ -e "$GENERATOR_EXTENSION" ] ; then
# If the ZFS sysroot.mount flag exists, the initial RAM disk configured
# it to mount ZFS on root. In that case, we bail early. This flag
# file gets created by the zfs-generator program upon successful run.
info "ZFS: There is a sysroot.mount and zfs-generator has extended it."
info "ZFS: Delegating root mount to sysroot.mount."
# Let us tell the initrd to run on shutdown.
# We have a shutdown hook to run
# because we imported the pool.
if [ -e "$GENERATOR_FILE" ] && [ -e "$GENERATOR_EXTENSION" ]; then
# We're under systemd and dracut-zfs-generator ran to completion.
info "ZFS: Delegating root mount to sysroot.mount at al."
# We now prevent Dracut from running this thing again.
for zfsmounthook in "$hookdir"/mount/*zfs* ; do
if [ -f "$zfsmounthook" ] ; then
rm -f "$zfsmounthook"
fi
done
rm -f "$hookdir"/mount/*zfs*
return
fi
info "ZFS: No sysroot.mount exists or zfs-generator did not extend it."
info "ZFS: Mounting root with the traditional mount-zfs.sh instead."
# ask_for_password tries prompt cmd
#
# Wraps around plymouth ask-for-password and adds fallback to tty password ask
# if plymouth is not present.
ask_for_password() {
tries="$1"
prompt="$2"
cmd="$3"
{
flock -s 9
# Prompt for password with plymouth, if installed and running.
if plymouth --ping 2>/dev/null; then
plymouth ask-for-password \
--prompt "$prompt" --number-of-tries="$tries" | \
eval "$cmd"
ret=$?
else
i=1
while [ "$i" -le "$tries" ]; do
printf "%s [%i/%i]:" "$prompt" "$i" "$tries" >&2
eval "$cmd" && ret=0 && break
ret=$?
i=$((i+1))
printf '\n' >&2
done
unset i
fi
} 9>/.console_lock
[ "$ret" -ne 0 ] && echo "Wrong password" >&2
return "$ret"
}
# Delay until all required block devices are present.
modprobe zfs 2>/dev/null
udevadm settle
ZFS_DATASET=
ZFS_POOL=
if [ "${root}" = "zfs:AUTO" ] ; then
if ! ZFS_DATASET="$(find_bootfs)" ; then
if ! ZFS_DATASET="$(zpool get -Ho value bootfs | grep -m1 -vFx -)"; then
# shellcheck disable=SC2086
zpool import -N -a ${ZPOOL_IMPORT_OPTS}
if ! ZFS_DATASET="$(find_bootfs)" ; then
if ! ZFS_DATASET="$(zpool get -Ho value bootfs | grep -m1 -vFx -)"; then
warn "ZFS: No bootfs attribute found in importable pools."
export_all -F
zpool export -aF
rootok=0
return 1
@@ -53,34 +78,43 @@ if [ "${root}" = "zfs:AUTO" ] ; then
info "ZFS: Using ${ZFS_DATASET} as root."
fi
ZFS_DATASET="${ZFS_DATASET:-${root#zfs:}}"
ZFS_DATASET="${ZFS_DATASET:-${root}}"
ZFS_POOL="${ZFS_DATASET%%/*}"
if import_pool "${ZFS_POOL}" ; then
# Load keys if we can or if we need to
if [ "$(zpool list -H -o feature@encryption "${ZFS_POOL}")" = 'active' ]; then
# if the root dataset has encryption enabled
ENCRYPTIONROOT="$(zfs get -H -o value encryptionroot "${ZFS_DATASET}")"
if ! [ "${ENCRYPTIONROOT}" = "-" ]; then
KEYSTATUS="$(zfs get -H -o value keystatus "${ENCRYPTIONROOT}")"
# if the key needs to be loaded
if [ "$KEYSTATUS" = "unavailable" ]; then
# decrypt them
ask_for_password \
--tries 5 \
--prompt "Encrypted ZFS password for ${ENCRYPTIONROOT}: " \
--cmd "zfs load-key '${ENCRYPTIONROOT}'"
fi
if ! zpool get -Ho name "${ZFS_POOL}" > /dev/null 2>&1; then
info "ZFS: Importing pool ${ZFS_POOL}..."
# shellcheck disable=SC2086
if ! zpool import -N ${ZPOOL_IMPORT_OPTS} "${ZFS_POOL}"; then
warn "ZFS: Unable to import pool ${ZFS_POOL}"
rootok=0
return 1
fi
fi
# Load keys if we can or if we need to
# TODO: for_relevant_root_children like in zfs-load-key.sh.in
if [ "$(zpool get -Ho value feature@encryption "${ZFS_POOL}")" = 'active' ]; then
# if the root dataset has encryption enabled
ENCRYPTIONROOT="$(zfs get -Ho value encryptionroot "${ZFS_DATASET}")"
if ! [ "${ENCRYPTIONROOT}" = "-" ]; then
KEYSTATUS="$(zfs get -Ho value keystatus "${ENCRYPTIONROOT}")"
# if the key needs to be loaded
if [ "$KEYSTATUS" = "unavailable" ]; then
# decrypt them
ask_for_password \
5 \
"Encrypted ZFS password for ${ENCRYPTIONROOT}: " \
"zfs load-key '${ENCRYPTIONROOT}'"
fi
fi
# Let us tell the initrd to run on shutdown.
# We have a shutdown hook to run
# because we imported the pool.
info "ZFS: Mounting dataset ${ZFS_DATASET}..."
if mount_dataset "${ZFS_DATASET}" ; then
ROOTFS_MOUNTED=yes
return 0
fi
fi
rootok=0
# Let us tell the initrd to run on shutdown.
# We have a shutdown hook to run
# because we imported the pool.
info "ZFS: Mounting dataset ${ZFS_DATASET}..."
if ! mount_dataset "${ZFS_DATASET}"; then
rootok=0
return 1
fi
+17 -45
View File
@@ -1,7 +1,8 @@
#!/bin/sh
# shellcheck disable=SC2034,SC2154
. /lib/dracut-lib.sh
# shellcheck source=zfs-lib.sh.in
. /lib/dracut-zfs-lib.sh
# Let the command line override our host id.
spl_hostid=$(getarg spl_hostid=)
@@ -15,49 +16,20 @@ else
warn "ZFS: Pools may not import correctly."
fi
wait_for_zfs=0
case "${root}" in
""|zfs|zfs:)
# We'll take root unset, root=zfs, or root=zfs:
# No root set, so we want to read the bootfs attribute. We
# can't do that until udev settles so we'll set dummy values
# and hope for the best later on.
root="zfs:AUTO"
rootok=1
wait_for_zfs=1
if decode_root_args; then
if [ "$root" = "zfs:AUTO" ]; then
info "ZFS: Boot dataset autodetected from bootfs=."
else
info "ZFS: Boot dataset is ${root}."
fi
info "ZFS: Enabling autodetection of bootfs after udev settles."
;;
ZFS=*|zfs:*|FILESYSTEM=*)
# root is explicit ZFS root. Parse it now. We can handle
# a root=... param in any of the following formats:
# root=ZFS=rpool/ROOT
# root=zfs:rpool/ROOT
# root=zfs:FILESYSTEM=rpool/ROOT
# root=FILESYSTEM=rpool/ROOT
# root=ZFS=pool+with+space/ROOT+WITH+SPACE (translates to root=ZFS=pool with space/ROOT WITH SPACE)
# Strip down to just the pool/fs
root="${root#zfs:}"
root="${root#FILESYSTEM=}"
root="zfs:${root#ZFS=}"
# switch + with spaces because kernel cmdline does not allow us to quote parameters
root=$(echo "$root" | tr '+' ' ')
rootok=1
wait_for_zfs=1
info "ZFS: Set ${root} as bootfs."
;;
esac
# Make sure Dracut is happy that we have a root and will wait for ZFS
# modules to settle before mounting.
if [ ${wait_for_zfs} -eq 1 ]; then
ln -s /dev/null /dev/root 2>/dev/null
initqueuedir="${hookdir}/initqueue/finished"
test -d "${initqueuedir}" || {
initqueuedir="${hookdir}/initqueue-finished"
}
echo '[ -e /dev/zfs ]' > "${initqueuedir}/zfs.sh"
rootok=1
# Make sure Dracut is happy that we have a root and will wait for ZFS
# modules to settle before mounting.
if [ -n "${wait_for_zfs}" ]; then
ln -s null /dev/root
echo '[ -e /dev/zfs ]' > "${hookdir}/initqueue/finished/zfs.sh"
fi
else
info "ZFS: no ZFS-on-root."
fi
@@ -8,7 +8,7 @@ Before=zfs-import.target
[Service]
Type=oneshot
ExecStart=/bin/sh -c "exec systemctl set-environment BOOTFS=$(@sbindir@/zpool list -H -o bootfs | grep -m1 -v '^-$')"
ExecStart=/bin/sh -c "exec systemctl set-environment BOOTFS=$(@sbindir@/zpool list -H -o bootfs | grep -m1 -vFx -)"
[Install]
WantedBy=zfs-import.target
+5 -25
View File
@@ -1,5 +1,5 @@
#!/bin/sh
# shellcheck disable=SC2016,SC1004
# shellcheck disable=SC2016,SC1004,SC2154
grep -wq debug /proc/cmdline && debug=1
[ -n "$debug" ] && echo "zfs-generator: starting" >> /dev/kmsg
@@ -10,37 +10,17 @@ GENERATOR_DIR="$1"
exit 1
}
[ -f /lib/dracut-lib.sh ] && dracutlib=/lib/dracut-lib.sh
[ -f /usr/lib/dracut/modules.d/99base/dracut-lib.sh ] && dracutlib=/usr/lib/dracut/modules.d/99base/dracut-lib.sh
command -v getarg >/dev/null 2>&1 || {
[ -n "$debug" ] && echo "zfs-generator: loading Dracut library from $dracutlib" >> /dev/kmsg
. "$dracutlib"
}
# shellcheck source=zfs-lib.sh.in
. /lib/dracut-zfs-lib.sh
decode_root_args || exit 0
[ -z "$root" ] && root=$(getarg root=)
[ -z "$rootfstype" ] && rootfstype=$(getarg rootfstype=)
[ -z "$rootflags" ] && rootflags=$(getarg rootflags=)
# If root is not ZFS= or zfs: or rootfstype is not zfs
# then we are not supposed to handle it.
[ "${root##zfs:}" = "${root}" ] &&
[ "${root##ZFS=}" = "${root}" ] &&
[ "$rootfstype" != "zfs" ] &&
exit 0
[ -z "${rootflags}" ] && rootflags=$(getarg rootflags=)
case ",${rootflags}," in
*,zfsutil,*) ;;
,,) rootflags=zfsutil ;;
*) rootflags="zfsutil,${rootflags}" ;;
esac
if [ "${root}" != "zfs:AUTO" ]; then
root="${root##zfs:}"
root="${root##ZFS=}"
fi
[ -n "$debug" ] && echo "zfs-generator: writing extension for sysroot.mount to $GENERATOR_DIR/sysroot.mount.d/zfs-enhancement.conf" >> /dev/kmsg
@@ -89,7 +69,7 @@ else
_zfs_generator_cb() {
dset="${1}"
mpnt="${2}"
unit="sysroot$(echo "$mpnt" | tr '/' '-').mount"
unit="$(systemd-escape --suffix=mount -p "/sysroot${mpnt}")"
{
echo "[Unit]"
+54 -142
View File
@@ -1,74 +1,16 @@
#!/bin/sh
# shellcheck disable=SC2034
command -v getarg >/dev/null || . /lib/dracut-lib.sh
command -v getargbool >/dev/null || {
# Compatibility with older Dracut versions.
# With apologies to the Dracut developers.
getargbool() {
_default="$1"; shift
! _b=$(getarg "$@") && [ -z "$_b" ] && _b="$_default"
if [ -n "$_b" ]; then
[ "$_b" = "0" ] && return 1
[ "$_b" = "no" ] && return 1
[ "$_b" = "off" ] && return 1
fi
return 0
}
}
command -v getarg >/dev/null || . /lib/dracut-lib.sh || . /usr/lib/dracut/modules.d/99base/dracut-lib.sh
OLDIFS="${IFS}"
NEWLINE="
"
TAB=" "
ZPOOL_IMPORT_OPTS=""
if getargbool 0 zfs_force -y zfs.force -y zfsforce ; then
ZPOOL_IMPORT_OPTS=
if getargbool 0 zfs_force -y zfs.force -y zfsforce; then
warn "ZFS: Will force-import pools if necessary."
ZPOOL_IMPORT_OPTS="${ZPOOL_IMPORT_OPTS} -f"
ZPOOL_IMPORT_OPTS=-f
fi
# find_bootfs
# returns the first dataset with the bootfs attribute.
find_bootfs() {
IFS="${NEWLINE}"
for dataset in $(zpool list -H -o bootfs); do
case "${dataset}" in
"" | "-")
continue
;;
"no pools available")
IFS="${OLDIFS}"
return 1
;;
*)
IFS="${OLDIFS}"
echo "${dataset}"
return 0
;;
esac
done
IFS="${OLDIFS}"
return 1
}
# import_pool POOL
# imports the given zfs pool if it isn't imported already.
import_pool() {
pool="${1}"
if ! zpool list -H "${pool}" > /dev/null 2>&1; then
info "ZFS: Importing pool ${pool}..."
# shellcheck disable=SC2086
if ! zpool import -N ${ZPOOL_IMPORT_OPTS} "${pool}" ; then
warn "ZFS: Unable to import pool ${pool}"
return 1
fi
fi
return 0
}
_mount_dataset_cb() {
mount -o zfsutil -t zfs "${1}" "${NEWROOT}${2}"
}
@@ -121,87 +63,57 @@ for_relevant_root_children() {
)
}
# export_all OPTS
# exports all imported zfs pools.
export_all() {
ret=0
IFS="${NEWLINE}"
for pool in $(zpool list -H -o name) ; do
if zpool list -H "${pool}" > /dev/null 2>&1; then
zpool export "${pool}" "$@" || ret=$?
fi
done
IFS="${OLDIFS}"
return ${ret}
}
# ask_for_password
# Parse root=, rootfstype=, return them decoded and normalised to zfs:AUTO for auto, plain dset for explicit
#
# Wraps around plymouth ask-for-password and adds fallback to tty password ask
# if plymouth is not present.
# True if ZFS-on-root, false if we shouldn't
#
# --cmd command
# Command to execute. Required.
# --prompt prompt
# Password prompt. Note that function already adds ':' at the end.
# Recommended.
# --tries n
# How many times repeat command on its failure. Default is 3.
# --ply-[cmd|prompt|tries]
# Command/prompt/tries specific for plymouth password ask only.
# --tty-[cmd|prompt|tries]
# Command/prompt/tries specific for tty password ask only.
# --tty-echo-off
# Turn off input echo before tty command is executed and turn on after.
# It's useful when password is read from stdin.
ask_for_password() {
ply_tries=3
tty_tries=3
while [ "$#" -gt 0 ]; do
case "$1" in
--cmd) ply_cmd="$2"; tty_cmd="$2"; shift;;
--ply-cmd) ply_cmd="$2"; shift;;
--tty-cmd) tty_cmd="$2"; shift;;
--prompt) ply_prompt="$2"; tty_prompt="$2"; shift;;
--ply-prompt) ply_prompt="$2"; shift;;
--tty-prompt) tty_prompt="$2"; shift;;
--tries) ply_tries="$2"; tty_tries="$2"; shift;;
--ply-tries) ply_tries="$2"; shift;;
--tty-tries) tty_tries="$2"; shift;;
--tty-echo-off) tty_echo_off=yes;;
# Supported values:
# root=
# root=zfs
# root=zfs:
# root=zfs:AUTO
#
# root=ZFS=data/set
# root=zfs:data/set
# root=zfs:ZFS=data/set (as a side-effect; allowed but undocumented)
#
# rootfstype=zfs AND root=data/set <=> root=data/set
# rootfstype=zfs AND root= <=> root=zfs:AUTO
#
# '+'es in explicit dataset decoded to ' 's.
decode_root_args() {
if [ -n "$rootfstype" ]; then
[ "$rootfstype" = zfs ]
return
fi
root=$(getarg root=)
rootfstype=$(getarg rootfstype=)
# shellcheck disable=SC2249
case "$root" in
""|zfs|zfs:|zfs:AUTO)
root=zfs:AUTO
rootfstype=zfs
return 0
;;
ZFS=*|zfs:*)
root="${root#zfs:}"
root="${root#ZFS=}"
root=$(echo "$root" | tr '+' ' ')
rootfstype=zfs
return 0
;;
esac
if [ "$rootfstype" = "zfs" ]; then
case "$root" in
"") root=zfs:AUTO ;;
*) root=$(echo "$root" | tr '+' ' ') ;;
esac
shift
done
return 0
fi
{ flock -s 9;
# Prompt for password with plymouth, if installed and running.
if plymouth --ping 2>/dev/null; then
plymouth ask-for-password \
--prompt "$ply_prompt" --number-of-tries="$ply_tries" | \
eval "$ply_cmd"
ret=$?
else
if [ "$tty_echo_off" = yes ]; then
stty_orig="$(stty -g)"
stty -echo
fi
i=1
while [ "$i" -le "$tty_tries" ]; do
[ -n "$tty_prompt" ] && \
printf "%s [%i/%i]:" "$tty_prompt" "$i" "$tty_tries" >&2
eval "$tty_cmd" && ret=0 && break
ret=$?
i=$((i+1))
[ -n "$tty_prompt" ] && printf '\n' >&2
done
unset i
[ "$tty_echo_off" = yes ] && stty "$stty_orig"
fi
} 9>/.console_lock
[ $ret -ne 0 ] && echo "Wrong password" >&2
return $ret
return 1
}
+48 -57
View File
@@ -4,70 +4,61 @@
# only run this on systemd systems, we handle the decrypt in mount-zfs.sh in the mount hook otherwise
[ -e /bin/systemctl ] || [ -e /usr/bin/systemctl ] || return 0
# This script only gets executed on systemd systems, see mount-zfs.sh for non-systemd systems
# shellcheck source=zfs-lib.sh.in
. /lib/dracut-zfs-lib.sh
# import the libs now that we know the pool imported
[ -f /lib/dracut-lib.sh ] && dracutlib=/lib/dracut-lib.sh
[ -f /usr/lib/dracut/modules.d/99base/dracut-lib.sh ] && dracutlib=/usr/lib/dracut/modules.d/99base/dracut-lib.sh
# shellcheck source=./lib-zfs.sh.in
. "$dracutlib"
# load the kernel command line vars
[ -z "$root" ] && root="$(getarg root=)"
# If root is not ZFS= or zfs: or rootfstype is not zfs then we are not supposed to handle it.
[ "${root##zfs:}" = "${root}" ] && [ "${root##ZFS=}" = "${root}" ] && [ "$rootfstype" != "zfs" ] && exit 0
decode_root_args || return 0
# There is a race between the zpool import and the pre-mount hooks, so we wait for a pool to be imported
while [ "$(zpool list -H)" = "" ]; do
systemctl is-failed --quiet zfs-import-cache.service zfs-import-scan.service && exit 1
while ! systemctl is-active --quiet zfs-import.target; do
systemctl is-failed --quiet zfs-import-cache.service zfs-import-scan.service && return 1
sleep 0.1s
done
# run this after import as zfs-import-cache/scan service is confirmed good
# we do not overwrite the ${root} variable, but create a new one, BOOTFS, to hold the dataset
if [ "${root}" = "zfs:AUTO" ] ; then
BOOTFS="$(zpool list -H -o bootfs | awk '$1 != "-" {print; exit}')"
else
BOOTFS="${root##zfs:}"
BOOTFS="${BOOTFS##ZFS=}"
BOOTFS="$root"
if [ "$BOOTFS" = "zfs:AUTO" ]; then
BOOTFS="$(zpool get -Ho value bootfs | grep -m1 -vFx -)"
fi
# if pool encryption is active and the zfs command understands '-o encryption'
if [ "$(zpool list -H -o feature@encryption "${BOOTFS%%/*}")" = 'active' ]; then
# if the root dataset has encryption enabled
ENCRYPTIONROOT="$(zfs get -H -o value encryptionroot "${BOOTFS}")"
if ! [ "${ENCRYPTIONROOT}" = "-" ]; then
KEYSTATUS="$(zfs get -H -o value keystatus "${ENCRYPTIONROOT}")"
# continue only if the key needs to be loaded
[ "$KEYSTATUS" = "unavailable" ] || exit 0
[ "$(zpool get -Ho value feature@encryption "${BOOTFS%%/*}")" = 'active' ] || return 0
KEYLOCATION="$(zfs get -H -o value keylocation "${ENCRYPTIONROOT}")"
case "${KEYLOCATION%%://*}" in
prompt)
for _ in 1 2 3; do
systemd-ask-password --no-tty "Encrypted ZFS password for ${BOOTFS}" | zfs load-key "${ENCRYPTIONROOT}" && break
_load_key_cb() {
dataset="$1"
ENCRYPTIONROOT="$(zfs get -Ho value encryptionroot "${dataset}")"
[ "${ENCRYPTIONROOT}" = "-" ] && return 0
[ "$(zfs get -Ho value keystatus "${ENCRYPTIONROOT}")" = "unavailable" ] || return 0
KEYLOCATION="$(zfs get -Ho value keylocation "${ENCRYPTIONROOT}")"
case "${KEYLOCATION%%://*}" in
prompt)
for _ in 1 2 3; do
systemd-ask-password --no-tty "Encrypted ZFS password for ${dataset}" | zfs load-key "${ENCRYPTIONROOT}" && break
done
;;
http*)
systemctl start network-online.target
zfs load-key "${ENCRYPTIONROOT}"
;;
file)
KEYFILE="${KEYLOCATION#file://}"
[ -r "${KEYFILE}" ] || udevadm settle
[ -r "${KEYFILE}" ] || {
info "ZFS: Waiting for key ${KEYFILE} for ${ENCRYPTIONROOT}..."
for _ in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20; do
sleep 0.5s
[ -r "${KEYFILE}" ] && break
done
;;
http*)
systemctl start network-online.target
zfs load-key "${ENCRYPTIONROOT}"
;;
file)
KEYFILE="${KEYLOCATION#file://}"
[ -r "${KEYFILE}" ] || udevadm settle
[ -r "${KEYFILE}" ] || {
info "Waiting for key ${KEYFILE} for ${ENCRYPTIONROOT}..."
for _ in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20; do
sleep 0.5s
[ -r "${KEYFILE}" ] && break
done
}
[ -r "${KEYFILE}" ] || warn "Key ${KEYFILE} for ${ENCRYPTIONROOT} hasn't appeared. Trying anyway."
zfs load-key "${ENCRYPTIONROOT}"
;;
*)
zfs load-key "${ENCRYPTIONROOT}"
;;
esac
fi
fi
}
[ -r "${KEYFILE}" ] || warn "ZFS: Key ${KEYFILE} for ${ENCRYPTIONROOT} hasn't appeared. Trying anyway."
zfs load-key "${ENCRYPTIONROOT}"
;;
*)
zfs load-key "${ENCRYPTIONROOT}"
;;
esac
}
_load_key_cb "$BOOTFS"
for_relevant_root_children "$BOOTFS" _load_key_cb
+1 -1
View File
@@ -2,7 +2,7 @@
command -v getarg >/dev/null 2>&1 || . /lib/dracut-lib.sh
if zpool list 2>&1 | grep -q 'no pools available' ; then
if [ -z "$(zpool get -Ho value name)" ]; then
info "ZFS: No active pools, no need to export anything."
else
info "ZFS: There is an active pool, will export it."
@@ -1,14 +1,12 @@
[Unit]
Description=Rollback bootfs just before it is mounted
Requisite=zfs-import.target
After=zfs-import.target zfs-snapshot-bootfs.service
After=zfs-import.target dracut-pre-mount.service zfs-snapshot-bootfs.service
Before=dracut-mount.service
DefaultDependencies=no
ConditionKernelCommandLine=bootfs.rollback
[Service]
# ${BOOTFS} should have been set by zfs-env-bootfs.service
Type=oneshot
ExecStartPre=/bin/sh -c 'test -n "${BOOTFS}"'
ExecStart=/bin/sh -c '. /lib/dracut-lib.sh; SNAPNAME="$(getarg bootfs.rollback)"; exec @sbindir@/zfs rollback -Rf "${BOOTFS}@${SNAPNAME:-%v}"'
ExecStart=/bin/sh -c '. /lib/dracut-zfs-lib.sh; decode_root_args || exit; [ "$root" = "zfs:AUTO" ] && root="$BOOTFS" SNAPNAME="$(getarg bootfs.rollback)"; exec @sbindir@/zfs rollback -Rf "$root@${SNAPNAME:-%v}"'
RemainAfterExit=yes
@@ -1,14 +1,12 @@
[Unit]
Description=Snapshot bootfs just before it is mounted
Requisite=zfs-import.target
After=zfs-import.target
After=zfs-import.target dracut-pre-mount.service
Before=dracut-mount.service
DefaultDependencies=no
ConditionKernelCommandLine=bootfs.snapshot
[Service]
# ${BOOTFS} should have been set by zfs-env-bootfs.service
Type=oneshot
ExecStartPre=/bin/sh -c 'test -n "${BOOTFS}"'
ExecStart=-/bin/sh -c '. /lib/dracut-lib.sh; SNAPNAME="$(getarg bootfs.snapshot)"; exec @sbindir@/zfs snapshot "${BOOTFS}@${SNAPNAME:-%v}"'
-ExecStart=/bin/sh -c '. /lib/dracut-zfs-lib.sh; decode_root_args || exit; [ "$root" = "zfs:AUTO" ] && root="$BOOTFS" SNAPNAME="$(getarg bootfs.snapshot)"; exec @sbindir@/zfs snapshot "$root@${SNAPNAME:-%v}"'
RemainAfterExit=yes
+38 -213
View File
@@ -1,225 +1,50 @@
How to setup a zfs root filesystem using dracut
-----------------------------------------------
## Basic setup
1. Install `zfs-dracut`
2. Set `mountpoint=/` for your root dataset (for compatibility, `legacy` also works, but is not recommended for new installations):
```sh
zfs set mountpoint=/ pool/dataset
```
3. Either (a) set `bootfs=` on the pool to the dataset:
```sh
zpool set bootfs=pool/dataset pool
```
4. Or (b) append `root=zfs:pool/dataset` to your kernel cmdline.
5. Re-generate your initrd and update it in your boot bundle
1) Install the zfs-dracut package. This package adds a zfs dracut module
to the /usr/share/dracut/modules.d/ directory which allows dracut to
create an initramfs which is zfs aware.
Encrypted datasets have keys loaded automatically or prompted for.
2) Set the bootfs property for the bootable dataset in the pool. Then set
the dataset mountpoint property to '/'.
If the root dataset contains children with `mountpoint=`s of `/etc`, `/bin`, `/lib*`, or `/usr`, they're mounted too.
$ zpool set bootfs=pool/dataset pool
$ zfs set mountpoint=/ pool/dataset
For complete documentation, see `dracut.zfs(7)`.
Alternately, legacy mountpoints can be used by setting the 'root=' option
on the kernel line of your grub.conf/menu.lst configuration file. Then
set the dataset mountpoint property to 'legacy'.
## cmdline
1. `root=` | Root dataset is… |
---------------------------|----------------------------------------------------------|
*(empty)* | the first `bootfs=` after `zpool import -aN` |
`zfs:AUTO`, `zfs:`, `zfs` | *(as above, but overriding other autoselection methods)* |
`ZFS=pool/dataset` | `pool/dataset` |
`zfs:pool/dataset` | *(as above)* |
$ grub.conf/menu.lst: kernel ... root=ZFS=pool/dataset
$ zfs set mountpoint=legacy pool/dataset
All `+`es are replaced with spaces (i.e. to boot from `root pool/data set`, pass `root=zfs:root+pool/data+set`).
3) To set zfs module options put them in /etc/modprobe.d/zfs.conf file.
The complete list of zfs module options is available by running the
_modinfo zfs_ command. Commonly set options include: zfs_arc_min,
zfs_arc_max, zfs_prefetch_disable, and zfs_vdev_max_pending.
The dataset can be at any depth, including being the pool's root dataset (i.e. `root=zfs:pool`).
4) Finally, create your new initramfs by running dracut.
`rootfstype=zfs` is equivalent to `root=zfs:AUTO`, `rootfstype=zfs root=pool/dataset` is equivalent to `root=zfs:pool/dataset`.
$ dracut --force /path/to/initramfs kernel_version
2. `spl_hostid`: passed to `zgenhostid -f`, useful to override the `/etc/hostid` file baked into the initrd.
Kernel Command Line
-------------------
3. `bootfs.snapshot`, `bootfs.snapshot=snapshot-name`: enables `zfs-snapshot-bootfs.service`,
which creates a snapshot `$root_dataset@$(uname -r)` (or, in the second form, `$root_dataset@snapshot-name`)
after pool import but before the rootfs is mounted.
Failure to create the snapshot is noted, but booting continues.
The initramfs' behavior is influenced by the following kernel command line
parameters passed in from the boot loader:
4. `bootfs.rollback`, `bootfs.rollback=snapshot-name`: enables `zfs-snapshot-bootfs.service`,
which `-Rf` rolls back to `$root_dataset@$(uname -r)` (or, in the second form, `$root_dataset@snapshot-name`)
after pool import but before the rootfs is mounted.
Failure to roll back will fall down to the rescue shell.
This has obvious potential for data loss: make sure your persistent data is not below the rootfs and you don't care about any intermediate snapshots.
* `root=...`: If not set, importable pools are searched for a bootfs
attribute. If an explicitly set root is desired, you may use
`root=ZFS:pool/dataset`
5. If both `bootfs.snapshot` and `bootfs.rollback` are set, `bootfs.rollback` is ordered *after* `bootfs.snapshot`.
* `zfs_force=0`: If set to 1, the initramfs will run `zpool import -f` when
attempting to import pools if the required pool isn't automatically imported
by the zfs module. This can save you a trip to a bootcd if hostid has
changed, but is dangerous and can lead to zpool corruption, particularly in
cases where storage is on a shared fabric such as iSCSI where multiple hosts
can access storage devices concurrently. _Please understand the implications
of force-importing a pool before enabling this option!_
* `spl_hostid`: By default, the hostid used by the SPL module is read from
/etc/hostid inside the initramfs. This file is placed there from the host
system when the initramfs is built which effectively ties the ramdisk to the
host which builds it. If a different hostid is desired, one may be set in
this attribute and will override any file present in the ramdisk. The
format should be hex exactly as found in the `/etc/hostid` file, IE
`spl_hostid=0x00bab10c`.
Note that changing the hostid between boots will most likely lead to an
un-importable pool since the last importing hostid won't match. In order
to recover from this, you may use the `zfs_force` option or boot from a
different filesystem and `zpool import -f` then `zpool export` the pool
before rebooting with the new hostid.
* `bootfs.snapshot`: If listed, enables the zfs-snapshot-bootfs service on a Dracut system. The zfs-snapshot-bootfs service simply runs `zfs snapshot $BOOTFS@%v` after the pool has been imported but before the bootfs is mounted. `$BOOTFS` is substituted with the value of the bootfs setting on the pool. `%v` is substituted with the version string of the kernel currently being booted (e.g. 5.6.6-200.fc31.x86\_64). Failure to create the snapshot (e.g. because one with the same name already exists) will be logged, but will not otherwise interrupt the boot process.
It is safe to leave the bootfs.snapshot flag set persistently on your kernel command line so that a new snapshot of your bootfs will be created on every kernel update. If you leave bootfs.snapshot set persistently on your kernel command line, you may find the below script helpful for automatically removing old snapshots of the bootfs along with their associated kernel.
#!/usr/bin/sh
if [[ "$1" == "remove" ]] && grep -q "\bbootfs.snapshot\b" /proc/cmdline; then
zfs destroy $(findmnt -n -o source /)@$2 &> /dev/null
fi
exit 0
To use the above script place it in a plain text file named /etc/kernel/install.d/99-zfs-cleanup.install and mark it executable with the following command:
$ chmod +x /etc/kernel/install.d/99-zfs-cleanup.install
On Red Hat based systems, you can change the value of `installonly_limit` in /etc/dnf/dnf.conf to adjust the number of kernels and their associated snapshots that are kept.
* `bootfs.snapshot=<snapname>`: Is identical to the bootfs.snapshot parameter explained above except that the value substituted for \<snapname\> will be used when creating the snapshot instead of the version string of the kernel currently being booted.
* `bootfs.rollback`: If listed, enables the zfs-rollback-bootfs service on a Dracut system. The zfs-rollback-bootfs service simply runs `zfs rollback -Rf $BOOTFS@%v` after the pool has been imported but before the bootfs is mounted. If the rollback operation fails, the boot process will be interrupted with a Dracut rescue shell. __Use this parameter with caution. Intermediate snapshots of the bootfs will be destroyed!__ TIP: Keep your user data (e.g. /home) on separate file systems (it can be in the same pool though).
* `bootfs.rollback=<snapname>`: Is identical to the bootfs.rollback parameter explained above except that the value substituted for \<snapname\> will be used when rolling back the bootfs instead of the version string of the kernel currently being booted. If you use this form, choose a snapshot that is new enough to contain the needed kernel modules under /lib/modules or use a kernel that has all the needed modules built-in.
How it Works
============
The Dracut module consists of the following files (less Makefile's):
* `module-setup.sh`: Script run by the initramfs builder to create the
ramdisk. Contains instructions on which files are required by the modules
and z* programs. Also triggers inclusion of `/etc/hostid` and the zpool
cache. This file is not included in the initramfs.
* `90-zfs.rules`: udev rules which trigger loading of the ZFS modules at boot.
* `zfs-lib.sh`: Utility functions used by the other files.
* `parse-zfs.sh`: Run early in the initramfs boot process to parse kernel
command line and determine if ZFS is the active root filesystem.
* `mount-zfs.sh`: Run later in initramfs boot process after udev has settled
to mount the root dataset.
* `export-zfs.sh`: Run on shutdown after dracut has restored the initramfs
and pivoted to it, allowing for a clean unmount and export of the ZFS root.
`zfs-lib.sh`
------------
This file provides a few handy functions for working with ZFS. Those
functions are used by the `mount-zfs.sh` and `export-zfs.sh` files.
However, they could be used by any other file as well, as long as the file
sources `/lib/dracut-zfs-lib.sh`.
`module-setup.sh`
-----------------
This file is run by the Dracut script within the live system, not at boot
time. It's not included in the final initramfs. Functions in this script
describe which files are needed by ZFS at boot time.
Currently all the various z* and spl modules are included, a dependency is
asserted on udev-rules, and the various zfs, zpool, etc. helpers are included.
Dracut provides library functions which automatically gather the shared libs
necessary to run each of these binaries, so statically built binaries are
not required.
The zpool and zvol udev rules files are copied from where they are
installed by the ZFS build. __PACKAGERS TAKE NOTE__: If you move
`/etc/udev/rules/60-z*.rules`, you'll need to update this file to match.
Currently this file also includes `/etc/hostid` and `/etc/zfs/zpool.cache`
which means the generated ramdisk is specific to the host system which built
it. If a generic initramfs is required, it may be preferable to omit these
files and specify the `spl_hostid` from the boot loader instead.
`parse-zfs.sh`
--------------
Run during the cmdline phase of the initramfs boot process, this script
performs some basic sanity checks on kernel command line parameters to
determine if booting from ZFS is likely to be what is desired. Dracut
requires this script to adjust the `root` variable if required and to set
`rootok=1` if a mountable root filesystem is available. Unfortunately this
script must run before udev is settled and kernel modules are known to be
loaded, so accessing the zpool and zfs commands is unsafe.
If the root=ZFS... parameter is set on the command line, then it's at least
certain that ZFS is what is desired, though this script is unable to
determine if ZFS is in fact available. This script will alter the `root`
parameter to replace several historical forms of specifying the pool and
dataset name with the canonical form of `zfs:pool/dataset`.
If no root= parameter is set, the best this script can do is guess that
ZFS is desired. At present, no other known filesystems will work with no
root= parameter, though this might possibly interfere with using the
compiled-in default root in the kernel image. It's considered unlikely
that would ever be the case when an initramfs is in use, so this script
sets `root=zfs:AUTO` and hopes for the best.
Once the root=... (or lack thereof) parameter is parsed, a dummy symlink
is created from `/dev/root` -> `/dev/null` to satisfy parts of the Dracut
process which check for presence of a single root device node.
Finally, an initqueue/finished hook is registered which causes the initqueue
phase of Dracut to wait for `/dev/zfs` to become available before attempting
to mount anything.
`mount-zfs.sh`
--------------
This script is run after udev has settled and all tasks in the initqueue
have succeeded. This ensures that `/dev/zfs` is available and that the
various ZFS modules are successfully loaded. As it is now safe to call
zpool and friends, we can proceed to find the bootfs attribute if necessary.
If the root parameter was explicitly set on the command line, no parsing is
necessary. The list of imported pools is checked to see if the desired pool
is already imported. If it's not, and attempt is made to import the pool
explicitly, though no force is attempted. Finally the specified dataset
is mounted on `$NEWROOT`, first using the `-o zfsutil` option to handle
non-legacy mounts, then if that fails, without zfsutil to handle legacy
mount points.
If no root parameter was specified, this script attempts to find a pool with
its bootfs attribute set. First, already-imported pools are scanned and if
an appropriate pool is found, no additional pools are imported. If no pool
with bootfs is found, any additional pools in the system are imported with
`zpool import -N -a`, and the scan for bootfs is tried again. If no bootfs
is found with all pools imported, all pools are re-exported, and boot fails.
Assuming a bootfs is found, an attempt is made to mount it to `$NEWROOT`,
first with, then without the zfsutil option as above.
Ordinarily pools are imported _without_ the force option which may cause
boot to fail if the hostid has changed or a pool has been physically moved
between servers. The `zfs_force` kernel parameter is provided which when
set to `1` causes `zpool import` to be run with the `-f` flag. Forcing pool
import can lead to serious data corruption and loss of pools, so this option
should be used with extreme caution. Note that even with this flag set, if
the required zpool was auto-imported by the kernel module, no additional
`zpool import` commands are run, so nothing is forced.
`export-zfs.sh`
---------------
Normally the zpool containing the root dataset cannot be exported on
shutdown as it is still in use by the init process. To work around this,
Dracut is able to restore the initramfs on shutdown and pivot to it.
All remaining process are then running from a ramdisk, allowing for a
clean unmount and export of the ZFS root. The theory of operation is
described in detail in the [Dracut manual](https://www.kernel.org/pub/linux/utils/boot/dracut/dracut.html#_dracut_on_shutdown).
This script will try to export all remaining zpools after Dracut has
pivoted to the initramfs. If an initial regular export is not successful,
Dracut will call this script once more with the `final` option,
in which case a forceful export is attempted.
Other Dracut modules include similar shutdown scripts and Dracut
invokes these scripts round-robin until they succeed. In particular,
the `90dm` module installs a script which tries to close and remove
all device mapper targets. Thus, if there are ZVOLs containing
dm-crypt volumes or if the zpool itself is backed by a dm-crypt
volume, the shutdown scripts will try to untangle this.
6. `zfs_force`, `zfs.force`, `zfsforce`: add `-f` to all `zpool import` invocations.
May be useful. Use with caution.
+1 -1
View File
@@ -104,7 +104,7 @@ zfs_errno = enum_with_offset(1024, [
)
# compat before we used the enum helper for these values
ZFS_ERR_CHECKPOINT_EXISTS = zfs_errno.ZFS_ERR_CHECKPOINT_EXISTS
assert(ZFS_ERR_CHECKPOINT_EXISTS == 1024)
assert (ZFS_ERR_CHECKPOINT_EXISTS == 1024)
ZFS_ERR_DISCARDING_CHECKPOINT = zfs_errno.ZFS_ERR_DISCARDING_CHECKPOINT
ZFS_ERR_NO_CHECKPOINT = zfs_errno.ZFS_ERR_NO_CHECKPOINT
ZFS_ERR_DEVRM_IN_PROGRESS = zfs_errno.ZFS_ERR_DEVRM_IN_PROGRESS
+3 -2
View File
@@ -1,9 +1,10 @@
include $(top_srcdir)/config/Substfiles.am
include $(top_srcdir)/config/Shellcheck.am
initconf_SCRIPTS = zfs
initconf_DATA = zfs
SUBSTFILES += $(initconf_SCRIPTS)
SUBSTFILES += $(initconf_DATA)
SHELLCHECKSCRIPTS = $(initconf_DATA)
SHELLCHECK_SHELL = sh
SHELLCHECK_IGNORE = ,SC2034
@@ -27,9 +27,6 @@
#include <sys/types.h>
#include <sys/time.h>
#include <sys/stat.h>
#include <sys/wait.h>
#include <sys/mman.h>
#include <semaphore.h>
#include <stdbool.h>
#include <unistd.h>
#include <fcntl.h>
@@ -44,25 +41,16 @@
#include <errno.h>
#include <libzfs.h>
/*
* For debugging only.
*
* Free statics with trivial life-times,
* but saved line filenames are replaced with a static string.
*/
#define FREE_STATICS false
#define nitems(arr) (sizeof (arr) / sizeof (*arr))
#define STRCMP ((int(*)(const void *, const void *))&strcmp)
#define PID_T_CMP ((int(*)(const void *, const void *))&pid_t_cmp)
static int
pid_t_cmp(const pid_t *lhs, const pid_t *rhs)
{
/*
* This is always valid, quoth sys_types.h(7posix):
* > blksize_t, pid_t, and ssize_t shall be signed integer types.
*/
return (*lhs - *rhs);
}
#define EXIT_ENOMEM() \
do { \
fprintf(stderr, PROGNAME "[%d]: " \
"not enough memory (L%d)!\n", getpid(), __LINE__); \
_exit(1); \
} while (0)
#define PROGNAME "zfs-mount-generator"
@@ -80,20 +68,11 @@ pid_t_cmp(const pid_t *lhs, const pid_t *rhs)
#define URI_REGEX_S "^\\([A-Za-z][A-Za-z0-9+.\\-]*\\):\\/\\/\\(.*\\)$"
static regex_t uri_regex;
static char *argv0;
static const char *destdir = "/tmp";
static int destdir_fd = -1;
static void *known_pools = NULL; /* tsearch() of C strings */
static struct {
sem_t noauto_not_on_sem;
sem_t noauto_names_sem;
size_t noauto_names_len;
size_t noauto_names_max;
char noauto_names[][NAME_MAX];
} *noauto_files;
static void *noauto_files = NULL; /* tsearch() of C strings */
static char *
@@ -103,8 +82,12 @@ systemd_escape(const char *input, const char *prepend, const char *append)
size_t applen = strlen(append);
size_t prelen = strlen(prepend);
char *ret = malloc(4 * len + prelen + applen + 1);
if (!ret)
EXIT_ENOMEM();
if (!ret) {
fprintf(stderr, PROGNAME "[%d]: "
"out of memory to escape \"%s%s%s\"!\n",
getpid(), prepend, input, append);
return (NULL);
}
memcpy(ret, prepend, prelen);
char *out = ret + prelen;
@@ -166,8 +149,12 @@ systemd_escape_path(char *input, const char *prepend, const char *append)
{
if (strcmp(input, "/") == 0) {
char *ret;
if (asprintf(&ret, "%s-%s", prepend, append) == -1)
EXIT_ENOMEM();
if (asprintf(&ret, "%s-%s", prepend, append) == -1) {
fprintf(stderr, PROGNAME "[%d]: "
"out of memory to escape \"%s%s%s\"!\n",
getpid(), prepend, input, append);
ret = NULL;
}
return (ret);
} else {
/*
@@ -209,6 +196,10 @@ fopenat(int dirfd, const char *pathname, int flags,
static int
line_worker(char *line, const char *cachefile)
{
int ret = 0;
void *tofree_all[8];
void **tofree = tofree_all;
char *toktmp;
/* BEGIN CSTYLED */
const char *dataset = strtok_r(line, "\t", &toktmp);
@@ -240,11 +231,9 @@ line_worker(char *line, const char *cachefile)
if (p_nbmand == NULL) {
fprintf(stderr, PROGNAME "[%d]: %s: not enough tokens!\n",
getpid(), dataset);
return (1);
goto err;
}
strncpy(argv0, dataset, strlen(argv0));
/* Minimal pre-requisites to mount a ZFS dataset */
const char *after = "zfs-import.target";
const char *wants = "zfs-import.target";
@@ -280,28 +269,31 @@ line_worker(char *line, const char *cachefile)
if (strcmp(p_encroot, "-") != 0) {
char *keyloadunit =
char *keyloadunit = *(tofree++) =
systemd_escape(p_encroot, "zfs-load-key@", ".service");
if (keyloadunit == NULL)
goto err;
if (strcmp(dataset, p_encroot) == 0) {
const char *keymountdep = NULL;
bool is_prompt = false;
bool need_network = false;
regmatch_t uri_matches[3];
if (regexec(&uri_regex, p_keyloc,
sizeof (uri_matches) / sizeof (*uri_matches),
uri_matches, 0) == 0) {
nitems(uri_matches), uri_matches, 0) == 0) {
p_keyloc[uri_matches[1].rm_eo] = '\0';
p_keyloc[uri_matches[2].rm_eo] = '\0';
const char *scheme =
&p_keyloc[uri_matches[1].rm_so];
const char *path =
&p_keyloc[uri_matches[2].rm_so];
/*
* Assumes all URI keylocations need
* the mount for their path;
* http://, for example, wouldn't
* (but it'd need network-online.target et al.)
*/
keymountdep = path;
if (strcmp(scheme, "https") == 0 ||
strcmp(scheme, "http") == 0)
need_network = true;
else
keymountdep = path;
} else {
if (strcmp(p_keyloc, "prompt") != 0)
fprintf(stderr, PROGNAME "[%d]: %s: "
@@ -321,7 +313,7 @@ line_worker(char *line, const char *cachefile)
"couldn't open %s under %s: %s\n",
getpid(), dataset, keyloadunit, destdir,
strerror(errno));
return (1);
goto err;
}
fprintf(keyloadunit_f,
@@ -335,20 +327,22 @@ line_worker(char *line, const char *cachefile)
"After=%s\n",
dataset, cachefile, wants, after);
if (need_network)
fprintf(keyloadunit_f,
"Wants=network-online.target\n"
"After=network-online.target\n");
if (p_systemd_requires)
fprintf(keyloadunit_f,
"Requires=%s\n", p_systemd_requires);
if (p_systemd_requiresmountsfor || keymountdep) {
fprintf(keyloadunit_f, "RequiresMountsFor=");
if (p_systemd_requiresmountsfor)
fprintf(keyloadunit_f,
"%s ", p_systemd_requiresmountsfor);
if (keymountdep)
fprintf(keyloadunit_f,
"'%s'", keymountdep);
fprintf(keyloadunit_f, "\n");
}
if (p_systemd_requiresmountsfor)
fprintf(keyloadunit_f,
"RequiresMountsFor=%s\n",
p_systemd_requiresmountsfor);
if (keymountdep)
fprintf(keyloadunit_f,
"RequiresMountsFor='%s'\n", keymountdep);
/* BEGIN CSTYLED */
fprintf(keyloadunit_f,
@@ -393,9 +387,13 @@ line_worker(char *line, const char *cachefile)
if (after[0] == '\0')
after = keyloadunit;
else if (asprintf(&toktmp, "%s %s", after, keyloadunit) != -1)
after = toktmp;
else
EXIT_ENOMEM();
after = *(tofree++) = toktmp;
else {
fprintf(stderr, PROGNAME "[%d]: %s: "
"out of memory to generate after=\"%s %s\"!\n",
getpid(), dataset, after, keyloadunit);
goto err;
}
}
@@ -404,12 +402,12 @@ line_worker(char *line, const char *cachefile)
strcmp(p_systemd_ignore, "off") == 0) {
/* ok */
} else if (strcmp(p_systemd_ignore, "on") == 0)
return (0);
goto end;
else {
fprintf(stderr, PROGNAME "[%d]: %s: "
"invalid org.openzfs.systemd:ignore=%s\n",
getpid(), dataset, p_systemd_ignore);
return (1);
goto err;
}
/* Check for canmount */
@@ -418,21 +416,21 @@ line_worker(char *line, const char *cachefile)
} else if (strcmp(p_canmount, "noauto") == 0)
noauto = true;
else if (strcmp(p_canmount, "off") == 0)
return (0);
goto end;
else {
fprintf(stderr, PROGNAME "[%d]: %s: invalid canmount=%s\n",
getpid(), dataset, p_canmount);
return (1);
goto err;
}
/* Check for legacy and blank mountpoints */
if (strcmp(p_mountpoint, "legacy") == 0 ||
strcmp(p_mountpoint, "none") == 0)
return (0);
goto end;
else if (p_mountpoint[0] != '/') {
fprintf(stderr, PROGNAME "[%d]: %s: invalid mountpoint=%s\n",
getpid(), dataset, p_mountpoint);
return (1);
goto err;
}
/* Escape the mountpoint per systemd policy */
@@ -442,7 +440,7 @@ line_worker(char *line, const char *cachefile)
fprintf(stderr,
PROGNAME "[%d]: %s: abnormal simplified mountpoint: %s\n",
getpid(), dataset, p_mountpoint);
return (1);
goto err;
}
@@ -552,8 +550,7 @@ line_worker(char *line, const char *cachefile)
* files if we're sure they were created by us. (see 5.)
* 2. We handle files differently based on canmount.
* Units with canmount=on always have precedence over noauto.
* This is enforced by the noauto_not_on_sem semaphore,
* which is only unlocked when the last canmount=on process exits.
* This is enforced by processing these units before all others.
* It is important to use p_canmount and not noauto here,
* since we categorise by canmount while other properties,
* e.g. org.openzfs.systemd:wanted-by, also modify noauto.
@@ -561,7 +558,7 @@ line_worker(char *line, const char *cachefile)
* Additionally, we use noauto_files to track the unit file names
* (which are the systemd-escaped mountpoints) of all (exclusively)
* noauto datasets that had a file created.
* 4. If the file to be created is found in the tracking array,
* 4. If the file to be created is found in the tracking tree,
* we do NOT create it.
* 5. If a file exists for a noauto dataset,
* we check whether the file name is in the array.
@@ -571,29 +568,14 @@ line_worker(char *line, const char *cachefile)
* further noauto datasets creating a file for this path again.
*/
{
sem_t *our_sem = (strcmp(p_canmount, "on") == 0) ?
&noauto_files->noauto_names_sem :
&noauto_files->noauto_not_on_sem;
while (sem_wait(our_sem) == -1 && errno == EINTR)
;
}
struct stat stbuf;
bool already_exists = fstatat(destdir_fd, mountfile, &stbuf, 0) == 0;
bool is_known = tfind(mountfile, &noauto_files, STRCMP) != NULL;
bool is_known = false;
for (size_t i = 0; i < noauto_files->noauto_names_len; ++i) {
if (strncmp(
noauto_files->noauto_names[i], mountfile, NAME_MAX) == 0) {
is_known = true;
break;
}
}
*(tofree++) = (void *)mountfile;
if (already_exists) {
if (is_known) {
/* If it's in $noauto_files, we must be noauto too */
/* If it's in noauto_files, we must be noauto too */
/* See 5 */
errno = 0;
@@ -614,43 +596,31 @@ line_worker(char *line, const char *cachefile)
}
/* File exists: skip current dataset */
if (strcmp(p_canmount, "on") == 0)
sem_post(&noauto_files->noauto_names_sem);
return (0);
goto end;
} else {
if (is_known) {
/* See 4 */
if (strcmp(p_canmount, "on") == 0)
sem_post(&noauto_files->noauto_names_sem);
return (0);
goto end;
} else if (strcmp(p_canmount, "noauto") == 0) {
if (noauto_files->noauto_names_len ==
noauto_files->noauto_names_max)
if (tsearch(mountfile, &noauto_files, STRCMP) == NULL)
fprintf(stderr, PROGNAME "[%d]: %s: "
"noauto dataset limit (%zu) reached! "
"Not tracking %s. Please report this to "
"https://github.com/openzfs/zfs\n",
getpid(), dataset,
noauto_files->noauto_names_max, mountfile);
else {
strncpy(noauto_files->noauto_names[
noauto_files->noauto_names_len],
mountfile, NAME_MAX);
++noauto_files->noauto_names_len;
}
"out of memory for noauto datasets! "
"Not tracking %s.\n",
getpid(), dataset, mountfile);
else
/* mountfile escaped to noauto_files */
*(--tofree) = NULL;
}
}
FILE *mountfile_f = fopenat(destdir_fd, mountfile,
O_WRONLY | O_CREAT | O_TRUNC | O_CLOEXEC, "w", 0644);
if (strcmp(p_canmount, "on") == 0)
sem_post(&noauto_files->noauto_names_sem);
if (!mountfile_f) {
fprintf(stderr,
PROGNAME "[%d]: %s: couldn't open %s under %s: %s\n",
getpid(), dataset, mountfile, destdir, strerror(errno));
return (1);
goto err;
}
fprintf(mountfile_f,
@@ -699,12 +669,17 @@ line_worker(char *line, const char *cachefile)
(void) fclose(mountfile_f);
if (!requiredby && !wantedby)
return (0);
goto end;
/* Finally, create the appropriate dependencies */
char *linktgt;
if (asprintf(&linktgt, "../%s", mountfile) == -1)
EXIT_ENOMEM();
if (asprintf(&linktgt, "../%s", mountfile) == -1) {
fprintf(stderr, PROGNAME "[%d]: %s: "
"out of memory for dependents of %s!\n",
getpid(), dataset, mountfile);
goto err;
}
*(tofree++) = linktgt;
char *dependencies[][2] = {
{"wants", wantedby},
@@ -719,8 +694,14 @@ line_worker(char *line, const char *cachefile)
reqby;
reqby = strtok_r(NULL, " ", &toktmp)) {
char *depdir;
if (asprintf(&depdir, "%s.%s", reqby, (*dep)[0]) == -1)
EXIT_ENOMEM();
if (asprintf(
&depdir, "%s.%s", reqby, (*dep)[0]) == -1) {
fprintf(stderr, PROGNAME "[%d]: %s: "
"out of memory for dependent dir name "
"\"%s.%s\"!\n",
getpid(), dataset, reqby, (*dep)[0]);
continue;
}
(void) mkdirat(destdir_fd, depdir, 0755);
int depdir_fd = openat(destdir_fd, depdir,
@@ -746,7 +727,24 @@ line_worker(char *line, const char *cachefile)
}
}
return (0);
end:
if (tofree >= tofree_all + nitems(tofree_all)) {
/*
* This won't happen as-is:
* we've got 8 slots and allocate 4 things at most.
*/
fprintf(stderr,
PROGNAME "[%d]: %s: need to free %zu > %zu!\n",
getpid(), dataset, tofree - tofree_all, nitems(tofree_all));
ret = tofree - tofree_all;
}
while (tofree-- != tofree_all)
free(*tofree);
return (ret);
err:
ret = 1;
goto end;
}
@@ -780,12 +778,11 @@ main(int argc, char **argv)
if (kmfd >= 0) {
(void) dup2(kmfd, STDERR_FILENO);
(void) close(kmfd);
setlinebuf(stderr);
}
}
uint8_t debug = 0;
argv0 = argv[0];
switch (argc) {
case 1:
/* Use default */
@@ -844,33 +841,9 @@ main(int argc, char **argv)
}
}
{
/*
* We could just get a gigabyte here and Not Care,
* but if vm.overcommit_memory=2, then MAP_NORESERVE is ignored
* and we'd try (and likely fail) to rip it out of swap
*/
noauto_files = mmap(NULL, 4 * 1024 * 1024,
PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0);
if (noauto_files == MAP_FAILED) {
fprintf(stderr,
PROGNAME "[%d]: couldn't allocate IPC region: %s\n",
getpid(), strerror(errno));
_exit(1);
}
sem_init(&noauto_files->noauto_not_on_sem, true, 0);
sem_init(&noauto_files->noauto_names_sem, true, 1);
noauto_files->noauto_names_len = 0;
/* Works out to 16447ish, *well* enough */
noauto_files->noauto_names_max =
(4 * 1024 * 1024 - sizeof (*noauto_files)) / NAME_MAX;
}
bool debug = false;
char *line = NULL;
size_t linelen = 0;
struct timespec time_start = {};
{
const char *dbgenv = getenv("ZFS_DEBUG");
if (dbgenv)
@@ -879,7 +852,7 @@ main(int argc, char **argv)
FILE *cmdline = fopen("/proc/cmdline", "re");
if (cmdline != NULL) {
if (getline(&line, &linelen, cmdline) >= 0)
debug = strstr(line, "debug") ? 2 : 0;
debug = strstr(line, "debug");
(void) fclose(cmdline);
}
}
@@ -888,19 +861,17 @@ main(int argc, char **argv)
dup2(STDERR_FILENO, STDOUT_FILENO);
}
size_t forked_canmount_on = 0;
size_t forked_canmount_not_on = 0;
size_t canmount_on_pids_len = 128;
pid_t *canmount_on_pids =
malloc(canmount_on_pids_len * sizeof (*canmount_on_pids));
if (canmount_on_pids == NULL)
canmount_on_pids_len = 0;
struct timespec time_start = {};
if (debug)
clock_gettime(CLOCK_MONOTONIC_RAW, &time_start);
ssize_t read;
pid_t pid;
struct line {
char *line;
const char *fname;
struct line *next;
} *lines_canmount_not_on = NULL;
int ret = 0;
struct dirent *cachent;
while ((cachent = readdir(fslist_dir)) != NULL) {
if (strcmp(cachent->d_name, ".") == 0 ||
@@ -916,129 +887,67 @@ main(int argc, char **argv)
continue;
}
const char *filename = FREE_STATICS ? "(elided)" : NULL;
ssize_t read;
while ((read = getline(&line, &linelen, cachefile)) >= 0) {
line[read - 1] = '\0'; /* newline */
switch (pid = fork()) {
case -1:
fprintf(stderr,
PROGNAME "[%d]: couldn't fork for %s: %s\n",
getpid(), line, strerror(errno));
break;
case 0: /* child */
_exit(line_worker(line, cachent->d_name));
default: { /* parent */
char *tmp;
char *dset = strtok_r(line, "\t", &tmp);
strtok_r(NULL, "\t", &tmp);
char *canmount = strtok_r(NULL, "\t", &tmp);
bool canmount_on =
canmount && strncmp(canmount, "on", 2) == 0;
char *canmount = line;
canmount += strcspn(canmount, "\t");
canmount += strspn(canmount, "\t");
canmount += strcspn(canmount, "\t");
canmount += strspn(canmount, "\t");
bool canmount_on = strncmp(canmount, "on", 2) == 0;
if (debug >= 2)
printf(PROGNAME ": forked %d, "
"canmount_on=%d, dataset=%s\n",
(int)pid, canmount_on, dset);
if (canmount_on)
ret |= line_worker(line, cachent->d_name);
else {
if (filename == NULL)
filename =
strdup(cachent->d_name) ?: "(?)";
if (canmount_on &&
forked_canmount_on ==
canmount_on_pids_len) {
size_t new_len =
(canmount_on_pids_len ?: 16) * 2;
void *new_pidlist =
realloc(canmount_on_pids,
new_len *
sizeof (*canmount_on_pids));
if (!new_pidlist) {
fprintf(stderr,
PROGNAME "[%d]: "
"out of memory! "
"Mount ordering may be "
"affected.\n", getpid());
continue;
}
canmount_on_pids = new_pidlist;
canmount_on_pids_len = new_len;
struct line *l = calloc(1, sizeof (*l));
char *nl = strdup(line);
if (l == NULL || nl == NULL) {
fprintf(stderr, PROGNAME "[%d]: "
"out of memory for \"%s\" in %s\n",
getpid(), line, cachent->d_name);
free(l);
free(nl);
continue;
}
if (canmount_on) {
canmount_on_pids[forked_canmount_on] =
pid;
++forked_canmount_on;
} else
++forked_canmount_not_on;
break;
}
l->line = nl;
l->fname = filename;
l->next = lines_canmount_not_on;
lines_canmount_not_on = l;
}
}
(void) fclose(cachefile);
fclose(cachefile);
}
free(line);
if (forked_canmount_on == 0) {
/* No canmount=on processes to finish, so don't deadlock here */
for (size_t i = 0; i < forked_canmount_not_on; ++i)
sem_post(&noauto_files->noauto_not_on_sem);
} else {
/* Likely a no-op, since we got these from a narrow fork loop */
qsort(canmount_on_pids, forked_canmount_on,
sizeof (*canmount_on_pids), PID_T_CMP);
}
while (lines_canmount_not_on) {
struct line *l = lines_canmount_not_on;
lines_canmount_not_on = l->next;
int status, ret = 0;
struct rusage usage;
size_t forked_canmount_on_max = forked_canmount_on;
while ((pid = wait4(-1, &status, 0, &usage)) != -1) {
ret |= WEXITSTATUS(status) | WTERMSIG(status);
if (forked_canmount_on != 0) {
if (bsearch(&pid, canmount_on_pids,
forked_canmount_on_max, sizeof (*canmount_on_pids),
PID_T_CMP))
--forked_canmount_on;
if (forked_canmount_on == 0) {
/*
* All canmount=on processes have finished,
* let all the lower-priority ones finish now
*/
for (size_t i = 0;
i < forked_canmount_not_on; ++i)
sem_post(
&noauto_files->noauto_not_on_sem);
}
ret |= line_worker(l->line, l->fname);
if (FREE_STATICS) {
free(l->line);
free(l);
}
if (debug >= 2)
printf(PROGNAME ": %d done, user=%llu.%06us, "
"system=%llu.%06us, maxrss=%ldB, ex=0x%x\n",
(int)pid,
(unsigned long long) usage.ru_utime.tv_sec,
(unsigned int) usage.ru_utime.tv_usec,
(unsigned long long) usage.ru_stime.tv_sec,
(unsigned int) usage.ru_stime.tv_usec,
usage.ru_maxrss * 1024, status);
}
if (debug) {
struct timespec time_end = {};
clock_gettime(CLOCK_MONOTONIC_RAW, &time_end);
struct rusage usage;
getrusage(RUSAGE_SELF, &usage);
printf(
"\n"
PROGNAME ": self : "
"user=%llu.%06us, system=%llu.%06us, maxrss=%ldB\n",
(unsigned long long) usage.ru_utime.tv_sec,
(unsigned int) usage.ru_utime.tv_usec,
(unsigned long long) usage.ru_stime.tv_sec,
(unsigned int) usage.ru_stime.tv_usec,
usage.ru_maxrss * 1024);
getrusage(RUSAGE_CHILDREN, &usage);
printf(PROGNAME ": children: "
PROGNAME ": "
"user=%llu.%06us, system=%llu.%06us, maxrss=%ldB\n",
(unsigned long long) usage.ru_utime.tv_sec,
(unsigned int) usage.ru_utime.tv_usec,
@@ -1068,7 +977,7 @@ main(int argc, char **argv)
time_init.tv_nsec / 1000000000;
time_init.tv_nsec %= 1000000000;
printf(PROGNAME ": wall : "
printf(PROGNAME ": "
"total=%llu.%09llus = "
"init=%llu.%09llus + real=%llu.%09llus\n",
(unsigned long long) time_init.tv_sec,
@@ -1077,7 +986,15 @@ main(int argc, char **argv)
(unsigned long long) time_start.tv_nsec,
(unsigned long long) time_end.tv_sec,
(unsigned long long) time_end.tv_nsec);
fflush(stdout);
}
if (FREE_STATICS) {
closedir(fslist_dir);
tdestroy(noauto_files, free);
tdestroy(known_pools, free);
regfree(&uri_regex);
}
_exit(ret);
}
+3 -2
View File
@@ -10,9 +10,10 @@ dist_pkgsysconf_DATA = \
vdev_id.conf.multipath.example \
vdev_id.conf.scsi.example
pkgsysconf_SCRIPTS = \
pkgsysconf_DATA = \
zfs-functions
SUBSTFILES += $(pkgsysconf_SCRIPTS)
SUBSTFILES += $(pkgsysconf_DATA)
SHELLCHECKSCRIPTS = $(pkgsysconf_DATA)
SHELLCHECK_SHELL = dash # local variables
+4 -3
View File
@@ -795,9 +795,10 @@ extern int zfs_receive(libzfs_handle_t *, const char *, nvlist_t *,
recvflags_t *, int, avl_tree_t *);
typedef enum diff_flags {
ZFS_DIFF_PARSEABLE = 0x1,
ZFS_DIFF_TIMESTAMP = 0x2,
ZFS_DIFF_CLASSIFY = 0x4
ZFS_DIFF_PARSEABLE = 1 << 0,
ZFS_DIFF_TIMESTAMP = 1 << 1,
ZFS_DIFF_CLASSIFY = 1 << 2,
ZFS_DIFF_NO_MANGLE = 1 << 3
} diff_flags_t;
extern int zfs_show_diffs(zfs_handle_t *, int, const char *, const char *,
+1
View File
@@ -234,6 +234,7 @@ typedef struct differ_info {
boolean_t scripted;
boolean_t classify;
boolean_t timestamped;
boolean_t no_mangle;
uint64_t shares;
int zerr;
int cleanupfd;
+1 -1
View File
@@ -52,7 +52,7 @@
#define ZFS_MODULE_PARAM_CALL_IMPL(parent, name, perm, args, desc) \
SYSCTL_DECL(parent); \
SYSCTL_PROC(parent, OID_AUTO, name, perm | args, desc)
SYSCTL_PROC(parent, OID_AUTO, name, CTLFLAG_MPSAFE | perm | args, desc)
#define ZFS_MODULE_PARAM_CALL(scope_prefix, name_prefix, name, func, _, perm, desc) \
ZFS_MODULE_PARAM_CALL_IMPL(_vfs_ ## scope_prefix, name, perm, func ## _args(name_prefix ## name), desc)
+18 -14
View File
@@ -123,25 +123,29 @@ extern minor_t zfsdev_minor_alloc(void);
#define zn_rlimit_fsize(zp, uio) \
vn_rlimit_fsize(ZTOV(zp), GET_UIO_STRUCT(uio), zfs_uio_td(uio))
#define ZFS_ENTER_ERROR(zfsvfs, error) do { \
ZFS_TEARDOWN_ENTER_READ((zfsvfs), FTAG); \
if (__predict_false((zfsvfs)->z_unmounted)) { \
ZFS_TEARDOWN_EXIT_READ(zfsvfs, FTAG); \
return (error); \
} \
} while (0)
/* Called on entry to each ZFS vnode and vfs operation */
#define ZFS_ENTER(zfsvfs) \
{ \
ZFS_TEARDOWN_ENTER_READ((zfsvfs), FTAG); \
if (__predict_false((zfsvfs)->z_unmounted)) { \
ZFS_TEARDOWN_EXIT_READ(zfsvfs, FTAG); \
return (EIO); \
} \
}
#define ZFS_ENTER(zfsvfs) ZFS_ENTER_ERROR(zfsvfs, EIO)
/* Must be called before exiting the vop */
#define ZFS_EXIT(zfsvfs) ZFS_TEARDOWN_EXIT_READ(zfsvfs, FTAG)
#define ZFS_EXIT(zfsvfs) ZFS_TEARDOWN_EXIT_READ(zfsvfs, FTAG)
#define ZFS_VERIFY_ZP_ERROR(zp, error) do { \
if (__predict_false((zp)->z_sa_hdl == NULL)) { \
ZFS_EXIT((zp)->z_zfsvfs); \
return (error); \
} \
} while (0)
/* Verifies the znode is valid */
#define ZFS_VERIFY_ZP(zp) \
if (__predict_false((zp)->z_sa_hdl == NULL)) { \
ZFS_EXIT((zp)->z_zfsvfs); \
return (EIO); \
} \
#define ZFS_VERIFY_ZP(zp) ZFS_VERIFY_ZP_ERROR(zp, EIO)
/*
* Macros for dealing with dmu_buf_hold
+37 -8
View File
@@ -495,21 +495,45 @@ blk_queue_discard_granularity(struct request_queue *q, unsigned int dg)
}
/*
* 5.19 API,
* bdev_max_discard_sectors()
*
* 2.6.32 API,
* blk_queue_discard()
*/
static inline boolean_t
bdev_discard_supported(struct block_device *bdev)
{
#if defined(HAVE_BDEV_MAX_DISCARD_SECTORS)
return (!!bdev_max_discard_sectors(bdev));
#elif defined(HAVE_BLK_QUEUE_DISCARD)
return (!!blk_queue_discard(bdev_get_queue(bdev)));
#else
#error "Unsupported kernel"
#endif
}
/*
* 5.19 API,
* bdev_max_secure_erase_sectors()
*
* 4.8 API,
* blk_queue_secure_erase()
*
* 2.6.36 - 4.7 API,
* blk_queue_secdiscard()
*/
static inline int
blk_queue_discard_secure(struct request_queue *q)
static inline boolean_t
bdev_secure_discard_supported(struct block_device *bdev)
{
#if defined(HAVE_BLK_QUEUE_SECURE_ERASE)
return (blk_queue_secure_erase(q));
#if defined(HAVE_BDEV_MAX_SECURE_ERASE_SECTORS)
return (!!bdev_max_secure_erase_sectors(bdev));
#elif defined(HAVE_BLK_QUEUE_SECURE_ERASE)
return (!!blk_queue_secure_erase(bdev_get_queue(bdev)));
#elif defined(HAVE_BLK_QUEUE_SECDISCARD)
return (blk_queue_secdiscard(q));
return (!!blk_queue_secdiscard(bdev_get_queue(bdev)));
#else
return (0);
#error "Unsupported kernel"
#endif
}
@@ -527,7 +551,10 @@ blk_generic_start_io_acct(struct request_queue *q __attribute__((unused)),
struct gendisk *disk __attribute__((unused)),
int rw __attribute__((unused)), struct bio *bio)
{
#if defined(HAVE_DISK_IO_ACCT)
#if defined(HAVE_BDEV_IO_ACCT)
return (bdev_start_io_acct(bio->bi_bdev, bio_sectors(bio),
bio_op(bio), jiffies));
#elif defined(HAVE_DISK_IO_ACCT)
return (disk_start_io_acct(disk, bio_sectors(bio), bio_op(bio)));
#elif defined(HAVE_BIO_IO_ACCT)
return (bio_start_io_acct(bio));
@@ -550,7 +577,9 @@ blk_generic_end_io_acct(struct request_queue *q __attribute__((unused)),
struct gendisk *disk __attribute__((unused)),
int rw __attribute__((unused)), struct bio *bio, unsigned long start_time)
{
#if defined(HAVE_DISK_IO_ACCT)
#if defined(HAVE_BDEV_IO_ACCT)
bdev_end_io_acct(bio->bi_bdev, bio_op(bio), start_time);
#elif defined(HAVE_DISK_IO_ACCT)
disk_end_io_acct(disk, bio_op(bio), start_time);
#elif defined(HAVE_BIO_IO_ACCT)
bio_end_io_acct(bio, start_time);
+2
View File
@@ -87,7 +87,9 @@
#if defined(HAVE_KERNEL_FPU_API_HEADER)
#include <asm/fpu/api.h>
#if defined(HAVE_KERNEL_FPU_INTERNAL_HEADER)
#include <asm/fpu/internal.h>
#endif
#if defined(HAVE_KERNEL_FPU_XCR_HEADER)
#include <asm/fpu/xcr.h>
#endif
@@ -115,6 +115,20 @@ fn(struct dentry *dentry, const char *name, void *buffer, size_t size, \
{ \
return (__ ## fn(dentry->d_inode, name, buffer, size)); \
}
/*
* Android API change,
* The xattr_handler->get() callback was changed to take a dentry and inode
* and flags, because the dentry might not be attached to an inode yet.
*/
#elif defined(HAVE_XATTR_GET_DENTRY_INODE_FLAGS)
#define ZPL_XATTR_GET_WRAPPER(fn) \
static int \
fn(const struct xattr_handler *handler, struct dentry *dentry, \
struct inode *inode, const char *name, void *buffer, \
size_t size, int flags) \
{ \
return (__ ## fn(inode, name, buffer, size)); \
}
#else
#error "Unsupported kernel"
#endif
+4
View File
@@ -64,7 +64,11 @@
* }
*/
#ifdef HAVE_REGISTER_SHRINKER_VARARG
#define spl_register_shrinker(x) register_shrinker(x, "zfs-arc-shrinker")
#else
#define spl_register_shrinker(x) register_shrinker(x)
#endif
#define spl_unregister_shrinker(x) unregister_shrinker(x)
/*
+1
View File
@@ -91,6 +91,7 @@ abd_t *abd_alloc_linear(size_t, boolean_t);
abd_t *abd_alloc_gang(void);
abd_t *abd_alloc_for_io(size_t, boolean_t);
abd_t *abd_alloc_sametype(abd_t *, size_t);
boolean_t abd_size_alloc_linear(size_t);
void abd_gang_add(abd_t *, abd_t *, boolean_t);
void abd_free(abd_t *);
abd_t *abd_get_offset(abd_t *, size_t);
-1
View File
@@ -68,7 +68,6 @@ abd_t *abd_get_offset_scatter(abd_t *, abd_t *, size_t, size_t);
void abd_free_struct_impl(abd_t *);
void abd_alloc_chunks(abd_t *, size_t);
void abd_free_chunks(abd_t *);
boolean_t abd_size_alloc_linear(size_t);
void abd_update_scatter_stats(abd_t *, abd_stats_op_t);
void abd_update_linear_stats(abd_t *, abd_stats_op_t);
void abd_verify_scatter(abd_t *);
+1
View File
@@ -85,6 +85,7 @@ typedef void arc_prune_func_t(int64_t bytes, void *priv);
/* Shared module parameters */
extern int zfs_arc_average_blocksize;
extern int l2arc_exclude_special;
/* generic arc_done_func_t's which you can use */
arc_read_done_func_t arc_bcopy_func;
+7 -7
View File
@@ -30,22 +30,22 @@ typedef struct bqueue {
kmutex_t bq_lock;
kcondvar_t bq_add_cv;
kcondvar_t bq_pop_cv;
uint64_t bq_size;
uint64_t bq_maxsize;
uint64_t bq_fill_fraction;
size_t bq_size;
size_t bq_maxsize;
uint_t bq_fill_fraction;
size_t bq_node_offset;
} bqueue_t;
typedef struct bqueue_node {
list_node_t bqn_node;
uint64_t bqn_size;
size_t bqn_size;
} bqueue_node_t;
int bqueue_init(bqueue_t *, uint64_t, uint64_t, size_t);
int bqueue_init(bqueue_t *, uint_t, size_t, size_t);
void bqueue_destroy(bqueue_t *);
void bqueue_enqueue(bqueue_t *, void *, uint64_t);
void bqueue_enqueue_flush(bqueue_t *, void *, uint64_t);
void bqueue_enqueue(bqueue_t *, void *, size_t);
void bqueue_enqueue_flush(bqueue_t *, void *, size_t);
void *bqueue_dequeue(bqueue_t *);
boolean_t bqueue_empty(bqueue_t *);
+10 -2
View File
@@ -72,7 +72,11 @@ extern kmem_cache_t *zfs_btree_leaf_cache;
typedef struct zfs_btree_hdr {
struct zfs_btree_core *bth_parent;
boolean_t bth_core;
/*
* Set to -1 to indicate core nodes. Other values represent first
* valid element offset for leaf nodes.
*/
uint32_t bth_first;
/*
* For both leaf and core nodes, represents the number of elements in
* the node. For core nodes, they will have bth_count + 1 children.
@@ -91,9 +95,12 @@ typedef struct zfs_btree_leaf {
uint8_t btl_elems[];
} zfs_btree_leaf_t;
#define BTREE_LEAF_ESIZE (BTREE_LEAF_SIZE - \
offsetof(zfs_btree_leaf_t, btl_elems))
typedef struct zfs_btree_index {
zfs_btree_hdr_t *bti_node;
uint64_t bti_offset;
uint32_t bti_offset;
/*
* True if the location is before the list offset, false if it's at
* the listed offset.
@@ -105,6 +112,7 @@ typedef struct btree {
zfs_btree_hdr_t *bt_root;
int64_t bt_height;
size_t bt_elem_size;
uint32_t bt_leaf_cap;
uint64_t bt_num_elems;
uint64_t bt_num_nodes;
zfs_btree_leaf_t *bt_bulk; // non-null if bulk loading
-3
View File
@@ -32,9 +32,6 @@ int aes_mod_fini(void);
int edonr_mod_init(void);
int edonr_mod_fini(void);
int sha1_mod_init(void);
int sha1_mod_fini(void);
int sha2_mod_init(void);
int sha2_mod_fini(void);
+2 -11
View File
@@ -330,7 +330,7 @@ typedef struct dbuf_hash_table {
kmutex_t hash_mutexes[DBUF_MUTEXES] ____cacheline_aligned;
} dbuf_hash_table_t;
typedef void (*dbuf_prefetch_fn)(void *, boolean_t);
typedef void (*dbuf_prefetch_fn)(void *, uint64_t, uint64_t, boolean_t);
uint64_t dbuf_whichblock(const struct dnode *di, const int64_t level,
const uint64_t offset);
@@ -442,16 +442,7 @@ dbuf_find_dirty_eq(dmu_buf_impl_t *db, uint64_t txg)
(dbuf_is_metadata(_db) && \
((_db)->db_objset->os_primary_cache == ZFS_CACHE_METADATA)))
#define DBUF_IS_L2CACHEABLE(_db) \
((_db)->db_objset->os_secondary_cache == ZFS_CACHE_ALL || \
(dbuf_is_metadata(_db) && \
((_db)->db_objset->os_secondary_cache == ZFS_CACHE_METADATA)))
#define DNODE_LEVEL_IS_L2CACHEABLE(_dn, _level) \
((_dn)->dn_objset->os_secondary_cache == ZFS_CACHE_ALL || \
(((_level) > 0 || \
DMU_OT_IS_METADATA((_dn)->dn_handle->dnh_dnode->dn_type)) && \
((_dn)->dn_objset->os_secondary_cache == ZFS_CACHE_METADATA)))
boolean_t dbuf_is_l2cacheable(dmu_buf_impl_t *db);
#ifdef ZFS_DEBUG
+4 -2
View File
@@ -136,7 +136,7 @@ typedef enum dmu_object_byteswap {
#endif
#define DMU_OT_IS_METADATA(ot) (((ot) & DMU_OT_NEWTYPE) ? \
((ot) & DMU_OT_METADATA) : \
(((ot) & DMU_OT_METADATA) != 0) : \
DMU_OT_IS_METADATA_IMPL(ot))
#define DMU_OT_IS_DDT(ot) \
@@ -147,7 +147,7 @@ typedef enum dmu_object_byteswap {
((ot) == DMU_OT_PLAIN_FILE_CONTENTS || (ot) == DMU_OT_UINT64_OTHER)
#define DMU_OT_IS_ENCRYPTED(ot) (((ot) & DMU_OT_NEWTYPE) ? \
((ot) & DMU_OT_ENCRYPTED) : \
(((ot) & DMU_OT_ENCRYPTED) != 0) : \
DMU_OT_IS_ENCRYPTED_IMPL(ot))
/*
@@ -1067,6 +1067,8 @@ int dmu_diff(const char *tosnap_name, const char *fromsnap_name,
#define ZFS_CRC64_POLY 0xC96C5795D7870F42ULL /* ECMA-182, reflected form */
extern uint64_t zfs_crc64_table[256];
extern int dmu_prefetch_max;
#ifdef __cplusplus
}
#endif
-4
View File
@@ -200,10 +200,6 @@ struct objset {
#define DMU_GROUPUSED_DNODE(os) ((os)->os_groupused_dnode.dnh_dnode)
#define DMU_PROJECTUSED_DNODE(os) ((os)->os_projectused_dnode.dnh_dnode)
#define DMU_OS_IS_L2CACHEABLE(os) \
((os)->os_secondary_cache == ZFS_CACHE_ALL || \
(os)->os_secondary_cache == ZFS_CACHE_METADATA)
/* called from zpl */
int dmu_objset_hold(const char *name, void *tag, objset_t **osp);
int dmu_objset_hold_flags(const char *name, boolean_t decrypt, void *tag,
+1
View File
@@ -125,6 +125,7 @@ typedef struct dmu_tx_stats {
kstat_named_t dmu_tx_dirty_delay;
kstat_named_t dmu_tx_dirty_over_max;
kstat_named_t dmu_tx_dirty_frees_delay;
kstat_named_t dmu_tx_wrlog_delay;
kstat_named_t dmu_tx_quota;
} dmu_tx_stats_t;
+7 -9
View File
@@ -49,20 +49,18 @@ typedef struct zfetch {
typedef struct zstream {
uint64_t zs_blkid; /* expect next access at this blkid */
uint64_t zs_pf_blkid1; /* first block to prefetch */
uint64_t zs_pf_blkid; /* block to prefetch up to */
/*
* We will next prefetch the L1 indirect block of this level-0
* block id.
*/
uint64_t zs_ipf_blkid1; /* first block to prefetch */
uint64_t zs_ipf_blkid; /* block to prefetch up to */
unsigned int zs_pf_dist; /* data prefetch distance in bytes */
unsigned int zs_ipf_dist; /* L1 prefetch distance in bytes */
uint64_t zs_pf_start; /* first data block to prefetch */
uint64_t zs_pf_end; /* data block to prefetch up to */
uint64_t zs_ipf_start; /* first data block to prefetch L1 */
uint64_t zs_ipf_end; /* data block to prefetch L1 up to */
list_node_t zs_node; /* link for zf_stream */
hrtime_t zs_atime; /* time last prefetch issued */
zfetch_t *zs_fetch; /* parent fetch */
boolean_t zs_missed; /* stream saw cache misses */
boolean_t zs_more; /* need more distant prefetch */
zfs_refcount_t zs_callers; /* number of pending callers */
/*
* Number of stream references: dnode, callers and pending blocks.
+1 -1
View File
@@ -616,7 +616,7 @@ _NOTE(CONSTCOND) } while (0)
#else
#define dprintf_dnode(db, fmt, ...)
#define DNODE_VERIFY(dn)
#define DNODE_VERIFY(dn) ((void) sizeof ((uintptr_t)(dn)))
#define FREE_VERIFY(db, start, end, tx)
#endif
+7 -1
View File
@@ -40,6 +40,7 @@
#include <sys/rrwlock.h>
#include <sys/dsl_synctask.h>
#include <sys/mmp.h>
#include <sys/aggsum.h>
#ifdef __cplusplus
extern "C" {
@@ -58,6 +59,7 @@ struct dsl_deadlist;
extern unsigned long zfs_dirty_data_max;
extern unsigned long zfs_dirty_data_max_max;
extern unsigned long zfs_wrlog_data_max;
extern int zfs_dirty_data_sync_percent;
extern int zfs_dirty_data_max_percent;
extern int zfs_dirty_data_max_max_percent;
@@ -82,7 +84,6 @@ typedef struct zfs_blkstat {
typedef struct zfs_all_blkstats {
zfs_blkstat_t zab_type[DN_MAX_LEVELS + 1][DMU_OT_TOTAL + 1];
kmutex_t zab_lock;
} zfs_all_blkstats_t;
@@ -119,6 +120,9 @@ typedef struct dsl_pool {
uint64_t dp_mos_compressed_delta;
uint64_t dp_mos_uncompressed_delta;
aggsum_t dp_wrlog_pertxg[TXG_SIZE];
aggsum_t dp_wrlog_total;
/*
* Time of most recently scheduled (furthest in the future)
* wakeup for delayed transactions.
@@ -159,6 +163,8 @@ uint64_t dsl_pool_adjustedsize(dsl_pool_t *dp, zfs_space_check_t slop_policy);
uint64_t dsl_pool_unreserved_space(dsl_pool_t *dp,
zfs_space_check_t slop_policy);
uint64_t dsl_pool_deferred_space(dsl_pool_t *dp);
void dsl_pool_wrlog_count(dsl_pool_t *dp, int64_t size, uint64_t txg);
boolean_t dsl_pool_need_wrlog_delay(dsl_pool_t *dp);
void dsl_pool_dirty_space(dsl_pool_t *dp, int64_t space, dmu_tx_t *tx);
void dsl_pool_undirty_space(dsl_pool_t *dp, int64_t space, uint64_t txg);
void dsl_free(dsl_pool_t *dp, uint64_t txg, const blkptr_t *bpp);
+1 -1
View File
@@ -155,7 +155,7 @@ typedef struct dsl_scan {
dsl_scan_phys_t scn_phys; /* on disk representation of scan */
dsl_scan_phys_t scn_phys_cached;
avl_tree_t scn_queue; /* queue of datasets to scan */
uint64_t scn_bytes_pending; /* outstanding data to issue */
uint64_t scn_queues_pending; /* outstanding data to issue */
} dsl_scan_t;
typedef struct dsl_scan_io_queue dsl_scan_io_queue_t;
+43
View File
@@ -757,6 +757,7 @@ typedef struct zpool_load_policy {
/* Rewind data discovered */
#define ZPOOL_CONFIG_LOAD_TIME "rewind_txg_ts"
#define ZPOOL_CONFIG_LOAD_META_ERRORS "verify_meta_errors"
#define ZPOOL_CONFIG_LOAD_DATA_ERRORS "verify_data_errors"
#define ZPOOL_CONFIG_REWIND_TIME "seconds_of_rewind"
@@ -1101,6 +1102,7 @@ typedef struct vdev_stat {
uint64_t vs_configured_ashift; /* TLV vdev_ashift */
uint64_t vs_logical_ashift; /* vdev_logical_ashift */
uint64_t vs_physical_ashift; /* vdev_physical_ashift */
uint64_t vs_pspace; /* physical capacity */
} vdev_stat_t;
/* BEGIN CSTYLED */
@@ -1613,6 +1615,47 @@ typedef enum {
#define ZFS_EV_HIST_DSID "history_dsid"
#define ZFS_EV_RESILVER_TYPE "resilver_type"
/*
* We currently support block sizes from 512 bytes to 16MB.
* The benefits of larger blocks, and thus larger IO, need to be weighed
* against the cost of COWing a giant block to modify one byte, and the
* large latency of reading or writing a large block.
*
* The recordsize property can not be set larger than zfs_max_recordsize
* (default 16MB on 64-bit and 1MB on 32-bit). See the comment near
* zfs_max_recordsize in dsl_dataset.c for details.
*
* Note that although the LSIZE field of the blkptr_t can store sizes up
* to 32MB, the dnode's dn_datablkszsec can only store sizes up to
* 32MB - 512 bytes. Therefore, we limit SPA_MAXBLOCKSIZE to 16MB.
*/
#define SPA_MINBLOCKSHIFT 9
#define SPA_OLD_MAXBLOCKSHIFT 17
#define SPA_MAXBLOCKSHIFT 24
#define SPA_MINBLOCKSIZE (1ULL << SPA_MINBLOCKSHIFT)
#define SPA_OLD_MAXBLOCKSIZE (1ULL << SPA_OLD_MAXBLOCKSHIFT)
#define SPA_MAXBLOCKSIZE (1ULL << SPA_MAXBLOCKSHIFT)
/* supported encryption algorithms */
enum zio_encrypt {
ZIO_CRYPT_INHERIT = 0,
ZIO_CRYPT_ON,
ZIO_CRYPT_OFF,
ZIO_CRYPT_AES_128_CCM,
ZIO_CRYPT_AES_192_CCM,
ZIO_CRYPT_AES_256_CCM,
ZIO_CRYPT_AES_128_GCM,
ZIO_CRYPT_AES_192_GCM,
ZIO_CRYPT_AES_256_GCM,
ZIO_CRYPT_FUNCTIONS
};
#define ZIO_CRYPT_ON_VALUE ZIO_CRYPT_AES_256_GCM
#define ZIO_CRYPT_DEFAULT ZIO_CRYPT_OFF
#ifdef __cplusplus
}
#endif
+3
View File
@@ -49,11 +49,14 @@ int metaslab_init(metaslab_group_t *, uint64_t, uint64_t, uint64_t,
metaslab_t **);
void metaslab_fini(metaslab_t *);
void metaslab_set_unflushed_dirty(metaslab_t *, boolean_t);
void metaslab_set_unflushed_txg(metaslab_t *, uint64_t, dmu_tx_t *);
void metaslab_set_estimated_condensed_size(metaslab_t *, uint64_t, dmu_tx_t *);
boolean_t metaslab_unflushed_dirty(metaslab_t *);
uint64_t metaslab_unflushed_txg(metaslab_t *);
uint64_t metaslab_estimated_condensed_size(metaslab_t *);
int metaslab_sort_by_flushed(const void *, const void *);
void metaslab_unflushed_bump(metaslab_t *, dmu_tx_t *, boolean_t);
uint64_t metaslab_unflushed_changes_memused(metaslab_t *);
int metaslab_load(metaslab_t *);
+1
View File
@@ -553,6 +553,7 @@ struct metaslab {
* log space maps.
*/
uint64_t ms_unflushed_txg;
boolean_t ms_unflushed_dirty;
/* updated every time we are done syncing the metaslab's space map */
uint64_t ms_synced_length;
+5 -16
View File
@@ -63,12 +63,8 @@ typedef struct range_tree {
*/
uint8_t rt_shift;
uint64_t rt_start;
range_tree_ops_t *rt_ops;
/* rt_btree_compare should only be set if rt_arg is a b-tree */
const range_tree_ops_t *rt_ops;
void *rt_arg;
int (*rt_btree_compare)(const void *, const void *);
uint64_t rt_gap; /* allowable inter-segment gap */
/*
@@ -278,11 +274,11 @@ rs_set_fill(range_seg_t *rs, range_tree_t *rt, uint64_t fill)
typedef void range_tree_func_t(void *arg, uint64_t start, uint64_t size);
range_tree_t *range_tree_create_impl(range_tree_ops_t *ops,
range_tree_t *range_tree_create_gap(const range_tree_ops_t *ops,
range_seg_type_t type, void *arg, uint64_t start, uint64_t shift,
int (*zfs_btree_compare) (const void *, const void *), uint64_t gap);
range_tree_t *range_tree_create(range_tree_ops_t *ops, range_seg_type_t type,
void *arg, uint64_t start, uint64_t shift);
uint64_t gap);
range_tree_t *range_tree_create(const range_tree_ops_t *ops,
range_seg_type_t type, void *arg, uint64_t start, uint64_t shift);
void range_tree_destroy(range_tree_t *rt);
boolean_t range_tree_contains(range_tree_t *rt, uint64_t start, uint64_t size);
range_seg_t *range_tree_find(range_tree_t *rt, uint64_t start, uint64_t size);
@@ -316,13 +312,6 @@ void range_tree_remove_xor_add_segment(uint64_t start, uint64_t end,
void range_tree_remove_xor_add(range_tree_t *rt, range_tree_t *removefrom,
range_tree_t *addto);
void rt_btree_create(range_tree_t *rt, void *arg);
void rt_btree_destroy(range_tree_t *rt, void *arg);
void rt_btree_add(range_tree_t *rt, range_seg_t *rs, void *arg);
void rt_btree_remove(range_tree_t *rt, range_seg_t *rs, void *arg);
void rt_btree_vacate(range_tree_t *rt, void *arg);
extern range_tree_ops_t rt_btree_ops;
#ifdef __cplusplus
}
#endif
-21
View File
@@ -72,27 +72,6 @@ struct dsl_pool;
struct dsl_dataset;
struct dsl_crypto_params;
/*
* We currently support block sizes from 512 bytes to 16MB.
* The benefits of larger blocks, and thus larger IO, need to be weighed
* against the cost of COWing a giant block to modify one byte, and the
* large latency of reading or writing a large block.
*
* Note that although blocks up to 16MB are supported, the recordsize
* property can not be set larger than zfs_max_recordsize (default 1MB).
* See the comment near zfs_max_recordsize in dsl_dataset.c for details.
*
* Note that although the LSIZE field of the blkptr_t can store sizes up
* to 32MB, the dnode's dn_datablkszsec can only store sizes up to
* 32MB - 512 bytes. Therefore, we limit SPA_MAXBLOCKSIZE to 16MB.
*/
#define SPA_MINBLOCKSHIFT 9
#define SPA_OLD_MAXBLOCKSHIFT 17
#define SPA_MAXBLOCKSHIFT 24
#define SPA_MINBLOCKSIZE (1ULL << SPA_MINBLOCKSHIFT)
#define SPA_OLD_MAXBLOCKSIZE (1ULL << SPA_OLD_MAXBLOCKSHIFT)
#define SPA_MAXBLOCKSIZE (1ULL << SPA_MAXBLOCKSHIFT)
/*
* Alignment Shift (ashift) is an immutable, internal top-level vdev property
* which can only be set at vdev creation time. Physical writes are always done
+3 -2
View File
@@ -146,9 +146,9 @@ typedef struct spa_config_lock {
kmutex_t scl_lock;
kthread_t *scl_writer;
int scl_write_wanted;
int scl_count;
kcondvar_t scl_cv;
zfs_refcount_t scl_count;
} spa_config_lock_t;
} ____cacheline_aligned spa_config_lock_t;
typedef struct spa_config_dirent {
list_node_t scd_link;
@@ -370,6 +370,7 @@ struct spa {
boolean_t spa_is_root; /* pool is root */
int spa_minref; /* num refs when first opened */
spa_mode_t spa_mode; /* SPA_MODE_{READ|WRITE} */
boolean_t spa_read_spacemaps; /* spacemaps available if ro */
spa_log_state_t spa_log_state; /* log state */
uint64_t spa_autoexpand; /* lun expansion on/off */
ddt_t *spa_ddt[ZIO_CHECKSUM_FUNCTIONS]; /* in-core DDTs */
+7 -2
View File
@@ -30,7 +30,10 @@
typedef struct log_summary_entry {
uint64_t lse_start; /* start TXG */
uint64_t lse_end; /* last TXG */
uint64_t lse_txgcount; /* # of TXGs */
uint64_t lse_mscount; /* # of metaslabs needed to be flushed */
uint64_t lse_msdcount; /* # of dirty metaslabs needed to be flushed */
uint64_t lse_blkcount; /* blocks held by this entry */
list_node_t lse_node;
} log_summary_entry_t;
@@ -50,6 +53,7 @@ typedef struct spa_log_sm {
uint64_t sls_nblocks; /* number of blocks in this log */
uint64_t sls_mscount; /* # of metaslabs flushed in the log's txg */
avl_node_t sls_node; /* node in spa_sm_logs_by_txg */
space_map_t *sls_sm; /* space map pointer, if open */
} spa_log_sm_t;
int spa_ld_log_spacemaps(spa_t *);
@@ -68,8 +72,9 @@ uint64_t spa_log_sm_memused(spa_t *);
void spa_log_sm_decrement_mscount(spa_t *, uint64_t);
void spa_log_sm_increment_current_mscount(spa_t *);
void spa_log_summary_add_flushed_metaslab(spa_t *);
void spa_log_summary_decrement_mscount(spa_t *, uint64_t);
void spa_log_summary_add_flushed_metaslab(spa_t *, boolean_t);
void spa_log_summary_dirty_flushed_metaslab(spa_t *, uint64_t);
void spa_log_summary_decrement_mscount(spa_t *, uint64_t, boolean_t);
void spa_log_summary_decrement_blkcount(spa_t *, uint64_t);
boolean_t spa_flush_all_logs_requested(spa_t *);
+3
View File
@@ -244,6 +244,9 @@ extern "C" {
#define DEV_PATH "path"
#define DEV_IS_PART "is_slice"
#define DEV_SIZE "dev_size"
/* Size of the whole parent block device (if dev is a partition) */
#define DEV_PARENT_SIZE "dev_parent_size"
#endif /* __linux__ */
#define EV_V1 1
+1 -1
View File
@@ -78,7 +78,7 @@ extern void txg_register_callbacks(txg_handle_t *txghp, list_t *tx_callbacks);
extern void txg_delay(struct dsl_pool *dp, uint64_t txg, hrtime_t delta,
hrtime_t resolution);
extern void txg_kick(struct dsl_pool *dp);
extern void txg_kick(struct dsl_pool *dp, uint64_t txg);
/*
* Wait until the given transaction group has finished syncing.
+1
View File
@@ -642,6 +642,7 @@ extern int vdev_obsolete_counts_are_precise(vdev_t *vd, boolean_t *are_precise);
*/
int vdev_checkpoint_sm_object(vdev_t *vd, uint64_t *sm_obj);
void vdev_metaslab_group_create(vdev_t *vd);
uint64_t vdev_best_ashift(uint64_t logical, uint64_t a, uint64_t b);
/*
* Vdev ashift optimization tunables
+5
View File
@@ -110,7 +110,12 @@ typedef enum zap_flags {
* already randomly distributed.
*/
ZAP_FLAG_PRE_HASHED_KEY = 1 << 2,
#if defined(__linux__) && defined(_KERNEL)
} zfs_zap_flags_t;
#define zap_flags_t zfs_zap_flags_t
#else
} zap_flags_t;
#endif
/*
* Create a new zapobj with no attributes and return its object number.
+10 -1
View File
@@ -221,6 +221,15 @@ typedef struct {
uint64_t lr_foid; /* object id */
} lr_ooo_t;
/*
* Additional lr_attr_t fields.
*/
typedef struct {
uint64_t lr_attr_attrs; /* all of the attributes */
uint64_t lr_attr_crtime[2]; /* create time */
uint8_t lr_attr_scanstamp[32];
} lr_attr_end_t;
/*
* Handle option extended vattr attributes.
*
@@ -231,7 +240,7 @@ typedef struct {
typedef struct {
uint32_t lr_attr_masksize; /* number of elements in array */
uint32_t lr_attr_bitmap; /* First entry of array */
/* remainder of array and any additional fields */
/* remainder of array and additional lr_attr_end_t fields */
} lr_attr_t;
/*
+2 -17
View File
@@ -108,23 +108,6 @@ enum zio_checksum {
#define ZIO_DEDUPCHECKSUM ZIO_CHECKSUM_SHA256
/* supported encryption algorithms */
enum zio_encrypt {
ZIO_CRYPT_INHERIT = 0,
ZIO_CRYPT_ON,
ZIO_CRYPT_OFF,
ZIO_CRYPT_AES_128_CCM,
ZIO_CRYPT_AES_192_CCM,
ZIO_CRYPT_AES_256_CCM,
ZIO_CRYPT_AES_128_GCM,
ZIO_CRYPT_AES_192_GCM,
ZIO_CRYPT_AES_256_GCM,
ZIO_CRYPT_FUNCTIONS
};
#define ZIO_CRYPT_ON_VALUE ZIO_CRYPT_AES_256_GCM
#define ZIO_CRYPT_DEFAULT ZIO_CRYPT_OFF
/* macros defining encryption lengths */
#define ZIO_OBJSET_MAC_LEN 32
#define ZIO_DATA_IV_LEN 12
@@ -699,6 +682,8 @@ extern void spa_handle_ignored_writes(spa_t *spa);
/* zbookmark_phys functions */
boolean_t zbookmark_subtree_completed(const struct dnode_phys *dnp,
const zbookmark_phys_t *subtree_root, const zbookmark_phys_t *last_block);
boolean_t zbookmark_subtree_tbd(const struct dnode_phys *dnp,
const zbookmark_phys_t *subtree_root, const zbookmark_phys_t *last_block);
int zbookmark_compare(uint16_t dbss1, uint8_t ibs1, uint16_t dbss2,
uint8_t ibs2, const zbookmark_phys_t *zb1, const zbookmark_phys_t *zb2);
+3
View File
@@ -5,6 +5,9 @@ VPATH = $(top_srcdir)/module/avl/
# Includes kernel code, generate warnings for large stack frames
AM_CFLAGS += $(FRAME_LARGER_THAN)
# See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=54020
AM_CFLAGS += -no-suppress
noinst_LTLIBRARIES = libavl.la
KERNEL_C = \
+3
View File
@@ -2,6 +2,9 @@ include $(top_srcdir)/config/Rules.am
AM_CFLAGS += $(LIBUUID_CFLAGS) $(ZLIB_CFLAGS)
# See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=54020
AM_CFLAGS += -no-suppress
noinst_LTLIBRARIES = libefi.la
USER_C = \
+2 -3
View File
@@ -6,6 +6,8 @@ VPATH = \
# Includes kernel code, generate warnings for large stack frames
AM_CFLAGS += $(FRAME_LARGER_THAN)
# See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=54020
AM_CFLAGS += -no-suppress
noinst_LTLIBRARIES = libicp.la
@@ -17,7 +19,6 @@ ASM_SOURCES_AS = \
asm-x86_64/modes/gcm_pclmulqdq.S \
asm-x86_64/modes/aesni-gcm-x86_64.S \
asm-x86_64/modes/ghash-x86_64.S \
asm-x86_64/sha1/sha1-x86_64.S \
asm-x86_64/sha2/sha256_impl.S \
asm-x86_64/sha2/sha512_impl.S
else
@@ -46,7 +47,6 @@ KERNEL_C = \
algs/modes/ctr.c \
algs/modes/ccm.c \
algs/modes/ecb.c \
algs/sha1/sha1.c \
algs/sha2/sha2.c \
algs/skein/skein.c \
algs/skein/skein_block.c \
@@ -54,7 +54,6 @@ KERNEL_C = \
illumos-crypto.c \
io/aes.c \
io/edonr_mod.c \
io/sha1_mod.c \
io/sha2_mod.c \
io/skein_mod.c \
os/modhash.c \
+3
View File
@@ -8,6 +8,9 @@ VPATH = \
# and required CFLAGS for libtirpc
AM_CFLAGS += $(FRAME_LARGER_THAN) $(LIBTIRPC_CFLAGS)
# See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=54020
AM_CFLAGS += -no-suppress
lib_LTLIBRARIES = libnvpair.la
include $(top_srcdir)/config/Abigail.am
+3
View File
@@ -2,6 +2,9 @@ include $(top_srcdir)/config/Rules.am
DEFAULT_INCLUDES += -I$(srcdir)
# See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=54020
AM_CFLAGS += -no-suppress
noinst_LTLIBRARIES = libshare.la
USER_C = \
+3
View File
@@ -2,6 +2,9 @@ include $(top_srcdir)/config/Rules.am
SUBDIRS = include
# See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=54020
AM_CFLAGS += -no-suppress
noinst_LTLIBRARIES = libspl_assert.la libspl.la
libspl_assert_la_SOURCES = \

Some files were not shown because too many files have changed in this diff Show More