Compare commits

...

83 Commits

Author SHA1 Message Date
Tony Hutter 6a6bd49398 Tag zfs-2.1.6
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2022-09-28 17:25:10 -07:00
Richard Yao 566e908fa0 Fix bad free in skein code
Clang's static analyzer found a bad free caused by skein_mac_atomic().
It will allocate a context on the stack and then pass it to
skein_final(), which attempts to free it. Upon inspection,
skein_digest_atomic() also has the same problem.

These functions were created to match the OpenSolaris ICP API, so I was
curious how we avoided this in other providers and looked at the SHA2
code. It appears that SHA2 has a SHA2Final() helper function that is
called by the exported sha2_mac_final()/sha2_digest_final() as well as
the sha2_mac_atomic() and sha2_digest_atomic() functions. The real work
is done in SHA2Final() while some checks and the free are done in
sha2_mac_final()/sha2_digest_final().

We fix the use after free in the skein code by taking inspiration from
the SHA2 code. We introduce a skein_final_nofree() that does most of the
work, and make skein_final() into a function that calls it and then
frees the memory.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13954
2022-09-28 17:25:10 -07:00
Tony Hutter a2705b1dd5 zpool: Don't print "repairing" on force faulted drives
If you force fault a drive that's resilvering, it's scan stats can get
frozen in time, giving the false impression that it's being resilvered.
This commit checks the vdev state to see if the vdev is healthy before
reporting "resilvering" or "repairing" in zpool status.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #13927
Closes #13930
2022-09-28 12:41:23 -07:00
Mateusz Guzik 63d4838b4a FreeBSD: handle V_PCATCH
See https://cgit.FreeBSD.org/src/commit/?id=a75d1ddd74312f5dd79bc1e965f7077679659f2e

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #13910
2022-09-28 10:35:13 -07:00
Mateusz Guzik eec942cc54 FreeBSD: catch up to 1400068
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #13909
2022-09-28 10:35:13 -07:00
Mateusz Guzik 2c8e3e4b28 FreeBSD: stop passing LK_INTERLOCK to VOP_LOCK
There is an ongoing effort to eliminate this feature.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #13908
2022-09-28 10:35:13 -07:00
Richard Yao 55816c64da FreeBSD: Fix integer conversion for vnlru_free{,_vfsops}()
When reviewing #13875, I noticed that our FreeBSD code has an issue
where it converts from `int64_t` to `int` when calling
`vnlru_free{,_vfsops}()`. The result is that if the int64_t is `1 <<
36`, the int will be 0, since the low bits are 0. Even when some low
bits are set, a value such as `((1 << 36) + 1)` would truncate to 1,
which is wrong.

There is protection against this on 32-bit platforms, but on 64-bit
platforms, there is no check to protect us, so we add a check.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13882
2022-09-28 10:35:13 -07:00
Ryan Moeller 8dcd6af623 FreeBSD: Ignore symlink to i386 includes
A symlink to i386 includes is created in the build dir on amd64 since
freebsd/freebsd-src@d07600c563

Tell git to ignore it like the other include links.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #13719
2022-09-28 10:35:13 -07:00
Richard Yao c973929b29 LUA: Fix CVE-2014-5461
Apply the fix from upstream.

http://www.lua.org/bugs.html#5.2.2-1
https://www.opencve.io/cve/CVE-2014-5461

It should be noted that exploiting this requires the `SYS_CONFIG`
privilege, and anyone with that privilege likely has other opportunities
to do exploits, so it is unlikely that bad actors could exploit this
unless system administrators are executing untrusted ZFS Channel
Programs.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13949
2022-09-27 16:49:02 -07:00
Richard Yao 835e03682c Linux: Fix uninitialized variable usage in zio_do_crypt_data()
Coverity complained about this. An error from `hkdf_sha512()` before uio
initialization will cause pointers to uninitialized memory to be passed
to `zio_crypt_destroy_uio()`. This is a regression that was introduced
by cf63739191. Interestingly, this never
affected FreeBSD, since the FreeBSD version never had that patch ported.
Since moving uio initialization to the top of this function would slow
down the qat_crypt() path, we only move the `memset()` calls to the top
of the function. This is sufficient to fix this problem.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13944
2022-09-27 15:43:26 -07:00
Alexander Motin 33223cbc3c Refactor Log Size Limit
Original Log Size Limit implementation blocked all writes in case of
limit reached until the TXG is committed and the log is freed.  It
caused huge delays and following speed spikes in application writes.

This implementation instead smoothly throttles writes, using exactly
the same mechanism as used for dirty data.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: jxdking <lostking2008@hotmail.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Issue #12284
Closes #13476
2022-09-26 14:55:27 -07:00
Brian Behlendorf 91e02156dd Revert "Reduce dbuf_find() lock contention"
This reverts commit 34dbc618f5.  While this
change resolved the lock contention observed for certain workloads, it
inadventantly reduced the maximum hash inserts/removes per second.  This
appears to be due to the slightly higher acquisition cost of a rwlock vs
a mutex.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
2022-09-21 13:15:51 -07:00
Richard Yao b66f8d3c2b Add zfs_btree_verify_intensity kernel module parameter
I see a few issues in the issue tracker that might be aided by being
able to turn this on. We have no module parameter for it, so I would
like to add one.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13874
2022-09-21 13:15:51 -07:00
Richard Yao 5096ed31c8 Fix incorrect size given to bqueue_enqueue() call in dmu_redact.c
We pass sizeof (struct redact_record *) rather than sizeof (struct
redact_record). Passing the pointer size is wrong.

Coverity caught this in two places.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13885
2022-09-21 13:15:51 -07:00
Ameer Hamza 035e52f591 Delay ZFS_PROP_SHARESMB property to handle it for encrypted raw receive
For encrypted raw receive, objset creation is delayed until a call to
dmu_recv_stream(). ZFS_PROP_SHARESMB property requires objset to be
populated when calling zpl_earlier_version(). To correctly handle the
ZFS_PROP_SHARESMB property for encrypted raw receive, this change
delays setting the property.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #13878
2022-09-21 13:15:26 -07:00
Ameer Hamza d5105f068f zfs recv hangs if max recordsize is less than received recordsize
- Some optimizations for bqueue enqueue/dequeue.
- Added a fix to prevent deadlock when both bqueue_enqueue_impl()
and bqueue_dequeue() waits for signal to be triggered.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #13855
2022-09-21 13:15:26 -07:00
наб faa1e4082d include: move SPA_MINBLOCKSHIFT and zio_encrypt to sys/fs/zfs.h
These are used by userspace, so should live in a public header

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12116
2022-09-21 13:15:26 -07:00
Alexander Motin 44cec45f72 Improve too large physical ashift handling
When iterating through children physical ashifts for vdev, prefer
ones above the maximum logical ashift, that we can actually use,
but within the administrator defined maximum.

When selecting top-level vdev ashift, do not set it to the defined
maximum in case physical ashift is even higher, but just ignore one.
Using the maximum does not prevent misaligned writes, but reduces
space efficiency.  Since ZFS tries to write data sequentially and
aggregates the writes, in many cases large misanigned writes may be
not as bad as the space penalty otherwise.

Allow internal physical ashifts for vdevs higher than SHIFT_MAX.
May be one day allocator or aggregation could benefit from that.

Reduce zfs_vdev_max_auto_ashift default from 16 (64KB) to 14 (16KB),
so that ZFS may still use bigger ashifts up to SHIFT_MAX (64KB),
but only if it really has to or explicitly told to, but not as an
"optimization".

There are some read-intensive NVMe SSDs that report Preferred Write
Alignment of 64KB, and attempt to build RAIDZ2 of those leads to a
space inefficiency that can't be justified.  Instead these changes
make ZFS fall back to logical ashift of 12 (4KB) by default and
only warn user that it may be suboptimal for performance.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #13798
2022-09-21 13:15:15 -07:00
Rich Ercolani ebbbe01e31 Ask libtool to stop hiding some errors
For #13083, curiously, it did not print the actual error, just
that the compile failed with "Error 1".

In theory, this flag should cause it to report errors twice sometimes.
In practice, I'm pretty okay with reporting some twice if it avoids
reporting some never.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #13086
2022-09-21 16:12:14 -07:00
Kevin Jin d05f3039f7 Add Module Parameter Regarding Log Size Limit
zfs_wrlog_data_max
The upper limit of TX_WRITE log data. Once it is reached,
write operation is blocked, until log data is cleared out
after txg sync. It only counts TX_WRITE log with WR_COPIED
or WR_NEED_COPY.

Reviewed-by: Prakash Surya <prakash.surya@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: jxdking <lostking2008@hotmail.com>
Closes #12284
2022-09-21 16:12:14 -07:00
Kevin Jin 999830a021 Optimize txg_kick() process (#12274)
Use dp_dirty_pertxg[] for txg_kick(), instead of dp_dirty_total in
original code. Extra parameter "txg" is added for txg_kick(), thus it
knows which txg to kick. Also txg_kick() call is moved from
dsl_pool_need_dirty_delay() to dsl_pool_dirty_space() so that we can
know the txg number assigned for txg_kick().

Some unnecessary code regarding dp_dirty_total in txg_sync_thread() is
also cleaned up.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: jxdking <lostking2008@hotmail.com>
Closes #12274
2022-09-21 16:12:14 -07:00
Ameer Hamza a5b0d42540 zfs recv hangs if max recordsize is less than received recordsize
- Some optimizations for bqueue enqueue/dequeue.
- Added a fix to prevent deadlock when both bqueue_enqueue_impl()
and bqueue_dequeue() waits for signal to be triggered.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #13855
2022-09-19 09:39:07 -07:00
Christian Schwarz cde04badd1 make DMU_OT_IS_METADATA and DMU_OT_IS_ENCRYPTED return B_TRUE or B_FALSE
Without this patch, the

    ASSERT3U(dbuf_is_metadata(db), ==, arc_is_metadata(buf));

at the beginning of dbuf_assign_arcbuf can panic
if the object type is a DMU_OT_NEWTYPE that has
DMU_OT_METADATA set.

While we're at it, fix DMU_OT_IS_ENCRYPTED as well.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Christian Schwarz <christian.schwarz@nutanix.com>
Closes #13842
2022-09-15 16:58:35 -07:00
Richard Yao 3f7c174b50 vdev_draid_lookup_map() should not iterate outside draid_maps
Coverity reported this as an out-of-bounds read.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13865
2022-09-15 16:58:35 -07:00
Akash B 03fa3ef264 Add physical device size to SIZE column in 'zpool list -v'
Add physical device size/capacity only for physical devices in
'zpool list -v' instead of displaying "-" in the SIZE column.
This would make it easier to see the individual device capacity and
to determine which spares are large enough to replace which devices.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Dipak Ghosh <dipak.ghosh@hpe.com>
Signed-off-by: Akash B <akash-b@hpe.com>
Closes #12561
Closes #13106
2022-09-15 10:23:01 -07:00
George Amanakis 8bd3dca9bf Introduce a tunable to exclude special class buffers from L2ARC
Special allocation class or dedup vdevs may have roughly the same
performance as L2ARC vdevs. Introduce a new tunable to exclude those
buffers from being cacheable on L2ARC.

Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #11761
Closes #12285
2022-09-14 11:27:00 -07:00
наб c8f795ba53 config: check for parallel(1), use it for cstyle
Before:
$ time make cstyle
real    0m23.118s
user    0m23.002s
sys     0m0.114s

After:
$ time make cstyle
real    0m4.577s
user    0m31.487s
sys     0m0.699s

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Issue #12899
2022-09-14 11:23:25 -07:00
Tony Hutter 7bbfac9d04 zed: Fix config_sync autoexpand flood
Users were seeing floods of `config_sync` events when autoexpand was
enabled.  This happened because all "disk status change" udev events
invoke the autoexpand codepath, which calls zpool_relabel_disk(),
which in turn cause another "disk status change" event to happen,
in a feedback loop.  Note that "disk status change" happens every time
a user calls close() on a block device.

This commit breaks the feedback loop by only allowing an autoexpand
to happen if the disk actually changed size.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes: #7132
Closes: #7366
Closes #13729
2022-09-14 09:57:44 -07:00
Walter Huf 2010c183bc Add xattr_handler support for Android kernels
Some ARM BSPs run the Android kernel, which has
a modified xattr_handler->get() function signature.
This adds support to compile against these kernels.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Walter Huf <hufman@gmail.com>
Closes #13824
2022-09-14 09:57:37 -07:00
Samuel aa9e887d2a Fix column width in 'zpool iostat -v' and 'zpool list -v'
This commit fixes a minor spacing issue caused when
enumerating vdev names, which originated from #13031

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Akash B <akash-b@hpe.com>
Signed-off-by: Samuel Wycliffe <samuelwycliffe@gmail.com>
Closes #13811
2022-09-14 09:57:05 -07:00
Ryan Moeller 78206a2e44 FreeBSD: Mark ZFS_MODULE_PARAM_CALL as MPSAFE
ZFS_MODULE_PARAM_CALL handlers implement their own locking if needed
and do not require Giant.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #13756
2022-09-13 17:59:15 -07:00
Alexander Motin b6ebf270eb Apply arc_shrink_shift to ARC above arc_c_min
It makes sense to free memory in smaller chunks when approaching
arc_c_min to let other kernel subsystems to free more, since after
that point we can't free anything.  This also matches behavior on
Linux, where to shrinker reported only the size above arc_c_min.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #13794
2022-09-13 17:59:10 -07:00
George Wilson 15b64fbc94 Importing from cachefile can trip assertion
When importing from cachefile, it is possible that the builtin retry
logic will trip an assertion because it also fails to find the pool.
This fix addresses that case and returns the correct error message to
the user.

Reviewed-by: Richard Yao <ryao@gentoo.org>
Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Wilson <gwilson@delphix.com>
Closes #13781
2022-09-13 17:59:04 -07:00
Tony Hutter b1be0a5c15 ZTS: Fix zpool_expand_001_pos
`zpool_expand_001_pos` was often failing due to not seeing autoexpand
commands in the `zpool history`.  During testing, I found this to be
unreliable (sometimes the "online" wouldn't appear in `zpool history`)
and unnecessary, as we could simply check that the pool increased in
size.

This commit revamps the test to check for the expanded pool size
and corresponding new free space.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #13743
2022-09-13 17:58:03 -07:00
Tony Hutter 65f8f92d12 zed: Look for NVMe DEVPATH if no ID_BUS
We tried replacing an NVMe drive using autoreplace, only
to see zed reject it with:

zed[27955]: zed_udev_monitor: /dev/nvme5n1 no devid source

This happened because ZED saw that ID_BUS was not set by udev
for the NVMe drive, and thus didn't think it was "real drive".
This commit allows NVMe drives to be autoreplaced even if
ID_BUS is not set.

Reviewed-by: Don Brady <don.brady@intel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #13512
Closes #13646
2022-09-13 17:51:11 -07:00
Tony Hutter acd7464639 zed: Ignore false 'atari' partitions in autoreplace
libudev will sometimes falsely identify an 'atari' partition on a
blank disk, preventing it from being used in an autoreplace.  This
seems to be a known issue.  The workaround is to just ignore the
fake partition and continue with the autoreplace.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #13497
Closes #13632
2022-09-13 17:51:06 -07:00
Tony Hutter f48d9b4269 rpm: Silence "unversioned Obsoletes" warnings on EL 9
Get rid of RPM warnings on AlmaLinux 9:

"It's not recommended to have unversioned Obsoletes"

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #13584
Closes #13638
2022-09-13 17:50:59 -07:00
Neal Gompa (ニール・ゴンパ) e1b49e3f1d rpm: Use the correct version-release information in dependencies
This tightly links the subpackages together and ensures that everything
is upgraded together.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Neal Gompa <ngompa@datto.com>
Closes #13489
2022-09-13 17:50:42 -07:00
Richard Yao 8131a96544 Fix use-after-free in btree code
Coverty static analysis found these.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #10989
Closes #13861
2022-09-13 16:15:38 -07:00
gregory-lee-bartholomew 979fd5a434 contrib: dracut: zfs-snapshot-bootfs: exit status fix
When the zfs-snapshot-bootfs service attempts to create a snapshot
that already exists, the exit status of the command is non-zero and
the service reports failed to the systemd service manager. This is a
common occurrence if bootfs.snapshot is left set on the kernel command
line and it should not be considered a failure.

This service was originally set to ignore this error by prefixing
the command with - on the ExecStart line, but the leading - appears
to have been dropped in #13359.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gregory Bartholomew <gregory.lee.bartholomew@gmail.com>
Closes #13769
2022-08-12 14:31:51 -07:00
r-ricci 533779f5f2 arcstat: fix -p option
When the -p option is used, a list of floats is passed to sep.join(),
which expects strings. Fix this by converting each value to a string.

Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Roberto Ricci <ricci@disroot.org>
Closes #12916 
Closes #13767
2022-08-12 14:29:24 -07:00
Paul Zuchowski db5fd16f0b Fix problem with zdb_objset_id test.
Use large numbers for datasets with
numeric names to avoid name and id
collisions.

Signed-off-by: Paul Zuchowski <pzuchowski@datto.com>
2022-08-09 11:46:12 -07:00
Coleman Kane e0dbab1a14 Linux 6.0 compat: register_shrinker() now var-arg
The 6.0 kernel added a printf-style var-arg for args > 0 to the
register_shrinker function, in order to add names to shrinkers, in
commit e33c267ab70de4249d22d7eab1cc7d68a889bac2. This enables the
shrinkers to have friendly names exposed in /sys/kernel/debug/shrinker/.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #13748
2022-08-09 09:41:06 -07:00
Brian Behlendorf 4063d7b6b4 Linux 5.20 compat: blk_cleanup_disk()
As of the Linux 5.20 kernel blk_cleanup_disk() has been removed,
all callers should use put_disk().

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13728
2022-08-09 09:41:06 -07:00
Brian Behlendorf 58571ba447 Linux 5.20 compat: bdevname()
As of the Linux 5.20 kernel bdevname() has been removed, all
callers should use snprintf() and the "%pg" format specifier.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13728
2022-08-09 09:41:06 -07:00
Brian Behlendorf 57e1052d33 Linux 5.19 compat: META
Update the META file to reflect compatibility with the 5.19 kernel.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13715
2022-08-09 09:41:06 -07:00
Paul Zuchowski fcbddc7f7c Fix problem with zdb -d
zdb -d <pool>/<objset ID> does not work when
other command line arguments are included i.e.
zdb -U <cachefile> -d <pool>/<objset ID>
This change fixes the command line parsing
to handle this situation.  Also fix issue
where zdb -r <dataset> <file> does not handle
the root <dataset> of the pool. Introduce -N
option to force <objset ID> to be interpreted
as a numeric objsetID.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Paul Zuchowski <pzuchowski@datto.com>
Closes #12845
Closes #12944
2022-08-08 16:56:38 -07:00
Tino Reichardt b06aff105c Fix checkstyle warning: E275 missing whitespace after keyword
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #13710
2022-08-02 10:05:14 -07:00
Rich Ercolani 035ee628cf Revert behavior of 59eab109 on not-Linux
It turns out that short-circuiting the EFAULT behavior on a short read
breaks things on FreeBSD. So until there's a nicer solution, let's
just revert the behavior for not-Linux.

Reference:
https://reviews.freebsd.org/R10:70f51f0e474ffe1fb74cb427423a2fba3637544d

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12698
2022-08-02 10:05:14 -07:00
Rich Ercolani 5c56591b57 Handle partial reads in zfs_read
Currently, dmu_read_uio_dnode can read 64K of a requested 1M in one
loop, get EFAULT back from zfs_uiomove() (because the iovec only holds
64k), and return EFAULT, which turns into EAGAIN on the way out. EAGAIN
gets interpreted as "I didn't read anything", the caller tries again
without consuming the 64k we already read, and we're stuck.

This apparently works on newer kernels because the caller which breaks
on older Linux kernels by happily passing along a 1M read request and a
64k iovec just requests 64k at a time.

With this, we now won't return EFAULT if we got a partial read.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12370
Closes #12509
Closes #12516
2022-08-02 10:05:14 -07:00
наб 17512aba0c module: lua: ldo: fix pragma name
/home/nabijaczleweli/store/code/zfs/module/lua/ldo.c:175:32: warning:
unknown option after ‘#pragma GCC diagnostic’ kind [-Wpragmas]
  175 | #pragma GCC diagnostic ignored "-Winfinite-recursion"a
      |                                ^~~~~~~~~~~~~~~~~~~~~~

Fixes: a6e8113fed ("Silence
-Winfinite-recursion warning in luaD_throw()")

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13348
2022-07-28 14:17:38 -07:00
Brian Behlendorf 98315be036 ZTS: Fix io_uring support check
Not all Linux distribution kernels enable io_uring support by
default.  Update the run time check to verify that the booted
kernel was built with CONFIG_IO_URING=y.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Co-authored-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13648
Closes #13685
2022-07-27 13:38:56 -07:00
Brian Behlendorf 69ad0bd769 Fix objtool: missing int3 after ret warning
Resolve straight-line speculation warnings reported by objtool
for x86_64 assembly on Linux when CONFIG_SLS is set.  See the
following LWN article for the complete details.

https://lwn.net/Articles/877845/

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
Attila Fülöp b9d862f2db ICP: Add missing stack frame info to SHA asm files
Since the assembly routines calculating SHA checksums don't use
a standard stack layout, CFI directives are needed to unroll the
stack.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes #11733
2022-07-27 13:38:56 -07:00
Brian Behlendorf d2ff2196a5 Fix -Wformat-overflow warning in zfs_project_handle_dir()
Switch to using asprintf() to satisfy the compiler and resolve the
potential format-overflow warning.  Not the conditional before the
sprintf() would have prevented this regardless.

    cmd/zfs/zfs_project.c: In function ‘zfs_project_handle_dir’:
    cmd/zfs/zfs_project.c:241:38: error: ‘/’ directive writing
    1 byte into a region of size between 0 and 4352
    [-Werror=format-overflow=]
    cmd/zfs/zfs_project.c:241:17: note: ‘sprintf’ output between
    2 and 4609 bytes into a destination of size 4352

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
Brian Behlendorf d483ef3744 Fix -Wformat-truncation warning in upgrade_set_callback()
Extend the buffer slightly resolve the warning.

    cmd/zfs/zfs_main.c: In function ‘upgrade_set_callback’:
    cmd/zfs/zfs_main.c:2446:22: error: ‘%llu’ directive output
    may be truncated writing between 1 and 20 bytes into a
    region of size 16 [-Werror=format-truncation=]
    cmd/zfs/zfs_main.c:2445:24: note: ‘snprintf’ output between
    2 and 21 bytes into a destination of size 16

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
Brian Behlendorf 60f2cfd24f Fix -Wuse-after-free warning in dbuf_destroy()
Move the use of the db pointer after it is freed.  It's only used as
a tag so a dereference would never occur, but there's no reason we
can't invert the order to resolve the warning.

    module/zfs/dbuf.c: In function 'dbuf_destroy':
    module/zfs/dbuf.c:2953:17: error:
    pointer 'db' may be used after 'free' [-Werror=use-after-free]

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
Brian Behlendorf 6a81173026 Fix -Wuse-after-free warning in dbuf_issue_final_prefetch_done()
Move the use of the private pointer after it is freed.  It's only
used as a tag so a dereference would never occur, but there's no
harm in inverting the order to resolve the warning.

    module/zfs/dbuf.c: In function 'dbuf_issue_final_prefetch_done':
    module/zfs/dbuf.c:3204:17: error:
    pointer 'private' may be used after 'free' [-Werror=use-after-free]

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
Brian Behlendorf 087f5dedd5 Fix -Wattribute-warning in dsl layer
The memcpy(), memmove(), and memset() functions have been annotated
to perform bounds checking when using FORTIFY_SOURCE.  A warning is
now generted when writing beyond the end of the specified field.

Alternately, the new struct_group() macro could be used to create
an anonymous union member for use by memcpy().  However, since this
is the only place the macro would be helpful it's preferable to
restructure the code slights to avoid the need for additional
compatibility code when the macro does not exist.

https://lore.kernel.org/lkml/20211118183807.1283332-1-keescook@chromium.org/T/

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
Brian Behlendorf c771583f23 Fix -Wattribute-warning in edonr
The wrong union memory was being accessed in EdonRInit resulting in
a write beyond size of field compiler warning.  Reference the correct
member to resolve the warning.  The warning was correct and this in
case the mistake was harmless.

    In function ‘fortify_memcpy_chk’,
    inlined from ‘EdonRInit’ at zfs/module/icp/algs/edonr/edonr.c:494:3:
    ./include/linux/fortify-string.h:344:25: error: call to
    ‘__write_overflow_field’ declared with attribute warning:
    detected write beyond size of field (1st parameter);
    maybe use struct_group()? [-Werror=attribute-warning]

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
Brian Behlendorf ef0e506f46 Fix -Wattribute-warning in zfs_log_xvattr()
Restructure the code in zfs_log_xvattr() to use a lr_attr_end
structure when accessing lr_attr_t elements located after the
variable sized array.  This makes the code more understandable
and resolves the accessing beyond the end of the field warnings.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
Brian Behlendorf d7a8c573cf Silence -Winfinite-recursion warning in luaD_throw()
This code should be kept inline with the upstream lua version as much
as possible.  Therefore, we simply want to silence the warning.  This
check was enabled by default as part of -Wall in gcc 12.1.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
наб 2d235d58f8 config: prune unused -Wno-bool-compare checks
Reviewed-by: Alejandro Colomar <alx.manpages@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13110
2022-07-27 13:38:56 -07:00
наб 37430e8211 libtpool: -Wno-clobbered
Also remove -Wno-unused-but-set-variable

Upstream-bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61118
Reviewed-by: Alejandro Colomar <alx.manpages@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13110
2022-07-27 13:38:56 -07:00
Tino Reichardt 4b0977027b Remove sha1 hashing from OpenZFS, it's not used anywhere.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Attila Fülöp <attila@fueloep.org>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12895
Closes #12902
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
2022-07-26 10:12:44 -07:00
Alexander Motin 15868d3ecb Fix scrub resume from newly created hole.
It may happen that scan bookmark points to a block that was turned
into a part of a big hole.  In such case dsl_scan_visitbp() may skip
it and dsl_scan_check_resume() will not be called for it.  As result
new scan suspend won't be possible until the end of the object, that
may take hours if the object is a multi-terabyte ZVOL on a slow HDD
pool, stretching TXG to all that time, creating all sorts of problems.

This patch changes the resume condition to any greater or equal block,
so even if we miss the bookmarked block, the next one we find will
delete the bookmark, allowing new suspend.

Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
2022-07-26 10:10:37 -07:00
Alexander Motin bbb50e6129 Avoid memory copy when verifying raidz/draid parity
Before this change for every valid parity column raidz_parity_verify()
allocated new buffer and copied there existing data, then recalculated
the parity and compared the result with the copy.  This patch removes
the memory copy, simply swapping original buffer pointers with newly
allocated empty ones for parity recalculation and comparison. Original
buffers with potentially incorrect parity data are then just freed,
while new recalculated ones are used for repair.

On a pool of 12 4-wide raidz vdevs, storing 1.5TB of 16MB blocks, this
change reduces memory traffic during scrub by 17% and total unhalted
CPU time by 25%.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13613
2022-07-26 10:10:37 -07:00
Alexander Motin 03e33b2bb8 Avoid memory copies during mirror scrub
Issuing several scrub reads for a block we may use the parent ZIO
buffer for one of child ZIOs.  If that read complete successfully,
then we won't need to copy the data explicitly.  If block has only
one copy (typical for root vdev, which is also a mirror inside),
then we never need to copy -- succeed or fail as-is.  Previous
code also copied data from buffer of every successfully completed
child ZIO, but that just does not make any sense.

On healthy N-wide mirror this saves all N+1 (or even more in case
of ditto blocks) memory copies for each scrubbed block, allowing
CPU to focus mostly on check-summing.  For other vdev types it
should save one memory copy per block copy at root vdev.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13606
2022-07-26 10:10:37 -07:00
Alexander Motin 4b8f16072d Fix and disable blocks statistics during scrub
Block statistics calculation during scrub I/O issue in case of sorted
scrub accounted ditto blocks several times.  Embedded blocks on other
side were not accounted at all.  This change moves the accounting from
issue to scan stage, that fixes both problems and also allows to avoid
pool-wide locking and the lock contention it created.

Since this statistics is quite specific and is not even exposed now
anywhere, disable its calculation by default to not waste CPU time.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13579
2022-07-26 10:10:37 -07:00
Alexander Motin 5e06805d8e Avoid two 64-bit divisions per scanned block
Change math to make it like the ARC, using multiplications instead.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13591
2022-07-26 10:10:37 -07:00
Alexander Motin dc91a6a660 Several B-tree optimizations
- Introduce first element offset within a leaf.  It allows to reduce
by ~50% average memmove() size when adding/removing elements.  If the
added/removed element is in the first half of the leaf, we may shift
elements before it and adjust the bth_first instead of moving more
elements after it.
 - Use memcpy() instead of memmove() when we know there is no overlap.
 - Switch from uint64_t to uint32_t.  It does not limit anything,
but 32-bit arches should appreciate it greatly in hot paths.
 - Store leaf capacity in struct btree to avoid 64-bit divisions.
 - Adjust zfs_btree_insert_into_leaf() to always result in balanced
leaves after splitting, no matter where the new element was inserted.
Not that we care about it much, but it should also allow B-trees with
as little as two elements per leaf instead of 4 previously.

When scrubbing pool of 12 SSDs, storing 1.5TB of 4KB zvol blocks this
reduces amount of time spent in memmove() inside the scan thread from
13.7% to 5.7% and total scrub time by ~15 seconds out of 9 minutes.
It should also reduce spacemaps load time, but I haven't measured it.

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13582
2022-07-26 10:10:37 -07:00
Alexander Motin a861aa2b9e Several sorted scrub optimizations
- Reduce size and comparison complexity of q_exts_by_size B-tree.
Previous code used two 64-bit divisions and many other operations to
compare two B-tree elements.  It created enormous overhead.  This
implementation moves the math to the upper level and stores the score
in the B-tree elements themselves.  Since all that we need to store in
that B-tree is the extent score and offset, those can fit into single
8 byte value instead of 24 bytes of q_exts_by_addr element and can be
compared with single operation.
 - Better decouple secondary tree logic from main range_tree by moving
rt_btree_ops and related functions into dsl_scan.c as ext_size_ops.
Those functions are very small to worry about the code duplication and
range_tree does not need to know details such as rt_btree_compare.
 - Instead of accounting number of pending bytes per pool, that needs
atomic on global variable per block, account the number of non-empty
per-vdev queues, that change much more rarely.
 - When extent scan is interrupted by TXG end, continue it in the next
TXG instead of selecting next best extent.  It allows to avoid leaving
one truncated (and so likely not the best any more) extent each TXG.

On top of some other optimizations this saves about 1.5 minutes out of
10 to scrub pool of 12 SSDs, storing 1.5TB of 4KB zvol blocks.

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <caputit1@tcnj.edu>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13576
2022-07-26 10:10:37 -07:00
Alexander Motin 881249de6f FreeBSD: Improve crypto_dispatch() handling
Handle crypto_dispatch() return values same as crp->crp_etype errors.
On FreeBSD 12 many drivers returned same errors both ways, and lack
of proper handling for the first ended up in assertion panic later.
It was changed in FreeBSD 13, but there is no reason to not be safe.

While there, skip waiting for completion, including locking and
wakeup() call, for sessions on synchronous crypto drivers, such as
typical aesni and software.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13563
2022-07-26 10:10:37 -07:00
Alexander Motin 916d9de158 Reduce ZIO io_lock contention on sorted scrub
During sorted scrub multiple threads (one per vdev) are issuing many
ZIOs same time, all using the same scn->scn_zio_root ZIO as parent.
It causes huge lock contention on the single global lock on that ZIO.
Improve it by introducing per-queue null ZIOs, children to that one,
and using them instead as proxy.

For 12 SSD pool storing 1.5TB of 4KB blocks on 80-core system this
dramatically reduces lock contention and reduces scrub time from 21
minutes down to 12.5, while actual read stages (not scan) are about
3x faster, reaching 100K blocks per second per vdev.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13553
2022-07-26 10:10:37 -07:00
Alexander Motin 813e15f28c AVL: Remove obsolete branching optimizations
Modern Clang and GCC can successfully implement simple conditions
without branching with math and flag operations.  Use of arrays for
translation no longer helps as much as it was 14+ years ago.

Disassemble of the code generated by Clang 13.0.0 on FreeBSD 13.1,
Clang 14.0.4 on FreeBSD 14 and GCC 10.2.1 on Debian 11 with this
change still shows no branching instructions.

Profiling of CPU-bound scan stage of sorted scrub shows reproducible
reduction of time spent inside avl_find() from 6.52% to 4.58%.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13540
2022-07-26 10:10:37 -07:00
Alexander Motin 884364ea85 More speculative prefetcher improvements
- Make prefetch distance adaptive: up to 4MB prefetch doubles for
every, hit same as before, but after that it grows by 1/8 every time
the prefetch read does not complete in time to satisfy the demand.
My tests show that 4MB is sufficient for wide NVMe pool to saturate
single reader thread at 2.5GB/s, while new 64MB maximum allows the
same thread to reach 1.5GB/s on wide HDD pool.  Further distance
increase may increase speed even more, but less dramatic and with
higher latency.

 - Allow early reuse of inactive prefetch streams: streams that never
saw hits can be reused immediately if there is a demand, while others
can be reused after 1s of inactivity, starting with the oldest.  After
2s of inactivity streams are deleted to free resources same as before.
This allows by several times increase strided read performance on HDD
pool in presence of simultaneous random reads, previously filling the
zfetch_max_streams limit for seconds and so blocking most of prefetch.

 - Always issue intermediate indirect block reads with SYNC priority.
Each of those reads if delayed for longer may delay up to 1024 other
block prefetches, that may be not good for wide pools.

Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13452
2022-07-26 10:10:37 -07:00
Alexander Motin 6e1e90d64c Improve mg_aliquot math
When calculating mg_aliquot alike to #12046 use number of unique data
disks in the vdev, not the total number of children vdev.  Increase
default value of the tunable from 512KB to 1MB to compensate.

Before this change each disk in striped pool was getting 512KB of
sequential data, in 2-wide mirror -- 1MB, in 3-wide RAIDZ1 -- 768KB.
After this change in all the cases each disk should get 1MB.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13388
2022-07-26 10:10:37 -07:00
Alexander Motin dd9c110ab5 Improve log spacemap load time
Previous flushing algorithm limited only total number of log blocks to
the minimum of 256K and 4x number of metaslabs in the pool.  As result,
system with 1500 disks with 1000 metaslabs each, touching several new
metaslabs each TXG could grow spacemap log to huge size without much
benefits.  We've observed one of such systems importing pool for about
45 minutes.

This patch improves the situation from five sides:
 - By limiting maximum period for each metaslab to be flushed to 1000
TXGs, that effectively limits maximum number of per-TXG spacemap logs
to load to the same number.
 - By making flushing more smooth via accounting number of metaslabs
that were touched after the last flush and actually need another flush,
not just ms_unflushed_txg bump.
 - By applying zfs_unflushed_log_block_pct to the number of metaslabs
that were touched after the last flush, not all metaslabs in the pool.
 - By aggressively prefetching per-TXG spacemap logs up to 16 TXGs in
advance, making log spacemap load process for wide HDD pool CPU-bound,
accelerating it by many times.
 - By reducing zfs_unflushed_log_block_max from 256K to 128K, reducing
single-threaded by nature log processing time from ~10 to ~5 minutes.

As further optimization we could skip bumping ms_unflushed_txg for
metaslabs not touched since the last flush, but that would be an
incompatible change, requiring new pool feature.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #12789
2022-07-26 10:10:37 -07:00
Alexander Motin fdb80a2301 Add more control/visibility to spa_load_verify().
Use error thresholds from policy to control whether to scrub data
and/or metadata.  If threshold is set to UINT64_MAX, then caller
probably does not care about result and we may skip that part.

By default import neither set the data error threshold nor read
the error counter, so skip the data scrub for faster import.
Metadata are still scrubbed and fail if even single error found.

While there just for symmetry return number of metadata errors in
case threshold is not set to zero and we haven't reached it.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #13022
2022-07-26 10:10:37 -07:00
Allan Jude 72a4709a59 spa.c: Replace VERIFY(nvlist_*(...) == 0) with fnvlist_* (#12678)
The fnvlist versions of the functions are fatal if they fail,
saving each call from having to include checking the result.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Allan Jude <allan@klarasystems.com>
2022-07-26 10:10:37 -07:00
Alexander Motin 415882d228 Avoid small buffer copying on write
It is wrong for arc_write_ready() to use zfs_abd_scatter_enabled to
decide whether to reallocate/copy the buffer, because the answer is
OS-specific and depends on the buffer size.  Instead of that use
abd_size_alloc_linear(), moved into public header.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12425
2022-07-26 10:10:37 -07:00
Alexander Motin 5b860ae1fb Remove refcount from spa_config_*()
The only reason for spa_config_*() to use refcount instead of simple
non-atomic (thanks to scl_lock) variable for scl_count is tracking,
hard disabled for the last 8 years.  Switch to simple int scl_count
reduces the lock hold time by avoiding atomic, plus makes structure
fit into single cache line, reducing the locks contention.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #12287
2022-07-26 10:10:37 -07:00
Brian Behlendorf 3920d7f325 Scrub mirror children without BPs
When scrubbing a raidz/draid pool, which contains a replacing or
sparing mirror with multiple online children, only one child will
be read.  This is not normally a serious concern because the DTL
records are used to determine where a good copy of the data is.
As long as the data can be read from one child the mirror vdev
will use it to repair gaps in any of its children.  Furthermore,
even if the data which was read is corrupt the raidz code will
detect this and issue its own repair I/O to correct the damage
in the mirror vdev.

However, in the scenario where the DTL is wrong due to silent
data corruption (say due to overwriting one child) and the scrub
happens to read from a child with good data, then the other damaged
mirror child will not be detected nor repaired.

While this is possible for both raidz and draid vdevs, it's most
pronounced when using draid.  This is because by default the zed
will sequentially rebuild a draid pool to a distributed spare,
and the distributed spare half of the mirror is always preferred
since it delivers better performance.  This means the damaged
half of the mirror will go undetected even after scrubbing.

For system administrations this behavior is non-intuitive and in
a worst case scenario could result in the only good copy of the
data being unknowingly detached from the mirror.

This change resolves the issue by reading all replacing/sparing
mirror children when scrubbing.  When the BP isn't available for
verification, then compare the data buffers from each child.  They
must all be identical, if not there's silent damage and an error
is returned to prompt the top-level vdev to issue a repair I/O to
rewrite the data on all of the mirror children.  Since we can't
tell which child was wrong a checksum error is logged against the
replacing or sparing mirror vdev.

Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13555
2022-07-14 10:21:29 -07:00
144 changed files with 3002 additions and 5267 deletions
+2 -2
View File
@@ -1,10 +1,10 @@
Meta: 1
Name: zfs
Branch: 1.0
Version: 2.1.5
Version: 2.1.6
Release: 1
Release-Tags: relext
License: CDDL
Author: OpenZFS
Linux-Maximum: 5.18
Linux-Maximum: 5.19
Linux-Minimum: 3.10
+6 -1
View File
@@ -114,6 +114,11 @@ commitcheck:
${top_srcdir}/scripts/commitcheck.sh; \
fi
if HAVE_PARALLEL
cstyle_line = -print0 | parallel -X0 ${top_srcdir}/scripts/cstyle.pl -cpP {}
else
cstyle_line = -exec ${top_srcdir}/scripts/cstyle.pl -cpP {} +
endif
PHONY += cstyle
cstyle:
@find ${top_srcdir} -name build -prune \
@@ -122,7 +127,7 @@ cstyle:
! -name 'opt_global.h' ! -name '*_if*.h' \
! -name 'zstd_compat_wrapper.h' \
! -path './module/zstd/lib/*' \
-exec ${top_srcdir}/scripts/cstyle.pl -cpP {} \+
$(cstyle_line)
filter_executable = -exec test -x '{}' \; -print
+1 -1
View File
@@ -271,7 +271,7 @@ def print_values():
if pretty_print:
fmt = lambda col: prettynum(cols[col][0], cols[col][1], v[col])
else:
fmt = lambda col: v[col]
fmt = lambda col: str(v[col])
sys.stdout.write(sep.join(fmt(col) for col in hdr))
sys.stdout.write("\n")
+83 -30
View File
@@ -112,7 +112,7 @@ extern int zfs_vdev_async_read_max_active;
extern boolean_t spa_load_verify_dryrun;
extern boolean_t spa_mode_readable_spacemaps;
extern int zfs_reconstruct_indirect_combinations_max;
extern int zfs_btree_verify_intensity;
extern uint_t zfs_btree_verify_intensity;
static const char cmdname[] = "zdb";
uint8_t dump_opt[256];
@@ -8272,6 +8272,23 @@ zdb_embedded_block(char *thing)
free(buf);
}
/* check for valid hex or decimal numeric string */
static boolean_t
zdb_numeric(char *str)
{
int i = 0;
if (strlen(str) == 0)
return (B_FALSE);
if (strncmp(str, "0x", 2) == 0 || strncmp(str, "0X", 2) == 0)
i = 2;
for (; i < strlen(str); i++) {
if (!isxdigit(str[i]))
return (B_FALSE);
}
return (B_TRUE);
}
int
main(int argc, char **argv)
{
@@ -8317,7 +8334,7 @@ main(int argc, char **argv)
zfs_btree_verify_intensity = 3;
while ((c = getopt(argc, argv,
"AbcCdDeEFGhiI:klLmMo:Op:PqrRsSt:uU:vVx:XYyZ")) != -1) {
"AbcCdDeEFGhiI:klLmMNo:Op:PqrRsSt:uU:vVx:XYyZ")) != -1) {
switch (c) {
case 'b':
case 'c':
@@ -8331,6 +8348,7 @@ main(int argc, char **argv)
case 'l':
case 'm':
case 'M':
case 'N':
case 'O':
case 'r':
case 'R':
@@ -8422,31 +8440,6 @@ main(int argc, char **argv)
(void) fprintf(stderr, "-p option requires use of -e\n");
usage();
}
if (dump_opt['d'] || dump_opt['r']) {
/* <pool>[/<dataset | objset id> is accepted */
if (argv[2] && (objset_str = strchr(argv[2], '/')) != NULL &&
objset_str++ != NULL) {
char *endptr;
errno = 0;
objset_id = strtoull(objset_str, &endptr, 0);
/* dataset 0 is the same as opening the pool */
if (errno == 0 && endptr != objset_str &&
objset_id != 0) {
target_is_spa = B_FALSE;
dataset_lookup = B_TRUE;
} else if (objset_id != 0) {
printf("failed to open objset %s "
"%llu %s", objset_str,
(u_longlong_t)objset_id,
strerror(errno));
exit(1);
}
/* normal dataset name not an objset ID */
if (endptr == objset_str) {
objset_id = -1;
}
}
}
#if defined(_LP64)
/*
@@ -8486,7 +8479,7 @@ main(int argc, char **argv)
verbose = MAX(verbose, 1);
for (c = 0; c < 256; c++) {
if (dump_all && strchr("AeEFklLOPrRSXy", c) == NULL)
if (dump_all && strchr("AeEFklLNOPrRSXy", c) == NULL)
dump_opt[c] = 1;
if (dump_opt[c])
dump_opt[c] += verbose;
@@ -8525,6 +8518,7 @@ main(int argc, char **argv)
return (dump_path(argv[0], argv[1], NULL));
}
if (dump_opt['r']) {
target_is_spa = B_FALSE;
if (argc != 3)
usage();
dump_opt['v'] = verbose;
@@ -8535,6 +8529,10 @@ main(int argc, char **argv)
rewind = ZPOOL_DO_REWIND |
(dump_opt['X'] ? ZPOOL_EXTREME_REWIND : 0);
/* -N implies -d */
if (dump_opt['N'] && dump_opt['d'] == 0)
dump_opt['d'] = dump_opt['N'];
if (nvlist_alloc(&policy, NV_UNIQUE_NAME_TYPE, 0) != 0 ||
nvlist_add_uint64(policy, ZPOOL_LOAD_REQUEST_TXG, max_txg) != 0 ||
nvlist_add_uint32(policy, ZPOOL_LOAD_REWIND_POLICY, rewind) != 0)
@@ -8553,6 +8551,34 @@ main(int argc, char **argv)
targetlen = strlen(target);
if (targetlen && target[targetlen - 1] == '/')
target[targetlen - 1] = '\0';
/*
* See if an objset ID was supplied (-d <pool>/<objset ID>).
* To disambiguate tank/100, consider the 100 as objsetID
* if -N was given, otherwise 100 is an objsetID iff
* tank/100 as a named dataset fails on lookup.
*/
objset_str = strchr(target, '/');
if (objset_str && strlen(objset_str) > 1 &&
zdb_numeric(objset_str + 1)) {
char *endptr;
errno = 0;
objset_str++;
objset_id = strtoull(objset_str, &endptr, 0);
/* dataset 0 is the same as opening the pool */
if (errno == 0 && endptr != objset_str &&
objset_id != 0) {
if (dump_opt['N'])
dataset_lookup = B_TRUE;
}
/* normal dataset name not an objset ID */
if (endptr == objset_str) {
objset_id = -1;
}
} else if (objset_str && !zdb_numeric(objset_str + 1) &&
dump_opt['N']) {
printf("Supply a numeric objset ID with -N\n");
exit(1);
}
} else {
target_pool = target;
}
@@ -8670,13 +8696,27 @@ main(int argc, char **argv)
}
return (error);
} else {
target_pool = strdup(target);
if (strpbrk(target, "/@") != NULL)
*strpbrk(target_pool, "/@") = '\0';
zdb_set_skip_mmp(target);
/*
* If -N was supplied, the user has indicated that
* zdb -d <pool>/<objsetID> is in effect. Otherwise
* we first assume that the dataset string is the
* dataset name. If dmu_objset_hold fails with the
* dataset string, and we have an objset_id, retry the
* lookup with the objsetID.
*/
boolean_t retry = B_TRUE;
retry_lookup:
if (dataset_lookup == B_TRUE) {
/*
* Use the supplied id to get the name
* for open_objset.
*/
error = spa_open(target, &spa, FTAG);
error = spa_open(target_pool, &spa, FTAG);
if (error == 0) {
error = name_from_objset_id(spa,
objset_id, dsname);
@@ -8685,10 +8725,23 @@ main(int argc, char **argv)
target = dsname;
}
}
if (error == 0)
if (error == 0) {
if (objset_id > 0 && retry) {
int err = dmu_objset_hold(target, FTAG,
&os);
if (err) {
dataset_lookup = B_TRUE;
retry = B_FALSE;
goto retry_lookup;
} else {
dmu_objset_rele(os, FTAG);
}
}
error = open_objset(target, FTAG, &os);
}
if (error == 0)
spa = dmu_objset_spa(os);
free(target_pool);
}
}
nvlist_free(policy);
+147 -8
View File
@@ -894,14 +894,90 @@ zfs_deliver_check(nvlist_t *nvl)
return (0);
}
/*
* Given a path to a vdev, lookup the vdev's physical size from its
* config nvlist.
*
* Returns the vdev's physical size in bytes on success, 0 on error.
*/
static uint64_t
vdev_size_from_config(zpool_handle_t *zhp, const char *vdev_path)
{
nvlist_t *nvl = NULL;
boolean_t avail_spare, l2cache, log;
vdev_stat_t *vs = NULL;
uint_t c;
nvl = zpool_find_vdev(zhp, vdev_path, &avail_spare, &l2cache, &log);
if (!nvl)
return (0);
verify(nvlist_lookup_uint64_array(nvl, ZPOOL_CONFIG_VDEV_STATS,
(uint64_t **)&vs, &c) == 0);
if (!vs) {
zed_log_msg(LOG_INFO, "%s: no nvlist for '%s'", __func__,
vdev_path);
return (0);
}
return (vs->vs_pspace);
}
/*
* Given a path to a vdev, lookup if the vdev is a "whole disk" in the
* config nvlist. "whole disk" means that ZFS was passed a whole disk
* at pool creation time, which it partitioned up and has full control over.
* Thus a partition with wholedisk=1 set tells us that zfs created the
* partition at creation time. A partition without whole disk set would have
* been created by externally (like with fdisk) and passed to ZFS.
*
* Returns the whole disk value (either 0 or 1).
*/
static uint64_t
vdev_whole_disk_from_config(zpool_handle_t *zhp, const char *vdev_path)
{
nvlist_t *nvl = NULL;
boolean_t avail_spare, l2cache, log;
uint64_t wholedisk;
nvl = zpool_find_vdev(zhp, vdev_path, &avail_spare, &l2cache, &log);
if (!nvl)
return (0);
verify(nvlist_lookup_uint64(nvl, ZPOOL_CONFIG_WHOLE_DISK,
&wholedisk) == 0);
return (wholedisk);
}
/*
* If the device size grew more than 1% then return true.
*/
#define DEVICE_GREW(oldsize, newsize) \
((newsize > oldsize) && \
((newsize / (newsize - oldsize)) <= 100))
static int
zfsdle_vdev_online(zpool_handle_t *zhp, void *data)
{
char *devname = data;
boolean_t avail_spare, l2cache;
nvlist_t *udev_nvl = data;
nvlist_t *tgt;
int error;
char *tmp_devname, devname[MAXPATHLEN];
uint64_t guid;
if (nvlist_lookup_uint64(udev_nvl, ZFS_EV_VDEV_GUID, &guid) == 0) {
sprintf(devname, "%llu", (u_longlong_t)guid);
} else if (nvlist_lookup_string(udev_nvl, DEV_PHYS_PATH,
&tmp_devname) == 0) {
strlcpy(devname, tmp_devname, MAXPATHLEN);
zfs_append_partition(devname, MAXPATHLEN);
} else {
zed_log_msg(LOG_INFO, "%s: no guid or physpath", __func__);
}
zed_log_msg(LOG_INFO, "zfsdle_vdev_online: searching for '%s' in '%s'",
devname, zpool_get_name(zhp));
@@ -953,12 +1029,75 @@ zfsdle_vdev_online(zpool_handle_t *zhp, void *data)
vdev_state_t newstate;
if (zpool_get_state(zhp) != POOL_STATE_UNAVAIL) {
error = zpool_vdev_online(zhp, fullpath, 0,
&newstate);
zed_log_msg(LOG_INFO, "zfsdle_vdev_online: "
"setting device '%s' to ONLINE state "
"in pool '%s': %d", fullpath,
zpool_get_name(zhp), error);
/*
* If this disk size has not changed, then
* there's no need to do an autoexpand. To
* check we look at the disk's size in its
* config, and compare it to the disk size
* that udev is reporting.
*/
uint64_t udev_size = 0, conf_size = 0,
wholedisk = 0, udev_parent_size = 0;
/*
* Get the size of our disk that udev is
* reporting.
*/
if (nvlist_lookup_uint64(udev_nvl, DEV_SIZE,
&udev_size) != 0) {
udev_size = 0;
}
/*
* Get the size of our disk's parent device
* from udev (where sda1's parent is sda).
*/
if (nvlist_lookup_uint64(udev_nvl,
DEV_PARENT_SIZE, &udev_parent_size) != 0) {
udev_parent_size = 0;
}
conf_size = vdev_size_from_config(zhp,
fullpath);
wholedisk = vdev_whole_disk_from_config(zhp,
fullpath);
/*
* Only attempt an autoexpand if the vdev size
* changed. There are two different cases
* to consider.
*
* 1. wholedisk=1
* If you do a 'zpool create' on a whole disk
* (like /dev/sda), then zfs will create
* partitions on the disk (like /dev/sda1). In
* that case, wholedisk=1 will be set in the
* partition's nvlist config. So zed will need
* to see if your parent device (/dev/sda)
* expanded in size, and if so, then attempt
* the autoexpand.
*
* 2. wholedisk=0
* If you do a 'zpool create' on an existing
* partition, or a device that doesn't allow
* partitions, then wholedisk=0, and you will
* simply need to check if the device itself
* expanded in size.
*/
if (DEVICE_GREW(conf_size, udev_size) ||
(wholedisk && DEVICE_GREW(conf_size,
udev_parent_size))) {
error = zpool_vdev_online(zhp, fullpath,
0, &newstate);
zed_log_msg(LOG_INFO,
"%s: autoexpanding '%s' from %llu"
" to %llu bytes in pool '%s': %d",
__func__, fullpath, conf_size,
MAX(udev_size, udev_parent_size),
zpool_get_name(zhp), error);
}
}
}
zpool_close(zhp);
@@ -989,7 +1128,7 @@ zfs_deliver_dle(nvlist_t *nvl)
zed_log_msg(LOG_INFO, "zfs_deliver_dle: no guid or physpath");
}
if (zpool_iter(g_zfshdl, zfsdle_vdev_online, name) != 1) {
if (zpool_iter(g_zfshdl, zfsdle_vdev_online, nvl) != 1) {
zed_log_msg(LOG_INFO, "zfs_deliver_dle: device '%s' not "
"found", name);
return (1);
+50 -10
View File
@@ -78,6 +78,8 @@ zed_udev_event(const char *class, const char *subclass, nvlist_t *nvl)
zed_log_msg(LOG_INFO, "\t%s: %s", DEV_PHYS_PATH, strval);
if (nvlist_lookup_uint64(nvl, DEV_SIZE, &numval) == 0)
zed_log_msg(LOG_INFO, "\t%s: %llu", DEV_SIZE, numval);
if (nvlist_lookup_uint64(nvl, DEV_PARENT_SIZE, &numval) == 0)
zed_log_msg(LOG_INFO, "\t%s: %llu", DEV_PARENT_SIZE, numval);
if (nvlist_lookup_uint64(nvl, ZFS_EV_POOL_GUID, &numval) == 0)
zed_log_msg(LOG_INFO, "\t%s: %llu", ZFS_EV_POOL_GUID, numval);
if (nvlist_lookup_uint64(nvl, ZFS_EV_VDEV_GUID, &numval) == 0)
@@ -130,6 +132,20 @@ dev_event_nvlist(struct udev_device *dev)
numval *= strtoull(value, NULL, 10);
(void) nvlist_add_uint64(nvl, DEV_SIZE, numval);
/*
* If the device has a parent, then get the parent block
* device's size as well. For example, /dev/sda1's parent
* is /dev/sda.
*/
struct udev_device *parent_dev = udev_device_get_parent(dev);
if ((value = udev_device_get_sysattr_value(parent_dev, "size"))
!= NULL) {
uint64_t numval = DEV_BSIZE;
numval *= strtoull(value, NULL, 10);
(void) nvlist_add_uint64(nvl, DEV_PARENT_SIZE, numval);
}
}
/*
@@ -169,7 +185,7 @@ zed_udev_monitor(void *arg)
while (1) {
struct udev_device *dev;
const char *action, *type, *part, *sectors;
const char *bus, *uuid;
const char *bus, *uuid, *devpath;
const char *class, *subclass;
nvlist_t *nvl;
boolean_t is_zfs = B_FALSE;
@@ -208,6 +224,12 @@ zed_udev_monitor(void *arg)
* if this is a disk and it is partitioned, then the
* zfs label will reside in a DEVTYPE=partition and
* we can skip passing this event
*
* Special case: Blank disks are sometimes reported with
* an erroneous 'atari' partition, and should not be
* excluded from being used as an autoreplace disk:
*
* https://github.com/openzfs/zfs/issues/13497
*/
type = udev_device_get_property_value(dev, "DEVTYPE");
part = udev_device_get_property_value(dev,
@@ -215,14 +237,23 @@ zed_udev_monitor(void *arg)
if (type != NULL && type[0] != '\0' &&
strcmp(type, "disk") == 0 &&
part != NULL && part[0] != '\0') {
zed_log_msg(LOG_INFO,
"%s: skip %s since it has a %s partition already",
__func__,
udev_device_get_property_value(dev, "DEVNAME"),
part);
/* skip and wait for partition event */
udev_device_unref(dev);
continue;
const char *devname =
udev_device_get_property_value(dev, "DEVNAME");
if (strcmp(part, "atari") == 0) {
zed_log_msg(LOG_INFO,
"%s: %s is reporting an atari partition, "
"but we're going to assume it's a false "
"positive and still use it (issue #13497)",
__func__, devname);
} else {
zed_log_msg(LOG_INFO,
"%s: skip %s since it has a %s partition "
"already", __func__, devname, part);
/* skip and wait for partition event */
udev_device_unref(dev);
continue;
}
}
/*
@@ -248,10 +279,19 @@ zed_udev_monitor(void *arg)
* device id string is required in the message schema
* for matching with vdevs. Preflight here for expected
* udev information.
*
* Special case:
* NVMe devices don't have ID_BUS set (at least on RHEL 7-8),
* but they are valid for autoreplace. Add a special case for
* them by searching for "/nvme/" in the udev DEVPATH:
*
* DEVPATH=/devices/pci0000:00/0000:00:1e.0/nvme/nvme2/nvme2n1
*/
bus = udev_device_get_property_value(dev, "ID_BUS");
uuid = udev_device_get_property_value(dev, "DM_UUID");
if (!is_zfs && (bus == NULL && uuid == NULL)) {
devpath = udev_device_get_devpath(dev);
if (!is_zfs && (bus == NULL && uuid == NULL &&
strstr(devpath, "/nvme/") == NULL)) {
zed_log_msg(LOG_INFO, "zed_udev_monitor: %s no devid "
"source", udev_device_get_devnode(dev));
udev_device_unref(dev);
+1 -1
View File
@@ -2480,7 +2480,7 @@ upgrade_set_callback(zfs_handle_t *zhp, void *data)
/* upgrade */
if (version < cb->cb_version) {
char verstr[16];
char verstr[24];
(void) snprintf(verstr, sizeof (verstr),
"%llu", (u_longlong_t)cb->cb_version);
if (cb->cb_lastfs[0] && !same_pool(zhp, cb->cb_lastfs)) {
+10 -4
View File
@@ -207,7 +207,6 @@ static int
zfs_project_handle_dir(const char *name, zfs_project_control_t *zpc,
list_t *head)
{
char fullname[PATH_MAX];
struct dirent *ent;
DIR *dir;
int ret = 0;
@@ -227,21 +226,28 @@ zfs_project_handle_dir(const char *name, zfs_project_control_t *zpc,
zpc->zpc_ignore_noent = B_TRUE;
errno = 0;
while (!ret && (ent = readdir(dir)) != NULL) {
char *fullname;
/* skip "." and ".." */
if (strcmp(ent->d_name, ".") == 0 ||
strcmp(ent->d_name, "..") == 0)
continue;
if (strlen(ent->d_name) + strlen(name) >=
sizeof (fullname) + 1) {
if (strlen(ent->d_name) + strlen(name) + 1 >= PATH_MAX) {
errno = ENAMETOOLONG;
break;
}
sprintf(fullname, "%s/%s", name, ent->d_name);
if (asprintf(&fullname, "%s/%s", name, ent->d_name) == -1) {
errno = ENOMEM;
break;
}
ret = zfs_project_handle_one(fullname, zpc);
if (!ret && zpc->zpc_recursive && ent->d_type == DT_DIR)
zfs_project_item_alloc(head, fullname);
free(fullname);
}
if (errno && !ret) {
+20 -8
View File
@@ -2438,7 +2438,14 @@ print_status_config(zpool_handle_t *zhp, status_cbdata_t *cb, const char *name,
(void) nvlist_lookup_uint64_array(root, ZPOOL_CONFIG_SCAN_STATS,
(uint64_t **)&ps, &c);
if (ps != NULL && ps->pss_state == DSS_SCANNING && children == 0) {
/*
* If you force fault a drive that's resilvering, its scan stats can
* get frozen in time, giving the false impression that it's
* being resilvered. That's why we check the state to see if the vdev
* is healthy before reporting "resilvering" or "repairing".
*/
if (ps != NULL && ps->pss_state == DSS_SCANNING && children == 0 &&
vs->vs_state == VDEV_STATE_HEALTHY) {
if (vs->vs_scan_processed != 0) {
(void) printf(gettext(" (%s)"),
(ps->pss_func == POOL_SCAN_RESILVER) ?
@@ -2450,7 +2457,7 @@ print_status_config(zpool_handle_t *zhp, status_cbdata_t *cb, const char *name,
/* The top-level vdevs have the rebuild stats */
if (vrs != NULL && vrs->vrs_state == VDEV_REBUILD_ACTIVE &&
children == 0) {
children == 0 && vs->vs_state == VDEV_STATE_HEALTHY) {
if (vs->vs_rebuild_processed != 0) {
(void) printf(gettext(" (resilvering)"));
}
@@ -5458,8 +5465,8 @@ get_namewidth_iostat(zpool_handle_t *zhp, void *data)
* get_namewidth() returns the maximum width of any name in that column
* for any pool/vdev/device line that will be output.
*/
width = get_namewidth(zhp, cb->cb_namewidth, cb->cb_name_flags,
cb->cb_verbose);
width = get_namewidth(zhp, cb->cb_namewidth,
cb->cb_name_flags | VDEV_NAME_TYPE_ID, cb->cb_verbose);
/*
* The width we are calculating is the width of the header and also the
@@ -6035,6 +6042,7 @@ print_one_column(zpool_prop_t prop, uint64_t value, const char *str,
size_t width = zprop_width(prop, &fixed, ZFS_TYPE_POOL);
switch (prop) {
case ZPOOL_PROP_SIZE:
case ZPOOL_PROP_EXPANDSZ:
case ZPOOL_PROP_CHECKPOINT:
case ZPOOL_PROP_DEDUPRATIO:
@@ -6130,8 +6138,12 @@ print_list_stats(zpool_handle_t *zhp, const char *name, nvlist_t *nv,
* 'toplevel' boolean value is passed to the print_one_column()
* to indicate that the value is valid.
*/
print_one_column(ZPOOL_PROP_SIZE, vs->vs_space, NULL, scripted,
toplevel, format);
if (vs->vs_pspace)
print_one_column(ZPOOL_PROP_SIZE, vs->vs_pspace, NULL,
scripted, B_TRUE, format);
else
print_one_column(ZPOOL_PROP_SIZE, vs->vs_space, NULL,
scripted, toplevel, format);
print_one_column(ZPOOL_PROP_ALLOCATED, vs->vs_alloc, NULL,
scripted, toplevel, format);
print_one_column(ZPOOL_PROP_FREE, vs->vs_space - vs->vs_alloc,
@@ -6282,8 +6294,8 @@ get_namewidth_list(zpool_handle_t *zhp, void *data)
list_cbdata_t *cb = data;
int width;
width = get_namewidth(zhp, cb->cb_namewidth, cb->cb_name_flags,
cb->cb_verbose);
width = get_namewidth(zhp, cb->cb_namewidth,
cb->cb_name_flags | VDEV_NAME_TYPE_ID, cb->cb_verbose);
if (width < 9)
width = 9;
+32 -36
View File
@@ -88,7 +88,7 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_NO_FORMAT_TRUNCATION], [
])
dnl #
dnl # Check if gcc supports -Wno-format-truncation option.
dnl # Check if gcc supports -Wno-format-zero-length option.
dnl #
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_NO_FORMAT_ZERO_LENGTH], [
AC_MSG_CHECKING([whether $CC supports -Wno-format-zero-length])
@@ -108,57 +108,30 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_NO_FORMAT_ZERO_LENGTH], [
AC_SUBST([NO_FORMAT_ZERO_LENGTH])
])
dnl #
dnl # Check if gcc supports -Wno-bool-compare option.
dnl # Check if gcc supports -Wno-clobbered option.
dnl #
dnl # We actually invoke gcc with the -Wbool-compare option
dnl # We actually invoke gcc with the -Wclobbered option
dnl # and infer the 'no-' version does or doesn't exist based upon
dnl # the results. This is required because when checking any of
dnl # no- prefixed options gcc always returns success.
dnl #
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_NO_BOOL_COMPARE], [
AC_MSG_CHECKING([whether $CC supports -Wno-bool-compare])
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_NO_CLOBBERED], [
AC_MSG_CHECKING([whether $CC supports -Wno-clobbered])
saved_flags="$CFLAGS"
CFLAGS="$CFLAGS -Werror -Wbool-compare"
CFLAGS="$CFLAGS -Werror -Wclobbered"
AC_COMPILE_IFELSE([AC_LANG_PROGRAM([], [])], [
NO_BOOL_COMPARE=-Wno-bool-compare
NO_CLOBBERED=-Wno-clobbered
AC_MSG_RESULT([yes])
], [
NO_BOOL_COMPARE=
NO_CLOBBERED=
AC_MSG_RESULT([no])
])
CFLAGS="$saved_flags"
AC_SUBST([NO_BOOL_COMPARE])
])
dnl #
dnl # Check if gcc supports -Wno-unused-but-set-variable option.
dnl #
dnl # We actually invoke gcc with the -Wunused-but-set-variable option
dnl # and infer the 'no-' version does or doesn't exist based upon
dnl # the results. This is required because when checking any of
dnl # no- prefixed options gcc always returns success.
dnl #
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_NO_UNUSED_BUT_SET_VARIABLE], [
AC_MSG_CHECKING([whether $CC supports -Wno-unused-but-set-variable])
saved_flags="$CFLAGS"
CFLAGS="$CFLAGS -Werror -Wunused-but-set-variable"
AC_COMPILE_IFELSE([AC_LANG_PROGRAM([], [])], [
NO_UNUSED_BUT_SET_VARIABLE=-Wno-unused-but-set-variable
AC_MSG_RESULT([yes])
], [
NO_UNUSED_BUT_SET_VARIABLE=
AC_MSG_RESULT([no])
])
CFLAGS="$saved_flags"
AC_SUBST([NO_UNUSED_BUT_SET_VARIABLE])
AC_SUBST([NO_CLOBBERED])
])
dnl #
@@ -184,6 +157,29 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_IMPLICIT_FALLTHROUGH], [
AC_SUBST([IMPLICIT_FALLTHROUGH])
])
dnl #
dnl # Check if cc supports -Winfinite-recursion option.
dnl #
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_INFINITE_RECURSION], [
AC_MSG_CHECKING([whether $CC supports -Winfinite-recursion])
saved_flags="$CFLAGS"
CFLAGS="$CFLAGS -Werror -Winfinite-recursion"
AC_COMPILE_IFELSE([AC_LANG_PROGRAM([], [])], [
INFINITE_RECURSION=-Winfinite-recursion
AC_DEFINE([HAVE_INFINITE_RECURSION], 1,
[Define if compiler supports -Winfinite-recursion])
AC_MSG_RESULT([yes])
], [
INFINITE_RECURSION=
AC_MSG_RESULT([no])
])
CFLAGS="$saved_flags"
AC_SUBST([INFINITE_RECURSION])
])
dnl #
dnl # Check if gcc supports -fno-omit-frame-pointer option.
dnl #
+8
View File
@@ -0,0 +1,8 @@
dnl #
dnl # Check if GNU parallel is available.
dnl #
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_PARALLEL], [
AC_CHECK_PROG([PARALLEL], [parallel], [yes])
AM_CONDITIONAL([HAVE_PARALLEL], [test "x$PARALLEL" = "xyes"])
])
+8 -8
View File
@@ -259,17 +259,17 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_QUEUE_FLUSH], [
ZFS_LINUX_TEST_SRC([blk_queue_flush], [
#include <linux/blkdev.h>
], [
struct request_queue *q = NULL;
struct request_queue *q __attribute__ ((unused)) = NULL;
(void) blk_queue_flush(q, REQ_FLUSH);
], [$NO_UNUSED_BUT_SET_VARIABLE], [ZFS_META_LICENSE])
], [], [ZFS_META_LICENSE])
ZFS_LINUX_TEST_SRC([blk_queue_write_cache], [
#include <linux/kernel.h>
#include <linux/blkdev.h>
], [
struct request_queue *q = NULL;
struct request_queue *q __attribute__ ((unused)) = NULL;
blk_queue_write_cache(q, true, true);
], [$NO_UNUSED_BUT_SET_VARIABLE], [ZFS_META_LICENSE])
], [], [ZFS_META_LICENSE])
])
AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_FLUSH], [
@@ -322,9 +322,9 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_QUEUE_MAX_HW_SECTORS], [
ZFS_LINUX_TEST_SRC([blk_queue_max_hw_sectors], [
#include <linux/blkdev.h>
], [
struct request_queue *q = NULL;
struct request_queue *q __attribute__ ((unused)) = NULL;
(void) blk_queue_max_hw_sectors(q, BLK_SAFE_MAX_SECTORS);
], [$NO_UNUSED_BUT_SET_VARIABLE])
], [])
])
AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_MAX_HW_SECTORS], [
@@ -345,9 +345,9 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_QUEUE_MAX_SEGMENTS], [
ZFS_LINUX_TEST_SRC([blk_queue_max_segments], [
#include <linux/blkdev.h>
], [
struct request_queue *q = NULL;
struct request_queue *q __attribute__ ((unused)) = NULL;
(void) blk_queue_max_segments(q, BLK_MAX_SEGMENTS);
], [$NO_UNUSED_BUT_SET_VARIABLE])
], [])
])
AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_MAX_SEGMENTS], [
+28
View File
@@ -294,6 +294,32 @@ AC_DEFUN([ZFS_AC_KERNEL_BLKDEV_BDEV_WHOLE], [
])
])
dnl #
dnl # 5.20 API change,
dnl # Removed bdevname(), snprintf(.., %pg) should be used.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLKDEV_BDEVNAME], [
ZFS_LINUX_TEST_SRC([bdevname], [
#include <linux/fs.h>
#include <linux/blkdev.h>
], [
struct block_device *bdev __attribute__ ((unused)) = NULL;
char path[BDEVNAME_SIZE];
(void) bdevname(bdev, path);
])
])
AC_DEFUN([ZFS_AC_KERNEL_BLKDEV_BDEVNAME], [
AC_MSG_CHECKING([whether bdevname() exists])
ZFS_LINUX_TEST_RESULT([bdevname], [
AC_DEFINE(HAVE_BDEVNAME, 1, [bdevname() is available])
AC_MSG_RESULT(yes)
], [
AC_MSG_RESULT(no)
])
])
dnl #
dnl # 5.19 API: blkdev_issue_secure_erase()
dnl # 3.10 API: blkdev_issue_discard(..., BLKDEV_DISCARD_SECURE)
@@ -377,6 +403,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLKDEV], [
ZFS_AC_KERNEL_SRC_BLKDEV_CHECK_DISK_CHANGE
ZFS_AC_KERNEL_SRC_BLKDEV_BDEV_CHECK_MEDIA_CHANGE
ZFS_AC_KERNEL_SRC_BLKDEV_BDEV_WHOLE
ZFS_AC_KERNEL_SRC_BLKDEV_BDEVNAME
ZFS_AC_KERNEL_SRC_BLKDEV_ISSUE_SECURE_ERASE
])
@@ -391,6 +418,7 @@ AC_DEFUN([ZFS_AC_KERNEL_BLKDEV], [
ZFS_AC_KERNEL_BLKDEV_CHECK_DISK_CHANGE
ZFS_AC_KERNEL_BLKDEV_BDEV_CHECK_MEDIA_CHANGE
ZFS_AC_KERNEL_BLKDEV_BDEV_WHOLE
ZFS_AC_KERNEL_BLKDEV_BDEVNAME
ZFS_AC_KERNEL_BLKDEV_GET_ERESTARTSYS
ZFS_AC_KERNEL_BLKDEV_ISSUE_SECURE_ERASE
])
+12 -5
View File
@@ -6,13 +6,16 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_CHECK_EVENTS], [
#include <linux/blkdev.h>
unsigned int blk_check_events(struct gendisk *disk,
unsigned int clearing) { return (0); }
unsigned int clearing) {
(void) disk, (void) clearing;
return (0);
}
static const struct block_device_operations
bops __attribute__ ((unused)) = {
.check_events = blk_check_events,
};
], [], [$NO_UNUSED_BUT_SET_VARIABLE])
], [], [])
])
AC_DEFUN([ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_CHECK_EVENTS], [
@@ -31,7 +34,10 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_RELEASE_VOID], [
ZFS_LINUX_TEST_SRC([block_device_operations_release_void], [
#include <linux/blkdev.h>
void blk_release(struct gendisk *g, fmode_t mode) { return; }
void blk_release(struct gendisk *g, fmode_t mode) {
(void) g, (void) mode;
return;
}
static const struct block_device_operations
bops __attribute__ ((unused)) = {
@@ -40,7 +46,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_RELEASE_VOID], [
.ioctl = NULL,
.compat_ioctl = NULL,
};
], [], [$NO_UNUSED_BUT_SET_VARIABLE])
], [], [])
])
AC_DEFUN([ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_RELEASE_VOID], [
@@ -61,6 +67,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_REVALIDATE_DISK], [
#include <linux/blkdev.h>
int blk_revalidate_disk(struct gendisk *disk) {
(void) disk;
return(0);
}
@@ -68,7 +75,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_REVALIDATE_DISK], [
bops __attribute__ ((unused)) = {
.revalidate_disk = blk_revalidate_disk,
};
], [], [$NO_UNUSED_BUT_SET_VARIABLE])
], [], [])
])
AC_DEFUN([ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_REVALIDATE_DISK], [
+2 -2
View File
@@ -5,9 +5,9 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_GET_DISK_RO], [
ZFS_LINUX_TEST_SRC([get_disk_ro], [
#include <linux/blkdev.h>
],[
struct gendisk *disk = NULL;
struct gendisk *disk __attribute__ ((unused)) = NULL;
(void) get_disk_ro(disk);
], [$NO_UNUSED_BUT_SET_VARIABLE])
], [])
])
AC_DEFUN([ZFS_AC_KERNEL_GET_DISK_RO], [
+20
View File
@@ -49,6 +49,13 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_MAKE_REQUEST_FN], [
struct gendisk *disk __attribute__ ((unused));
disk = blk_alloc_disk(NUMA_NO_NODE);
])
ZFS_LINUX_TEST_SRC([blk_cleanup_disk], [
#include <linux/blkdev.h>
],[
struct gendisk *disk __attribute__ ((unused));
blk_cleanup_disk(disk);
])
])
AC_DEFUN([ZFS_AC_KERNEL_MAKE_REQUEST_FN], [
@@ -73,6 +80,19 @@ AC_DEFUN([ZFS_AC_KERNEL_MAKE_REQUEST_FN], [
ZFS_LINUX_TEST_RESULT([blk_alloc_disk], [
AC_MSG_RESULT(yes)
AC_DEFINE([HAVE_BLK_ALLOC_DISK], 1, [blk_alloc_disk() exists])
dnl #
dnl # 5.20 API change,
dnl # Removed blk_cleanup_disk(), put_disk() should be used.
dnl #
AC_MSG_CHECKING([whether blk_cleanup_disk() exists])
ZFS_LINUX_TEST_RESULT([blk_cleanup_disk], [
AC_MSG_RESULT(yes)
AC_DEFINE([HAVE_BLK_CLEANUP_DISK], 1,
[blk_cleanup_disk() exists])
], [
AC_MSG_RESULT(no)
])
], [
AC_MSG_RESULT(no)
])
+52 -15
View File
@@ -54,6 +54,21 @@ AC_DEFUN([ZFS_AC_KERNEL_SHRINK_CONTROL_HAS_NID], [
])
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_REGISTER_SHRINKER_VARARG], [
ZFS_LINUX_TEST_SRC([register_shrinker_vararg], [
#include <linux/mm.h>
unsigned long shrinker_cb(struct shrinker *shrink,
struct shrink_control *sc) { return 0; }
],[
struct shrinker cache_shrinker = {
.count_objects = shrinker_cb,
.scan_objects = shrinker_cb,
.seeks = DEFAULT_SEEKS,
};
register_shrinker(&cache_shrinker, "vararg-reg-shrink-test");
])
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_SHRINKER_CALLBACK], [
ZFS_LINUX_TEST_SRC([shrinker_cb_shrink_control], [
#include <linux/mm.h>
@@ -83,29 +98,50 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_SHRINKER_CALLBACK], [
AC_DEFUN([ZFS_AC_KERNEL_SHRINKER_CALLBACK],[
dnl #
dnl # 3.0 - 3.11 API change
dnl # cs->shrink(struct shrinker *, struct shrink_control *sc)
dnl # 6.0 API change
dnl # register_shrinker() becomes a var-arg function that takes
dnl # a printf-style format string as args > 0
dnl #
AC_MSG_CHECKING([whether new 2-argument shrinker exists])
ZFS_LINUX_TEST_RESULT([shrinker_cb_shrink_control], [
AC_MSG_CHECKING([whether new var-arg register_shrinker() exists])
ZFS_LINUX_TEST_RESULT([register_shrinker_vararg], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SINGLE_SHRINKER_CALLBACK, 1,
[new shrinker callback wants 2 args])
AC_DEFINE(HAVE_REGISTER_SHRINKER_VARARG, 1,
[register_shrinker is vararg])
dnl # We assume that the split shrinker callback exists if the
dnl # vararg register_shrinker() exists, because the latter is
dnl # a much more recent addition, and the macro test for the
dnl # var-arg version only works if the callback is split
AC_DEFINE(HAVE_SPLIT_SHRINKER_CALLBACK, 1,
[cs->count_objects exists])
],[
AC_MSG_RESULT(no)
dnl #
dnl # 3.12 API change,
dnl # cs->shrink() is logically split in to
dnl # cs->count_objects() and cs->scan_objects()
dnl # 3.0 - 3.11 API change
dnl # cs->shrink(struct shrinker *, struct shrink_control *sc)
dnl #
AC_MSG_CHECKING([whether cs->count_objects callback exists])
ZFS_LINUX_TEST_RESULT([shrinker_cb_shrink_control_split], [
AC_MSG_CHECKING([whether new 2-argument shrinker exists])
ZFS_LINUX_TEST_RESULT([shrinker_cb_shrink_control], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SPLIT_SHRINKER_CALLBACK, 1,
[cs->count_objects exists])
AC_DEFINE(HAVE_SINGLE_SHRINKER_CALLBACK, 1,
[new shrinker callback wants 2 args])
],[
ZFS_LINUX_TEST_ERROR([shrinker])
AC_MSG_RESULT(no)
dnl #
dnl # 3.12 API change,
dnl # cs->shrink() is logically split in to
dnl # cs->count_objects() and cs->scan_objects()
dnl #
AC_MSG_CHECKING([if cs->count_objects callback exists])
ZFS_LINUX_TEST_RESULT(
[shrinker_cb_shrink_control_split],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SPLIT_SHRINKER_CALLBACK, 1,
[cs->count_objects exists])
],[
ZFS_LINUX_TEST_ERROR([shrinker])
])
])
])
])
@@ -141,6 +177,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_SHRINKER], [
ZFS_AC_KERNEL_SRC_SHRINK_CONTROL_HAS_NID
ZFS_AC_KERNEL_SRC_SHRINKER_CALLBACK
ZFS_AC_KERNEL_SRC_SHRINK_CONTROL_STRUCT
ZFS_AC_KERNEL_SRC_REGISTER_SHRINKER_VARARG
])
AC_DEFUN([ZFS_AC_KERNEL_SHRINKER], [
+28 -1
View File
@@ -100,6 +100,19 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_XATTR_HANDLER_GET], [
.get = get,
};
],[])
ZFS_LINUX_TEST_SRC([xattr_handler_get_dentry_inode_flags], [
#include <linux/xattr.h>
int get(const struct xattr_handler *handler,
struct dentry *dentry, struct inode *inode,
const char *name, void *buffer,
size_t size, int flags) { return 0; }
static const struct xattr_handler
xops __attribute__ ((unused)) = {
.get = get,
};
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_XATTR_HANDLER_GET], [
@@ -142,7 +155,21 @@ AC_DEFUN([ZFS_AC_KERNEL_XATTR_HANDLER_GET], [
AC_DEFINE(HAVE_XATTR_GET_DENTRY, 1,
[xattr_handler->get() wants dentry])
],[
ZFS_LINUX_TEST_ERROR([xattr get()])
dnl #
dnl # Android API change,
dnl # The xattr_handler->get() callback was
dnl # changed to take dentry, inode and flags.
dnl #
AC_MSG_RESULT(no)
AC_MSG_CHECKING(
[whether xattr_handler->get() wants dentry and inode and flags])
ZFS_LINUX_TEST_RESULT([xattr_handler_get_dentry_inode_flags], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_XATTR_GET_DENTRY_INODE_FLAGS, 1,
[xattr_handler->get() wants dentry and inode and flags])
],[
ZFS_LINUX_TEST_ERROR([xattr get()])
])
])
])
])
+3 -2
View File
@@ -209,8 +209,8 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS], [
AX_COUNT_CPUS([])
AC_SUBST(CPU_COUNT)
ZFS_AC_CONFIG_ALWAYS_CC_NO_UNUSED_BUT_SET_VARIABLE
ZFS_AC_CONFIG_ALWAYS_CC_NO_BOOL_COMPARE
ZFS_AC_CONFIG_ALWAYS_CC_NO_CLOBBERED
ZFS_AC_CONFIG_ALWAYS_CC_INFINITE_RECURSION
ZFS_AC_CONFIG_ALWAYS_CC_IMPLICIT_FALLTHROUGH
ZFS_AC_CONFIG_ALWAYS_CC_FRAME_LARGER_THAN
ZFS_AC_CONFIG_ALWAYS_CC_NO_FORMAT_TRUNCATION
@@ -226,6 +226,7 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS], [
ZFS_AC_CONFIG_ALWAYS_SED
ZFS_AC_CONFIG_ALWAYS_CPPCHECK
ZFS_AC_CONFIG_ALWAYS_SHELLCHECK
ZFS_AC_CONFIG_ALWAYS_PARALLEL
])
AC_DEFUN([ZFS_AC_CONFIG], [
@@ -8,5 +8,5 @@ ConditionKernelCommandLine=bootfs.snapshot
[Service]
Type=oneshot
ExecStart=/bin/sh -c '. /lib/dracut-zfs-lib.sh; decode_root_args || exit; [ "$root" = "zfs:AUTO" ] && root="$BOOTFS" SNAPNAME="$(getarg bootfs.snapshot)"; exec @sbindir@/zfs snapshot "$root@${SNAPNAME:-%v}"'
-ExecStart=/bin/sh -c '. /lib/dracut-zfs-lib.sh; decode_root_args || exit; [ "$root" = "zfs:AUTO" ] && root="$BOOTFS" SNAPNAME="$(getarg bootfs.snapshot)"; exec @sbindir@/zfs snapshot "$root@${SNAPNAME:-%v}"'
RemainAfterExit=yes
+1 -1
View File
@@ -104,7 +104,7 @@ zfs_errno = enum_with_offset(1024, [
)
# compat before we used the enum helper for these values
ZFS_ERR_CHECKPOINT_EXISTS = zfs_errno.ZFS_ERR_CHECKPOINT_EXISTS
assert(ZFS_ERR_CHECKPOINT_EXISTS == 1024)
assert (ZFS_ERR_CHECKPOINT_EXISTS == 1024)
ZFS_ERR_DISCARDING_CHECKPOINT = zfs_errno.ZFS_ERR_DISCARDING_CHECKPOINT
ZFS_ERR_NO_CHECKPOINT = zfs_errno.ZFS_ERR_NO_CHECKPOINT
ZFS_ERR_DEVRM_IN_PROGRESS = zfs_errno.ZFS_ERR_DEVRM_IN_PROGRESS
+1 -1
View File
@@ -52,7 +52,7 @@
#define ZFS_MODULE_PARAM_CALL_IMPL(parent, name, perm, args, desc) \
SYSCTL_DECL(parent); \
SYSCTL_PROC(parent, OID_AUTO, name, perm | args, desc)
SYSCTL_PROC(parent, OID_AUTO, name, CTLFLAG_MPSAFE | perm | args, desc)
#define ZFS_MODULE_PARAM_CALL(scope_prefix, name_prefix, name, func, _, perm, desc) \
ZFS_MODULE_PARAM_CALL_IMPL(_vfs_ ## scope_prefix, name, perm, func ## _args(name_prefix ## name), desc)
@@ -115,6 +115,20 @@ fn(struct dentry *dentry, const char *name, void *buffer, size_t size, \
{ \
return (__ ## fn(dentry->d_inode, name, buffer, size)); \
}
/*
* Android API change,
* The xattr_handler->get() callback was changed to take a dentry and inode
* and flags, because the dentry might not be attached to an inode yet.
*/
#elif defined(HAVE_XATTR_GET_DENTRY_INODE_FLAGS)
#define ZPL_XATTR_GET_WRAPPER(fn) \
static int \
fn(const struct xattr_handler *handler, struct dentry *dentry, \
struct inode *inode, const char *name, void *buffer, \
size_t size, int flags) \
{ \
return (__ ## fn(inode, name, buffer, size)); \
}
#else
#error "Unsupported kernel"
#endif
+4
View File
@@ -64,7 +64,11 @@
* }
*/
#ifdef HAVE_REGISTER_SHRINKER_VARARG
#define spl_register_shrinker(x) register_shrinker(x, "zfs-arc-shrinker")
#else
#define spl_register_shrinker(x) register_shrinker(x)
#endif
#define spl_unregister_shrinker(x) unregister_shrinker(x)
/*
+1
View File
@@ -91,6 +91,7 @@ abd_t *abd_alloc_linear(size_t, boolean_t);
abd_t *abd_alloc_gang(void);
abd_t *abd_alloc_for_io(size_t, boolean_t);
abd_t *abd_alloc_sametype(abd_t *, size_t);
boolean_t abd_size_alloc_linear(size_t);
void abd_gang_add(abd_t *, abd_t *, boolean_t);
void abd_free(abd_t *);
abd_t *abd_get_offset(abd_t *, size_t);
-1
View File
@@ -68,7 +68,6 @@ abd_t *abd_get_offset_scatter(abd_t *, abd_t *, size_t, size_t);
void abd_free_struct_impl(abd_t *);
void abd_alloc_chunks(abd_t *, size_t);
void abd_free_chunks(abd_t *);
boolean_t abd_size_alloc_linear(size_t);
void abd_update_scatter_stats(abd_t *, abd_stats_op_t);
void abd_update_linear_stats(abd_t *, abd_stats_op_t);
void abd_verify_scatter(abd_t *);
+1
View File
@@ -85,6 +85,7 @@ typedef void arc_prune_func_t(int64_t bytes, void *priv);
/* Shared module parameters */
extern int zfs_arc_average_blocksize;
extern int l2arc_exclude_special;
/* generic arc_done_func_t's which you can use */
arc_read_done_func_t arc_bcopy_func;
+7 -7
View File
@@ -30,22 +30,22 @@ typedef struct bqueue {
kmutex_t bq_lock;
kcondvar_t bq_add_cv;
kcondvar_t bq_pop_cv;
uint64_t bq_size;
uint64_t bq_maxsize;
uint64_t bq_fill_fraction;
size_t bq_size;
size_t bq_maxsize;
uint_t bq_fill_fraction;
size_t bq_node_offset;
} bqueue_t;
typedef struct bqueue_node {
list_node_t bqn_node;
uint64_t bqn_size;
size_t bqn_size;
} bqueue_node_t;
int bqueue_init(bqueue_t *, uint64_t, uint64_t, size_t);
int bqueue_init(bqueue_t *, uint_t, size_t, size_t);
void bqueue_destroy(bqueue_t *);
void bqueue_enqueue(bqueue_t *, void *, uint64_t);
void bqueue_enqueue_flush(bqueue_t *, void *, uint64_t);
void bqueue_enqueue(bqueue_t *, void *, size_t);
void bqueue_enqueue_flush(bqueue_t *, void *, size_t);
void *bqueue_dequeue(bqueue_t *);
boolean_t bqueue_empty(bqueue_t *);
+10 -2
View File
@@ -72,7 +72,11 @@ extern kmem_cache_t *zfs_btree_leaf_cache;
typedef struct zfs_btree_hdr {
struct zfs_btree_core *bth_parent;
boolean_t bth_core;
/*
* Set to -1 to indicate core nodes. Other values represent first
* valid element offset for leaf nodes.
*/
uint32_t bth_first;
/*
* For both leaf and core nodes, represents the number of elements in
* the node. For core nodes, they will have bth_count + 1 children.
@@ -91,9 +95,12 @@ typedef struct zfs_btree_leaf {
uint8_t btl_elems[];
} zfs_btree_leaf_t;
#define BTREE_LEAF_ESIZE (BTREE_LEAF_SIZE - \
offsetof(zfs_btree_leaf_t, btl_elems))
typedef struct zfs_btree_index {
zfs_btree_hdr_t *bti_node;
uint64_t bti_offset;
uint32_t bti_offset;
/*
* True if the location is before the list offset, false if it's at
* the listed offset.
@@ -105,6 +112,7 @@ typedef struct btree {
zfs_btree_hdr_t *bt_root;
int64_t bt_height;
size_t bt_elem_size;
uint32_t bt_leaf_cap;
uint64_t bt_num_elems;
uint64_t bt_num_nodes;
zfs_btree_leaf_t *bt_bulk; // non-null if bulk loading
-3
View File
@@ -32,9 +32,6 @@ int aes_mod_fini(void);
int edonr_mod_init(void);
int edonr_mod_fini(void);
int sha1_mod_init(void);
int sha1_mod_fini(void);
int sha2_mod_init(void);
int sha2_mod_fini(void);
+6 -14
View File
@@ -321,15 +321,16 @@ typedef struct dmu_buf_impl {
uint8_t db_dirtycnt;
} dmu_buf_impl_t;
#define DBUF_RWLOCKS 8192
#define DBUF_HASH_RWLOCK(h, idx) (&(h)->hash_rwlocks[(idx) & (DBUF_RWLOCKS-1)])
/* Note: the dbuf hash table is exposed only for the mdb module */
#define DBUF_MUTEXES 2048
#define DBUF_HASH_MUTEX(h, idx) (&(h)->hash_mutexes[(idx) & (DBUF_MUTEXES-1)])
typedef struct dbuf_hash_table {
uint64_t hash_table_mask;
dmu_buf_impl_t **hash_table;
krwlock_t hash_rwlocks[DBUF_RWLOCKS] ____cacheline_aligned;
kmutex_t hash_mutexes[DBUF_MUTEXES] ____cacheline_aligned;
} dbuf_hash_table_t;
typedef void (*dbuf_prefetch_fn)(void *, boolean_t);
typedef void (*dbuf_prefetch_fn)(void *, uint64_t, uint64_t, boolean_t);
uint64_t dbuf_whichblock(const struct dnode *di, const int64_t level,
const uint64_t offset);
@@ -441,16 +442,7 @@ dbuf_find_dirty_eq(dmu_buf_impl_t *db, uint64_t txg)
(dbuf_is_metadata(_db) && \
((_db)->db_objset->os_primary_cache == ZFS_CACHE_METADATA)))
#define DBUF_IS_L2CACHEABLE(_db) \
((_db)->db_objset->os_secondary_cache == ZFS_CACHE_ALL || \
(dbuf_is_metadata(_db) && \
((_db)->db_objset->os_secondary_cache == ZFS_CACHE_METADATA)))
#define DNODE_LEVEL_IS_L2CACHEABLE(_dn, _level) \
((_dn)->dn_objset->os_secondary_cache == ZFS_CACHE_ALL || \
(((_level) > 0 || \
DMU_OT_IS_METADATA((_dn)->dn_handle->dnh_dnode->dn_type)) && \
((_dn)->dn_objset->os_secondary_cache == ZFS_CACHE_METADATA)))
boolean_t dbuf_is_l2cacheable(dmu_buf_impl_t *db);
#ifdef ZFS_DEBUG
+4 -2
View File
@@ -136,7 +136,7 @@ typedef enum dmu_object_byteswap {
#endif
#define DMU_OT_IS_METADATA(ot) (((ot) & DMU_OT_NEWTYPE) ? \
((ot) & DMU_OT_METADATA) : \
(((ot) & DMU_OT_METADATA) != 0) : \
DMU_OT_IS_METADATA_IMPL(ot))
#define DMU_OT_IS_DDT(ot) \
@@ -147,7 +147,7 @@ typedef enum dmu_object_byteswap {
((ot) == DMU_OT_PLAIN_FILE_CONTENTS || (ot) == DMU_OT_UINT64_OTHER)
#define DMU_OT_IS_ENCRYPTED(ot) (((ot) & DMU_OT_NEWTYPE) ? \
((ot) & DMU_OT_ENCRYPTED) : \
(((ot) & DMU_OT_ENCRYPTED) != 0) : \
DMU_OT_IS_ENCRYPTED_IMPL(ot))
/*
@@ -1067,6 +1067,8 @@ int dmu_diff(const char *tosnap_name, const char *fromsnap_name,
#define ZFS_CRC64_POLY 0xC96C5795D7870F42ULL /* ECMA-182, reflected form */
extern uint64_t zfs_crc64_table[256];
extern int dmu_prefetch_max;
#ifdef __cplusplus
}
#endif
-4
View File
@@ -200,10 +200,6 @@ struct objset {
#define DMU_GROUPUSED_DNODE(os) ((os)->os_groupused_dnode.dnh_dnode)
#define DMU_PROJECTUSED_DNODE(os) ((os)->os_projectused_dnode.dnh_dnode)
#define DMU_OS_IS_L2CACHEABLE(os) \
((os)->os_secondary_cache == ZFS_CACHE_ALL || \
(os)->os_secondary_cache == ZFS_CACHE_METADATA)
/* called from zpl */
int dmu_objset_hold(const char *name, void *tag, objset_t **osp);
int dmu_objset_hold_flags(const char *name, boolean_t decrypt, void *tag,
+1
View File
@@ -125,6 +125,7 @@ typedef struct dmu_tx_stats {
kstat_named_t dmu_tx_dirty_delay;
kstat_named_t dmu_tx_dirty_over_max;
kstat_named_t dmu_tx_dirty_frees_delay;
kstat_named_t dmu_tx_wrlog_delay;
kstat_named_t dmu_tx_quota;
} dmu_tx_stats_t;
+7 -9
View File
@@ -49,20 +49,18 @@ typedef struct zfetch {
typedef struct zstream {
uint64_t zs_blkid; /* expect next access at this blkid */
uint64_t zs_pf_blkid1; /* first block to prefetch */
uint64_t zs_pf_blkid; /* block to prefetch up to */
/*
* We will next prefetch the L1 indirect block of this level-0
* block id.
*/
uint64_t zs_ipf_blkid1; /* first block to prefetch */
uint64_t zs_ipf_blkid; /* block to prefetch up to */
unsigned int zs_pf_dist; /* data prefetch distance in bytes */
unsigned int zs_ipf_dist; /* L1 prefetch distance in bytes */
uint64_t zs_pf_start; /* first data block to prefetch */
uint64_t zs_pf_end; /* data block to prefetch up to */
uint64_t zs_ipf_start; /* first data block to prefetch L1 */
uint64_t zs_ipf_end; /* data block to prefetch L1 up to */
list_node_t zs_node; /* link for zf_stream */
hrtime_t zs_atime; /* time last prefetch issued */
zfetch_t *zs_fetch; /* parent fetch */
boolean_t zs_missed; /* stream saw cache misses */
boolean_t zs_more; /* need more distant prefetch */
zfs_refcount_t zs_callers; /* number of pending callers */
/*
* Number of stream references: dnode, callers and pending blocks.
+7 -1
View File
@@ -40,6 +40,7 @@
#include <sys/rrwlock.h>
#include <sys/dsl_synctask.h>
#include <sys/mmp.h>
#include <sys/aggsum.h>
#ifdef __cplusplus
extern "C" {
@@ -58,6 +59,7 @@ struct dsl_deadlist;
extern unsigned long zfs_dirty_data_max;
extern unsigned long zfs_dirty_data_max_max;
extern unsigned long zfs_wrlog_data_max;
extern int zfs_dirty_data_sync_percent;
extern int zfs_dirty_data_max_percent;
extern int zfs_dirty_data_max_max_percent;
@@ -82,7 +84,6 @@ typedef struct zfs_blkstat {
typedef struct zfs_all_blkstats {
zfs_blkstat_t zab_type[DN_MAX_LEVELS + 1][DMU_OT_TOTAL + 1];
kmutex_t zab_lock;
} zfs_all_blkstats_t;
@@ -119,6 +120,9 @@ typedef struct dsl_pool {
uint64_t dp_mos_compressed_delta;
uint64_t dp_mos_uncompressed_delta;
aggsum_t dp_wrlog_pertxg[TXG_SIZE];
aggsum_t dp_wrlog_total;
/*
* Time of most recently scheduled (furthest in the future)
* wakeup for delayed transactions.
@@ -159,6 +163,8 @@ uint64_t dsl_pool_adjustedsize(dsl_pool_t *dp, zfs_space_check_t slop_policy);
uint64_t dsl_pool_unreserved_space(dsl_pool_t *dp,
zfs_space_check_t slop_policy);
uint64_t dsl_pool_deferred_space(dsl_pool_t *dp);
void dsl_pool_wrlog_count(dsl_pool_t *dp, int64_t size, uint64_t txg);
boolean_t dsl_pool_need_wrlog_delay(dsl_pool_t *dp);
void dsl_pool_dirty_space(dsl_pool_t *dp, int64_t space, dmu_tx_t *tx);
void dsl_pool_undirty_space(dsl_pool_t *dp, int64_t space, uint64_t txg);
void dsl_free(dsl_pool_t *dp, uint64_t txg, const blkptr_t *bpp);
+1 -1
View File
@@ -155,7 +155,7 @@ typedef struct dsl_scan {
dsl_scan_phys_t scn_phys; /* on disk representation of scan */
dsl_scan_phys_t scn_phys_cached;
avl_tree_t scn_queue; /* queue of datasets to scan */
uint64_t scn_bytes_pending; /* outstanding data to issue */
uint64_t scn_queues_pending; /* outstanding data to issue */
} dsl_scan_t;
typedef struct dsl_scan_io_queue dsl_scan_io_queue_t;
+43
View File
@@ -757,6 +757,7 @@ typedef struct zpool_load_policy {
/* Rewind data discovered */
#define ZPOOL_CONFIG_LOAD_TIME "rewind_txg_ts"
#define ZPOOL_CONFIG_LOAD_META_ERRORS "verify_meta_errors"
#define ZPOOL_CONFIG_LOAD_DATA_ERRORS "verify_data_errors"
#define ZPOOL_CONFIG_REWIND_TIME "seconds_of_rewind"
@@ -1101,6 +1102,7 @@ typedef struct vdev_stat {
uint64_t vs_configured_ashift; /* TLV vdev_ashift */
uint64_t vs_logical_ashift; /* vdev_logical_ashift */
uint64_t vs_physical_ashift; /* vdev_physical_ashift */
uint64_t vs_pspace; /* physical capacity */
} vdev_stat_t;
/* BEGIN CSTYLED */
@@ -1613,6 +1615,47 @@ typedef enum {
#define ZFS_EV_HIST_DSID "history_dsid"
#define ZFS_EV_RESILVER_TYPE "resilver_type"
/*
* We currently support block sizes from 512 bytes to 16MB.
* The benefits of larger blocks, and thus larger IO, need to be weighed
* against the cost of COWing a giant block to modify one byte, and the
* large latency of reading or writing a large block.
*
* The recordsize property can not be set larger than zfs_max_recordsize
* (default 16MB on 64-bit and 1MB on 32-bit). See the comment near
* zfs_max_recordsize in dsl_dataset.c for details.
*
* Note that although the LSIZE field of the blkptr_t can store sizes up
* to 32MB, the dnode's dn_datablkszsec can only store sizes up to
* 32MB - 512 bytes. Therefore, we limit SPA_MAXBLOCKSIZE to 16MB.
*/
#define SPA_MINBLOCKSHIFT 9
#define SPA_OLD_MAXBLOCKSHIFT 17
#define SPA_MAXBLOCKSHIFT 24
#define SPA_MINBLOCKSIZE (1ULL << SPA_MINBLOCKSHIFT)
#define SPA_OLD_MAXBLOCKSIZE (1ULL << SPA_OLD_MAXBLOCKSHIFT)
#define SPA_MAXBLOCKSIZE (1ULL << SPA_MAXBLOCKSHIFT)
/* supported encryption algorithms */
enum zio_encrypt {
ZIO_CRYPT_INHERIT = 0,
ZIO_CRYPT_ON,
ZIO_CRYPT_OFF,
ZIO_CRYPT_AES_128_CCM,
ZIO_CRYPT_AES_192_CCM,
ZIO_CRYPT_AES_256_CCM,
ZIO_CRYPT_AES_128_GCM,
ZIO_CRYPT_AES_192_GCM,
ZIO_CRYPT_AES_256_GCM,
ZIO_CRYPT_FUNCTIONS
};
#define ZIO_CRYPT_ON_VALUE ZIO_CRYPT_AES_256_GCM
#define ZIO_CRYPT_DEFAULT ZIO_CRYPT_OFF
#ifdef __cplusplus
}
#endif
+3
View File
@@ -49,11 +49,14 @@ int metaslab_init(metaslab_group_t *, uint64_t, uint64_t, uint64_t,
metaslab_t **);
void metaslab_fini(metaslab_t *);
void metaslab_set_unflushed_dirty(metaslab_t *, boolean_t);
void metaslab_set_unflushed_txg(metaslab_t *, uint64_t, dmu_tx_t *);
void metaslab_set_estimated_condensed_size(metaslab_t *, uint64_t, dmu_tx_t *);
boolean_t metaslab_unflushed_dirty(metaslab_t *);
uint64_t metaslab_unflushed_txg(metaslab_t *);
uint64_t metaslab_estimated_condensed_size(metaslab_t *);
int metaslab_sort_by_flushed(const void *, const void *);
void metaslab_unflushed_bump(metaslab_t *, dmu_tx_t *, boolean_t);
uint64_t metaslab_unflushed_changes_memused(metaslab_t *);
int metaslab_load(metaslab_t *);
+1
View File
@@ -553,6 +553,7 @@ struct metaslab {
* log space maps.
*/
uint64_t ms_unflushed_txg;
boolean_t ms_unflushed_dirty;
/* updated every time we are done syncing the metaslab's space map */
uint64_t ms_synced_length;
+5 -16
View File
@@ -63,12 +63,8 @@ typedef struct range_tree {
*/
uint8_t rt_shift;
uint64_t rt_start;
range_tree_ops_t *rt_ops;
/* rt_btree_compare should only be set if rt_arg is a b-tree */
const range_tree_ops_t *rt_ops;
void *rt_arg;
int (*rt_btree_compare)(const void *, const void *);
uint64_t rt_gap; /* allowable inter-segment gap */
/*
@@ -278,11 +274,11 @@ rs_set_fill(range_seg_t *rs, range_tree_t *rt, uint64_t fill)
typedef void range_tree_func_t(void *arg, uint64_t start, uint64_t size);
range_tree_t *range_tree_create_impl(range_tree_ops_t *ops,
range_tree_t *range_tree_create_gap(const range_tree_ops_t *ops,
range_seg_type_t type, void *arg, uint64_t start, uint64_t shift,
int (*zfs_btree_compare) (const void *, const void *), uint64_t gap);
range_tree_t *range_tree_create(range_tree_ops_t *ops, range_seg_type_t type,
void *arg, uint64_t start, uint64_t shift);
uint64_t gap);
range_tree_t *range_tree_create(const range_tree_ops_t *ops,
range_seg_type_t type, void *arg, uint64_t start, uint64_t shift);
void range_tree_destroy(range_tree_t *rt);
boolean_t range_tree_contains(range_tree_t *rt, uint64_t start, uint64_t size);
range_seg_t *range_tree_find(range_tree_t *rt, uint64_t start, uint64_t size);
@@ -316,13 +312,6 @@ void range_tree_remove_xor_add_segment(uint64_t start, uint64_t end,
void range_tree_remove_xor_add(range_tree_t *rt, range_tree_t *removefrom,
range_tree_t *addto);
void rt_btree_create(range_tree_t *rt, void *arg);
void rt_btree_destroy(range_tree_t *rt, void *arg);
void rt_btree_add(range_tree_t *rt, range_seg_t *rs, void *arg);
void rt_btree_remove(range_tree_t *rt, range_seg_t *rs, void *arg);
void rt_btree_vacate(range_tree_t *rt, void *arg);
extern range_tree_ops_t rt_btree_ops;
#ifdef __cplusplus
}
#endif
-21
View File
@@ -72,27 +72,6 @@ struct dsl_pool;
struct dsl_dataset;
struct dsl_crypto_params;
/*
* We currently support block sizes from 512 bytes to 16MB.
* The benefits of larger blocks, and thus larger IO, need to be weighed
* against the cost of COWing a giant block to modify one byte, and the
* large latency of reading or writing a large block.
*
* Note that although blocks up to 16MB are supported, the recordsize
* property can not be set larger than zfs_max_recordsize (default 1MB).
* See the comment near zfs_max_recordsize in dsl_dataset.c for details.
*
* Note that although the LSIZE field of the blkptr_t can store sizes up
* to 32MB, the dnode's dn_datablkszsec can only store sizes up to
* 32MB - 512 bytes. Therefore, we limit SPA_MAXBLOCKSIZE to 16MB.
*/
#define SPA_MINBLOCKSHIFT 9
#define SPA_OLD_MAXBLOCKSHIFT 17
#define SPA_MAXBLOCKSHIFT 24
#define SPA_MINBLOCKSIZE (1ULL << SPA_MINBLOCKSHIFT)
#define SPA_OLD_MAXBLOCKSIZE (1ULL << SPA_OLD_MAXBLOCKSHIFT)
#define SPA_MAXBLOCKSIZE (1ULL << SPA_MAXBLOCKSHIFT)
/*
* Alignment Shift (ashift) is an immutable, internal top-level vdev property
* which can only be set at vdev creation time. Physical writes are always done
+2 -2
View File
@@ -146,9 +146,9 @@ typedef struct spa_config_lock {
kmutex_t scl_lock;
kthread_t *scl_writer;
int scl_write_wanted;
int scl_count;
kcondvar_t scl_cv;
zfs_refcount_t scl_count;
} spa_config_lock_t;
} ____cacheline_aligned spa_config_lock_t;
typedef struct spa_config_dirent {
list_node_t scd_link;
+7 -2
View File
@@ -30,7 +30,10 @@
typedef struct log_summary_entry {
uint64_t lse_start; /* start TXG */
uint64_t lse_end; /* last TXG */
uint64_t lse_txgcount; /* # of TXGs */
uint64_t lse_mscount; /* # of metaslabs needed to be flushed */
uint64_t lse_msdcount; /* # of dirty metaslabs needed to be flushed */
uint64_t lse_blkcount; /* blocks held by this entry */
list_node_t lse_node;
} log_summary_entry_t;
@@ -50,6 +53,7 @@ typedef struct spa_log_sm {
uint64_t sls_nblocks; /* number of blocks in this log */
uint64_t sls_mscount; /* # of metaslabs flushed in the log's txg */
avl_node_t sls_node; /* node in spa_sm_logs_by_txg */
space_map_t *sls_sm; /* space map pointer, if open */
} spa_log_sm_t;
int spa_ld_log_spacemaps(spa_t *);
@@ -68,8 +72,9 @@ uint64_t spa_log_sm_memused(spa_t *);
void spa_log_sm_decrement_mscount(spa_t *, uint64_t);
void spa_log_sm_increment_current_mscount(spa_t *);
void spa_log_summary_add_flushed_metaslab(spa_t *);
void spa_log_summary_decrement_mscount(spa_t *, uint64_t);
void spa_log_summary_add_flushed_metaslab(spa_t *, boolean_t);
void spa_log_summary_dirty_flushed_metaslab(spa_t *, uint64_t);
void spa_log_summary_decrement_mscount(spa_t *, uint64_t, boolean_t);
void spa_log_summary_decrement_blkcount(spa_t *, uint64_t);
boolean_t spa_flush_all_logs_requested(spa_t *);
+3
View File
@@ -244,6 +244,9 @@ extern "C" {
#define DEV_PATH "path"
#define DEV_IS_PART "is_slice"
#define DEV_SIZE "dev_size"
/* Size of the whole parent block device (if dev is a partition) */
#define DEV_PARENT_SIZE "dev_parent_size"
#endif /* __linux__ */
#define EV_V1 1
+1 -1
View File
@@ -78,7 +78,7 @@ extern void txg_register_callbacks(txg_handle_t *txghp, list_t *tx_callbacks);
extern void txg_delay(struct dsl_pool *dp, uint64_t txg, hrtime_t delta,
hrtime_t resolution);
extern void txg_kick(struct dsl_pool *dp);
extern void txg_kick(struct dsl_pool *dp, uint64_t txg);
/*
* Wait until the given transaction group has finished syncing.
+1
View File
@@ -642,6 +642,7 @@ extern int vdev_obsolete_counts_are_precise(vdev_t *vd, boolean_t *are_precise);
*/
int vdev_checkpoint_sm_object(vdev_t *vd, uint64_t *sm_obj);
void vdev_metaslab_group_create(vdev_t *vd);
uint64_t vdev_best_ashift(uint64_t logical, uint64_t a, uint64_t b);
/*
* Vdev ashift optimization tunables
+10 -1
View File
@@ -221,6 +221,15 @@ typedef struct {
uint64_t lr_foid; /* object id */
} lr_ooo_t;
/*
* Additional lr_attr_t fields.
*/
typedef struct {
uint64_t lr_attr_attrs; /* all of the attributes */
uint64_t lr_attr_crtime[2]; /* create time */
uint8_t lr_attr_scanstamp[32];
} lr_attr_end_t;
/*
* Handle option extended vattr attributes.
*
@@ -231,7 +240,7 @@ typedef struct {
typedef struct {
uint32_t lr_attr_masksize; /* number of elements in array */
uint32_t lr_attr_bitmap; /* First entry of array */
/* remainder of array and any additional fields */
/* remainder of array and additional lr_attr_end_t fields */
} lr_attr_t;
/*
+2 -17
View File
@@ -108,23 +108,6 @@ enum zio_checksum {
#define ZIO_DEDUPCHECKSUM ZIO_CHECKSUM_SHA256
/* supported encryption algorithms */
enum zio_encrypt {
ZIO_CRYPT_INHERIT = 0,
ZIO_CRYPT_ON,
ZIO_CRYPT_OFF,
ZIO_CRYPT_AES_128_CCM,
ZIO_CRYPT_AES_192_CCM,
ZIO_CRYPT_AES_256_CCM,
ZIO_CRYPT_AES_128_GCM,
ZIO_CRYPT_AES_192_GCM,
ZIO_CRYPT_AES_256_GCM,
ZIO_CRYPT_FUNCTIONS
};
#define ZIO_CRYPT_ON_VALUE ZIO_CRYPT_AES_256_GCM
#define ZIO_CRYPT_DEFAULT ZIO_CRYPT_OFF
/* macros defining encryption lengths */
#define ZIO_OBJSET_MAC_LEN 32
#define ZIO_DATA_IV_LEN 12
@@ -699,6 +682,8 @@ extern void spa_handle_ignored_writes(spa_t *spa);
/* zbookmark_phys functions */
boolean_t zbookmark_subtree_completed(const struct dnode_phys *dnp,
const zbookmark_phys_t *subtree_root, const zbookmark_phys_t *last_block);
boolean_t zbookmark_subtree_tbd(const struct dnode_phys *dnp,
const zbookmark_phys_t *subtree_root, const zbookmark_phys_t *last_block);
int zbookmark_compare(uint16_t dbss1, uint8_t ibs1, uint16_t dbss2,
uint8_t ibs2, const zbookmark_phys_t *zb1, const zbookmark_phys_t *zb2);
+3
View File
@@ -5,6 +5,9 @@ VPATH = $(top_srcdir)/module/avl/
# Includes kernel code, generate warnings for large stack frames
AM_CFLAGS += $(FRAME_LARGER_THAN)
# See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=54020
AM_CFLAGS += -no-suppress
noinst_LTLIBRARIES = libavl.la
KERNEL_C = \
+3
View File
@@ -2,6 +2,9 @@ include $(top_srcdir)/config/Rules.am
AM_CFLAGS += $(LIBUUID_CFLAGS) $(ZLIB_CFLAGS)
# See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=54020
AM_CFLAGS += -no-suppress
noinst_LTLIBRARIES = libefi.la
USER_C = \
+2 -3
View File
@@ -6,6 +6,8 @@ VPATH = \
# Includes kernel code, generate warnings for large stack frames
AM_CFLAGS += $(FRAME_LARGER_THAN)
# See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=54020
AM_CFLAGS += -no-suppress
noinst_LTLIBRARIES = libicp.la
@@ -17,7 +19,6 @@ ASM_SOURCES_AS = \
asm-x86_64/modes/gcm_pclmulqdq.S \
asm-x86_64/modes/aesni-gcm-x86_64.S \
asm-x86_64/modes/ghash-x86_64.S \
asm-x86_64/sha1/sha1-x86_64.S \
asm-x86_64/sha2/sha256_impl.S \
asm-x86_64/sha2/sha512_impl.S
else
@@ -46,7 +47,6 @@ KERNEL_C = \
algs/modes/ctr.c \
algs/modes/ccm.c \
algs/modes/ecb.c \
algs/sha1/sha1.c \
algs/sha2/sha2.c \
algs/skein/skein.c \
algs/skein/skein_block.c \
@@ -54,7 +54,6 @@ KERNEL_C = \
illumos-crypto.c \
io/aes.c \
io/edonr_mod.c \
io/sha1_mod.c \
io/sha2_mod.c \
io/skein_mod.c \
os/modhash.c \
+3
View File
@@ -8,6 +8,9 @@ VPATH = \
# and required CFLAGS for libtirpc
AM_CFLAGS += $(FRAME_LARGER_THAN) $(LIBTIRPC_CFLAGS)
# See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=54020
AM_CFLAGS += -no-suppress
lib_LTLIBRARIES = libnvpair.la
include $(top_srcdir)/config/Abigail.am
+3
View File
@@ -2,6 +2,9 @@ include $(top_srcdir)/config/Rules.am
DEFAULT_INCLUDES += -I$(srcdir)
# See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=54020
AM_CFLAGS += -no-suppress
noinst_LTLIBRARIES = libshare.la
USER_C = \
+3
View File
@@ -2,6 +2,9 @@ include $(top_srcdir)/config/Rules.am
SUBDIRS = include
# See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=54020
AM_CFLAGS += -no-suppress
noinst_LTLIBRARIES = libspl_assert.la libspl.la
libspl_assert_la_SOURCES = \
+6
View File
@@ -1,5 +1,11 @@
include $(top_srcdir)/config/Rules.am
# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61118
AM_CFLAGS += $(NO_CLOBBERED)
# See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=54020
AM_CFLAGS += -no-suppress
noinst_LTLIBRARIES = libtpool.la
USER_C = \
+3
View File
@@ -5,6 +5,9 @@ VPATH = $(top_srcdir)/module/unicode
# Includes kernel code, generate warnings for large stack frames
AM_CFLAGS += $(FRAME_LARGER_THAN)
# See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=54020
AM_CFLAGS += -no-suppress
noinst_LTLIBRARIES = libunicode.la
KERNEL_C = \
+3
View File
@@ -1,5 +1,8 @@
include $(top_srcdir)/config/Rules.am
# See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=54020
AM_CFLAGS += -no-suppress
lib_LTLIBRARIES = libuutil.la
include $(top_srcdir)/config/Abigail.am
+3 -1
View File
@@ -6,9 +6,11 @@ VPATH = \
$(top_srcdir)/lib/libzfs
# Suppress unused but set variable warnings often due to ASSERTs
AM_CFLAGS += $(NO_UNUSED_BUT_SET_VARIABLE)
AM_CFLAGS += $(LIBCRYPTO_CFLAGS) $(ZLIB_CFLAGS)
# See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=54020
AM_CFLAGS += -no-suppress
pkgconfig_DATA = libzfs.pc
lib_LTLIBRARIES = libzfs.la
+3
View File
@@ -2,6 +2,9 @@ include $(top_srcdir)/config/Rules.am
pkgconfig_DATA = libzfs_core.pc
# See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=54020
AM_CFLAGS += -no-suppress
lib_LTLIBRARIES = libzfs_core.la
include $(top_srcdir)/config/Abigail.am
+3
View File
@@ -2,6 +2,9 @@ include $(top_srcdir)/config/Rules.am
pkgconfig_DATA = libzfsbootenv.pc
# See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=54020
AM_CFLAGS += -no-suppress
lib_LTLIBRARIES = libzfsbootenv.la
include $(top_srcdir)/config/Abigail.am
+3 -3
View File
@@ -17,9 +17,6 @@ endif
# Unconditionally enable debugging for libzpool
AM_CPPFLAGS += -DDEBUG -UNDEBUG -DZFS_DEBUG
# Suppress unused but set variable warnings often due to ASSERTs
AM_CFLAGS += $(NO_UNUSED_BUT_SET_VARIABLE)
# Includes kernel code generate warnings for large stack frames
AM_CFLAGS += $(FRAME_LARGER_THAN)
@@ -27,6 +24,9 @@ AM_CFLAGS += $(ZLIB_CFLAGS)
AM_CFLAGS += -DLIB_ZPOOL_BUILD
# See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=54020
AM_CFLAGS += -no-suppress
lib_LTLIBRARIES = libzpool.la
USER_C = \
+2
View File
@@ -5,6 +5,8 @@ VPATH = $(top_srcdir)/module/zstd
# -fno-tree-vectorize is set for gcc in zstd/common/compiler.h
# Set it for other compilers, too.
AM_CFLAGS += -fno-tree-vectorize
# See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=54020
AM_CFLAGS += -no-suppress
noinst_LTLIBRARIES = libzstd.la
+3 -2
View File
@@ -1,9 +1,10 @@
include $(top_srcdir)/config/Rules.am
# Suppress unused but set variable warnings often due to ASSERTs
AM_CFLAGS += $(NO_UNUSED_BUT_SET_VARIABLE)
AM_CFLAGS += $(LIBBLKID_CFLAGS) $(LIBUDEV_CFLAGS)
# See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=54020
AM_CFLAGS += -no-suppress
DEFAULT_INCLUDES += -I$(srcdir)
noinst_LTLIBRARIES = libzutil.la
+2
View File
@@ -1660,6 +1660,8 @@ zpool_find_import_cached(libpc_handle_t *hdl, importargs_t *iarg)
* caller.
*/
nvpair_t *pair = nvlist_next_nvpair(nv, NULL);
if (pair == NULL)
continue;
fnvlist_add_nvlist(pools, nvpair_name(pair),
fnvpair_value_nvlist(pair));
+63 -11
View File
@@ -109,6 +109,11 @@ A value of
.Sy 100
disables this feature.
.
.It Sy l2arc_exclude_special Ns = Ns Sy 0 Ns | Ns 1 Pq int
Controls whether buffers present on special vdevs are eligibile for caching
into L2ARC.
If set to 1, exclude dbufs on special vdevs from being cached to L2ARC.
.
.It Sy l2arc_mfuonly Ns = Ns Sy 0 Ns | Ns 1 Pq int
Controls whether only MFU metadata and data are cached from ARC into L2ARC.
This may be desired to avoid wasting space on L2ARC when reading/writing large
@@ -213,12 +218,12 @@ For L2ARC devices less than 1GB, the amount of data
evicts is significant compared to the amount of restored L2ARC data.
In this case, do not write log blocks in L2ARC in order not to waste space.
.
.It Sy metaslab_aliquot Ns = Ns Sy 524288 Ns B Po 512kB Pc Pq ulong
.It Sy metaslab_aliquot Ns = Ns Sy 1048576 Ns B Po 1MB Pc Pq ulong
Metaslab granularity, in bytes.
This is roughly similar to what would be referred to as the "stripe size"
in traditional RAID arrays.
In normal operation, ZFS will try to write this amount of data
to a top-level vdev before moving on to the next one.
In normal operation, ZFS will try to write this amount of data to each disk
before moving on to the next top-level vdev.
.
.It Sy metaslab_bias_enabled Ns = Ns Sy 1 Ns | Ns 0 Pq int
Enable metaslab group biasing based on their vdevs' over- or under-utilization
@@ -342,9 +347,12 @@ When a vdev is added, target this number of metaslabs per top-level vdev.
.It Sy zfs_vdev_default_ms_shift Ns = Ns Sy 29 Po 512MB Pc Pq int
Default limit for metaslab size.
.
.It Sy zfs_vdev_max_auto_ashift Ns = Ns Sy ASHIFT_MAX Po 16 Pc Pq ulong
.It Sy zfs_vdev_max_auto_ashift Ns = Ns Sy 14 Pq ulong
Maximum ashift used when optimizing for logical -> physical sector size on new
top-level vdevs.
May be increased up to
.Sy ASHIFT_MAX Po 16 Pc ,
but this may negatively impact pool space efficiency.
.
.It Sy zfs_vdev_min_auto_ashift Ns = Ns Sy ASHIFT_MIN Po 9 Pc Pq ulong
Minimum ashift used when creating new top-level vdevs.
@@ -475,7 +483,15 @@ However, this is limited by
.It Sy zfetch_array_rd_sz Ns = Ns Sy 1048576 Ns B Po 1MB Pc Pq ulong
If prefetching is enabled, disable prefetching for reads larger than this size.
.
.It Sy zfetch_max_distance Ns = Ns Sy 8388608 Ns B Po 8MB Pc Pq uint
.It Sy zfetch_min_distance Ns = Ns Sy 4194304 Ns B Po 4 MiB Pc Pq uint
Min bytes to prefetch per stream.
Prefetch distance starts from the demand access size and quickly grows to
this value, doubling on each hit.
After that it may grow further by 1/8 per hit, but only if some prefetch
since last time haven't completed in time to satisfy demand request, i.e.
prefetch depth didn't cover the read latency or the pool got saturated.
.
.It Sy zfetch_max_distance Ns = Ns Sy 67108864 Ns B Po 64 MiB Pc Pq uint
Max bytes to prefetch per stream.
.
.It Sy zfetch_max_idistance Ns = Ns Sy 67108864 Ns B Po 64MB Pc Pq uint
@@ -484,8 +500,11 @@ Max bytes to prefetch indirects for per stream.
.It Sy zfetch_max_streams Ns = Ns Sy 8 Pq uint
Max number of streams per zfetch (prefetch streams per file).
.
.It Sy zfetch_min_sec_reap Ns = Ns Sy 2 Pq uint
Min time before an active prefetch stream can be reclaimed
.It Sy zfetch_min_sec_reap Ns = Ns Sy 1 Pq uint
Min time before inactive prefetch stream can be reclaimed
.
.It Sy zfetch_max_sec_reap Ns = Ns Sy 2 Pq uint
Max time before inactive prefetch stream can be deleted
.
.It Sy zfs_abd_scatter_enabled Ns = Ns Sy 1 Ns | Ns 0 Pq int
Enables ARC from using scatter/gather lists and forces all allocations to be
@@ -966,13 +985,13 @@ log spacemap in memory, in bytes.
Part of overall system memory that ZFS allows to be used
for unflushed metadata changes by the log spacemap, in millionths.
.
.It Sy zfs_unflushed_log_block_max Ns = Ns Sy 262144 Po 256k Pc Pq ulong
.It Sy zfs_unflushed_log_block_max Ns = Ns Sy 131072 Po 128k Pc Pq ulong
Describes the maximum number of log spacemap blocks allowed for each pool.
The default value means that the space in all the log spacemaps
can add up to no more than
.Sy 262144
.Sy 131072
blocks (which means
.Em 32GB
.Em 16GB
of logical space before compression and ditto blocks,
assuming that blocksize is
.Em 128kB ) .
@@ -1002,7 +1021,12 @@ Thus we always allow at least this many log blocks.
.It Sy zfs_unflushed_log_block_pct Ns = Ns Sy 400 Ns % Pq ulong
Tunable used to determine the number of blocks that can be used for
the spacemap log, expressed as a percentage of the total number of
metaslabs in the pool.
unflushed metaslabs in the pool.
.
.It Sy zfs_unflushed_log_txg_max Ns = Ns Sy 1000 Pq ulong
Tunable limiting maximum time in TXGs any metaslab may remain unflushed.
It effectively limits maximum number of unflushed per-TXG spacemap logs
that need to be read after unclean pool export.
.
.It Sy zfs_unlink_suspend_progress Ns = Ns Sy 0 Ns | Ns 1 Pq uint
When enabled, files will not be asynchronously removed from the list of pending
@@ -1075,6 +1099,18 @@ Start syncing out a transaction group if there's at least this much dirty data
This should be less than
.Sy zfs_vdev_async_write_active_min_dirty_percent .
.
.It Sy zfs_wrlog_data_max Ns = Pq int
The upper limit of write-transaction zil log data size in bytes.
Write operations are throttled when approaching the limit until log data is
cleared out after transaction group sync.
Because of some overhead, it should be set at least 2 times the size of
.Sy zfs_dirty_data_max
.No to prevent harming normal write throughput.
It also should be smaller than the size of the slog device if slog is present.
.Pp
Defaults to
.Sy zfs_dirty_data_max*2
.
.It Sy zfs_fallocate_reserve_percent Ns = Ns Sy 110 Ns % Pq uint
Since ZFS is a copy-on-write filesystem with snapshots, blocks cannot be
preallocated for a file in order to guarantee that later writes will not
@@ -1312,6 +1348,22 @@ _
.TE
.Sy \& * No Requires debug build.
.
.It Sy zfs_btree_verify_intensity Ns = Ns Sy 0 Pq uint
Enables btree verification.
The following settings are culminative:
.TS
box;
lbz r l l .
Value Description
1 Verify height.
2 Verify pointers from children to parent.
3 Verify element counts.
4 Verify element order. (expensive)
* 5 Verify unused memory is poisoned. (expensive)
.TE
.Sy \& * No Requires debug build.
.
.It Sy zfs_free_leak_on_eio Ns = Ns Sy 0 Ns | Ns 1 Pq int
If destroy encounters an
.Sy EIO
+17 -1
View File
@@ -23,7 +23,7 @@
.Nd display ZFS storage pool debugging and consistency information
.Sh SYNOPSIS
.Nm
.Op Fl AbcdDFGhikLMPsvXYy
.Op Fl AbcdDFGhikLMNPsvXYy
.Op Fl e Oo Fl V Oc Oo Fl p Ar path Oc Ns …
.Op Fl I Ar inflight I/Os
.Oo Fl o Ar var Ns = Ns Ar value Oc Ns …
@@ -137,6 +137,14 @@ also display the configuration that would be used were the pool to be imported.
Display information about datasets.
Specified once, displays basic dataset information: ID, create transaction,
size, and object count.
See
.Fl N
for determining if
.Op Ar poolname Ns Op / Ns Ar dataset | objset ID
is to use the specified
.Op Ar dataset | objset ID
as a string (dataset name) or a number (objset ID) when
datasets have numeric names.
.Pp
If specified multiple times provides greater and greater verbosity.
.Pp
@@ -272,6 +280,14 @@ Also display information about the maximum contiguous free space and the
percentage of free space in each space map.
.It Fl MMM
Display every spacemap record.
.It Fl N
Same as
.Fl d
but force zdb to interpret the
.Op Ar dataset | objset ID
in
.Op Ar poolname Ns Op / Ns Ar dataset | objset ID
as a numeric objset ID.
.It Fl O Ar dataset path
Look up the specified
.Ar path
+1
View File
@@ -22,5 +22,6 @@
/export_syms
/machine
/x86
/i386
!Makefile.in
+4 -20
View File
@@ -108,21 +108,6 @@
#include <sys/cmn_err.h>
#include <sys/mod.h>
/*
* Small arrays to translate between balance (or diff) values and child indices.
*
* Code that deals with binary tree data structures will randomly use
* left and right children when examining a tree. C "if()" statements
* which evaluate randomly suffer from very poor hardware branch prediction.
* In this code we avoid some of the branch mispredictions by using the
* following translation arrays. They replace random branches with an
* additional memory reference. Since the translation arrays are both very
* small the data should remain efficiently in cache.
*/
static const int avl_child2balance[2] = {-1, 1};
static const int avl_balance2child[] = {0, 0, 1};
/*
* Walk from one node to the previous valued node (ie. an infix walk
* towards the left). At any given node we do one of 2 things:
@@ -278,8 +263,7 @@ avl_find(avl_tree_t *tree, const void *value, avl_index_t *where)
#endif
return (AVL_NODE2DATA(node, off));
}
child = avl_balance2child[1 + diff];
child = (diff > 0);
}
if (where != NULL)
@@ -531,7 +515,7 @@ avl_insert(avl_tree_t *tree, void *new_data, avl_index_t where)
* Compute the new balance
*/
old_balance = AVL_XBALANCE(node);
new_balance = old_balance + avl_child2balance[which_child];
new_balance = old_balance + (which_child ? 1 : -1);
/*
* If we introduced equal balance, then we are done immediately
@@ -697,7 +681,7 @@ avl_remove(avl_tree_t *tree, void *data)
* choose node to swap from whichever side is taller
*/
old_balance = AVL_XBALANCE(delete);
left = avl_balance2child[old_balance + 1];
left = (old_balance > 0);
right = 1 - left;
/*
@@ -781,7 +765,7 @@ avl_remove(avl_tree_t *tree, void *data)
*/
node = parent;
old_balance = AVL_XBALANCE(node);
new_balance = old_balance - avl_child2balance[which_child];
new_balance = old_balance - (which_child ? 1 : -1);
parent = AVL_XPARENT(node);
which_child = AVL_XCHILD(node);
-6
View File
@@ -27,7 +27,6 @@ $(MODULE)-objs += core/kcf_prov_lib.o
$(MODULE)-objs += spi/kcf_spi.o
$(MODULE)-objs += io/aes.o
$(MODULE)-objs += io/edonr_mod.o
$(MODULE)-objs += io/sha1_mod.o
$(MODULE)-objs += io/sha2_mod.o
$(MODULE)-objs += io/skein_mod.o
$(MODULE)-objs += os/modhash.o
@@ -43,7 +42,6 @@ $(MODULE)-objs += algs/aes/aes_impl_generic.o
$(MODULE)-objs += algs/aes/aes_impl.o
$(MODULE)-objs += algs/aes/aes_modes.o
$(MODULE)-objs += algs/edonr/edonr.o
$(MODULE)-objs += algs/sha1/sha1.o
$(MODULE)-objs += algs/sha2/sha2.o
$(MODULE)-objs += algs/skein/skein.o
$(MODULE)-objs += algs/skein/skein_block.o
@@ -55,7 +53,6 @@ $(MODULE)-$(CONFIG_X86_64) += asm-x86_64/aes/aes_aesni.o
$(MODULE)-$(CONFIG_X86_64) += asm-x86_64/modes/gcm_pclmulqdq.o
$(MODULE)-$(CONFIG_X86_64) += asm-x86_64/modes/aesni-gcm-x86_64.o
$(MODULE)-$(CONFIG_X86_64) += asm-x86_64/modes/ghash-x86_64.o
$(MODULE)-$(CONFIG_X86_64) += asm-x86_64/sha1/sha1-x86_64.o
$(MODULE)-$(CONFIG_X86_64) += asm-x86_64/sha2/sha256_impl.o
$(MODULE)-$(CONFIG_X86_64) += asm-x86_64/sha2/sha512_impl.o
@@ -72,7 +69,6 @@ OBJECT_FILES_NON_STANDARD_ghash-x86_64.o := y
# Suppress objtool "unsupported stack pointer realignment" warnings. We are
# not using a DRAP register while aligning the stack to a 64 byte boundary.
# See #6950 for the reasoning.
OBJECT_FILES_NON_STANDARD_sha1-x86_64.o := y
OBJECT_FILES_NON_STANDARD_sha256_impl.o := y
OBJECT_FILES_NON_STANDARD_sha512_impl.o := y
@@ -86,13 +82,11 @@ ICP_DIRS = \
algs/aes \
algs/edonr \
algs/modes \
algs/sha1 \
algs/sha2 \
algs/skein \
asm-x86_64 \
asm-x86_64/aes \
asm-x86_64/modes \
asm-x86_64/sha1 \
asm-x86_64/sha2 \
asm-i386 \
asm-generic
+1 -1
View File
@@ -488,7 +488,7 @@ EdonRInit(EdonRState *state, size_t hashbitlen)
state->hashbitlen = 512;
state->bits_processed = 0;
state->unprocessed_bits = 0;
bcopy(i512p2, hashState224(state)->DoublePipe,
bcopy(i512p2, hashState512(state)->DoublePipe,
16 * sizeof (uint64_t));
break;
}
-835
View File
@@ -1,835 +0,0 @@
/*
* Copyright 2009 Sun Microsystems, Inc. All rights reserved.
* Use is subject to license terms.
*/
/*
* The basic framework for this code came from the reference
* implementation for MD5. That implementation is Copyright (C)
* 1991-2, RSA Data Security, Inc. Created 1991. All rights reserved.
*
* License to copy and use this software is granted provided that it
* is identified as the "RSA Data Security, Inc. MD5 Message-Digest
* Algorithm" in all material mentioning or referencing this software
* or this function.
*
* License is also granted to make and use derivative works provided
* that such works are identified as "derived from the RSA Data
* Security, Inc. MD5 Message-Digest Algorithm" in all material
* mentioning or referencing the derived work.
*
* RSA Data Security, Inc. makes no representations concerning either
* the merchantability of this software or the suitability of this
* software for any particular purpose. It is provided "as is"
* without express or implied warranty of any kind.
*
* These notices must be retained in any copies of any part of this
* documentation and/or software.
*
* NOTE: Cleaned-up and optimized, version of SHA1, based on the FIPS 180-1
* standard, available at http://www.itl.nist.gov/fipspubs/fip180-1.htm
* Not as fast as one would like -- further optimizations are encouraged
* and appreciated.
*/
#include <sys/zfs_context.h>
#include <sha1/sha1.h>
#include <sha1/sha1_consts.h>
#ifdef _LITTLE_ENDIAN
#include <sys/byteorder.h>
#define HAVE_HTONL
#endif
#define _RESTRICT_KYWD
static void Encode(uint8_t *, const uint32_t *, size_t);
#if defined(__sparc)
#define SHA1_TRANSFORM(ctx, in) \
SHA1Transform((ctx)->state[0], (ctx)->state[1], (ctx)->state[2], \
(ctx)->state[3], (ctx)->state[4], (ctx), (in))
static void SHA1Transform(uint32_t, uint32_t, uint32_t, uint32_t, uint32_t,
SHA1_CTX *, const uint8_t *);
#elif defined(__amd64)
#define SHA1_TRANSFORM(ctx, in) sha1_block_data_order((ctx), (in), 1)
#define SHA1_TRANSFORM_BLOCKS(ctx, in, num) sha1_block_data_order((ctx), \
(in), (num))
void sha1_block_data_order(SHA1_CTX *ctx, const void *inpp, size_t num_blocks);
#else
#define SHA1_TRANSFORM(ctx, in) SHA1Transform((ctx), (in))
static void SHA1Transform(SHA1_CTX *, const uint8_t *);
#endif
static uint8_t PADDING[64] = { 0x80, /* all zeros */ };
/*
* F, G, and H are the basic SHA1 functions.
*/
#define F(b, c, d) (((b) & (c)) | ((~b) & (d)))
#define G(b, c, d) ((b) ^ (c) ^ (d))
#define H(b, c, d) (((b) & (c)) | (((b)|(c)) & (d)))
/*
* SHA1Init()
*
* purpose: initializes the sha1 context and begins and sha1 digest operation
* input: SHA1_CTX * : the context to initializes.
* output: void
*/
void
SHA1Init(SHA1_CTX *ctx)
{
ctx->count[0] = ctx->count[1] = 0;
/*
* load magic initialization constants. Tell lint
* that these constants are unsigned by using U.
*/
ctx->state[0] = 0x67452301U;
ctx->state[1] = 0xefcdab89U;
ctx->state[2] = 0x98badcfeU;
ctx->state[3] = 0x10325476U;
ctx->state[4] = 0xc3d2e1f0U;
}
void
SHA1Update(SHA1_CTX *ctx, const void *inptr, size_t input_len)
{
uint32_t i, buf_index, buf_len;
const uint8_t *input = inptr;
#if defined(__amd64)
uint32_t block_count;
#endif /* __amd64 */
/* check for noop */
if (input_len == 0)
return;
/* compute number of bytes mod 64 */
buf_index = (ctx->count[1] >> 3) & 0x3F;
/* update number of bits */
if ((ctx->count[1] += (input_len << 3)) < (input_len << 3))
ctx->count[0]++;
ctx->count[0] += (input_len >> 29);
buf_len = 64 - buf_index;
/* transform as many times as possible */
i = 0;
if (input_len >= buf_len) {
/*
* general optimization:
*
* only do initial bcopy() and SHA1Transform() if
* buf_index != 0. if buf_index == 0, we're just
* wasting our time doing the bcopy() since there
* wasn't any data left over from a previous call to
* SHA1Update().
*/
if (buf_index) {
bcopy(input, &ctx->buf_un.buf8[buf_index], buf_len);
SHA1_TRANSFORM(ctx, ctx->buf_un.buf8);
i = buf_len;
}
#if !defined(__amd64)
for (; i + 63 < input_len; i += 64)
SHA1_TRANSFORM(ctx, &input[i]);
#else
block_count = (input_len - i) >> 6;
if (block_count > 0) {
SHA1_TRANSFORM_BLOCKS(ctx, &input[i], block_count);
i += block_count << 6;
}
#endif /* !__amd64 */
/*
* general optimization:
*
* if i and input_len are the same, return now instead
* of calling bcopy(), since the bcopy() in this case
* will be an expensive nop.
*/
if (input_len == i)
return;
buf_index = 0;
}
/* buffer remaining input */
bcopy(&input[i], &ctx->buf_un.buf8[buf_index], input_len - i);
}
/*
* SHA1Final()
*
* purpose: ends an sha1 digest operation, finalizing the message digest and
* zeroing the context.
* input: uchar_t * : A buffer to store the digest.
* : The function actually uses void* because many
* : callers pass things other than uchar_t here.
* SHA1_CTX * : the context to finalize, save, and zero
* output: void
*/
void
SHA1Final(void *digest, SHA1_CTX *ctx)
{
uint8_t bitcount_be[sizeof (ctx->count)];
uint32_t index = (ctx->count[1] >> 3) & 0x3f;
/* store bit count, big endian */
Encode(bitcount_be, ctx->count, sizeof (bitcount_be));
/* pad out to 56 mod 64 */
SHA1Update(ctx, PADDING, ((index < 56) ? 56 : 120) - index);
/* append length (before padding) */
SHA1Update(ctx, bitcount_be, sizeof (bitcount_be));
/* store state in digest */
Encode(digest, ctx->state, sizeof (ctx->state));
/* zeroize sensitive information */
bzero(ctx, sizeof (*ctx));
}
#if !defined(__amd64)
typedef uint32_t sha1word;
/*
* sparc optimization:
*
* on the sparc, we can load big endian 32-bit data easily. note that
* special care must be taken to ensure the address is 32-bit aligned.
* in the interest of speed, we don't check to make sure, since
* careful programming can guarantee this for us.
*/
#if defined(_ZFS_BIG_ENDIAN)
#define LOAD_BIG_32(addr) (*(uint32_t *)(addr))
#elif defined(HAVE_HTONL)
#define LOAD_BIG_32(addr) htonl(*((uint32_t *)(addr)))
#else
#define LOAD_BIG_32(addr) BE_32(*((uint32_t *)(addr)))
#endif /* _BIG_ENDIAN */
/*
* SHA1Transform()
*/
#if defined(W_ARRAY)
#define W(n) w[n]
#else /* !defined(W_ARRAY) */
#define W(n) w_ ## n
#endif /* !defined(W_ARRAY) */
/*
* ROTATE_LEFT rotates x left n bits.
*/
#if defined(__GNUC__) && defined(_LP64)
static __inline__ uint64_t
ROTATE_LEFT(uint64_t value, uint32_t n)
{
uint32_t t32;
t32 = (uint32_t)value;
return ((t32 << n) | (t32 >> (32 - n)));
}
#else
#define ROTATE_LEFT(x, n) \
(((x) << (n)) | ((x) >> ((sizeof (x) * NBBY)-(n))))
#endif
#if defined(__sparc)
/*
* sparc register window optimization:
*
* `a', `b', `c', `d', and `e' are passed into SHA1Transform
* explicitly since it increases the number of registers available to
* the compiler. under this scheme, these variables can be held in
* %i0 - %i4, which leaves more local and out registers available.
*
* purpose: sha1 transformation -- updates the digest based on `block'
* input: uint32_t : bytes 1 - 4 of the digest
* uint32_t : bytes 5 - 8 of the digest
* uint32_t : bytes 9 - 12 of the digest
* uint32_t : bytes 12 - 16 of the digest
* uint32_t : bytes 16 - 20 of the digest
* SHA1_CTX * : the context to update
* uint8_t [64]: the block to use to update the digest
* output: void
*/
void
SHA1Transform(uint32_t a, uint32_t b, uint32_t c, uint32_t d, uint32_t e,
SHA1_CTX *ctx, const uint8_t blk[64])
{
/*
* sparc optimization:
*
* while it is somewhat counter-intuitive, on sparc, it is
* more efficient to place all the constants used in this
* function in an array and load the values out of the array
* than to manually load the constants. this is because
* setting a register to a 32-bit value takes two ops in most
* cases: a `sethi' and an `or', but loading a 32-bit value
* from memory only takes one `ld' (or `lduw' on v9). while
* this increases memory usage, the compiler can find enough
* other things to do while waiting to keep the pipeline does
* not stall. additionally, it is likely that many of these
* constants are cached so that later accesses do not even go
* out to the bus.
*
* this array is declared `static' to keep the compiler from
* having to bcopy() this array onto the stack frame of
* SHA1Transform() each time it is called -- which is
* unacceptably expensive.
*
* the `const' is to ensure that callers are good citizens and
* do not try to munge the array. since these routines are
* going to be called from inside multithreaded kernelland,
* this is a good safety check. -- `sha1_consts' will end up in
* .rodata.
*
* unfortunately, loading from an array in this manner hurts
* performance under Intel. So, there is a macro,
* SHA1_CONST(), used in SHA1Transform(), that either expands to
* a reference to this array, or to the actual constant,
* depending on what platform this code is compiled for.
*/
static const uint32_t sha1_consts[] = {
SHA1_CONST_0, SHA1_CONST_1, SHA1_CONST_2, SHA1_CONST_3
};
/*
* general optimization:
*
* use individual integers instead of using an array. this is a
* win, although the amount it wins by seems to vary quite a bit.
*/
uint32_t w_0, w_1, w_2, w_3, w_4, w_5, w_6, w_7;
uint32_t w_8, w_9, w_10, w_11, w_12, w_13, w_14, w_15;
/*
* sparc optimization:
*
* if `block' is already aligned on a 4-byte boundary, use
* LOAD_BIG_32() directly. otherwise, bcopy() into a
* buffer that *is* aligned on a 4-byte boundary and then do
* the LOAD_BIG_32() on that buffer. benchmarks have shown
* that using the bcopy() is better than loading the bytes
* individually and doing the endian-swap by hand.
*
* even though it's quite tempting to assign to do:
*
* blk = bcopy(ctx->buf_un.buf32, blk, sizeof (ctx->buf_un.buf32));
*
* and only have one set of LOAD_BIG_32()'s, the compiler
* *does not* like that, so please resist the urge.
*/
if ((uintptr_t)blk & 0x3) { /* not 4-byte aligned? */
bcopy(blk, ctx->buf_un.buf32, sizeof (ctx->buf_un.buf32));
w_15 = LOAD_BIG_32(ctx->buf_un.buf32 + 15);
w_14 = LOAD_BIG_32(ctx->buf_un.buf32 + 14);
w_13 = LOAD_BIG_32(ctx->buf_un.buf32 + 13);
w_12 = LOAD_BIG_32(ctx->buf_un.buf32 + 12);
w_11 = LOAD_BIG_32(ctx->buf_un.buf32 + 11);
w_10 = LOAD_BIG_32(ctx->buf_un.buf32 + 10);
w_9 = LOAD_BIG_32(ctx->buf_un.buf32 + 9);
w_8 = LOAD_BIG_32(ctx->buf_un.buf32 + 8);
w_7 = LOAD_BIG_32(ctx->buf_un.buf32 + 7);
w_6 = LOAD_BIG_32(ctx->buf_un.buf32 + 6);
w_5 = LOAD_BIG_32(ctx->buf_un.buf32 + 5);
w_4 = LOAD_BIG_32(ctx->buf_un.buf32 + 4);
w_3 = LOAD_BIG_32(ctx->buf_un.buf32 + 3);
w_2 = LOAD_BIG_32(ctx->buf_un.buf32 + 2);
w_1 = LOAD_BIG_32(ctx->buf_un.buf32 + 1);
w_0 = LOAD_BIG_32(ctx->buf_un.buf32 + 0);
} else {
/* LINTED E_BAD_PTR_CAST_ALIGN */
w_15 = LOAD_BIG_32(blk + 60);
/* LINTED E_BAD_PTR_CAST_ALIGN */
w_14 = LOAD_BIG_32(blk + 56);
/* LINTED E_BAD_PTR_CAST_ALIGN */
w_13 = LOAD_BIG_32(blk + 52);
/* LINTED E_BAD_PTR_CAST_ALIGN */
w_12 = LOAD_BIG_32(blk + 48);
/* LINTED E_BAD_PTR_CAST_ALIGN */
w_11 = LOAD_BIG_32(blk + 44);
/* LINTED E_BAD_PTR_CAST_ALIGN */
w_10 = LOAD_BIG_32(blk + 40);
/* LINTED E_BAD_PTR_CAST_ALIGN */
w_9 = LOAD_BIG_32(blk + 36);
/* LINTED E_BAD_PTR_CAST_ALIGN */
w_8 = LOAD_BIG_32(blk + 32);
/* LINTED E_BAD_PTR_CAST_ALIGN */
w_7 = LOAD_BIG_32(blk + 28);
/* LINTED E_BAD_PTR_CAST_ALIGN */
w_6 = LOAD_BIG_32(blk + 24);
/* LINTED E_BAD_PTR_CAST_ALIGN */
w_5 = LOAD_BIG_32(blk + 20);
/* LINTED E_BAD_PTR_CAST_ALIGN */
w_4 = LOAD_BIG_32(blk + 16);
/* LINTED E_BAD_PTR_CAST_ALIGN */
w_3 = LOAD_BIG_32(blk + 12);
/* LINTED E_BAD_PTR_CAST_ALIGN */
w_2 = LOAD_BIG_32(blk + 8);
/* LINTED E_BAD_PTR_CAST_ALIGN */
w_1 = LOAD_BIG_32(blk + 4);
/* LINTED E_BAD_PTR_CAST_ALIGN */
w_0 = LOAD_BIG_32(blk + 0);
}
#else /* !defined(__sparc) */
void /* CSTYLED */
SHA1Transform(SHA1_CTX *ctx, const uint8_t blk[64])
{
/* CSTYLED */
sha1word a = ctx->state[0];
sha1word b = ctx->state[1];
sha1word c = ctx->state[2];
sha1word d = ctx->state[3];
sha1word e = ctx->state[4];
#if defined(W_ARRAY)
sha1word w[16];
#else /* !defined(W_ARRAY) */
sha1word w_0, w_1, w_2, w_3, w_4, w_5, w_6, w_7;
sha1word w_8, w_9, w_10, w_11, w_12, w_13, w_14, w_15;
#endif /* !defined(W_ARRAY) */
W(0) = LOAD_BIG_32((void *)(blk + 0));
W(1) = LOAD_BIG_32((void *)(blk + 4));
W(2) = LOAD_BIG_32((void *)(blk + 8));
W(3) = LOAD_BIG_32((void *)(blk + 12));
W(4) = LOAD_BIG_32((void *)(blk + 16));
W(5) = LOAD_BIG_32((void *)(blk + 20));
W(6) = LOAD_BIG_32((void *)(blk + 24));
W(7) = LOAD_BIG_32((void *)(blk + 28));
W(8) = LOAD_BIG_32((void *)(blk + 32));
W(9) = LOAD_BIG_32((void *)(blk + 36));
W(10) = LOAD_BIG_32((void *)(blk + 40));
W(11) = LOAD_BIG_32((void *)(blk + 44));
W(12) = LOAD_BIG_32((void *)(blk + 48));
W(13) = LOAD_BIG_32((void *)(blk + 52));
W(14) = LOAD_BIG_32((void *)(blk + 56));
W(15) = LOAD_BIG_32((void *)(blk + 60));
#endif /* !defined(__sparc) */
/*
* general optimization:
*
* even though this approach is described in the standard as
* being slower algorithmically, it is 30-40% faster than the
* "faster" version under SPARC, because this version has more
* of the constraints specified at compile-time and uses fewer
* variables (and therefore has better register utilization)
* than its "speedier" brother. (i've tried both, trust me)
*
* for either method given in the spec, there is an "assignment"
* phase where the following takes place:
*
* tmp = (main_computation);
* e = d; d = c; c = rotate_left(b, 30); b = a; a = tmp;
*
* we can make the algorithm go faster by not doing this work,
* but just pretending that `d' is now `e', etc. this works
* really well and obviates the need for a temporary variable.
* however, we still explicitly perform the rotate action,
* since it is cheaper on SPARC to do it once than to have to
* do it over and over again.
*/
/* round 1 */
e = ROTATE_LEFT(a, 5) + F(b, c, d) + e + W(0) + SHA1_CONST(0); /* 0 */
b = ROTATE_LEFT(b, 30);
d = ROTATE_LEFT(e, 5) + F(a, b, c) + d + W(1) + SHA1_CONST(0); /* 1 */
a = ROTATE_LEFT(a, 30);
c = ROTATE_LEFT(d, 5) + F(e, a, b) + c + W(2) + SHA1_CONST(0); /* 2 */
e = ROTATE_LEFT(e, 30);
b = ROTATE_LEFT(c, 5) + F(d, e, a) + b + W(3) + SHA1_CONST(0); /* 3 */
d = ROTATE_LEFT(d, 30);
a = ROTATE_LEFT(b, 5) + F(c, d, e) + a + W(4) + SHA1_CONST(0); /* 4 */
c = ROTATE_LEFT(c, 30);
e = ROTATE_LEFT(a, 5) + F(b, c, d) + e + W(5) + SHA1_CONST(0); /* 5 */
b = ROTATE_LEFT(b, 30);
d = ROTATE_LEFT(e, 5) + F(a, b, c) + d + W(6) + SHA1_CONST(0); /* 6 */
a = ROTATE_LEFT(a, 30);
c = ROTATE_LEFT(d, 5) + F(e, a, b) + c + W(7) + SHA1_CONST(0); /* 7 */
e = ROTATE_LEFT(e, 30);
b = ROTATE_LEFT(c, 5) + F(d, e, a) + b + W(8) + SHA1_CONST(0); /* 8 */
d = ROTATE_LEFT(d, 30);
a = ROTATE_LEFT(b, 5) + F(c, d, e) + a + W(9) + SHA1_CONST(0); /* 9 */
c = ROTATE_LEFT(c, 30);
e = ROTATE_LEFT(a, 5) + F(b, c, d) + e + W(10) + SHA1_CONST(0); /* 10 */
b = ROTATE_LEFT(b, 30);
d = ROTATE_LEFT(e, 5) + F(a, b, c) + d + W(11) + SHA1_CONST(0); /* 11 */
a = ROTATE_LEFT(a, 30);
c = ROTATE_LEFT(d, 5) + F(e, a, b) + c + W(12) + SHA1_CONST(0); /* 12 */
e = ROTATE_LEFT(e, 30);
b = ROTATE_LEFT(c, 5) + F(d, e, a) + b + W(13) + SHA1_CONST(0); /* 13 */
d = ROTATE_LEFT(d, 30);
a = ROTATE_LEFT(b, 5) + F(c, d, e) + a + W(14) + SHA1_CONST(0); /* 14 */
c = ROTATE_LEFT(c, 30);
e = ROTATE_LEFT(a, 5) + F(b, c, d) + e + W(15) + SHA1_CONST(0); /* 15 */
b = ROTATE_LEFT(b, 30);
W(0) = ROTATE_LEFT((W(13) ^ W(8) ^ W(2) ^ W(0)), 1); /* 16 */
d = ROTATE_LEFT(e, 5) + F(a, b, c) + d + W(0) + SHA1_CONST(0);
a = ROTATE_LEFT(a, 30);
W(1) = ROTATE_LEFT((W(14) ^ W(9) ^ W(3) ^ W(1)), 1); /* 17 */
c = ROTATE_LEFT(d, 5) + F(e, a, b) + c + W(1) + SHA1_CONST(0);
e = ROTATE_LEFT(e, 30);
W(2) = ROTATE_LEFT((W(15) ^ W(10) ^ W(4) ^ W(2)), 1); /* 18 */
b = ROTATE_LEFT(c, 5) + F(d, e, a) + b + W(2) + SHA1_CONST(0);
d = ROTATE_LEFT(d, 30);
W(3) = ROTATE_LEFT((W(0) ^ W(11) ^ W(5) ^ W(3)), 1); /* 19 */
a = ROTATE_LEFT(b, 5) + F(c, d, e) + a + W(3) + SHA1_CONST(0);
c = ROTATE_LEFT(c, 30);
/* round 2 */
W(4) = ROTATE_LEFT((W(1) ^ W(12) ^ W(6) ^ W(4)), 1); /* 20 */
e = ROTATE_LEFT(a, 5) + G(b, c, d) + e + W(4) + SHA1_CONST(1);
b = ROTATE_LEFT(b, 30);
W(5) = ROTATE_LEFT((W(2) ^ W(13) ^ W(7) ^ W(5)), 1); /* 21 */
d = ROTATE_LEFT(e, 5) + G(a, b, c) + d + W(5) + SHA1_CONST(1);
a = ROTATE_LEFT(a, 30);
W(6) = ROTATE_LEFT((W(3) ^ W(14) ^ W(8) ^ W(6)), 1); /* 22 */
c = ROTATE_LEFT(d, 5) + G(e, a, b) + c + W(6) + SHA1_CONST(1);
e = ROTATE_LEFT(e, 30);
W(7) = ROTATE_LEFT((W(4) ^ W(15) ^ W(9) ^ W(7)), 1); /* 23 */
b = ROTATE_LEFT(c, 5) + G(d, e, a) + b + W(7) + SHA1_CONST(1);
d = ROTATE_LEFT(d, 30);
W(8) = ROTATE_LEFT((W(5) ^ W(0) ^ W(10) ^ W(8)), 1); /* 24 */
a = ROTATE_LEFT(b, 5) + G(c, d, e) + a + W(8) + SHA1_CONST(1);
c = ROTATE_LEFT(c, 30);
W(9) = ROTATE_LEFT((W(6) ^ W(1) ^ W(11) ^ W(9)), 1); /* 25 */
e = ROTATE_LEFT(a, 5) + G(b, c, d) + e + W(9) + SHA1_CONST(1);
b = ROTATE_LEFT(b, 30);
W(10) = ROTATE_LEFT((W(7) ^ W(2) ^ W(12) ^ W(10)), 1); /* 26 */
d = ROTATE_LEFT(e, 5) + G(a, b, c) + d + W(10) + SHA1_CONST(1);
a = ROTATE_LEFT(a, 30);
W(11) = ROTATE_LEFT((W(8) ^ W(3) ^ W(13) ^ W(11)), 1); /* 27 */
c = ROTATE_LEFT(d, 5) + G(e, a, b) + c + W(11) + SHA1_CONST(1);
e = ROTATE_LEFT(e, 30);
W(12) = ROTATE_LEFT((W(9) ^ W(4) ^ W(14) ^ W(12)), 1); /* 28 */
b = ROTATE_LEFT(c, 5) + G(d, e, a) + b + W(12) + SHA1_CONST(1);
d = ROTATE_LEFT(d, 30);
W(13) = ROTATE_LEFT((W(10) ^ W(5) ^ W(15) ^ W(13)), 1); /* 29 */
a = ROTATE_LEFT(b, 5) + G(c, d, e) + a + W(13) + SHA1_CONST(1);
c = ROTATE_LEFT(c, 30);
W(14) = ROTATE_LEFT((W(11) ^ W(6) ^ W(0) ^ W(14)), 1); /* 30 */
e = ROTATE_LEFT(a, 5) + G(b, c, d) + e + W(14) + SHA1_CONST(1);
b = ROTATE_LEFT(b, 30);
W(15) = ROTATE_LEFT((W(12) ^ W(7) ^ W(1) ^ W(15)), 1); /* 31 */
d = ROTATE_LEFT(e, 5) + G(a, b, c) + d + W(15) + SHA1_CONST(1);
a = ROTATE_LEFT(a, 30);
W(0) = ROTATE_LEFT((W(13) ^ W(8) ^ W(2) ^ W(0)), 1); /* 32 */
c = ROTATE_LEFT(d, 5) + G(e, a, b) + c + W(0) + SHA1_CONST(1);
e = ROTATE_LEFT(e, 30);
W(1) = ROTATE_LEFT((W(14) ^ W(9) ^ W(3) ^ W(1)), 1); /* 33 */
b = ROTATE_LEFT(c, 5) + G(d, e, a) + b + W(1) + SHA1_CONST(1);
d = ROTATE_LEFT(d, 30);
W(2) = ROTATE_LEFT((W(15) ^ W(10) ^ W(4) ^ W(2)), 1); /* 34 */
a = ROTATE_LEFT(b, 5) + G(c, d, e) + a + W(2) + SHA1_CONST(1);
c = ROTATE_LEFT(c, 30);
W(3) = ROTATE_LEFT((W(0) ^ W(11) ^ W(5) ^ W(3)), 1); /* 35 */
e = ROTATE_LEFT(a, 5) + G(b, c, d) + e + W(3) + SHA1_CONST(1);
b = ROTATE_LEFT(b, 30);
W(4) = ROTATE_LEFT((W(1) ^ W(12) ^ W(6) ^ W(4)), 1); /* 36 */
d = ROTATE_LEFT(e, 5) + G(a, b, c) + d + W(4) + SHA1_CONST(1);
a = ROTATE_LEFT(a, 30);
W(5) = ROTATE_LEFT((W(2) ^ W(13) ^ W(7) ^ W(5)), 1); /* 37 */
c = ROTATE_LEFT(d, 5) + G(e, a, b) + c + W(5) + SHA1_CONST(1);
e = ROTATE_LEFT(e, 30);
W(6) = ROTATE_LEFT((W(3) ^ W(14) ^ W(8) ^ W(6)), 1); /* 38 */
b = ROTATE_LEFT(c, 5) + G(d, e, a) + b + W(6) + SHA1_CONST(1);
d = ROTATE_LEFT(d, 30);
W(7) = ROTATE_LEFT((W(4) ^ W(15) ^ W(9) ^ W(7)), 1); /* 39 */
a = ROTATE_LEFT(b, 5) + G(c, d, e) + a + W(7) + SHA1_CONST(1);
c = ROTATE_LEFT(c, 30);
/* round 3 */
W(8) = ROTATE_LEFT((W(5) ^ W(0) ^ W(10) ^ W(8)), 1); /* 40 */
e = ROTATE_LEFT(a, 5) + H(b, c, d) + e + W(8) + SHA1_CONST(2);
b = ROTATE_LEFT(b, 30);
W(9) = ROTATE_LEFT((W(6) ^ W(1) ^ W(11) ^ W(9)), 1); /* 41 */
d = ROTATE_LEFT(e, 5) + H(a, b, c) + d + W(9) + SHA1_CONST(2);
a = ROTATE_LEFT(a, 30);
W(10) = ROTATE_LEFT((W(7) ^ W(2) ^ W(12) ^ W(10)), 1); /* 42 */
c = ROTATE_LEFT(d, 5) + H(e, a, b) + c + W(10) + SHA1_CONST(2);
e = ROTATE_LEFT(e, 30);
W(11) = ROTATE_LEFT((W(8) ^ W(3) ^ W(13) ^ W(11)), 1); /* 43 */
b = ROTATE_LEFT(c, 5) + H(d, e, a) + b + W(11) + SHA1_CONST(2);
d = ROTATE_LEFT(d, 30);
W(12) = ROTATE_LEFT((W(9) ^ W(4) ^ W(14) ^ W(12)), 1); /* 44 */
a = ROTATE_LEFT(b, 5) + H(c, d, e) + a + W(12) + SHA1_CONST(2);
c = ROTATE_LEFT(c, 30);
W(13) = ROTATE_LEFT((W(10) ^ W(5) ^ W(15) ^ W(13)), 1); /* 45 */
e = ROTATE_LEFT(a, 5) + H(b, c, d) + e + W(13) + SHA1_CONST(2);
b = ROTATE_LEFT(b, 30);
W(14) = ROTATE_LEFT((W(11) ^ W(6) ^ W(0) ^ W(14)), 1); /* 46 */
d = ROTATE_LEFT(e, 5) + H(a, b, c) + d + W(14) + SHA1_CONST(2);
a = ROTATE_LEFT(a, 30);
W(15) = ROTATE_LEFT((W(12) ^ W(7) ^ W(1) ^ W(15)), 1); /* 47 */
c = ROTATE_LEFT(d, 5) + H(e, a, b) + c + W(15) + SHA1_CONST(2);
e = ROTATE_LEFT(e, 30);
W(0) = ROTATE_LEFT((W(13) ^ W(8) ^ W(2) ^ W(0)), 1); /* 48 */
b = ROTATE_LEFT(c, 5) + H(d, e, a) + b + W(0) + SHA1_CONST(2);
d = ROTATE_LEFT(d, 30);
W(1) = ROTATE_LEFT((W(14) ^ W(9) ^ W(3) ^ W(1)), 1); /* 49 */
a = ROTATE_LEFT(b, 5) + H(c, d, e) + a + W(1) + SHA1_CONST(2);
c = ROTATE_LEFT(c, 30);
W(2) = ROTATE_LEFT((W(15) ^ W(10) ^ W(4) ^ W(2)), 1); /* 50 */
e = ROTATE_LEFT(a, 5) + H(b, c, d) + e + W(2) + SHA1_CONST(2);
b = ROTATE_LEFT(b, 30);
W(3) = ROTATE_LEFT((W(0) ^ W(11) ^ W(5) ^ W(3)), 1); /* 51 */
d = ROTATE_LEFT(e, 5) + H(a, b, c) + d + W(3) + SHA1_CONST(2);
a = ROTATE_LEFT(a, 30);
W(4) = ROTATE_LEFT((W(1) ^ W(12) ^ W(6) ^ W(4)), 1); /* 52 */
c = ROTATE_LEFT(d, 5) + H(e, a, b) + c + W(4) + SHA1_CONST(2);
e = ROTATE_LEFT(e, 30);
W(5) = ROTATE_LEFT((W(2) ^ W(13) ^ W(7) ^ W(5)), 1); /* 53 */
b = ROTATE_LEFT(c, 5) + H(d, e, a) + b + W(5) + SHA1_CONST(2);
d = ROTATE_LEFT(d, 30);
W(6) = ROTATE_LEFT((W(3) ^ W(14) ^ W(8) ^ W(6)), 1); /* 54 */
a = ROTATE_LEFT(b, 5) + H(c, d, e) + a + W(6) + SHA1_CONST(2);
c = ROTATE_LEFT(c, 30);
W(7) = ROTATE_LEFT((W(4) ^ W(15) ^ W(9) ^ W(7)), 1); /* 55 */
e = ROTATE_LEFT(a, 5) + H(b, c, d) + e + W(7) + SHA1_CONST(2);
b = ROTATE_LEFT(b, 30);
W(8) = ROTATE_LEFT((W(5) ^ W(0) ^ W(10) ^ W(8)), 1); /* 56 */
d = ROTATE_LEFT(e, 5) + H(a, b, c) + d + W(8) + SHA1_CONST(2);
a = ROTATE_LEFT(a, 30);
W(9) = ROTATE_LEFT((W(6) ^ W(1) ^ W(11) ^ W(9)), 1); /* 57 */
c = ROTATE_LEFT(d, 5) + H(e, a, b) + c + W(9) + SHA1_CONST(2);
e = ROTATE_LEFT(e, 30);
W(10) = ROTATE_LEFT((W(7) ^ W(2) ^ W(12) ^ W(10)), 1); /* 58 */
b = ROTATE_LEFT(c, 5) + H(d, e, a) + b + W(10) + SHA1_CONST(2);
d = ROTATE_LEFT(d, 30);
W(11) = ROTATE_LEFT((W(8) ^ W(3) ^ W(13) ^ W(11)), 1); /* 59 */
a = ROTATE_LEFT(b, 5) + H(c, d, e) + a + W(11) + SHA1_CONST(2);
c = ROTATE_LEFT(c, 30);
/* round 4 */
W(12) = ROTATE_LEFT((W(9) ^ W(4) ^ W(14) ^ W(12)), 1); /* 60 */
e = ROTATE_LEFT(a, 5) + G(b, c, d) + e + W(12) + SHA1_CONST(3);
b = ROTATE_LEFT(b, 30);
W(13) = ROTATE_LEFT((W(10) ^ W(5) ^ W(15) ^ W(13)), 1); /* 61 */
d = ROTATE_LEFT(e, 5) + G(a, b, c) + d + W(13) + SHA1_CONST(3);
a = ROTATE_LEFT(a, 30);
W(14) = ROTATE_LEFT((W(11) ^ W(6) ^ W(0) ^ W(14)), 1); /* 62 */
c = ROTATE_LEFT(d, 5) + G(e, a, b) + c + W(14) + SHA1_CONST(3);
e = ROTATE_LEFT(e, 30);
W(15) = ROTATE_LEFT((W(12) ^ W(7) ^ W(1) ^ W(15)), 1); /* 63 */
b = ROTATE_LEFT(c, 5) + G(d, e, a) + b + W(15) + SHA1_CONST(3);
d = ROTATE_LEFT(d, 30);
W(0) = ROTATE_LEFT((W(13) ^ W(8) ^ W(2) ^ W(0)), 1); /* 64 */
a = ROTATE_LEFT(b, 5) + G(c, d, e) + a + W(0) + SHA1_CONST(3);
c = ROTATE_LEFT(c, 30);
W(1) = ROTATE_LEFT((W(14) ^ W(9) ^ W(3) ^ W(1)), 1); /* 65 */
e = ROTATE_LEFT(a, 5) + G(b, c, d) + e + W(1) + SHA1_CONST(3);
b = ROTATE_LEFT(b, 30);
W(2) = ROTATE_LEFT((W(15) ^ W(10) ^ W(4) ^ W(2)), 1); /* 66 */
d = ROTATE_LEFT(e, 5) + G(a, b, c) + d + W(2) + SHA1_CONST(3);
a = ROTATE_LEFT(a, 30);
W(3) = ROTATE_LEFT((W(0) ^ W(11) ^ W(5) ^ W(3)), 1); /* 67 */
c = ROTATE_LEFT(d, 5) + G(e, a, b) + c + W(3) + SHA1_CONST(3);
e = ROTATE_LEFT(e, 30);
W(4) = ROTATE_LEFT((W(1) ^ W(12) ^ W(6) ^ W(4)), 1); /* 68 */
b = ROTATE_LEFT(c, 5) + G(d, e, a) + b + W(4) + SHA1_CONST(3);
d = ROTATE_LEFT(d, 30);
W(5) = ROTATE_LEFT((W(2) ^ W(13) ^ W(7) ^ W(5)), 1); /* 69 */
a = ROTATE_LEFT(b, 5) + G(c, d, e) + a + W(5) + SHA1_CONST(3);
c = ROTATE_LEFT(c, 30);
W(6) = ROTATE_LEFT((W(3) ^ W(14) ^ W(8) ^ W(6)), 1); /* 70 */
e = ROTATE_LEFT(a, 5) + G(b, c, d) + e + W(6) + SHA1_CONST(3);
b = ROTATE_LEFT(b, 30);
W(7) = ROTATE_LEFT((W(4) ^ W(15) ^ W(9) ^ W(7)), 1); /* 71 */
d = ROTATE_LEFT(e, 5) + G(a, b, c) + d + W(7) + SHA1_CONST(3);
a = ROTATE_LEFT(a, 30);
W(8) = ROTATE_LEFT((W(5) ^ W(0) ^ W(10) ^ W(8)), 1); /* 72 */
c = ROTATE_LEFT(d, 5) + G(e, a, b) + c + W(8) + SHA1_CONST(3);
e = ROTATE_LEFT(e, 30);
W(9) = ROTATE_LEFT((W(6) ^ W(1) ^ W(11) ^ W(9)), 1); /* 73 */
b = ROTATE_LEFT(c, 5) + G(d, e, a) + b + W(9) + SHA1_CONST(3);
d = ROTATE_LEFT(d, 30);
W(10) = ROTATE_LEFT((W(7) ^ W(2) ^ W(12) ^ W(10)), 1); /* 74 */
a = ROTATE_LEFT(b, 5) + G(c, d, e) + a + W(10) + SHA1_CONST(3);
c = ROTATE_LEFT(c, 30);
W(11) = ROTATE_LEFT((W(8) ^ W(3) ^ W(13) ^ W(11)), 1); /* 75 */
e = ROTATE_LEFT(a, 5) + G(b, c, d) + e + W(11) + SHA1_CONST(3);
b = ROTATE_LEFT(b, 30);
W(12) = ROTATE_LEFT((W(9) ^ W(4) ^ W(14) ^ W(12)), 1); /* 76 */
d = ROTATE_LEFT(e, 5) + G(a, b, c) + d + W(12) + SHA1_CONST(3);
a = ROTATE_LEFT(a, 30);
W(13) = ROTATE_LEFT((W(10) ^ W(5) ^ W(15) ^ W(13)), 1); /* 77 */
c = ROTATE_LEFT(d, 5) + G(e, a, b) + c + W(13) + SHA1_CONST(3);
e = ROTATE_LEFT(e, 30);
W(14) = ROTATE_LEFT((W(11) ^ W(6) ^ W(0) ^ W(14)), 1); /* 78 */
b = ROTATE_LEFT(c, 5) + G(d, e, a) + b + W(14) + SHA1_CONST(3);
d = ROTATE_LEFT(d, 30);
W(15) = ROTATE_LEFT((W(12) ^ W(7) ^ W(1) ^ W(15)), 1); /* 79 */
ctx->state[0] += ROTATE_LEFT(b, 5) + G(c, d, e) + a + W(15) +
SHA1_CONST(3);
ctx->state[1] += b;
ctx->state[2] += ROTATE_LEFT(c, 30);
ctx->state[3] += d;
ctx->state[4] += e;
/* zeroize sensitive information */
W(0) = W(1) = W(2) = W(3) = W(4) = W(5) = W(6) = W(7) = W(8) = 0;
W(9) = W(10) = W(11) = W(12) = W(13) = W(14) = W(15) = 0;
}
#endif /* !__amd64 */
/*
* Encode()
*
* purpose: to convert a list of numbers from little endian to big endian
* input: uint8_t * : place to store the converted big endian numbers
* uint32_t * : place to get numbers to convert from
* size_t : the length of the input in bytes
* output: void
*/
static void
Encode(uint8_t *_RESTRICT_KYWD output, const uint32_t *_RESTRICT_KYWD input,
size_t len)
{
size_t i, j;
#if defined(__sparc)
if (IS_P2ALIGNED(output, sizeof (uint32_t))) {
for (i = 0, j = 0; j < len; i++, j += 4) {
/* LINTED E_BAD_PTR_CAST_ALIGN */
*((uint32_t *)(output + j)) = input[i];
}
} else {
#endif /* little endian -- will work on big endian, but slowly */
for (i = 0, j = 0; j < len; i++, j += 4) {
output[j] = (input[i] >> 24) & 0xff;
output[j + 1] = (input[i] >> 16) & 0xff;
output[j + 2] = (input[i] >> 8) & 0xff;
output[j + 3] = input[i] & 0xff;
}
#if defined(__sparc)
}
#endif
}
+12 -12
View File
@@ -208,7 +208,7 @@ _key_expansion_256a_local:
pxor %xmm1, %xmm0
movups %xmm0, (%rcx)
add $0x10, %rcx
ret
RET
nop
SET_SIZE(_key_expansion_128)
SET_SIZE(_key_expansion_256a)
@@ -236,7 +236,7 @@ _key_expansion_192a_local:
shufps $0b01001110, %xmm2, %xmm1
movups %xmm1, 0x10(%rcx)
add $0x20, %rcx
ret
RET
SET_SIZE(_key_expansion_192a)
@@ -257,7 +257,7 @@ _key_expansion_192b_local:
movups %xmm0, (%rcx)
add $0x10, %rcx
ret
RET
SET_SIZE(_key_expansion_192b)
@@ -271,7 +271,7 @@ _key_expansion_256b_local:
pxor %xmm1, %xmm2
movups %xmm2, (%rcx)
add $0x10, %rcx
ret
RET
SET_SIZE(_key_expansion_256b)
@@ -376,7 +376,7 @@ rijndael_key_setup_enc_intel_local:
mov $14, %rax // return # rounds = 14
#endif
FRAME_END
ret
RET
.align 4
.Lenc_key192:
@@ -413,7 +413,7 @@ rijndael_key_setup_enc_intel_local:
mov $12, %rax // return # rounds = 12
#endif
FRAME_END
ret
RET
.align 4
.Lenc_key128:
@@ -453,13 +453,13 @@ rijndael_key_setup_enc_intel_local:
mov $10, %rax // return # rounds = 10
#endif
FRAME_END
ret
RET
.Lenc_key_invalid_param:
#ifdef OPENSSL_INTERFACE
mov $-1, %rax // user key or AES key pointer is NULL
FRAME_END
ret
RET
#else
/* FALLTHROUGH */
#endif /* OPENSSL_INTERFACE */
@@ -471,7 +471,7 @@ rijndael_key_setup_enc_intel_local:
xor %rax, %rax // a key pointer is NULL or invalid keysize
#endif /* OPENSSL_INTERFACE */
FRAME_END
ret
RET
SET_SIZE(rijndael_key_setup_enc_intel)
@@ -548,7 +548,7 @@ FRAME_BEGIN
// OpenSolaris: rax = # rounds (10, 12, or 14) or 0 for error
// OpenSSL: rax = 0 for OK, or non-zero for error
FRAME_END
ret
RET
SET_SIZE(rijndael_key_setup_dec_intel)
@@ -655,7 +655,7 @@ ENTRY_NP(aes_encrypt_intel)
aesenclast %KEY, %STATE // last round
movups %STATE, (%OUTP) // output
ret
RET
SET_SIZE(aes_encrypt_intel)
@@ -738,7 +738,7 @@ ENTRY_NP(aes_decrypt_intel)
aesdeclast %KEY, %STATE // last round
movups %STATE, (%OUTP) // output
ret
RET
SET_SIZE(aes_decrypt_intel)
#endif /* lint || __lint */
+2 -2
View File
@@ -785,7 +785,7 @@ ENTRY_NP(aes_encrypt_amd64)
mov 2*8(%rsp), %rbp
mov 3*8(%rsp), %r12
add $[4*8], %rsp
ret
RET
SET_SIZE(aes_encrypt_amd64)
@@ -896,7 +896,7 @@ ENTRY_NP(aes_decrypt_amd64)
mov 2*8(%rsp), %rbp
mov 3*8(%rsp), %r12
add $[4*8], %rsp
ret
RET
SET_SIZE(aes_decrypt_amd64)
#endif /* lint || __lint */
@@ -1201,7 +1201,7 @@ aesni_gcm_encrypt:
.align 32
clear_fpu_regs_avx:
vzeroall
ret
RET
.size clear_fpu_regs_avx,.-clear_fpu_regs_avx
/*
@@ -1219,7 +1219,7 @@ gcm_xor_avx:
movdqu (%rsi), %xmm1
pxor %xmm1, %xmm0
movdqu %xmm0, (%rsi)
ret
RET
.size gcm_xor_avx,.-gcm_xor_avx
/*
@@ -1236,7 +1236,7 @@ atomic_toggle_boolean_nv:
jz 1f
movl $1, %eax
1:
ret
RET
.size atomic_toggle_boolean_nv,.-atomic_toggle_boolean_nv
.align 64
+1 -1
View File
@@ -244,7 +244,7 @@ ENTRY_NP(gcm_mul_pclmulqdq)
//
// Return
//
ret
RET
SET_SIZE(gcm_mul_pclmulqdq)
#endif /* lint || __lint */
File diff suppressed because it is too large Load Diff
+27 -1
View File
@@ -83,12 +83,21 @@ SHA256TransformBlocks(SHA2_CTX *ctx, const void *in, size_t num)
#include <sys/asm_linkage.h>
ENTRY_NP(SHA256TransformBlocks)
.cfi_startproc
movq %rsp, %rax
.cfi_def_cfa_register %rax
push %rbx
.cfi_offset %rbx,-16
push %rbp
.cfi_offset %rbp,-24
push %r12
.cfi_offset %r12,-32
push %r13
.cfi_offset %r13,-40
push %r14
.cfi_offset %r14,-48
push %r15
.cfi_offset %r15,-56
mov %rsp,%rbp # copy %rsp
shl $4,%rdx # num*16
sub $16*4+4*8,%rsp
@@ -99,6 +108,9 @@ ENTRY_NP(SHA256TransformBlocks)
mov %rsi,16*4+1*8(%rsp) # save inp, 2nd arg
mov %rdx,16*4+2*8(%rsp) # save end pointer, "3rd" arg
mov %rbp,16*4+3*8(%rsp) # save copy of %rsp
# echo ".cfi_cfa_expression %rsp+88,deref,+56" |
# openssl/crypto/perlasm/x86_64-xlate.pl
.cfi_escape 0x0f,0x06,0x77,0xd8,0x00,0x06,0x23,0x38
#.picmeup %rbp
# The .picmeup pseudo-directive, from perlasm/x86_64_xlate.pl, puts
@@ -2026,14 +2038,28 @@ ENTRY_NP(SHA256TransformBlocks)
jb .Lloop
mov 16*4+3*8(%rsp),%rsp
.cfi_def_cfa %rsp,56
pop %r15
.cfi_adjust_cfa_offset -8
.cfi_restore %r15
pop %r14
.cfi_adjust_cfa_offset -8
.cfi_restore %r14
pop %r13
.cfi_adjust_cfa_offset -8
.cfi_restore %r13
pop %r12
.cfi_adjust_cfa_offset -8
.cfi_restore %r12
pop %rbp
.cfi_adjust_cfa_offset -8
.cfi_restore %rbp
pop %rbx
.cfi_adjust_cfa_offset -8
.cfi_restore %rbx
ret
RET
.cfi_endproc
SET_SIZE(SHA256TransformBlocks)
.data
+27 -1
View File
@@ -84,12 +84,21 @@ SHA512TransformBlocks(SHA2_CTX *ctx, const void *in, size_t num)
#include <sys/asm_linkage.h>
ENTRY_NP(SHA512TransformBlocks)
.cfi_startproc
movq %rsp, %rax
.cfi_def_cfa_register %rax
push %rbx
.cfi_offset %rbx,-16
push %rbp
.cfi_offset %rbp,-24
push %r12
.cfi_offset %r12,-32
push %r13
.cfi_offset %r13,-40
push %r14
.cfi_offset %r14,-48
push %r15
.cfi_offset %r15,-56
mov %rsp,%rbp # copy %rsp
shl $4,%rdx # num*16
sub $16*8+4*8,%rsp
@@ -100,6 +109,9 @@ ENTRY_NP(SHA512TransformBlocks)
mov %rsi,16*8+1*8(%rsp) # save inp, 2nd arg
mov %rdx,16*8+2*8(%rsp) # save end pointer, "3rd" arg
mov %rbp,16*8+3*8(%rsp) # save copy of %rsp
# echo ".cfi_cfa_expression %rsp+152,deref,+56" |
# openssl/crypto/perlasm/x86_64-xlate.pl
.cfi_escape 0x0f,0x06,0x77,0x98,0x01,0x06,0x23,0x38
#.picmeup %rbp
# The .picmeup pseudo-directive, from perlasm/x86_64_xlate.pl, puts
@@ -2027,14 +2039,28 @@ ENTRY_NP(SHA512TransformBlocks)
jb .Lloop
mov 16*8+3*8(%rsp),%rsp
.cfi_def_cfa %rsp,56
pop %r15
.cfi_adjust_cfa_offset -8
.cfi_restore %r15
pop %r14
.cfi_adjust_cfa_offset -8
.cfi_restore %r14
pop %r13
.cfi_adjust_cfa_offset -8
.cfi_restore %r13
pop %r12
.cfi_adjust_cfa_offset -8
.cfi_restore %r12
pop %rbp
.cfi_adjust_cfa_offset -8
.cfi_restore %rbp
pop %rbx
.cfi_adjust_cfa_offset -8
.cfi_restore %rbx
ret
RET
.cfi_endproc
SET_SIZE(SHA512TransformBlocks)
.data
-2
View File
@@ -111,7 +111,6 @@ icp_fini(void)
{
skein_mod_fini();
sha2_mod_fini();
sha1_mod_fini();
edonr_mod_fini();
aes_mod_fini();
kcf_sched_destroy();
@@ -142,7 +141,6 @@ icp_init(void)
/* initialize algorithms */
aes_mod_init();
edonr_mod_init();
sha1_mod_init();
sha2_mod_init();
skein_mod_init();
-61
View File
@@ -1,61 +0,0 @@
/*
* CDDL HEADER START
*
* The contents of this file are subject to the terms of the
* Common Development and Distribution License (the "License").
* You may not use this file except in compliance with the License.
*
* You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
* or http://www.opensolaris.org/os/licensing.
* See the License for the specific language governing permissions
* and limitations under the License.
*
* When distributing Covered Code, include this CDDL HEADER in each
* file and include the License file at usr/src/OPENSOLARIS.LICENSE.
* If applicable, add the following below this CDDL HEADER, with the
* fields enclosed by brackets "[]" replaced with your own identifying
* information: Portions Copyright [yyyy] [name of copyright owner]
*
* CDDL HEADER END
*/
/*
* Copyright 2007 Sun Microsystems, Inc. All rights reserved.
* Use is subject to license terms.
*/
#ifndef _SYS_SHA1_H
#define _SYS_SHA1_H
#include <sys/types.h> /* for uint_* */
#ifdef __cplusplus
extern "C" {
#endif
/*
* NOTE: n2rng (Niagara2 RNG driver) accesses the state field of
* SHA1_CTX directly. NEVER change this structure without verifying
* compatibility with n2rng. The important thing is that the state
* must be in a field declared as uint32_t state[5].
*/
/* SHA-1 context. */
typedef struct {
uint32_t state[5]; /* state (ABCDE) */
uint32_t count[2]; /* number of bits, modulo 2^64 (msb first) */
union {
uint8_t buf8[64]; /* undigested input */
uint32_t buf32[16]; /* realigned input */
} buf_un;
} SHA1_CTX;
#define SHA1_DIGEST_LENGTH 20
void SHA1Init(SHA1_CTX *);
void SHA1Update(SHA1_CTX *, const void *, size_t);
void SHA1Final(void *, SHA1_CTX *);
#ifdef __cplusplus
}
#endif
#endif /* _SYS_SHA1_H */
-65
View File
@@ -1,65 +0,0 @@
/*
* CDDL HEADER START
*
* The contents of this file are subject to the terms of the
* Common Development and Distribution License, Version 1.0 only
* (the "License"). You may not use this file except in compliance
* with the License.
*
* You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
* or http://www.opensolaris.org/os/licensing.
* See the License for the specific language governing permissions
* and limitations under the License.
*
* When distributing Covered Code, include this CDDL HEADER in each
* file and include the License file at usr/src/OPENSOLARIS.LICENSE.
* If applicable, add the following below this CDDL HEADER, with the
* fields enclosed by brackets "[]" replaced with your own identifying
* information: Portions Copyright [yyyy] [name of copyright owner]
*
* CDDL HEADER END
*/
/*
* Copyright (c) 1998, by Sun Microsystems, Inc.
* All rights reserved.
*/
#ifndef _SYS_SHA1_CONSTS_H
#define _SYS_SHA1_CONSTS_H
#ifdef __cplusplus
extern "C" {
#endif
/*
* as explained in sha1.c, loading 32-bit constants on a sparc is expensive
* since it involves both a `sethi' and an `or'. thus, we instead use `ld'
* to load the constants from an array called `sha1_consts'. however, on
* intel (and perhaps other processors), it is cheaper to load the constant
* directly. thus, the c code in SHA1Transform() uses the macro SHA1_CONST()
* which either expands to a constant or an array reference, depending on
* the architecture the code is being compiled for.
*/
#include <sys/types.h> /* uint32_t */
extern const uint32_t sha1_consts[];
#if defined(__sparc)
#define SHA1_CONST(x) (sha1_consts[x])
#else
#define SHA1_CONST(x) (SHA1_CONST_ ## x)
#endif
/* constants, as provided in FIPS 180-1 */
#define SHA1_CONST_0 0x5a827999U
#define SHA1_CONST_1 0x6ed9eba1U
#define SHA1_CONST_2 0x8f1bbcdcU
#define SHA1_CONST_3 0xca62c1d6U
#ifdef __cplusplus
}
#endif
#endif /* _SYS_SHA1_CONSTS_H */
-73
View File
@@ -1,73 +0,0 @@
/*
* CDDL HEADER START
*
* The contents of this file are subject to the terms of the
* Common Development and Distribution License (the "License").
* You may not use this file except in compliance with the License.
*
* You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
* or http://www.opensolaris.org/os/licensing.
* See the License for the specific language governing permissions
* and limitations under the License.
*
* When distributing Covered Code, include this CDDL HEADER in each
* file and include the License file at usr/src/OPENSOLARIS.LICENSE.
* If applicable, add the following below this CDDL HEADER, with the
* fields enclosed by brackets "[]" replaced with your own identifying
* information: Portions Copyright [yyyy] [name of copyright owner]
*
* CDDL HEADER END
*/
/*
* Copyright 2009 Sun Microsystems, Inc. All rights reserved.
* Use is subject to license terms.
*/
#ifndef _SHA1_IMPL_H
#define _SHA1_IMPL_H
#ifdef __cplusplus
extern "C" {
#endif
#define SHA1_HASH_SIZE 20 /* SHA_1 digest length in bytes */
#define SHA1_DIGEST_LENGTH 20 /* SHA1 digest length in bytes */
#define SHA1_HMAC_BLOCK_SIZE 64 /* SHA1-HMAC block size */
#define SHA1_HMAC_MIN_KEY_LEN 1 /* SHA1-HMAC min key length in bytes */
#define SHA1_HMAC_MAX_KEY_LEN INT_MAX /* SHA1-HMAC max key length in bytes */
#define SHA1_HMAC_INTS_PER_BLOCK (SHA1_HMAC_BLOCK_SIZE/sizeof (uint32_t))
/*
* CSPI information (entry points, provider info, etc.)
*/
typedef enum sha1_mech_type {
SHA1_MECH_INFO_TYPE, /* SUN_CKM_SHA1 */
SHA1_HMAC_MECH_INFO_TYPE, /* SUN_CKM_SHA1_HMAC */
SHA1_HMAC_GEN_MECH_INFO_TYPE /* SUN_CKM_SHA1_HMAC_GENERAL */
} sha1_mech_type_t;
/*
* Context for SHA1 mechanism.
*/
typedef struct sha1_ctx {
sha1_mech_type_t sc_mech_type; /* type of context */
SHA1_CTX sc_sha1_ctx; /* SHA1 context */
} sha1_ctx_t;
/*
* Context for SHA1-HMAC and SHA1-HMAC-GENERAL mechanisms.
*/
typedef struct sha1_hmac_ctx {
sha1_mech_type_t hc_mech_type; /* type of context */
uint32_t hc_digest_len; /* digest len in bytes */
SHA1_CTX hc_icontext; /* inner SHA1 context */
SHA1_CTX hc_ocontext; /* outer SHA1 context */
} sha1_hmac_ctx_t;
#ifdef __cplusplus
}
#endif
#endif /* _SHA1_IMPL_H */
@@ -30,6 +30,12 @@
#include <sys/stack.h>
#include <sys/trap.h>
#if defined(__linux__) && defined(CONFIG_SLS)
#define RET ret; int3
#else
#define RET ret
#endif
#ifdef __cplusplus
extern "C" {
#endif
File diff suppressed because it is too large Load Diff
+15 -3
View File
@@ -494,7 +494,8 @@ skein_update(crypto_ctx_t *ctx, crypto_data_t *data, crypto_req_handle_t req)
*/
/*ARGSUSED*/
static int
skein_final(crypto_ctx_t *ctx, crypto_data_t *digest, crypto_req_handle_t req)
skein_final_nofree(crypto_ctx_t *ctx, crypto_data_t *digest,
crypto_req_handle_t req)
{
int error = CRYPTO_SUCCESS;
@@ -525,6 +526,17 @@ skein_final(crypto_ctx_t *ctx, crypto_data_t *digest, crypto_req_handle_t req)
else
digest->cd_length = 0;
return (error);
}
static int
skein_final(crypto_ctx_t *ctx, crypto_data_t *digest, crypto_req_handle_t req)
{
int error = skein_final_nofree(ctx, digest, req);
if (error == CRYPTO_BUFFER_TOO_SMALL)
return (error);
bzero(SKEIN_CTX(ctx), sizeof (*SKEIN_CTX(ctx)));
kmem_free(SKEIN_CTX(ctx), sizeof (*(SKEIN_CTX(ctx))));
SKEIN_CTX_LVALUE(ctx) = NULL;
@@ -560,7 +572,7 @@ skein_digest_atomic(crypto_provider_handle_t provider,
if ((error = skein_update(&ctx, data, digest)) != CRYPTO_SUCCESS)
goto out;
if ((error = skein_final(&ctx, data, digest)) != CRYPTO_SUCCESS)
if ((error = skein_final_nofree(&ctx, data, digest)) != CRYPTO_SUCCESS)
goto out;
out:
@@ -669,7 +681,7 @@ skein_mac_atomic(crypto_provider_handle_t provider,
if ((error = skein_update(&ctx, data, req)) != CRYPTO_SUCCESS)
goto errout;
if ((error = skein_final(&ctx, mac, req)) != CRYPTO_SUCCESS)
if ((error = skein_final_nofree(&ctx, mac, req)) != CRYPTO_SUCCESS)
goto errout;
return (CRYPTO_SUCCESS);
+12 -1
View File
@@ -168,6 +168,13 @@ static void seterrorobj (lua_State *L, int errcode, StkId oldtop) {
L->top = oldtop + 1;
}
/*
* Silence infinite recursion warning which was added to -Wall in gcc 12.1
*/
#if defined(HAVE_INFINITE_RECURSION)
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Winfinite-recursion"
#endif
l_noret luaD_throw (lua_State *L, int errcode) {
if (L->errorJmp) { /* thread has an error handler? */
@@ -190,6 +197,10 @@ l_noret luaD_throw (lua_State *L, int errcode) {
}
}
#if defined(HAVE_INFINITE_RECURSION)
#pragma GCC diagnostic pop
#endif
int luaD_rawrunprotected (lua_State *L, Pfunc f, void *ud) {
unsigned short oldnCcalls = L->nCcalls;
@@ -395,7 +406,7 @@ int luaD_precall (lua_State *L, StkId func, int nresults) {
StkId base;
Proto *p = clLvalue(func)->p;
n = cast_int(L->top - func) - 1; /* number of real arguments */
luaD_checkstack(L, p->maxstacksize);
luaD_checkstack(L, p->maxstacksize + p->numparams);
for (; n < p->numparams; n++)
setnilvalue(L->top++); /* complete missing arguments */
if (!p->is_vararg) {
+8 -2
View File
@@ -35,6 +35,12 @@ x:
.size x, [.-x]
#if defined(__linux__) && defined(CONFIG_SLS)
#define RET ret; int3
#else
#define RET ret
#endif
/*
* Setjmp and longjmp implement non-local gotos using state vectors
* type label_t.
@@ -52,7 +58,7 @@ x:
movq 0(%rsp), %rdx /* return address */
movq %rdx, 56(%rdi) /* rip */
xorl %eax, %eax /* return 0 */
ret
RET
SET_SIZE(setjmp)
ENTRY(longjmp)
@@ -67,7 +73,7 @@ x:
movq %rdx, 0(%rsp)
xorl %eax, %eax
incl %eax /* return 1 */
ret
RET
SET_SIZE(longjmp)
#ifdef __ELF__
+1 -1
View File
@@ -131,7 +131,7 @@ abd_scatter_chunkcnt(abd_t *abd)
boolean_t
abd_size_alloc_linear(size_t size)
{
return (size < zfs_abd_scatter_min_size ? B_TRUE : B_FALSE);
return (!zfs_abd_scatter_enabled || size < zfs_abd_scatter_min_size);
}
void
+10 -1
View File
@@ -161,6 +161,12 @@ arc_prune_task(void *arg)
int64_t nr_scan = (intptr_t)arg;
arc_reduce_target_size(ptob(nr_scan));
#ifndef __ILP32__
if (nr_scan > INT_MAX)
nr_scan = INT_MAX;
#endif
#if __FreeBSD_version >= 1300139
sx_xlock(&arc_vnlru_lock);
vnlru_free_vfsops(nr_scan, &zfs_vfsops, arc_vnlru_marker);
@@ -223,7 +229,10 @@ arc_lowmem(void *arg __unused, int howto __unused)
arc_warm = B_TRUE;
arc_growtime = gethrtime() + SEC2NSEC(arc_grow_retry);
free_memory = arc_available_memory();
to_free = (arc_c >> arc_shrink_shift) - MIN(free_memory, 0);
int64_t can_free = arc_c - arc_c_min;
if (can_free <= 0)
return;
to_free = (can_free >> arc_shrink_shift) - MIN(free_memory, 0);
DTRACE_PROBE2(arc__needfree, int64_t, free_memory, int64_t, to_free);
arc_reduce_target_size(to_free);
+30 -13
View File
@@ -151,6 +151,13 @@ freebsd_zfs_crypt_done(struct cryptop *crp)
return (0);
}
static int
freebsd_zfs_crypt_done_sync(struct cryptop *crp)
{
return (0);
}
void
freebsd_crypt_freesession(freebsd_crypt_session_t *sess)
{
@@ -160,26 +167,36 @@ freebsd_crypt_freesession(freebsd_crypt_session_t *sess)
}
static int
zfs_crypto_dispatch(freebsd_crypt_session_t *session, struct cryptop *crp)
zfs_crypto_dispatch(freebsd_crypt_session_t *session, struct cryptop *crp)
{
int error;
crp->crp_opaque = session;
crp->crp_callback = freebsd_zfs_crypt_done;
for (;;) {
#if __FreeBSD_version < 1400004
boolean_t async = ((crypto_ses2caps(crp->crp_session) &
CRYPTOCAP_F_SYNC) == 0);
#else
boolean_t async = !CRYPTO_SESS_SYNC(crp->crp_session);
#endif
crp->crp_callback = async ? freebsd_zfs_crypt_done :
freebsd_zfs_crypt_done_sync;
error = crypto_dispatch(crp);
if (error)
break;
mtx_lock(&session->fs_lock);
while (session->fs_done == false)
msleep(crp, &session->fs_lock, 0,
"zfs_crypto", 0);
mtx_unlock(&session->fs_lock);
if (crp->crp_etype == ENOMEM) {
pause("zcrnomem", 1);
} else if (crp->crp_etype != EAGAIN) {
if (error == 0) {
if (async) {
mtx_lock(&session->fs_lock);
while (session->fs_done == false) {
msleep(crp, &session->fs_lock, 0,
"zfs_crypto", 0);
}
mtx_unlock(&session->fs_lock);
}
error = crp->crp_etype;
}
if (error == ENOMEM) {
pause("zcrnomem", 1);
} else if (error != EAGAIN) {
break;
}
crp->crp_etype = 0;
+1 -2
View File
@@ -956,8 +956,7 @@ skip_open:
*logical_ashift = highbit(MAX(pp->sectorsize, SPA_MINBLOCKSIZE)) - 1;
*physical_ashift = 0;
if (pp->stripesize && pp->stripesize > (1 << *logical_ashift) &&
ISP2(pp->stripesize) && pp->stripesize <= (1 << ASHIFT_MAX) &&
pp->stripeoffset == 0)
ISP2(pp->stripesize) && pp->stripeoffset == 0)
*physical_ashift = highbit(pp->stripesize) - 1;
/*
+2 -1
View File
@@ -976,12 +976,13 @@ zfsctl_snapdir_lookup(struct vop_lookup_args *ap)
*/
VI_LOCK(*vpp);
if (((*vpp)->v_iflag & VI_MOUNT) == 0) {
VI_UNLOCK(*vpp);
/*
* Upgrade to exclusive lock in order to:
* - avoid race conditions
* - satisfy the contract of mount_snapshot()
*/
err = VOP_LOCK(*vpp, LK_TRYUPGRADE | LK_INTERLOCK);
err = VOP_LOCK(*vpp, LK_TRYUPGRADE);
if (err == 0)
break;
} else {
+4
View File
@@ -226,7 +226,11 @@ zfs_vop_fsync(vnode_t *vp)
struct mount *mp;
int error;
#if __FreeBSD_version < 1400068
if ((error = vn_start_write(vp, &mp, V_WAIT | PCATCH)) != 0)
#else
if ((error = vn_start_write(vp, &mp, V_WAIT | V_PCATCH)) != 0)
#endif
goto drop;
vn_lock(vp, LK_EXCLUSIVE | LK_RETRY);
error = VOP_FSYNC(vp, MNT_WAIT, curthread);
+30 -11
View File
@@ -981,13 +981,17 @@ zfs_lookup(vnode_t *dvp, const char *nm, vnode_t **vpp,
case RENAME:
if (error == ENOENT) {
error = EJUSTRETURN;
#if __FreeBSD_version < 1400068
cnp->cn_flags |= SAVENAME;
#endif
break;
}
fallthrough;
case DELETE:
#if __FreeBSD_version < 1400068
if (error == 0)
cnp->cn_flags |= SAVENAME;
#endif
break;
}
}
@@ -1337,7 +1341,10 @@ zfs_lookup_internal(znode_t *dzp, const char *name, vnode_t **vpp,
cnp->cn_nameptr = __DECONST(char *, name);
cnp->cn_namelen = strlen(name);
cnp->cn_nameiop = nameiop;
cnp->cn_flags = ISLASTCN | SAVENAME;
cnp->cn_flags = ISLASTCN;
#if __FreeBSD_version < 1400068
cnp->cn_flags |= SAVENAME;
#endif
cnp->cn_lkflags = LK_EXCLUSIVE | LK_RETRY;
cnp->cn_cred = kcred;
#if __FreeBSD_version < 1400037
@@ -4642,7 +4649,9 @@ zfs_freebsd_create(struct vop_create_args *ap)
znode_t *zp = NULL;
int rc, mode;
#if __FreeBSD_version < 1400068
ASSERT(cnp->cn_flags & SAVENAME);
#endif
vattr_init_mask(vap);
mode = vap->va_mode & ALLPERMS;
@@ -4672,7 +4681,9 @@ static int
zfs_freebsd_remove(struct vop_remove_args *ap)
{
#if __FreeBSD_version < 1400068
ASSERT(ap->a_cnp->cn_flags & SAVENAME);
#endif
return (zfs_remove_(ap->a_dvp, ap->a_vp, ap->a_cnp->cn_nameptr,
ap->a_cnp->cn_cred));
@@ -4694,7 +4705,9 @@ zfs_freebsd_mkdir(struct vop_mkdir_args *ap)
znode_t *zp = NULL;
int rc;
#if __FreeBSD_version < 1400068
ASSERT(ap->a_cnp->cn_flags & SAVENAME);
#endif
vattr_init_mask(vap);
*ap->a_vpp = NULL;
@@ -4720,7 +4733,9 @@ zfs_freebsd_rmdir(struct vop_rmdir_args *ap)
{
struct componentname *cnp = ap->a_cnp;
#if __FreeBSD_version < 1400068
ASSERT(cnp->cn_flags & SAVENAME);
#endif
return (zfs_rmdir_(ap->a_dvp, ap->a_vp, cnp->cn_nameptr, cnp->cn_cred));
}
@@ -4974,8 +4989,10 @@ zfs_freebsd_rename(struct vop_rename_args *ap)
vnode_t *tvp = ap->a_tvp;
int error;
#if __FreeBSD_version < 1400068
ASSERT(ap->a_fcnp->cn_flags & (SAVENAME|SAVESTART));
ASSERT(ap->a_tcnp->cn_flags & (SAVENAME|SAVESTART));
#endif
error = zfs_do_rename(fdvp, &fvp, ap->a_fcnp, tdvp, &tvp,
ap->a_tcnp, ap->a_fcnp->cn_cred);
@@ -5011,7 +5028,9 @@ zfs_freebsd_symlink(struct vop_symlink_args *ap)
#endif
int rc;
#if __FreeBSD_version < 1400068
ASSERT(cnp->cn_flags & SAVENAME);
#endif
vap->va_type = VLNK; /* FreeBSD: Syscall only sets va_mode. */
vattr_init_mask(vap);
@@ -5105,7 +5124,9 @@ zfs_freebsd_link(struct vop_link_args *ap)
if (tdvp->v_mount != vp->v_mount)
return (EXDEV);
#if __FreeBSD_version < 1400068
ASSERT(cnp->cn_flags & SAVENAME);
#endif
return (zfs_link(VTOZ(tdvp), VTOZ(vp),
cnp->cn_nameptr, cnp->cn_cred, 0));
@@ -5364,10 +5385,10 @@ zfs_getextattr_dir(struct vop_getextattr_args *ap, const char *attrname)
NDINIT_ATVP(&nd, LOOKUP, NOFOLLOW, UIO_SYSSPACE, attrname, xvp);
#endif
error = vn_open_cred(&nd, &flags, 0, VN_OPEN_INVFS, ap->a_cred, NULL);
vp = nd.ni_vp;
NDFREE_PNBUF(&nd);
if (error != 0)
return (error);
vp = nd.ni_vp;
NDFREE_PNBUF(&nd);
if (ap->a_size != NULL) {
error = VOP_GETATTR(vp, &va, ap->a_cred);
@@ -5481,12 +5502,10 @@ zfs_deleteextattr_dir(struct vop_deleteextattr_args *ap, const char *attrname)
UIO_SYSSPACE, attrname, xvp);
#endif
error = namei(&nd);
vp = nd.ni_vp;
if (error != 0) {
NDFREE_PNBUF(&nd);
if (error != 0)
return (error);
}
vp = nd.ni_vp;
error = VOP_REMOVE(nd.ni_dvp, vp, &nd.ni_cnd);
NDFREE_PNBUF(&nd);
@@ -5612,10 +5631,10 @@ zfs_setextattr_dir(struct vop_setextattr_args *ap, const char *attrname)
#endif
error = vn_open_cred(&nd, &flags, 0600, VN_OPEN_INVFS, ap->a_cred,
NULL);
vp = nd.ni_vp;
NDFREE_PNBUF(&nd);
if (error != 0)
return (error);
vp = nd.ni_vp;
NDFREE_PNBUF(&nd);
VATTR_NULL(&va);
va.va_size = 0;
@@ -5767,10 +5786,10 @@ zfs_listextattr_dir(struct vop_listextattr_args *ap, const char *attrprefix)
UIO_SYSSPACE, ".", xvp);
#endif
error = namei(&nd);
vp = nd.ni_vp;
NDFREE_PNBUF(&nd);
if (error != 0)
return (error);
vp = nd.ni_vp;
NDFREE_PNBUF(&nd);
auio.uio_iov = &aiov;
auio.uio_iovcnt = 1;
+1 -1
View File
@@ -638,7 +638,7 @@ abd_alloc_zero_scatter(void)
boolean_t
abd_size_alloc_linear(size_t size)
{
return (size < zfs_abd_scatter_min_size ? B_TRUE : B_FALSE);
return (!zfs_abd_scatter_enabled || size < zfs_abd_scatter_min_size);
}
void
+11 -1
View File
@@ -105,6 +105,16 @@ bdev_whole(struct block_device *bdev)
}
#endif
#if defined(HAVE_BDEVNAME)
#define vdev_bdevname(bdev, name) bdevname(bdev, name)
#else
static inline void
vdev_bdevname(struct block_device *bdev, char *name)
{
snprintf(name, BDEVNAME_SIZE, "%pg", bdev);
}
#endif
/*
* Returns the maximum expansion capacity of the block device (in bytes).
*
@@ -204,7 +214,7 @@ vdev_disk_open(vdev_t *v, uint64_t *psize, uint64_t *max_psize,
if (bdev) {
if (v->vdev_expanding && bdev != bdev_whole(bdev)) {
bdevname(bdev_whole(bdev), disk_name + 5);
vdev_bdevname(bdev_whole(bdev), disk_name + 5);
/*
* If userland has BLKPG_RESIZE_PARTITION,
* then it should have updated the partition
+4 -4
View File
@@ -1900,6 +1900,9 @@ zio_do_crypt_data(boolean_t encrypt, zio_crypt_key_t *key,
crypto_ctx_template_t tmpl;
uint8_t *authbuf = NULL;
memset(&puio, 0, sizeof (puio));
memset(&cuio, 0, sizeof (cuio));
/*
* If the needed key is the current one, just use it. Otherwise we
* need to generate a temporary one from the given salt + master key.
@@ -1937,7 +1940,7 @@ zio_do_crypt_data(boolean_t encrypt, zio_crypt_key_t *key,
*/
if (qat_crypt_use_accel(datalen) &&
ot != DMU_OT_INTENT_LOG && ot != DMU_OT_DNODE) {
uint8_t *srcbuf, *dstbuf;
uint8_t __attribute__((unused)) *srcbuf, *dstbuf;
if (encrypt) {
srcbuf = plainbuf;
@@ -1960,9 +1963,6 @@ zio_do_crypt_data(boolean_t encrypt, zio_crypt_key_t *key,
/* If the hardware implementation fails fall back to software */
}
bzero(&puio, sizeof (zfs_uio_t));
bzero(&cuio, sizeof (zfs_uio_t));
/* create uios for encryption */
ret = zio_crypt_init_uios(encrypt, key->zk_version, ot, plainbuf,
cipherbuf, datalen, byteswap, mac, &puio, &cuio, &enc_len,
+4
View File
@@ -954,7 +954,11 @@ zvol_free(zvol_state_t *zv)
del_gendisk(zv->zv_zso->zvo_disk);
#if defined(HAVE_SUBMIT_BIO_IN_BLOCK_DEVICE_OPERATIONS) && \
defined(HAVE_BLK_ALLOC_DISK)
#if defined(HAVE_BLK_CLEANUP_DISK)
blk_cleanup_disk(zv->zv_zso->zvo_disk);
#else
put_disk(zv->zv_zso->zvo_disk);
#endif
#else
blk_cleanup_queue(zv->zv_zso->zvo_queue);
put_disk(zv->zv_zso->zvo_disk);
+1 -1
View File
@@ -181,7 +181,7 @@ abd_free_struct(abd_t *abd)
abd_t *
abd_alloc(size_t size, boolean_t is_metadata)
{
if (!zfs_abd_scatter_enabled || abd_size_alloc_linear(size))
if (abd_size_alloc_linear(size))
return (abd_alloc_linear(size, is_metadata));
VERIFY3U(size, <=, SPA_MAXBLOCKSIZE);

Some files were not shown because too many files have changed in this diff Show More