Compare commits

...

212 Commits

Author SHA1 Message Date
Tony Hutter 92e0d9d183 Tag zfs-2.1.9
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2023-01-24 15:41:54 -08:00
Coleman Kane 232fc23c6e linux 6.2 compat: zpl_set_acl arg2 is now struct dentry
Linux 6.2 changes the second argument of the set_acl operation to be a
"struct dentry *" rather than a "struct inode *". The inode* parameter
is still available as dentry->d_inode, so adjust the call to the _impl
function call to dereference and pass that pointer to it.

Also document that the get_acl -> get_inode_acl member name change from
commit 884a693 was an API change also introduced in Linux 6.2.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #14415
2023-01-24 15:36:08 -08:00
Tony Hutter 11bdc5c8e8 Revert "ztest fails assertion in zio_write_gang_member_ready()"
This reverts commit 0156253d29.

That commit was identified as causing IO errors on a user's
encrypted dataset:
https://github.com/openzfs/zfs/issues/14413

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2023-01-24 15:35:24 -08:00
Tony Hutter 04b02785b6 Tag zfs-2.1.8
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2023-01-19 13:00:56 -08:00
Gian-Carlo DeFazio f22254283a change how d_alias is replaced by du.d_alias
d_alias may need to be converted to du.d_alias
depending on the kernel version.
d_alias is currently in only one place in the code which
changes
"hlist_for_each_entry(dentry, &inode->i_dentry, d_alias)"
to
"hlist_for_each_entry(dentry, &inode->i_dentry, d_u.d_alias)"
as neccesary.

This effectively results in a double macro expansion
for code that uses the zfs headers but already has its
own macro for just d_alias (lustre in this case).

Remove the conditional code for hlist_for_each_entry
and have a macro for "d_alias -> du.d_alias" instead.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gian-Carlo DeFazio <defazio1@llnl.gov>
Closes #14377
2023-01-19 12:50:42 -08:00
Richard Yao 7319a73921 Linux ppc64le ieee128 compat: Do not redefine __asm on external headers
There is an external assembly declaration extension in GNU C that glibc
uses when building with ieee128 floating point support on ppc64le.
Marking that as volatile makes no sense, so the build breaks.

It does not make sense to only mark this as volatile on Linux, since if
do not want the compiler reordering things on Linux, we do not want the
compiler reordering things on any other platform, so we stop treating
Linux specially and just manually inline the CPP macro so that we can
eliminate it. This should fix the build on ppc64le.

Tested-by: @gyakovlev
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14308
Closes #14384
2023-01-19 12:50:42 -08:00
Vince van Oosten 596cfb6b15 include systemd overrides to zfs-dracut module
If a user that uses systemd and dracut wants to overide certain
settings, they typically use `systemctl edit [unit]` or place a file in
`/etc/systemd/system/[unit].d/override.conf` directly.

The zfs-dracut module did not include those overrides however, so this
did not have any effect at boot time.

For zfs-import-scan.service and zfs-import-cache.service, overrides are
now included in the dracut initramfs image.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Vince van Oosten <techhazard@codeforyouand.me>
Closes #14075
Closes #14076
2023-01-19 12:50:42 -08:00
George Amanakis f806306ce0 Activate filesystem features only in syncing context
When activating filesystem features after receiving a snapshot, do
so only in syncing context.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #14304
Closes #14252
2023-01-19 12:50:42 -08:00
Richard Yao f33b298346 Illumos #15286: do_composition() needs sign awareness
Authored by: Dan McDonald <danmcd@mnx.io>
Reviewed by: Patrick Mooney <pmooney@pfmooney.com>
Reviewed by: Richard Lowe <richlowe@richlowe.net>
Approved by: Joshua M. Clulow <josh@sysmgr.org>
Ported-by: Richard Yao <richard.yao@alumni.stonybrook.edu>

Illumos-issue: https://www.illumos.org/issues/15286
Illumos-commit: https://github.com/illumos/illumos-gate/commit/f137b22e734e85642da3e56e8b94da3f5f027c73

Porting Notes:

The patch in illumos did not have much of a commit message, and did not
provide attribution to the reporter, while original patch proposed to
OpenZFS did, so I am listing the reporter (myself) and original patch
author (also myself) below while including the original commit message
with some minor corrections as part of the porting notes:

In do_composition(), we have:

size = u8_number_of_bytes[*p];
if (size <= 1 || (p + size) > oslast)
	break;

There, we have type promotion from int8_t to size_t, which is unsigned.
C will sign extend the value as part of the widening before treating the
value as unsigned and the negative values we can counter are error
values from U8_ILLEGAL_CHAR and U8_OUT_OF_RANGE_CHAR, which are -1 and
-2 respectively. The unsigned versions of these under two's complement
are SIZE_MAX and SIZE_MAX-1 respectively.

The bounds check is written under the assumption that `size <= 1` does a
signed comparison. This is followed by a pointer comparison to see if
the string has the correct length, which is fine.

A little further down we have:

for (i = 0; i < size; i++)
	tc[i] = *p++;

When an error condition is encountered, this will attempt to iterate at
least SIZE_MAX-1 times, which will massively overflow the buffer, which
is not fine.

The kernel will kill the loop as soon as it hits the kernel stack guard
on Linux systems built with CONFIG_VMAP_STACK=y, which should be just
about all of them. That prevents arbitrary code execution and just about
any other bad thing that a black hat attacker might attempt with
knowledge of this buffer overflow. Other systems' kernels have
mitigations for unbounded in-kernel buffer overflows that will catch
this too.

Also, the patch in illumos-gate made an effort to fix C style issues
that had been fixed in the OpenZFS/ZFSOnLinux repository. Those issues
had been mentioned in the email that I originally sent them about this
issue. One of the fixes had not been already done, so it is included.
Another to collect_a_seq()'s arguments was handled differently in
OpenZFS. For the sake of avoiding unnecessary differences, it has been
adopted. This has the interesting effect that if you correct the paths
in the illumos-gate patch to match the current OpenZFS repository, you
can reverse apply it cleanly.

Original-patch-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reported-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Co-authored-by: Dan McDonald <danmcd@mnx.io>
Closes #14318
Closes #14342
2023-01-19 12:50:42 -08:00
Brian Behlendorf 04fcf13de0 dracut: fix typo in mount-zfs.sh.in
Format the `zpool get` command correctly.  The -o option must
be followed by "all" or the requested field name.

Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13602
2023-01-19 12:50:42 -08:00
Matthew Ahrens 37dbf91c8a removal of LegacyVersion broke ax_python_dev.m4
The 22.0 release of the python `packaging` package removed the
`LegacyVersion` trait, causing ZFS to no longer compile.

This commit replaces the sections of `ax_python_dev.m4` that rely on
`LegacyVersion` with updated implementations from the upstream
`autoconf-archive`.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #14297
2023-01-19 12:50:42 -08:00
Mateusz Guzik be697f4339 FreeBSD: catch up to 1400077
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #14328
2023-01-19 12:50:42 -08:00
Martin Rüegg c07a8660f0 Fix shebang for helper script of deb-utils
Shebang was missing the `!` between `#` and the actual path.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Martin Rüegg <martin.rueegg@metaworx.ch>
Closes #14339
2023-01-19 12:50:42 -08:00
Martin Rüegg ea62fb4ab7 Add quotation marks around $PATH for deb-utils
Fix #14338, failing to build deb-utils if existing `$PATH` variable
would include a whitespace.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Martin Rüegg <martin.rueegg@metaworx.ch>
Closes #14339
2023-01-19 12:50:42 -08:00
Brian Behlendorf 5aca6e1092 Documentation corrections
- Update the link to the OpenZFS Code of Conduct.
- Remove extra "the" from contrib/initramfs/scripts/zfs

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #14298
Closes #14307
2023-01-19 12:50:42 -08:00
George Melikov d72e004715 systemd: set restart=always for zfs-zed.service
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Co-authored-by: Attila Fülöp <attila@fueloep.org>
Closes #14294
2023-01-19 12:50:42 -08:00
Ethan Coe-Renner 9ef565a185 Add color output to zfs diff.
This adds support to color zfs diff (in the style of git diff)
conditional on the ZFS_COLOR environment variable.

Signed-off-by: Ethan Coe-Renner <coerenner1@llnl.gov>
2023-01-19 12:50:36 -08:00
наб 0e72f5fb83 libzfs: diff: simplify superfluous stdio
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12829
2023-01-19 12:50:36 -08:00
наб e9897e542d libzfs: diff: print_what() can return the symbol => get_what()
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12829
2023-01-19 12:50:36 -08:00
Doug Rabson 70b1b5bb98 FreeBSD: Remove stray debug printf
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Doug Rabson <dfr@rabson.org>
Closes #14286
Closes #14287
2023-01-19 12:50:36 -08:00
Richard Yao a2aabac123 Zero end of embedded block buffer in dump_write_embedded()
This fixes a kernel stack leak.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Tested-by: Nicholas Sherlock <n.sherlock@gmail.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13778
Closes #14255
2023-01-19 12:50:36 -08:00
Marcel Menzel 3207803abf Change ZEVENT_POOL_GUID to ZEVENT_POOL to display pool names
Outgoing mails for ZFS pool events include the pool GUID,
but not the actual pool name. Let's change this for better
readability, as it is already done in the mails for finished
pool resilvers.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Marcel Menzel <mail@mcl.gg>
Closes #14272
2023-01-19 12:50:36 -08:00
Allan Jude 6219190d7f Restrict visibility of per-dataset kstats inside FreeBSD jails
When inside a jail, visibility on datasets not "jailed" to the
jail is restricted. However, it was possible to enumerate all
datasets in the pool by looking at the kstats sysctl MIB.

Only the kstats corresponding to datasets that the user has
visibility on are accessible now.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Closes #14254
2023-01-19 12:50:36 -08:00
Richard Yao 24a6d8316a Fix dereference after null check in enqueue_range
If the bp is NULL, we have a hole. However, when we build with
assertions, we will dereference bp when `blkid == DMU_SPILL_BLKID`. When
this happens on a hole, we will have a NULL pointer dereference.

Reported-by: Coverity (CID-1524670)
Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14264
2023-01-19 12:50:36 -08:00
Richard Yao e23ed1b330 Fix potential buffer overflow in zpool command
The ZPOOL_SCRIPTS_PATH environment variable can be passed here. This
allows for arbitrarily long strings to be passed to sprintf(), which can
overflow the buffer.

I missed this in my earlier audit of the codebase. CodeQL's
cpp/unbounded-write check caught this.

Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14264
2023-01-19 12:50:36 -08:00
Richard Yao 572114d846 FreeBSD: zfs_register_callbacks() must implement error check correctly
I read the following article and noticed a couple of ZFS bugs mentioned:

https://pvs-studio.com/en/blog/posts/cpp/0377/

I decided to search for them in the modern OpenZFS codebase and then
found one that matched the description of the first one:

V593 Consider reviewing the expression of the 'A = B != C' kind. The
expression is calculated as following: 'A = (B != C)'. zfs_vfsops.c 498

The consequence of this is that the error value is replaced with `1`
when there is an error. When there is no error, 0 is correctly passed.
This is a very minor issue that is unlikely to cause any real problems.

The incorrect error code would either be returned to the mount command
on a failure or any of `zfs receive`, `zfs recv`, `zfs rollback` or `zfs
upgrade`.

The second one has already been fixed.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14261
2023-01-19 12:50:36 -08:00
наб 6af8e80310 fgrep -> grep -F
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13259
2023-01-19 12:50:36 -08:00
наб f8a124b104 egrep -> grep -E
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13259
2023-01-19 12:50:25 -08:00
Tony Hutter 689c53f2c5 Update META to 6.1 kernel
ZFS successfully builds against the 6.1.4 kernel.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #14371
2023-01-10 16:12:11 -08:00
Matthew Ahrens 0156253d29 ztest fails assertion in zio_write_gang_member_ready()
Encrypted blocks can have up to 2 DVA's, as the third DVA is reserved
for the salt+IV.  However, dmu_write_policy() allows non-encrypted
blocks (e.g. DMU_OT_OBJSET) inside encrypted datasets to request and
allocate 3 DVA's, since they don't need a salt+IV (they are merely
authenicated).

However, if such a block becomes a gang block, the gang code incorrectly
limits the gang block header to 2 DVA's.  This leads to a "NDVAs
inversion", where a parent block (the gang block header) has less DVA's
than its children (the gang members), causing an assertion failure in
zio_write_gang_member_ready().

This commit addresses the problem by only restricting the gang block
header to 2 DVA's if the block is actually encrypted (and thus its gang
block members can have at most 2 DVA's).

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #14250
Closes #14356
2023-01-10 08:44:55 -08:00
Antonio Russo 3e0962a236 Introduce ZFS_LINUX_REQUIRE_API autoconf macro
Currently, if API tests fail, we either ignore the failures, or
unconditionally halt the kernel build.  This leads to situations where
incompatibilities with existing APIs may develop, but not trip the
configure compatibility checks.

This introduces a new mechanism to require APIs for kernels above a
particular version.  While not perfect, this at least guarantees
mainline kernels do not break existing APIs without at least providing
some warning.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Antonio Russo <aerusso@aerusso.net>
Closes #14343
2023-01-10 08:43:49 -08:00
Coleman Kane 3c0b8c874b linux 6.2 compat: bio->bi_rw was renamed bio->bi_opf
The bi_rw member of struct bio was renamed to bi_opf in Linux 6.2.
As well, Linux's implementation of bio_set_op_attrs(...) has been
removed.

The HAVE_BIO_BI_OPF macro already appears to be defined, but the
removal of the bio_set_op_attrs(...) implementation makes the build
fall back on the locally-defined implementation, which isn't updated
for the bio->bi_opf change. This commit adds that update.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #14324
Closes #14331
2023-01-10 08:43:49 -08:00
Coleman Kane b586ea5d93 linux 6.2 compat: get_acl() got moved to get_inode_acl() in 6.2
Linux 6.2 renamed the get_acl() operation to get_inode_acl() in
the inode_operations struct. This should fix Issue #14323.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #14323
Closes #14331
2023-01-10 08:43:49 -08:00
Antonio Russo 138d2b29dd Linux 6.1 compat: open inside tmpfile()
commit d27c81847b upstream

Linux 863f144 modified the .tmpfile interface to pass a struct file,
rather than a struct dentry, and expect the tmpfile implementation to
open inside of tmpfile().

This patch implements a configuration test that checks for this new API
and appropriately sets a HAVE_TMPFILE_DENTRY flag that tracks this old
API.  Contingent on this flag, the appropriate API is implemented.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Antonio Russo <aerusso@aerusso.net>
Closes #14301
Closes #14343
2023-01-09 17:15:22 -08:00
Antonio Russo 5371d8dae7 ZTS: close in mmapwrite.c
commit a7304ab9c1 upstream

mmapwrite is used during the ZTS to identify issues with mmap-ed files.
This helper program exercises this pathway by continuously writing to a
file.  ee6bf97c7 modified the writing threads to terminate after a set
amount of total data is written.  This change allows standard program
execution to reach the end of a writer thread without closing the file
descriptor, introducing a resource "leak."

This patch appeases resource leak analyses by close()-ing the file at
the end of the thread.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Antonio Russo <aerusso@aerusso.net>
Closes #14353
2023-01-09 17:15:22 -08:00
Antonio Russo a75af541cf ZTS: limit mmapwrite file size
commit ee6bf97c77 upstream

mmapwrite spawns several threads, all of which perform writes on a file
for the purpose of testing the behavior of mmap(2)-ed files.  One
thread performs an mmap and a write to the beginning of that region,
while the others perform regular writes after lseek(2)-ing the end of
the file.

Because these regular writes are set in a while (1) loop, they will
write an unbounded amount of data to disk.  The mmap_write_001_pos test
script SIGKILLs them after 30 seconds, but on fast testbeds, this may
be enough time to exhaust the available space in the filesystem,
leading to spurious test failures.

Instead, limit the total file size by checking that the lseek return
value is no greater than 250 * 1024*1024 bytes, which is less than the
default minimum vdev size defined in includes/default.cfg .

This also includes part of 2a493a4c71,
which checks the return value of lseek.

Signed-off-by: Antonio Russo <aerusso@aerusso.net>
Closes #14277
Closes #14345
2023-01-09 17:15:22 -08:00
Ameer Hamza 75fbe7eb99 skip permission checks for extended attributes
zfs_zaccess_trivial() calls the generic_permission() to read
xattr attributes. This causes deadlock if called from
zpl_xattr_set_dir() context as xattr and the dent locks are
already held in this scenario. This commit skips the permissions
checks for extended attributes since the Linux VFS stack already
checks it before passing us the control.

Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
2023-01-05 11:10:28 -08:00
Ameer Hamza d0f350c962 Allow receiver to override encryption properties in case of replication
Currently, the receiver fails to override the encryption
property for the plain replicated dataset with the error:
"cannot receive incremental stream: encryption property
'encryption' cannot be set for incremental streams.". The
problem is resolved by allowing the receiver to override
the encryption property for plain replicated send.

Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
2023-01-05 11:10:04 -08:00
Ameer Hamza 2f2d6bece8 zed: unclean disk attachment faults the vdev
If the attached disk already contains a vdev GUID, it
means the disk is not clean. In such a scenario, the
physical path would be a match that makes the disk
faulted when trying to online it. So, we would only
want to proceed if either GUID matches with the last
attached disk or the disk is in a clean state.

Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
2023-01-05 11:09:36 -08:00
Ryan Moeller fbbc375d43 FreeBSD: Fix potential boot panic with bad label
vdev_geom_read_pool_label() can leave NULL in configs.  Check for it
and skip consistently when generating rootconf.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #14291
(cherry picked from commit dc8c2f6158)
2023-01-05 11:00:09 -08:00
Rich Ercolani e84a2ed7a8 Add workaround for broken Linux pipes
Linux has an unresolved hang if you resize a pipe with bytes
in it.

Since there's no obvious way to detect this happening, added a
workaround to disable resizing the pipe buffer if you set an
environment variable.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #13309
2023-01-05 10:47:25 -08:00
Ryan Moeller f28c7302cb initramfs: Fix legacy mountpoint rootfs
Legacy mountpoint datasets should not pass `-o zfsutil` to `mount.zfs`.
Fix the logic in `mount_fs()` to not forget we have a legacy mountpoint
when checking for an `org.zol:mountpoint` userprop.

Reviewed-by: Richard Yao <ryao@gentoo.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #14274
(cherry picked from commit 786ff6a6cb)
2022-12-13 17:33:33 -08:00
szubersk 4767037bcf vdev_raidz_math_aarch64_neonx2.c: suppress diagnostic only for GCC
Signed-off-by: szubersk <szuberskidamian@gmail.com>
2022-12-09 12:07:38 -08:00
szubersk d50ce5c9ec tests: mkfile: usage: () -> (void)
Signed-off-by: szubersk <szuberskidamian@gmail.com>
2022-12-09 12:07:38 -08:00
szubersk 05732da4d1 Use Ubuntu 20.04 and remove Ubuntu 18.04 from workflows
- `ubuntu-latest` now resolves to `ubuntu-22.04`. Explicit pinning
  is needed.

- cherry-pick #14238

Signed-off-by: szubersk <szuberskidamian@gmail.com>
2022-12-09 10:57:10 -08:00
Savyasachee Jha 8f7826f73b dracut: skip zfsexpandknoweldge when zfs_devs is present in dracut
PR 1711 (https://github.com/dracutdevs/dracut/pull/1711) adds a zfs_devs
function to dracut to detect the physical devices backing zfs pools. If
this function exists in the version of dracut this module is being
called from, then it does not need to run.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Signed-off-by: Savyasachee Jha <hi@savyasacheejha.com>
Closes #13121
2022-12-09 10:42:46 -08:00
Tony Hutter 21bd766133 Tag zfs-2.1.7
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2022-12-01 12:39:45 -08:00
Tony Hutter 7819b12f2c zfs-2.1.7: Use ubuntu-20.04 for zloop and sanity builders
The zfs-2.1.7 branch is still using the older 'python-dev'
package names rather than the newer 'python3-dev' packages that
are required for 'ubuntu-latest'.  Use 'ubuntu-20.04' instead of
'ubuntu-latest' to get around this.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2022-12-01 12:39:45 -08:00
George Amanakis c8d2ab05e1 Fix setting the large_block feature after receiving a snapshot
We are not allowed to dirty a filesystem when done receiving
a snapshot. In this case the flag SPA_FEATURE_LARGE_BLOCKS will
not be set on that filesystem since the filesystem is not on
dp_dirty_datasets, and a subsequent encrypted raw send will fail.
Fix this by checking in dsl_dataset_snapshot_sync_impl() if the feature
needs to be activated and do so if appropriate.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #13699
Closes #13782
2022-12-01 12:39:45 -08:00
Damian Szuberski 2c50512ad2 Make autodetection disable pyzfs for kernel/srpm configurations
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #13394
Closes #14178
2022-12-01 12:39:44 -08:00
Brooks Davis c4468a70c3 Don't leak packed recieved proprties
When local properties (e.g., from -o and -x) are provided, don't leak
the packed representation of the received properties due to variable
reuse.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Brooks Davis <brooks.davis@sri.com>
Closes #14197
2022-12-01 12:39:44 -08:00
Richard Yao e48aaef89f Fix NULL pointer dereference in dbuf_prefetch_indirect_done()
When ZFS is built with assertions, a prefetch is done on a redacted
blkptr and `dpa->dpa_dnode` is NULL, we will have a NULL pointer
dereference in `dbuf_prefetch_indirect_done()`.

Both Coverity and Clang's Static Analyzer caught this.

Reported-by: Coverity (CID 1524671)
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14210
2022-12-01 12:39:44 -08:00
Richard Yao 0e3abd2994 Lua: Fix bad bitshift in lua_strx2number()
The port of lua to OpenZFS modified lua to use int64_t for numbers
instead of double. As part of this, a function for calculating
exponentiation was replaced with a bit shift. Unfortunately, it did not
handle negative values. Also, it only supported exponents numbers with
7 digits before before overflow. This supports exponents up to 15 digits
before overflow.

Clang's static analyzer reported this as "Result of operation is garbage
or undefined" because the exponent was negative.

Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14204
2022-12-01 12:39:44 -08:00
Damian Szuberski 3d1e808096 Fix clang 13 compilation errors
```
os/linux/zfs/zvol_os.c:1111:3: error: ignoring return value of function
  declared with 'warn_unused_result' attribute [-Werror,-Wunused-result]
                add_disk(zv->zv_zso->zvo_disk);
                ^~~~~~~~ ~~~~~~~~~~~~~~~~~~~~

zpl_xattr.c:1579:1: warning: no previous prototype for function
  'zpl_posix_acl_release_impl' [-Wmissing-prototypes]
```

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #13551
(cherry picked from commit 9884319666)
2022-12-01 12:39:44 -08:00
наб 108c07c655 Remove final K&R definitions
Clang trunk now warns -Wstrict-prototypes on this, and they're removed
in C2x

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13447
2022-12-01 12:39:44 -08:00
наб 32f7499acf module: zfs: vdev_removal: remove unused num_indirect
Found with -Wunused-but-set-variable on Clang trunk

Fixes: a1d477c24c ("OpenZFS 7614, 9064 - zfs device evacuation/removal")
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13304
2022-12-01 12:39:44 -08:00
наб 670d66e7a0 tests: cmd: draid: remove unused and undocumented -v
Found with -Wunused-but-set-variable on Clang trunk

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13304
2022-12-01 12:39:44 -08:00
наб ad0379bf0e linux: libspl: zone: () -> (void)
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12968
2022-12-01 12:39:44 -08:00
Laura Hild 2662b8e72b Correct multipathd.target to .service
https://github.com/openzfs/zfs/pull/9863 says it "orders
zfs-import-cache.service and zfs-import-scan.service after
multipathd.service" but the commit (79add96) actually
ordered them after .target.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Laura Hild <lsh@jlab.org>
Closes #12709
Closes #14171
2022-12-01 12:39:44 -08:00
Rich Ercolani fa7d572a8a Handle and detect #13709's unlock regression (#14161)
In #13709, as in #11294 before it, it turns out that 63a26454 still had
the same failure mode as when it was first landed as d1d47691, and
fails to unlock certain datasets that formerly worked.

Rather than reverting it again, let's add handling to just throw out
the accounting metadata that failed to unlock when that happens, as
well as a test with a pre-broken pool image to ensure that we never get
bitten by this again.

Fixes: #13709

Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
2022-12-01 12:39:43 -08:00
shodanshok d9de079a4b Fix arc_p aggressive increase
The original ARC paper called for an initial 50/50 MRU/MFU split
and this is accounted in various places where arc_p = arc_c >> 1,
with further adjustment based on ghost lists size/hit. However, in
current code both arc_adapt() and arc_get_data_impl() aggressively
grow arc_p until arc_c is reached, causing unneeded pressure on
MFU and greatly reducing its scan-resistance until ghost list
adjustments kick in.

This patch restores the original behavior of initially having arc_p
as 1/2 of total ARC, without preventing MRU to use up to 100% total
ARC when MFU is empty.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gionatan Danti <g.danti@assyoma.it>
Closes #14137
Closes #14120
2022-12-01 12:39:43 -08:00
Richard Yao 957c3776f2 FreeBSD: Fix out of bounds read in zfs_ioctl_ozfs_to_legacy()
There is an off by 1 error in the check. Fortunately, this function does
not appear to be used in kernel space, despite being compiled as part of
the kernel module. However, it is used in userspace. Callers of
lzc_ioctl_fd() likely will crash if they attempt to use the
unimplemented request number.

This was reported by FreeBSD's coverity scan.

Reported-by: Coverity (CID 1432059)
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14135
2022-12-01 12:39:43 -08:00
Serapheim Dimitropoulos 85537f77a3 Expose zfs_vdev_open_timeout_ms as a tunable
Some of our customers have been occasionally hitting zfs import failures
in Linux because udevd doesn't create the by-id symbolic links in time
for zpool import to use them. The main issue is that the
systemd-udev-settle.service that zfs-import-cache.service and other
services depend on is racy. There is also an openzfs issue filed (see
https://github.com/openzfs/zfs/issues/10891) outlining the problem and
potential solutions.

With the proper solutions being significant in terms of complexity and
the priority of the issue being low for the time being, this patch
exposes `zfs_vdev_open_timeout_ms` as a tunable so people that are
experiencing this issue often can increase it as a workaround.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #14133
2022-12-01 12:39:43 -08:00
Brooks Davis 5f53a444b3 Remove an unused variable
Clang-16 detects this set-but-unused variable which is assigned and
incremented, but never referenced otherwise.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Brooks Davis <brooks.davis@sri.com>
Closes #14125
2022-12-01 12:39:43 -08:00
Brooks Davis 572bd18c1f Make 1-bit bitfields unsigned
This fixes -Wsingle-bit-bitfield-constant-conversion warning from
clang-16 like:

lib/libzfs/libzfs_dataset.c:4529:19: error: implicit truncation
  from 'int' to a one-bit wide bit-field changes value from
  1 to -1 [-Werror,-Wsingle-bit-bitfield-constant-conversion]
                flags.nounmount = B_TRUE;
				^ ~~~~~~

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Brooks Davis <brooks.davis@sri.com>
Closes #14125
2022-12-01 12:39:43 -08:00
Richard Yao 256b74d0b0 Address warnings about possible division by zero from clangsa
* The complaint in ztest_replay_write() is only possible if something
   went horribly wrong. An assertion will silence this and if it goes
   off, we will know that something is wrong.
 * The complaint in spa_estimate_metaslabs_to_flush() is not impossible,
   but seems very unlikely. We resolve this by passing the value from
   the `MIN()` that does not go to infinity when the variable is zero.

There was a third report from Clang's scan-build, but that was a
definite false positive and disappeared when checked again through
Clang's static analyzer with Z3 refution via CodeChecker.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14124
2022-12-01 12:39:43 -08:00
Allan Jude ac01b876c9 Avoid null pointer dereference in dsl_fs_ss_limit_check()
Check for cr == NULL before dereferencing it in
dsl_enforce_ds_ss_limits() to lookup the zone/jail ID.

Reported-by: Coverity (CID 1210459)
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Closes #14103
2022-12-01 12:39:43 -08:00
Richard Yao e9a8fb17b5 Fix too few arguments to formatting function
CodeQL reported that when the VERIFY3U condition is false, we do not
pass enough arguments to `spl_panic()`. This is because the format
string from `snprintf()` was concatenated into the format string for
`spl_panic()`, which causes us to have an unexpected format specifier.

A CodeQL developer suggested fixing the macro to have a `%s` format
string that takes a stringified RIGHT argument, which would fix this.
However, upon inspection, the VERIFY3U check was never necessary in the
first place, so we remove it in favor of just calling `snprintf()`.

Lastly, it is interesting that every other static analyzer run on the
codebase did not catch this, including some that made an effort to catch
such things. Presumably, all of them relied on header annotations, which
we have not yet done on `spl_panic()`. CodeQL apparently is able to
track the flow of arguments on their way to annotated functions, which
llowed it to catch this when others did not. A future patch that I have
in development should annotate `spl_panic()`, so the others will catch
this too.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14098
2022-12-01 12:39:43 -08:00
Pavel Snajdr 52e658edd7 Remove zpl_revalidate: fix snapshot rollback
Open files, which aren't present in the snapshot, which is being
roll-backed to, need to disappear from the visible VFS image of
the dataset.

Kernel provides d_drop function to drop invalid entry from
the dcache, but inode can be referenced by dentry multiple dentries.

The introduced zpl_d_drop_aliases function walks and invalidates
all aliases of an inode.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pavel Snajdr <snajpa@snajpa.net>
Closes #9600
Closes #14070
2022-12-01 12:39:42 -08:00
Richard Yao 4c59fde1f5 Fix theoretical use of uninitialized values
Clang's static analyzer complains about this.

In get_configs(), if we have an invalid configuration that has no top
level vdevs, we can read a couple of uninitialized variables. Aborting
upon seeing this would break the userland tools for healthy pools, so we
instead initialize the two variables to 0 to allow the userland tools to
continue functioning for the pools with valid configurations.

In zfs_do_wait(), if no wait activities are enabled, we read an
uninitialized error variable.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14043
2022-12-01 12:39:42 -08:00
Richard Yao 3830858c5c Fix memory leaks in dmu_send()/dmu_send_obj()
If we encounter an EXDEV error when using the redacted snapshots
feature, the memory used by dspp.fromredactsnaps is leaked.

Clang's static analyzer caught this during an experiment in which I had
annotated various headers in an attempt to improve the results of static
analysis.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13973
2022-12-01 12:39:42 -08:00
Richard Yao af2e53f62c Fix possible NULL pointer dereference in sha2_mac_init()
If mechanism->cm_param is NULL, passing mechanism to
PROV_SHA2_GET_DIGEST_LEN() will dereference a NULL pointer.

Coverity reported this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14044
2022-12-01 12:39:42 -08:00
Richard Yao 89c41f3979 set_global_var() should not pass NULL pointers to dlclose()
Both Coverity and Clang's static analyzer caught this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14044
2022-12-01 12:39:42 -08:00
Richard Yao 409c99a1d3 Fix NULL pointer dereference in spa_open_common()
Calling spa_open() will pass a NULL pointer to spa_open_common()'s
config parameter. Under the right circumstances, we will dereference the
config parameter without doing a NULL check.

Clang's static analyzer found this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14044
2022-12-01 12:39:42 -08:00
Richard Yao bbec0e60a8 Fix NULL pointer passed to strlcpy from zap_lookup_impl()
Clang's static analyzer pointed out that whenever zap_lookup_by_dnode()
is called, we have the following stack where strlcpy() is passed a NULL
pointer for realname from zap_lookup_by_dnode():

strlcpy()
zap_lookup_impl()
zap_lookup_norm_by_dnode()
zap_lookup_by_dnode()

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14044
2022-12-01 12:39:42 -08:00
Richard Yao a5f17a94d3 fm_fmri_hc_create() must call va_end() before returning
clang-tidy caught this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14044
2022-12-01 12:39:42 -08:00
Richard Yao 5eaad8bdb5 Fix NULL pointer dereference in zdb
Clang's static analyzer complained that we dereference a NULL pointer in
dump_path() if we return 0 when there is an error.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14044
2022-12-01 12:39:42 -08:00
Richard Yao 4351d18fb0 ZED: Fix uninitialized value reads
Coverity complained about a couple of uninitialized value reads in ZED.

 * zfs_deliver_dle() can pass an uninitialized string to zed_log_msg()
 * An uninitialized sev.sigev_signo is passed to timer_create()

The former would log garbage while the latter is not a real issue, but
we might as well suppress it by initializing the field to 0 for
consistency's sake.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14047
2022-12-01 12:39:41 -08:00
Richard Yao 2453f90350 Fix theoretical array overflow in lua_typename()
Out of the 12 defects in lua that coverity reports, 5 of them involve
`lua_typename()` and out of the dozens of defects in ZFS that lua
reports, 3 of them involve `lua_typename()` due to the ZCP code. Given
all of the uses of `lua_typename()` in the ZCP code, I was surprised
that there were not more. It appears that only 2 were reported because
only 3 called `lua_type()`, which does a defective sanity check that
allows invalid types to be passed.

lua/lua@d4fb848be7 addressed this in
upstream lua 5.3. Unfortunately, we did not get that fix since we use
lua 5.2 and we do not have assertions enabled in lua, so the upstream
solution would not do anything.

While we could adopt the upstream solution and enable assertions, a
simpler solution is to fix the issue by making `lua_typename()` return
`internal_type_error` whenever it is called with an invalid type. This
avoids the array overflow and if we ever see it appear somewhere, we
will know there is a problem with the lua interpreter.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13947
2022-12-01 12:39:41 -08:00
Richard Yao d016ca1a92 Fix potential NULL pointer dereference in lzc_ioctl()
Users are allowed to pass NULL to resultp, but we unconditionally assume
that they never do. When an external user does pass NULL to resultp, we
dereference a NULL pointer.

Clang's static analyzer complained about this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14008
2022-12-01 12:39:41 -08:00
Richard Yao d05f247aec scripts/enum-extract.pl should not hard code perl path
This is a portability issue. The issue had already been fixed for
scripts/cstyle.pl by 2dbf1bf829.
scripts/enum-extract.pl was added to the repository the following year
without this portability fix.

Michael Bishop informed me that this broke his attempt to build ZFS
2.1.6 on NixOS, since he was building manually outside of their package
manager (that usually rewrites the shebangs to NixOS' unusual paths).
NixOS puts all of the paths into $PATH, so scripts that portably rely
on env to find the interpreter still work.

Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14012
2022-12-01 12:39:41 -08:00
Richard Yao fa74250cd3 PAM: Fix unchecked return value from zfs_key_config_load()
9a49c6b782 was intended to fix this issue,
but I had missed the case in pam_sm_open_session(). Clang's static
analyzer had not reported it and I forgot to look for other cases.

Interestingly, GCC gcc-12.1.1_p20220625's static analyzer had caught
this as multiple double-free bugs, since another failure after the
failure in zfs_key_config_load() will cause us to attempt to free the
memory that zfs_key_config_load() was supposed to allocate, but had
cleaned up upon failure.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13978
2022-12-01 12:39:41 -08:00
Richard Yao c562bbefc0 Fix potential NULL pointer dereference in dsl_dataset_promote_check()
If the `list_head()` returns NULL, we dereference it, right before we
check to see if it returned NULL.

We have defined two different pointers that both point to the same
thing, which are `origin_head` and `origin_ds`. Almost everything uses
`origin_ds`, so we switch them to use `origin_ds`.

We also promote `origin_ds` to a const pointer so that the compiler
verifies that nothing modifies it.

Coverity complained about this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13967
2022-12-01 12:39:41 -08:00
Richard Yao d4df36de5d Fix unreachable code in zstreamdump
82226e4f44 was intended to prevent a
warning from being printed in situations where it was inappropriate, but
accidentally disabled it entirely by setting featureflags in the wrong
case statement.

Coverity reported this as dead code.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13946
2022-12-01 12:39:41 -08:00
Richard Yao 531361114b PAM: Fix uninitialized value read
Clang's static analyzer found that config.uid is uninitialized when
zfs_key_config_load() returns an error.

Oddly, this was not included in the unchecked return values that
Coverity found.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13957
2022-12-01 12:39:41 -08:00
Richard Yao e11c4327f1 set_global_var_parse_kv() should pass the pointer from strdup()
A comment says that the caller should free k_out, but the pointer passed
via k_out is not the same pointer we received from strdup(). Instead,
it is a pointer into the region we received from strdup(). The free
function should always be called with the original pointer, so this is
likely a bug.

We solve this by calling `strdup()` a second time and then freeing the
original pointer.

Coverity reported this as a memory leak.

Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13867
2022-12-01 12:39:41 -08:00
Richard Yao fbe150fe5b Call va_end() before return in zpool_standard_error_fmt()
Commit ecd6cf800b63704be73fb264c3f5b6e0dafc068d by marks in OpenSolaris
at Tue Jun 26 07:44:24 2007 -0700 introduced a bug where we fail to call
`va_end()` before returning.

The man page for va_start() says:

"Each invocation of va_start() must be matched by a corresponding
invocation of va_end() in the same function."

Coverity complained about this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Chunwei Chen <david.chen@nutanix.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13904
2022-12-01 12:39:41 -08:00
Richard Yao 1ff8f41851 Fix potential NULL pointer dereference in zfsdle_vdev_online()
Coverity complained about this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Chunwei Chen <david.chen@nutanix.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13903
2022-12-01 12:39:40 -08:00
Richard Yao c6d93d0a80 FreeBSD: Fix uninitialized pointer read in spa_import_rootpool()
The FreeBSD project's coverity scans found this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13923
2022-12-01 12:39:40 -08:00
Richard Yao 9f1691a964 Linux: Fix use-after-free in zfsvfs_create()
Coverity reported that we pass a pointer to zfsvfs to
`dmu_objset_disown()` after freeing zfsvfs in zfsvfs_create_impl() after
a failure in zfsvfs_init().

We have nearly identical duplicate versions of this code for FreeBSD and
Linux, but interestingly, the FreeBSD version of this code differs in
such a way that it does not suffer from this bug. We remove the
difference from the FreeBSD version to fix this bug.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13883
2022-12-01 12:39:40 -08:00
Richard Yao 12b859c970 Fix null pointer dereferences in PAM
Coverity caught these.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13889
2022-12-01 12:39:40 -08:00
наб 39a39b8ab9 Handle ECKSUM as new EZFS_CKSUM ‒ "insufficient replicas"
Add a meaningful error message for ECKSUM to common error messages.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #6805
Closes #13808
Closes #13898
2022-12-01 12:39:40 -08:00
Richard Yao 1d5e569a69 Fix use-after-free bugs in icp code
These were reported by Coverity as "Read from pointer after free" bugs.
Presumably, it did not report it as a use-after-free bug because it does
not understand the inline assembly that implements the atomic
instruction.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13881
2022-12-01 12:39:40 -08:00
Richard Yao 3f380df778 Remove incorrect free() in zfs_get_pci_slots_sys_path()
Coverity found this. We attempted to free tmp, which is a pointer to a
string that should be freed by the caller.

Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13864
2022-12-01 12:39:40 -08:00
Richard Yao b247d47be1 Cleanup: Make memory barrier definitions consistent across kernels
We inherited membar_consumer() and membar_producer() from OpenSolaris,
but we had replaced membar_consumer() with Linux's smp_rmb() in
zfs_ioctl.c. The FreeBSD SPL consequently implemented a shim for the
Linux-only smp_rmb().

We reinstate membar_consumer() in platform independent code and fix the
FreeBSD SPL to implement membar_consumer() in a way analogous to Linux.

Reviewed-by: Konstantin Belousov <kib@FreeBSD.org>
Reviewed-by: Mateusz Guzik <mjguzik@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13843
2022-12-01 12:39:40 -08:00
Richard Yao 792825724b zpool_load_compat() should create strings of length ZFS_MAXPROPLEN
Otherwise, `strlcat()` can overflow them.

Coverity found this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13866
2022-12-01 12:39:40 -08:00
Alexander Lobakin ab22031d79 icp: fix all !ENDBR objtool warnings in x86 Asm code
Currently, only Blake3 x86 Asm code has signs of being ENDBR-aware.
At least, under certain conditions it includes some header file and
uses some custom macro from there.
Linux has its own NOENDBR since several releases ago. It's defined
in the same <asm/linkage.h>, so currently <sys/asm_linkage.h>
already is provided with it.

Let's unify those two into one %ENDBR macro. At first, check if it's
present already. If so -- use Linux kernel version. Otherwise, try
to go that second way and use %_CET_ENDBR from <cet.h> if available.
If no, fall back to just empty definition.
This fixes a couple more 'relocations to !ENDBR' across the module.
And now that we always have the latest/actual ENDBR definition, use
it at the entrance of the few corresponding functions that objtool
still complains about. This matches the way how it's used in the
upstream x86 core Asm code.

Reviewed-by: Attila Fülöp <attila@fueloep.org>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Closes #14035
2022-12-01 12:39:39 -08:00
Alexander Lobakin 33bc03dea7 icp: fix rodata being marked as text in x86 Asm code
objtool properly complains that it can't decode some of the
instructions from ICP x86 Asm code. As mentioned in the Makefile,
where those object files were excluded from objtool check (but they
can still be visible under IBT and LTO), those are just constants,
not code.
In that case, they must be placed in .rodata, so they won't be
marked as "allocatable, executable" (ax) in EFL headers and this
effectively prevents objtool from trying to decode this data. That
reveals a whole bunch of other issues in ICP Asm code, as previously
objtool was bailing out after that warning message.

Reviewed-by: Attila Fülöp <attila@fueloep.org>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Closes #14035

Conflicts:
	module/Kbuild.in
2022-11-30 10:15:58 -08:00
Alexander Lobakin ee93cbc9d4 icp: properly fix all RETs in x86_64 Asm code
Commit 43569ee374 ("Fix objtool: missing int3 after ret warning")
addressed replacing all `ret`s in x86 asm code to a macro in the
Linux kernel in order to enable SLS. That was done by copying the
upstream macro definitions and fixed objtool complaints.
Since then, several more mitigations were introduced, including
Rethunk. It requires to have a jump to one of the thunks in order
to work, so the RET macro was changed again. And, as ZFS code
didn't use the mainline defition, but copied it, this is currently
missing.

Objtool reminds about it time to time (Clang 16, CONFIG_RETHUNK=y):

fs/zfs/lua/zlua.o: warning: objtool: setjmp+0x25: 'naked' return
 found in RETHUNK build
fs/zfs/lua/zlua.o: warning: objtool: longjmp+0x27: 'naked' return
 found in RETHUNK build

Do it the following way:
* if we're building under Linux, unconditionally include
  <linux/linkage.h> in the related files. It is available in x86
  sources since even pre-2.6 times, so doesn't need any conftests;
* then, if RET macro is available, it will be used directly, so that
  we will always have the version actual to the kernel we build;
* if there's no such macro, we define it as a simple `ret`, as it
  was on pre-SLS times.

This ensures we always have the up-to-date definition with no need
to update it manually, and at the same time is safe for the whole
variety of kernels ZFS module supports.
Then, there's a couple more "naked" rets left in the code, they're
just defined as:

	.byte 0xf3,0xc3

In fact, this is just:

	rep ret

`rep ret` instead of just `ret` seems to mitigate performance issues
on some old AMD processors and most likely makes no sense as of
today.
Anyways, address those rets, so that they will be protected with
Rethunk and SLS. Include <sys/asm_linkage.h> here which now always
has RET definition and replace those constructs with just RET.
This wipes the last couple of places with unpatched rets objtool's
been complaining about.

Reviewed-by: Attila Fülöp <attila@fueloep.org>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Closes #14035
2022-11-30 10:15:58 -08:00
Ryan Moeller 1d9aa838ed libzfs recv: Check if user prop before inheritable
User props trigger an assert in zfs_prop_inheritable(), we must check
if the prop is a user prop first.

Signed-off-by: Ryan Moeller <ryan@iXsystems.com>

Backported as snippit from:
63652e1 Add --enable-asan and --enable-ubsan switches
2022-11-30 10:13:23 -08:00
Damian Szuberski 0f4ee295ba dsl_prop_known_index(): check for invalid prop
Resolve UBSAN array-index-out-of-bounds error in zprop_desc_t.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #14142
Closes #14147
2022-11-08 10:16:21 -08:00
Ameer Hamza 8c0684d326 zed: Avoid core dump if wholedisk property does not exist
zed aborts and dumps core in vdev_whole_disk_from_config() if
wholedisk property does not exist. make_leaf_vdev() adds the
property but there may be already pools that don't have the
wholedisk in the label.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #14062
2022-11-08 10:10:05 -08:00
Ameer Hamza ca3a675c74 zed: Prevent special vdev to be replaced by hot spare
Special vdevs should not be replaced by a hot spare.
Log vdevs already support this, extending the
functionality for special vdevs.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #14129
2022-11-07 13:36:57 -08:00
Attila Fülöp cd1f023846 Deny receiving into encrypted datasets if the keys are not loaded (#14139)
Commit 68ddc06b61 introduced support
for receiving unencrypted datasets as children of encrypted ones but
unfortunately got the logic upside down. This resulted in failing to
deny receives of incremental sends into encrypted datasets without
their keys loaded. If receiving a filesystem, the receive was done
into a newly created unencrypted child dataset of the target. In
case of volumes the receive made the target volume undeletable since
a dataset was created below it, which we obviously can't handle.
Incremental streams with embedded blocks are affected as well.

We fix the broken logic to properly deny receives in such cases.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes #13598
Closes #14055
Closes #14119
2022-11-04 11:07:29 -07:00
Ryan Moeller b27c7a1457 zil: Relax assertion in zil_parse
Rather than panic debug builds when we fail to parse a whole ZIL, let's
instead improve the logging of errors and continue like in a release
build.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #14116
2022-11-01 12:49:14 -07:00
Mariusz Zaborski 186e39f336 quota: extend quota for dataset
This patch relax the quota limitation for dataset by around 3%.
What this means is that user can write more data then the quota is
set to. However thanks to that we can get more stable bandwidth, in
case when we are overwriting data in-place, and not consuming any
additional space.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: Mariusz Zaborski <oshogbo@vexillium.org>
Sponsored-by: Zededa Inc.
Sponsored-by: Klara Inc.
Closes #13839
2022-11-01 12:48:37 -07:00
shodanshok 1d2b0563f7 Fix ARC target collapse when zfs_arc_meta_limit_percent=100
Reclaim metadata when arc_available_memory < 0 even if
meta_used is not bigger than arc_meta_limit.

As described in https://github.com/openzfs/zfs/issues/14054 if
zfs_arc_meta_limit_percent=100 then ARC target can collapse to
arc_min due to arc_purge not freeing any metadata.

This patch lets arc_prune to do its work when arc_available_memory
is negative even if meta_used is not bigger than arc_meta_limit,
avoiding ARC target collapse.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gionatan Danti <g.danti@assyoma.it>
Closes #14054 
Closes #14093
2022-11-01 12:48:30 -07:00
vaclavskala 8929355b4c Propagate extent_bytes change to autotrim thread
The autotrim thread only reads zfs_trim_extent_bytes_min and
zfs_trim_extent_bytes_max variable only on thread start.  We
should check for parameter changes during thread execution to
allow parameter changes take effect without needing to disable
then restart the autotrim.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Václav Skála <skala@vshosting.cz>
Closes #14077
2022-11-01 12:48:23 -07:00
Coleman Kane 212ba9bd97 Linux 6.1 compat: change order of sys/mutex.h includes
After Linux 6.1-rc1 came out, the build started failing to build a
couple of the files in the linux spl code due to the mutex_init
redefinition. Moving the sys/mutex.h include to a lower position within
these two files appears to fix the problem.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #14040
2022-11-01 12:44:56 -07:00
Brian Behlendorf 7ce097c874 Linux 6.0 compat: META
Update the META file to reflect compatibility with the 6.0 kernel.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #14091
2022-11-01 12:43:49 -07:00
Alexander 3e767e34bd Linux compat: fix DECLARE_EVENT_CLASS() test when ZFS is built-in
ZFS_LINUX_TRY_COMPILE_HEADER macro doesn't take CONFIG_ZFS=y into
account. As a result, on several latest Linux versions, configure
script marks DECLARE_EVENT_CLASS() available for non-GPL when ZFS
is being built as a module, but marks it unavailable when ZFS is
built-in.
Follow the logic of the neighbor macros and adjust
ZFS_LINUX_TRY_COMPILE_HEADER accordingly, so that it doesn't try
to look for a .ko when ZFS is built-in.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Closes #14006
2022-11-01 12:43:29 -07:00
Christian Schwarz df000276b8 zfs_domount: fix double-disown of dataset / double-free of zfsvfs_t
Before this patch, in zfs_domount, if zfs_root or d_make_root fails, we
leave zfsvfs != NULL. This will lead to execution of the error handling
`if` statement at the `out` label, and hence to a call to
dmu_objset_disown and zfsvfs_free.

However, zfs_umount, which we call upon failure of zfs_root and
d_make_root already does dmu_objset_disown and zfsvfs_free.

I suppose this patch rather adds to the brittleness of this part of the
code base, but I don't want to invest more time in this right now.
To add a regression test, we'd need some kind of fault injection
facility for zfs_root or d_make_root, which doesn't exist right now.
And even then, I think that regression test would be too closely tied
to the implementation.

To repro the double-disown / double-free, do the following:
1. patch zfs_root to always return an error
2. mount a ZFS filesystem

Here's the stack trace you would see then:

  VERIFY3(ds->ds_owner == tag) failed (0000000000000000 == ffff9142361e8000)
  PANIC at dsl_dataset.c:1003:dsl_dataset_disown()
  Showing stack for process 28332
  CPU: 2 PID: 28332 Comm: zpool Tainted: G           O      5.10.103-1.nutanix.el7.x86_64 #1
  Call Trace:
   dump_stack+0x74/0x92
   spl_dumpstack+0x29/0x2b [spl]
   spl_panic+0xd4/0xfc [spl]
   dsl_dataset_disown+0xe9/0x150 [zfs]
   dmu_objset_disown+0xd6/0x150 [zfs]
   zfs_domount+0x17b/0x4b0 [zfs]
   zpl_mount+0x174/0x220 [zfs]
   legacy_get_tree+0x2b/0x50
   vfs_get_tree+0x2a/0xc0
   path_mount+0x2fa/0xa70
   do_mount+0x7c/0xa0
   __x64_sys_mount+0x8b/0xe0
   do_syscall_64+0x38/0x50
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Co-authored-by: Christian Schwarz <christian.schwarz@nutanix.com>
Signed-off-by: Christian Schwarz <christian.schwarz@nutanix.com>
Closes #14025
2022-11-01 12:42:32 -07:00
Richard Yao 7a1b6c51d0 Linux: Remove ZFS_AC_KERNEL_SRC_MODULE_PARAM_CALL_CONST autotools check
On older kernels, the definition for `module_param_call()` typecasts
function pointers to `(void *)`, which triggers -Werror, causing the
check to return false when it should return true.

Fixing this breaks the build process on some older kernels because they
define a `__check_old_set_param()` function in their headers that checks
for a non-constified `->set()`. We workaround that through the c
preprocessor by defining `__check_old_set_param(set)` to `(set)`, which
prevents the build failures.

However, it is now apparent that all kernels that we support have
adopted the GRSecurity change, so there is no need to have an explicit
autotools check for it anymore. We therefore remove the autotools check,
while adding the workaround to our headers for the build time
non-constified `->set()` check done by older kernel headers.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13984
Closes #14004
2022-11-01 12:42:01 -07:00
George Melikov 4dd9c3b08e CI: bump actions/upload-artifact to v3
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #14018
2022-11-01 12:38:22 -07:00
George Melikov 1bbc09e1f7 CI: bump actions/checkout to v3
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #14018
2022-11-01 12:38:09 -07:00
Serapheim Dimitropoulos 37d5a3e04b Stop ganging due to past vdev write errors
= Problem

While examining a customer's system we noticed unreasonable space
usage from a few snapshots due to gang blocks. Under some further
analysis we discovered that the pool would create gang blocks because
all its disks had non-zero write error counts and they'd be skipped
for normal metaslab allocations due to the following if-clause in
`metaslab_alloc_dva()`:
```
	/*
	 * Avoid writing single-copy data to a failing,
	 * non-redundant vdev, unless we've already tried all
	 * other vdevs.
	 */
	if ((vd->vdev_stat.vs_write_errors > 0 ||
	    vd->vdev_state < VDEV_STATE_HEALTHY) &&
	    d == 0 && !try_hard && vd->vdev_children == 0) {
		metaslab_trace_add(zal, mg, NULL, psize, d,
		    TRACE_VDEV_ERROR, allocator);
		goto next;
	}
```

= Proposed Solution

Get rid of the predicate in the if-clause that checks the past
write errors of the selected vdev. We still try to allocate from
HEALTHY vdevs anyway by checking vdev_state so the past write
errors doesn't seem to help us (quite the opposite - it can cause
issues in long-lived pools like the one from our customer).

= Testing

I first created a pool with 3 vdevs:
```
$ zpool list -v volpool
NAME        SIZE  ALLOC   FREE
volpool    22.5G   117M  22.4G
  xvdb     7.99G  40.2M  7.46G
  xvdc     7.99G  39.1M  7.46G
  xvdd     7.99G  37.8M  7.46G
```

And used `zinject` like so with each one of them:
```
$ sudo zinject -d xvdb -e io -T write -f 0.1 volpool
```

And got the vdevs to the following state:
```
$ zpool status volpool
  pool: volpool
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.
...<cropped>..
action: Determine if the device needs to be replaced, and clear the
...<cropped>..
config:

	NAME        STATE     READ WRITE CKSUM
	volpool     ONLINE       0     0     0
	  xvdb      ONLINE       0     1     0
	  xvdc      ONLINE       0     1     0
	  xvdd      ONLINE       0     4     0

```

I also double-checked their write error counters with sdb:
```
sdb> spa volpool | vdev | member vdev_stat.vs_write_errors
(uint64_t)0  # <---- this is the root vdev
(uint64_t)2
(uint64_t)1
(uint64_t)1
```

Then I checked that I the problem was reproduced in my VM as I the
gang count was growing in zdb as I was writting more data:
```
$ sudo zdb volpool | grep gang
        ganged count:              1384

$ sudo zdb volpool | grep gang
        ganged count:              1393

$ sudo zdb volpool | grep gang
        ganged count:              1402

$ sudo zdb volpool | grep gang
        ganged count:              1414
```

Then I updated my bits with this patch and the gang count stayed the
same.

Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #14003
2022-11-01 12:36:25 -07:00
Serapheim Dimitropoulos 25096e1180 zvol_wait logic may terminate prematurely
Setups that have a lot of zvols may see zvol_wait terminate prematurely
even though the script is still making progress.  For example, we have a
customer that called zvol_wait for ~7100 zvols and by the last iteration
of that script it was still waiting on ~2900. Similarly another one
called zvol_wait for 2200 and by the time the script terminated there
were only 50 left.

This patch adjusts the logic to stay within the outer loop of the script
if we are making any progress whatsoever.

Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #13998
2022-11-01 12:35:36 -07:00
shodanshok 820edcbf91 Remove ambiguity on demand vs prefetch stats reported by arc_summary
arc_summary currently list prefetch stats as "demand prefetch"
However, a hit/miss can be due to demand or prefetch, not both.
To remove any confusion, this patch removes the "Demand" word
from the affected lines.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Gionatan Danti <g.danti@assyoma.it>
Closes #13985
2022-11-01 12:35:05 -07:00
Serapheim Dimitropoulos 37763ea2a6 Fix panic in dsl_process_sub_livelist for EINTR
= Issue

Recently we hit an assertion panic in `dsl_process_sub_livelist` while
exporting the spa and interrupting `bpobj_iterate_nofree`. In that case
`bpobj_iterate_nofree` stops mid-way returning an EINTR without clearing
the intermediate AVL tree that keeps track of the livelist entries it
has encountered so far. At that point the code has a VERIFY for the
number of elements of the AVL expecting it to be zero (which is not the
case for EINTR).

= Fix

Cleanup any intermediate state before destroying the AVL when
encountering EINTR. Also added a comment documenting the scenario where
the EINTR comes up. There is no need to do anything else for the calles
of `dsl_process_sub_livelist` as they already handle the EINTR case.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #13939
2022-11-01 12:34:08 -07:00
Mateusz Guzik c8d6a91a99 Bring per_txg_dirty_frees_percent back to 30
The current value causes significant artificial slowdown during mass
parallel file removal, which can be observed both on FreeBSD and Linux
when running real workloads.

Sample results from Linux doing make -j 96 clean after an allyesconfig
modules build:

before: 4.14s user 6.79s system 48% cpu 22.631 total
after:	4.17s user 6.44s system 153% cpu 6.927 total

FreeBSD results in the ticket.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by:	Mateusz Guzik <mjguzik@gmail.com>
Closes #13932
Closes #13938
2022-11-01 12:32:40 -07:00
Akash B 7ac732b8d6 Add options to zfs redundant_metadata property
Currently, additional/extra copies are created for metadata in
addition to the redundancy provided by the pool(mirror/raidz/draid),
due to this 2 times more space is utilized per inode and this decreases
the total number of inodes that can be created in the filesystem. By
setting redundant_metadata to none, no additional copies of metadata
are created, hence can reduce the space consumed by the additional
metadata copies and increase the total number of inodes that can be
created in the filesystem.  Additionally, this can improve file create
performance due to the reduced amount of metadata which needs
to be written.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Dipak Ghosh <dipak.ghosh@hpe.com>
Signed-off-by: Akash B <akash-b@hpe.com>
Closes #13680
2022-11-01 12:25:58 -07:00
Andriy Gapon 04f1983aab FreeBSD: vn_flush_cached_data: observe vnode locking contract
vm_object_page_clean() expects that the associated vnode is locked
as VOP_PUTPAGES() may get called on the vnode.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Andriy Gapon <avg@FreeBSD.org>
Closes #14079
(cherry picked from commit 41133c9794)
2022-10-27 16:14:57 -07:00
Mark Johnston 4e3fecbdfd FreeBSD: Fix a pair of bugs in zfs_fhtovp()
- Add a zfs_exit() call in an error path, otherwise a lock is leaked.
- Remove the fid_gen > 1 check.  That appears to be Linux-specific:
  zfsctl_snapdir_fid() sets fid_gen to 0 or 1 depending on whether the
  snapshot directory is mounted.  On FreeBSD it fails, making snapshot
  dirs inaccessible via NFS.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Andriy Gapon <avg@FreeBSD.org>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Fixes: 43dbf88178 ("FreeBSD: vfsops: use setgen for error case")
Closes #14001
Closes #13974
(cherry picked from commit ed566bf1cd)
2022-10-26 14:59:25 -07:00
samwyc fc1c0053f9 Fix sequential resilver drive failure race condition
This patch handles the race condition on simultaneous failure of
2 drives, which misses the vdev_rebuild_reset_wanted signal in
vdev_rebuild_thread. We retry to catch this inside the
vdev_rebuild_complete_sync function.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Dipak Ghosh <dipak.ghosh@hpe.com>
Reviewed-by: Akash B <akash-b@hpe.com>
Signed-off-by: Samuel Wycliffe J <samwyc@hpe.com>
Closes #14041
Closes #14050
2022-10-21 14:05:06 -07:00
Brian Behlendorf 7795975681 contrib: dracut: zfs-snapshot-bootfs: exit status fix
Correct misplaced `-` is the original backport of #13769.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #13769
2022-10-20 11:37:21 -07:00
gregory-lee-bartholomew 3b935cc3ed contrib: dracut: zfs-{rollback,snapshot}-bootfs: explicit snapname fix
Due to a missing semicolon on the ExecStart line, it wasn't possible
to specify the snapshot name on the bootfs.{rollback,snapshot}
kernel parameters if the boot dataset name was obtained from the
root=zfs:... kernel parameter.

Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gregory Bartholomew <gregory.lee.bartholomew@gmail.com>
Closes #13585
2022-10-20 11:34:59 -07:00
Richard Yao b0bc882395 kcfpool_alloc() should have its argument list marked void
This error occurred when building on Gentoo with debugging enabled:

zfs-kmod-2.1.6/work/zfs-2.1.6/module/icp/core/kcf_sched.c:1277:14:
error: a function declaration without a prototype is deprecated
in all versions of C [-Werror,-Wstrict-prototypes]
  kcfpool_alloc()
               ^
               void
1 error generated.

This function is not present in master.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14023
2022-10-12 15:47:39 -07:00
наб 8cf59e97c4 etc: mask zfs-load-key.service
Otherwise, systemd-sysv-generator will generate a service equivalent
that breaks the boot: under systemd this is covered by
zfs-mount-generator

We already do this for zfs-import.service, and other init scripts are
suppressed automatically by the "actual" .service files

Fixes: commit f04b976200 ("Add init script
 to load keys")
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #14010
Closes #14019
2022-10-12 15:29:21 -07:00
Damian Szuberski 4d22befde6 initramfs: use mount.zfs instead of mount
A followup to d7a67402a8

For `mount -t zfs -o opts ds mp` command line
some implementations of `mount(8)`, e. g. Busybox in Debian
work as follows:

```
newfstatat(AT_FDCWD, "ds", 0x7fff826f4ab0, 0) = -1
mount("ds", "mp", "zfs", MS_SILENT, NULL) = 0
```

The logic above skips completely `mount.zfs` and prevents us
from reading filesystem properties and applying mount options.

For comparison, the coreutils `mount(8)` implementation does:

```
openat(AT_FDCWD, "/proc/filesystems", O_RDONLY|O_CLOEXEC) = 3
// figure out that zfs is a `nodev` filesystem and look for a helper
newfstatat(AT_FDCWD, "/sbin/mount.zfs" ...) = 0
execve("/sbin/mount.zfs" ...) = 0
```

Using `mount.zfs` in initramfs would help circumvent deficiencies
of some of `mount(8)` implementations. `mount -t zfs` translates
to `mount.zfs` invocation, except for cases when explicitly disabled
by `-i`.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #13305
(cherry picked from commit 35d81a75a8)
2022-10-05 17:01:39 -07:00
Tony Hutter 6a6bd49398 Tag zfs-2.1.6
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2022-09-28 17:25:10 -07:00
Richard Yao 566e908fa0 Fix bad free in skein code
Clang's static analyzer found a bad free caused by skein_mac_atomic().
It will allocate a context on the stack and then pass it to
skein_final(), which attempts to free it. Upon inspection,
skein_digest_atomic() also has the same problem.

These functions were created to match the OpenSolaris ICP API, so I was
curious how we avoided this in other providers and looked at the SHA2
code. It appears that SHA2 has a SHA2Final() helper function that is
called by the exported sha2_mac_final()/sha2_digest_final() as well as
the sha2_mac_atomic() and sha2_digest_atomic() functions. The real work
is done in SHA2Final() while some checks and the free are done in
sha2_mac_final()/sha2_digest_final().

We fix the use after free in the skein code by taking inspiration from
the SHA2 code. We introduce a skein_final_nofree() that does most of the
work, and make skein_final() into a function that calls it and then
frees the memory.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13954
2022-09-28 17:25:10 -07:00
Tony Hutter a2705b1dd5 zpool: Don't print "repairing" on force faulted drives
If you force fault a drive that's resilvering, it's scan stats can get
frozen in time, giving the false impression that it's being resilvered.
This commit checks the vdev state to see if the vdev is healthy before
reporting "resilvering" or "repairing" in zpool status.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #13927
Closes #13930
2022-09-28 12:41:23 -07:00
Mateusz Guzik 63d4838b4a FreeBSD: handle V_PCATCH
See https://cgit.FreeBSD.org/src/commit/?id=a75d1ddd74312f5dd79bc1e965f7077679659f2e

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #13910
2022-09-28 10:35:13 -07:00
Mateusz Guzik eec942cc54 FreeBSD: catch up to 1400068
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #13909
2022-09-28 10:35:13 -07:00
Mateusz Guzik 2c8e3e4b28 FreeBSD: stop passing LK_INTERLOCK to VOP_LOCK
There is an ongoing effort to eliminate this feature.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #13908
2022-09-28 10:35:13 -07:00
Richard Yao 55816c64da FreeBSD: Fix integer conversion for vnlru_free{,_vfsops}()
When reviewing #13875, I noticed that our FreeBSD code has an issue
where it converts from `int64_t` to `int` when calling
`vnlru_free{,_vfsops}()`. The result is that if the int64_t is `1 <<
36`, the int will be 0, since the low bits are 0. Even when some low
bits are set, a value such as `((1 << 36) + 1)` would truncate to 1,
which is wrong.

There is protection against this on 32-bit platforms, but on 64-bit
platforms, there is no check to protect us, so we add a check.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13882
2022-09-28 10:35:13 -07:00
Ryan Moeller 8dcd6af623 FreeBSD: Ignore symlink to i386 includes
A symlink to i386 includes is created in the build dir on amd64 since
freebsd/freebsd-src@d07600c563

Tell git to ignore it like the other include links.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #13719
2022-09-28 10:35:13 -07:00
Richard Yao c973929b29 LUA: Fix CVE-2014-5461
Apply the fix from upstream.

http://www.lua.org/bugs.html#5.2.2-1
https://www.opencve.io/cve/CVE-2014-5461

It should be noted that exploiting this requires the `SYS_CONFIG`
privilege, and anyone with that privilege likely has other opportunities
to do exploits, so it is unlikely that bad actors could exploit this
unless system administrators are executing untrusted ZFS Channel
Programs.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13949
2022-09-27 16:49:02 -07:00
Richard Yao 835e03682c Linux: Fix uninitialized variable usage in zio_do_crypt_data()
Coverity complained about this. An error from `hkdf_sha512()` before uio
initialization will cause pointers to uninitialized memory to be passed
to `zio_crypt_destroy_uio()`. This is a regression that was introduced
by cf63739191. Interestingly, this never
affected FreeBSD, since the FreeBSD version never had that patch ported.
Since moving uio initialization to the top of this function would slow
down the qat_crypt() path, we only move the `memset()` calls to the top
of the function. This is sufficient to fix this problem.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13944
2022-09-27 15:43:26 -07:00
Alexander Motin 33223cbc3c Refactor Log Size Limit
Original Log Size Limit implementation blocked all writes in case of
limit reached until the TXG is committed and the log is freed.  It
caused huge delays and following speed spikes in application writes.

This implementation instead smoothly throttles writes, using exactly
the same mechanism as used for dirty data.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: jxdking <lostking2008@hotmail.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Issue #12284
Closes #13476
2022-09-26 14:55:27 -07:00
Brian Behlendorf 91e02156dd Revert "Reduce dbuf_find() lock contention"
This reverts commit 34dbc618f5.  While this
change resolved the lock contention observed for certain workloads, it
inadventantly reduced the maximum hash inserts/removes per second.  This
appears to be due to the slightly higher acquisition cost of a rwlock vs
a mutex.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
2022-09-21 13:15:51 -07:00
Richard Yao b66f8d3c2b Add zfs_btree_verify_intensity kernel module parameter
I see a few issues in the issue tracker that might be aided by being
able to turn this on. We have no module parameter for it, so I would
like to add one.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13874
2022-09-21 13:15:51 -07:00
Richard Yao 5096ed31c8 Fix incorrect size given to bqueue_enqueue() call in dmu_redact.c
We pass sizeof (struct redact_record *) rather than sizeof (struct
redact_record). Passing the pointer size is wrong.

Coverity caught this in two places.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13885
2022-09-21 13:15:51 -07:00
Ameer Hamza 035e52f591 Delay ZFS_PROP_SHARESMB property to handle it for encrypted raw receive
For encrypted raw receive, objset creation is delayed until a call to
dmu_recv_stream(). ZFS_PROP_SHARESMB property requires objset to be
populated when calling zpl_earlier_version(). To correctly handle the
ZFS_PROP_SHARESMB property for encrypted raw receive, this change
delays setting the property.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #13878
2022-09-21 13:15:26 -07:00
Ameer Hamza d5105f068f zfs recv hangs if max recordsize is less than received recordsize
- Some optimizations for bqueue enqueue/dequeue.
- Added a fix to prevent deadlock when both bqueue_enqueue_impl()
and bqueue_dequeue() waits for signal to be triggered.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #13855
2022-09-21 13:15:26 -07:00
наб faa1e4082d include: move SPA_MINBLOCKSHIFT and zio_encrypt to sys/fs/zfs.h
These are used by userspace, so should live in a public header

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12116
2022-09-21 13:15:26 -07:00
Alexander Motin 44cec45f72 Improve too large physical ashift handling
When iterating through children physical ashifts for vdev, prefer
ones above the maximum logical ashift, that we can actually use,
but within the administrator defined maximum.

When selecting top-level vdev ashift, do not set it to the defined
maximum in case physical ashift is even higher, but just ignore one.
Using the maximum does not prevent misaligned writes, but reduces
space efficiency.  Since ZFS tries to write data sequentially and
aggregates the writes, in many cases large misanigned writes may be
not as bad as the space penalty otherwise.

Allow internal physical ashifts for vdevs higher than SHIFT_MAX.
May be one day allocator or aggregation could benefit from that.

Reduce zfs_vdev_max_auto_ashift default from 16 (64KB) to 14 (16KB),
so that ZFS may still use bigger ashifts up to SHIFT_MAX (64KB),
but only if it really has to or explicitly told to, but not as an
"optimization".

There are some read-intensive NVMe SSDs that report Preferred Write
Alignment of 64KB, and attempt to build RAIDZ2 of those leads to a
space inefficiency that can't be justified.  Instead these changes
make ZFS fall back to logical ashift of 12 (4KB) by default and
only warn user that it may be suboptimal for performance.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #13798
2022-09-21 13:15:15 -07:00
Rich Ercolani ebbbe01e31 Ask libtool to stop hiding some errors
For #13083, curiously, it did not print the actual error, just
that the compile failed with "Error 1".

In theory, this flag should cause it to report errors twice sometimes.
In practice, I'm pretty okay with reporting some twice if it avoids
reporting some never.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #13086
2022-09-21 16:12:14 -07:00
Kevin Jin d05f3039f7 Add Module Parameter Regarding Log Size Limit
zfs_wrlog_data_max
The upper limit of TX_WRITE log data. Once it is reached,
write operation is blocked, until log data is cleared out
after txg sync. It only counts TX_WRITE log with WR_COPIED
or WR_NEED_COPY.

Reviewed-by: Prakash Surya <prakash.surya@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: jxdking <lostking2008@hotmail.com>
Closes #12284
2022-09-21 16:12:14 -07:00
Kevin Jin 999830a021 Optimize txg_kick() process (#12274)
Use dp_dirty_pertxg[] for txg_kick(), instead of dp_dirty_total in
original code. Extra parameter "txg" is added for txg_kick(), thus it
knows which txg to kick. Also txg_kick() call is moved from
dsl_pool_need_dirty_delay() to dsl_pool_dirty_space() so that we can
know the txg number assigned for txg_kick().

Some unnecessary code regarding dp_dirty_total in txg_sync_thread() is
also cleaned up.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: jxdking <lostking2008@hotmail.com>
Closes #12274
2022-09-21 16:12:14 -07:00
Ameer Hamza a5b0d42540 zfs recv hangs if max recordsize is less than received recordsize
- Some optimizations for bqueue enqueue/dequeue.
- Added a fix to prevent deadlock when both bqueue_enqueue_impl()
and bqueue_dequeue() waits for signal to be triggered.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #13855
2022-09-19 09:39:07 -07:00
Christian Schwarz cde04badd1 make DMU_OT_IS_METADATA and DMU_OT_IS_ENCRYPTED return B_TRUE or B_FALSE
Without this patch, the

    ASSERT3U(dbuf_is_metadata(db), ==, arc_is_metadata(buf));

at the beginning of dbuf_assign_arcbuf can panic
if the object type is a DMU_OT_NEWTYPE that has
DMU_OT_METADATA set.

While we're at it, fix DMU_OT_IS_ENCRYPTED as well.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Christian Schwarz <christian.schwarz@nutanix.com>
Closes #13842
2022-09-15 16:58:35 -07:00
Richard Yao 3f7c174b50 vdev_draid_lookup_map() should not iterate outside draid_maps
Coverity reported this as an out-of-bounds read.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13865
2022-09-15 16:58:35 -07:00
Akash B 03fa3ef264 Add physical device size to SIZE column in 'zpool list -v'
Add physical device size/capacity only for physical devices in
'zpool list -v' instead of displaying "-" in the SIZE column.
This would make it easier to see the individual device capacity and
to determine which spares are large enough to replace which devices.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Dipak Ghosh <dipak.ghosh@hpe.com>
Signed-off-by: Akash B <akash-b@hpe.com>
Closes #12561
Closes #13106
2022-09-15 10:23:01 -07:00
George Amanakis 8bd3dca9bf Introduce a tunable to exclude special class buffers from L2ARC
Special allocation class or dedup vdevs may have roughly the same
performance as L2ARC vdevs. Introduce a new tunable to exclude those
buffers from being cacheable on L2ARC.

Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #11761
Closes #12285
2022-09-14 11:27:00 -07:00
наб c8f795ba53 config: check for parallel(1), use it for cstyle
Before:
$ time make cstyle
real    0m23.118s
user    0m23.002s
sys     0m0.114s

After:
$ time make cstyle
real    0m4.577s
user    0m31.487s
sys     0m0.699s

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Issue #12899
2022-09-14 11:23:25 -07:00
Tony Hutter 7bbfac9d04 zed: Fix config_sync autoexpand flood
Users were seeing floods of `config_sync` events when autoexpand was
enabled.  This happened because all "disk status change" udev events
invoke the autoexpand codepath, which calls zpool_relabel_disk(),
which in turn cause another "disk status change" event to happen,
in a feedback loop.  Note that "disk status change" happens every time
a user calls close() on a block device.

This commit breaks the feedback loop by only allowing an autoexpand
to happen if the disk actually changed size.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes: #7132
Closes: #7366
Closes #13729
2022-09-14 09:57:44 -07:00
Walter Huf 2010c183bc Add xattr_handler support for Android kernels
Some ARM BSPs run the Android kernel, which has
a modified xattr_handler->get() function signature.
This adds support to compile against these kernels.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Walter Huf <hufman@gmail.com>
Closes #13824
2022-09-14 09:57:37 -07:00
Samuel aa9e887d2a Fix column width in 'zpool iostat -v' and 'zpool list -v'
This commit fixes a minor spacing issue caused when
enumerating vdev names, which originated from #13031

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Akash B <akash-b@hpe.com>
Signed-off-by: Samuel Wycliffe <samuelwycliffe@gmail.com>
Closes #13811
2022-09-14 09:57:05 -07:00
Ryan Moeller 78206a2e44 FreeBSD: Mark ZFS_MODULE_PARAM_CALL as MPSAFE
ZFS_MODULE_PARAM_CALL handlers implement their own locking if needed
and do not require Giant.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #13756
2022-09-13 17:59:15 -07:00
Alexander Motin b6ebf270eb Apply arc_shrink_shift to ARC above arc_c_min
It makes sense to free memory in smaller chunks when approaching
arc_c_min to let other kernel subsystems to free more, since after
that point we can't free anything.  This also matches behavior on
Linux, where to shrinker reported only the size above arc_c_min.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #13794
2022-09-13 17:59:10 -07:00
George Wilson 15b64fbc94 Importing from cachefile can trip assertion
When importing from cachefile, it is possible that the builtin retry
logic will trip an assertion because it also fails to find the pool.
This fix addresses that case and returns the correct error message to
the user.

Reviewed-by: Richard Yao <ryao@gentoo.org>
Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Wilson <gwilson@delphix.com>
Closes #13781
2022-09-13 17:59:04 -07:00
Tony Hutter b1be0a5c15 ZTS: Fix zpool_expand_001_pos
`zpool_expand_001_pos` was often failing due to not seeing autoexpand
commands in the `zpool history`.  During testing, I found this to be
unreliable (sometimes the "online" wouldn't appear in `zpool history`)
and unnecessary, as we could simply check that the pool increased in
size.

This commit revamps the test to check for the expanded pool size
and corresponding new free space.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #13743
2022-09-13 17:58:03 -07:00
Tony Hutter 65f8f92d12 zed: Look for NVMe DEVPATH if no ID_BUS
We tried replacing an NVMe drive using autoreplace, only
to see zed reject it with:

zed[27955]: zed_udev_monitor: /dev/nvme5n1 no devid source

This happened because ZED saw that ID_BUS was not set by udev
for the NVMe drive, and thus didn't think it was "real drive".
This commit allows NVMe drives to be autoreplaced even if
ID_BUS is not set.

Reviewed-by: Don Brady <don.brady@intel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #13512
Closes #13646
2022-09-13 17:51:11 -07:00
Tony Hutter acd7464639 zed: Ignore false 'atari' partitions in autoreplace
libudev will sometimes falsely identify an 'atari' partition on a
blank disk, preventing it from being used in an autoreplace.  This
seems to be a known issue.  The workaround is to just ignore the
fake partition and continue with the autoreplace.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #13497
Closes #13632
2022-09-13 17:51:06 -07:00
Tony Hutter f48d9b4269 rpm: Silence "unversioned Obsoletes" warnings on EL 9
Get rid of RPM warnings on AlmaLinux 9:

"It's not recommended to have unversioned Obsoletes"

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #13584
Closes #13638
2022-09-13 17:50:59 -07:00
Neal Gompa (ニール・ゴンパ) e1b49e3f1d rpm: Use the correct version-release information in dependencies
This tightly links the subpackages together and ensures that everything
is upgraded together.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Neal Gompa <ngompa@datto.com>
Closes #13489
2022-09-13 17:50:42 -07:00
Richard Yao 8131a96544 Fix use-after-free in btree code
Coverty static analysis found these.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #10989
Closes #13861
2022-09-13 16:15:38 -07:00
gregory-lee-bartholomew 979fd5a434 contrib: dracut: zfs-snapshot-bootfs: exit status fix
When the zfs-snapshot-bootfs service attempts to create a snapshot
that already exists, the exit status of the command is non-zero and
the service reports failed to the systemd service manager. This is a
common occurrence if bootfs.snapshot is left set on the kernel command
line and it should not be considered a failure.

This service was originally set to ignore this error by prefixing
the command with - on the ExecStart line, but the leading - appears
to have been dropped in #13359.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gregory Bartholomew <gregory.lee.bartholomew@gmail.com>
Closes #13769
2022-08-12 14:31:51 -07:00
r-ricci 533779f5f2 arcstat: fix -p option
When the -p option is used, a list of floats is passed to sep.join(),
which expects strings. Fix this by converting each value to a string.

Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Roberto Ricci <ricci@disroot.org>
Closes #12916 
Closes #13767
2022-08-12 14:29:24 -07:00
Paul Zuchowski db5fd16f0b Fix problem with zdb_objset_id test.
Use large numbers for datasets with
numeric names to avoid name and id
collisions.

Signed-off-by: Paul Zuchowski <pzuchowski@datto.com>
2022-08-09 11:46:12 -07:00
Coleman Kane e0dbab1a14 Linux 6.0 compat: register_shrinker() now var-arg
The 6.0 kernel added a printf-style var-arg for args > 0 to the
register_shrinker function, in order to add names to shrinkers, in
commit e33c267ab70de4249d22d7eab1cc7d68a889bac2. This enables the
shrinkers to have friendly names exposed in /sys/kernel/debug/shrinker/.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #13748
2022-08-09 09:41:06 -07:00
Brian Behlendorf 4063d7b6b4 Linux 5.20 compat: blk_cleanup_disk()
As of the Linux 5.20 kernel blk_cleanup_disk() has been removed,
all callers should use put_disk().

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13728
2022-08-09 09:41:06 -07:00
Brian Behlendorf 58571ba447 Linux 5.20 compat: bdevname()
As of the Linux 5.20 kernel bdevname() has been removed, all
callers should use snprintf() and the "%pg" format specifier.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13728
2022-08-09 09:41:06 -07:00
Brian Behlendorf 57e1052d33 Linux 5.19 compat: META
Update the META file to reflect compatibility with the 5.19 kernel.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13715
2022-08-09 09:41:06 -07:00
Paul Zuchowski fcbddc7f7c Fix problem with zdb -d
zdb -d <pool>/<objset ID> does not work when
other command line arguments are included i.e.
zdb -U <cachefile> -d <pool>/<objset ID>
This change fixes the command line parsing
to handle this situation.  Also fix issue
where zdb -r <dataset> <file> does not handle
the root <dataset> of the pool. Introduce -N
option to force <objset ID> to be interpreted
as a numeric objsetID.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Paul Zuchowski <pzuchowski@datto.com>
Closes #12845
Closes #12944
2022-08-08 16:56:38 -07:00
Tino Reichardt b06aff105c Fix checkstyle warning: E275 missing whitespace after keyword
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #13710
2022-08-02 10:05:14 -07:00
Rich Ercolani 035ee628cf Revert behavior of 59eab109 on not-Linux
It turns out that short-circuiting the EFAULT behavior on a short read
breaks things on FreeBSD. So until there's a nicer solution, let's
just revert the behavior for not-Linux.

Reference:
https://reviews.freebsd.org/R10:70f51f0e474ffe1fb74cb427423a2fba3637544d

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12698
2022-08-02 10:05:14 -07:00
Rich Ercolani 5c56591b57 Handle partial reads in zfs_read
Currently, dmu_read_uio_dnode can read 64K of a requested 1M in one
loop, get EFAULT back from zfs_uiomove() (because the iovec only holds
64k), and return EFAULT, which turns into EAGAIN on the way out. EAGAIN
gets interpreted as "I didn't read anything", the caller tries again
without consuming the 64k we already read, and we're stuck.

This apparently works on newer kernels because the caller which breaks
on older Linux kernels by happily passing along a 1M read request and a
64k iovec just requests 64k at a time.

With this, we now won't return EFAULT if we got a partial read.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12370
Closes #12509
Closes #12516
2022-08-02 10:05:14 -07:00
наб 17512aba0c module: lua: ldo: fix pragma name
/home/nabijaczleweli/store/code/zfs/module/lua/ldo.c:175:32: warning:
unknown option after ‘#pragma GCC diagnostic’ kind [-Wpragmas]
  175 | #pragma GCC diagnostic ignored "-Winfinite-recursion"a
      |                                ^~~~~~~~~~~~~~~~~~~~~~

Fixes: a6e8113fed ("Silence
-Winfinite-recursion warning in luaD_throw()")

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13348
2022-07-28 14:17:38 -07:00
Brian Behlendorf 98315be036 ZTS: Fix io_uring support check
Not all Linux distribution kernels enable io_uring support by
default.  Update the run time check to verify that the booted
kernel was built with CONFIG_IO_URING=y.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Co-authored-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13648
Closes #13685
2022-07-27 13:38:56 -07:00
Brian Behlendorf 69ad0bd769 Fix objtool: missing int3 after ret warning
Resolve straight-line speculation warnings reported by objtool
for x86_64 assembly on Linux when CONFIG_SLS is set.  See the
following LWN article for the complete details.

https://lwn.net/Articles/877845/

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
Attila Fülöp b9d862f2db ICP: Add missing stack frame info to SHA asm files
Since the assembly routines calculating SHA checksums don't use
a standard stack layout, CFI directives are needed to unroll the
stack.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes #11733
2022-07-27 13:38:56 -07:00
Brian Behlendorf d2ff2196a5 Fix -Wformat-overflow warning in zfs_project_handle_dir()
Switch to using asprintf() to satisfy the compiler and resolve the
potential format-overflow warning.  Not the conditional before the
sprintf() would have prevented this regardless.

    cmd/zfs/zfs_project.c: In function ‘zfs_project_handle_dir’:
    cmd/zfs/zfs_project.c:241:38: error: ‘/’ directive writing
    1 byte into a region of size between 0 and 4352
    [-Werror=format-overflow=]
    cmd/zfs/zfs_project.c:241:17: note: ‘sprintf’ output between
    2 and 4609 bytes into a destination of size 4352

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
Brian Behlendorf d483ef3744 Fix -Wformat-truncation warning in upgrade_set_callback()
Extend the buffer slightly resolve the warning.

    cmd/zfs/zfs_main.c: In function ‘upgrade_set_callback’:
    cmd/zfs/zfs_main.c:2446:22: error: ‘%llu’ directive output
    may be truncated writing between 1 and 20 bytes into a
    region of size 16 [-Werror=format-truncation=]
    cmd/zfs/zfs_main.c:2445:24: note: ‘snprintf’ output between
    2 and 21 bytes into a destination of size 16

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
Brian Behlendorf 60f2cfd24f Fix -Wuse-after-free warning in dbuf_destroy()
Move the use of the db pointer after it is freed.  It's only used as
a tag so a dereference would never occur, but there's no reason we
can't invert the order to resolve the warning.

    module/zfs/dbuf.c: In function 'dbuf_destroy':
    module/zfs/dbuf.c:2953:17: error:
    pointer 'db' may be used after 'free' [-Werror=use-after-free]

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
Brian Behlendorf 6a81173026 Fix -Wuse-after-free warning in dbuf_issue_final_prefetch_done()
Move the use of the private pointer after it is freed.  It's only
used as a tag so a dereference would never occur, but there's no
harm in inverting the order to resolve the warning.

    module/zfs/dbuf.c: In function 'dbuf_issue_final_prefetch_done':
    module/zfs/dbuf.c:3204:17: error:
    pointer 'private' may be used after 'free' [-Werror=use-after-free]

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
Brian Behlendorf 087f5dedd5 Fix -Wattribute-warning in dsl layer
The memcpy(), memmove(), and memset() functions have been annotated
to perform bounds checking when using FORTIFY_SOURCE.  A warning is
now generted when writing beyond the end of the specified field.

Alternately, the new struct_group() macro could be used to create
an anonymous union member for use by memcpy().  However, since this
is the only place the macro would be helpful it's preferable to
restructure the code slights to avoid the need for additional
compatibility code when the macro does not exist.

https://lore.kernel.org/lkml/20211118183807.1283332-1-keescook@chromium.org/T/

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
Brian Behlendorf c771583f23 Fix -Wattribute-warning in edonr
The wrong union memory was being accessed in EdonRInit resulting in
a write beyond size of field compiler warning.  Reference the correct
member to resolve the warning.  The warning was correct and this in
case the mistake was harmless.

    In function ‘fortify_memcpy_chk’,
    inlined from ‘EdonRInit’ at zfs/module/icp/algs/edonr/edonr.c:494:3:
    ./include/linux/fortify-string.h:344:25: error: call to
    ‘__write_overflow_field’ declared with attribute warning:
    detected write beyond size of field (1st parameter);
    maybe use struct_group()? [-Werror=attribute-warning]

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
Brian Behlendorf ef0e506f46 Fix -Wattribute-warning in zfs_log_xvattr()
Restructure the code in zfs_log_xvattr() to use a lr_attr_end
structure when accessing lr_attr_t elements located after the
variable sized array.  This makes the code more understandable
and resolves the accessing beyond the end of the field warnings.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
Brian Behlendorf d7a8c573cf Silence -Winfinite-recursion warning in luaD_throw()
This code should be kept inline with the upstream lua version as much
as possible.  Therefore, we simply want to silence the warning.  This
check was enabled by default as part of -Wall in gcc 12.1.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
наб 2d235d58f8 config: prune unused -Wno-bool-compare checks
Reviewed-by: Alejandro Colomar <alx.manpages@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13110
2022-07-27 13:38:56 -07:00
наб 37430e8211 libtpool: -Wno-clobbered
Also remove -Wno-unused-but-set-variable

Upstream-bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61118
Reviewed-by: Alejandro Colomar <alx.manpages@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13110
2022-07-27 13:38:56 -07:00
Tino Reichardt 4b0977027b Remove sha1 hashing from OpenZFS, it's not used anywhere.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Attila Fülöp <attila@fueloep.org>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12895
Closes #12902
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
2022-07-26 10:12:44 -07:00
Alexander Motin 15868d3ecb Fix scrub resume from newly created hole.
It may happen that scan bookmark points to a block that was turned
into a part of a big hole.  In such case dsl_scan_visitbp() may skip
it and dsl_scan_check_resume() will not be called for it.  As result
new scan suspend won't be possible until the end of the object, that
may take hours if the object is a multi-terabyte ZVOL on a slow HDD
pool, stretching TXG to all that time, creating all sorts of problems.

This patch changes the resume condition to any greater or equal block,
so even if we miss the bookmarked block, the next one we find will
delete the bookmark, allowing new suspend.

Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
2022-07-26 10:10:37 -07:00
Alexander Motin bbb50e6129 Avoid memory copy when verifying raidz/draid parity
Before this change for every valid parity column raidz_parity_verify()
allocated new buffer and copied there existing data, then recalculated
the parity and compared the result with the copy.  This patch removes
the memory copy, simply swapping original buffer pointers with newly
allocated empty ones for parity recalculation and comparison. Original
buffers with potentially incorrect parity data are then just freed,
while new recalculated ones are used for repair.

On a pool of 12 4-wide raidz vdevs, storing 1.5TB of 16MB blocks, this
change reduces memory traffic during scrub by 17% and total unhalted
CPU time by 25%.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13613
2022-07-26 10:10:37 -07:00
Alexander Motin 03e33b2bb8 Avoid memory copies during mirror scrub
Issuing several scrub reads for a block we may use the parent ZIO
buffer for one of child ZIOs.  If that read complete successfully,
then we won't need to copy the data explicitly.  If block has only
one copy (typical for root vdev, which is also a mirror inside),
then we never need to copy -- succeed or fail as-is.  Previous
code also copied data from buffer of every successfully completed
child ZIO, but that just does not make any sense.

On healthy N-wide mirror this saves all N+1 (or even more in case
of ditto blocks) memory copies for each scrubbed block, allowing
CPU to focus mostly on check-summing.  For other vdev types it
should save one memory copy per block copy at root vdev.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13606
2022-07-26 10:10:37 -07:00
Alexander Motin 4b8f16072d Fix and disable blocks statistics during scrub
Block statistics calculation during scrub I/O issue in case of sorted
scrub accounted ditto blocks several times.  Embedded blocks on other
side were not accounted at all.  This change moves the accounting from
issue to scan stage, that fixes both problems and also allows to avoid
pool-wide locking and the lock contention it created.

Since this statistics is quite specific and is not even exposed now
anywhere, disable its calculation by default to not waste CPU time.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13579
2022-07-26 10:10:37 -07:00
Alexander Motin 5e06805d8e Avoid two 64-bit divisions per scanned block
Change math to make it like the ARC, using multiplications instead.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13591
2022-07-26 10:10:37 -07:00
Alexander Motin dc91a6a660 Several B-tree optimizations
- Introduce first element offset within a leaf.  It allows to reduce
by ~50% average memmove() size when adding/removing elements.  If the
added/removed element is in the first half of the leaf, we may shift
elements before it and adjust the bth_first instead of moving more
elements after it.
 - Use memcpy() instead of memmove() when we know there is no overlap.
 - Switch from uint64_t to uint32_t.  It does not limit anything,
but 32-bit arches should appreciate it greatly in hot paths.
 - Store leaf capacity in struct btree to avoid 64-bit divisions.
 - Adjust zfs_btree_insert_into_leaf() to always result in balanced
leaves after splitting, no matter where the new element was inserted.
Not that we care about it much, but it should also allow B-trees with
as little as two elements per leaf instead of 4 previously.

When scrubbing pool of 12 SSDs, storing 1.5TB of 4KB zvol blocks this
reduces amount of time spent in memmove() inside the scan thread from
13.7% to 5.7% and total scrub time by ~15 seconds out of 9 minutes.
It should also reduce spacemaps load time, but I haven't measured it.

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13582
2022-07-26 10:10:37 -07:00
Alexander Motin a861aa2b9e Several sorted scrub optimizations
- Reduce size and comparison complexity of q_exts_by_size B-tree.
Previous code used two 64-bit divisions and many other operations to
compare two B-tree elements.  It created enormous overhead.  This
implementation moves the math to the upper level and stores the score
in the B-tree elements themselves.  Since all that we need to store in
that B-tree is the extent score and offset, those can fit into single
8 byte value instead of 24 bytes of q_exts_by_addr element and can be
compared with single operation.
 - Better decouple secondary tree logic from main range_tree by moving
rt_btree_ops and related functions into dsl_scan.c as ext_size_ops.
Those functions are very small to worry about the code duplication and
range_tree does not need to know details such as rt_btree_compare.
 - Instead of accounting number of pending bytes per pool, that needs
atomic on global variable per block, account the number of non-empty
per-vdev queues, that change much more rarely.
 - When extent scan is interrupted by TXG end, continue it in the next
TXG instead of selecting next best extent.  It allows to avoid leaving
one truncated (and so likely not the best any more) extent each TXG.

On top of some other optimizations this saves about 1.5 minutes out of
10 to scrub pool of 12 SSDs, storing 1.5TB of 4KB zvol blocks.

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <caputit1@tcnj.edu>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13576
2022-07-26 10:10:37 -07:00
Alexander Motin 881249de6f FreeBSD: Improve crypto_dispatch() handling
Handle crypto_dispatch() return values same as crp->crp_etype errors.
On FreeBSD 12 many drivers returned same errors both ways, and lack
of proper handling for the first ended up in assertion panic later.
It was changed in FreeBSD 13, but there is no reason to not be safe.

While there, skip waiting for completion, including locking and
wakeup() call, for sessions on synchronous crypto drivers, such as
typical aesni and software.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13563
2022-07-26 10:10:37 -07:00
Alexander Motin 916d9de158 Reduce ZIO io_lock contention on sorted scrub
During sorted scrub multiple threads (one per vdev) are issuing many
ZIOs same time, all using the same scn->scn_zio_root ZIO as parent.
It causes huge lock contention on the single global lock on that ZIO.
Improve it by introducing per-queue null ZIOs, children to that one,
and using them instead as proxy.

For 12 SSD pool storing 1.5TB of 4KB blocks on 80-core system this
dramatically reduces lock contention and reduces scrub time from 21
minutes down to 12.5, while actual read stages (not scan) are about
3x faster, reaching 100K blocks per second per vdev.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13553
2022-07-26 10:10:37 -07:00
Alexander Motin 813e15f28c AVL: Remove obsolete branching optimizations
Modern Clang and GCC can successfully implement simple conditions
without branching with math and flag operations.  Use of arrays for
translation no longer helps as much as it was 14+ years ago.

Disassemble of the code generated by Clang 13.0.0 on FreeBSD 13.1,
Clang 14.0.4 on FreeBSD 14 and GCC 10.2.1 on Debian 11 with this
change still shows no branching instructions.

Profiling of CPU-bound scan stage of sorted scrub shows reproducible
reduction of time spent inside avl_find() from 6.52% to 4.58%.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13540
2022-07-26 10:10:37 -07:00
Alexander Motin 884364ea85 More speculative prefetcher improvements
- Make prefetch distance adaptive: up to 4MB prefetch doubles for
every, hit same as before, but after that it grows by 1/8 every time
the prefetch read does not complete in time to satisfy the demand.
My tests show that 4MB is sufficient for wide NVMe pool to saturate
single reader thread at 2.5GB/s, while new 64MB maximum allows the
same thread to reach 1.5GB/s on wide HDD pool.  Further distance
increase may increase speed even more, but less dramatic and with
higher latency.

 - Allow early reuse of inactive prefetch streams: streams that never
saw hits can be reused immediately if there is a demand, while others
can be reused after 1s of inactivity, starting with the oldest.  After
2s of inactivity streams are deleted to free resources same as before.
This allows by several times increase strided read performance on HDD
pool in presence of simultaneous random reads, previously filling the
zfetch_max_streams limit for seconds and so blocking most of prefetch.

 - Always issue intermediate indirect block reads with SYNC priority.
Each of those reads if delayed for longer may delay up to 1024 other
block prefetches, that may be not good for wide pools.

Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13452
2022-07-26 10:10:37 -07:00
Alexander Motin 6e1e90d64c Improve mg_aliquot math
When calculating mg_aliquot alike to #12046 use number of unique data
disks in the vdev, not the total number of children vdev.  Increase
default value of the tunable from 512KB to 1MB to compensate.

Before this change each disk in striped pool was getting 512KB of
sequential data, in 2-wide mirror -- 1MB, in 3-wide RAIDZ1 -- 768KB.
After this change in all the cases each disk should get 1MB.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13388
2022-07-26 10:10:37 -07:00
Alexander Motin dd9c110ab5 Improve log spacemap load time
Previous flushing algorithm limited only total number of log blocks to
the minimum of 256K and 4x number of metaslabs in the pool.  As result,
system with 1500 disks with 1000 metaslabs each, touching several new
metaslabs each TXG could grow spacemap log to huge size without much
benefits.  We've observed one of such systems importing pool for about
45 minutes.

This patch improves the situation from five sides:
 - By limiting maximum period for each metaslab to be flushed to 1000
TXGs, that effectively limits maximum number of per-TXG spacemap logs
to load to the same number.
 - By making flushing more smooth via accounting number of metaslabs
that were touched after the last flush and actually need another flush,
not just ms_unflushed_txg bump.
 - By applying zfs_unflushed_log_block_pct to the number of metaslabs
that were touched after the last flush, not all metaslabs in the pool.
 - By aggressively prefetching per-TXG spacemap logs up to 16 TXGs in
advance, making log spacemap load process for wide HDD pool CPU-bound,
accelerating it by many times.
 - By reducing zfs_unflushed_log_block_max from 256K to 128K, reducing
single-threaded by nature log processing time from ~10 to ~5 minutes.

As further optimization we could skip bumping ms_unflushed_txg for
metaslabs not touched since the last flush, but that would be an
incompatible change, requiring new pool feature.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #12789
2022-07-26 10:10:37 -07:00
Alexander Motin fdb80a2301 Add more control/visibility to spa_load_verify().
Use error thresholds from policy to control whether to scrub data
and/or metadata.  If threshold is set to UINT64_MAX, then caller
probably does not care about result and we may skip that part.

By default import neither set the data error threshold nor read
the error counter, so skip the data scrub for faster import.
Metadata are still scrubbed and fail if even single error found.

While there just for symmetry return number of metadata errors in
case threshold is not set to zero and we haven't reached it.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #13022
2022-07-26 10:10:37 -07:00
Allan Jude 72a4709a59 spa.c: Replace VERIFY(nvlist_*(...) == 0) with fnvlist_* (#12678)
The fnvlist versions of the functions are fatal if they fail,
saving each call from having to include checking the result.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Allan Jude <allan@klarasystems.com>
2022-07-26 10:10:37 -07:00
Alexander Motin 415882d228 Avoid small buffer copying on write
It is wrong for arc_write_ready() to use zfs_abd_scatter_enabled to
decide whether to reallocate/copy the buffer, because the answer is
OS-specific and depends on the buffer size.  Instead of that use
abd_size_alloc_linear(), moved into public header.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12425
2022-07-26 10:10:37 -07:00
Alexander Motin 5b860ae1fb Remove refcount from spa_config_*()
The only reason for spa_config_*() to use refcount instead of simple
non-atomic (thanks to scl_lock) variable for scl_count is tracking,
hard disabled for the last 8 years.  Switch to simple int scl_count
reduces the lock hold time by avoiding atomic, plus makes structure
fit into single cache line, reducing the locks contention.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #12287
2022-07-26 10:10:37 -07:00
Brian Behlendorf 3920d7f325 Scrub mirror children without BPs
When scrubbing a raidz/draid pool, which contains a replacing or
sparing mirror with multiple online children, only one child will
be read.  This is not normally a serious concern because the DTL
records are used to determine where a good copy of the data is.
As long as the data can be read from one child the mirror vdev
will use it to repair gaps in any of its children.  Furthermore,
even if the data which was read is corrupt the raidz code will
detect this and issue its own repair I/O to correct the damage
in the mirror vdev.

However, in the scenario where the DTL is wrong due to silent
data corruption (say due to overwriting one child) and the scrub
happens to read from a child with good data, then the other damaged
mirror child will not be detected nor repaired.

While this is possible for both raidz and draid vdevs, it's most
pronounced when using draid.  This is because by default the zed
will sequentially rebuild a draid pool to a distributed spare,
and the distributed spare half of the mirror is always preferred
since it delivers better performance.  This means the damaged
half of the mirror will go undetected even after scrubbing.

For system administrations this behavior is non-intuitive and in
a worst case scenario could result in the only good copy of the
data being unknowingly detached from the mirror.

This change resolves the issue by reading all replacing/sparing
mirror children when scrubbing.  When the BP isn't available for
verification, then compare the data buffers from each child.  They
must all be identical, if not there's silent damage and an error
is returned to prompt the top-level vdev to issue a repair I/O to
rewrite the data on all of the mirror children.  Since we can't
tell which child was wrong a checksum error is logged against the
replacing or sparing mirror vdev.

Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13555
2022-07-14 10:21:29 -07:00
276 changed files with 4446 additions and 5849 deletions
+2 -2
View File
@@ -8,7 +8,7 @@ jobs:
checkstyle:
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
with:
ref: ${{ github.event.pull_request.head.sha }}
- name: Install dependencies
@@ -43,7 +43,7 @@ jobs:
if: failure() && steps.CheckABI.outcome == 'failure'
run: |
find -name *.abi | tar -cf abi_files.tar -T -
- uses: actions/upload-artifact@v2
- uses: actions/upload-artifact@v3
if: failure() && steps.CheckABI.outcome == 'failure'
with:
name: New ABI files (use only if you're sure about interface changes)
+3 -3
View File
@@ -9,10 +9,10 @@ jobs:
strategy:
fail-fast: false
matrix:
os: [18.04, 20.04]
os: [20.04]
runs-on: ubuntu-${{ matrix.os }}
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
with:
ref: ${{ github.event.pull_request.head.sha }}
- name: Install dependencies
@@ -75,7 +75,7 @@ jobs:
sudo chmod +r $RESULTS_PATH/*
# Replace ':' in dir names, actions/upload-artifact doesn't support it
for f in $(find /var/tmp/test_results -name '*:*'); do mv "$f" "${f//:/__}"; done
- uses: actions/upload-artifact@v2
- uses: actions/upload-artifact@v3
if: failure()
with:
name: Test logs Ubuntu-${{ matrix.os }}
+3 -3
View File
@@ -6,9 +6,9 @@ on:
jobs:
tests:
runs-on: ubuntu-latest
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
with:
ref: ${{ github.event.pull_request.head.sha }}
- name: Install dependencies
@@ -71,7 +71,7 @@ jobs:
sudo chmod +r $RESULTS_PATH/*
# Replace ':' in dir names, actions/upload-artifact doesn't support it
for f in $(find /var/tmp/test_results -name '*:*'); do mv "$f" "${f//:/__}"; done
- uses: actions/upload-artifact@v2
- uses: actions/upload-artifact@v3
if: failure()
with:
name: Test logs
+4 -4
View File
@@ -6,11 +6,11 @@ on:
jobs:
tests:
runs-on: ubuntu-latest
runs-on: ubuntu-20.04
env:
TEST_DIR: /var/tmp/zloop
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
with:
ref: ${{ github.event.pull_request.head.sha }}
- name: Install dependencies
@@ -50,7 +50,7 @@ jobs:
if: failure()
run: |
sudo chmod +r -R $TEST_DIR/
- uses: actions/upload-artifact@v2
- uses: actions/upload-artifact@v3
if: failure()
with:
name: Logs
@@ -58,7 +58,7 @@ jobs:
/var/tmp/zloop/*/
!/var/tmp/zloop/*/vdev/
if-no-files-found: ignore
- uses: actions/upload-artifact@v2
- uses: actions/upload-artifact@v3
if: failure()
with:
name: Pool files
+1 -1
View File
@@ -1,2 +1,2 @@
The [OpenZFS Code of Conduct](http://www.open-zfs.org/wiki/Code_of_Conduct)
The [OpenZFS Code of Conduct](https://openzfs.org/wiki/Code_of_Conduct)
applies to spaces associated with the OpenZFS project, including GitHub.
+2 -2
View File
@@ -1,10 +1,10 @@
Meta: 1
Name: zfs
Branch: 1.0
Version: 2.1.5
Version: 2.1.9
Release: 1
Release-Tags: relext
License: CDDL
Author: OpenZFS
Linux-Maximum: 5.18
Linux-Maximum: 6.1
Linux-Minimum: 3.10
+6 -1
View File
@@ -114,6 +114,11 @@ commitcheck:
${top_srcdir}/scripts/commitcheck.sh; \
fi
if HAVE_PARALLEL
cstyle_line = -print0 | parallel -X0 ${top_srcdir}/scripts/cstyle.pl -cpP {}
else
cstyle_line = -exec ${top_srcdir}/scripts/cstyle.pl -cpP {} +
endif
PHONY += cstyle
cstyle:
@find ${top_srcdir} -name build -prune \
@@ -122,7 +127,7 @@ cstyle:
! -name 'opt_global.h' ! -name '*_if*.h' \
! -name 'zstd_compat_wrapper.h' \
! -path './module/zstd/lib/*' \
-exec ${top_srcdir}/scripts/cstyle.pl -cpP {} \+
$(cstyle_line)
filter_executable = -exec test -x '{}' \; -print
+4 -4
View File
@@ -686,9 +686,9 @@ def section_archits(kstats_dict):
print()
print('Cache hits by data type:')
dt_todo = (('Demand data:', arc_stats['demand_data_hits']),
('Demand prefetch data:', arc_stats['prefetch_data_hits']),
('Prefetch data:', arc_stats['prefetch_data_hits']),
('Demand metadata:', arc_stats['demand_metadata_hits']),
('Demand prefetch metadata:',
('Prefetch metadata:',
arc_stats['prefetch_metadata_hits']))
for title, value in dt_todo:
@@ -697,10 +697,10 @@ def section_archits(kstats_dict):
print()
print('Cache misses by data type:')
dm_todo = (('Demand data:', arc_stats['demand_data_misses']),
('Demand prefetch data:',
('Prefetch data:',
arc_stats['prefetch_data_misses']),
('Demand metadata:', arc_stats['demand_metadata_misses']),
('Demand prefetch metadata:',
('Prefetch metadata:',
arc_stats['prefetch_metadata_misses']))
for title, value in dm_todo:
+1 -1
View File
@@ -271,7 +271,7 @@ def print_values():
if pretty_print:
fmt = lambda col: prettynum(cols[col][0], cols[col][1], v[col])
else:
fmt = lambda col: v[col]
fmt = lambda col: str(v[col])
sys.stdout.write(sep.join(fmt(col) for col in hdr))
sys.stdout.write("\n")
+84 -31
View File
@@ -112,7 +112,7 @@ extern int zfs_vdev_async_read_max_active;
extern boolean_t spa_load_verify_dryrun;
extern boolean_t spa_mode_readable_spacemaps;
extern int zfs_reconstruct_indirect_combinations_max;
extern int zfs_btree_verify_intensity;
extern uint_t zfs_btree_verify_intensity;
static const char cmdname[] = "zdb";
uint8_t dump_opt[256];
@@ -2995,7 +2995,7 @@ open_objset(const char *path, void *tag, objset_t **osp)
}
sa_os = *osp;
return (0);
return (err);
}
static void
@@ -8272,6 +8272,23 @@ zdb_embedded_block(char *thing)
free(buf);
}
/* check for valid hex or decimal numeric string */
static boolean_t
zdb_numeric(char *str)
{
int i = 0;
if (strlen(str) == 0)
return (B_FALSE);
if (strncmp(str, "0x", 2) == 0 || strncmp(str, "0X", 2) == 0)
i = 2;
for (; i < strlen(str); i++) {
if (!isxdigit(str[i]))
return (B_FALSE);
}
return (B_TRUE);
}
int
main(int argc, char **argv)
{
@@ -8317,7 +8334,7 @@ main(int argc, char **argv)
zfs_btree_verify_intensity = 3;
while ((c = getopt(argc, argv,
"AbcCdDeEFGhiI:klLmMo:Op:PqrRsSt:uU:vVx:XYyZ")) != -1) {
"AbcCdDeEFGhiI:klLmMNo:Op:PqrRsSt:uU:vVx:XYyZ")) != -1) {
switch (c) {
case 'b':
case 'c':
@@ -8331,6 +8348,7 @@ main(int argc, char **argv)
case 'l':
case 'm':
case 'M':
case 'N':
case 'O':
case 'r':
case 'R':
@@ -8422,31 +8440,6 @@ main(int argc, char **argv)
(void) fprintf(stderr, "-p option requires use of -e\n");
usage();
}
if (dump_opt['d'] || dump_opt['r']) {
/* <pool>[/<dataset | objset id> is accepted */
if (argv[2] && (objset_str = strchr(argv[2], '/')) != NULL &&
objset_str++ != NULL) {
char *endptr;
errno = 0;
objset_id = strtoull(objset_str, &endptr, 0);
/* dataset 0 is the same as opening the pool */
if (errno == 0 && endptr != objset_str &&
objset_id != 0) {
target_is_spa = B_FALSE;
dataset_lookup = B_TRUE;
} else if (objset_id != 0) {
printf("failed to open objset %s "
"%llu %s", objset_str,
(u_longlong_t)objset_id,
strerror(errno));
exit(1);
}
/* normal dataset name not an objset ID */
if (endptr == objset_str) {
objset_id = -1;
}
}
}
#if defined(_LP64)
/*
@@ -8486,7 +8479,7 @@ main(int argc, char **argv)
verbose = MAX(verbose, 1);
for (c = 0; c < 256; c++) {
if (dump_all && strchr("AeEFklLOPrRSXy", c) == NULL)
if (dump_all && strchr("AeEFklLNOPrRSXy", c) == NULL)
dump_opt[c] = 1;
if (dump_opt[c])
dump_opt[c] += verbose;
@@ -8525,6 +8518,7 @@ main(int argc, char **argv)
return (dump_path(argv[0], argv[1], NULL));
}
if (dump_opt['r']) {
target_is_spa = B_FALSE;
if (argc != 3)
usage();
dump_opt['v'] = verbose;
@@ -8535,6 +8529,10 @@ main(int argc, char **argv)
rewind = ZPOOL_DO_REWIND |
(dump_opt['X'] ? ZPOOL_EXTREME_REWIND : 0);
/* -N implies -d */
if (dump_opt['N'] && dump_opt['d'] == 0)
dump_opt['d'] = dump_opt['N'];
if (nvlist_alloc(&policy, NV_UNIQUE_NAME_TYPE, 0) != 0 ||
nvlist_add_uint64(policy, ZPOOL_LOAD_REQUEST_TXG, max_txg) != 0 ||
nvlist_add_uint32(policy, ZPOOL_LOAD_REWIND_POLICY, rewind) != 0)
@@ -8553,6 +8551,34 @@ main(int argc, char **argv)
targetlen = strlen(target);
if (targetlen && target[targetlen - 1] == '/')
target[targetlen - 1] = '\0';
/*
* See if an objset ID was supplied (-d <pool>/<objset ID>).
* To disambiguate tank/100, consider the 100 as objsetID
* if -N was given, otherwise 100 is an objsetID iff
* tank/100 as a named dataset fails on lookup.
*/
objset_str = strchr(target, '/');
if (objset_str && strlen(objset_str) > 1 &&
zdb_numeric(objset_str + 1)) {
char *endptr;
errno = 0;
objset_str++;
objset_id = strtoull(objset_str, &endptr, 0);
/* dataset 0 is the same as opening the pool */
if (errno == 0 && endptr != objset_str &&
objset_id != 0) {
if (dump_opt['N'])
dataset_lookup = B_TRUE;
}
/* normal dataset name not an objset ID */
if (endptr == objset_str) {
objset_id = -1;
}
} else if (objset_str && !zdb_numeric(objset_str + 1) &&
dump_opt['N']) {
printf("Supply a numeric objset ID with -N\n");
exit(1);
}
} else {
target_pool = target;
}
@@ -8670,13 +8696,27 @@ main(int argc, char **argv)
}
return (error);
} else {
target_pool = strdup(target);
if (strpbrk(target, "/@") != NULL)
*strpbrk(target_pool, "/@") = '\0';
zdb_set_skip_mmp(target);
/*
* If -N was supplied, the user has indicated that
* zdb -d <pool>/<objsetID> is in effect. Otherwise
* we first assume that the dataset string is the
* dataset name. If dmu_objset_hold fails with the
* dataset string, and we have an objset_id, retry the
* lookup with the objsetID.
*/
boolean_t retry = B_TRUE;
retry_lookup:
if (dataset_lookup == B_TRUE) {
/*
* Use the supplied id to get the name
* for open_objset.
*/
error = spa_open(target, &spa, FTAG);
error = spa_open(target_pool, &spa, FTAG);
if (error == 0) {
error = name_from_objset_id(spa,
objset_id, dsname);
@@ -8685,10 +8725,23 @@ main(int argc, char **argv)
target = dsname;
}
}
if (error == 0)
if (error == 0) {
if (objset_id > 0 && retry) {
int err = dmu_objset_hold(target, FTAG,
&os);
if (err) {
dataset_lookup = B_TRUE;
retry = B_FALSE;
goto retry_lookup;
} else {
dmu_objset_rele(os, FTAG);
}
}
error = open_objset(target, FTAG, &os);
}
if (error == 0)
spa = dmu_objset_spa(os);
free(target_pool);
}
}
nvlist_free(policy);
+1
View File
@@ -599,6 +599,7 @@ fmd_timer_install(fmd_hdl_t *hdl, void *arg, fmd_event_t *ep, hrtime_t delta)
sev.sigev_notify_function = _timer_notify;
sev.sigev_notify_attributes = NULL;
sev.sigev_value.sival_ptr = ftp;
sev.sigev_signo = 0;
timer_create(CLOCK_REALTIME, &sev, &ftp->ft_tid);
timer_settime(ftp->ft_tid, 0, &its, NULL);
+170 -21
View File
@@ -525,6 +525,7 @@ typedef struct dev_data {
boolean_t dd_islabeled;
uint64_t dd_pool_guid;
uint64_t dd_vdev_guid;
uint64_t dd_new_vdev_guid;
const char *dd_new_devid;
} dev_data_t;
@@ -535,6 +536,7 @@ zfs_iter_vdev(zpool_handle_t *zhp, nvlist_t *nvl, void *data)
char *path = NULL;
uint_t c, children;
nvlist_t **child;
uint64_t guid = 0;
/*
* First iterate over any children.
@@ -562,17 +564,14 @@ zfs_iter_vdev(zpool_handle_t *zhp, nvlist_t *nvl, void *data)
/* once a vdev was matched and processed there is nothing left to do */
if (dp->dd_found)
return;
(void) nvlist_lookup_uint64(nvl, ZPOOL_CONFIG_GUID, &guid);
/*
* Match by GUID if available otherwise fallback to devid or physical
*/
if (dp->dd_vdev_guid != 0) {
uint64_t guid;
if (nvlist_lookup_uint64(nvl, ZPOOL_CONFIG_GUID,
&guid) != 0 || guid != dp->dd_vdev_guid) {
if (guid != dp->dd_vdev_guid)
return;
}
zed_log_msg(LOG_INFO, " zfs_iter_vdev: matched on %llu", guid);
dp->dd_found = B_TRUE;
@@ -582,6 +581,12 @@ zfs_iter_vdev(zpool_handle_t *zhp, nvlist_t *nvl, void *data)
* illumos, substring matching is not required to accommodate
* the partition suffix. An exact match will be present in
* the dp->dd_compare value.
* If the attached disk already contains a vdev GUID, it means
* the disk is not clean. In such a scenario, the physical path
* would be a match that makes the disk faulted when trying to
* online it. So, we would only want to proceed if either GUID
* matches with the last attached disk or the disk is in clean
* state.
*/
if (nvlist_lookup_string(nvl, dp->dd_prop, &path) != 0 ||
strcmp(dp->dd_compare, path) != 0) {
@@ -589,6 +594,12 @@ zfs_iter_vdev(zpool_handle_t *zhp, nvlist_t *nvl, void *data)
__func__, dp->dd_compare, path);
return;
}
if (dp->dd_new_vdev_guid != 0 && dp->dd_new_vdev_guid != guid) {
zed_log_msg(LOG_INFO, " %s: no match (GUID:%llu"
" != vdev GUID:%llu)", __func__,
dp->dd_new_vdev_guid, guid);
return;
}
zed_log_msg(LOG_INFO, " zfs_iter_vdev: matched %s on %s",
dp->dd_prop, path);
@@ -670,7 +681,7 @@ zfs_iter_pool(zpool_handle_t *zhp, void *data)
*/
static boolean_t
devphys_iter(const char *physical, const char *devid, zfs_process_func_t func,
boolean_t is_slice)
boolean_t is_slice, uint64_t new_vdev_guid)
{
dev_data_t data = { 0 };
@@ -680,6 +691,7 @@ devphys_iter(const char *physical, const char *devid, zfs_process_func_t func,
data.dd_found = B_FALSE;
data.dd_islabeled = is_slice;
data.dd_new_devid = devid; /* used by auto replace code */
data.dd_new_vdev_guid = new_vdev_guid;
(void) zpool_iter(g_zfshdl, zfs_iter_pool, &data);
@@ -848,7 +860,7 @@ zfs_deliver_add(nvlist_t *nvl, boolean_t is_lofi)
if (devid_iter(devid, zfs_process_add, is_slice))
return (0);
if (devpath != NULL && devphys_iter(devpath, devid, zfs_process_add,
is_slice))
is_slice, vdev_guid))
return (0);
if (vdev_guid != 0)
(void) guid_iter(pool_guid, vdev_guid, devid, zfs_process_add,
@@ -894,21 +906,96 @@ zfs_deliver_check(nvlist_t *nvl)
return (0);
}
/*
* Given a path to a vdev, lookup the vdev's physical size from its
* config nvlist.
*
* Returns the vdev's physical size in bytes on success, 0 on error.
*/
static uint64_t
vdev_size_from_config(zpool_handle_t *zhp, const char *vdev_path)
{
nvlist_t *nvl = NULL;
boolean_t avail_spare, l2cache, log;
vdev_stat_t *vs = NULL;
uint_t c;
nvl = zpool_find_vdev(zhp, vdev_path, &avail_spare, &l2cache, &log);
if (!nvl)
return (0);
verify(nvlist_lookup_uint64_array(nvl, ZPOOL_CONFIG_VDEV_STATS,
(uint64_t **)&vs, &c) == 0);
if (!vs) {
zed_log_msg(LOG_INFO, "%s: no nvlist for '%s'", __func__,
vdev_path);
return (0);
}
return (vs->vs_pspace);
}
/*
* Given a path to a vdev, lookup if the vdev is a "whole disk" in the
* config nvlist. "whole disk" means that ZFS was passed a whole disk
* at pool creation time, which it partitioned up and has full control over.
* Thus a partition with wholedisk=1 set tells us that zfs created the
* partition at creation time. A partition without whole disk set would have
* been created by externally (like with fdisk) and passed to ZFS.
*
* Returns the whole disk value (either 0 or 1).
*/
static uint64_t
vdev_whole_disk_from_config(zpool_handle_t *zhp, const char *vdev_path)
{
nvlist_t *nvl = NULL;
boolean_t avail_spare, l2cache, log;
uint64_t wholedisk = 0;
nvl = zpool_find_vdev(zhp, vdev_path, &avail_spare, &l2cache, &log);
if (!nvl)
return (0);
(void) nvlist_lookup_uint64(nvl, ZPOOL_CONFIG_WHOLE_DISK, &wholedisk);
return (wholedisk);
}
/*
* If the device size grew more than 1% then return true.
*/
#define DEVICE_GREW(oldsize, newsize) \
((newsize > oldsize) && \
((newsize / (newsize - oldsize)) <= 100))
static int
zfsdle_vdev_online(zpool_handle_t *zhp, void *data)
{
char *devname = data;
boolean_t avail_spare, l2cache;
nvlist_t *udev_nvl = data;
nvlist_t *tgt;
int error;
char *tmp_devname, devname[MAXPATHLEN] = "";
uint64_t guid;
if (nvlist_lookup_uint64(udev_nvl, ZFS_EV_VDEV_GUID, &guid) == 0) {
sprintf(devname, "%llu", (u_longlong_t)guid);
} else if (nvlist_lookup_string(udev_nvl, DEV_PHYS_PATH,
&tmp_devname) == 0) {
strlcpy(devname, tmp_devname, MAXPATHLEN);
zfs_append_partition(devname, MAXPATHLEN);
} else {
zed_log_msg(LOG_INFO, "%s: no guid or physpath", __func__);
}
zed_log_msg(LOG_INFO, "zfsdle_vdev_online: searching for '%s' in '%s'",
devname, zpool_get_name(zhp));
if ((tgt = zpool_find_vdev_by_physpath(zhp, devname,
&avail_spare, &l2cache, NULL)) != NULL) {
char *path, fullpath[MAXPATHLEN];
uint64_t wholedisk;
uint64_t wholedisk = 0;
error = nvlist_lookup_string(tgt, ZPOOL_CONFIG_PATH, &path);
if (error) {
@@ -916,10 +1003,8 @@ zfsdle_vdev_online(zpool_handle_t *zhp, void *data)
return (0);
}
error = nvlist_lookup_uint64(tgt, ZPOOL_CONFIG_WHOLE_DISK,
(void) nvlist_lookup_uint64(tgt, ZPOOL_CONFIG_WHOLE_DISK,
&wholedisk);
if (error)
wholedisk = 0;
if (wholedisk) {
path = strrchr(path, '/');
@@ -953,12 +1038,75 @@ zfsdle_vdev_online(zpool_handle_t *zhp, void *data)
vdev_state_t newstate;
if (zpool_get_state(zhp) != POOL_STATE_UNAVAIL) {
error = zpool_vdev_online(zhp, fullpath, 0,
&newstate);
zed_log_msg(LOG_INFO, "zfsdle_vdev_online: "
"setting device '%s' to ONLINE state "
"in pool '%s': %d", fullpath,
zpool_get_name(zhp), error);
/*
* If this disk size has not changed, then
* there's no need to do an autoexpand. To
* check we look at the disk's size in its
* config, and compare it to the disk size
* that udev is reporting.
*/
uint64_t udev_size = 0, conf_size = 0,
wholedisk = 0, udev_parent_size = 0;
/*
* Get the size of our disk that udev is
* reporting.
*/
if (nvlist_lookup_uint64(udev_nvl, DEV_SIZE,
&udev_size) != 0) {
udev_size = 0;
}
/*
* Get the size of our disk's parent device
* from udev (where sda1's parent is sda).
*/
if (nvlist_lookup_uint64(udev_nvl,
DEV_PARENT_SIZE, &udev_parent_size) != 0) {
udev_parent_size = 0;
}
conf_size = vdev_size_from_config(zhp,
fullpath);
wholedisk = vdev_whole_disk_from_config(zhp,
fullpath);
/*
* Only attempt an autoexpand if the vdev size
* changed. There are two different cases
* to consider.
*
* 1. wholedisk=1
* If you do a 'zpool create' on a whole disk
* (like /dev/sda), then zfs will create
* partitions on the disk (like /dev/sda1). In
* that case, wholedisk=1 will be set in the
* partition's nvlist config. So zed will need
* to see if your parent device (/dev/sda)
* expanded in size, and if so, then attempt
* the autoexpand.
*
* 2. wholedisk=0
* If you do a 'zpool create' on an existing
* partition, or a device that doesn't allow
* partitions, then wholedisk=0, and you will
* simply need to check if the device itself
* expanded in size.
*/
if (DEVICE_GREW(conf_size, udev_size) ||
(wholedisk && DEVICE_GREW(conf_size,
udev_parent_size))) {
error = zpool_vdev_online(zhp, fullpath,
0, &newstate);
zed_log_msg(LOG_INFO,
"%s: autoexpanding '%s' from %llu"
" to %llu bytes in pool '%s': %d",
__func__, fullpath, conf_size,
MAX(udev_size, udev_parent_size),
zpool_get_name(zhp), error);
}
}
}
zpool_close(zhp);
@@ -986,10 +1134,11 @@ zfs_deliver_dle(nvlist_t *nvl)
strlcpy(name, devname, MAXPATHLEN);
zfs_append_partition(name, MAXPATHLEN);
} else {
sprintf(name, "unknown");
zed_log_msg(LOG_INFO, "zfs_deliver_dle: no guid or physpath");
}
if (zpool_iter(g_zfshdl, zfsdle_vdev_online, name) != 1) {
if (zpool_iter(g_zfshdl, zfsdle_vdev_online, nvl) != 1) {
zed_log_msg(LOG_INFO, "zfs_deliver_dle: device '%s' not "
"found", name);
return (1);
@@ -1075,7 +1224,7 @@ zfs_enum_pools(void *arg)
* For now, each agent has its own libzfs instance
*/
int
zfs_slm_init()
zfs_slm_init(void)
{
if ((g_zfshdl = libzfs_init()) == NULL)
return (-1);
@@ -1101,7 +1250,7 @@ zfs_slm_init()
}
void
zfs_slm_fini()
zfs_slm_fini(void)
{
unavailpool_t *pool;
pendingdev_t *device;
+2 -2
View File
@@ -37,7 +37,7 @@ if [ "${ZEVENT_VDEV_STATE_STR}" != "FAULTED" ] \
fi
umask 077
note_subject="ZFS device fault for pool ${ZEVENT_POOL_GUID} on $(hostname)"
note_subject="ZFS device fault for pool ${ZEVENT_POOL} on $(hostname)"
note_pathname="$(mktemp)"
{
if [ "${ZEVENT_VDEV_STATE_STR}" = "FAULTED" ] ; then
@@ -65,7 +65,7 @@ note_pathname="$(mktemp)"
[ -n "${ZEVENT_VDEV_GUID}" ] && echo " vguid: ${ZEVENT_VDEV_GUID}"
[ -n "${ZEVENT_VDEV_DEVID}" ] && echo " devid: ${ZEVENT_VDEV_DEVID}"
echo " pool: ${ZEVENT_POOL_GUID}"
echo " pool: ${ZEVENT_POOL} (${ZEVENT_POOL_GUID})"
} > "${note_pathname}"
+54 -14
View File
@@ -78,6 +78,8 @@ zed_udev_event(const char *class, const char *subclass, nvlist_t *nvl)
zed_log_msg(LOG_INFO, "\t%s: %s", DEV_PHYS_PATH, strval);
if (nvlist_lookup_uint64(nvl, DEV_SIZE, &numval) == 0)
zed_log_msg(LOG_INFO, "\t%s: %llu", DEV_SIZE, numval);
if (nvlist_lookup_uint64(nvl, DEV_PARENT_SIZE, &numval) == 0)
zed_log_msg(LOG_INFO, "\t%s: %llu", DEV_PARENT_SIZE, numval);
if (nvlist_lookup_uint64(nvl, ZFS_EV_POOL_GUID, &numval) == 0)
zed_log_msg(LOG_INFO, "\t%s: %llu", ZFS_EV_POOL_GUID, numval);
if (nvlist_lookup_uint64(nvl, ZFS_EV_VDEV_GUID, &numval) == 0)
@@ -130,6 +132,20 @@ dev_event_nvlist(struct udev_device *dev)
numval *= strtoull(value, NULL, 10);
(void) nvlist_add_uint64(nvl, DEV_SIZE, numval);
/*
* If the device has a parent, then get the parent block
* device's size as well. For example, /dev/sda1's parent
* is /dev/sda.
*/
struct udev_device *parent_dev = udev_device_get_parent(dev);
if ((value = udev_device_get_sysattr_value(parent_dev, "size"))
!= NULL) {
uint64_t numval = DEV_BSIZE;
numval *= strtoull(value, NULL, 10);
(void) nvlist_add_uint64(nvl, DEV_PARENT_SIZE, numval);
}
}
/*
@@ -169,7 +185,7 @@ zed_udev_monitor(void *arg)
while (1) {
struct udev_device *dev;
const char *action, *type, *part, *sectors;
const char *bus, *uuid;
const char *bus, *uuid, *devpath;
const char *class, *subclass;
nvlist_t *nvl;
boolean_t is_zfs = B_FALSE;
@@ -208,6 +224,12 @@ zed_udev_monitor(void *arg)
* if this is a disk and it is partitioned, then the
* zfs label will reside in a DEVTYPE=partition and
* we can skip passing this event
*
* Special case: Blank disks are sometimes reported with
* an erroneous 'atari' partition, and should not be
* excluded from being used as an autoreplace disk:
*
* https://github.com/openzfs/zfs/issues/13497
*/
type = udev_device_get_property_value(dev, "DEVTYPE");
part = udev_device_get_property_value(dev,
@@ -215,14 +237,23 @@ zed_udev_monitor(void *arg)
if (type != NULL && type[0] != '\0' &&
strcmp(type, "disk") == 0 &&
part != NULL && part[0] != '\0') {
zed_log_msg(LOG_INFO,
"%s: skip %s since it has a %s partition already",
__func__,
udev_device_get_property_value(dev, "DEVNAME"),
part);
/* skip and wait for partition event */
udev_device_unref(dev);
continue;
const char *devname =
udev_device_get_property_value(dev, "DEVNAME");
if (strcmp(part, "atari") == 0) {
zed_log_msg(LOG_INFO,
"%s: %s is reporting an atari partition, "
"but we're going to assume it's a false "
"positive and still use it (issue #13497)",
__func__, devname);
} else {
zed_log_msg(LOG_INFO,
"%s: skip %s since it has a %s partition "
"already", __func__, devname, part);
/* skip and wait for partition event */
udev_device_unref(dev);
continue;
}
}
/*
@@ -248,10 +279,19 @@ zed_udev_monitor(void *arg)
* device id string is required in the message schema
* for matching with vdevs. Preflight here for expected
* udev information.
*
* Special case:
* NVMe devices don't have ID_BUS set (at least on RHEL 7-8),
* but they are valid for autoreplace. Add a special case for
* them by searching for "/nvme/" in the udev DEVPATH:
*
* DEVPATH=/devices/pci0000:00/0000:00:1e.0/nvme/nvme2/nvme2n1
*/
bus = udev_device_get_property_value(dev, "ID_BUS");
uuid = udev_device_get_property_value(dev, "DM_UUID");
if (!is_zfs && (bus == NULL && uuid == NULL)) {
devpath = udev_device_get_devpath(dev);
if (!is_zfs && (bus == NULL && uuid == NULL &&
strstr(devpath, "/nvme/") == NULL)) {
zed_log_msg(LOG_INFO, "zed_udev_monitor: %s no devid "
"source", udev_device_get_devnode(dev));
udev_device_unref(dev);
@@ -362,7 +402,7 @@ zed_udev_monitor(void *arg)
}
int
zed_disk_event_init()
zed_disk_event_init(void)
{
int fd, fflags;
@@ -398,7 +438,7 @@ zed_disk_event_init()
}
void
zed_disk_event_fini()
zed_disk_event_fini(void)
{
/* cancel monitor thread at recvmsg() */
(void) pthread_cancel(g_mon_tid);
@@ -416,13 +456,13 @@ zed_disk_event_fini()
#include "zed_disk_event.h"
int
zed_disk_event_init()
zed_disk_event_init(void)
{
return (0);
}
void
zed_disk_event_fini()
zed_disk_event_fini(void)
{
}
+2 -2
View File
@@ -2480,7 +2480,7 @@ upgrade_set_callback(zfs_handle_t *zhp, void *data)
/* upgrade */
if (version < cb->cb_version) {
char verstr[16];
char verstr[24];
(void) snprintf(verstr, sizeof (verstr),
"%llu", (u_longlong_t)cb->cb_version);
if (cb->cb_lastfs[0] && !same_pool(zhp, cb->cb_lastfs)) {
@@ -8535,7 +8535,7 @@ static int
zfs_do_wait(int argc, char **argv)
{
boolean_t enabled[ZFS_WAIT_NUM_ACTIVITIES];
int error, i;
int error = 0, i;
int c;
/* By default, wait for all types of activity. */
+10 -4
View File
@@ -207,7 +207,6 @@ static int
zfs_project_handle_dir(const char *name, zfs_project_control_t *zpc,
list_t *head)
{
char fullname[PATH_MAX];
struct dirent *ent;
DIR *dir;
int ret = 0;
@@ -227,21 +226,28 @@ zfs_project_handle_dir(const char *name, zfs_project_control_t *zpc,
zpc->zpc_ignore_noent = B_TRUE;
errno = 0;
while (!ret && (ent = readdir(dir)) != NULL) {
char *fullname;
/* skip "." and ".." */
if (strcmp(ent->d_name, ".") == 0 ||
strcmp(ent->d_name, "..") == 0)
continue;
if (strlen(ent->d_name) + strlen(name) >=
sizeof (fullname) + 1) {
if (strlen(ent->d_name) + strlen(name) + 1 >= PATH_MAX) {
errno = ENAMETOOLONG;
break;
}
sprintf(fullname, "%s/%s", name, ent->d_name);
if (asprintf(&fullname, "%s/%s", name, ent->d_name) == -1) {
errno = ENOMEM;
break;
}
ret = zfs_project_handle_one(fullname, zpc);
if (!ret && zpc->zpc_recursive && ent->d_type == DT_DIR)
zfs_project_item_alloc(head, fullname);
free(fullname);
}
if (errno && !ret) {
+27 -9
View File
@@ -2438,7 +2438,14 @@ print_status_config(zpool_handle_t *zhp, status_cbdata_t *cb, const char *name,
(void) nvlist_lookup_uint64_array(root, ZPOOL_CONFIG_SCAN_STATS,
(uint64_t **)&ps, &c);
if (ps != NULL && ps->pss_state == DSS_SCANNING && children == 0) {
/*
* If you force fault a drive that's resilvering, its scan stats can
* get frozen in time, giving the false impression that it's
* being resilvered. That's why we check the state to see if the vdev
* is healthy before reporting "resilvering" or "repairing".
*/
if (ps != NULL && ps->pss_state == DSS_SCANNING && children == 0 &&
vs->vs_state == VDEV_STATE_HEALTHY) {
if (vs->vs_scan_processed != 0) {
(void) printf(gettext(" (%s)"),
(ps->pss_func == POOL_SCAN_RESILVER) ?
@@ -2450,7 +2457,7 @@ print_status_config(zpool_handle_t *zhp, status_cbdata_t *cb, const char *name,
/* The top-level vdevs have the rebuild stats */
if (vrs != NULL && vrs->vrs_state == VDEV_REBUILD_ACTIVE &&
children == 0) {
children == 0 && vs->vs_state == VDEV_STATE_HEALTHY) {
if (vs->vs_rebuild_processed != 0) {
(void) printf(gettext(" (resilvering)"));
}
@@ -5407,7 +5414,13 @@ print_zpool_dir_scripts(char *dirpath)
if ((dir = opendir(dirpath)) != NULL) {
/* print all the files and directories within directory */
while ((ent = readdir(dir)) != NULL) {
sprintf(fullpath, "%s/%s", dirpath, ent->d_name);
if (snprintf(fullpath, sizeof (fullpath), "%s/%s",
dirpath, ent->d_name) >= sizeof (fullpath)) {
(void) fprintf(stderr,
gettext("internal error: "
"ZPOOL_SCRIPTS_PATH too large.\n"));
exit(1);
}
/* Print the scripts */
if (stat(fullpath, &dir_stat) == 0)
@@ -5458,8 +5471,8 @@ get_namewidth_iostat(zpool_handle_t *zhp, void *data)
* get_namewidth() returns the maximum width of any name in that column
* for any pool/vdev/device line that will be output.
*/
width = get_namewidth(zhp, cb->cb_namewidth, cb->cb_name_flags,
cb->cb_verbose);
width = get_namewidth(zhp, cb->cb_namewidth,
cb->cb_name_flags | VDEV_NAME_TYPE_ID, cb->cb_verbose);
/*
* The width we are calculating is the width of the header and also the
@@ -6035,6 +6048,7 @@ print_one_column(zpool_prop_t prop, uint64_t value, const char *str,
size_t width = zprop_width(prop, &fixed, ZFS_TYPE_POOL);
switch (prop) {
case ZPOOL_PROP_SIZE:
case ZPOOL_PROP_EXPANDSZ:
case ZPOOL_PROP_CHECKPOINT:
case ZPOOL_PROP_DEDUPRATIO:
@@ -6130,8 +6144,12 @@ print_list_stats(zpool_handle_t *zhp, const char *name, nvlist_t *nv,
* 'toplevel' boolean value is passed to the print_one_column()
* to indicate that the value is valid.
*/
print_one_column(ZPOOL_PROP_SIZE, vs->vs_space, NULL, scripted,
toplevel, format);
if (vs->vs_pspace)
print_one_column(ZPOOL_PROP_SIZE, vs->vs_pspace, NULL,
scripted, B_TRUE, format);
else
print_one_column(ZPOOL_PROP_SIZE, vs->vs_space, NULL,
scripted, toplevel, format);
print_one_column(ZPOOL_PROP_ALLOCATED, vs->vs_alloc, NULL,
scripted, toplevel, format);
print_one_column(ZPOOL_PROP_FREE, vs->vs_space - vs->vs_alloc,
@@ -6282,8 +6300,8 @@ get_namewidth_list(zpool_handle_t *zhp, void *data)
list_cbdata_t *cb = data;
int width;
width = get_namewidth(zhp, cb->cb_namewidth, cb->cb_name_flags,
cb->cb_verbose);
width = get_namewidth(zhp, cb->cb_namewidth,
cb->cb_name_flags | VDEV_NAME_TYPE_ID, cb->cb_verbose);
if (width < 9)
width = 9;
+3 -3
View File
@@ -363,9 +363,6 @@ zstream_do_dump(int argc, char *argv[])
BSWAP_64(drrb->drr_fromguid);
}
featureflags =
DMU_GET_FEATUREFLAGS(drrb->drr_versioninfo);
(void) printf("BEGIN record\n");
(void) printf("\thdrtype = %lld\n",
DMU_GET_STREAM_HDRTYPE(drrb->drr_versioninfo));
@@ -465,6 +462,9 @@ zstream_do_dump(int argc, char *argv[])
BSWAP_64(drro->drr_maxblkid);
}
featureflags =
DMU_GET_FEATUREFLAGS(drrb->drr_versioninfo);
if (featureflags & DMU_BACKUP_FEATURE_RAW &&
drro->drr_bonuslen > drro->drr_raw_bonuslen) {
(void) fprintf(stderr,
+1
View File
@@ -2193,6 +2193,7 @@ ztest_replay_write(void *arg1, void *arg2, boolean_t byteswap)
* but not always, because we also want to verify correct
* behavior when the data was not recently read into cache.
*/
ASSERT(doi.doi_data_block_size);
ASSERT0(offset % doi.doi_data_block_size);
if (ztest_random(4) != 0) {
int prefetch = ztest_random(2) ?
+7
View File
@@ -109,6 +109,13 @@ while [ "$outer_loop" -lt 20 ]; do
exit 0
fi
fi
#
# zvol_count made some progress - let's stay in this loop.
#
if [ "$old_zvols_count" -gt "$zvols_count" ]; then
outer_loop=$((outer_loop - 1))
fi
done
echo "Timed out waiting on zvol links"
+32 -36
View File
@@ -88,7 +88,7 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_NO_FORMAT_TRUNCATION], [
])
dnl #
dnl # Check if gcc supports -Wno-format-truncation option.
dnl # Check if gcc supports -Wno-format-zero-length option.
dnl #
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_NO_FORMAT_ZERO_LENGTH], [
AC_MSG_CHECKING([whether $CC supports -Wno-format-zero-length])
@@ -108,57 +108,30 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_NO_FORMAT_ZERO_LENGTH], [
AC_SUBST([NO_FORMAT_ZERO_LENGTH])
])
dnl #
dnl # Check if gcc supports -Wno-bool-compare option.
dnl # Check if gcc supports -Wno-clobbered option.
dnl #
dnl # We actually invoke gcc with the -Wbool-compare option
dnl # We actually invoke gcc with the -Wclobbered option
dnl # and infer the 'no-' version does or doesn't exist based upon
dnl # the results. This is required because when checking any of
dnl # no- prefixed options gcc always returns success.
dnl #
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_NO_BOOL_COMPARE], [
AC_MSG_CHECKING([whether $CC supports -Wno-bool-compare])
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_NO_CLOBBERED], [
AC_MSG_CHECKING([whether $CC supports -Wno-clobbered])
saved_flags="$CFLAGS"
CFLAGS="$CFLAGS -Werror -Wbool-compare"
CFLAGS="$CFLAGS -Werror -Wclobbered"
AC_COMPILE_IFELSE([AC_LANG_PROGRAM([], [])], [
NO_BOOL_COMPARE=-Wno-bool-compare
NO_CLOBBERED=-Wno-clobbered
AC_MSG_RESULT([yes])
], [
NO_BOOL_COMPARE=
NO_CLOBBERED=
AC_MSG_RESULT([no])
])
CFLAGS="$saved_flags"
AC_SUBST([NO_BOOL_COMPARE])
])
dnl #
dnl # Check if gcc supports -Wno-unused-but-set-variable option.
dnl #
dnl # We actually invoke gcc with the -Wunused-but-set-variable option
dnl # and infer the 'no-' version does or doesn't exist based upon
dnl # the results. This is required because when checking any of
dnl # no- prefixed options gcc always returns success.
dnl #
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_NO_UNUSED_BUT_SET_VARIABLE], [
AC_MSG_CHECKING([whether $CC supports -Wno-unused-but-set-variable])
saved_flags="$CFLAGS"
CFLAGS="$CFLAGS -Werror -Wunused-but-set-variable"
AC_COMPILE_IFELSE([AC_LANG_PROGRAM([], [])], [
NO_UNUSED_BUT_SET_VARIABLE=-Wno-unused-but-set-variable
AC_MSG_RESULT([yes])
], [
NO_UNUSED_BUT_SET_VARIABLE=
AC_MSG_RESULT([no])
])
CFLAGS="$saved_flags"
AC_SUBST([NO_UNUSED_BUT_SET_VARIABLE])
AC_SUBST([NO_CLOBBERED])
])
dnl #
@@ -184,6 +157,29 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_IMPLICIT_FALLTHROUGH], [
AC_SUBST([IMPLICIT_FALLTHROUGH])
])
dnl #
dnl # Check if cc supports -Winfinite-recursion option.
dnl #
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_INFINITE_RECURSION], [
AC_MSG_CHECKING([whether $CC supports -Winfinite-recursion])
saved_flags="$CFLAGS"
CFLAGS="$CFLAGS -Werror -Winfinite-recursion"
AC_COMPILE_IFELSE([AC_LANG_PROGRAM([], [])], [
INFINITE_RECURSION=-Winfinite-recursion
AC_DEFINE([HAVE_INFINITE_RECURSION], 1,
[Define if compiler supports -Winfinite-recursion])
AC_MSG_RESULT([yes])
], [
INFINITE_RECURSION=
AC_MSG_RESULT([no])
])
CFLAGS="$saved_flags"
AC_SUBST([INFINITE_RECURSION])
])
dnl #
dnl # Check if gcc supports -fno-omit-frame-pointer option.
dnl #
+8
View File
@@ -0,0 +1,8 @@
dnl #
dnl # Check if GNU parallel is available.
dnl #
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_PARALLEL], [
AC_CHECK_PROG([PARALLEL], [parallel], [yes])
AM_CONDITIONAL([HAVE_PARALLEL], [test "x$PARALLEL" = "xyes"])
])
+10
View File
@@ -46,6 +46,16 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_PYZFS], [
])
AC_SUBST(DEFINE_PYZFS)
dnl #
dnl # Autodetection disables pyzfs if kernel or srpm config
dnl #
AS_IF([test "x$enable_pyzfs" = xcheck], [
AS_IF([test "x$ZFS_CONFIG" = xkernel -o "x$ZFS_CONFIG" = xsrpm ], [
enable_pyzfs=no
AC_MSG_NOTICE([Disabling pyzfs for kernel/srpm config])
])
])
dnl #
dnl # Python "packaging" (or, failing that, "distlib") module is required to build and install pyzfs
dnl #
+35 -36
View File
@@ -97,23 +97,13 @@ AC_DEFUN([AX_PYTHON_DEVEL],[
# Check for a version of Python >= 2.1.0
#
AC_MSG_CHECKING([for a version of Python >= '2.1.0'])
ac_supports_python_ver=`cat<<EOD | $PYTHON -
from __future__ import print_function;
import sys;
try:
from packaging import version;
except ImportError:
from distlib import version;
ver = sys.version.split ()[[0]];
(tst_cmp, tst_ver) = ">= '2.1.0'".split ();
tst_ver = tst_ver.strip ("'");
eval ("print (version.LegacyVersion (ver)"+ tst_cmp +"version.LegacyVersion (tst_ver))")
EOD`
ac_supports_python_ver=`$PYTHON -c "import sys; \
ver = sys.version.split ()[[0]]; \
print (ver >= '2.1.0')"`
if test "$ac_supports_python_ver" != "True"; then
if test -z "$PYTHON_NOVERSIONCHECK"; then
AC_MSG_RESULT([no])
m4_ifvaln([$2],[$2],[
AC_MSG_FAILURE([
AC_MSG_FAILURE([
This version of the AC@&t@_PYTHON_DEVEL macro
doesn't work properly with versions of Python before
2.1.0. You may need to re-run configure, setting the
@@ -122,7 +112,6 @@ PYTHON_EXTRA_LIBS and PYTHON_EXTRA_LDFLAGS by hand.
Moreover, to disable this check, set PYTHON_NOVERSIONCHECK
to something else than an empty string.
])
])
else
AC_MSG_RESULT([skip at user request])
fi
@@ -131,37 +120,47 @@ to something else than an empty string.
fi
#
# if the macro parameter ``version'' is set, honour it
# If the macro parameter ``version'' is set, honour it.
# A Python shim class, VPy, is used to implement correct version comparisons via
# string expressions, since e.g. a naive textual ">= 2.7.3" won't work for
# Python 2.7.10 (the ".1" being evaluated as less than ".3").
#
if test -n "$1"; then
AC_MSG_CHECKING([for a version of Python $1])
# Why the strip ()? Because if we don't, version.parse
# will, for example, report 3.10.0 >= '3.11.0'
ac_supports_python_ver=`cat<<EOD | $PYTHON -
from __future__ import print_function;
import sys;
try:
from packaging import version;
except ImportError:
from distlib import version;
ver = sys.version.split ()[[0]];
(tst_cmp, tst_ver) = "$1".split ();
tst_ver = tst_ver.strip ("'");
eval ("print (version.LegacyVersion (ver)"+ tst_cmp +"version.LegacyVersion (tst_ver))")
EOD`
cat << EOF > ax_python_devel_vpy.py
class VPy:
def vtup(self, s):
return tuple(map(int, s.strip().replace("rc", ".").split(".")))
def __init__(self):
import sys
self.vpy = tuple(sys.version_info)
def __eq__(self, s):
return self.vpy == self.vtup(s)
def __ne__(self, s):
return self.vpy != self.vtup(s)
def __lt__(self, s):
return self.vpy < self.vtup(s)
def __gt__(self, s):
return self.vpy > self.vtup(s)
def __le__(self, s):
return self.vpy <= self.vtup(s)
def __ge__(self, s):
return self.vpy >= self.vtup(s)
EOF
ac_supports_python_ver=`$PYTHON -c "import ax_python_devel_vpy; \
ver = ax_python_devel_vpy.VPy(); \
print (ver $1)"`
rm -rf ax_python_devel_vpy*.py* __pycache__/ax_python_devel_vpy*.py*
if test "$ac_supports_python_ver" = "True"; then
AC_MSG_RESULT([yes])
AC_MSG_RESULT([yes])
else
AC_MSG_RESULT([no])
m4_ifvaln([$2],[$2],[
AC_MSG_ERROR([this package requires Python $1.
AC_MSG_ERROR([this package requires Python $1.
If you have it installed, but it isn't the default Python
interpreter in your system path, please pass the PYTHON_VERSION
variable to configure. See ``configure --help'' for reference.
])
PYTHON_VERSION=""
])
PYTHON_VERSION=""
fi
fi
+2 -2
View File
@@ -66,7 +66,7 @@ deb-utils: deb-local rpm-utils-initramfs
## to do this, so we install a shim onto the path which calls the real
## dh_shlibdeps with the required arguments.
path_prepend=`mktemp -d /tmp/intercept.XXXXXX`; \
echo "#$(SHELL)" > $${path_prepend}/dh_shlibdeps; \
echo "#!$(SHELL)" > $${path_prepend}/dh_shlibdeps; \
echo "`which dh_shlibdeps` -- \
-xlibuutil3linux -xlibnvpair3linux -xlibzfs5linux -xlibzpool5linux" \
>> $${path_prepend}/dh_shlibdeps; \
@@ -74,7 +74,7 @@ deb-utils: deb-local rpm-utils-initramfs
## Debianized packages from the auto-generated dependencies of the new debs,
## which should NOT be mixed with the alien-generated debs created here
chmod +x $${path_prepend}/dh_shlibdeps; \
env PATH=$${path_prepend}:$${PATH} \
env "PATH=$${path_prepend}:$${PATH}" \
fakeroot $(ALIEN) --bump=0 --scripts --to-deb --target=$$debarch \
$$pkg1 $$pkg2 $$pkg3 $$pkg4 $$pkg5 $$pkg6 $$pkg7 \
$$pkg8 $$pkg9 $$pkg10 || exit 1; \
+46 -4
View File
@@ -165,6 +165,9 @@ dnl #
dnl # 5.15 API change,
dnl # Added the bool rcu argument to get_acl for rcu path walk.
dnl #
dnl # 6.2 API change,
dnl # get_acl() was renamed to get_inode_acl()
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_INODE_OPERATIONS_GET_ACL], [
ZFS_LINUX_TEST_SRC([inode_operations_get_acl], [
#include <linux/fs.h>
@@ -189,6 +192,18 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_INODE_OPERATIONS_GET_ACL], [
.get_acl = get_acl_fn,
};
],[])
ZFS_LINUX_TEST_SRC([inode_operations_get_inode_acl], [
#include <linux/fs.h>
struct posix_acl *get_inode_acl_fn(struct inode *inode, int type,
bool rcu) { return NULL; }
static const struct inode_operations
iops __attribute__ ((unused)) = {
.get_inode_acl = get_inode_acl_fn,
};
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_INODE_OPERATIONS_GET_ACL], [
@@ -201,7 +216,12 @@ AC_DEFUN([ZFS_AC_KERNEL_INODE_OPERATIONS_GET_ACL], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_GET_ACL_RCU, 1, [iops->get_acl() takes rcu])
],[
ZFS_LINUX_TEST_ERROR([iops->get_acl()])
ZFS_LINUX_TEST_RESULT([inode_operations_get_inode_acl], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_GET_INODE_ACL, 1, [has iops->get_inode_acl()])
],[
ZFS_LINUX_TEST_ERROR([iops->get_acl() or iops->get_inode_acl()])
])
])
])
])
@@ -213,7 +233,22 @@ dnl #
dnl # 5.12 API change,
dnl # set_acl() added a user_namespace* parameter first
dnl #
dnl # 6.2 API change,
dnl # set_acl() second paramter changed to a struct dentry *
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_INODE_OPERATIONS_SET_ACL], [
ZFS_LINUX_TEST_SRC([inode_operations_set_acl_userns_dentry], [
#include <linux/fs.h>
int set_acl_fn(struct user_namespace *userns,
struct dentry *dent, struct posix_acl *acl,
int type) { return 0; }
static const struct inode_operations
iops __attribute__ ((unused)) = {
.set_acl = set_acl_fn,
};
],[])
ZFS_LINUX_TEST_SRC([inode_operations_set_acl_userns], [
#include <linux/fs.h>
@@ -246,11 +281,18 @@ AC_DEFUN([ZFS_AC_KERNEL_INODE_OPERATIONS_SET_ACL], [
AC_DEFINE(HAVE_SET_ACL, 1, [iops->set_acl() exists])
AC_DEFINE(HAVE_SET_ACL_USERNS, 1, [iops->set_acl() takes 4 args])
],[
ZFS_LINUX_TEST_RESULT([inode_operations_set_acl], [
ZFS_LINUX_TEST_RESULT([inode_operations_set_acl_userns_dentry], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SET_ACL, 1, [iops->set_acl() exists, takes 3 args])
AC_DEFINE(HAVE_SET_ACL, 1, [iops->set_acl() exists])
AC_DEFINE(HAVE_SET_ACL_USERNS_DENTRY_ARG2, 1,
[iops->set_acl() takes 4 args, arg2 is struct dentry *])
],[
AC_MSG_RESULT(no)
ZFS_LINUX_TEST_RESULT([inode_operations_set_acl], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SET_ACL, 1, [iops->set_acl() exists, takes 3 args])
],[
ZFS_LINUX_REQUIRE_API([i_op->set_acl()], [3.14])
])
])
])
])
+1 -2
View File
@@ -7,8 +7,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_ADD_DISK], [
#include <linux/blkdev.h>
], [
struct gendisk *disk = NULL;
int err = add_disk(disk);
err = err;
int error __attribute__ ((unused)) = add_disk(disk);
])
])
+8 -8
View File
@@ -259,17 +259,17 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_QUEUE_FLUSH], [
ZFS_LINUX_TEST_SRC([blk_queue_flush], [
#include <linux/blkdev.h>
], [
struct request_queue *q = NULL;
struct request_queue *q __attribute__ ((unused)) = NULL;
(void) blk_queue_flush(q, REQ_FLUSH);
], [$NO_UNUSED_BUT_SET_VARIABLE], [ZFS_META_LICENSE])
], [], [ZFS_META_LICENSE])
ZFS_LINUX_TEST_SRC([blk_queue_write_cache], [
#include <linux/kernel.h>
#include <linux/blkdev.h>
], [
struct request_queue *q = NULL;
struct request_queue *q __attribute__ ((unused)) = NULL;
blk_queue_write_cache(q, true, true);
], [$NO_UNUSED_BUT_SET_VARIABLE], [ZFS_META_LICENSE])
], [], [ZFS_META_LICENSE])
])
AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_FLUSH], [
@@ -322,9 +322,9 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_QUEUE_MAX_HW_SECTORS], [
ZFS_LINUX_TEST_SRC([blk_queue_max_hw_sectors], [
#include <linux/blkdev.h>
], [
struct request_queue *q = NULL;
struct request_queue *q __attribute__ ((unused)) = NULL;
(void) blk_queue_max_hw_sectors(q, BLK_SAFE_MAX_SECTORS);
], [$NO_UNUSED_BUT_SET_VARIABLE])
], [])
])
AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_MAX_HW_SECTORS], [
@@ -345,9 +345,9 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_QUEUE_MAX_SEGMENTS], [
ZFS_LINUX_TEST_SRC([blk_queue_max_segments], [
#include <linux/blkdev.h>
], [
struct request_queue *q = NULL;
struct request_queue *q __attribute__ ((unused)) = NULL;
(void) blk_queue_max_segments(q, BLK_MAX_SEGMENTS);
], [$NO_UNUSED_BUT_SET_VARIABLE])
], [])
])
AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_MAX_SEGMENTS], [
+28
View File
@@ -294,6 +294,32 @@ AC_DEFUN([ZFS_AC_KERNEL_BLKDEV_BDEV_WHOLE], [
])
])
dnl #
dnl # 5.20 API change,
dnl # Removed bdevname(), snprintf(.., %pg) should be used.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLKDEV_BDEVNAME], [
ZFS_LINUX_TEST_SRC([bdevname], [
#include <linux/fs.h>
#include <linux/blkdev.h>
], [
struct block_device *bdev __attribute__ ((unused)) = NULL;
char path[BDEVNAME_SIZE];
(void) bdevname(bdev, path);
])
])
AC_DEFUN([ZFS_AC_KERNEL_BLKDEV_BDEVNAME], [
AC_MSG_CHECKING([whether bdevname() exists])
ZFS_LINUX_TEST_RESULT([bdevname], [
AC_DEFINE(HAVE_BDEVNAME, 1, [bdevname() is available])
AC_MSG_RESULT(yes)
], [
AC_MSG_RESULT(no)
])
])
dnl #
dnl # 5.19 API: blkdev_issue_secure_erase()
dnl # 3.10 API: blkdev_issue_discard(..., BLKDEV_DISCARD_SECURE)
@@ -377,6 +403,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLKDEV], [
ZFS_AC_KERNEL_SRC_BLKDEV_CHECK_DISK_CHANGE
ZFS_AC_KERNEL_SRC_BLKDEV_BDEV_CHECK_MEDIA_CHANGE
ZFS_AC_KERNEL_SRC_BLKDEV_BDEV_WHOLE
ZFS_AC_KERNEL_SRC_BLKDEV_BDEVNAME
ZFS_AC_KERNEL_SRC_BLKDEV_ISSUE_SECURE_ERASE
])
@@ -391,6 +418,7 @@ AC_DEFUN([ZFS_AC_KERNEL_BLKDEV], [
ZFS_AC_KERNEL_BLKDEV_CHECK_DISK_CHANGE
ZFS_AC_KERNEL_BLKDEV_BDEV_CHECK_MEDIA_CHANGE
ZFS_AC_KERNEL_BLKDEV_BDEV_WHOLE
ZFS_AC_KERNEL_BLKDEV_BDEVNAME
ZFS_AC_KERNEL_BLKDEV_GET_ERESTARTSYS
ZFS_AC_KERNEL_BLKDEV_ISSUE_SECURE_ERASE
])
+12 -5
View File
@@ -6,13 +6,16 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_CHECK_EVENTS], [
#include <linux/blkdev.h>
unsigned int blk_check_events(struct gendisk *disk,
unsigned int clearing) { return (0); }
unsigned int clearing) {
(void) disk, (void) clearing;
return (0);
}
static const struct block_device_operations
bops __attribute__ ((unused)) = {
.check_events = blk_check_events,
};
], [], [$NO_UNUSED_BUT_SET_VARIABLE])
], [], [])
])
AC_DEFUN([ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_CHECK_EVENTS], [
@@ -31,7 +34,10 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_RELEASE_VOID], [
ZFS_LINUX_TEST_SRC([block_device_operations_release_void], [
#include <linux/blkdev.h>
void blk_release(struct gendisk *g, fmode_t mode) { return; }
void blk_release(struct gendisk *g, fmode_t mode) {
(void) g, (void) mode;
return;
}
static const struct block_device_operations
bops __attribute__ ((unused)) = {
@@ -40,7 +46,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_RELEASE_VOID], [
.ioctl = NULL,
.compat_ioctl = NULL,
};
], [], [$NO_UNUSED_BUT_SET_VARIABLE])
], [], [])
])
AC_DEFUN([ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_RELEASE_VOID], [
@@ -61,6 +67,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_REVALIDATE_DISK], [
#include <linux/blkdev.h>
int blk_revalidate_disk(struct gendisk *disk) {
(void) disk;
return(0);
}
@@ -68,7 +75,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_REVALIDATE_DISK], [
bops __attribute__ ((unused)) = {
.revalidate_disk = blk_revalidate_disk,
};
], [], [$NO_UNUSED_BUT_SET_VARIABLE])
], [], [])
])
AC_DEFUN([ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_REVALIDATE_DISK], [
+30
View File
@@ -0,0 +1,30 @@
dnl #
dnl # 3.18 API change
dnl # Dentry aliases are in d_u struct dentry member
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_DENTRY_ALIAS_D_U], [
ZFS_LINUX_TEST_SRC([dentry_alias_d_u], [
#include <linux/fs.h>
#include <linux/dcache.h>
#include <linux/list.h>
], [
struct inode *inode __attribute__ ((unused)) = NULL;
struct dentry *dentry __attribute__ ((unused)) = NULL;
hlist_for_each_entry(dentry, &inode->i_dentry,
d_u.d_alias) {
d_drop(dentry);
}
])
])
AC_DEFUN([ZFS_AC_KERNEL_DENTRY_ALIAS_D_U], [
AC_MSG_CHECKING([whether dentry aliases are in d_u member])
ZFS_LINUX_TEST_RESULT([dentry_alias_d_u], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_DENTRY_D_U_ALIASES, 1,
[dentry aliases are in d_u member])
],[
AC_MSG_RESULT(no)
])
])
+2 -2
View File
@@ -5,9 +5,9 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_GET_DISK_RO], [
ZFS_LINUX_TEST_SRC([get_disk_ro], [
#include <linux/blkdev.h>
],[
struct gendisk *disk = NULL;
struct gendisk *disk __attribute__ ((unused)) = NULL;
(void) get_disk_ro(disk);
], [$NO_UNUSED_BUT_SET_VARIABLE])
], [])
])
AC_DEFUN([ZFS_AC_KERNEL_GET_DISK_RO], [
+1 -1
View File
@@ -55,7 +55,7 @@ dnl #
AC_DEFUN([ZFS_AC_KERNEL_ENUM_MEMBER], [
AC_MSG_CHECKING([whether enum $2 contains $1])
AS_IF([AC_TRY_COMMAND(
"${srcdir}/scripts/enum-extract.pl" "$2" "$3" | egrep -qx $1)],[
"${srcdir}/scripts/enum-extract.pl" "$2" "$3" | grep -Eqx $1)],[
AC_MSG_RESULT([yes])
AC_DEFINE(m4_join([_], [ZFS_ENUM], m4_toupper($2), $1), 1,
[enum $2 contains $1])
+20
View File
@@ -49,6 +49,13 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_MAKE_REQUEST_FN], [
struct gendisk *disk __attribute__ ((unused));
disk = blk_alloc_disk(NUMA_NO_NODE);
])
ZFS_LINUX_TEST_SRC([blk_cleanup_disk], [
#include <linux/blkdev.h>
],[
struct gendisk *disk __attribute__ ((unused));
blk_cleanup_disk(disk);
])
])
AC_DEFUN([ZFS_AC_KERNEL_MAKE_REQUEST_FN], [
@@ -73,6 +80,19 @@ AC_DEFUN([ZFS_AC_KERNEL_MAKE_REQUEST_FN], [
ZFS_LINUX_TEST_RESULT([blk_alloc_disk], [
AC_MSG_RESULT(yes)
AC_DEFINE([HAVE_BLK_ALLOC_DISK], 1, [blk_alloc_disk() exists])
dnl #
dnl # 5.20 API change,
dnl # Removed blk_cleanup_disk(), put_disk() should be used.
dnl #
AC_MSG_CHECKING([whether blk_cleanup_disk() exists])
ZFS_LINUX_TEST_RESULT([blk_cleanup_disk], [
AC_MSG_RESULT(yes)
AC_DEFINE([HAVE_BLK_CLEANUP_DISK], 1,
[blk_cleanup_disk() exists])
], [
AC_MSG_RESULT(no)
])
], [
AC_MSG_RESULT(no)
])
-33
View File
@@ -1,33 +0,0 @@
dnl #
dnl # Grsecurity kernel API change
dnl # constified parameters of module_param_call() methods
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_MODULE_PARAM_CALL_CONST], [
ZFS_LINUX_TEST_SRC([module_param_call], [
#include <linux/module.h>
#include <linux/moduleparam.h>
int param_get(char *b, const struct kernel_param *kp)
{
return (0);
}
int param_set(const char *b, const struct kernel_param *kp)
{
return (0);
}
module_param_call(p, param_set, param_get, NULL, 0644);
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_MODULE_PARAM_CALL_CONST], [
AC_MSG_CHECKING([whether module_param_call() is hardened])
ZFS_LINUX_TEST_RESULT([module_param_call], [
AC_MSG_RESULT(yes)
AC_DEFINE(MODULE_PARAM_CALL_CONST, 1,
[hardened module_param_call])
],[
AC_MSG_RESULT(no)
])
])
+52 -15
View File
@@ -54,6 +54,21 @@ AC_DEFUN([ZFS_AC_KERNEL_SHRINK_CONTROL_HAS_NID], [
])
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_REGISTER_SHRINKER_VARARG], [
ZFS_LINUX_TEST_SRC([register_shrinker_vararg], [
#include <linux/mm.h>
unsigned long shrinker_cb(struct shrinker *shrink,
struct shrink_control *sc) { return 0; }
],[
struct shrinker cache_shrinker = {
.count_objects = shrinker_cb,
.scan_objects = shrinker_cb,
.seeks = DEFAULT_SEEKS,
};
register_shrinker(&cache_shrinker, "vararg-reg-shrink-test");
])
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_SHRINKER_CALLBACK], [
ZFS_LINUX_TEST_SRC([shrinker_cb_shrink_control], [
#include <linux/mm.h>
@@ -83,29 +98,50 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_SHRINKER_CALLBACK], [
AC_DEFUN([ZFS_AC_KERNEL_SHRINKER_CALLBACK],[
dnl #
dnl # 3.0 - 3.11 API change
dnl # cs->shrink(struct shrinker *, struct shrink_control *sc)
dnl # 6.0 API change
dnl # register_shrinker() becomes a var-arg function that takes
dnl # a printf-style format string as args > 0
dnl #
AC_MSG_CHECKING([whether new 2-argument shrinker exists])
ZFS_LINUX_TEST_RESULT([shrinker_cb_shrink_control], [
AC_MSG_CHECKING([whether new var-arg register_shrinker() exists])
ZFS_LINUX_TEST_RESULT([register_shrinker_vararg], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SINGLE_SHRINKER_CALLBACK, 1,
[new shrinker callback wants 2 args])
AC_DEFINE(HAVE_REGISTER_SHRINKER_VARARG, 1,
[register_shrinker is vararg])
dnl # We assume that the split shrinker callback exists if the
dnl # vararg register_shrinker() exists, because the latter is
dnl # a much more recent addition, and the macro test for the
dnl # var-arg version only works if the callback is split
AC_DEFINE(HAVE_SPLIT_SHRINKER_CALLBACK, 1,
[cs->count_objects exists])
],[
AC_MSG_RESULT(no)
dnl #
dnl # 3.12 API change,
dnl # cs->shrink() is logically split in to
dnl # cs->count_objects() and cs->scan_objects()
dnl # 3.0 - 3.11 API change
dnl # cs->shrink(struct shrinker *, struct shrink_control *sc)
dnl #
AC_MSG_CHECKING([whether cs->count_objects callback exists])
ZFS_LINUX_TEST_RESULT([shrinker_cb_shrink_control_split], [
AC_MSG_CHECKING([whether new 2-argument shrinker exists])
ZFS_LINUX_TEST_RESULT([shrinker_cb_shrink_control], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SPLIT_SHRINKER_CALLBACK, 1,
[cs->count_objects exists])
AC_DEFINE(HAVE_SINGLE_SHRINKER_CALLBACK, 1,
[new shrinker callback wants 2 args])
],[
ZFS_LINUX_TEST_ERROR([shrinker])
AC_MSG_RESULT(no)
dnl #
dnl # 3.12 API change,
dnl # cs->shrink() is logically split in to
dnl # cs->count_objects() and cs->scan_objects()
dnl #
AC_MSG_CHECKING([if cs->count_objects callback exists])
ZFS_LINUX_TEST_RESULT(
[shrinker_cb_shrink_control_split],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SPLIT_SHRINKER_CALLBACK, 1,
[cs->count_objects exists])
],[
ZFS_LINUX_TEST_ERROR([shrinker])
])
])
])
])
@@ -141,6 +177,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_SHRINKER], [
ZFS_AC_KERNEL_SRC_SHRINK_CONTROL_HAS_NID
ZFS_AC_KERNEL_SRC_SHRINKER_CALLBACK
ZFS_AC_KERNEL_SRC_SHRINK_CONTROL_STRUCT
ZFS_AC_KERNEL_SRC_REGISTER_SHRINKER_VARARG
])
AC_DEFUN([ZFS_AC_KERNEL_SHRINKER], [
+27 -5
View File
@@ -3,11 +3,25 @@ dnl # 3.11 API change
dnl # Add support for i_op->tmpfile
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_TMPFILE], [
dnl #
dnl # 6.1 API change
dnl # use struct file instead of struct dentry
dnl #
ZFS_LINUX_TEST_SRC([inode_operations_tmpfile], [
#include <linux/fs.h>
int tmpfile(struct user_namespace *userns,
struct inode *inode, struct file *file,
umode_t mode) { return 0; }
static struct inode_operations
iops __attribute__ ((unused)) = {
.tmpfile = tmpfile,
};
],[])
dnl #
dnl # 5.11 API change
dnl # add support for userns parameter to tmpfile
dnl #
ZFS_LINUX_TEST_SRC([inode_operations_tmpfile_userns], [
ZFS_LINUX_TEST_SRC([inode_operations_tmpfile_dentry_userns], [
#include <linux/fs.h>
int tmpfile(struct user_namespace *userns,
struct inode *inode, struct dentry *dentry,
@@ -17,7 +31,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_TMPFILE], [
.tmpfile = tmpfile,
};
],[])
ZFS_LINUX_TEST_SRC([inode_operations_tmpfile], [
ZFS_LINUX_TEST_SRC([inode_operations_tmpfile_dentry], [
#include <linux/fs.h>
int tmpfile(struct inode *inode, struct dentry *dentry,
umode_t mode) { return 0; }
@@ -30,16 +44,24 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_TMPFILE], [
AC_DEFUN([ZFS_AC_KERNEL_TMPFILE], [
AC_MSG_CHECKING([whether i_op->tmpfile() exists])
ZFS_LINUX_TEST_RESULT([inode_operations_tmpfile_userns], [
ZFS_LINUX_TEST_RESULT([inode_operations_tmpfile], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_TMPFILE, 1, [i_op->tmpfile() exists])
AC_DEFINE(HAVE_TMPFILE_USERNS, 1, [i_op->tmpfile() has userns])
],[
ZFS_LINUX_TEST_RESULT([inode_operations_tmpfile], [
ZFS_LINUX_TEST_RESULT([inode_operations_tmpfile_dentry_userns], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_TMPFILE, 1, [i_op->tmpfile() exists])
AC_DEFINE(HAVE_TMPFILE_USERNS, 1, [i_op->tmpfile() has userns])
AC_DEFINE(HAVE_TMPFILE_DENTRY, 1, [i_op->tmpfile() uses old dentry signature])
],[
AC_MSG_RESULT(no)
ZFS_LINUX_TEST_RESULT([inode_operations_tmpfile_dentry], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_TMPFILE, 1, [i_op->tmpfile() exists])
AC_DEFINE(HAVE_TMPFILE_DENTRY, 1, [i_op->tmpfile() uses old dentry signature])
],[
ZFS_LINUX_REQUIRE_API([i_op->tmpfile()], [3.11])
])
])
])
])
+28 -1
View File
@@ -100,6 +100,19 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_XATTR_HANDLER_GET], [
.get = get,
};
],[])
ZFS_LINUX_TEST_SRC([xattr_handler_get_dentry_inode_flags], [
#include <linux/xattr.h>
int get(const struct xattr_handler *handler,
struct dentry *dentry, struct inode *inode,
const char *name, void *buffer,
size_t size, int flags) { return 0; }
static const struct xattr_handler
xops __attribute__ ((unused)) = {
.get = get,
};
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_XATTR_HANDLER_GET], [
@@ -142,7 +155,21 @@ AC_DEFUN([ZFS_AC_KERNEL_XATTR_HANDLER_GET], [
AC_DEFINE(HAVE_XATTR_GET_DENTRY, 1,
[xattr_handler->get() wants dentry])
],[
ZFS_LINUX_TEST_ERROR([xattr get()])
dnl #
dnl # Android API change,
dnl # The xattr_handler->get() callback was
dnl # changed to take dentry, inode and flags.
dnl #
AC_MSG_RESULT(no)
AC_MSG_CHECKING(
[whether xattr_handler->get() wants dentry and inode and flags])
ZFS_LINUX_TEST_RESULT([xattr_handler_get_dentry_inode_flags], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_XATTR_GET_DENTRY_INODE_FLAGS, 1,
[xattr_handler->get() wants dentry and inode and flags])
],[
ZFS_LINUX_TEST_ERROR([xattr get()])
])
])
])
])
+48 -9
View File
@@ -93,6 +93,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_SRC], [
ZFS_AC_KERNEL_SRC_SETATTR_PREPARE
ZFS_AC_KERNEL_SRC_INSERT_INODE_LOCKED
ZFS_AC_KERNEL_SRC_DENTRY
ZFS_AC_KERNEL_SRC_DENTRY_ALIAS_D_U
ZFS_AC_KERNEL_SRC_TRUNCATE_SETSIZE
ZFS_AC_KERNEL_SRC_SECURITY_INODE
ZFS_AC_KERNEL_SRC_FST_MOUNT
@@ -119,7 +120,6 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_SRC], [
ZFS_AC_KERNEL_SRC_FMODE_T
ZFS_AC_KERNEL_SRC_KUIDGID_T
ZFS_AC_KERNEL_SRC_KUID_HELPERS
ZFS_AC_KERNEL_SRC_MODULE_PARAM_CALL_CONST
ZFS_AC_KERNEL_SRC_RENAME
ZFS_AC_KERNEL_SRC_CURRENT_TIME
ZFS_AC_KERNEL_SRC_USERNS_CAPABILITIES
@@ -210,6 +210,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_RESULT], [
ZFS_AC_KERNEL_SETATTR_PREPARE
ZFS_AC_KERNEL_INSERT_INODE_LOCKED
ZFS_AC_KERNEL_DENTRY
ZFS_AC_KERNEL_DENTRY_ALIAS_D_U
ZFS_AC_KERNEL_TRUNCATE_SETSIZE
ZFS_AC_KERNEL_SECURITY_INODE
ZFS_AC_KERNEL_FST_MOUNT
@@ -236,7 +237,6 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_RESULT], [
ZFS_AC_KERNEL_FMODE_T
ZFS_AC_KERNEL_KUIDGID_T
ZFS_AC_KERNEL_KUID_HELPERS
ZFS_AC_KERNEL_MODULE_PARAM_CALL_CONST
ZFS_AC_KERNEL_RENAME
ZFS_AC_KERNEL_CURRENT_TIME
ZFS_AC_KERNEL_USERNS_CAPABILITIES
@@ -404,11 +404,11 @@ AC_DEFUN([ZFS_AC_KERNEL], [
utsrelease1=$kernelbuild/include/linux/version.h
utsrelease2=$kernelbuild/include/linux/utsrelease.h
utsrelease3=$kernelbuild/include/generated/utsrelease.h
AS_IF([test -r $utsrelease1 && fgrep -q UTS_RELEASE $utsrelease1], [
AS_IF([test -r $utsrelease1 && grep -qF UTS_RELEASE $utsrelease1], [
utsrelease=$utsrelease1
], [test -r $utsrelease2 && fgrep -q UTS_RELEASE $utsrelease2], [
], [test -r $utsrelease2 && grep -qF UTS_RELEASE $utsrelease2], [
utsrelease=$utsrelease2
], [test -r $utsrelease3 && fgrep -q UTS_RELEASE $utsrelease3], [
], [test -r $utsrelease3 && grep -qF UTS_RELEASE $utsrelease3], [
utsrelease=$utsrelease3
])
@@ -934,8 +934,47 @@ dnl # like ZFS_LINUX_TRY_COMPILE, except the contents conftest.h are
dnl # provided via the fifth parameter
dnl #
AC_DEFUN([ZFS_LINUX_TRY_COMPILE_HEADER], [
ZFS_LINUX_COMPILE_IFELSE(
[ZFS_LINUX_TEST_PROGRAM([[$1]], [[$2]], [[ZFS_META_LICENSE]])],
[test -f build/conftest/conftest.ko],
[$3], [$4], [$5])
AS_IF([test "x$enable_linux_builtin" = "xyes"], [
ZFS_LINUX_COMPILE_IFELSE(
[ZFS_LINUX_TEST_PROGRAM([[$1]], [[$2]],
[[ZFS_META_LICENSE]])],
[test -f build/conftest/conftest.o], [$3], [$4], [$5])
], [
ZFS_LINUX_COMPILE_IFELSE(
[ZFS_LINUX_TEST_PROGRAM([[$1]], [[$2]],
[[ZFS_META_LICENSE]])],
[test -f build/conftest/conftest.ko], [$3], [$4], [$5])
])
])
dnl #
dnl # AS_VERSION_COMPARE_LE
dnl # like AS_VERSION_COMPARE_LE, but runs $3 if (and only if) $1 <= $2
dnl # AS_VERSION_COMPARE_LE (version-1, version-2, [action-if-less-or-equal], [action-if-greater])
dnl #
AC_DEFUN([AS_VERSION_COMPARE_LE], [
AS_VERSION_COMPARE([$1], [$2], [$3], [$3], [$4])
])
dnl #
dnl # ZFS_LINUX_REQUIRE_API
dnl # like ZFS_LINUX_TEST_ERROR, except only fails if the kernel is
dnl # at least some specified version.
dnl #
AC_DEFUN([ZFS_LINUX_REQUIRE_API], [
AS_VERSION_COMPARE_LE([$2], [$kernsrcver], [
AC_MSG_ERROR([
*** None of the expected "$1" interfaces were detected. This
*** interface is expected for kernels version "$2" and above.
*** This may be because your kernel version is newer than what is
*** supported, or you are using a patched custom kernel with
*** incompatible modifications. Newer kernels may have incompatible
*** APIs.
***
*** ZFS Version: $ZFS_META_ALIAS
*** Compatible Kernels: $ZFS_META_KVER_MIN - $ZFS_META_KVER_MAX
])
], [
AC_MSG_RESULT(no)
])
])
+4 -3
View File
@@ -173,7 +173,7 @@ AC_DEFUN([ZFS_AC_DEBUG_KMEM_TRACKING], [
])
AC_DEFUN([ZFS_AC_DEBUG_INVARIANTS_DETECT_FREEBSD], [
AS_IF([sysctl -n kern.conftxt | fgrep -qx $'options\tINVARIANTS'],
AS_IF([sysctl -n kern.conftxt | grep -Fqx $'options\tINVARIANTS'],
[enable_invariants="yes"],
[enable_invariants="no"])
])
@@ -209,8 +209,8 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS], [
AX_COUNT_CPUS([])
AC_SUBST(CPU_COUNT)
ZFS_AC_CONFIG_ALWAYS_CC_NO_UNUSED_BUT_SET_VARIABLE
ZFS_AC_CONFIG_ALWAYS_CC_NO_BOOL_COMPARE
ZFS_AC_CONFIG_ALWAYS_CC_NO_CLOBBERED
ZFS_AC_CONFIG_ALWAYS_CC_INFINITE_RECURSION
ZFS_AC_CONFIG_ALWAYS_CC_IMPLICIT_FALLTHROUGH
ZFS_AC_CONFIG_ALWAYS_CC_FRAME_LARGER_THAN
ZFS_AC_CONFIG_ALWAYS_CC_NO_FORMAT_TRUNCATION
@@ -226,6 +226,7 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS], [
ZFS_AC_CONFIG_ALWAYS_SED
ZFS_AC_CONFIG_ALWAYS_CPPCHECK
ZFS_AC_CONFIG_ALWAYS_SHELLCHECK
ZFS_AC_CONFIG_ALWAYS_PARALLEL
])
AC_DEFUN([ZFS_AC_CONFIG], [
@@ -57,6 +57,12 @@ array_contains () {
}
check() {
# https://github.com/dracutdevs/dracut/pull/1711 provides a zfs_devs
# function to detect the physical devices backing zfs pools. If this
# function exists in the version of dracut this module is being called
# from, then it does not need to run.
type zfs_devs >/dev/null 2>&1 && return 1
local mp
local dev
local blockdevs
+10
View File
@@ -89,6 +89,16 @@ install() {
"zfs-import-cache.service"; do
inst_simple "${systemdsystemunitdir}/${_service}"
systemctl -q --root "${initdir}" add-wants zfs-import.target "${_service}"
# Add user-provided unit overrides
# - /etc/systemd/system/zfs-import-{scan,cache}.service
# - /etc/systemd/system/zfs-import-{scan,cache}.service.d/overrides.conf
# -H ensures they are marked host-only
# -o ensures there is no error upon absence of these files
inst_multiple -o -H \
"${systemdsystemconfdir}/${_service}" \
"${systemdsystemconfdir}/${_service}.d/"*.conf
done
for _service in \
+1 -1
View File
@@ -82,7 +82,7 @@ ZFS_DATASET="${ZFS_DATASET:-${root}}"
ZFS_POOL="${ZFS_DATASET%%/*}"
if ! zpool get -Ho name "${ZFS_POOL}" > /dev/null 2>&1; then
if ! zpool get -Ho value name "${ZFS_POOL}" > /dev/null 2>&1; then
info "ZFS: Importing pool ${ZFS_POOL}..."
# shellcheck disable=SC2086
if ! zpool import -N ${ZPOOL_IMPORT_OPTS} "${ZFS_POOL}"; then
@@ -8,5 +8,5 @@ ConditionKernelCommandLine=bootfs.rollback
[Service]
Type=oneshot
ExecStart=/bin/sh -c '. /lib/dracut-zfs-lib.sh; decode_root_args || exit; [ "$root" = "zfs:AUTO" ] && root="$BOOTFS" SNAPNAME="$(getarg bootfs.rollback)"; exec @sbindir@/zfs rollback -Rf "$root@${SNAPNAME:-%v}"'
ExecStart=/bin/sh -c '. /lib/dracut-zfs-lib.sh; decode_root_args || exit; [ "$root" = "zfs:AUTO" ] && root="$BOOTFS"; SNAPNAME="$(getarg bootfs.rollback)"; exec @sbindir@/zfs rollback -Rf "$root@${SNAPNAME:-%v}"'
RemainAfterExit=yes
@@ -8,5 +8,5 @@ ConditionKernelCommandLine=bootfs.snapshot
[Service]
Type=oneshot
ExecStart=/bin/sh -c '. /lib/dracut-zfs-lib.sh; decode_root_args || exit; [ "$root" = "zfs:AUTO" ] && root="$BOOTFS" SNAPNAME="$(getarg bootfs.snapshot)"; exec @sbindir@/zfs snapshot "$root@${SNAPNAME:-%v}"'
ExecStart=-/bin/sh -c '. /lib/dracut-zfs-lib.sh; decode_root_args || exit; [ "$root" = "zfs:AUTO" ] && root="$BOOTFS"; SNAPNAME="$(getarg bootfs.snapshot)"; exec @sbindir@/zfs snapshot "$root@${SNAPNAME:-%v}"'
RemainAfterExit=yes
+12 -16
View File
@@ -326,30 +326,26 @@ mount_fs()
# Need the _original_ datasets mountpoint!
mountpoint=$(get_fs_value "$fs" mountpoint)
ZFS_CMD="mount -o zfsutil -t zfs"
ZFS_CMD="mount.zfs -o zfsutil"
if [ "$mountpoint" = "legacy" ] || [ "$mountpoint" = "none" ]; then
# Can't use the mountpoint property. Might be one of our
# clones. Check the 'org.zol:mountpoint' property set in
# clone_snap() if that's usable.
mountpoint=$(get_fs_value "$fs" org.zol:mountpoint)
if [ "$mountpoint" = "legacy" ] ||
[ "$mountpoint" = "none" ] ||
[ "$mountpoint" = "-" ]
mountpoint1=$(get_fs_value "$fs" org.zol:mountpoint)
if [ "$mountpoint1" = "legacy" ] ||
[ "$mountpoint1" = "none" ] ||
[ "$mountpoint1" = "-" ]
then
if [ "$fs" != "${ZFS_BOOTFS}" ]; then
# We don't have a proper mountpoint and this
# isn't the root fs.
return 0
else
# Last hail-mary: Hope 'rootmnt' is set!
mountpoint=""
fi
fi
# If it's not a legacy filesystem, it can only be a
# native one...
if [ "$mountpoint" = "legacy" ]; then
ZFS_CMD="mount -t zfs"
ZFS_CMD="mount.zfs"
# Last hail-mary: Hope 'rootmnt' is set!
mountpoint=""
else
mountpoint="$mountpoint1"
fi
fi
@@ -503,7 +499,7 @@ clone_snap()
echo "Error: $ZFS_ERROR"
echo ""
echo "Failed to clone snapshot."
echo "Make sure that the any problems are corrected and then make sure"
echo "Make sure that any problems are corrected and then make sure"
echo "that the dataset '$destfs' exists and is bootable."
shell
else
@@ -915,7 +911,7 @@ mountroot()
echo " not specified on the kernel command line."
echo ""
echo "Manually mount the root filesystem on $rootmnt and then exit."
echo "Hint: Try: mount -o zfsutil -t zfs ${ZFS_RPOOL-rpool}/ROOT/system $rootmnt"
echo "Hint: Try: mount.zfs -o zfsutil ${ZFS_RPOOL-rpool}/ROOT/system $rootmnt"
shell
fi
+11 -3
View File
@@ -496,7 +496,6 @@ zfs_key_config_get_dataset(zfs_key_config_t *config)
if (zhp == NULL) {
pam_syslog(NULL, LOG_ERR, "dataset %s not found",
config->homes_prefix);
zfs_close(zhp);
return (NULL);
}
@@ -508,6 +507,10 @@ zfs_key_config_get_dataset(zfs_key_config_t *config)
return (dsname);
}
if (config->homes_prefix == NULL) {
return (NULL);
}
size_t len = ZFS_MAX_DATASET_NAME_LEN;
size_t total_len = strlen(config->homes_prefix) + 1
+ strlen(config->username);
@@ -711,7 +714,10 @@ pam_sm_open_session(pam_handle_t *pamh, int flags,
return (PAM_SUCCESS);
}
zfs_key_config_t config;
zfs_key_config_load(pamh, &config, argc, argv);
if (zfs_key_config_load(pamh, &config, argc, argv) != 0) {
return (PAM_SESSION_ERR);
}
if (config.uid < 1000) {
zfs_key_config_free(&config);
return (PAM_SUCCESS);
@@ -765,7 +771,9 @@ pam_sm_close_session(pam_handle_t *pamh, int flags,
return (PAM_SUCCESS);
}
zfs_key_config_t config;
zfs_key_config_load(pamh, &config, argc, argv);
if (zfs_key_config_load(pamh, &config, argc, argv) != 0) {
return (PAM_SESSION_ERR);
}
if (config.uid < 1000) {
zfs_key_config_free(&config);
return (PAM_SUCCESS);
+1 -1
View File
@@ -104,7 +104,7 @@ zfs_errno = enum_with_offset(1024, [
)
# compat before we used the enum helper for these values
ZFS_ERR_CHECKPOINT_EXISTS = zfs_errno.ZFS_ERR_CHECKPOINT_EXISTS
assert(ZFS_ERR_CHECKPOINT_EXISTS == 1024)
assert (ZFS_ERR_CHECKPOINT_EXISTS == 1024)
ZFS_ERR_DISCARDING_CHECKPOINT = zfs_errno.ZFS_ERR_DISCARDING_CHECKPOINT
ZFS_ERR_NO_CHECKPOINT = zfs_errno.ZFS_ERR_NO_CHECKPOINT
ZFS_ERR_DEVRM_IN_PROGRESS = zfs_errno.ZFS_ERR_DEVRM_IN_PROGRESS
+1
View File
@@ -22,3 +22,4 @@ SUBSTFILES += $(systemdpreset_DATA) $(systemdunit_DATA)
install-data-hook:
$(MKDIR_P) "$(DESTDIR)$(systemdunitdir)"
ln -sf /dev/null "$(DESTDIR)$(systemdunitdir)/zfs-import.service"
ln -sf /dev/null "$(DESTDIR)$(systemdunitdir)/zfs-load-key.service"
@@ -5,7 +5,7 @@ DefaultDependencies=no
Requires=systemd-udev-settle.service
After=systemd-udev-settle.service
After=cryptsetup.target
After=multipathd.target
After=multipathd.service
After=systemd-remount-fs.service
Before=zfs-import.target
ConditionFileNotEmpty=@sysconfdir@/zfs/zpool.cache
@@ -5,7 +5,7 @@ DefaultDependencies=no
Requires=systemd-udev-settle.service
After=systemd-udev-settle.service
After=cryptsetup.target
After=multipathd.target
After=multipathd.service
Before=zfs-import.target
ConditionFileNotEmpty=!@sysconfdir@/zfs/zpool.cache
ConditionPathIsDirectory=/sys/module/zfs
+1 -1
View File
@@ -5,7 +5,7 @@ ConditionPathIsDirectory=/sys/module/zfs
[Service]
ExecStart=@sbindir@/zed -F
Restart=on-abort
Restart=always
[Install]
Alias=zed.service
+6 -5
View File
@@ -150,6 +150,7 @@ typedef enum zfs_error {
EZFS_NO_RESILVER_DEFER, /* pool doesn't support resilver_defer */
EZFS_EXPORT_IN_PROGRESS, /* currently exporting the pool */
EZFS_REBUILDING, /* resilvering (sequential reconstrution) */
EZFS_CKSUM, /* insufficient replicas */
EZFS_UNKNOWN
} zfs_error_t;
@@ -257,10 +258,10 @@ extern int zpool_add(zpool_handle_t *, nvlist_t *);
typedef struct splitflags {
/* do not split, but return the config that would be split off */
int dryrun : 1;
unsigned int dryrun : 1;
/* after splitting, import the pool */
int import : 1;
unsigned int import : 1;
int name_flags;
} splitflags_t;
@@ -649,13 +650,13 @@ extern int zfs_rollback(zfs_handle_t *, zfs_handle_t *, boolean_t);
typedef struct renameflags {
/* recursive rename */
int recursive : 1;
unsigned int recursive : 1;
/* don't unmount file systems */
int nounmount : 1;
unsigned int nounmount : 1;
/* force unmount file systems */
int forceunmount : 1;
unsigned int forceunmount : 1;
} renameflags_t;
extern int zfs_rename(zfs_handle_t *, const char *, renameflags_t);
+4 -2
View File
@@ -151,13 +151,15 @@ int zfs_ioctl_fd(int fd, unsigned long request, struct zfs_cmd *zc);
* List of colors to use
*/
#define ANSI_RED "\033[0;31m"
#define ANSI_GREEN "\033[0;32m"
#define ANSI_YELLOW "\033[0;33m"
#define ANSI_BLUE "\033[0;34m"
#define ANSI_RESET "\033[0m"
#define ANSI_BOLD "\033[1m"
void color_start(char *color);
void color_start(const char *color);
void color_end(void);
int printf_color(char *color, char *format, ...);
int printf_color(const char *color, char *format, ...);
/*
* These functions are used by the ZFS libraries and cmd/zpool code, but are
-1
View File
@@ -83,7 +83,6 @@
#define __printf(a, b) __printflike(a, b)
#define barrier() __asm__ __volatile__("": : :"memory")
#define smp_rmb() rmb()
#define ___PASTE(a, b) a##b
#define __PASTE(a, b) ___PASTE(a, b)
+2 -1
View File
@@ -57,7 +57,8 @@ extern uint64_t atomic_cas_64(volatile uint64_t *target, uint64_t cmp,
uint64_t newval);
#endif
#define membar_producer atomic_thread_fence_rel
#define membar_consumer() atomic_thread_fence_acq()
#define membar_producer() atomic_thread_fence_rel()
static __inline uint32_t
atomic_add_32_nv(volatile uint32_t *target, int32_t delta)
+1 -1
View File
@@ -52,7 +52,7 @@
#define ZFS_MODULE_PARAM_CALL_IMPL(parent, name, perm, args, desc) \
SYSCTL_DECL(parent); \
SYSCTL_PROC(parent, OID_AUTO, name, perm | args, desc)
SYSCTL_PROC(parent, OID_AUTO, name, CTLFLAG_MPSAFE | perm | args, desc)
#define ZFS_MODULE_PARAM_CALL(scope_prefix, name_prefix, name, func, _, perm, desc) \
ZFS_MODULE_PARAM_CALL_IMPL(_vfs_ ## scope_prefix, name, perm, func ## _args(name_prefix ## name), desc)
+2
View File
@@ -95,9 +95,11 @@ vn_flush_cached_data(vnode_t *vp, boolean_t sync)
if (vp->v_object->flags & OBJ_MIGHTBEDIRTY) {
#endif
int flags = sync ? OBJPC_SYNC : 0;
vn_lock(vp, LK_SHARED | LK_RETRY);
zfs_vmobject_wlock(vp->v_object);
vm_object_page_clean(vp->v_object, 0, 0, flags);
zfs_vmobject_wunlock(vp->v_object);
VOP_UNLOCK(vp);
}
}
@@ -357,7 +357,11 @@ vdev_lookup_bdev(const char *path, dev_t *dev)
static inline void
bio_set_op_attrs(struct bio *bio, unsigned rw, unsigned flags)
{
#if defined(HAVE_BIO_BI_OPF)
bio->bi_opf = rw | flags;
#else
bio->bi_rw |= rw | flags;
#endif /* HAVE_BIO_BI_OPF */
}
#endif
@@ -35,6 +35,10 @@
#define d_make_root(inode) d_alloc_root(inode)
#endif /* HAVE_D_MAKE_ROOT */
#ifdef HAVE_DENTRY_D_U_ALIASES
#define d_alias d_u.d_alias
#endif
/*
* 2.6.30 API change,
* The const keyword was added to the 'struct dentry_operations' in
@@ -61,4 +65,21 @@ d_clear_d_op(struct dentry *dentry)
DCACHE_OP_REVALIDATE | DCACHE_OP_DELETE);
}
/*
* Walk and invalidate all dentry aliases of an inode
* unless it's a mountpoint
*/
static inline void
zpl_d_drop_aliases(struct inode *inode)
{
struct dentry *dentry;
spin_lock(&inode->i_lock);
hlist_for_each_entry(dentry, &inode->i_dentry, d_alias) {
if (!IS_ROOT(dentry) && !d_mountpoint(dentry) &&
(dentry->d_inode == inode)) {
d_drop(dentry);
}
}
spin_unlock(&inode->i_lock);
}
#endif /* _ZFS_DCACHE_H */
+9 -5
View File
@@ -30,12 +30,16 @@
#include <linux/module.h>
#include <linux/moduleparam.h>
/* Grsecurity kernel API change */
#ifdef MODULE_PARAM_CALL_CONST
/*
* Despite constifying struct kernel_param_ops, some older kernels define a
* `__check_old_set_param()` function in their headers that checks for a
* non-constified `->set()`. This has long been fixed in Linux mainline, but
* since we support older kernels, we workaround it by using a preprocessor
* definition to disable it.
*/
#define __check_old_set_param(_) (0)
typedef const struct kernel_param zfs_kernel_param_t;
#else
typedef struct kernel_param zfs_kernel_param_t;
#endif
#define ZMOD_RW 0644
#define ZMOD_RD 0444
@@ -115,6 +115,20 @@ fn(struct dentry *dentry, const char *name, void *buffer, size_t size, \
{ \
return (__ ## fn(dentry->d_inode, name, buffer, size)); \
}
/*
* Android API change,
* The xattr_handler->get() callback was changed to take a dentry and inode
* and flags, because the dentry might not be attached to an inode yet.
*/
#elif defined(HAVE_XATTR_GET_DENTRY_INODE_FLAGS)
#define ZPL_XATTR_GET_WRAPPER(fn) \
static int \
fn(const struct xattr_handler *handler, struct dentry *dentry, \
struct inode *inode, const char *name, void *buffer, \
size_t size, int flags) \
{ \
return (__ ## fn(inode, name, buffer, size)); \
}
#else
#error "Unsupported kernel"
#endif
+4
View File
@@ -64,7 +64,11 @@
* }
*/
#ifdef HAVE_REGISTER_SHRINKER_VARARG
#define spl_register_shrinker(x) register_shrinker(x, "zfs-arc-shrinker")
#else
#define spl_register_shrinker(x) register_shrinker(x)
#endif
#define spl_unregister_shrinker(x) unregister_shrinker(x)
/*
+2
View File
@@ -44,7 +44,9 @@
#define zfs_totalhigh_pages totalhigh_pages
#endif
#define membar_consumer() smp_rmb()
#define membar_producer() smp_wmb()
#define physmem zfs_totalram_pages
#define xcopyin(from, to, size) copy_from_user(to, from, size)
+2 -4
View File
@@ -62,7 +62,6 @@ DECLARE_EVENT_CLASS(zfs_ace_class,
__field(boolean_t, z_is_sa)
__field(boolean_t, z_is_mapped)
__field(boolean_t, z_is_ctldir)
__field(boolean_t, z_is_stale)
__field(uint32_t, i_uid)
__field(uint32_t, i_gid)
@@ -95,7 +94,6 @@ DECLARE_EVENT_CLASS(zfs_ace_class,
__entry->z_is_sa = zn->z_is_sa;
__entry->z_is_mapped = zn->z_is_mapped;
__entry->z_is_ctldir = zn->z_is_ctldir;
__entry->z_is_stale = zn->z_is_stale;
__entry->i_uid = KUID_TO_SUID(ZTOI(zn)->i_uid);
__entry->i_gid = KGID_TO_SGID(ZTOI(zn)->i_gid);
@@ -117,7 +115,7 @@ DECLARE_EVENT_CLASS(zfs_ace_class,
"zn_prefetch %u blksz %u seq %u "
"mapcnt %llu size %llu pflags %llu "
"sync_cnt %u mode 0x%x is_sa %d "
"is_mapped %d is_ctldir %d is_stale %d inode { "
"is_mapped %d is_ctldir %d inode { "
"uid %u gid %u ino %lu nlink %u size %lli "
"blkbits %u bytes %u mode 0x%x generation %x } } "
"ace { type %u flags %u access_mask %u } mask_matched %u",
@@ -126,7 +124,7 @@ DECLARE_EVENT_CLASS(zfs_ace_class,
__entry->z_seq, __entry->z_mapcnt, __entry->z_size,
__entry->z_pflags, __entry->z_sync_cnt, __entry->z_mode,
__entry->z_is_sa, __entry->z_is_mapped,
__entry->z_is_ctldir, __entry->z_is_stale, __entry->i_uid,
__entry->z_is_ctldir, __entry->i_uid,
__entry->i_gid, __entry->i_ino, __entry->i_nlink,
__entry->i_size, __entry->i_blkbits,
__entry->i_bytes, __entry->i_mode, __entry->i_generation,
+6 -2
View File
@@ -45,7 +45,8 @@ extern const struct inode_operations zpl_inode_operations;
extern const struct inode_operations zpl_dir_inode_operations;
extern const struct inode_operations zpl_symlink_inode_operations;
extern const struct inode_operations zpl_special_inode_operations;
extern dentry_operations_t zpl_dentry_operations;
/* zpl_file.c */
extern const struct address_space_operations zpl_address_space_operations;
extern const struct file_operations zpl_file_operations;
extern const struct file_operations zpl_dir_file_operations;
@@ -66,11 +67,14 @@ extern int zpl_xattr_security_init(struct inode *ip, struct inode *dip,
#if defined(HAVE_SET_ACL_USERNS)
extern int zpl_set_acl(struct user_namespace *userns, struct inode *ip,
struct posix_acl *acl, int type);
#elif defined(HAVE_SET_ACL_USERNS_DENTRY_ARG2)
extern int zpl_set_acl(struct user_namespace *userns, struct dentry *dentry,
struct posix_acl *acl, int type);
#else
extern int zpl_set_acl(struct inode *ip, struct posix_acl *acl, int type);
#endif /* HAVE_SET_ACL_USERNS */
#endif /* HAVE_SET_ACL */
#if defined(HAVE_GET_ACL_RCU)
#if defined(HAVE_GET_ACL_RCU) || defined(HAVE_GET_INODE_ACL)
extern struct posix_acl *zpl_get_acl(struct inode *ip, int type, bool rcu);
#elif defined(HAVE_GET_ACL)
extern struct posix_acl *zpl_get_acl(struct inode *ip, int type);
+1
View File
@@ -91,6 +91,7 @@ abd_t *abd_alloc_linear(size_t, boolean_t);
abd_t *abd_alloc_gang(void);
abd_t *abd_alloc_for_io(size_t, boolean_t);
abd_t *abd_alloc_sametype(abd_t *, size_t);
boolean_t abd_size_alloc_linear(size_t);
void abd_gang_add(abd_t *, abd_t *, boolean_t);
void abd_free(abd_t *);
abd_t *abd_get_offset(abd_t *, size_t);
-1
View File
@@ -68,7 +68,6 @@ abd_t *abd_get_offset_scatter(abd_t *, abd_t *, size_t, size_t);
void abd_free_struct_impl(abd_t *);
void abd_alloc_chunks(abd_t *, size_t);
void abd_free_chunks(abd_t *);
boolean_t abd_size_alloc_linear(size_t);
void abd_update_scatter_stats(abd_t *, abd_stats_op_t);
void abd_update_linear_stats(abd_t *, abd_stats_op_t);
void abd_verify_scatter(abd_t *);
+1
View File
@@ -85,6 +85,7 @@ typedef void arc_prune_func_t(int64_t bytes, void *priv);
/* Shared module parameters */
extern int zfs_arc_average_blocksize;
extern int l2arc_exclude_special;
/* generic arc_done_func_t's which you can use */
arc_read_done_func_t arc_bcopy_func;
+7 -7
View File
@@ -30,22 +30,22 @@ typedef struct bqueue {
kmutex_t bq_lock;
kcondvar_t bq_add_cv;
kcondvar_t bq_pop_cv;
uint64_t bq_size;
uint64_t bq_maxsize;
uint64_t bq_fill_fraction;
size_t bq_size;
size_t bq_maxsize;
uint_t bq_fill_fraction;
size_t bq_node_offset;
} bqueue_t;
typedef struct bqueue_node {
list_node_t bqn_node;
uint64_t bqn_size;
size_t bqn_size;
} bqueue_node_t;
int bqueue_init(bqueue_t *, uint64_t, uint64_t, size_t);
int bqueue_init(bqueue_t *, uint_t, size_t, size_t);
void bqueue_destroy(bqueue_t *);
void bqueue_enqueue(bqueue_t *, void *, uint64_t);
void bqueue_enqueue_flush(bqueue_t *, void *, uint64_t);
void bqueue_enqueue(bqueue_t *, void *, size_t);
void bqueue_enqueue_flush(bqueue_t *, void *, size_t);
void *bqueue_dequeue(bqueue_t *);
boolean_t bqueue_empty(bqueue_t *);
+10 -2
View File
@@ -72,7 +72,11 @@ extern kmem_cache_t *zfs_btree_leaf_cache;
typedef struct zfs_btree_hdr {
struct zfs_btree_core *bth_parent;
boolean_t bth_core;
/*
* Set to -1 to indicate core nodes. Other values represent first
* valid element offset for leaf nodes.
*/
uint32_t bth_first;
/*
* For both leaf and core nodes, represents the number of elements in
* the node. For core nodes, they will have bth_count + 1 children.
@@ -91,9 +95,12 @@ typedef struct zfs_btree_leaf {
uint8_t btl_elems[];
} zfs_btree_leaf_t;
#define BTREE_LEAF_ESIZE (BTREE_LEAF_SIZE - \
offsetof(zfs_btree_leaf_t, btl_elems))
typedef struct zfs_btree_index {
zfs_btree_hdr_t *bti_node;
uint64_t bti_offset;
uint32_t bti_offset;
/*
* True if the location is before the list offset, false if it's at
* the listed offset.
@@ -105,6 +112,7 @@ typedef struct btree {
zfs_btree_hdr_t *bt_root;
int64_t bt_height;
size_t bt_elem_size;
uint32_t bt_leaf_cap;
uint64_t bt_num_elems;
uint64_t bt_num_nodes;
zfs_btree_leaf_t *bt_bulk; // non-null if bulk loading
-3
View File
@@ -32,9 +32,6 @@ int aes_mod_fini(void);
int edonr_mod_init(void);
int edonr_mod_fini(void);
int sha1_mod_init(void);
int sha1_mod_fini(void);
int sha2_mod_init(void);
int sha2_mod_fini(void);
+6 -14
View File
@@ -321,15 +321,16 @@ typedef struct dmu_buf_impl {
uint8_t db_dirtycnt;
} dmu_buf_impl_t;
#define DBUF_RWLOCKS 8192
#define DBUF_HASH_RWLOCK(h, idx) (&(h)->hash_rwlocks[(idx) & (DBUF_RWLOCKS-1)])
/* Note: the dbuf hash table is exposed only for the mdb module */
#define DBUF_MUTEXES 2048
#define DBUF_HASH_MUTEX(h, idx) (&(h)->hash_mutexes[(idx) & (DBUF_MUTEXES-1)])
typedef struct dbuf_hash_table {
uint64_t hash_table_mask;
dmu_buf_impl_t **hash_table;
krwlock_t hash_rwlocks[DBUF_RWLOCKS] ____cacheline_aligned;
kmutex_t hash_mutexes[DBUF_MUTEXES] ____cacheline_aligned;
} dbuf_hash_table_t;
typedef void (*dbuf_prefetch_fn)(void *, boolean_t);
typedef void (*dbuf_prefetch_fn)(void *, uint64_t, uint64_t, boolean_t);
uint64_t dbuf_whichblock(const struct dnode *di, const int64_t level,
const uint64_t offset);
@@ -441,16 +442,7 @@ dbuf_find_dirty_eq(dmu_buf_impl_t *db, uint64_t txg)
(dbuf_is_metadata(_db) && \
((_db)->db_objset->os_primary_cache == ZFS_CACHE_METADATA)))
#define DBUF_IS_L2CACHEABLE(_db) \
((_db)->db_objset->os_secondary_cache == ZFS_CACHE_ALL || \
(dbuf_is_metadata(_db) && \
((_db)->db_objset->os_secondary_cache == ZFS_CACHE_METADATA)))
#define DNODE_LEVEL_IS_L2CACHEABLE(_dn, _level) \
((_dn)->dn_objset->os_secondary_cache == ZFS_CACHE_ALL || \
(((_level) > 0 || \
DMU_OT_IS_METADATA((_dn)->dn_handle->dnh_dnode->dn_type)) && \
((_dn)->dn_objset->os_secondary_cache == ZFS_CACHE_METADATA)))
boolean_t dbuf_is_l2cacheable(dmu_buf_impl_t *db);
#ifdef ZFS_DEBUG
+11 -2
View File
@@ -27,6 +27,7 @@
* Copyright (c) 2014 Spectra Logic Corporation, All rights reserved.
* Copyright 2013 Saso Kiselkov. All rights reserved.
* Copyright (c) 2017, Intel Corporation.
* Copyright (c) 2022 Hewlett Packard Enterprise Development LP.
*/
/* Portions Copyright 2010 Robert Milkowski */
@@ -136,18 +137,24 @@ typedef enum dmu_object_byteswap {
#endif
#define DMU_OT_IS_METADATA(ot) (((ot) & DMU_OT_NEWTYPE) ? \
((ot) & DMU_OT_METADATA) : \
(((ot) & DMU_OT_METADATA) != 0) : \
DMU_OT_IS_METADATA_IMPL(ot))
#define DMU_OT_IS_DDT(ot) \
((ot) == DMU_OT_DDT_ZAP)
#define DMU_OT_IS_CRITICAL(ot) \
(DMU_OT_IS_METADATA(ot) && \
(ot) != DMU_OT_DNODE && \
(ot) != DMU_OT_DIRECTORY_CONTENTS && \
(ot) != DMU_OT_SA)
/* Note: ztest uses DMU_OT_UINT64_OTHER as a proxy for file blocks */
#define DMU_OT_IS_FILE(ot) \
((ot) == DMU_OT_PLAIN_FILE_CONTENTS || (ot) == DMU_OT_UINT64_OTHER)
#define DMU_OT_IS_ENCRYPTED(ot) (((ot) & DMU_OT_NEWTYPE) ? \
((ot) & DMU_OT_ENCRYPTED) : \
(((ot) & DMU_OT_ENCRYPTED) != 0) : \
DMU_OT_IS_ENCRYPTED_IMPL(ot))
/*
@@ -1067,6 +1074,8 @@ int dmu_diff(const char *tosnap_name, const char *fromsnap_name,
#define ZFS_CRC64_POLY 0xC96C5795D7870F42ULL /* ECMA-182, reflected form */
extern uint64_t zfs_crc64_table[256];
extern int dmu_prefetch_max;
#ifdef __cplusplus
}
#endif
-4
View File
@@ -200,10 +200,6 @@ struct objset {
#define DMU_GROUPUSED_DNODE(os) ((os)->os_groupused_dnode.dnh_dnode)
#define DMU_PROJECTUSED_DNODE(os) ((os)->os_projectused_dnode.dnh_dnode)
#define DMU_OS_IS_L2CACHEABLE(os) \
((os)->os_secondary_cache == ZFS_CACHE_ALL || \
(os)->os_secondary_cache == ZFS_CACHE_METADATA)
/* called from zpl */
int dmu_objset_hold(const char *name, void *tag, objset_t **osp);
int dmu_objset_hold_flags(const char *name, boolean_t decrypt, void *tag,
+1
View File
@@ -125,6 +125,7 @@ typedef struct dmu_tx_stats {
kstat_named_t dmu_tx_dirty_delay;
kstat_named_t dmu_tx_dirty_over_max;
kstat_named_t dmu_tx_dirty_frees_delay;
kstat_named_t dmu_tx_wrlog_delay;
kstat_named_t dmu_tx_quota;
} dmu_tx_stats_t;
+7 -9
View File
@@ -49,20 +49,18 @@ typedef struct zfetch {
typedef struct zstream {
uint64_t zs_blkid; /* expect next access at this blkid */
uint64_t zs_pf_blkid1; /* first block to prefetch */
uint64_t zs_pf_blkid; /* block to prefetch up to */
/*
* We will next prefetch the L1 indirect block of this level-0
* block id.
*/
uint64_t zs_ipf_blkid1; /* first block to prefetch */
uint64_t zs_ipf_blkid; /* block to prefetch up to */
unsigned int zs_pf_dist; /* data prefetch distance in bytes */
unsigned int zs_ipf_dist; /* L1 prefetch distance in bytes */
uint64_t zs_pf_start; /* first data block to prefetch */
uint64_t zs_pf_end; /* data block to prefetch up to */
uint64_t zs_ipf_start; /* first data block to prefetch L1 */
uint64_t zs_ipf_end; /* data block to prefetch L1 up to */
list_node_t zs_node; /* link for zf_stream */
hrtime_t zs_atime; /* time last prefetch issued */
zfetch_t *zs_fetch; /* parent fetch */
boolean_t zs_missed; /* stream saw cache misses */
boolean_t zs_more; /* need more distant prefetch */
zfs_refcount_t zs_callers; /* number of pending callers */
/*
* Number of stream references: dnode, callers and pending blocks.
+7 -1
View File
@@ -40,6 +40,7 @@
#include <sys/rrwlock.h>
#include <sys/dsl_synctask.h>
#include <sys/mmp.h>
#include <sys/aggsum.h>
#ifdef __cplusplus
extern "C" {
@@ -58,6 +59,7 @@ struct dsl_deadlist;
extern unsigned long zfs_dirty_data_max;
extern unsigned long zfs_dirty_data_max_max;
extern unsigned long zfs_wrlog_data_max;
extern int zfs_dirty_data_sync_percent;
extern int zfs_dirty_data_max_percent;
extern int zfs_dirty_data_max_max_percent;
@@ -82,7 +84,6 @@ typedef struct zfs_blkstat {
typedef struct zfs_all_blkstats {
zfs_blkstat_t zab_type[DN_MAX_LEVELS + 1][DMU_OT_TOTAL + 1];
kmutex_t zab_lock;
} zfs_all_blkstats_t;
@@ -119,6 +120,9 @@ typedef struct dsl_pool {
uint64_t dp_mos_compressed_delta;
uint64_t dp_mos_uncompressed_delta;
aggsum_t dp_wrlog_pertxg[TXG_SIZE];
aggsum_t dp_wrlog_total;
/*
* Time of most recently scheduled (furthest in the future)
* wakeup for delayed transactions.
@@ -159,6 +163,8 @@ uint64_t dsl_pool_adjustedsize(dsl_pool_t *dp, zfs_space_check_t slop_policy);
uint64_t dsl_pool_unreserved_space(dsl_pool_t *dp,
zfs_space_check_t slop_policy);
uint64_t dsl_pool_deferred_space(dsl_pool_t *dp);
void dsl_pool_wrlog_count(dsl_pool_t *dp, int64_t size, uint64_t txg);
boolean_t dsl_pool_need_wrlog_delay(dsl_pool_t *dp);
void dsl_pool_dirty_space(dsl_pool_t *dp, int64_t space, dmu_tx_t *tx);
void dsl_pool_undirty_space(dsl_pool_t *dp, int64_t space, uint64_t txg);
void dsl_free(dsl_pool_t *dp, uint64_t txg, const blkptr_t *bpp);
+1 -1
View File
@@ -155,7 +155,7 @@ typedef struct dsl_scan {
dsl_scan_phys_t scn_phys; /* on disk representation of scan */
dsl_scan_phys_t scn_phys_cached;
avl_tree_t scn_queue; /* queue of datasets to scan */
uint64_t scn_bytes_pending; /* outstanding data to issue */
uint64_t scn_queues_pending; /* outstanding data to issue */
} dsl_scan_t;
typedef struct dsl_scan_io_queue dsl_scan_io_queue_t;
+47 -1
View File
@@ -29,6 +29,7 @@
* Copyright (c) 2019 Datto Inc.
* Portions Copyright 2010 Robert Milkowski
* Copyright (c) 2021, Colm Buckley <colm@tuatha.org>
* Copyright (c) 2022 Hewlett Packard Enterprise Development LP.
*/
#ifndef _SYS_FS_ZFS_H
@@ -423,7 +424,9 @@ typedef enum {
typedef enum {
ZFS_REDUNDANT_METADATA_ALL,
ZFS_REDUNDANT_METADATA_MOST
ZFS_REDUNDANT_METADATA_MOST,
ZFS_REDUNDANT_METADATA_SOME,
ZFS_REDUNDANT_METADATA_NONE
} zfs_redundant_metadata_type_t;
typedef enum {
@@ -757,6 +760,7 @@ typedef struct zpool_load_policy {
/* Rewind data discovered */
#define ZPOOL_CONFIG_LOAD_TIME "rewind_txg_ts"
#define ZPOOL_CONFIG_LOAD_META_ERRORS "verify_meta_errors"
#define ZPOOL_CONFIG_LOAD_DATA_ERRORS "verify_data_errors"
#define ZPOOL_CONFIG_REWIND_TIME "seconds_of_rewind"
@@ -1101,6 +1105,7 @@ typedef struct vdev_stat {
uint64_t vs_configured_ashift; /* TLV vdev_ashift */
uint64_t vs_logical_ashift; /* vdev_logical_ashift */
uint64_t vs_physical_ashift; /* vdev_physical_ashift */
uint64_t vs_pspace; /* physical capacity */
} vdev_stat_t;
/* BEGIN CSTYLED */
@@ -1613,6 +1618,47 @@ typedef enum {
#define ZFS_EV_HIST_DSID "history_dsid"
#define ZFS_EV_RESILVER_TYPE "resilver_type"
/*
* We currently support block sizes from 512 bytes to 16MB.
* The benefits of larger blocks, and thus larger IO, need to be weighed
* against the cost of COWing a giant block to modify one byte, and the
* large latency of reading or writing a large block.
*
* The recordsize property can not be set larger than zfs_max_recordsize
* (default 16MB on 64-bit and 1MB on 32-bit). See the comment near
* zfs_max_recordsize in dsl_dataset.c for details.
*
* Note that although the LSIZE field of the blkptr_t can store sizes up
* to 32MB, the dnode's dn_datablkszsec can only store sizes up to
* 32MB - 512 bytes. Therefore, we limit SPA_MAXBLOCKSIZE to 16MB.
*/
#define SPA_MINBLOCKSHIFT 9
#define SPA_OLD_MAXBLOCKSHIFT 17
#define SPA_MAXBLOCKSHIFT 24
#define SPA_MINBLOCKSIZE (1ULL << SPA_MINBLOCKSHIFT)
#define SPA_OLD_MAXBLOCKSIZE (1ULL << SPA_OLD_MAXBLOCKSHIFT)
#define SPA_MAXBLOCKSIZE (1ULL << SPA_MAXBLOCKSHIFT)
/* supported encryption algorithms */
enum zio_encrypt {
ZIO_CRYPT_INHERIT = 0,
ZIO_CRYPT_ON,
ZIO_CRYPT_OFF,
ZIO_CRYPT_AES_128_CCM,
ZIO_CRYPT_AES_192_CCM,
ZIO_CRYPT_AES_256_CCM,
ZIO_CRYPT_AES_128_GCM,
ZIO_CRYPT_AES_192_GCM,
ZIO_CRYPT_AES_256_GCM,
ZIO_CRYPT_FUNCTIONS
};
#define ZIO_CRYPT_ON_VALUE ZIO_CRYPT_AES_256_GCM
#define ZIO_CRYPT_DEFAULT ZIO_CRYPT_OFF
#ifdef __cplusplus
}
#endif
+3
View File
@@ -49,11 +49,14 @@ int metaslab_init(metaslab_group_t *, uint64_t, uint64_t, uint64_t,
metaslab_t **);
void metaslab_fini(metaslab_t *);
void metaslab_set_unflushed_dirty(metaslab_t *, boolean_t);
void metaslab_set_unflushed_txg(metaslab_t *, uint64_t, dmu_tx_t *);
void metaslab_set_estimated_condensed_size(metaslab_t *, uint64_t, dmu_tx_t *);
boolean_t metaslab_unflushed_dirty(metaslab_t *);
uint64_t metaslab_unflushed_txg(metaslab_t *);
uint64_t metaslab_estimated_condensed_size(metaslab_t *);
int metaslab_sort_by_flushed(const void *, const void *);
void metaslab_unflushed_bump(metaslab_t *, dmu_tx_t *, boolean_t);
uint64_t metaslab_unflushed_changes_memused(metaslab_t *);
int metaslab_load(metaslab_t *);
+1
View File
@@ -553,6 +553,7 @@ struct metaslab {
* log space maps.
*/
uint64_t ms_unflushed_txg;
boolean_t ms_unflushed_dirty;
/* updated every time we are done syncing the metaslab's space map */
uint64_t ms_synced_length;
+5 -16
View File
@@ -63,12 +63,8 @@ typedef struct range_tree {
*/
uint8_t rt_shift;
uint64_t rt_start;
range_tree_ops_t *rt_ops;
/* rt_btree_compare should only be set if rt_arg is a b-tree */
const range_tree_ops_t *rt_ops;
void *rt_arg;
int (*rt_btree_compare)(const void *, const void *);
uint64_t rt_gap; /* allowable inter-segment gap */
/*
@@ -278,11 +274,11 @@ rs_set_fill(range_seg_t *rs, range_tree_t *rt, uint64_t fill)
typedef void range_tree_func_t(void *arg, uint64_t start, uint64_t size);
range_tree_t *range_tree_create_impl(range_tree_ops_t *ops,
range_tree_t *range_tree_create_gap(const range_tree_ops_t *ops,
range_seg_type_t type, void *arg, uint64_t start, uint64_t shift,
int (*zfs_btree_compare) (const void *, const void *), uint64_t gap);
range_tree_t *range_tree_create(range_tree_ops_t *ops, range_seg_type_t type,
void *arg, uint64_t start, uint64_t shift);
uint64_t gap);
range_tree_t *range_tree_create(const range_tree_ops_t *ops,
range_seg_type_t type, void *arg, uint64_t start, uint64_t shift);
void range_tree_destroy(range_tree_t *rt);
boolean_t range_tree_contains(range_tree_t *rt, uint64_t start, uint64_t size);
range_seg_t *range_tree_find(range_tree_t *rt, uint64_t start, uint64_t size);
@@ -316,13 +312,6 @@ void range_tree_remove_xor_add_segment(uint64_t start, uint64_t end,
void range_tree_remove_xor_add(range_tree_t *rt, range_tree_t *removefrom,
range_tree_t *addto);
void rt_btree_create(range_tree_t *rt, void *arg);
void rt_btree_destroy(range_tree_t *rt, void *arg);
void rt_btree_add(range_tree_t *rt, range_seg_t *rs, void *arg);
void rt_btree_remove(range_tree_t *rt, range_seg_t *rs, void *arg);
void rt_btree_vacate(range_tree_t *rt, void *arg);
extern range_tree_ops_t rt_btree_ops;
#ifdef __cplusplus
}
#endif
-21
View File
@@ -72,27 +72,6 @@ struct dsl_pool;
struct dsl_dataset;
struct dsl_crypto_params;
/*
* We currently support block sizes from 512 bytes to 16MB.
* The benefits of larger blocks, and thus larger IO, need to be weighed
* against the cost of COWing a giant block to modify one byte, and the
* large latency of reading or writing a large block.
*
* Note that although blocks up to 16MB are supported, the recordsize
* property can not be set larger than zfs_max_recordsize (default 1MB).
* See the comment near zfs_max_recordsize in dsl_dataset.c for details.
*
* Note that although the LSIZE field of the blkptr_t can store sizes up
* to 32MB, the dnode's dn_datablkszsec can only store sizes up to
* 32MB - 512 bytes. Therefore, we limit SPA_MAXBLOCKSIZE to 16MB.
*/
#define SPA_MINBLOCKSHIFT 9
#define SPA_OLD_MAXBLOCKSHIFT 17
#define SPA_MAXBLOCKSHIFT 24
#define SPA_MINBLOCKSIZE (1ULL << SPA_MINBLOCKSHIFT)
#define SPA_OLD_MAXBLOCKSIZE (1ULL << SPA_OLD_MAXBLOCKSHIFT)
#define SPA_MAXBLOCKSIZE (1ULL << SPA_MAXBLOCKSHIFT)
/*
* Alignment Shift (ashift) is an immutable, internal top-level vdev property
* which can only be set at vdev creation time. Physical writes are always done
+2 -2
View File
@@ -146,9 +146,9 @@ typedef struct spa_config_lock {
kmutex_t scl_lock;
kthread_t *scl_writer;
int scl_write_wanted;
int scl_count;
kcondvar_t scl_cv;
zfs_refcount_t scl_count;
} spa_config_lock_t;
} ____cacheline_aligned spa_config_lock_t;
typedef struct spa_config_dirent {
list_node_t scd_link;
+7 -2
View File
@@ -30,7 +30,10 @@
typedef struct log_summary_entry {
uint64_t lse_start; /* start TXG */
uint64_t lse_end; /* last TXG */
uint64_t lse_txgcount; /* # of TXGs */
uint64_t lse_mscount; /* # of metaslabs needed to be flushed */
uint64_t lse_msdcount; /* # of dirty metaslabs needed to be flushed */
uint64_t lse_blkcount; /* blocks held by this entry */
list_node_t lse_node;
} log_summary_entry_t;
@@ -50,6 +53,7 @@ typedef struct spa_log_sm {
uint64_t sls_nblocks; /* number of blocks in this log */
uint64_t sls_mscount; /* # of metaslabs flushed in the log's txg */
avl_node_t sls_node; /* node in spa_sm_logs_by_txg */
space_map_t *sls_sm; /* space map pointer, if open */
} spa_log_sm_t;
int spa_ld_log_spacemaps(spa_t *);
@@ -68,8 +72,9 @@ uint64_t spa_log_sm_memused(spa_t *);
void spa_log_sm_decrement_mscount(spa_t *, uint64_t);
void spa_log_sm_increment_current_mscount(spa_t *);
void spa_log_summary_add_flushed_metaslab(spa_t *);
void spa_log_summary_decrement_mscount(spa_t *, uint64_t);
void spa_log_summary_add_flushed_metaslab(spa_t *, boolean_t);
void spa_log_summary_dirty_flushed_metaslab(spa_t *, uint64_t);
void spa_log_summary_decrement_mscount(spa_t *, uint64_t, boolean_t);
void spa_log_summary_decrement_blkcount(spa_t *, uint64_t);
boolean_t spa_flush_all_logs_requested(spa_t *);
+3
View File
@@ -244,6 +244,9 @@ extern "C" {
#define DEV_PATH "path"
#define DEV_IS_PART "is_slice"
#define DEV_SIZE "dev_size"
/* Size of the whole parent block device (if dev is a partition) */
#define DEV_PARENT_SIZE "dev_parent_size"
#endif /* __linux__ */
#define EV_V1 1
+1 -1
View File
@@ -78,7 +78,7 @@ extern void txg_register_callbacks(txg_handle_t *txghp, list_t *tx_callbacks);
extern void txg_delay(struct dsl_pool *dp, uint64_t txg, hrtime_t delta,
hrtime_t resolution);
extern void txg_kick(struct dsl_pool *dp);
extern void txg_kick(struct dsl_pool *dp, uint64_t txg);
/*
* Wait until the given transaction group has finished syncing.
+1
View File
@@ -642,6 +642,7 @@ extern int vdev_obsolete_counts_are_precise(vdev_t *vd, boolean_t *are_precise);
*/
int vdev_checkpoint_sm_object(vdev_t *vd, uint64_t *sm_obj);
void vdev_metaslab_group_create(vdev_t *vd);
uint64_t vdev_best_ashift(uint64_t logical, uint64_t a, uint64_t b);
/*
* Vdev ashift optimization tunables
-1
View File
@@ -190,7 +190,6 @@ typedef struct znode {
boolean_t z_is_sa; /* are we native sa? */
boolean_t z_is_mapped; /* are we mmap'ed */
boolean_t z_is_ctldir; /* are we .zfs entry */
boolean_t z_is_stale; /* are we stale due to rollback? */
boolean_t z_suspended; /* extra ref from a suspend? */
uint_t z_blksz; /* block size in bytes */
uint_t z_seq; /* modification sequence number */
+10 -1
View File
@@ -221,6 +221,15 @@ typedef struct {
uint64_t lr_foid; /* object id */
} lr_ooo_t;
/*
* Additional lr_attr_t fields.
*/
typedef struct {
uint64_t lr_attr_attrs; /* all of the attributes */
uint64_t lr_attr_crtime[2]; /* create time */
uint8_t lr_attr_scanstamp[32];
} lr_attr_end_t;
/*
* Handle option extended vattr attributes.
*
@@ -231,7 +240,7 @@ typedef struct {
typedef struct {
uint32_t lr_attr_masksize; /* number of elements in array */
uint32_t lr_attr_bitmap; /* First entry of array */
/* remainder of array and any additional fields */
/* remainder of array and additional lr_attr_end_t fields */
} lr_attr_t;
/*
+2 -17
View File
@@ -108,23 +108,6 @@ enum zio_checksum {
#define ZIO_DEDUPCHECKSUM ZIO_CHECKSUM_SHA256
/* supported encryption algorithms */
enum zio_encrypt {
ZIO_CRYPT_INHERIT = 0,
ZIO_CRYPT_ON,
ZIO_CRYPT_OFF,
ZIO_CRYPT_AES_128_CCM,
ZIO_CRYPT_AES_192_CCM,
ZIO_CRYPT_AES_256_CCM,
ZIO_CRYPT_AES_128_GCM,
ZIO_CRYPT_AES_192_GCM,
ZIO_CRYPT_AES_256_GCM,
ZIO_CRYPT_FUNCTIONS
};
#define ZIO_CRYPT_ON_VALUE ZIO_CRYPT_AES_256_GCM
#define ZIO_CRYPT_DEFAULT ZIO_CRYPT_OFF
/* macros defining encryption lengths */
#define ZIO_OBJSET_MAC_LEN 32
#define ZIO_DATA_IV_LEN 12
@@ -699,6 +682,8 @@ extern void spa_handle_ignored_writes(spa_t *spa);
/* zbookmark_phys functions */
boolean_t zbookmark_subtree_completed(const struct dnode_phys *dnp,
const zbookmark_phys_t *subtree_root, const zbookmark_phys_t *last_block);
boolean_t zbookmark_subtree_tbd(const struct dnode_phys *dnp,
const zbookmark_phys_t *subtree_root, const zbookmark_phys_t *last_block);
int zbookmark_compare(uint16_t dbss1, uint8_t ibs1, uint16_t dbss2,
uint8_t ibs2, const zbookmark_phys_t *zb1, const zbookmark_phys_t *zb2);
+3
View File
@@ -5,6 +5,9 @@ VPATH = $(top_srcdir)/module/avl/
# Includes kernel code, generate warnings for large stack frames
AM_CFLAGS += $(FRAME_LARGER_THAN)
# See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=54020
AM_CFLAGS += -no-suppress
noinst_LTLIBRARIES = libavl.la
KERNEL_C = \
+3
View File
@@ -2,6 +2,9 @@ include $(top_srcdir)/config/Rules.am
AM_CFLAGS += $(LIBUUID_CFLAGS) $(ZLIB_CFLAGS)
# See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=54020
AM_CFLAGS += -no-suppress
noinst_LTLIBRARIES = libefi.la
USER_C = \
+2 -3
View File
@@ -6,6 +6,8 @@ VPATH = \
# Includes kernel code, generate warnings for large stack frames
AM_CFLAGS += $(FRAME_LARGER_THAN)
# See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=54020
AM_CFLAGS += -no-suppress
noinst_LTLIBRARIES = libicp.la
@@ -17,7 +19,6 @@ ASM_SOURCES_AS = \
asm-x86_64/modes/gcm_pclmulqdq.S \
asm-x86_64/modes/aesni-gcm-x86_64.S \
asm-x86_64/modes/ghash-x86_64.S \
asm-x86_64/sha1/sha1-x86_64.S \
asm-x86_64/sha2/sha256_impl.S \
asm-x86_64/sha2/sha512_impl.S
else
@@ -46,7 +47,6 @@ KERNEL_C = \
algs/modes/ctr.c \
algs/modes/ccm.c \
algs/modes/ecb.c \
algs/sha1/sha1.c \
algs/sha2/sha2.c \
algs/skein/skein.c \
algs/skein/skein_block.c \
@@ -54,7 +54,6 @@ KERNEL_C = \
illumos-crypto.c \
io/aes.c \
io/edonr_mod.c \
io/sha1_mod.c \
io/sha2_mod.c \
io/skein_mod.c \
os/modhash.c \
+3
View File
@@ -8,6 +8,9 @@ VPATH = \
# and required CFLAGS for libtirpc
AM_CFLAGS += $(FRAME_LARGER_THAN) $(LIBTIRPC_CFLAGS)
# See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=54020
AM_CFLAGS += -no-suppress
lib_LTLIBRARIES = libnvpair.la
include $(top_srcdir)/config/Abigail.am
+3
View File
@@ -2,6 +2,9 @@ include $(top_srcdir)/config/Rules.am
DEFAULT_INCLUDES += -I$(srcdir)
# See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=54020
AM_CFLAGS += -no-suppress
noinst_LTLIBRARIES = libshare.la
USER_C = \
+3
View File
@@ -2,6 +2,9 @@ include $(top_srcdir)/config/Rules.am
SUBDIRS = include
# See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=54020
AM_CFLAGS += -no-suppress
noinst_LTLIBRARIES = libspl_assert.la libspl.la
libspl_assert_la_SOURCES = \
+1 -1
View File
@@ -26,7 +26,7 @@
#include <zone.h>
zoneid_t
getzoneid()
getzoneid(void)
{
return (GLOBAL_ZONEID);
}
+6
View File
@@ -1,5 +1,11 @@
include $(top_srcdir)/config/Rules.am
# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61118
AM_CFLAGS += $(NO_CLOBBERED)
# See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=54020
AM_CFLAGS += -no-suppress
noinst_LTLIBRARIES = libtpool.la
USER_C = \

Some files were not shown because too many files have changed in this diff Show More