Compare commits

...

309 Commits

Author SHA1 Message Date
Tony Hutter 92e0d9d183 Tag zfs-2.1.9
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2023-01-24 15:41:54 -08:00
Coleman Kane 232fc23c6e linux 6.2 compat: zpl_set_acl arg2 is now struct dentry
Linux 6.2 changes the second argument of the set_acl operation to be a
"struct dentry *" rather than a "struct inode *". The inode* parameter
is still available as dentry->d_inode, so adjust the call to the _impl
function call to dereference and pass that pointer to it.

Also document that the get_acl -> get_inode_acl member name change from
commit 884a693 was an API change also introduced in Linux 6.2.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #14415
2023-01-24 15:36:08 -08:00
Tony Hutter 11bdc5c8e8 Revert "ztest fails assertion in zio_write_gang_member_ready()"
This reverts commit 0156253d29.

That commit was identified as causing IO errors on a user's
encrypted dataset:
https://github.com/openzfs/zfs/issues/14413

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2023-01-24 15:35:24 -08:00
Tony Hutter 04b02785b6 Tag zfs-2.1.8
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2023-01-19 13:00:56 -08:00
Gian-Carlo DeFazio f22254283a change how d_alias is replaced by du.d_alias
d_alias may need to be converted to du.d_alias
depending on the kernel version.
d_alias is currently in only one place in the code which
changes
"hlist_for_each_entry(dentry, &inode->i_dentry, d_alias)"
to
"hlist_for_each_entry(dentry, &inode->i_dentry, d_u.d_alias)"
as neccesary.

This effectively results in a double macro expansion
for code that uses the zfs headers but already has its
own macro for just d_alias (lustre in this case).

Remove the conditional code for hlist_for_each_entry
and have a macro for "d_alias -> du.d_alias" instead.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gian-Carlo DeFazio <defazio1@llnl.gov>
Closes #14377
2023-01-19 12:50:42 -08:00
Richard Yao 7319a73921 Linux ppc64le ieee128 compat: Do not redefine __asm on external headers
There is an external assembly declaration extension in GNU C that glibc
uses when building with ieee128 floating point support on ppc64le.
Marking that as volatile makes no sense, so the build breaks.

It does not make sense to only mark this as volatile on Linux, since if
do not want the compiler reordering things on Linux, we do not want the
compiler reordering things on any other platform, so we stop treating
Linux specially and just manually inline the CPP macro so that we can
eliminate it. This should fix the build on ppc64le.

Tested-by: @gyakovlev
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14308
Closes #14384
2023-01-19 12:50:42 -08:00
Vince van Oosten 596cfb6b15 include systemd overrides to zfs-dracut module
If a user that uses systemd and dracut wants to overide certain
settings, they typically use `systemctl edit [unit]` or place a file in
`/etc/systemd/system/[unit].d/override.conf` directly.

The zfs-dracut module did not include those overrides however, so this
did not have any effect at boot time.

For zfs-import-scan.service and zfs-import-cache.service, overrides are
now included in the dracut initramfs image.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Vince van Oosten <techhazard@codeforyouand.me>
Closes #14075
Closes #14076
2023-01-19 12:50:42 -08:00
George Amanakis f806306ce0 Activate filesystem features only in syncing context
When activating filesystem features after receiving a snapshot, do
so only in syncing context.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #14304
Closes #14252
2023-01-19 12:50:42 -08:00
Richard Yao f33b298346 Illumos #15286: do_composition() needs sign awareness
Authored by: Dan McDonald <danmcd@mnx.io>
Reviewed by: Patrick Mooney <pmooney@pfmooney.com>
Reviewed by: Richard Lowe <richlowe@richlowe.net>
Approved by: Joshua M. Clulow <josh@sysmgr.org>
Ported-by: Richard Yao <richard.yao@alumni.stonybrook.edu>

Illumos-issue: https://www.illumos.org/issues/15286
Illumos-commit: https://github.com/illumos/illumos-gate/commit/f137b22e734e85642da3e56e8b94da3f5f027c73

Porting Notes:

The patch in illumos did not have much of a commit message, and did not
provide attribution to the reporter, while original patch proposed to
OpenZFS did, so I am listing the reporter (myself) and original patch
author (also myself) below while including the original commit message
with some minor corrections as part of the porting notes:

In do_composition(), we have:

size = u8_number_of_bytes[*p];
if (size <= 1 || (p + size) > oslast)
	break;

There, we have type promotion from int8_t to size_t, which is unsigned.
C will sign extend the value as part of the widening before treating the
value as unsigned and the negative values we can counter are error
values from U8_ILLEGAL_CHAR and U8_OUT_OF_RANGE_CHAR, which are -1 and
-2 respectively. The unsigned versions of these under two's complement
are SIZE_MAX and SIZE_MAX-1 respectively.

The bounds check is written under the assumption that `size <= 1` does a
signed comparison. This is followed by a pointer comparison to see if
the string has the correct length, which is fine.

A little further down we have:

for (i = 0; i < size; i++)
	tc[i] = *p++;

When an error condition is encountered, this will attempt to iterate at
least SIZE_MAX-1 times, which will massively overflow the buffer, which
is not fine.

The kernel will kill the loop as soon as it hits the kernel stack guard
on Linux systems built with CONFIG_VMAP_STACK=y, which should be just
about all of them. That prevents arbitrary code execution and just about
any other bad thing that a black hat attacker might attempt with
knowledge of this buffer overflow. Other systems' kernels have
mitigations for unbounded in-kernel buffer overflows that will catch
this too.

Also, the patch in illumos-gate made an effort to fix C style issues
that had been fixed in the OpenZFS/ZFSOnLinux repository. Those issues
had been mentioned in the email that I originally sent them about this
issue. One of the fixes had not been already done, so it is included.
Another to collect_a_seq()'s arguments was handled differently in
OpenZFS. For the sake of avoiding unnecessary differences, it has been
adopted. This has the interesting effect that if you correct the paths
in the illumos-gate patch to match the current OpenZFS repository, you
can reverse apply it cleanly.

Original-patch-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reported-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Co-authored-by: Dan McDonald <danmcd@mnx.io>
Closes #14318
Closes #14342
2023-01-19 12:50:42 -08:00
Brian Behlendorf 04fcf13de0 dracut: fix typo in mount-zfs.sh.in
Format the `zpool get` command correctly.  The -o option must
be followed by "all" or the requested field name.

Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13602
2023-01-19 12:50:42 -08:00
Matthew Ahrens 37dbf91c8a removal of LegacyVersion broke ax_python_dev.m4
The 22.0 release of the python `packaging` package removed the
`LegacyVersion` trait, causing ZFS to no longer compile.

This commit replaces the sections of `ax_python_dev.m4` that rely on
`LegacyVersion` with updated implementations from the upstream
`autoconf-archive`.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #14297
2023-01-19 12:50:42 -08:00
Mateusz Guzik be697f4339 FreeBSD: catch up to 1400077
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #14328
2023-01-19 12:50:42 -08:00
Martin Rüegg c07a8660f0 Fix shebang for helper script of deb-utils
Shebang was missing the `!` between `#` and the actual path.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Martin Rüegg <martin.rueegg@metaworx.ch>
Closes #14339
2023-01-19 12:50:42 -08:00
Martin Rüegg ea62fb4ab7 Add quotation marks around $PATH for deb-utils
Fix #14338, failing to build deb-utils if existing `$PATH` variable
would include a whitespace.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Martin Rüegg <martin.rueegg@metaworx.ch>
Closes #14339
2023-01-19 12:50:42 -08:00
Brian Behlendorf 5aca6e1092 Documentation corrections
- Update the link to the OpenZFS Code of Conduct.
- Remove extra "the" from contrib/initramfs/scripts/zfs

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #14298
Closes #14307
2023-01-19 12:50:42 -08:00
George Melikov d72e004715 systemd: set restart=always for zfs-zed.service
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Co-authored-by: Attila Fülöp <attila@fueloep.org>
Closes #14294
2023-01-19 12:50:42 -08:00
Ethan Coe-Renner 9ef565a185 Add color output to zfs diff.
This adds support to color zfs diff (in the style of git diff)
conditional on the ZFS_COLOR environment variable.

Signed-off-by: Ethan Coe-Renner <coerenner1@llnl.gov>
2023-01-19 12:50:36 -08:00
наб 0e72f5fb83 libzfs: diff: simplify superfluous stdio
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12829
2023-01-19 12:50:36 -08:00
наб e9897e542d libzfs: diff: print_what() can return the symbol => get_what()
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12829
2023-01-19 12:50:36 -08:00
Doug Rabson 70b1b5bb98 FreeBSD: Remove stray debug printf
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Doug Rabson <dfr@rabson.org>
Closes #14286
Closes #14287
2023-01-19 12:50:36 -08:00
Richard Yao a2aabac123 Zero end of embedded block buffer in dump_write_embedded()
This fixes a kernel stack leak.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Tested-by: Nicholas Sherlock <n.sherlock@gmail.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13778
Closes #14255
2023-01-19 12:50:36 -08:00
Marcel Menzel 3207803abf Change ZEVENT_POOL_GUID to ZEVENT_POOL to display pool names
Outgoing mails for ZFS pool events include the pool GUID,
but not the actual pool name. Let's change this for better
readability, as it is already done in the mails for finished
pool resilvers.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Marcel Menzel <mail@mcl.gg>
Closes #14272
2023-01-19 12:50:36 -08:00
Allan Jude 6219190d7f Restrict visibility of per-dataset kstats inside FreeBSD jails
When inside a jail, visibility on datasets not "jailed" to the
jail is restricted. However, it was possible to enumerate all
datasets in the pool by looking at the kstats sysctl MIB.

Only the kstats corresponding to datasets that the user has
visibility on are accessible now.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Closes #14254
2023-01-19 12:50:36 -08:00
Richard Yao 24a6d8316a Fix dereference after null check in enqueue_range
If the bp is NULL, we have a hole. However, when we build with
assertions, we will dereference bp when `blkid == DMU_SPILL_BLKID`. When
this happens on a hole, we will have a NULL pointer dereference.

Reported-by: Coverity (CID-1524670)
Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14264
2023-01-19 12:50:36 -08:00
Richard Yao e23ed1b330 Fix potential buffer overflow in zpool command
The ZPOOL_SCRIPTS_PATH environment variable can be passed here. This
allows for arbitrarily long strings to be passed to sprintf(), which can
overflow the buffer.

I missed this in my earlier audit of the codebase. CodeQL's
cpp/unbounded-write check caught this.

Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14264
2023-01-19 12:50:36 -08:00
Richard Yao 572114d846 FreeBSD: zfs_register_callbacks() must implement error check correctly
I read the following article and noticed a couple of ZFS bugs mentioned:

https://pvs-studio.com/en/blog/posts/cpp/0377/

I decided to search for them in the modern OpenZFS codebase and then
found one that matched the description of the first one:

V593 Consider reviewing the expression of the 'A = B != C' kind. The
expression is calculated as following: 'A = (B != C)'. zfs_vfsops.c 498

The consequence of this is that the error value is replaced with `1`
when there is an error. When there is no error, 0 is correctly passed.
This is a very minor issue that is unlikely to cause any real problems.

The incorrect error code would either be returned to the mount command
on a failure or any of `zfs receive`, `zfs recv`, `zfs rollback` or `zfs
upgrade`.

The second one has already been fixed.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14261
2023-01-19 12:50:36 -08:00
наб 6af8e80310 fgrep -> grep -F
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13259
2023-01-19 12:50:36 -08:00
наб f8a124b104 egrep -> grep -E
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13259
2023-01-19 12:50:25 -08:00
Tony Hutter 689c53f2c5 Update META to 6.1 kernel
ZFS successfully builds against the 6.1.4 kernel.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #14371
2023-01-10 16:12:11 -08:00
Matthew Ahrens 0156253d29 ztest fails assertion in zio_write_gang_member_ready()
Encrypted blocks can have up to 2 DVA's, as the third DVA is reserved
for the salt+IV.  However, dmu_write_policy() allows non-encrypted
blocks (e.g. DMU_OT_OBJSET) inside encrypted datasets to request and
allocate 3 DVA's, since they don't need a salt+IV (they are merely
authenicated).

However, if such a block becomes a gang block, the gang code incorrectly
limits the gang block header to 2 DVA's.  This leads to a "NDVAs
inversion", where a parent block (the gang block header) has less DVA's
than its children (the gang members), causing an assertion failure in
zio_write_gang_member_ready().

This commit addresses the problem by only restricting the gang block
header to 2 DVA's if the block is actually encrypted (and thus its gang
block members can have at most 2 DVA's).

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #14250
Closes #14356
2023-01-10 08:44:55 -08:00
Antonio Russo 3e0962a236 Introduce ZFS_LINUX_REQUIRE_API autoconf macro
Currently, if API tests fail, we either ignore the failures, or
unconditionally halt the kernel build.  This leads to situations where
incompatibilities with existing APIs may develop, but not trip the
configure compatibility checks.

This introduces a new mechanism to require APIs for kernels above a
particular version.  While not perfect, this at least guarantees
mainline kernels do not break existing APIs without at least providing
some warning.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Antonio Russo <aerusso@aerusso.net>
Closes #14343
2023-01-10 08:43:49 -08:00
Coleman Kane 3c0b8c874b linux 6.2 compat: bio->bi_rw was renamed bio->bi_opf
The bi_rw member of struct bio was renamed to bi_opf in Linux 6.2.
As well, Linux's implementation of bio_set_op_attrs(...) has been
removed.

The HAVE_BIO_BI_OPF macro already appears to be defined, but the
removal of the bio_set_op_attrs(...) implementation makes the build
fall back on the locally-defined implementation, which isn't updated
for the bio->bi_opf change. This commit adds that update.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #14324
Closes #14331
2023-01-10 08:43:49 -08:00
Coleman Kane b586ea5d93 linux 6.2 compat: get_acl() got moved to get_inode_acl() in 6.2
Linux 6.2 renamed the get_acl() operation to get_inode_acl() in
the inode_operations struct. This should fix Issue #14323.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #14323
Closes #14331
2023-01-10 08:43:49 -08:00
Antonio Russo 138d2b29dd Linux 6.1 compat: open inside tmpfile()
commit d27c81847b upstream

Linux 863f144 modified the .tmpfile interface to pass a struct file,
rather than a struct dentry, and expect the tmpfile implementation to
open inside of tmpfile().

This patch implements a configuration test that checks for this new API
and appropriately sets a HAVE_TMPFILE_DENTRY flag that tracks this old
API.  Contingent on this flag, the appropriate API is implemented.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Antonio Russo <aerusso@aerusso.net>
Closes #14301
Closes #14343
2023-01-09 17:15:22 -08:00
Antonio Russo 5371d8dae7 ZTS: close in mmapwrite.c
commit a7304ab9c1 upstream

mmapwrite is used during the ZTS to identify issues with mmap-ed files.
This helper program exercises this pathway by continuously writing to a
file.  ee6bf97c7 modified the writing threads to terminate after a set
amount of total data is written.  This change allows standard program
execution to reach the end of a writer thread without closing the file
descriptor, introducing a resource "leak."

This patch appeases resource leak analyses by close()-ing the file at
the end of the thread.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Antonio Russo <aerusso@aerusso.net>
Closes #14353
2023-01-09 17:15:22 -08:00
Antonio Russo a75af541cf ZTS: limit mmapwrite file size
commit ee6bf97c77 upstream

mmapwrite spawns several threads, all of which perform writes on a file
for the purpose of testing the behavior of mmap(2)-ed files.  One
thread performs an mmap and a write to the beginning of that region,
while the others perform regular writes after lseek(2)-ing the end of
the file.

Because these regular writes are set in a while (1) loop, they will
write an unbounded amount of data to disk.  The mmap_write_001_pos test
script SIGKILLs them after 30 seconds, but on fast testbeds, this may
be enough time to exhaust the available space in the filesystem,
leading to spurious test failures.

Instead, limit the total file size by checking that the lseek return
value is no greater than 250 * 1024*1024 bytes, which is less than the
default minimum vdev size defined in includes/default.cfg .

This also includes part of 2a493a4c71,
which checks the return value of lseek.

Signed-off-by: Antonio Russo <aerusso@aerusso.net>
Closes #14277
Closes #14345
2023-01-09 17:15:22 -08:00
Ameer Hamza 75fbe7eb99 skip permission checks for extended attributes
zfs_zaccess_trivial() calls the generic_permission() to read
xattr attributes. This causes deadlock if called from
zpl_xattr_set_dir() context as xattr and the dent locks are
already held in this scenario. This commit skips the permissions
checks for extended attributes since the Linux VFS stack already
checks it before passing us the control.

Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
2023-01-05 11:10:28 -08:00
Ameer Hamza d0f350c962 Allow receiver to override encryption properties in case of replication
Currently, the receiver fails to override the encryption
property for the plain replicated dataset with the error:
"cannot receive incremental stream: encryption property
'encryption' cannot be set for incremental streams.". The
problem is resolved by allowing the receiver to override
the encryption property for plain replicated send.

Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
2023-01-05 11:10:04 -08:00
Ameer Hamza 2f2d6bece8 zed: unclean disk attachment faults the vdev
If the attached disk already contains a vdev GUID, it
means the disk is not clean. In such a scenario, the
physical path would be a match that makes the disk
faulted when trying to online it. So, we would only
want to proceed if either GUID matches with the last
attached disk or the disk is in a clean state.

Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
2023-01-05 11:09:36 -08:00
Ryan Moeller fbbc375d43 FreeBSD: Fix potential boot panic with bad label
vdev_geom_read_pool_label() can leave NULL in configs.  Check for it
and skip consistently when generating rootconf.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #14291
(cherry picked from commit dc8c2f6158)
2023-01-05 11:00:09 -08:00
Rich Ercolani e84a2ed7a8 Add workaround for broken Linux pipes
Linux has an unresolved hang if you resize a pipe with bytes
in it.

Since there's no obvious way to detect this happening, added a
workaround to disable resizing the pipe buffer if you set an
environment variable.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #13309
2023-01-05 10:47:25 -08:00
Ryan Moeller f28c7302cb initramfs: Fix legacy mountpoint rootfs
Legacy mountpoint datasets should not pass `-o zfsutil` to `mount.zfs`.
Fix the logic in `mount_fs()` to not forget we have a legacy mountpoint
when checking for an `org.zol:mountpoint` userprop.

Reviewed-by: Richard Yao <ryao@gentoo.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #14274
(cherry picked from commit 786ff6a6cb)
2022-12-13 17:33:33 -08:00
szubersk 4767037bcf vdev_raidz_math_aarch64_neonx2.c: suppress diagnostic only for GCC
Signed-off-by: szubersk <szuberskidamian@gmail.com>
2022-12-09 12:07:38 -08:00
szubersk d50ce5c9ec tests: mkfile: usage: () -> (void)
Signed-off-by: szubersk <szuberskidamian@gmail.com>
2022-12-09 12:07:38 -08:00
szubersk 05732da4d1 Use Ubuntu 20.04 and remove Ubuntu 18.04 from workflows
- `ubuntu-latest` now resolves to `ubuntu-22.04`. Explicit pinning
  is needed.

- cherry-pick #14238

Signed-off-by: szubersk <szuberskidamian@gmail.com>
2022-12-09 10:57:10 -08:00
Savyasachee Jha 8f7826f73b dracut: skip zfsexpandknoweldge when zfs_devs is present in dracut
PR 1711 (https://github.com/dracutdevs/dracut/pull/1711) adds a zfs_devs
function to dracut to detect the physical devices backing zfs pools. If
this function exists in the version of dracut this module is being
called from, then it does not need to run.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Signed-off-by: Savyasachee Jha <hi@savyasacheejha.com>
Closes #13121
2022-12-09 10:42:46 -08:00
Tony Hutter 21bd766133 Tag zfs-2.1.7
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2022-12-01 12:39:45 -08:00
Tony Hutter 7819b12f2c zfs-2.1.7: Use ubuntu-20.04 for zloop and sanity builders
The zfs-2.1.7 branch is still using the older 'python-dev'
package names rather than the newer 'python3-dev' packages that
are required for 'ubuntu-latest'.  Use 'ubuntu-20.04' instead of
'ubuntu-latest' to get around this.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2022-12-01 12:39:45 -08:00
George Amanakis c8d2ab05e1 Fix setting the large_block feature after receiving a snapshot
We are not allowed to dirty a filesystem when done receiving
a snapshot. In this case the flag SPA_FEATURE_LARGE_BLOCKS will
not be set on that filesystem since the filesystem is not on
dp_dirty_datasets, and a subsequent encrypted raw send will fail.
Fix this by checking in dsl_dataset_snapshot_sync_impl() if the feature
needs to be activated and do so if appropriate.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #13699
Closes #13782
2022-12-01 12:39:45 -08:00
Damian Szuberski 2c50512ad2 Make autodetection disable pyzfs for kernel/srpm configurations
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #13394
Closes #14178
2022-12-01 12:39:44 -08:00
Brooks Davis c4468a70c3 Don't leak packed recieved proprties
When local properties (e.g., from -o and -x) are provided, don't leak
the packed representation of the received properties due to variable
reuse.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Brooks Davis <brooks.davis@sri.com>
Closes #14197
2022-12-01 12:39:44 -08:00
Richard Yao e48aaef89f Fix NULL pointer dereference in dbuf_prefetch_indirect_done()
When ZFS is built with assertions, a prefetch is done on a redacted
blkptr and `dpa->dpa_dnode` is NULL, we will have a NULL pointer
dereference in `dbuf_prefetch_indirect_done()`.

Both Coverity and Clang's Static Analyzer caught this.

Reported-by: Coverity (CID 1524671)
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14210
2022-12-01 12:39:44 -08:00
Richard Yao 0e3abd2994 Lua: Fix bad bitshift in lua_strx2number()
The port of lua to OpenZFS modified lua to use int64_t for numbers
instead of double. As part of this, a function for calculating
exponentiation was replaced with a bit shift. Unfortunately, it did not
handle negative values. Also, it only supported exponents numbers with
7 digits before before overflow. This supports exponents up to 15 digits
before overflow.

Clang's static analyzer reported this as "Result of operation is garbage
or undefined" because the exponent was negative.

Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14204
2022-12-01 12:39:44 -08:00
Damian Szuberski 3d1e808096 Fix clang 13 compilation errors
```
os/linux/zfs/zvol_os.c:1111:3: error: ignoring return value of function
  declared with 'warn_unused_result' attribute [-Werror,-Wunused-result]
                add_disk(zv->zv_zso->zvo_disk);
                ^~~~~~~~ ~~~~~~~~~~~~~~~~~~~~

zpl_xattr.c:1579:1: warning: no previous prototype for function
  'zpl_posix_acl_release_impl' [-Wmissing-prototypes]
```

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #13551
(cherry picked from commit 9884319666)
2022-12-01 12:39:44 -08:00
наб 108c07c655 Remove final K&R definitions
Clang trunk now warns -Wstrict-prototypes on this, and they're removed
in C2x

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13447
2022-12-01 12:39:44 -08:00
наб 32f7499acf module: zfs: vdev_removal: remove unused num_indirect
Found with -Wunused-but-set-variable on Clang trunk

Fixes: a1d477c24c ("OpenZFS 7614, 9064 - zfs device evacuation/removal")
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13304
2022-12-01 12:39:44 -08:00
наб 670d66e7a0 tests: cmd: draid: remove unused and undocumented -v
Found with -Wunused-but-set-variable on Clang trunk

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13304
2022-12-01 12:39:44 -08:00
наб ad0379bf0e linux: libspl: zone: () -> (void)
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12968
2022-12-01 12:39:44 -08:00
Laura Hild 2662b8e72b Correct multipathd.target to .service
https://github.com/openzfs/zfs/pull/9863 says it "orders
zfs-import-cache.service and zfs-import-scan.service after
multipathd.service" but the commit (79add96) actually
ordered them after .target.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Laura Hild <lsh@jlab.org>
Closes #12709
Closes #14171
2022-12-01 12:39:44 -08:00
Rich Ercolani fa7d572a8a Handle and detect #13709's unlock regression (#14161)
In #13709, as in #11294 before it, it turns out that 63a26454 still had
the same failure mode as when it was first landed as d1d47691, and
fails to unlock certain datasets that formerly worked.

Rather than reverting it again, let's add handling to just throw out
the accounting metadata that failed to unlock when that happens, as
well as a test with a pre-broken pool image to ensure that we never get
bitten by this again.

Fixes: #13709

Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
2022-12-01 12:39:43 -08:00
shodanshok d9de079a4b Fix arc_p aggressive increase
The original ARC paper called for an initial 50/50 MRU/MFU split
and this is accounted in various places where arc_p = arc_c >> 1,
with further adjustment based on ghost lists size/hit. However, in
current code both arc_adapt() and arc_get_data_impl() aggressively
grow arc_p until arc_c is reached, causing unneeded pressure on
MFU and greatly reducing its scan-resistance until ghost list
adjustments kick in.

This patch restores the original behavior of initially having arc_p
as 1/2 of total ARC, without preventing MRU to use up to 100% total
ARC when MFU is empty.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gionatan Danti <g.danti@assyoma.it>
Closes #14137
Closes #14120
2022-12-01 12:39:43 -08:00
Richard Yao 957c3776f2 FreeBSD: Fix out of bounds read in zfs_ioctl_ozfs_to_legacy()
There is an off by 1 error in the check. Fortunately, this function does
not appear to be used in kernel space, despite being compiled as part of
the kernel module. However, it is used in userspace. Callers of
lzc_ioctl_fd() likely will crash if they attempt to use the
unimplemented request number.

This was reported by FreeBSD's coverity scan.

Reported-by: Coverity (CID 1432059)
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14135
2022-12-01 12:39:43 -08:00
Serapheim Dimitropoulos 85537f77a3 Expose zfs_vdev_open_timeout_ms as a tunable
Some of our customers have been occasionally hitting zfs import failures
in Linux because udevd doesn't create the by-id symbolic links in time
for zpool import to use them. The main issue is that the
systemd-udev-settle.service that zfs-import-cache.service and other
services depend on is racy. There is also an openzfs issue filed (see
https://github.com/openzfs/zfs/issues/10891) outlining the problem and
potential solutions.

With the proper solutions being significant in terms of complexity and
the priority of the issue being low for the time being, this patch
exposes `zfs_vdev_open_timeout_ms` as a tunable so people that are
experiencing this issue often can increase it as a workaround.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #14133
2022-12-01 12:39:43 -08:00
Brooks Davis 5f53a444b3 Remove an unused variable
Clang-16 detects this set-but-unused variable which is assigned and
incremented, but never referenced otherwise.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Brooks Davis <brooks.davis@sri.com>
Closes #14125
2022-12-01 12:39:43 -08:00
Brooks Davis 572bd18c1f Make 1-bit bitfields unsigned
This fixes -Wsingle-bit-bitfield-constant-conversion warning from
clang-16 like:

lib/libzfs/libzfs_dataset.c:4529:19: error: implicit truncation
  from 'int' to a one-bit wide bit-field changes value from
  1 to -1 [-Werror,-Wsingle-bit-bitfield-constant-conversion]
                flags.nounmount = B_TRUE;
				^ ~~~~~~

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Brooks Davis <brooks.davis@sri.com>
Closes #14125
2022-12-01 12:39:43 -08:00
Richard Yao 256b74d0b0 Address warnings about possible division by zero from clangsa
* The complaint in ztest_replay_write() is only possible if something
   went horribly wrong. An assertion will silence this and if it goes
   off, we will know that something is wrong.
 * The complaint in spa_estimate_metaslabs_to_flush() is not impossible,
   but seems very unlikely. We resolve this by passing the value from
   the `MIN()` that does not go to infinity when the variable is zero.

There was a third report from Clang's scan-build, but that was a
definite false positive and disappeared when checked again through
Clang's static analyzer with Z3 refution via CodeChecker.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14124
2022-12-01 12:39:43 -08:00
Allan Jude ac01b876c9 Avoid null pointer dereference in dsl_fs_ss_limit_check()
Check for cr == NULL before dereferencing it in
dsl_enforce_ds_ss_limits() to lookup the zone/jail ID.

Reported-by: Coverity (CID 1210459)
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Closes #14103
2022-12-01 12:39:43 -08:00
Richard Yao e9a8fb17b5 Fix too few arguments to formatting function
CodeQL reported that when the VERIFY3U condition is false, we do not
pass enough arguments to `spl_panic()`. This is because the format
string from `snprintf()` was concatenated into the format string for
`spl_panic()`, which causes us to have an unexpected format specifier.

A CodeQL developer suggested fixing the macro to have a `%s` format
string that takes a stringified RIGHT argument, which would fix this.
However, upon inspection, the VERIFY3U check was never necessary in the
first place, so we remove it in favor of just calling `snprintf()`.

Lastly, it is interesting that every other static analyzer run on the
codebase did not catch this, including some that made an effort to catch
such things. Presumably, all of them relied on header annotations, which
we have not yet done on `spl_panic()`. CodeQL apparently is able to
track the flow of arguments on their way to annotated functions, which
llowed it to catch this when others did not. A future patch that I have
in development should annotate `spl_panic()`, so the others will catch
this too.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14098
2022-12-01 12:39:43 -08:00
Pavel Snajdr 52e658edd7 Remove zpl_revalidate: fix snapshot rollback
Open files, which aren't present in the snapshot, which is being
roll-backed to, need to disappear from the visible VFS image of
the dataset.

Kernel provides d_drop function to drop invalid entry from
the dcache, but inode can be referenced by dentry multiple dentries.

The introduced zpl_d_drop_aliases function walks and invalidates
all aliases of an inode.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pavel Snajdr <snajpa@snajpa.net>
Closes #9600
Closes #14070
2022-12-01 12:39:42 -08:00
Richard Yao 4c59fde1f5 Fix theoretical use of uninitialized values
Clang's static analyzer complains about this.

In get_configs(), if we have an invalid configuration that has no top
level vdevs, we can read a couple of uninitialized variables. Aborting
upon seeing this would break the userland tools for healthy pools, so we
instead initialize the two variables to 0 to allow the userland tools to
continue functioning for the pools with valid configurations.

In zfs_do_wait(), if no wait activities are enabled, we read an
uninitialized error variable.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14043
2022-12-01 12:39:42 -08:00
Richard Yao 3830858c5c Fix memory leaks in dmu_send()/dmu_send_obj()
If we encounter an EXDEV error when using the redacted snapshots
feature, the memory used by dspp.fromredactsnaps is leaked.

Clang's static analyzer caught this during an experiment in which I had
annotated various headers in an attempt to improve the results of static
analysis.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13973
2022-12-01 12:39:42 -08:00
Richard Yao af2e53f62c Fix possible NULL pointer dereference in sha2_mac_init()
If mechanism->cm_param is NULL, passing mechanism to
PROV_SHA2_GET_DIGEST_LEN() will dereference a NULL pointer.

Coverity reported this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14044
2022-12-01 12:39:42 -08:00
Richard Yao 89c41f3979 set_global_var() should not pass NULL pointers to dlclose()
Both Coverity and Clang's static analyzer caught this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14044
2022-12-01 12:39:42 -08:00
Richard Yao 409c99a1d3 Fix NULL pointer dereference in spa_open_common()
Calling spa_open() will pass a NULL pointer to spa_open_common()'s
config parameter. Under the right circumstances, we will dereference the
config parameter without doing a NULL check.

Clang's static analyzer found this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14044
2022-12-01 12:39:42 -08:00
Richard Yao bbec0e60a8 Fix NULL pointer passed to strlcpy from zap_lookup_impl()
Clang's static analyzer pointed out that whenever zap_lookup_by_dnode()
is called, we have the following stack where strlcpy() is passed a NULL
pointer for realname from zap_lookup_by_dnode():

strlcpy()
zap_lookup_impl()
zap_lookup_norm_by_dnode()
zap_lookup_by_dnode()

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14044
2022-12-01 12:39:42 -08:00
Richard Yao a5f17a94d3 fm_fmri_hc_create() must call va_end() before returning
clang-tidy caught this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14044
2022-12-01 12:39:42 -08:00
Richard Yao 5eaad8bdb5 Fix NULL pointer dereference in zdb
Clang's static analyzer complained that we dereference a NULL pointer in
dump_path() if we return 0 when there is an error.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14044
2022-12-01 12:39:42 -08:00
Richard Yao 4351d18fb0 ZED: Fix uninitialized value reads
Coverity complained about a couple of uninitialized value reads in ZED.

 * zfs_deliver_dle() can pass an uninitialized string to zed_log_msg()
 * An uninitialized sev.sigev_signo is passed to timer_create()

The former would log garbage while the latter is not a real issue, but
we might as well suppress it by initializing the field to 0 for
consistency's sake.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14047
2022-12-01 12:39:41 -08:00
Richard Yao 2453f90350 Fix theoretical array overflow in lua_typename()
Out of the 12 defects in lua that coverity reports, 5 of them involve
`lua_typename()` and out of the dozens of defects in ZFS that lua
reports, 3 of them involve `lua_typename()` due to the ZCP code. Given
all of the uses of `lua_typename()` in the ZCP code, I was surprised
that there were not more. It appears that only 2 were reported because
only 3 called `lua_type()`, which does a defective sanity check that
allows invalid types to be passed.

lua/lua@d4fb848be7 addressed this in
upstream lua 5.3. Unfortunately, we did not get that fix since we use
lua 5.2 and we do not have assertions enabled in lua, so the upstream
solution would not do anything.

While we could adopt the upstream solution and enable assertions, a
simpler solution is to fix the issue by making `lua_typename()` return
`internal_type_error` whenever it is called with an invalid type. This
avoids the array overflow and if we ever see it appear somewhere, we
will know there is a problem with the lua interpreter.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13947
2022-12-01 12:39:41 -08:00
Richard Yao d016ca1a92 Fix potential NULL pointer dereference in lzc_ioctl()
Users are allowed to pass NULL to resultp, but we unconditionally assume
that they never do. When an external user does pass NULL to resultp, we
dereference a NULL pointer.

Clang's static analyzer complained about this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14008
2022-12-01 12:39:41 -08:00
Richard Yao d05f247aec scripts/enum-extract.pl should not hard code perl path
This is a portability issue. The issue had already been fixed for
scripts/cstyle.pl by 2dbf1bf829.
scripts/enum-extract.pl was added to the repository the following year
without this portability fix.

Michael Bishop informed me that this broke his attempt to build ZFS
2.1.6 on NixOS, since he was building manually outside of their package
manager (that usually rewrites the shebangs to NixOS' unusual paths).
NixOS puts all of the paths into $PATH, so scripts that portably rely
on env to find the interpreter still work.

Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14012
2022-12-01 12:39:41 -08:00
Richard Yao fa74250cd3 PAM: Fix unchecked return value from zfs_key_config_load()
9a49c6b782 was intended to fix this issue,
but I had missed the case in pam_sm_open_session(). Clang's static
analyzer had not reported it and I forgot to look for other cases.

Interestingly, GCC gcc-12.1.1_p20220625's static analyzer had caught
this as multiple double-free bugs, since another failure after the
failure in zfs_key_config_load() will cause us to attempt to free the
memory that zfs_key_config_load() was supposed to allocate, but had
cleaned up upon failure.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13978
2022-12-01 12:39:41 -08:00
Richard Yao c562bbefc0 Fix potential NULL pointer dereference in dsl_dataset_promote_check()
If the `list_head()` returns NULL, we dereference it, right before we
check to see if it returned NULL.

We have defined two different pointers that both point to the same
thing, which are `origin_head` and `origin_ds`. Almost everything uses
`origin_ds`, so we switch them to use `origin_ds`.

We also promote `origin_ds` to a const pointer so that the compiler
verifies that nothing modifies it.

Coverity complained about this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13967
2022-12-01 12:39:41 -08:00
Richard Yao d4df36de5d Fix unreachable code in zstreamdump
82226e4f44 was intended to prevent a
warning from being printed in situations where it was inappropriate, but
accidentally disabled it entirely by setting featureflags in the wrong
case statement.

Coverity reported this as dead code.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13946
2022-12-01 12:39:41 -08:00
Richard Yao 531361114b PAM: Fix uninitialized value read
Clang's static analyzer found that config.uid is uninitialized when
zfs_key_config_load() returns an error.

Oddly, this was not included in the unchecked return values that
Coverity found.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13957
2022-12-01 12:39:41 -08:00
Richard Yao e11c4327f1 set_global_var_parse_kv() should pass the pointer from strdup()
A comment says that the caller should free k_out, but the pointer passed
via k_out is not the same pointer we received from strdup(). Instead,
it is a pointer into the region we received from strdup(). The free
function should always be called with the original pointer, so this is
likely a bug.

We solve this by calling `strdup()` a second time and then freeing the
original pointer.

Coverity reported this as a memory leak.

Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13867
2022-12-01 12:39:41 -08:00
Richard Yao fbe150fe5b Call va_end() before return in zpool_standard_error_fmt()
Commit ecd6cf800b63704be73fb264c3f5b6e0dafc068d by marks in OpenSolaris
at Tue Jun 26 07:44:24 2007 -0700 introduced a bug where we fail to call
`va_end()` before returning.

The man page for va_start() says:

"Each invocation of va_start() must be matched by a corresponding
invocation of va_end() in the same function."

Coverity complained about this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Chunwei Chen <david.chen@nutanix.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13904
2022-12-01 12:39:41 -08:00
Richard Yao 1ff8f41851 Fix potential NULL pointer dereference in zfsdle_vdev_online()
Coverity complained about this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Chunwei Chen <david.chen@nutanix.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13903
2022-12-01 12:39:40 -08:00
Richard Yao c6d93d0a80 FreeBSD: Fix uninitialized pointer read in spa_import_rootpool()
The FreeBSD project's coverity scans found this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13923
2022-12-01 12:39:40 -08:00
Richard Yao 9f1691a964 Linux: Fix use-after-free in zfsvfs_create()
Coverity reported that we pass a pointer to zfsvfs to
`dmu_objset_disown()` after freeing zfsvfs in zfsvfs_create_impl() after
a failure in zfsvfs_init().

We have nearly identical duplicate versions of this code for FreeBSD and
Linux, but interestingly, the FreeBSD version of this code differs in
such a way that it does not suffer from this bug. We remove the
difference from the FreeBSD version to fix this bug.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13883
2022-12-01 12:39:40 -08:00
Richard Yao 12b859c970 Fix null pointer dereferences in PAM
Coverity caught these.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13889
2022-12-01 12:39:40 -08:00
наб 39a39b8ab9 Handle ECKSUM as new EZFS_CKSUM ‒ "insufficient replicas"
Add a meaningful error message for ECKSUM to common error messages.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #6805
Closes #13808
Closes #13898
2022-12-01 12:39:40 -08:00
Richard Yao 1d5e569a69 Fix use-after-free bugs in icp code
These were reported by Coverity as "Read from pointer after free" bugs.
Presumably, it did not report it as a use-after-free bug because it does
not understand the inline assembly that implements the atomic
instruction.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13881
2022-12-01 12:39:40 -08:00
Richard Yao 3f380df778 Remove incorrect free() in zfs_get_pci_slots_sys_path()
Coverity found this. We attempted to free tmp, which is a pointer to a
string that should be freed by the caller.

Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13864
2022-12-01 12:39:40 -08:00
Richard Yao b247d47be1 Cleanup: Make memory barrier definitions consistent across kernels
We inherited membar_consumer() and membar_producer() from OpenSolaris,
but we had replaced membar_consumer() with Linux's smp_rmb() in
zfs_ioctl.c. The FreeBSD SPL consequently implemented a shim for the
Linux-only smp_rmb().

We reinstate membar_consumer() in platform independent code and fix the
FreeBSD SPL to implement membar_consumer() in a way analogous to Linux.

Reviewed-by: Konstantin Belousov <kib@FreeBSD.org>
Reviewed-by: Mateusz Guzik <mjguzik@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13843
2022-12-01 12:39:40 -08:00
Richard Yao 792825724b zpool_load_compat() should create strings of length ZFS_MAXPROPLEN
Otherwise, `strlcat()` can overflow them.

Coverity found this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13866
2022-12-01 12:39:40 -08:00
Alexander Lobakin ab22031d79 icp: fix all !ENDBR objtool warnings in x86 Asm code
Currently, only Blake3 x86 Asm code has signs of being ENDBR-aware.
At least, under certain conditions it includes some header file and
uses some custom macro from there.
Linux has its own NOENDBR since several releases ago. It's defined
in the same <asm/linkage.h>, so currently <sys/asm_linkage.h>
already is provided with it.

Let's unify those two into one %ENDBR macro. At first, check if it's
present already. If so -- use Linux kernel version. Otherwise, try
to go that second way and use %_CET_ENDBR from <cet.h> if available.
If no, fall back to just empty definition.
This fixes a couple more 'relocations to !ENDBR' across the module.
And now that we always have the latest/actual ENDBR definition, use
it at the entrance of the few corresponding functions that objtool
still complains about. This matches the way how it's used in the
upstream x86 core Asm code.

Reviewed-by: Attila Fülöp <attila@fueloep.org>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Closes #14035
2022-12-01 12:39:39 -08:00
Alexander Lobakin 33bc03dea7 icp: fix rodata being marked as text in x86 Asm code
objtool properly complains that it can't decode some of the
instructions from ICP x86 Asm code. As mentioned in the Makefile,
where those object files were excluded from objtool check (but they
can still be visible under IBT and LTO), those are just constants,
not code.
In that case, they must be placed in .rodata, so they won't be
marked as "allocatable, executable" (ax) in EFL headers and this
effectively prevents objtool from trying to decode this data. That
reveals a whole bunch of other issues in ICP Asm code, as previously
objtool was bailing out after that warning message.

Reviewed-by: Attila Fülöp <attila@fueloep.org>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Closes #14035

Conflicts:
	module/Kbuild.in
2022-11-30 10:15:58 -08:00
Alexander Lobakin ee93cbc9d4 icp: properly fix all RETs in x86_64 Asm code
Commit 43569ee374 ("Fix objtool: missing int3 after ret warning")
addressed replacing all `ret`s in x86 asm code to a macro in the
Linux kernel in order to enable SLS. That was done by copying the
upstream macro definitions and fixed objtool complaints.
Since then, several more mitigations were introduced, including
Rethunk. It requires to have a jump to one of the thunks in order
to work, so the RET macro was changed again. And, as ZFS code
didn't use the mainline defition, but copied it, this is currently
missing.

Objtool reminds about it time to time (Clang 16, CONFIG_RETHUNK=y):

fs/zfs/lua/zlua.o: warning: objtool: setjmp+0x25: 'naked' return
 found in RETHUNK build
fs/zfs/lua/zlua.o: warning: objtool: longjmp+0x27: 'naked' return
 found in RETHUNK build

Do it the following way:
* if we're building under Linux, unconditionally include
  <linux/linkage.h> in the related files. It is available in x86
  sources since even pre-2.6 times, so doesn't need any conftests;
* then, if RET macro is available, it will be used directly, so that
  we will always have the version actual to the kernel we build;
* if there's no such macro, we define it as a simple `ret`, as it
  was on pre-SLS times.

This ensures we always have the up-to-date definition with no need
to update it manually, and at the same time is safe for the whole
variety of kernels ZFS module supports.
Then, there's a couple more "naked" rets left in the code, they're
just defined as:

	.byte 0xf3,0xc3

In fact, this is just:

	rep ret

`rep ret` instead of just `ret` seems to mitigate performance issues
on some old AMD processors and most likely makes no sense as of
today.
Anyways, address those rets, so that they will be protected with
Rethunk and SLS. Include <sys/asm_linkage.h> here which now always
has RET definition and replace those constructs with just RET.
This wipes the last couple of places with unpatched rets objtool's
been complaining about.

Reviewed-by: Attila Fülöp <attila@fueloep.org>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Closes #14035
2022-11-30 10:15:58 -08:00
Ryan Moeller 1d9aa838ed libzfs recv: Check if user prop before inheritable
User props trigger an assert in zfs_prop_inheritable(), we must check
if the prop is a user prop first.

Signed-off-by: Ryan Moeller <ryan@iXsystems.com>

Backported as snippit from:
63652e1 Add --enable-asan and --enable-ubsan switches
2022-11-30 10:13:23 -08:00
Damian Szuberski 0f4ee295ba dsl_prop_known_index(): check for invalid prop
Resolve UBSAN array-index-out-of-bounds error in zprop_desc_t.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #14142
Closes #14147
2022-11-08 10:16:21 -08:00
Ameer Hamza 8c0684d326 zed: Avoid core dump if wholedisk property does not exist
zed aborts and dumps core in vdev_whole_disk_from_config() if
wholedisk property does not exist. make_leaf_vdev() adds the
property but there may be already pools that don't have the
wholedisk in the label.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #14062
2022-11-08 10:10:05 -08:00
Ameer Hamza ca3a675c74 zed: Prevent special vdev to be replaced by hot spare
Special vdevs should not be replaced by a hot spare.
Log vdevs already support this, extending the
functionality for special vdevs.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #14129
2022-11-07 13:36:57 -08:00
Attila Fülöp cd1f023846 Deny receiving into encrypted datasets if the keys are not loaded (#14139)
Commit 68ddc06b61 introduced support
for receiving unencrypted datasets as children of encrypted ones but
unfortunately got the logic upside down. This resulted in failing to
deny receives of incremental sends into encrypted datasets without
their keys loaded. If receiving a filesystem, the receive was done
into a newly created unencrypted child dataset of the target. In
case of volumes the receive made the target volume undeletable since
a dataset was created below it, which we obviously can't handle.
Incremental streams with embedded blocks are affected as well.

We fix the broken logic to properly deny receives in such cases.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes #13598
Closes #14055
Closes #14119
2022-11-04 11:07:29 -07:00
Ryan Moeller b27c7a1457 zil: Relax assertion in zil_parse
Rather than panic debug builds when we fail to parse a whole ZIL, let's
instead improve the logging of errors and continue like in a release
build.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #14116
2022-11-01 12:49:14 -07:00
Mariusz Zaborski 186e39f336 quota: extend quota for dataset
This patch relax the quota limitation for dataset by around 3%.
What this means is that user can write more data then the quota is
set to. However thanks to that we can get more stable bandwidth, in
case when we are overwriting data in-place, and not consuming any
additional space.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: Mariusz Zaborski <oshogbo@vexillium.org>
Sponsored-by: Zededa Inc.
Sponsored-by: Klara Inc.
Closes #13839
2022-11-01 12:48:37 -07:00
shodanshok 1d2b0563f7 Fix ARC target collapse when zfs_arc_meta_limit_percent=100
Reclaim metadata when arc_available_memory < 0 even if
meta_used is not bigger than arc_meta_limit.

As described in https://github.com/openzfs/zfs/issues/14054 if
zfs_arc_meta_limit_percent=100 then ARC target can collapse to
arc_min due to arc_purge not freeing any metadata.

This patch lets arc_prune to do its work when arc_available_memory
is negative even if meta_used is not bigger than arc_meta_limit,
avoiding ARC target collapse.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gionatan Danti <g.danti@assyoma.it>
Closes #14054 
Closes #14093
2022-11-01 12:48:30 -07:00
vaclavskala 8929355b4c Propagate extent_bytes change to autotrim thread
The autotrim thread only reads zfs_trim_extent_bytes_min and
zfs_trim_extent_bytes_max variable only on thread start.  We
should check for parameter changes during thread execution to
allow parameter changes take effect without needing to disable
then restart the autotrim.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Václav Skála <skala@vshosting.cz>
Closes #14077
2022-11-01 12:48:23 -07:00
Coleman Kane 212ba9bd97 Linux 6.1 compat: change order of sys/mutex.h includes
After Linux 6.1-rc1 came out, the build started failing to build a
couple of the files in the linux spl code due to the mutex_init
redefinition. Moving the sys/mutex.h include to a lower position within
these two files appears to fix the problem.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #14040
2022-11-01 12:44:56 -07:00
Brian Behlendorf 7ce097c874 Linux 6.0 compat: META
Update the META file to reflect compatibility with the 6.0 kernel.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #14091
2022-11-01 12:43:49 -07:00
Alexander 3e767e34bd Linux compat: fix DECLARE_EVENT_CLASS() test when ZFS is built-in
ZFS_LINUX_TRY_COMPILE_HEADER macro doesn't take CONFIG_ZFS=y into
account. As a result, on several latest Linux versions, configure
script marks DECLARE_EVENT_CLASS() available for non-GPL when ZFS
is being built as a module, but marks it unavailable when ZFS is
built-in.
Follow the logic of the neighbor macros and adjust
ZFS_LINUX_TRY_COMPILE_HEADER accordingly, so that it doesn't try
to look for a .ko when ZFS is built-in.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Closes #14006
2022-11-01 12:43:29 -07:00
Christian Schwarz df000276b8 zfs_domount: fix double-disown of dataset / double-free of zfsvfs_t
Before this patch, in zfs_domount, if zfs_root or d_make_root fails, we
leave zfsvfs != NULL. This will lead to execution of the error handling
`if` statement at the `out` label, and hence to a call to
dmu_objset_disown and zfsvfs_free.

However, zfs_umount, which we call upon failure of zfs_root and
d_make_root already does dmu_objset_disown and zfsvfs_free.

I suppose this patch rather adds to the brittleness of this part of the
code base, but I don't want to invest more time in this right now.
To add a regression test, we'd need some kind of fault injection
facility for zfs_root or d_make_root, which doesn't exist right now.
And even then, I think that regression test would be too closely tied
to the implementation.

To repro the double-disown / double-free, do the following:
1. patch zfs_root to always return an error
2. mount a ZFS filesystem

Here's the stack trace you would see then:

  VERIFY3(ds->ds_owner == tag) failed (0000000000000000 == ffff9142361e8000)
  PANIC at dsl_dataset.c:1003:dsl_dataset_disown()
  Showing stack for process 28332
  CPU: 2 PID: 28332 Comm: zpool Tainted: G           O      5.10.103-1.nutanix.el7.x86_64 #1
  Call Trace:
   dump_stack+0x74/0x92
   spl_dumpstack+0x29/0x2b [spl]
   spl_panic+0xd4/0xfc [spl]
   dsl_dataset_disown+0xe9/0x150 [zfs]
   dmu_objset_disown+0xd6/0x150 [zfs]
   zfs_domount+0x17b/0x4b0 [zfs]
   zpl_mount+0x174/0x220 [zfs]
   legacy_get_tree+0x2b/0x50
   vfs_get_tree+0x2a/0xc0
   path_mount+0x2fa/0xa70
   do_mount+0x7c/0xa0
   __x64_sys_mount+0x8b/0xe0
   do_syscall_64+0x38/0x50
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Co-authored-by: Christian Schwarz <christian.schwarz@nutanix.com>
Signed-off-by: Christian Schwarz <christian.schwarz@nutanix.com>
Closes #14025
2022-11-01 12:42:32 -07:00
Richard Yao 7a1b6c51d0 Linux: Remove ZFS_AC_KERNEL_SRC_MODULE_PARAM_CALL_CONST autotools check
On older kernels, the definition for `module_param_call()` typecasts
function pointers to `(void *)`, which triggers -Werror, causing the
check to return false when it should return true.

Fixing this breaks the build process on some older kernels because they
define a `__check_old_set_param()` function in their headers that checks
for a non-constified `->set()`. We workaround that through the c
preprocessor by defining `__check_old_set_param(set)` to `(set)`, which
prevents the build failures.

However, it is now apparent that all kernels that we support have
adopted the GRSecurity change, so there is no need to have an explicit
autotools check for it anymore. We therefore remove the autotools check,
while adding the workaround to our headers for the build time
non-constified `->set()` check done by older kernel headers.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13984
Closes #14004
2022-11-01 12:42:01 -07:00
George Melikov 4dd9c3b08e CI: bump actions/upload-artifact to v3
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #14018
2022-11-01 12:38:22 -07:00
George Melikov 1bbc09e1f7 CI: bump actions/checkout to v3
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #14018
2022-11-01 12:38:09 -07:00
Serapheim Dimitropoulos 37d5a3e04b Stop ganging due to past vdev write errors
= Problem

While examining a customer's system we noticed unreasonable space
usage from a few snapshots due to gang blocks. Under some further
analysis we discovered that the pool would create gang blocks because
all its disks had non-zero write error counts and they'd be skipped
for normal metaslab allocations due to the following if-clause in
`metaslab_alloc_dva()`:
```
	/*
	 * Avoid writing single-copy data to a failing,
	 * non-redundant vdev, unless we've already tried all
	 * other vdevs.
	 */
	if ((vd->vdev_stat.vs_write_errors > 0 ||
	    vd->vdev_state < VDEV_STATE_HEALTHY) &&
	    d == 0 && !try_hard && vd->vdev_children == 0) {
		metaslab_trace_add(zal, mg, NULL, psize, d,
		    TRACE_VDEV_ERROR, allocator);
		goto next;
	}
```

= Proposed Solution

Get rid of the predicate in the if-clause that checks the past
write errors of the selected vdev. We still try to allocate from
HEALTHY vdevs anyway by checking vdev_state so the past write
errors doesn't seem to help us (quite the opposite - it can cause
issues in long-lived pools like the one from our customer).

= Testing

I first created a pool with 3 vdevs:
```
$ zpool list -v volpool
NAME        SIZE  ALLOC   FREE
volpool    22.5G   117M  22.4G
  xvdb     7.99G  40.2M  7.46G
  xvdc     7.99G  39.1M  7.46G
  xvdd     7.99G  37.8M  7.46G
```

And used `zinject` like so with each one of them:
```
$ sudo zinject -d xvdb -e io -T write -f 0.1 volpool
```

And got the vdevs to the following state:
```
$ zpool status volpool
  pool: volpool
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.
...<cropped>..
action: Determine if the device needs to be replaced, and clear the
...<cropped>..
config:

	NAME        STATE     READ WRITE CKSUM
	volpool     ONLINE       0     0     0
	  xvdb      ONLINE       0     1     0
	  xvdc      ONLINE       0     1     0
	  xvdd      ONLINE       0     4     0

```

I also double-checked their write error counters with sdb:
```
sdb> spa volpool | vdev | member vdev_stat.vs_write_errors
(uint64_t)0  # <---- this is the root vdev
(uint64_t)2
(uint64_t)1
(uint64_t)1
```

Then I checked that I the problem was reproduced in my VM as I the
gang count was growing in zdb as I was writting more data:
```
$ sudo zdb volpool | grep gang
        ganged count:              1384

$ sudo zdb volpool | grep gang
        ganged count:              1393

$ sudo zdb volpool | grep gang
        ganged count:              1402

$ sudo zdb volpool | grep gang
        ganged count:              1414
```

Then I updated my bits with this patch and the gang count stayed the
same.

Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #14003
2022-11-01 12:36:25 -07:00
Serapheim Dimitropoulos 25096e1180 zvol_wait logic may terminate prematurely
Setups that have a lot of zvols may see zvol_wait terminate prematurely
even though the script is still making progress.  For example, we have a
customer that called zvol_wait for ~7100 zvols and by the last iteration
of that script it was still waiting on ~2900. Similarly another one
called zvol_wait for 2200 and by the time the script terminated there
were only 50 left.

This patch adjusts the logic to stay within the outer loop of the script
if we are making any progress whatsoever.

Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #13998
2022-11-01 12:35:36 -07:00
shodanshok 820edcbf91 Remove ambiguity on demand vs prefetch stats reported by arc_summary
arc_summary currently list prefetch stats as "demand prefetch"
However, a hit/miss can be due to demand or prefetch, not both.
To remove any confusion, this patch removes the "Demand" word
from the affected lines.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Gionatan Danti <g.danti@assyoma.it>
Closes #13985
2022-11-01 12:35:05 -07:00
Serapheim Dimitropoulos 37763ea2a6 Fix panic in dsl_process_sub_livelist for EINTR
= Issue

Recently we hit an assertion panic in `dsl_process_sub_livelist` while
exporting the spa and interrupting `bpobj_iterate_nofree`. In that case
`bpobj_iterate_nofree` stops mid-way returning an EINTR without clearing
the intermediate AVL tree that keeps track of the livelist entries it
has encountered so far. At that point the code has a VERIFY for the
number of elements of the AVL expecting it to be zero (which is not the
case for EINTR).

= Fix

Cleanup any intermediate state before destroying the AVL when
encountering EINTR. Also added a comment documenting the scenario where
the EINTR comes up. There is no need to do anything else for the calles
of `dsl_process_sub_livelist` as they already handle the EINTR case.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #13939
2022-11-01 12:34:08 -07:00
Mateusz Guzik c8d6a91a99 Bring per_txg_dirty_frees_percent back to 30
The current value causes significant artificial slowdown during mass
parallel file removal, which can be observed both on FreeBSD and Linux
when running real workloads.

Sample results from Linux doing make -j 96 clean after an allyesconfig
modules build:

before: 4.14s user 6.79s system 48% cpu 22.631 total
after:	4.17s user 6.44s system 153% cpu 6.927 total

FreeBSD results in the ticket.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by:	Mateusz Guzik <mjguzik@gmail.com>
Closes #13932
Closes #13938
2022-11-01 12:32:40 -07:00
Akash B 7ac732b8d6 Add options to zfs redundant_metadata property
Currently, additional/extra copies are created for metadata in
addition to the redundancy provided by the pool(mirror/raidz/draid),
due to this 2 times more space is utilized per inode and this decreases
the total number of inodes that can be created in the filesystem. By
setting redundant_metadata to none, no additional copies of metadata
are created, hence can reduce the space consumed by the additional
metadata copies and increase the total number of inodes that can be
created in the filesystem.  Additionally, this can improve file create
performance due to the reduced amount of metadata which needs
to be written.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Dipak Ghosh <dipak.ghosh@hpe.com>
Signed-off-by: Akash B <akash-b@hpe.com>
Closes #13680
2022-11-01 12:25:58 -07:00
Andriy Gapon 04f1983aab FreeBSD: vn_flush_cached_data: observe vnode locking contract
vm_object_page_clean() expects that the associated vnode is locked
as VOP_PUTPAGES() may get called on the vnode.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Andriy Gapon <avg@FreeBSD.org>
Closes #14079
(cherry picked from commit 41133c9794)
2022-10-27 16:14:57 -07:00
Mark Johnston 4e3fecbdfd FreeBSD: Fix a pair of bugs in zfs_fhtovp()
- Add a zfs_exit() call in an error path, otherwise a lock is leaked.
- Remove the fid_gen > 1 check.  That appears to be Linux-specific:
  zfsctl_snapdir_fid() sets fid_gen to 0 or 1 depending on whether the
  snapshot directory is mounted.  On FreeBSD it fails, making snapshot
  dirs inaccessible via NFS.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Andriy Gapon <avg@FreeBSD.org>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Fixes: 43dbf88178 ("FreeBSD: vfsops: use setgen for error case")
Closes #14001
Closes #13974
(cherry picked from commit ed566bf1cd)
2022-10-26 14:59:25 -07:00
samwyc fc1c0053f9 Fix sequential resilver drive failure race condition
This patch handles the race condition on simultaneous failure of
2 drives, which misses the vdev_rebuild_reset_wanted signal in
vdev_rebuild_thread. We retry to catch this inside the
vdev_rebuild_complete_sync function.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Dipak Ghosh <dipak.ghosh@hpe.com>
Reviewed-by: Akash B <akash-b@hpe.com>
Signed-off-by: Samuel Wycliffe J <samwyc@hpe.com>
Closes #14041
Closes #14050
2022-10-21 14:05:06 -07:00
Brian Behlendorf 7795975681 contrib: dracut: zfs-snapshot-bootfs: exit status fix
Correct misplaced `-` is the original backport of #13769.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #13769
2022-10-20 11:37:21 -07:00
gregory-lee-bartholomew 3b935cc3ed contrib: dracut: zfs-{rollback,snapshot}-bootfs: explicit snapname fix
Due to a missing semicolon on the ExecStart line, it wasn't possible
to specify the snapshot name on the bootfs.{rollback,snapshot}
kernel parameters if the boot dataset name was obtained from the
root=zfs:... kernel parameter.

Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gregory Bartholomew <gregory.lee.bartholomew@gmail.com>
Closes #13585
2022-10-20 11:34:59 -07:00
Richard Yao b0bc882395 kcfpool_alloc() should have its argument list marked void
This error occurred when building on Gentoo with debugging enabled:

zfs-kmod-2.1.6/work/zfs-2.1.6/module/icp/core/kcf_sched.c:1277:14:
error: a function declaration without a prototype is deprecated
in all versions of C [-Werror,-Wstrict-prototypes]
  kcfpool_alloc()
               ^
               void
1 error generated.

This function is not present in master.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14023
2022-10-12 15:47:39 -07:00
наб 8cf59e97c4 etc: mask zfs-load-key.service
Otherwise, systemd-sysv-generator will generate a service equivalent
that breaks the boot: under systemd this is covered by
zfs-mount-generator

We already do this for zfs-import.service, and other init scripts are
suppressed automatically by the "actual" .service files

Fixes: commit f04b976200 ("Add init script
 to load keys")
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #14010
Closes #14019
2022-10-12 15:29:21 -07:00
Damian Szuberski 4d22befde6 initramfs: use mount.zfs instead of mount
A followup to d7a67402a8

For `mount -t zfs -o opts ds mp` command line
some implementations of `mount(8)`, e. g. Busybox in Debian
work as follows:

```
newfstatat(AT_FDCWD, "ds", 0x7fff826f4ab0, 0) = -1
mount("ds", "mp", "zfs", MS_SILENT, NULL) = 0
```

The logic above skips completely `mount.zfs` and prevents us
from reading filesystem properties and applying mount options.

For comparison, the coreutils `mount(8)` implementation does:

```
openat(AT_FDCWD, "/proc/filesystems", O_RDONLY|O_CLOEXEC) = 3
// figure out that zfs is a `nodev` filesystem and look for a helper
newfstatat(AT_FDCWD, "/sbin/mount.zfs" ...) = 0
execve("/sbin/mount.zfs" ...) = 0
```

Using `mount.zfs` in initramfs would help circumvent deficiencies
of some of `mount(8)` implementations. `mount -t zfs` translates
to `mount.zfs` invocation, except for cases when explicitly disabled
by `-i`.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #13305
(cherry picked from commit 35d81a75a8)
2022-10-05 17:01:39 -07:00
Tony Hutter 6a6bd49398 Tag zfs-2.1.6
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2022-09-28 17:25:10 -07:00
Richard Yao 566e908fa0 Fix bad free in skein code
Clang's static analyzer found a bad free caused by skein_mac_atomic().
It will allocate a context on the stack and then pass it to
skein_final(), which attempts to free it. Upon inspection,
skein_digest_atomic() also has the same problem.

These functions were created to match the OpenSolaris ICP API, so I was
curious how we avoided this in other providers and looked at the SHA2
code. It appears that SHA2 has a SHA2Final() helper function that is
called by the exported sha2_mac_final()/sha2_digest_final() as well as
the sha2_mac_atomic() and sha2_digest_atomic() functions. The real work
is done in SHA2Final() while some checks and the free are done in
sha2_mac_final()/sha2_digest_final().

We fix the use after free in the skein code by taking inspiration from
the SHA2 code. We introduce a skein_final_nofree() that does most of the
work, and make skein_final() into a function that calls it and then
frees the memory.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13954
2022-09-28 17:25:10 -07:00
Tony Hutter a2705b1dd5 zpool: Don't print "repairing" on force faulted drives
If you force fault a drive that's resilvering, it's scan stats can get
frozen in time, giving the false impression that it's being resilvered.
This commit checks the vdev state to see if the vdev is healthy before
reporting "resilvering" or "repairing" in zpool status.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #13927
Closes #13930
2022-09-28 12:41:23 -07:00
Mateusz Guzik 63d4838b4a FreeBSD: handle V_PCATCH
See https://cgit.FreeBSD.org/src/commit/?id=a75d1ddd74312f5dd79bc1e965f7077679659f2e

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #13910
2022-09-28 10:35:13 -07:00
Mateusz Guzik eec942cc54 FreeBSD: catch up to 1400068
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #13909
2022-09-28 10:35:13 -07:00
Mateusz Guzik 2c8e3e4b28 FreeBSD: stop passing LK_INTERLOCK to VOP_LOCK
There is an ongoing effort to eliminate this feature.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #13908
2022-09-28 10:35:13 -07:00
Richard Yao 55816c64da FreeBSD: Fix integer conversion for vnlru_free{,_vfsops}()
When reviewing #13875, I noticed that our FreeBSD code has an issue
where it converts from `int64_t` to `int` when calling
`vnlru_free{,_vfsops}()`. The result is that if the int64_t is `1 <<
36`, the int will be 0, since the low bits are 0. Even when some low
bits are set, a value such as `((1 << 36) + 1)` would truncate to 1,
which is wrong.

There is protection against this on 32-bit platforms, but on 64-bit
platforms, there is no check to protect us, so we add a check.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13882
2022-09-28 10:35:13 -07:00
Ryan Moeller 8dcd6af623 FreeBSD: Ignore symlink to i386 includes
A symlink to i386 includes is created in the build dir on amd64 since
freebsd/freebsd-src@d07600c563

Tell git to ignore it like the other include links.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #13719
2022-09-28 10:35:13 -07:00
Richard Yao c973929b29 LUA: Fix CVE-2014-5461
Apply the fix from upstream.

http://www.lua.org/bugs.html#5.2.2-1
https://www.opencve.io/cve/CVE-2014-5461

It should be noted that exploiting this requires the `SYS_CONFIG`
privilege, and anyone with that privilege likely has other opportunities
to do exploits, so it is unlikely that bad actors could exploit this
unless system administrators are executing untrusted ZFS Channel
Programs.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13949
2022-09-27 16:49:02 -07:00
Richard Yao 835e03682c Linux: Fix uninitialized variable usage in zio_do_crypt_data()
Coverity complained about this. An error from `hkdf_sha512()` before uio
initialization will cause pointers to uninitialized memory to be passed
to `zio_crypt_destroy_uio()`. This is a regression that was introduced
by cf63739191. Interestingly, this never
affected FreeBSD, since the FreeBSD version never had that patch ported.
Since moving uio initialization to the top of this function would slow
down the qat_crypt() path, we only move the `memset()` calls to the top
of the function. This is sufficient to fix this problem.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13944
2022-09-27 15:43:26 -07:00
Alexander Motin 33223cbc3c Refactor Log Size Limit
Original Log Size Limit implementation blocked all writes in case of
limit reached until the TXG is committed and the log is freed.  It
caused huge delays and following speed spikes in application writes.

This implementation instead smoothly throttles writes, using exactly
the same mechanism as used for dirty data.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: jxdking <lostking2008@hotmail.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Issue #12284
Closes #13476
2022-09-26 14:55:27 -07:00
Brian Behlendorf 91e02156dd Revert "Reduce dbuf_find() lock contention"
This reverts commit 34dbc618f5.  While this
change resolved the lock contention observed for certain workloads, it
inadventantly reduced the maximum hash inserts/removes per second.  This
appears to be due to the slightly higher acquisition cost of a rwlock vs
a mutex.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
2022-09-21 13:15:51 -07:00
Richard Yao b66f8d3c2b Add zfs_btree_verify_intensity kernel module parameter
I see a few issues in the issue tracker that might be aided by being
able to turn this on. We have no module parameter for it, so I would
like to add one.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13874
2022-09-21 13:15:51 -07:00
Richard Yao 5096ed31c8 Fix incorrect size given to bqueue_enqueue() call in dmu_redact.c
We pass sizeof (struct redact_record *) rather than sizeof (struct
redact_record). Passing the pointer size is wrong.

Coverity caught this in two places.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13885
2022-09-21 13:15:51 -07:00
Ameer Hamza 035e52f591 Delay ZFS_PROP_SHARESMB property to handle it for encrypted raw receive
For encrypted raw receive, objset creation is delayed until a call to
dmu_recv_stream(). ZFS_PROP_SHARESMB property requires objset to be
populated when calling zpl_earlier_version(). To correctly handle the
ZFS_PROP_SHARESMB property for encrypted raw receive, this change
delays setting the property.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #13878
2022-09-21 13:15:26 -07:00
Ameer Hamza d5105f068f zfs recv hangs if max recordsize is less than received recordsize
- Some optimizations for bqueue enqueue/dequeue.
- Added a fix to prevent deadlock when both bqueue_enqueue_impl()
and bqueue_dequeue() waits for signal to be triggered.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #13855
2022-09-21 13:15:26 -07:00
наб faa1e4082d include: move SPA_MINBLOCKSHIFT and zio_encrypt to sys/fs/zfs.h
These are used by userspace, so should live in a public header

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12116
2022-09-21 13:15:26 -07:00
Alexander Motin 44cec45f72 Improve too large physical ashift handling
When iterating through children physical ashifts for vdev, prefer
ones above the maximum logical ashift, that we can actually use,
but within the administrator defined maximum.

When selecting top-level vdev ashift, do not set it to the defined
maximum in case physical ashift is even higher, but just ignore one.
Using the maximum does not prevent misaligned writes, but reduces
space efficiency.  Since ZFS tries to write data sequentially and
aggregates the writes, in many cases large misanigned writes may be
not as bad as the space penalty otherwise.

Allow internal physical ashifts for vdevs higher than SHIFT_MAX.
May be one day allocator or aggregation could benefit from that.

Reduce zfs_vdev_max_auto_ashift default from 16 (64KB) to 14 (16KB),
so that ZFS may still use bigger ashifts up to SHIFT_MAX (64KB),
but only if it really has to or explicitly told to, but not as an
"optimization".

There are some read-intensive NVMe SSDs that report Preferred Write
Alignment of 64KB, and attempt to build RAIDZ2 of those leads to a
space inefficiency that can't be justified.  Instead these changes
make ZFS fall back to logical ashift of 12 (4KB) by default and
only warn user that it may be suboptimal for performance.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #13798
2022-09-21 13:15:15 -07:00
Rich Ercolani ebbbe01e31 Ask libtool to stop hiding some errors
For #13083, curiously, it did not print the actual error, just
that the compile failed with "Error 1".

In theory, this flag should cause it to report errors twice sometimes.
In practice, I'm pretty okay with reporting some twice if it avoids
reporting some never.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #13086
2022-09-21 16:12:14 -07:00
Kevin Jin d05f3039f7 Add Module Parameter Regarding Log Size Limit
zfs_wrlog_data_max
The upper limit of TX_WRITE log data. Once it is reached,
write operation is blocked, until log data is cleared out
after txg sync. It only counts TX_WRITE log with WR_COPIED
or WR_NEED_COPY.

Reviewed-by: Prakash Surya <prakash.surya@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: jxdking <lostking2008@hotmail.com>
Closes #12284
2022-09-21 16:12:14 -07:00
Kevin Jin 999830a021 Optimize txg_kick() process (#12274)
Use dp_dirty_pertxg[] for txg_kick(), instead of dp_dirty_total in
original code. Extra parameter "txg" is added for txg_kick(), thus it
knows which txg to kick. Also txg_kick() call is moved from
dsl_pool_need_dirty_delay() to dsl_pool_dirty_space() so that we can
know the txg number assigned for txg_kick().

Some unnecessary code regarding dp_dirty_total in txg_sync_thread() is
also cleaned up.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: jxdking <lostking2008@hotmail.com>
Closes #12274
2022-09-21 16:12:14 -07:00
Ameer Hamza a5b0d42540 zfs recv hangs if max recordsize is less than received recordsize
- Some optimizations for bqueue enqueue/dequeue.
- Added a fix to prevent deadlock when both bqueue_enqueue_impl()
and bqueue_dequeue() waits for signal to be triggered.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #13855
2022-09-19 09:39:07 -07:00
Christian Schwarz cde04badd1 make DMU_OT_IS_METADATA and DMU_OT_IS_ENCRYPTED return B_TRUE or B_FALSE
Without this patch, the

    ASSERT3U(dbuf_is_metadata(db), ==, arc_is_metadata(buf));

at the beginning of dbuf_assign_arcbuf can panic
if the object type is a DMU_OT_NEWTYPE that has
DMU_OT_METADATA set.

While we're at it, fix DMU_OT_IS_ENCRYPTED as well.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Christian Schwarz <christian.schwarz@nutanix.com>
Closes #13842
2022-09-15 16:58:35 -07:00
Richard Yao 3f7c174b50 vdev_draid_lookup_map() should not iterate outside draid_maps
Coverity reported this as an out-of-bounds read.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13865
2022-09-15 16:58:35 -07:00
Akash B 03fa3ef264 Add physical device size to SIZE column in 'zpool list -v'
Add physical device size/capacity only for physical devices in
'zpool list -v' instead of displaying "-" in the SIZE column.
This would make it easier to see the individual device capacity and
to determine which spares are large enough to replace which devices.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Dipak Ghosh <dipak.ghosh@hpe.com>
Signed-off-by: Akash B <akash-b@hpe.com>
Closes #12561
Closes #13106
2022-09-15 10:23:01 -07:00
George Amanakis 8bd3dca9bf Introduce a tunable to exclude special class buffers from L2ARC
Special allocation class or dedup vdevs may have roughly the same
performance as L2ARC vdevs. Introduce a new tunable to exclude those
buffers from being cacheable on L2ARC.

Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #11761
Closes #12285
2022-09-14 11:27:00 -07:00
наб c8f795ba53 config: check for parallel(1), use it for cstyle
Before:
$ time make cstyle
real    0m23.118s
user    0m23.002s
sys     0m0.114s

After:
$ time make cstyle
real    0m4.577s
user    0m31.487s
sys     0m0.699s

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Issue #12899
2022-09-14 11:23:25 -07:00
Tony Hutter 7bbfac9d04 zed: Fix config_sync autoexpand flood
Users were seeing floods of `config_sync` events when autoexpand was
enabled.  This happened because all "disk status change" udev events
invoke the autoexpand codepath, which calls zpool_relabel_disk(),
which in turn cause another "disk status change" event to happen,
in a feedback loop.  Note that "disk status change" happens every time
a user calls close() on a block device.

This commit breaks the feedback loop by only allowing an autoexpand
to happen if the disk actually changed size.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes: #7132
Closes: #7366
Closes #13729
2022-09-14 09:57:44 -07:00
Walter Huf 2010c183bc Add xattr_handler support for Android kernels
Some ARM BSPs run the Android kernel, which has
a modified xattr_handler->get() function signature.
This adds support to compile against these kernels.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Walter Huf <hufman@gmail.com>
Closes #13824
2022-09-14 09:57:37 -07:00
Samuel aa9e887d2a Fix column width in 'zpool iostat -v' and 'zpool list -v'
This commit fixes a minor spacing issue caused when
enumerating vdev names, which originated from #13031

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Akash B <akash-b@hpe.com>
Signed-off-by: Samuel Wycliffe <samuelwycliffe@gmail.com>
Closes #13811
2022-09-14 09:57:05 -07:00
Ryan Moeller 78206a2e44 FreeBSD: Mark ZFS_MODULE_PARAM_CALL as MPSAFE
ZFS_MODULE_PARAM_CALL handlers implement their own locking if needed
and do not require Giant.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #13756
2022-09-13 17:59:15 -07:00
Alexander Motin b6ebf270eb Apply arc_shrink_shift to ARC above arc_c_min
It makes sense to free memory in smaller chunks when approaching
arc_c_min to let other kernel subsystems to free more, since after
that point we can't free anything.  This also matches behavior on
Linux, where to shrinker reported only the size above arc_c_min.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #13794
2022-09-13 17:59:10 -07:00
George Wilson 15b64fbc94 Importing from cachefile can trip assertion
When importing from cachefile, it is possible that the builtin retry
logic will trip an assertion because it also fails to find the pool.
This fix addresses that case and returns the correct error message to
the user.

Reviewed-by: Richard Yao <ryao@gentoo.org>
Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Wilson <gwilson@delphix.com>
Closes #13781
2022-09-13 17:59:04 -07:00
Tony Hutter b1be0a5c15 ZTS: Fix zpool_expand_001_pos
`zpool_expand_001_pos` was often failing due to not seeing autoexpand
commands in the `zpool history`.  During testing, I found this to be
unreliable (sometimes the "online" wouldn't appear in `zpool history`)
and unnecessary, as we could simply check that the pool increased in
size.

This commit revamps the test to check for the expanded pool size
and corresponding new free space.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #13743
2022-09-13 17:58:03 -07:00
Tony Hutter 65f8f92d12 zed: Look for NVMe DEVPATH if no ID_BUS
We tried replacing an NVMe drive using autoreplace, only
to see zed reject it with:

zed[27955]: zed_udev_monitor: /dev/nvme5n1 no devid source

This happened because ZED saw that ID_BUS was not set by udev
for the NVMe drive, and thus didn't think it was "real drive".
This commit allows NVMe drives to be autoreplaced even if
ID_BUS is not set.

Reviewed-by: Don Brady <don.brady@intel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #13512
Closes #13646
2022-09-13 17:51:11 -07:00
Tony Hutter acd7464639 zed: Ignore false 'atari' partitions in autoreplace
libudev will sometimes falsely identify an 'atari' partition on a
blank disk, preventing it from being used in an autoreplace.  This
seems to be a known issue.  The workaround is to just ignore the
fake partition and continue with the autoreplace.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #13497
Closes #13632
2022-09-13 17:51:06 -07:00
Tony Hutter f48d9b4269 rpm: Silence "unversioned Obsoletes" warnings on EL 9
Get rid of RPM warnings on AlmaLinux 9:

"It's not recommended to have unversioned Obsoletes"

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #13584
Closes #13638
2022-09-13 17:50:59 -07:00
Neal Gompa (ニール・ゴンパ) e1b49e3f1d rpm: Use the correct version-release information in dependencies
This tightly links the subpackages together and ensures that everything
is upgraded together.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Neal Gompa <ngompa@datto.com>
Closes #13489
2022-09-13 17:50:42 -07:00
Richard Yao 8131a96544 Fix use-after-free in btree code
Coverty static analysis found these.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #10989
Closes #13861
2022-09-13 16:15:38 -07:00
gregory-lee-bartholomew 979fd5a434 contrib: dracut: zfs-snapshot-bootfs: exit status fix
When the zfs-snapshot-bootfs service attempts to create a snapshot
that already exists, the exit status of the command is non-zero and
the service reports failed to the systemd service manager. This is a
common occurrence if bootfs.snapshot is left set on the kernel command
line and it should not be considered a failure.

This service was originally set to ignore this error by prefixing
the command with - on the ExecStart line, but the leading - appears
to have been dropped in #13359.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gregory Bartholomew <gregory.lee.bartholomew@gmail.com>
Closes #13769
2022-08-12 14:31:51 -07:00
r-ricci 533779f5f2 arcstat: fix -p option
When the -p option is used, a list of floats is passed to sep.join(),
which expects strings. Fix this by converting each value to a string.

Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Roberto Ricci <ricci@disroot.org>
Closes #12916 
Closes #13767
2022-08-12 14:29:24 -07:00
Paul Zuchowski db5fd16f0b Fix problem with zdb_objset_id test.
Use large numbers for datasets with
numeric names to avoid name and id
collisions.

Signed-off-by: Paul Zuchowski <pzuchowski@datto.com>
2022-08-09 11:46:12 -07:00
Coleman Kane e0dbab1a14 Linux 6.0 compat: register_shrinker() now var-arg
The 6.0 kernel added a printf-style var-arg for args > 0 to the
register_shrinker function, in order to add names to shrinkers, in
commit e33c267ab70de4249d22d7eab1cc7d68a889bac2. This enables the
shrinkers to have friendly names exposed in /sys/kernel/debug/shrinker/.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #13748
2022-08-09 09:41:06 -07:00
Brian Behlendorf 4063d7b6b4 Linux 5.20 compat: blk_cleanup_disk()
As of the Linux 5.20 kernel blk_cleanup_disk() has been removed,
all callers should use put_disk().

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13728
2022-08-09 09:41:06 -07:00
Brian Behlendorf 58571ba447 Linux 5.20 compat: bdevname()
As of the Linux 5.20 kernel bdevname() has been removed, all
callers should use snprintf() and the "%pg" format specifier.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13728
2022-08-09 09:41:06 -07:00
Brian Behlendorf 57e1052d33 Linux 5.19 compat: META
Update the META file to reflect compatibility with the 5.19 kernel.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13715
2022-08-09 09:41:06 -07:00
Paul Zuchowski fcbddc7f7c Fix problem with zdb -d
zdb -d <pool>/<objset ID> does not work when
other command line arguments are included i.e.
zdb -U <cachefile> -d <pool>/<objset ID>
This change fixes the command line parsing
to handle this situation.  Also fix issue
where zdb -r <dataset> <file> does not handle
the root <dataset> of the pool. Introduce -N
option to force <objset ID> to be interpreted
as a numeric objsetID.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Paul Zuchowski <pzuchowski@datto.com>
Closes #12845
Closes #12944
2022-08-08 16:56:38 -07:00
Tino Reichardt b06aff105c Fix checkstyle warning: E275 missing whitespace after keyword
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #13710
2022-08-02 10:05:14 -07:00
Rich Ercolani 035ee628cf Revert behavior of 59eab109 on not-Linux
It turns out that short-circuiting the EFAULT behavior on a short read
breaks things on FreeBSD. So until there's a nicer solution, let's
just revert the behavior for not-Linux.

Reference:
https://reviews.freebsd.org/R10:70f51f0e474ffe1fb74cb427423a2fba3637544d

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12698
2022-08-02 10:05:14 -07:00
Rich Ercolani 5c56591b57 Handle partial reads in zfs_read
Currently, dmu_read_uio_dnode can read 64K of a requested 1M in one
loop, get EFAULT back from zfs_uiomove() (because the iovec only holds
64k), and return EFAULT, which turns into EAGAIN on the way out. EAGAIN
gets interpreted as "I didn't read anything", the caller tries again
without consuming the 64k we already read, and we're stuck.

This apparently works on newer kernels because the caller which breaks
on older Linux kernels by happily passing along a 1M read request and a
64k iovec just requests 64k at a time.

With this, we now won't return EFAULT if we got a partial read.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12370
Closes #12509
Closes #12516
2022-08-02 10:05:14 -07:00
наб 17512aba0c module: lua: ldo: fix pragma name
/home/nabijaczleweli/store/code/zfs/module/lua/ldo.c:175:32: warning:
unknown option after ‘#pragma GCC diagnostic’ kind [-Wpragmas]
  175 | #pragma GCC diagnostic ignored "-Winfinite-recursion"a
      |                                ^~~~~~~~~~~~~~~~~~~~~~

Fixes: a6e8113fed ("Silence
-Winfinite-recursion warning in luaD_throw()")

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13348
2022-07-28 14:17:38 -07:00
Brian Behlendorf 98315be036 ZTS: Fix io_uring support check
Not all Linux distribution kernels enable io_uring support by
default.  Update the run time check to verify that the booted
kernel was built with CONFIG_IO_URING=y.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Co-authored-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13648
Closes #13685
2022-07-27 13:38:56 -07:00
Brian Behlendorf 69ad0bd769 Fix objtool: missing int3 after ret warning
Resolve straight-line speculation warnings reported by objtool
for x86_64 assembly on Linux when CONFIG_SLS is set.  See the
following LWN article for the complete details.

https://lwn.net/Articles/877845/

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
Attila Fülöp b9d862f2db ICP: Add missing stack frame info to SHA asm files
Since the assembly routines calculating SHA checksums don't use
a standard stack layout, CFI directives are needed to unroll the
stack.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes #11733
2022-07-27 13:38:56 -07:00
Brian Behlendorf d2ff2196a5 Fix -Wformat-overflow warning in zfs_project_handle_dir()
Switch to using asprintf() to satisfy the compiler and resolve the
potential format-overflow warning.  Not the conditional before the
sprintf() would have prevented this regardless.

    cmd/zfs/zfs_project.c: In function ‘zfs_project_handle_dir’:
    cmd/zfs/zfs_project.c:241:38: error: ‘/’ directive writing
    1 byte into a region of size between 0 and 4352
    [-Werror=format-overflow=]
    cmd/zfs/zfs_project.c:241:17: note: ‘sprintf’ output between
    2 and 4609 bytes into a destination of size 4352

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
Brian Behlendorf d483ef3744 Fix -Wformat-truncation warning in upgrade_set_callback()
Extend the buffer slightly resolve the warning.

    cmd/zfs/zfs_main.c: In function ‘upgrade_set_callback’:
    cmd/zfs/zfs_main.c:2446:22: error: ‘%llu’ directive output
    may be truncated writing between 1 and 20 bytes into a
    region of size 16 [-Werror=format-truncation=]
    cmd/zfs/zfs_main.c:2445:24: note: ‘snprintf’ output between
    2 and 21 bytes into a destination of size 16

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
Brian Behlendorf 60f2cfd24f Fix -Wuse-after-free warning in dbuf_destroy()
Move the use of the db pointer after it is freed.  It's only used as
a tag so a dereference would never occur, but there's no reason we
can't invert the order to resolve the warning.

    module/zfs/dbuf.c: In function 'dbuf_destroy':
    module/zfs/dbuf.c:2953:17: error:
    pointer 'db' may be used after 'free' [-Werror=use-after-free]

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
Brian Behlendorf 6a81173026 Fix -Wuse-after-free warning in dbuf_issue_final_prefetch_done()
Move the use of the private pointer after it is freed.  It's only
used as a tag so a dereference would never occur, but there's no
harm in inverting the order to resolve the warning.

    module/zfs/dbuf.c: In function 'dbuf_issue_final_prefetch_done':
    module/zfs/dbuf.c:3204:17: error:
    pointer 'private' may be used after 'free' [-Werror=use-after-free]

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
Brian Behlendorf 087f5dedd5 Fix -Wattribute-warning in dsl layer
The memcpy(), memmove(), and memset() functions have been annotated
to perform bounds checking when using FORTIFY_SOURCE.  A warning is
now generted when writing beyond the end of the specified field.

Alternately, the new struct_group() macro could be used to create
an anonymous union member for use by memcpy().  However, since this
is the only place the macro would be helpful it's preferable to
restructure the code slights to avoid the need for additional
compatibility code when the macro does not exist.

https://lore.kernel.org/lkml/20211118183807.1283332-1-keescook@chromium.org/T/

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
Brian Behlendorf c771583f23 Fix -Wattribute-warning in edonr
The wrong union memory was being accessed in EdonRInit resulting in
a write beyond size of field compiler warning.  Reference the correct
member to resolve the warning.  The warning was correct and this in
case the mistake was harmless.

    In function ‘fortify_memcpy_chk’,
    inlined from ‘EdonRInit’ at zfs/module/icp/algs/edonr/edonr.c:494:3:
    ./include/linux/fortify-string.h:344:25: error: call to
    ‘__write_overflow_field’ declared with attribute warning:
    detected write beyond size of field (1st parameter);
    maybe use struct_group()? [-Werror=attribute-warning]

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
Brian Behlendorf ef0e506f46 Fix -Wattribute-warning in zfs_log_xvattr()
Restructure the code in zfs_log_xvattr() to use a lr_attr_end
structure when accessing lr_attr_t elements located after the
variable sized array.  This makes the code more understandable
and resolves the accessing beyond the end of the field warnings.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
Brian Behlendorf d7a8c573cf Silence -Winfinite-recursion warning in luaD_throw()
This code should be kept inline with the upstream lua version as much
as possible.  Therefore, we simply want to silence the warning.  This
check was enabled by default as part of -Wall in gcc 12.1.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13528
Closes #13575
2022-07-27 13:38:56 -07:00
наб 2d235d58f8 config: prune unused -Wno-bool-compare checks
Reviewed-by: Alejandro Colomar <alx.manpages@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13110
2022-07-27 13:38:56 -07:00
наб 37430e8211 libtpool: -Wno-clobbered
Also remove -Wno-unused-but-set-variable

Upstream-bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61118
Reviewed-by: Alejandro Colomar <alx.manpages@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13110
2022-07-27 13:38:56 -07:00
Tino Reichardt 4b0977027b Remove sha1 hashing from OpenZFS, it's not used anywhere.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Attila Fülöp <attila@fueloep.org>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12895
Closes #12902
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
2022-07-26 10:12:44 -07:00
Alexander Motin 15868d3ecb Fix scrub resume from newly created hole.
It may happen that scan bookmark points to a block that was turned
into a part of a big hole.  In such case dsl_scan_visitbp() may skip
it and dsl_scan_check_resume() will not be called for it.  As result
new scan suspend won't be possible until the end of the object, that
may take hours if the object is a multi-terabyte ZVOL on a slow HDD
pool, stretching TXG to all that time, creating all sorts of problems.

This patch changes the resume condition to any greater or equal block,
so even if we miss the bookmarked block, the next one we find will
delete the bookmark, allowing new suspend.

Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
2022-07-26 10:10:37 -07:00
Alexander Motin bbb50e6129 Avoid memory copy when verifying raidz/draid parity
Before this change for every valid parity column raidz_parity_verify()
allocated new buffer and copied there existing data, then recalculated
the parity and compared the result with the copy.  This patch removes
the memory copy, simply swapping original buffer pointers with newly
allocated empty ones for parity recalculation and comparison. Original
buffers with potentially incorrect parity data are then just freed,
while new recalculated ones are used for repair.

On a pool of 12 4-wide raidz vdevs, storing 1.5TB of 16MB blocks, this
change reduces memory traffic during scrub by 17% and total unhalted
CPU time by 25%.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13613
2022-07-26 10:10:37 -07:00
Alexander Motin 03e33b2bb8 Avoid memory copies during mirror scrub
Issuing several scrub reads for a block we may use the parent ZIO
buffer for one of child ZIOs.  If that read complete successfully,
then we won't need to copy the data explicitly.  If block has only
one copy (typical for root vdev, which is also a mirror inside),
then we never need to copy -- succeed or fail as-is.  Previous
code also copied data from buffer of every successfully completed
child ZIO, but that just does not make any sense.

On healthy N-wide mirror this saves all N+1 (or even more in case
of ditto blocks) memory copies for each scrubbed block, allowing
CPU to focus mostly on check-summing.  For other vdev types it
should save one memory copy per block copy at root vdev.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13606
2022-07-26 10:10:37 -07:00
Alexander Motin 4b8f16072d Fix and disable blocks statistics during scrub
Block statistics calculation during scrub I/O issue in case of sorted
scrub accounted ditto blocks several times.  Embedded blocks on other
side were not accounted at all.  This change moves the accounting from
issue to scan stage, that fixes both problems and also allows to avoid
pool-wide locking and the lock contention it created.

Since this statistics is quite specific and is not even exposed now
anywhere, disable its calculation by default to not waste CPU time.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13579
2022-07-26 10:10:37 -07:00
Alexander Motin 5e06805d8e Avoid two 64-bit divisions per scanned block
Change math to make it like the ARC, using multiplications instead.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13591
2022-07-26 10:10:37 -07:00
Alexander Motin dc91a6a660 Several B-tree optimizations
- Introduce first element offset within a leaf.  It allows to reduce
by ~50% average memmove() size when adding/removing elements.  If the
added/removed element is in the first half of the leaf, we may shift
elements before it and adjust the bth_first instead of moving more
elements after it.
 - Use memcpy() instead of memmove() when we know there is no overlap.
 - Switch from uint64_t to uint32_t.  It does not limit anything,
but 32-bit arches should appreciate it greatly in hot paths.
 - Store leaf capacity in struct btree to avoid 64-bit divisions.
 - Adjust zfs_btree_insert_into_leaf() to always result in balanced
leaves after splitting, no matter where the new element was inserted.
Not that we care about it much, but it should also allow B-trees with
as little as two elements per leaf instead of 4 previously.

When scrubbing pool of 12 SSDs, storing 1.5TB of 4KB zvol blocks this
reduces amount of time spent in memmove() inside the scan thread from
13.7% to 5.7% and total scrub time by ~15 seconds out of 9 minutes.
It should also reduce spacemaps load time, but I haven't measured it.

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13582
2022-07-26 10:10:37 -07:00
Alexander Motin a861aa2b9e Several sorted scrub optimizations
- Reduce size and comparison complexity of q_exts_by_size B-tree.
Previous code used two 64-bit divisions and many other operations to
compare two B-tree elements.  It created enormous overhead.  This
implementation moves the math to the upper level and stores the score
in the B-tree elements themselves.  Since all that we need to store in
that B-tree is the extent score and offset, those can fit into single
8 byte value instead of 24 bytes of q_exts_by_addr element and can be
compared with single operation.
 - Better decouple secondary tree logic from main range_tree by moving
rt_btree_ops and related functions into dsl_scan.c as ext_size_ops.
Those functions are very small to worry about the code duplication and
range_tree does not need to know details such as rt_btree_compare.
 - Instead of accounting number of pending bytes per pool, that needs
atomic on global variable per block, account the number of non-empty
per-vdev queues, that change much more rarely.
 - When extent scan is interrupted by TXG end, continue it in the next
TXG instead of selecting next best extent.  It allows to avoid leaving
one truncated (and so likely not the best any more) extent each TXG.

On top of some other optimizations this saves about 1.5 minutes out of
10 to scrub pool of 12 SSDs, storing 1.5TB of 4KB zvol blocks.

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <caputit1@tcnj.edu>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13576
2022-07-26 10:10:37 -07:00
Alexander Motin 881249de6f FreeBSD: Improve crypto_dispatch() handling
Handle crypto_dispatch() return values same as crp->crp_etype errors.
On FreeBSD 12 many drivers returned same errors both ways, and lack
of proper handling for the first ended up in assertion panic later.
It was changed in FreeBSD 13, but there is no reason to not be safe.

While there, skip waiting for completion, including locking and
wakeup() call, for sessions on synchronous crypto drivers, such as
typical aesni and software.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13563
2022-07-26 10:10:37 -07:00
Alexander Motin 916d9de158 Reduce ZIO io_lock contention on sorted scrub
During sorted scrub multiple threads (one per vdev) are issuing many
ZIOs same time, all using the same scn->scn_zio_root ZIO as parent.
It causes huge lock contention on the single global lock on that ZIO.
Improve it by introducing per-queue null ZIOs, children to that one,
and using them instead as proxy.

For 12 SSD pool storing 1.5TB of 4KB blocks on 80-core system this
dramatically reduces lock contention and reduces scrub time from 21
minutes down to 12.5, while actual read stages (not scan) are about
3x faster, reaching 100K blocks per second per vdev.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13553
2022-07-26 10:10:37 -07:00
Alexander Motin 813e15f28c AVL: Remove obsolete branching optimizations
Modern Clang and GCC can successfully implement simple conditions
without branching with math and flag operations.  Use of arrays for
translation no longer helps as much as it was 14+ years ago.

Disassemble of the code generated by Clang 13.0.0 on FreeBSD 13.1,
Clang 14.0.4 on FreeBSD 14 and GCC 10.2.1 on Debian 11 with this
change still shows no branching instructions.

Profiling of CPU-bound scan stage of sorted scrub shows reproducible
reduction of time spent inside avl_find() from 6.52% to 4.58%.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13540
2022-07-26 10:10:37 -07:00
Alexander Motin 884364ea85 More speculative prefetcher improvements
- Make prefetch distance adaptive: up to 4MB prefetch doubles for
every, hit same as before, but after that it grows by 1/8 every time
the prefetch read does not complete in time to satisfy the demand.
My tests show that 4MB is sufficient for wide NVMe pool to saturate
single reader thread at 2.5GB/s, while new 64MB maximum allows the
same thread to reach 1.5GB/s on wide HDD pool.  Further distance
increase may increase speed even more, but less dramatic and with
higher latency.

 - Allow early reuse of inactive prefetch streams: streams that never
saw hits can be reused immediately if there is a demand, while others
can be reused after 1s of inactivity, starting with the oldest.  After
2s of inactivity streams are deleted to free resources same as before.
This allows by several times increase strided read performance on HDD
pool in presence of simultaneous random reads, previously filling the
zfetch_max_streams limit for seconds and so blocking most of prefetch.

 - Always issue intermediate indirect block reads with SYNC priority.
Each of those reads if delayed for longer may delay up to 1024 other
block prefetches, that may be not good for wide pools.

Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13452
2022-07-26 10:10:37 -07:00
Alexander Motin 6e1e90d64c Improve mg_aliquot math
When calculating mg_aliquot alike to #12046 use number of unique data
disks in the vdev, not the total number of children vdev.  Increase
default value of the tunable from 512KB to 1MB to compensate.

Before this change each disk in striped pool was getting 512KB of
sequential data, in 2-wide mirror -- 1MB, in 3-wide RAIDZ1 -- 768KB.
After this change in all the cases each disk should get 1MB.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13388
2022-07-26 10:10:37 -07:00
Alexander Motin dd9c110ab5 Improve log spacemap load time
Previous flushing algorithm limited only total number of log blocks to
the minimum of 256K and 4x number of metaslabs in the pool.  As result,
system with 1500 disks with 1000 metaslabs each, touching several new
metaslabs each TXG could grow spacemap log to huge size without much
benefits.  We've observed one of such systems importing pool for about
45 minutes.

This patch improves the situation from five sides:
 - By limiting maximum period for each metaslab to be flushed to 1000
TXGs, that effectively limits maximum number of per-TXG spacemap logs
to load to the same number.
 - By making flushing more smooth via accounting number of metaslabs
that were touched after the last flush and actually need another flush,
not just ms_unflushed_txg bump.
 - By applying zfs_unflushed_log_block_pct to the number of metaslabs
that were touched after the last flush, not all metaslabs in the pool.
 - By aggressively prefetching per-TXG spacemap logs up to 16 TXGs in
advance, making log spacemap load process for wide HDD pool CPU-bound,
accelerating it by many times.
 - By reducing zfs_unflushed_log_block_max from 256K to 128K, reducing
single-threaded by nature log processing time from ~10 to ~5 minutes.

As further optimization we could skip bumping ms_unflushed_txg for
metaslabs not touched since the last flush, but that would be an
incompatible change, requiring new pool feature.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #12789
2022-07-26 10:10:37 -07:00
Alexander Motin fdb80a2301 Add more control/visibility to spa_load_verify().
Use error thresholds from policy to control whether to scrub data
and/or metadata.  If threshold is set to UINT64_MAX, then caller
probably does not care about result and we may skip that part.

By default import neither set the data error threshold nor read
the error counter, so skip the data scrub for faster import.
Metadata are still scrubbed and fail if even single error found.

While there just for symmetry return number of metadata errors in
case threshold is not set to zero and we haven't reached it.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #13022
2022-07-26 10:10:37 -07:00
Allan Jude 72a4709a59 spa.c: Replace VERIFY(nvlist_*(...) == 0) with fnvlist_* (#12678)
The fnvlist versions of the functions are fatal if they fail,
saving each call from having to include checking the result.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Allan Jude <allan@klarasystems.com>
2022-07-26 10:10:37 -07:00
Alexander Motin 415882d228 Avoid small buffer copying on write
It is wrong for arc_write_ready() to use zfs_abd_scatter_enabled to
decide whether to reallocate/copy the buffer, because the answer is
OS-specific and depends on the buffer size.  Instead of that use
abd_size_alloc_linear(), moved into public header.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12425
2022-07-26 10:10:37 -07:00
Alexander Motin 5b860ae1fb Remove refcount from spa_config_*()
The only reason for spa_config_*() to use refcount instead of simple
non-atomic (thanks to scl_lock) variable for scl_count is tracking,
hard disabled for the last 8 years.  Switch to simple int scl_count
reduces the lock hold time by avoiding atomic, plus makes structure
fit into single cache line, reducing the locks contention.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #12287
2022-07-26 10:10:37 -07:00
Brian Behlendorf 3920d7f325 Scrub mirror children without BPs
When scrubbing a raidz/draid pool, which contains a replacing or
sparing mirror with multiple online children, only one child will
be read.  This is not normally a serious concern because the DTL
records are used to determine where a good copy of the data is.
As long as the data can be read from one child the mirror vdev
will use it to repair gaps in any of its children.  Furthermore,
even if the data which was read is corrupt the raidz code will
detect this and issue its own repair I/O to correct the damage
in the mirror vdev.

However, in the scenario where the DTL is wrong due to silent
data corruption (say due to overwriting one child) and the scrub
happens to read from a child with good data, then the other damaged
mirror child will not be detected nor repaired.

While this is possible for both raidz and draid vdevs, it's most
pronounced when using draid.  This is because by default the zed
will sequentially rebuild a draid pool to a distributed spare,
and the distributed spare half of the mirror is always preferred
since it delivers better performance.  This means the damaged
half of the mirror will go undetected even after scrubbing.

For system administrations this behavior is non-intuitive and in
a worst case scenario could result in the only good copy of the
data being unknowingly detached from the mirror.

This change resolves the issue by reading all replacing/sparing
mirror children when scrubbing.  When the BP isn't available for
verification, then compare the data buffers from each child.  They
must all be identical, if not there's silent damage and an error
is returned to prompt the top-level vdev to issue a repair I/O to
rewrite the data on all of the mirror children.  Since we can't
tell which child was wrong a checksum error is logged against the
replacing or sparing mirror vdev.

Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13555
2022-07-14 10:21:29 -07:00
Tony Hutter 6c3c5fcfbe Tag zfs-2.1.5
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2022-06-21 17:00:34 -07:00
Matthew Thode 6e954130d4 Remove install of zfs-load-module.service for dracut
The zfs-load-module.service service is not currently provided by
the OpenZFS repository so we cannot safely assume it exists.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Thode <mthode@mthode.org>
Closes #13574
2022-06-21 10:53:46 -07:00
Ryan Moeller 403d4bc66e FreeBSD: Silence clang unused-but-set-variable
Quick and dirty build fix for warnings being treated as errors.

Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
2022-06-15 11:27:28 -07:00
Alexander Motin 6ff89fe126 Improve sorted scan memory accounting
Since we use two B-trees q_exts_by_size and q_exts_by_addr, we should
count 2x sizeof (range_seg_gap_t) per node.  And since average B-tree
memory efficiency is about 75%, we should increase it to 3x.

Previous code under-counted up to 30% of the memory usage.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13537
2022-06-15 11:23:49 -07:00
Rich Ercolani cc565f557b Corrected edge case in uncompressed ARC->L2ARC handling
I genuinely don't know why this didn't come up before,
but adding the LZ4 early abort pointed out this flaw,
in which we're allocating a buffer of one size, and
then telling the compressor that we're handing it buffers
of a different size, which may be Very Different - say,
allocating 512b and then telling it the inputs are 128k.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Amanakis <gamanakis@gmail.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #13375
2022-06-14 18:10:21 -07:00
Alexander Motin 338188562b Remove wrong assertion in log spacemap
It is typical, but not generally true that if log summary has more
blocks it must also have unflushed metaslabs.  Normally with metaslabs
flushed in order it works, but there are known exceptions, such as
device removal or metaslab being loaded during its flush attempt.

Before 600a02b884 if spa_flush_metaslabs() hit loading metaslab it
usually stopped (unless memlimit is also exceeded), but now it may
flush more metaslabs, just skipping that particular one.  This
increased chances of assertion to fire when the skipped metaslab is
flushed on next iteration if all other metaslabs in that summary
entry are already flushed out of order.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #13486 
Closes #13513
2022-06-06 16:57:56 -07:00
Ryan Moeller 1fdd768d7f libzfs: Fail making a dataset handle gracefully
When a dataset is in the process of being received it gets marked as
inconsistent and should not be used.  We should check for this when
opening a dataset handle in libzfs and return with an appropriate error
set, rather than hitting an abort because of the incomplete data.

zfs_open() passes errno to zfs_standard_error() after observing
make_dataset_handle() fail, which ends up aborting if errno is 0.
Set errno before returning where we know it has not been set already.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #13077
2022-06-06 16:57:51 -07:00
наб 56eed508d4 libzfs: mount: don't leak mnt_param_t if mnt_func fails
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12968
2022-06-06 16:57:46 -07:00
Rich Ercolani 271241187b Reject zfs send -RI with nonexistent fromsnap
Right now, zfs send -I dataset@nonexistent dataset@existent fails, but
zfs send -RI dataset@nonexistent dataset@existent does not.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12574
Closes #12575
2022-06-06 16:57:41 -07:00
Brian Behlendorf fc18fa92c8 Linux 5.18 compat: META
Update the META file to reflect compatibility with the 5.18 kernel.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13527
2022-06-01 14:24:49 -07:00
Brian Behlendorf db530f6aa0 autoconf: AC_MSG_CHECKING consistency
Make the wording more consistent for the kernel AC_MSG_CHECKING
output (e.g. "checking whether ...".).  Additionally, group some
of the VFS interface checks with the others.  No functional change.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Attila Fülöp <attila@fueloep.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13529
2022-06-01 14:24:49 -07:00
Brian Behlendorf 1c4e6a312c Linux 5.19 compat: asm/fpu/internal.h
As of the Linux 5.19 kernel the asm/fpu/internal.h header was
entirely removed.  It has been effectively empty since the 5.16
kernel and provides no required functionality.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Attila Fülöp <attila@fueloep.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13529
2022-06-01 14:24:49 -07:00
Brian Behlendorf 69430e39e3 Linux 5.19 compat: zap_flags_t conflict
As of the Linux 5.19 kernel an identically named zap_flags_t typedef
is declared in the include/linux/mm_types.h linux header.  Sadly,
the inclusion of this header cannot be easily avoided.  To resolve
the conflict a #define is used to remap the name in the OpenZFS
sources when building against the Linux kernel.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13515
2022-06-01 14:24:49 -07:00
Brian Behlendorf ee84970d4f Linux 5.19 compat: bdev_start_io_acct() / bdev_end_io_acct()
As of the Linux 5.19 kernel the disk_*_io_acct() helper functions
have been replaced by the bdev_*_io_acct() functions.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13515
2022-06-01 14:24:49 -07:00
Brian Behlendorf fec407fb69 Linux 5.19 compat: aops->read_folio()
As of the Linux 5.19 kernel the readpage() address space operation
has been replaced by read_folio().

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13515
2022-06-01 14:24:49 -07:00
Brian Behlendorf 7ae5ea8864 Linux 5.19 compat: blkdev_issue_secure_erase()
Linux 5.19 commit torvalds/linux@44abff2c0 splits the secure
erase functionality from the blkdev_issue_discard() function.
The blkdev_issue_secure_erase() must now be issued to issue
a secure erase.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13515
2022-06-01 14:24:49 -07:00
Brian Behlendorf 048301b6dc Linux 5.19 compat: bdev_max_secure_erase_sectors()
Linux 5.19 commit torvalds/linux@44abff2c0 removed the
blk_queue_secure_erase() helper function.  The preferred
interface is to now use the bdev_max_secure_erase_sectors()
function to check for discard support.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13515
2022-06-01 14:24:49 -07:00
Brian Behlendorf 9ce5eb18ef Linux 5.19 compat: bdev_max_discard_sectors()
Linux 5.19 commit torvalds/linux@70200574cc removed the
blk_queue_discard() helper function.  The preferred interface
is to now use the bdev_max_discard_sectors() function to check
for discard support.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13515
2022-06-01 14:24:49 -07:00
Brian Behlendorf 5a639f0802 Linux 5.18 compat: bio_alloc()
As for the Linux 5.18 kernel bio_alloc() expects a block_device struct
as an argument.  This removes the need for the bio_set_dev() compatibility
code for 5.18 and newer kernels.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13515
2022-06-01 14:24:49 -07:00
Ryan Moeller 090bda59e3 Silence unused-but-set-variable warning
This was breaking the kmod port build on FreeBSD with Clang 13.

Use the same trick as we do for ASSERT() to make DNODE_VERIFY() use
its parameter at compile time without actually using it at run time
in non-debug builds.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #13507
2022-05-27 09:19:37 -07:00
heeplr 2458c7e63a zed: support subject as header in zed_notify_email()
Some minimal MUAs don't support passing the subjects as cmdline option.
This commit checks if "@SUBJECT@" is missing in ZED_EMAIL_OPTS and then
prepends a subject header to the notification message.
Also set a default for ${subject}.

Reviewed-by: Ahelenia Ziemia<C5><84>ska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Daniel Hiepler <d-git@coderdu.de>
Closes #13440
2022-05-27 09:19:37 -07:00
Umer Saleem 0a688b2345 rpm: Keep debug symbols if configured with '--enable-debuginfo'
Do not strip debug information from packages if '--enable-debuginfo' is
configured.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Closes #13500
2022-05-27 09:19:37 -07:00
Ryan Moeller fde66e583d FreeBSD: libspl: Add locking around statfs globals
Makes getmntent and getmntany thread-safe for external consumers of
libzfs zpool_disable_datasets, zfs_iter_mounted, libzfs_mnttab_update,
libzfs_mnttab_find.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #13484
2022-05-27 09:19:37 -07:00
Brian Behlendorf ed16dd7635 Standardize RHEL version check in packages
This is a follow up to 3c35662299 which standardizes how the RHEL
version check is done.  This simpler "0%{?rhel}" check is used
elsewhere in the packages so we do the same here.

Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13501
2022-05-27 09:19:37 -07:00
Rich Ercolani 5d9c527536 Modified ncompress requirement in RPM to exclude RHEL9
The bug this was working around is no longer present.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #13480
Closes #13490
2022-05-27 09:19:37 -07:00
Brian Behlendorf 5d534f1371 zed: Take no action on scrub/resilver checksum errors
When scrubbing/resilvering a pool it can be counter productive to
cancel the scan and kick of a replace operation to a hot spare
when encountering checksum errors.  In this case, the best course
of action is to allow the scrub/resilver to complete as quickly
as possible and to keep the vdevs fully online if possible.

Realistically, this is less of an issue for a RAIDZ since a
traditional resilver must be used and checksums will be verified.
However, this is not the case for a mirror or dRAID pool which is
sequentially resilvered and checksum verification is deferred
until after the replace operation completes.

Regardless, we apply this policy to all pool types since it's
a good idea for all vdevs.  Degrading additional vdevs has the
potential to make a bad situation worse.  Note the checksum
errors will still be reported as both an event and by
`zpool status`.  This change only prevents the ZED from
proactively taking any action.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13499
2022-05-27 09:19:37 -07:00
Mark Johnston 4184b78be1 zdb: Fix handling of nul termination in symlink targets
The SA attribute containing the symlink target does not include a nul
terminator, so when printing the target zdb would sometimes include
garbage at the end of the string.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #13482
2022-05-27 09:19:37 -07:00
наб 96c7c63994 automake: don't install /e/d/zfs or /e/z/zfs-functions +x
Closes #13496
Backport-of: #13503
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
2022-05-25 14:57:09 -07:00
Savyasachee Jha 115e059818 Multiple dracut module install script cleanups
- Replaced intances of `dracut_install` with `inst_simple`
- Removed calls to `test -x mark_hostonly` because the function is an
inbuilt dracut function
- Removed redundant installation of `systemd-ask-password` and
`systemd-tty-ask-password-agent` because they are already installed by
the systemd module. There is no need to install them again
- Removed multiple calls to the `mark_hostonly` function because the
`inst_simple` has a command-line switch for it
- Cleaned up the installation of the `zpool.cache`, `vdev_id.conf` and
`hostid` files to make the logic easier to follow
- Cleaned up and simplified the systemd service installation logic by
invoking systemctl instead of creating symlinks manually
- Replaced various hard-coded paths with dracut equivalents to better
conform with expected dracut behaviour
- Removed redundant call to `mkdir` (`inst_simple` creates the parent
directory if it does not exist on the destination initrd)

Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: Andrew J. Hesford <ajh@sideband.org>
Signed-off-by: Savyasachee Jha <hi@savyasacheejha.com>
Closes #13010
2022-05-25 11:09:23 -07:00
Savyasachee Jha 4252517f5f Remove absolute paths to udev rules and binaries for dracut
Since dracut functions can locate both udev rules and binaries, there is
no point in keeping absolute paths in the module setup script. It also
breaks the --sysroot option in dracut. This commit removes mentions to
absolute paths for binaries and udev rules.

Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: Andrew J. Hesford <ajh@sideband.org>
Signed-off-by: Savyasachee Jha <hi@savyasacheejha.com>
Closes #13010
2022-05-25 11:09:23 -07:00
Savyasachee Jha ebbfc6cb85 Make dracut fail if essential files cannot be installed
Dracut will now fail in initramfs generation if essential files cannot
be installed.

Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: Andrew J. Hesford <ajh@sideband.org>
Signed-off-by: Savyasachee Jha <hi@savyasacheejha.com>
Closes #13010
2022-05-25 11:09:23 -07:00
Savyasachee Jha 0671f72706 Make better use of dracut functions when building initramfs
Setting up the module involves multiple redundant calls to a bunch of
dracut functions wheich can be combined into one. Additionally, the mass
of code required to load libgcc_s.so* can be replaced with one dracut
function. This has the additional effect of removing errors involving
the non-installation of libgcc_s.so* which are seen on debian bullseye
when using version 2.1.2-1~bpo11+1 from the backports repository.

The systemd binaries are separated out into their own `dracut_install`
function call so they do not get pulled in when dracut does not load the
systemd module.

Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: Andrew J. Hesford <ajh@sideband.org>
Signed-off-by: Savyasachee Jha <hi@savyasacheejha.com>
Closes #13010
2022-05-25 11:09:23 -07:00
Coleman Kane 05147319b0 Fix compiler warnings about zero-length arrays in inline bitops
The compiler appears to be expanding the unused NULL pointer into a
zero-length array via the inline bitops code. When -Werror=array-bounds
is used, this causes a build failure. Recommended solution is allocate
temporary structures, fill with zeros (to avoid uninitialized data use
warnings), and pass the pointer to those to the inline calls.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #13463 
Closes #13465
2022-05-20 10:33:24 -07:00
Brian Behlendorf 0112bc2312 Add missing AC_MSG_RESULT(no) to configure
When the HAVE_IOPS_MKDIR_USERNS check fails output result
as required.
 
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13454
2022-05-20 10:33:24 -07:00
hping b28c0c4bf8 abd_os: remove redundant refcount creation for abd_children
Refcount creation for abd_zero_scatter->abd_children is redundant in
abd_alloc_zero_scatter, as it has been done in abd_init_struct.

In addition, abd_children is undefined when ZFS_DEBUG is disabled, the
reference of abd_children in abd_alloc_zero_scatter breaks build of
libzpool when ZFS_DEBUG is disabled.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Ping Huang <huangping@smartx.com>
Closes #13429
2022-05-20 10:33:24 -07:00
Aidan Harris eee389ba2e Fix functions without a prototype
clang-15 emits the following error message for functions without
a prototype:

fs/zfs/os/linux/spl/spl-kmem-cache.c:1423:27: error:
  a function declaration without a prototype is deprecated
  in all versions of C [-Werror,-Wstrict-prototypes]

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Aidan Harris <me@aidanharr.is>
Closes #13421
2022-05-20 10:33:24 -07:00
Mateusz Guzik 2c5c8bb0a6 FreeBSD: use zero_region instead of allocating a dedicated page
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #13406
2022-05-20 10:33:24 -07:00
szubersk 756c3e085b autoconf: Fail when __copy_from_user_inatomic is a non-GPL symbol
A followup to 849c14e048
Fix https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1009242

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #13389
2022-05-20 10:33:24 -07:00
Damian Szuberski 13b1f336d3 PPC get_user workaround
Linux 5.12 PPC 5.12 get_user() and __copy_from_user_inatomic()
inline helpers very indirectly include a reference to the GPL'd
array mmu_feature_keys[] and fails to build. Workaround this by
using copy_from_user() and throwing EFAULT for any calls to
__copy_from_user_inatomic(). This is a workaround until a fix
for Linux commit 7613f5a66becfd0e43a0f34de8518695888f5458
"powerpc/64s/kuap: Use mmu_has_feature()" is fully addressed.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Authored-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #11958
Closes #12590
Closes #13367
2022-05-20 10:33:24 -07:00
Brian Atkinson 60fc173251 Adding ZERO_PAGE detection
On some architectures ZERO_PAGE is unavailable because it references
a GPL exported symbol of empty_zero_page. Originally e08b993 removed
the call to PAGE_ZERO(0) for assignment to the abd_zero_page. However,
a simple check can be done to avoid a kernel allocation and free for
the abd_zero_page if ZERO_PAGE is available.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Brian Atkinson <batkinson@lanl.gov>
Closes #13199
2022-05-20 10:33:24 -07:00
Damian Szuberski 3bb068d4d5 autoconf: Pretend CONFIG_MODULES is always on
- Unconditionally inject `CONFIG_MODULES` make variable
  and `#define CONFIG_MODULES` to Kbuild in `ZFS_LINUX_COMPILE`
  autoconf function to emulate loadable kernel modules support.
  This allows OpenZFS to perform Linux checks despite
  `CONFIG_MODULES=n` in the actual Linux config.

- Add `ZFS_AC_KERNEL_CONFIG_MODULES` check which encompasses
  the logic from `ZFS_AC_KERNEL_TEST_MODULE` with additional
  diagnostic messages to the user

- Removed `ZFS_AC_KERNEL_TEST_MODULE` as it merely duplicates
  every check in `ZFS_AC_KERNEL_CONFIG_DEFINED`

- Moved `ZFS_AC_MODULE_SYMVERS` after `ZFS_AC_KERNEL_CONFIG_DEFINED`
  so the user has a chance to see the proper diagnostic from the
  steps before.

A workaround for Linux's

```
commit 3e3005df73b535cb849cf4ec8075d6aa3c460f68
Author: Masahiro Yamada <masahiroy@kernel.org>
Date:   Wed Mar 31 22:38:03 2021 +0900

kbuild: unify modules(_install) for in-tree and external modules

If you attempt to build or install modules ('make modules(_install)'
with CONFIG_MODULES disabled, you will get a clear error message, but
nothing for external module builds.

Factor out the modules and modules_install rules into the common part,
so you will get the same error message when you try to build external
modules with CONFIG_MODULES=n.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
```

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #10832
Closes #13361
2022-05-20 10:33:24 -07:00
Damian Szuberski 210b33109d Strengthen Linux kernel capabilities detection
- Add `CONFIG_BLOCK` Linux config requirement to
  `ZFS_AC_KERNEL_CONFIG_DEFINED`. OpenZFS won't compile without
  that block device support due to large amount of functional
  dependencies on it.

- Remove dependency on `groups_alloc()` in
  `ZFS_AC_KERNEL_SRC_GROUP_INFO_GID` to circumvent the missing stub
  in Linux 4.X kernel headers.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #13351
2022-05-20 10:33:24 -07:00
Richard Laager 1d54deb42f zvol_wait: Ignore locked zvols
"When an encrypted zvol is locked the zfs-volume-wait service does not
start.  The /sbin/zvol_wait should not wait for links when the volume
has property keystatus=unavailable."
-- https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1888405

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Thanks: James Dingwall <james-launchpad@dingwall.me.uk>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Closes #10662
2022-05-20 10:33:24 -07:00
Ka Ho Ng 1f31889046 FreeBSD: Implement hole-punching support
This adds supports for hole-punching facilities in the FreeBSD kernel
starting from __FreeBSD_version 1400032.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ka Ho Ng <khng@FreeBSD.org>
Sponsored-by: The FreeBSD Foundation
Closes #12458
2022-05-17 11:15:29 -07:00
наб 1467a1bb33 module: zstd: check we don't leak symbols; regenerate symbol map
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12988
Closes #13209
(cherry picked from commit 6ef00196db)
2022-05-16 15:48:21 -07:00
наб 2a64eeb6c7 man: zpool-import.8: -d -or -c
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13437
2022-05-10 13:36:37 -07:00
Brian Behlendorf bb29f1eb38 Reduce dbuf_find() lock contention
Holding a dbuf is a common operation which can become highly contended
in dbuf_find() when acquiring the dbuf hash mutex.  This is particularly
true on Linux when reading/writing volumes since by default up to 32
threads from the zvol_taskq may be taking a hold of the same dbuf.
This should also be observable on FreeBSD as long as there are enough
processes accessing the volume concurrently.

This is further aggregrated by the fact that only the block id will
be unique when calculating the dbuf hash for a single volume.  The
objset id, object id, and level will be the same for data blocks.
This has been observed to result in a somehwat less than uniform hash
distribution and a longer than expected max hash chain depth (~20)
on a large memory system (256 GB) using volumes.

This commit improves the siutation by switching the hash mutex to
an rwlock to allow concurrent lookups, and increasing DBUF_RWLOCKS
from 2048 to 8192 to further reduce the odds of a hash collision.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13405
2022-05-06 12:02:45 -07:00
наб 1184df6b93 contrib: dracut: remove getargbool polyfill
It was originally released in dracut 008 in February 2011;
we can probably drop it now

Upstream-commit: 47a02e3972
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13291
2022-05-06 12:01:48 -07:00
наб 1781ee703b Add dracut.zfs.7
Thorough documentation with a dracut.bootup(7)-style flowchart,
dracut.cmdline(7)-style cmdline listing,
and per-file docs like the old README

Upstream-commit: e3fc330d6c
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13291
2022-05-06 12:01:48 -07:00
наб 0947096044 contrib: dracut: zfs-needshutdown: don't list
Upstream-commit: 1cc9cc2f89
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13291
2022-05-06 12:01:48 -07:00
наб a0e81a4074 contrib: dracut: zfs-{rollback,snapshot}-bootfs: order after key loading
This fixes at least one race I got with an encrypted root

Upstream-commit: 6ebdb0b20d
Upstream-commit: b8d9679f36
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13291
2022-05-06 12:01:48 -07:00
наб fc41be5a8d contrib: dracut: don't require essentials to be under the same encroot
Upstream-commit: 30c6dce7f7
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13291
2022-05-06 12:01:48 -07:00
наб 059a563810 contrib: dracut: inline single-use import_pool, move single-use ask_for_password
Also don't set ROOTFS_MOUNTED; the final mention was removed in dracut
011 from July 2011

Upstream-commit: eaf1e06045
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13291
2022-05-06 12:01:48 -07:00
наб 5c97f76f5a contrib: dracut: zfs-lib: remove find_bootfs
Upstream-commit: dac0b0785a
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13291
2022-05-06 12:01:48 -07:00
наб 5c0aa409ed contrib: dracut: zfs-lib: simplify ask_for_password
The only user is mount-zfs.sh (non-systemd systems),
so reduce it to what it needs

Upstream-commit: 5d31169d7c
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13291
2022-05-06 12:01:48 -07:00
наб 71a1d8e5dc contrib; dracut: flatten zfs-load-key, simplify zfs-env-bootfs
Upstream-commit: fec2c613a4
Upstream-change: drop 90zfs/module-setup.sh.in cleanups that don't apply
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13291
2022-05-06 12:01:48 -07:00
наб 0864c29e7c contrib; dracut: centralise root= parsing, actually support root=s
So far, everything parsed root= manually, which meant that while
zfs-parse.sh was updated, and supposedly supported + -> ' ' conversion,
it meant nothing

Instead, centralise parsing, and allow:
  root=
  root=zfs
  root=zfs:
  root=zfs:AUTO

  root=ZFS=data/set
  root=zfs:data/set
  root=zfs:ZFS=data/set (as a side-effect; allowed but undocumented)

  rootfstype=zfs AND root=data/set <=> root=data/set
  rootfstype=zfs AND root=         <=> root=zfs:AUTO

So rootfstype=zfs /also/ behaves as expected, and + decoding works

Upstream-commit: 245529d85f
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13291
2022-05-06 12:01:48 -07:00
наб b551725df4 contrib: dracut: parse-zfs: stop pretending we support FILESYSTEM=
It was added in the original ae26d0465a ("Add dracut support") commit
in 2011, and was then broken a bit later with the advent of
dracut-zfs-generator, or maybe earlier as part of other churn

Either way, it's broken, and has been in 2.0+ as well, and no-one
complained. Stop pretending we support it at all

Upstream-commit: 2c74617bcf
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13291
2022-05-06 12:01:48 -07:00
наб ae054e690e contrib: dracut: parse-zfs: drop initqueue-finished for i/f
The switch was released in dracut 009 in March 2011,
we can safely get rid of the compatibility hook

Upstream-commit: 47636f5661
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13291
2022-05-06 12:01:48 -07:00
наб 0657247548 contrib/dracut: zfs-lib: export_all: replace with inline zpool export -a
07a3312f17, which introduced this in
October of 2014, didn't have zpool export -a available; we do

Upstream-commit: 6a41310c70
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13093
2022-05-06 12:01:48 -07:00
jokersus bc03fee94d Remove REMAKE_INITRD
The option has been deprecated in dkms and will break packaging in
future versions. See https://github.com/dell/dkms/commit/7114c62

Upstream-commit: b5c16861e9
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Amanakis <gamanakis@gmail.com>
Signed-off-by: jokersus <jokersus.cava@gmail.com>
Closes #12781
2022-05-06 11:32:45 -07:00
Rich Ercolani ce8ae064d2 Python 3.10 fixes, part 2
There was a fallback case I overlooked in the initial patch, with
a similarly imperfect version extractor.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12045 
Closes #12673
2022-05-04 11:37:49 -07:00
Brian Behlendorf 4c9c96aba4 Silence unused-but-set-variable warnings
Clang 13.0.0 added support for `Wunused-but-set-parameter` and
`-Wunused-but-set-variable` which correctly detects two unused
variables in zstd resulting in a build failure.  This commit
annotates these instances accordingly.

  https://releases.llvm.org/13.0.1/tools/clang/docs/ReleaseNotes.html#id6

In FSE_createCTable(), malloc() is intentionally defined as NULL when
compiled in the kernel so the variable is unused.

  zstd/lib/compress/fse_compress.c:307:12: error: variable 'size'
  set but not used [-Werror,-Wunused-but-set-variable]

Additionally, in ZSTD_seqDecompressedSize() the assert is compiled
out similarly resulting in an unused variable.

  zstd/lib/compress/zstd_compress_superblock.c:412:12: error: variable
  'litLengthSum' set but not used [-Werror,-Wunused-but-set-variable]

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2022-05-02 15:42:58 -07:00
наб ecec151c14 module: zfs: freebsd: fix unused, remove argsused
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12844
2022-05-02 15:42:58 -07:00
наб a4f582f0b6 FreeBSD: remove unused variable
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Issue #12899
2022-05-02 15:42:58 -07:00
наб 9e68b734b3 zvol: remove unused variable
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12917
2022-05-02 15:42:58 -07:00
наб a175fe82e6 fm: remove unused variables
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12917
2022-05-02 15:42:58 -07:00
наб b8e1366ee6 zvol: remove unused variable
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12917
2022-05-02 15:42:58 -07:00
наб 7536ad35ca module/zfs: vdev_removal: spa_vdev_remove_thread: remove unused variable
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12187
2022-05-02 15:42:58 -07:00
наб 986d64ccca module/zfs: vdev_indirect: vdev_indirect_repair: remove unused variable
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12187
2022-05-02 15:42:58 -07:00
наб 18e9268087 module/zfs: dbuf: dbuf_read_impl: remove unused variable
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12187
2022-05-02 15:42:58 -07:00
наб 4149e19dfc module/zfs: arc: arc_hdr_realloc_crypt: remove unused variables
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12187
2022-05-02 15:42:58 -07:00
наб 116d447fb5 libzfs: zfs_send: remove unused variable
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12187
2022-05-02 15:42:58 -07:00
наб a1a54b3e47 libzutil: zpool_find_config: remove unused variable
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12187
2022-05-02 15:42:58 -07:00
Brian Behlendorf ce8d41ef75 Skip spacemaps reading in case of pool readonly import
The only zdb utility require to read metaslab-related data during
read-only pool import because of spacemaps validation. Add global
variable which will allow zdb read spacemaps in case of readonly
import mode.

Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Fedor Uporov <fuporov.vstack@gmail.com>
Closes #9095
Closes #12687
2022-04-28 16:47:12 -07:00
наб c0ff5f1560 zfs: holds: dequadratify
Before:
  15  0m0.177s
  30  0m0.653s
  45  0m1.289s
  60  0m2.129s
  75  0m3.264s
  90  0m4.397s
  100 0m5.996s
  117 0m8.552s

After:
  30  0m0.053s
  117 0m0.125s

Upstream-commit: 2a70a09072
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13372
Closes #13373
2022-04-28 15:18:47 -07:00
Brian Behlendorf 49c1346c10 Linux 5.18 compat: replace __set_page_dirty_nobuffers
Replace __set_page_dirty_nobuffers with filemap_dirty_folio.

Upstream-commit: 6b1f86f8e9c7f9de7ca1cb987b2cf25e99b1ae3a
("Merge tag 'folio-5.18b' of
git://git.infradead.org/users/willy/pagecache ")

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Authored-by: Satadru Pramanik <satadru@gmail.com>
Signed-off-by: Satadru Pramanik <satadru@gmail.com>
Closes #13325
Closes #13380
2022-04-28 15:17:38 -07:00
Brian Behlendorf 71cd3726c0 Fix O_APPEND for Linux 3.15 and older kernels
When using a Linux kernel which predates the iov_iter interface the
O_APPEND flag should be applied in zpl_aio_write() via the call to
generic_write_checks().  The updated pos variable  was incorrectly
ignored resulting in the current offset being used.

This issue should only realistically impact the RHEL/CentOS 7.x
kernels which are based on Linux 3.10.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13370 
Closes #13377
2022-04-28 15:15:28 -07:00
наб 642426095a Linux 5.18 compat: kobj_type.default_attrs replaced with default_groups
Upstream-commit: cdb4f26a63c391317e335e6e683a614358e70aeb ("kobject:
 kobj_type: remove default_attrs")
Upstream-commit: 0cdda2edb3
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13357
2022-04-25 10:00:09 -07:00
Alexander Motin 972637dc06 FreeBSD: Fix translation from ABD to physical pages.
In hypothetical case of non-linear ABD with single segment, multiple
to page size but not aligned to it, vdev_geom_fill_unmap_cb() could
fill one page less into bio_ma array.

I am not sure it is expoitable, but better to be safe than sorry.

Reported-by: Mark Johnston <markj@FreeBSD.org>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
(cherry picked from commit 5352f85cddce44e82fb1c4caec3b333e3666d7fd)
2022-04-21 16:59:09 -07:00
Rich Ercolani c220771a47 Corrected oversight in ZERO_RANGE behavior
It turns out, no, in fact, ZERO_RANGE and PUNCH_HOLE do
have differing semantics in some ways - in particular,
one requires KEEP_SIZE, and the other does not.

Also added a zero-range test to catch this, corrected a flaw
that made the punch-hole test succeed vacuously, and a typo
in file_write.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #13329 
Closes #13338
2022-04-21 16:58:07 -07:00
наб 361dc138b1 Document zfs inherit -S's interaction with noninheritable properties
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Upstream-commit: 92295af800
Closes #11894
Closes #13335
2022-04-21 11:09:35 -07:00
Brian Behlendorf aa1c3c1d1d Linux 5.17 compat: GENHD_FL_EXT_DEVT / GENHD_FL_NO_PART_SCAN
As of the 5.17 kernel the GENHD_FL_EXT_DEVT flag has been removed
and the GENHD_FL_NO_PART_SCAN flag renamed GENHD_FL_NO_PART. Update
zvol_alloc() to set GENHD_FL_NO_PART for the newer kernels which
is sufficient.  The behavior for prior kernels remains unchanged.

1ebe2e5f ("block: remove GENHD_FL_EXT_DEVT")
46e7eac6 ("block: rename GENHD_FL_NO_PART_SCAN to GENHD_FL_NO_PART")

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13294
Closes #13297
2022-04-20 13:44:19 -07:00
Mark Johnston b7546f92ea FreeBSD: Return Mach error codes from VOP_(GET|PUT)PAGES
FreeBSD's memory management system uses its own error numbers and gets
confused when these VOPs return EIO.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reported-by: Peter Holm <pho@FreeBSD.org>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #13311
2022-04-19 10:42:54 -07:00
Mark Johnston e9cd90f6e5 FreeBSD: Parameterize ZFS_ENTER/ZFS_VERIFY_VP with an error code
For legacy reasons, a couple of VOPs have to return error numbers that
don't come from the usual errno namespace.  To handle the cases where
ZFS_ENTER or ZFS_VERIFY_ZP fail, we need to be able to override the
default error return value of EIO.  Extend the macros to permit this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #13311
2022-04-19 10:42:54 -07:00
наб ff23ef0c99 libzfs: import: zpool_clear_label: actually fail if clearing l2arc header fails
Found with -Wunused-but-set-variable on Clang trunk

Upstream-commit: a4e0cee178
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13304
2022-04-15 14:16:59 -07:00
наб 1f4c79b1ce libzfs: sendrecv: always cancel progress thread in zfs_send_one()
This is in line with all the other uses of the progress thread

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #11560
Closes #13284
2022-04-11 15:48:46 -07:00
Riccardo Schirone 35ddd8ee2e Linux 5.18 compat: use address_space_operations->readahead
->readpages was removed and replaced by ->readahead. Define
zpl_readahead for kernels that don't have ->readpages.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Riccardo Schirone <rschirone91@gmail.com>
Closes #13278
2022-04-06 13:15:27 -07:00
Riccardo Schirone 10a9f5fc47 Linux 5.18 compat: blkg_tryget is moved to private headers
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Riccardo Schirone <rschirone91@gmail.com>
Closes #13278
2022-04-06 13:15:27 -07:00
наб 9f7f704507 Linux 5.18 compat: replace genhd.h with blkdev.h includes
blkdev.h includes genhd.h since dawn of upstream git, so this is
globally safe

Upstream-commit: 322cbb50de711814c42fb088f6d31901502c711a ("block:
 remove genhd.h")

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13251
2022-04-06 13:15:27 -07:00
наб 215a8255a9 Linux 5.18 compat: 4-argument bio_alloc()
bio_alloc(gfp_t gfp_mask, unsigned short nr_iovecs)

became

  bio_alloc(struct block_device *bdev, unsigned short nr_vecs,
            unsigned int opf, gfp_t gfp_mask)
passing NULL/0 continues previous behaviour

Upstream-commit: 07888c665b405b1cd3577ddebfeb74f4717a84c4 ("block:
 pass a block_device and opf to bio_alloc")

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13251
2022-04-06 13:15:27 -07:00
Ryan Moeller a5a28723bd FreeBSD: Use NDFREE_PNBUF if available
NDF_ONLY_PNBUF has been removed from FreeBSD in favor of NDFREE_PNBUF.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #13277
2022-04-06 10:29:53 -07:00
Brian Behlendorf 5a9994f5ae Export minimal zfs_refcount interfaces
Lustre makes light use of the zfs_refcount interfaces which
isn't a problem when using a non-debug build of OpenZFS. However,
when debugging is enabled the required symbols are not exported.

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12613
2022-04-06 10:29:00 -07:00
Brian Behlendorf 9f6943504a Default to zfs_dmu_offset_next_sync=1
Strict hole reporting was previously disabled by default as a
performance optimization.  However, this has lead to confusion
over the expected behavior and a variety of workarounds being
adopted by consumers of ZFS.  Change the default behavior to
always report holes and force the TXG sync.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Upstream-commit: 05b3eb6d23
Ref: #13261
Closes #12746
2022-04-01 09:59:47 -07:00
наб fe6f2651f5 etc/systemd/zfs-mount-generator: serialise, handle keylocation=http[s]://
* etc/systemd/zfs-mount-generator: serialise

The wins for a relatively normal workload are rather slim:
	real	0.02119s/0.00985s=2.15029x
	user	0.02130s/0.00346s=6.15560x
	sys	0.03858s/0.00643s=6.00062x

	wall-total	0.014518s/0.005925s=2.45009x
	wall-init	0.014518s/0.002457s=5.90684x
	wall-real	0.014518s/0.003467s=4.18668x

But this is a big win on machines with a lot of datasets and expensive
forks.

For example, the gain on a VM on my work laptop with 900+ legacy-mount
Docker datasets, the original gains from the C rewrite were
only five-fold:
	real    0.516s/0.102s=5.05882x
	user    0.237s/0.143s=1.65734x
	sys     0.287s/0.100s=2.87x

And this serial variant gains this back there as well:
	real    0.102s/0.008s=12.75x
	user    0.143s/0.007s=20.42857
	sys     0.100s/0.001s=100x

	wall-total	0.09717s/0.00319s=30.40255x
	wall-init	0.00203s/0.00200s=1.015941x
	wall-real	0.09513s/0.00118s=80.02043x

For a total of
	real    0.516s/0.008s=64.5x
	user    0.237s/0.007s=33.85714x
	sys     0.287s/0.001s=287x

Suggested-by: Richard Laager <rlaager@wiktel.com>

* etc/systemd/zfs-mount-generator: pull in network for keylocation=https

Also simplify RequiresMountsFor= handling
Ref: #11956

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Upstream-commit: 4325de09cd
Closes #12138
2022-04-01 09:58:45 -07:00
наб 7fbb90feea libzfs: diff: stream_bytes: use fputc, %hho formats chars
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Upstream-commit: a72129edcb
Closes #12829
2022-04-01 09:58:45 -07:00
наб 5a21214be8 zfs, libzfs: diff: accept -h/ZFS_DIFF_NO_MANGLE, disabling path escaping
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Upstream-commit: 344bbc82e7
Closes #12829
2022-04-01 09:58:45 -07:00
344 changed files with 6717 additions and 7339 deletions
+2 -2
View File
@@ -8,7 +8,7 @@ jobs:
checkstyle:
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
with:
ref: ${{ github.event.pull_request.head.sha }}
- name: Install dependencies
@@ -43,7 +43,7 @@ jobs:
if: failure() && steps.CheckABI.outcome == 'failure'
run: |
find -name *.abi | tar -cf abi_files.tar -T -
- uses: actions/upload-artifact@v2
- uses: actions/upload-artifact@v3
if: failure() && steps.CheckABI.outcome == 'failure'
with:
name: New ABI files (use only if you're sure about interface changes)
+3 -3
View File
@@ -9,10 +9,10 @@ jobs:
strategy:
fail-fast: false
matrix:
os: [18.04, 20.04]
os: [20.04]
runs-on: ubuntu-${{ matrix.os }}
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
with:
ref: ${{ github.event.pull_request.head.sha }}
- name: Install dependencies
@@ -75,7 +75,7 @@ jobs:
sudo chmod +r $RESULTS_PATH/*
# Replace ':' in dir names, actions/upload-artifact doesn't support it
for f in $(find /var/tmp/test_results -name '*:*'); do mv "$f" "${f//:/__}"; done
- uses: actions/upload-artifact@v2
- uses: actions/upload-artifact@v3
if: failure()
with:
name: Test logs Ubuntu-${{ matrix.os }}
+3 -3
View File
@@ -6,9 +6,9 @@ on:
jobs:
tests:
runs-on: ubuntu-latest
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
with:
ref: ${{ github.event.pull_request.head.sha }}
- name: Install dependencies
@@ -71,7 +71,7 @@ jobs:
sudo chmod +r $RESULTS_PATH/*
# Replace ':' in dir names, actions/upload-artifact doesn't support it
for f in $(find /var/tmp/test_results -name '*:*'); do mv "$f" "${f//:/__}"; done
- uses: actions/upload-artifact@v2
- uses: actions/upload-artifact@v3
if: failure()
with:
name: Test logs
+4 -4
View File
@@ -6,11 +6,11 @@ on:
jobs:
tests:
runs-on: ubuntu-latest
runs-on: ubuntu-20.04
env:
TEST_DIR: /var/tmp/zloop
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
with:
ref: ${{ github.event.pull_request.head.sha }}
- name: Install dependencies
@@ -50,7 +50,7 @@ jobs:
if: failure()
run: |
sudo chmod +r -R $TEST_DIR/
- uses: actions/upload-artifact@v2
- uses: actions/upload-artifact@v3
if: failure()
with:
name: Logs
@@ -58,7 +58,7 @@ jobs:
/var/tmp/zloop/*/
!/var/tmp/zloop/*/vdev/
if-no-files-found: ignore
- uses: actions/upload-artifact@v2
- uses: actions/upload-artifact@v3
if: failure()
with:
name: Pool files
+1 -1
View File
@@ -1,2 +1,2 @@
The [OpenZFS Code of Conduct](http://www.open-zfs.org/wiki/Code_of_Conduct)
The [OpenZFS Code of Conduct](https://openzfs.org/wiki/Code_of_Conduct)
applies to spaces associated with the OpenZFS project, including GitHub.
+2 -2
View File
@@ -1,10 +1,10 @@
Meta: 1
Name: zfs
Branch: 1.0
Version: 2.1.4
Version: 2.1.9
Release: 1
Release-Tags: relext
License: CDDL
Author: OpenZFS
Linux-Maximum: 5.17
Linux-Maximum: 6.1
Linux-Minimum: 3.10
+12 -2
View File
@@ -103,7 +103,7 @@ endif
endif
PHONY += codecheck
codecheck: cstyle shellcheck checkbashisms flake8 mancheck testscheck vcscheck
codecheck: cstyle shellcheck checkbashisms flake8 mancheck testscheck vcscheck zstdcheck
PHONY += checkstyle
checkstyle: codecheck commitcheck
@@ -114,14 +114,20 @@ commitcheck:
${top_srcdir}/scripts/commitcheck.sh; \
fi
if HAVE_PARALLEL
cstyle_line = -print0 | parallel -X0 ${top_srcdir}/scripts/cstyle.pl -cpP {}
else
cstyle_line = -exec ${top_srcdir}/scripts/cstyle.pl -cpP {} +
endif
PHONY += cstyle
cstyle:
@find ${top_srcdir} -name build -prune \
-o -type f -name '*.[hc]' \
! -name 'zfs_config.*' ! -name '*.mod.c' \
! -name 'opt_global.h' ! -name '*_if*.h' \
! -name 'zstd_compat_wrapper.h' \
! -path './module/zstd/lib/*' \
-exec ${top_srcdir}/scripts/cstyle.pl -cpP {} \+
$(cstyle_line)
filter_executable = -exec test -x '{}' \; -print
@@ -173,6 +179,10 @@ vcscheck:
awk '{c++; print} END {if(c>0) exit 1}' ; \
fi
PHONY += zstdcheck
zstdcheck:
@$(MAKE) -C module/zstd checksymbols
PHONY += lint
lint: cppcheck paxcheck
+4 -4
View File
@@ -686,9 +686,9 @@ def section_archits(kstats_dict):
print()
print('Cache hits by data type:')
dt_todo = (('Demand data:', arc_stats['demand_data_hits']),
('Demand prefetch data:', arc_stats['prefetch_data_hits']),
('Prefetch data:', arc_stats['prefetch_data_hits']),
('Demand metadata:', arc_stats['demand_metadata_hits']),
('Demand prefetch metadata:',
('Prefetch metadata:',
arc_stats['prefetch_metadata_hits']))
for title, value in dt_todo:
@@ -697,10 +697,10 @@ def section_archits(kstats_dict):
print()
print('Cache misses by data type:')
dm_todo = (('Demand data:', arc_stats['demand_data_misses']),
('Demand prefetch data:',
('Prefetch data:',
arc_stats['prefetch_data_misses']),
('Demand metadata:', arc_stats['demand_metadata_misses']),
('Demand prefetch metadata:',
('Prefetch metadata:',
arc_stats['prefetch_metadata_misses']))
for title, value in dm_todo:
+1 -1
View File
@@ -271,7 +271,7 @@ def print_values():
if pretty_print:
fmt = lambda col: prettynum(cols[col][0], cols[col][1], v[col])
else:
fmt = lambda col: v[col]
fmt = lambda col: str(v[col])
sys.stdout.write(sep.join(fmt(col) for col in hdr))
sys.stdout.write("\n")
+96 -32
View File
@@ -110,8 +110,9 @@ extern int zfs_recover;
extern unsigned long zfs_arc_meta_min, zfs_arc_meta_limit;
extern int zfs_vdev_async_read_max_active;
extern boolean_t spa_load_verify_dryrun;
extern boolean_t spa_mode_readable_spacemaps;
extern int zfs_reconstruct_indirect_combinations_max;
extern int zfs_btree_verify_intensity;
extern uint_t zfs_btree_verify_intensity;
static const char cmdname[] = "zdb";
uint8_t dump_opt[256];
@@ -2994,7 +2995,7 @@ open_objset(const char *path, void *tag, objset_t **osp)
}
sa_os = *osp;
return (0);
return (err);
}
static void
@@ -3124,13 +3125,18 @@ dump_znode_symlink(sa_handle_t *hdl)
{
int sa_symlink_size = 0;
char linktarget[MAXPATHLEN];
linktarget[0] = '\0';
int error;
error = sa_size(hdl, sa_attr_table[ZPL_SYMLINK], &sa_symlink_size);
if (error || sa_symlink_size == 0) {
return;
}
if (sa_symlink_size >= sizeof (linktarget)) {
(void) printf("symlink size %d is too large\n",
sa_symlink_size);
return;
}
linktarget[sa_symlink_size] = '\0';
if (sa_lookup(hdl, sa_attr_table[ZPL_SYMLINK],
&linktarget, sa_symlink_size) == 0)
(void) printf("\ttarget %s\n", linktarget);
@@ -8266,6 +8272,23 @@ zdb_embedded_block(char *thing)
free(buf);
}
/* check for valid hex or decimal numeric string */
static boolean_t
zdb_numeric(char *str)
{
int i = 0;
if (strlen(str) == 0)
return (B_FALSE);
if (strncmp(str, "0x", 2) == 0 || strncmp(str, "0X", 2) == 0)
i = 2;
for (; i < strlen(str); i++) {
if (!isxdigit(str[i]))
return (B_FALSE);
}
return (B_TRUE);
}
int
main(int argc, char **argv)
{
@@ -8311,7 +8334,7 @@ main(int argc, char **argv)
zfs_btree_verify_intensity = 3;
while ((c = getopt(argc, argv,
"AbcCdDeEFGhiI:klLmMo:Op:PqrRsSt:uU:vVx:XYyZ")) != -1) {
"AbcCdDeEFGhiI:klLmMNo:Op:PqrRsSt:uU:vVx:XYyZ")) != -1) {
switch (c) {
case 'b':
case 'c':
@@ -8325,6 +8348,7 @@ main(int argc, char **argv)
case 'l':
case 'm':
case 'M':
case 'N':
case 'O':
case 'r':
case 'R':
@@ -8416,31 +8440,6 @@ main(int argc, char **argv)
(void) fprintf(stderr, "-p option requires use of -e\n");
usage();
}
if (dump_opt['d'] || dump_opt['r']) {
/* <pool>[/<dataset | objset id> is accepted */
if (argv[2] && (objset_str = strchr(argv[2], '/')) != NULL &&
objset_str++ != NULL) {
char *endptr;
errno = 0;
objset_id = strtoull(objset_str, &endptr, 0);
/* dataset 0 is the same as opening the pool */
if (errno == 0 && endptr != objset_str &&
objset_id != 0) {
target_is_spa = B_FALSE;
dataset_lookup = B_TRUE;
} else if (objset_id != 0) {
printf("failed to open objset %s "
"%llu %s", objset_str,
(u_longlong_t)objset_id,
strerror(errno));
exit(1);
}
/* normal dataset name not an objset ID */
if (endptr == objset_str) {
objset_id = -1;
}
}
}
#if defined(_LP64)
/*
@@ -8469,13 +8468,18 @@ main(int argc, char **argv)
*/
spa_load_verify_dryrun = B_TRUE;
/*
* ZDB should have ability to read spacemaps.
*/
spa_mode_readable_spacemaps = B_TRUE;
kernel_init(SPA_MODE_READ);
if (dump_all)
verbose = MAX(verbose, 1);
for (c = 0; c < 256; c++) {
if (dump_all && strchr("AeEFklLOPrRSXy", c) == NULL)
if (dump_all && strchr("AeEFklLNOPrRSXy", c) == NULL)
dump_opt[c] = 1;
if (dump_opt[c])
dump_opt[c] += verbose;
@@ -8514,6 +8518,7 @@ main(int argc, char **argv)
return (dump_path(argv[0], argv[1], NULL));
}
if (dump_opt['r']) {
target_is_spa = B_FALSE;
if (argc != 3)
usage();
dump_opt['v'] = verbose;
@@ -8524,6 +8529,10 @@ main(int argc, char **argv)
rewind = ZPOOL_DO_REWIND |
(dump_opt['X'] ? ZPOOL_EXTREME_REWIND : 0);
/* -N implies -d */
if (dump_opt['N'] && dump_opt['d'] == 0)
dump_opt['d'] = dump_opt['N'];
if (nvlist_alloc(&policy, NV_UNIQUE_NAME_TYPE, 0) != 0 ||
nvlist_add_uint64(policy, ZPOOL_LOAD_REQUEST_TXG, max_txg) != 0 ||
nvlist_add_uint32(policy, ZPOOL_LOAD_REWIND_POLICY, rewind) != 0)
@@ -8542,6 +8551,34 @@ main(int argc, char **argv)
targetlen = strlen(target);
if (targetlen && target[targetlen - 1] == '/')
target[targetlen - 1] = '\0';
/*
* See if an objset ID was supplied (-d <pool>/<objset ID>).
* To disambiguate tank/100, consider the 100 as objsetID
* if -N was given, otherwise 100 is an objsetID iff
* tank/100 as a named dataset fails on lookup.
*/
objset_str = strchr(target, '/');
if (objset_str && strlen(objset_str) > 1 &&
zdb_numeric(objset_str + 1)) {
char *endptr;
errno = 0;
objset_str++;
objset_id = strtoull(objset_str, &endptr, 0);
/* dataset 0 is the same as opening the pool */
if (errno == 0 && endptr != objset_str &&
objset_id != 0) {
if (dump_opt['N'])
dataset_lookup = B_TRUE;
}
/* normal dataset name not an objset ID */
if (endptr == objset_str) {
objset_id = -1;
}
} else if (objset_str && !zdb_numeric(objset_str + 1) &&
dump_opt['N']) {
printf("Supply a numeric objset ID with -N\n");
exit(1);
}
} else {
target_pool = target;
}
@@ -8659,13 +8696,27 @@ main(int argc, char **argv)
}
return (error);
} else {
target_pool = strdup(target);
if (strpbrk(target, "/@") != NULL)
*strpbrk(target_pool, "/@") = '\0';
zdb_set_skip_mmp(target);
/*
* If -N was supplied, the user has indicated that
* zdb -d <pool>/<objsetID> is in effect. Otherwise
* we first assume that the dataset string is the
* dataset name. If dmu_objset_hold fails with the
* dataset string, and we have an objset_id, retry the
* lookup with the objsetID.
*/
boolean_t retry = B_TRUE;
retry_lookup:
if (dataset_lookup == B_TRUE) {
/*
* Use the supplied id to get the name
* for open_objset.
*/
error = spa_open(target, &spa, FTAG);
error = spa_open(target_pool, &spa, FTAG);
if (error == 0) {
error = name_from_objset_id(spa,
objset_id, dsname);
@@ -8674,10 +8725,23 @@ main(int argc, char **argv)
target = dsname;
}
}
if (error == 0)
if (error == 0) {
if (objset_id > 0 && retry) {
int err = dmu_objset_hold(target, FTAG,
&os);
if (err) {
dataset_lookup = B_TRUE;
retry = B_FALSE;
goto retry_lookup;
} else {
dmu_objset_rele(os, FTAG);
}
}
error = open_objset(target, FTAG, &os);
}
if (error == 0)
spa = dmu_objset_spa(os);
free(target_pool);
}
}
nvlist_free(policy);
+1
View File
@@ -599,6 +599,7 @@ fmd_timer_install(fmd_hdl_t *hdl, void *arg, fmd_event_t *ep, hrtime_t delta)
sev.sigev_notify_function = _timer_notify;
sev.sigev_notify_attributes = NULL;
sev.sigev_value.sival_ptr = ftp;
sev.sigev_signo = 0;
timer_create(CLOCK_REALTIME, &sev, &ftp->ft_tid);
timer_settime(ftp->ft_tid, 0, &its, NULL);
+20
View File
@@ -35,6 +35,7 @@
#include <sys/fs/zfs.h>
#include <sys/fm/protocol.h>
#include <sys/fm/fs/zfs.h>
#include <sys/zio.h>
#include "zfs_agents.h"
#include "fmd_api.h"
@@ -773,6 +774,8 @@ zfs_fm_recv(fmd_hdl_t *hdl, fmd_event_t *ep, nvlist_t *nvl, const char *class)
ZFS_MAKE_EREPORT(FM_EREPORT_ZFS_PROBE_FAILURE))) {
char *failmode = NULL;
boolean_t checkremove = B_FALSE;
uint32_t pri = 0;
int32_t flags = 0;
/*
* If this is a checksum or I/O error, then toss it into the
@@ -795,6 +798,23 @@ zfs_fm_recv(fmd_hdl_t *hdl, fmd_event_t *ep, nvlist_t *nvl, const char *class)
checkremove = B_TRUE;
} else if (fmd_nvl_class_match(hdl, nvl,
ZFS_MAKE_EREPORT(FM_EREPORT_ZFS_CHECKSUM))) {
/*
* We ignore ereports for checksum errors generated by
* scrub/resilver I/O to avoid potentially further
* degrading the pool while it's being repaired.
*/
if (((nvlist_lookup_uint32(nvl,
FM_EREPORT_PAYLOAD_ZFS_ZIO_PRIORITY, &pri) == 0) &&
(pri == ZIO_PRIORITY_SCRUB ||
pri == ZIO_PRIORITY_REBUILD)) ||
((nvlist_lookup_int32(nvl,
FM_EREPORT_PAYLOAD_ZFS_ZIO_FLAGS, &flags) == 0) &&
(flags & (ZIO_FLAG_SCRUB | ZIO_FLAG_RESILVER)))) {
fmd_hdl_debug(hdl, "ignoring '%s' for "
"scrub/resilver I/O", class);
return;
}
if (zcp->zc_data.zc_serd_checksum[0] == '\0') {
zfs_serd_name(zcp->zc_data.zc_serd_checksum,
pool_guid, vdev_guid, "checksum");
+170 -21
View File
@@ -525,6 +525,7 @@ typedef struct dev_data {
boolean_t dd_islabeled;
uint64_t dd_pool_guid;
uint64_t dd_vdev_guid;
uint64_t dd_new_vdev_guid;
const char *dd_new_devid;
} dev_data_t;
@@ -535,6 +536,7 @@ zfs_iter_vdev(zpool_handle_t *zhp, nvlist_t *nvl, void *data)
char *path = NULL;
uint_t c, children;
nvlist_t **child;
uint64_t guid = 0;
/*
* First iterate over any children.
@@ -562,17 +564,14 @@ zfs_iter_vdev(zpool_handle_t *zhp, nvlist_t *nvl, void *data)
/* once a vdev was matched and processed there is nothing left to do */
if (dp->dd_found)
return;
(void) nvlist_lookup_uint64(nvl, ZPOOL_CONFIG_GUID, &guid);
/*
* Match by GUID if available otherwise fallback to devid or physical
*/
if (dp->dd_vdev_guid != 0) {
uint64_t guid;
if (nvlist_lookup_uint64(nvl, ZPOOL_CONFIG_GUID,
&guid) != 0 || guid != dp->dd_vdev_guid) {
if (guid != dp->dd_vdev_guid)
return;
}
zed_log_msg(LOG_INFO, " zfs_iter_vdev: matched on %llu", guid);
dp->dd_found = B_TRUE;
@@ -582,6 +581,12 @@ zfs_iter_vdev(zpool_handle_t *zhp, nvlist_t *nvl, void *data)
* illumos, substring matching is not required to accommodate
* the partition suffix. An exact match will be present in
* the dp->dd_compare value.
* If the attached disk already contains a vdev GUID, it means
* the disk is not clean. In such a scenario, the physical path
* would be a match that makes the disk faulted when trying to
* online it. So, we would only want to proceed if either GUID
* matches with the last attached disk or the disk is in clean
* state.
*/
if (nvlist_lookup_string(nvl, dp->dd_prop, &path) != 0 ||
strcmp(dp->dd_compare, path) != 0) {
@@ -589,6 +594,12 @@ zfs_iter_vdev(zpool_handle_t *zhp, nvlist_t *nvl, void *data)
__func__, dp->dd_compare, path);
return;
}
if (dp->dd_new_vdev_guid != 0 && dp->dd_new_vdev_guid != guid) {
zed_log_msg(LOG_INFO, " %s: no match (GUID:%llu"
" != vdev GUID:%llu)", __func__,
dp->dd_new_vdev_guid, guid);
return;
}
zed_log_msg(LOG_INFO, " zfs_iter_vdev: matched %s on %s",
dp->dd_prop, path);
@@ -670,7 +681,7 @@ zfs_iter_pool(zpool_handle_t *zhp, void *data)
*/
static boolean_t
devphys_iter(const char *physical, const char *devid, zfs_process_func_t func,
boolean_t is_slice)
boolean_t is_slice, uint64_t new_vdev_guid)
{
dev_data_t data = { 0 };
@@ -680,6 +691,7 @@ devphys_iter(const char *physical, const char *devid, zfs_process_func_t func,
data.dd_found = B_FALSE;
data.dd_islabeled = is_slice;
data.dd_new_devid = devid; /* used by auto replace code */
data.dd_new_vdev_guid = new_vdev_guid;
(void) zpool_iter(g_zfshdl, zfs_iter_pool, &data);
@@ -848,7 +860,7 @@ zfs_deliver_add(nvlist_t *nvl, boolean_t is_lofi)
if (devid_iter(devid, zfs_process_add, is_slice))
return (0);
if (devpath != NULL && devphys_iter(devpath, devid, zfs_process_add,
is_slice))
is_slice, vdev_guid))
return (0);
if (vdev_guid != 0)
(void) guid_iter(pool_guid, vdev_guid, devid, zfs_process_add,
@@ -894,21 +906,96 @@ zfs_deliver_check(nvlist_t *nvl)
return (0);
}
/*
* Given a path to a vdev, lookup the vdev's physical size from its
* config nvlist.
*
* Returns the vdev's physical size in bytes on success, 0 on error.
*/
static uint64_t
vdev_size_from_config(zpool_handle_t *zhp, const char *vdev_path)
{
nvlist_t *nvl = NULL;
boolean_t avail_spare, l2cache, log;
vdev_stat_t *vs = NULL;
uint_t c;
nvl = zpool_find_vdev(zhp, vdev_path, &avail_spare, &l2cache, &log);
if (!nvl)
return (0);
verify(nvlist_lookup_uint64_array(nvl, ZPOOL_CONFIG_VDEV_STATS,
(uint64_t **)&vs, &c) == 0);
if (!vs) {
zed_log_msg(LOG_INFO, "%s: no nvlist for '%s'", __func__,
vdev_path);
return (0);
}
return (vs->vs_pspace);
}
/*
* Given a path to a vdev, lookup if the vdev is a "whole disk" in the
* config nvlist. "whole disk" means that ZFS was passed a whole disk
* at pool creation time, which it partitioned up and has full control over.
* Thus a partition with wholedisk=1 set tells us that zfs created the
* partition at creation time. A partition without whole disk set would have
* been created by externally (like with fdisk) and passed to ZFS.
*
* Returns the whole disk value (either 0 or 1).
*/
static uint64_t
vdev_whole_disk_from_config(zpool_handle_t *zhp, const char *vdev_path)
{
nvlist_t *nvl = NULL;
boolean_t avail_spare, l2cache, log;
uint64_t wholedisk = 0;
nvl = zpool_find_vdev(zhp, vdev_path, &avail_spare, &l2cache, &log);
if (!nvl)
return (0);
(void) nvlist_lookup_uint64(nvl, ZPOOL_CONFIG_WHOLE_DISK, &wholedisk);
return (wholedisk);
}
/*
* If the device size grew more than 1% then return true.
*/
#define DEVICE_GREW(oldsize, newsize) \
((newsize > oldsize) && \
((newsize / (newsize - oldsize)) <= 100))
static int
zfsdle_vdev_online(zpool_handle_t *zhp, void *data)
{
char *devname = data;
boolean_t avail_spare, l2cache;
nvlist_t *udev_nvl = data;
nvlist_t *tgt;
int error;
char *tmp_devname, devname[MAXPATHLEN] = "";
uint64_t guid;
if (nvlist_lookup_uint64(udev_nvl, ZFS_EV_VDEV_GUID, &guid) == 0) {
sprintf(devname, "%llu", (u_longlong_t)guid);
} else if (nvlist_lookup_string(udev_nvl, DEV_PHYS_PATH,
&tmp_devname) == 0) {
strlcpy(devname, tmp_devname, MAXPATHLEN);
zfs_append_partition(devname, MAXPATHLEN);
} else {
zed_log_msg(LOG_INFO, "%s: no guid or physpath", __func__);
}
zed_log_msg(LOG_INFO, "zfsdle_vdev_online: searching for '%s' in '%s'",
devname, zpool_get_name(zhp));
if ((tgt = zpool_find_vdev_by_physpath(zhp, devname,
&avail_spare, &l2cache, NULL)) != NULL) {
char *path, fullpath[MAXPATHLEN];
uint64_t wholedisk;
uint64_t wholedisk = 0;
error = nvlist_lookup_string(tgt, ZPOOL_CONFIG_PATH, &path);
if (error) {
@@ -916,10 +1003,8 @@ zfsdle_vdev_online(zpool_handle_t *zhp, void *data)
return (0);
}
error = nvlist_lookup_uint64(tgt, ZPOOL_CONFIG_WHOLE_DISK,
(void) nvlist_lookup_uint64(tgt, ZPOOL_CONFIG_WHOLE_DISK,
&wholedisk);
if (error)
wholedisk = 0;
if (wholedisk) {
path = strrchr(path, '/');
@@ -953,12 +1038,75 @@ zfsdle_vdev_online(zpool_handle_t *zhp, void *data)
vdev_state_t newstate;
if (zpool_get_state(zhp) != POOL_STATE_UNAVAIL) {
error = zpool_vdev_online(zhp, fullpath, 0,
&newstate);
zed_log_msg(LOG_INFO, "zfsdle_vdev_online: "
"setting device '%s' to ONLINE state "
"in pool '%s': %d", fullpath,
zpool_get_name(zhp), error);
/*
* If this disk size has not changed, then
* there's no need to do an autoexpand. To
* check we look at the disk's size in its
* config, and compare it to the disk size
* that udev is reporting.
*/
uint64_t udev_size = 0, conf_size = 0,
wholedisk = 0, udev_parent_size = 0;
/*
* Get the size of our disk that udev is
* reporting.
*/
if (nvlist_lookup_uint64(udev_nvl, DEV_SIZE,
&udev_size) != 0) {
udev_size = 0;
}
/*
* Get the size of our disk's parent device
* from udev (where sda1's parent is sda).
*/
if (nvlist_lookup_uint64(udev_nvl,
DEV_PARENT_SIZE, &udev_parent_size) != 0) {
udev_parent_size = 0;
}
conf_size = vdev_size_from_config(zhp,
fullpath);
wholedisk = vdev_whole_disk_from_config(zhp,
fullpath);
/*
* Only attempt an autoexpand if the vdev size
* changed. There are two different cases
* to consider.
*
* 1. wholedisk=1
* If you do a 'zpool create' on a whole disk
* (like /dev/sda), then zfs will create
* partitions on the disk (like /dev/sda1). In
* that case, wholedisk=1 will be set in the
* partition's nvlist config. So zed will need
* to see if your parent device (/dev/sda)
* expanded in size, and if so, then attempt
* the autoexpand.
*
* 2. wholedisk=0
* If you do a 'zpool create' on an existing
* partition, or a device that doesn't allow
* partitions, then wholedisk=0, and you will
* simply need to check if the device itself
* expanded in size.
*/
if (DEVICE_GREW(conf_size, udev_size) ||
(wholedisk && DEVICE_GREW(conf_size,
udev_parent_size))) {
error = zpool_vdev_online(zhp, fullpath,
0, &newstate);
zed_log_msg(LOG_INFO,
"%s: autoexpanding '%s' from %llu"
" to %llu bytes in pool '%s': %d",
__func__, fullpath, conf_size,
MAX(udev_size, udev_parent_size),
zpool_get_name(zhp), error);
}
}
}
zpool_close(zhp);
@@ -986,10 +1134,11 @@ zfs_deliver_dle(nvlist_t *nvl)
strlcpy(name, devname, MAXPATHLEN);
zfs_append_partition(name, MAXPATHLEN);
} else {
sprintf(name, "unknown");
zed_log_msg(LOG_INFO, "zfs_deliver_dle: no guid or physpath");
}
if (zpool_iter(g_zfshdl, zfsdle_vdev_online, name) != 1) {
if (zpool_iter(g_zfshdl, zfsdle_vdev_online, nvl) != 1) {
zed_log_msg(LOG_INFO, "zfs_deliver_dle: device '%s' not "
"found", name);
return (1);
@@ -1075,7 +1224,7 @@ zfs_enum_pools(void *arg)
* For now, each agent has its own libzfs instance
*/
int
zfs_slm_init()
zfs_slm_init(void)
{
if ((g_zfshdl = libzfs_init()) == NULL)
return (-1);
@@ -1101,7 +1250,7 @@ zfs_slm_init()
}
void
zfs_slm_fini()
zfs_slm_fini(void)
{
unavailpool_t *pool;
pendingdev_t *device;
+2 -2
View File
@@ -37,7 +37,7 @@ if [ "${ZEVENT_VDEV_STATE_STR}" != "FAULTED" ] \
fi
umask 077
note_subject="ZFS device fault for pool ${ZEVENT_POOL_GUID} on $(hostname)"
note_subject="ZFS device fault for pool ${ZEVENT_POOL} on $(hostname)"
note_pathname="$(mktemp)"
{
if [ "${ZEVENT_VDEV_STATE_STR}" = "FAULTED" ] ; then
@@ -65,7 +65,7 @@ note_pathname="$(mktemp)"
[ -n "${ZEVENT_VDEV_GUID}" ] && echo " vguid: ${ZEVENT_VDEV_GUID}"
[ -n "${ZEVENT_VDEV_DEVID}" ] && echo " devid: ${ZEVENT_VDEV_DEVID}"
echo " pool: ${ZEVENT_POOL_GUID}"
echo " pool: ${ZEVENT_POOL} (${ZEVENT_POOL_GUID})"
} > "${note_pathname}"
+17 -4
View File
@@ -224,6 +224,8 @@ zed_notify()
# ZED_EMAIL_OPTS. This undergoes the following keyword substitutions:
# - @ADDRESS@ is replaced with the space-delimited recipient email address(es)
# - @SUBJECT@ is replaced with the notification subject
# If @SUBJECT@ was omited here, a "Subject: ..." header will be added to notification
#
#
# Arguments
# subject: notification subject
@@ -241,7 +243,7 @@ zed_notify()
#
zed_notify_email()
{
local subject="$1"
local subject="${1:-"ZED notification"}"
local pathname="${2:-"/dev/null"}"
: "${ZED_EMAIL_PROG:="mail"}"
@@ -262,12 +264,23 @@ zed_notify_email()
return 1
fi
ZED_EMAIL_OPTS="$(echo "${ZED_EMAIL_OPTS}" \
# construct cmdline options
ZED_EMAIL_OPTS_PARSED="$(echo "${ZED_EMAIL_OPTS}" \
| sed -e "s/@ADDRESS@/${ZED_EMAIL_ADDR}/g" \
-e "s/@SUBJECT@/${subject}/g")"
# shellcheck disable=SC2086
eval ${ZED_EMAIL_PROG} ${ZED_EMAIL_OPTS} < "${pathname}" >/dev/null 2>&1
# pipe message to email prog
# shellcheck disable=SC2086,SC2248
{
# no subject passed as option?
if [ "${ZED_EMAIL_OPTS%@SUBJECT@*}" = "${ZED_EMAIL_OPTS}" ] ; then
# inject subject header
printf "Subject: %s\n" "${subject}"
fi
# output message
cat "${pathname}"
} |
eval ${ZED_EMAIL_PROG} ${ZED_EMAIL_OPTS_PARSED} >/dev/null 2>&1
rv=$?
if [ "${rv}" -ne 0 ]; then
zed_log_err "${ZED_EMAIL_PROG##*/} exit=${rv}"
+1
View File
@@ -30,6 +30,7 @@ ZED_EMAIL_ADDR="root"
# The string @SUBJECT@ will be replaced with the notification subject;
# this should be protected with quotes to prevent word-splitting.
# Email will only be sent if ZED_EMAIL_ADDR is defined.
# If @SUBJECT@ was omited here, a "Subject: ..." header will be added to notification
#
#ZED_EMAIL_OPTS="-s '@SUBJECT@' @ADDRESS@"
+54 -14
View File
@@ -78,6 +78,8 @@ zed_udev_event(const char *class, const char *subclass, nvlist_t *nvl)
zed_log_msg(LOG_INFO, "\t%s: %s", DEV_PHYS_PATH, strval);
if (nvlist_lookup_uint64(nvl, DEV_SIZE, &numval) == 0)
zed_log_msg(LOG_INFO, "\t%s: %llu", DEV_SIZE, numval);
if (nvlist_lookup_uint64(nvl, DEV_PARENT_SIZE, &numval) == 0)
zed_log_msg(LOG_INFO, "\t%s: %llu", DEV_PARENT_SIZE, numval);
if (nvlist_lookup_uint64(nvl, ZFS_EV_POOL_GUID, &numval) == 0)
zed_log_msg(LOG_INFO, "\t%s: %llu", ZFS_EV_POOL_GUID, numval);
if (nvlist_lookup_uint64(nvl, ZFS_EV_VDEV_GUID, &numval) == 0)
@@ -130,6 +132,20 @@ dev_event_nvlist(struct udev_device *dev)
numval *= strtoull(value, NULL, 10);
(void) nvlist_add_uint64(nvl, DEV_SIZE, numval);
/*
* If the device has a parent, then get the parent block
* device's size as well. For example, /dev/sda1's parent
* is /dev/sda.
*/
struct udev_device *parent_dev = udev_device_get_parent(dev);
if ((value = udev_device_get_sysattr_value(parent_dev, "size"))
!= NULL) {
uint64_t numval = DEV_BSIZE;
numval *= strtoull(value, NULL, 10);
(void) nvlist_add_uint64(nvl, DEV_PARENT_SIZE, numval);
}
}
/*
@@ -169,7 +185,7 @@ zed_udev_monitor(void *arg)
while (1) {
struct udev_device *dev;
const char *action, *type, *part, *sectors;
const char *bus, *uuid;
const char *bus, *uuid, *devpath;
const char *class, *subclass;
nvlist_t *nvl;
boolean_t is_zfs = B_FALSE;
@@ -208,6 +224,12 @@ zed_udev_monitor(void *arg)
* if this is a disk and it is partitioned, then the
* zfs label will reside in a DEVTYPE=partition and
* we can skip passing this event
*
* Special case: Blank disks are sometimes reported with
* an erroneous 'atari' partition, and should not be
* excluded from being used as an autoreplace disk:
*
* https://github.com/openzfs/zfs/issues/13497
*/
type = udev_device_get_property_value(dev, "DEVTYPE");
part = udev_device_get_property_value(dev,
@@ -215,14 +237,23 @@ zed_udev_monitor(void *arg)
if (type != NULL && type[0] != '\0' &&
strcmp(type, "disk") == 0 &&
part != NULL && part[0] != '\0') {
zed_log_msg(LOG_INFO,
"%s: skip %s since it has a %s partition already",
__func__,
udev_device_get_property_value(dev, "DEVNAME"),
part);
/* skip and wait for partition event */
udev_device_unref(dev);
continue;
const char *devname =
udev_device_get_property_value(dev, "DEVNAME");
if (strcmp(part, "atari") == 0) {
zed_log_msg(LOG_INFO,
"%s: %s is reporting an atari partition, "
"but we're going to assume it's a false "
"positive and still use it (issue #13497)",
__func__, devname);
} else {
zed_log_msg(LOG_INFO,
"%s: skip %s since it has a %s partition "
"already", __func__, devname, part);
/* skip and wait for partition event */
udev_device_unref(dev);
continue;
}
}
/*
@@ -248,10 +279,19 @@ zed_udev_monitor(void *arg)
* device id string is required in the message schema
* for matching with vdevs. Preflight here for expected
* udev information.
*
* Special case:
* NVMe devices don't have ID_BUS set (at least on RHEL 7-8),
* but they are valid for autoreplace. Add a special case for
* them by searching for "/nvme/" in the udev DEVPATH:
*
* DEVPATH=/devices/pci0000:00/0000:00:1e.0/nvme/nvme2/nvme2n1
*/
bus = udev_device_get_property_value(dev, "ID_BUS");
uuid = udev_device_get_property_value(dev, "DM_UUID");
if (!is_zfs && (bus == NULL && uuid == NULL)) {
devpath = udev_device_get_devpath(dev);
if (!is_zfs && (bus == NULL && uuid == NULL &&
strstr(devpath, "/nvme/") == NULL)) {
zed_log_msg(LOG_INFO, "zed_udev_monitor: %s no devid "
"source", udev_device_get_devnode(dev));
udev_device_unref(dev);
@@ -362,7 +402,7 @@ zed_udev_monitor(void *arg)
}
int
zed_disk_event_init()
zed_disk_event_init(void)
{
int fd, fflags;
@@ -398,7 +438,7 @@ zed_disk_event_init()
}
void
zed_disk_event_fini()
zed_disk_event_fini(void)
{
/* cancel monitor thread at recvmsg() */
(void) pthread_cancel(g_mon_tid);
@@ -416,13 +456,13 @@ zed_disk_event_fini()
#include "zed_disk_event.h"
int
zed_disk_event_init()
zed_disk_event_init(void)
{
return (0);
}
void
zed_disk_event_fini()
zed_disk_event_fini(void)
{
}
+7 -4
View File
@@ -2480,7 +2480,7 @@ upgrade_set_callback(zfs_handle_t *zhp, void *data)
/* upgrade */
if (version < cb->cb_version) {
char verstr[16];
char verstr[24];
(void) snprintf(verstr, sizeof (verstr),
"%llu", (u_longlong_t)cb->cb_version);
if (cb->cb_lastfs[0] && !same_pool(zhp, cb->cb_lastfs)) {
@@ -6593,7 +6593,7 @@ zfs_do_holds(int argc, char **argv)
/*
* 1. collect holds data, set format options
*/
ret = zfs_for_each(argc, argv, flags, types, NULL, NULL, limit,
ret = zfs_for_each(1, argv + i, flags, types, NULL, NULL, limit,
holds_callback, &cb);
if (ret != 0)
++errors;
@@ -7672,7 +7672,7 @@ zfs_do_diff(int argc, char **argv)
int c;
struct sigaction sa;
while ((c = getopt(argc, argv, "FHt")) != -1) {
while ((c = getopt(argc, argv, "FHth")) != -1) {
switch (c) {
case 'F':
flags |= ZFS_DIFF_CLASSIFY;
@@ -7683,6 +7683,9 @@ zfs_do_diff(int argc, char **argv)
case 't':
flags |= ZFS_DIFF_TIMESTAMP;
break;
case 'h':
flags |= ZFS_DIFF_NO_MANGLE;
break;
default:
(void) fprintf(stderr,
gettext("invalid option '%c'\n"), optopt);
@@ -8532,7 +8535,7 @@ static int
zfs_do_wait(int argc, char **argv)
{
boolean_t enabled[ZFS_WAIT_NUM_ACTIVITIES];
int error, i;
int error = 0, i;
int c;
/* By default, wait for all types of activity. */
+10 -4
View File
@@ -207,7 +207,6 @@ static int
zfs_project_handle_dir(const char *name, zfs_project_control_t *zpc,
list_t *head)
{
char fullname[PATH_MAX];
struct dirent *ent;
DIR *dir;
int ret = 0;
@@ -227,21 +226,28 @@ zfs_project_handle_dir(const char *name, zfs_project_control_t *zpc,
zpc->zpc_ignore_noent = B_TRUE;
errno = 0;
while (!ret && (ent = readdir(dir)) != NULL) {
char *fullname;
/* skip "." and ".." */
if (strcmp(ent->d_name, ".") == 0 ||
strcmp(ent->d_name, "..") == 0)
continue;
if (strlen(ent->d_name) + strlen(name) >=
sizeof (fullname) + 1) {
if (strlen(ent->d_name) + strlen(name) + 1 >= PATH_MAX) {
errno = ENAMETOOLONG;
break;
}
sprintf(fullname, "%s/%s", name, ent->d_name);
if (asprintf(&fullname, "%s/%s", name, ent->d_name) == -1) {
errno = ENOMEM;
break;
}
ret = zfs_project_handle_one(fullname, zpc);
if (!ret && zpc->zpc_recursive && ent->d_type == DT_DIR)
zfs_project_item_alloc(head, fullname);
free(fullname);
}
if (errno && !ret) {
+27 -9
View File
@@ -2438,7 +2438,14 @@ print_status_config(zpool_handle_t *zhp, status_cbdata_t *cb, const char *name,
(void) nvlist_lookup_uint64_array(root, ZPOOL_CONFIG_SCAN_STATS,
(uint64_t **)&ps, &c);
if (ps != NULL && ps->pss_state == DSS_SCANNING && children == 0) {
/*
* If you force fault a drive that's resilvering, its scan stats can
* get frozen in time, giving the false impression that it's
* being resilvered. That's why we check the state to see if the vdev
* is healthy before reporting "resilvering" or "repairing".
*/
if (ps != NULL && ps->pss_state == DSS_SCANNING && children == 0 &&
vs->vs_state == VDEV_STATE_HEALTHY) {
if (vs->vs_scan_processed != 0) {
(void) printf(gettext(" (%s)"),
(ps->pss_func == POOL_SCAN_RESILVER) ?
@@ -2450,7 +2457,7 @@ print_status_config(zpool_handle_t *zhp, status_cbdata_t *cb, const char *name,
/* The top-level vdevs have the rebuild stats */
if (vrs != NULL && vrs->vrs_state == VDEV_REBUILD_ACTIVE &&
children == 0) {
children == 0 && vs->vs_state == VDEV_STATE_HEALTHY) {
if (vs->vs_rebuild_processed != 0) {
(void) printf(gettext(" (resilvering)"));
}
@@ -5407,7 +5414,13 @@ print_zpool_dir_scripts(char *dirpath)
if ((dir = opendir(dirpath)) != NULL) {
/* print all the files and directories within directory */
while ((ent = readdir(dir)) != NULL) {
sprintf(fullpath, "%s/%s", dirpath, ent->d_name);
if (snprintf(fullpath, sizeof (fullpath), "%s/%s",
dirpath, ent->d_name) >= sizeof (fullpath)) {
(void) fprintf(stderr,
gettext("internal error: "
"ZPOOL_SCRIPTS_PATH too large.\n"));
exit(1);
}
/* Print the scripts */
if (stat(fullpath, &dir_stat) == 0)
@@ -5458,8 +5471,8 @@ get_namewidth_iostat(zpool_handle_t *zhp, void *data)
* get_namewidth() returns the maximum width of any name in that column
* for any pool/vdev/device line that will be output.
*/
width = get_namewidth(zhp, cb->cb_namewidth, cb->cb_name_flags,
cb->cb_verbose);
width = get_namewidth(zhp, cb->cb_namewidth,
cb->cb_name_flags | VDEV_NAME_TYPE_ID, cb->cb_verbose);
/*
* The width we are calculating is the width of the header and also the
@@ -6035,6 +6048,7 @@ print_one_column(zpool_prop_t prop, uint64_t value, const char *str,
size_t width = zprop_width(prop, &fixed, ZFS_TYPE_POOL);
switch (prop) {
case ZPOOL_PROP_SIZE:
case ZPOOL_PROP_EXPANDSZ:
case ZPOOL_PROP_CHECKPOINT:
case ZPOOL_PROP_DEDUPRATIO:
@@ -6130,8 +6144,12 @@ print_list_stats(zpool_handle_t *zhp, const char *name, nvlist_t *nv,
* 'toplevel' boolean value is passed to the print_one_column()
* to indicate that the value is valid.
*/
print_one_column(ZPOOL_PROP_SIZE, vs->vs_space, NULL, scripted,
toplevel, format);
if (vs->vs_pspace)
print_one_column(ZPOOL_PROP_SIZE, vs->vs_pspace, NULL,
scripted, B_TRUE, format);
else
print_one_column(ZPOOL_PROP_SIZE, vs->vs_space, NULL,
scripted, toplevel, format);
print_one_column(ZPOOL_PROP_ALLOCATED, vs->vs_alloc, NULL,
scripted, toplevel, format);
print_one_column(ZPOOL_PROP_FREE, vs->vs_space - vs->vs_alloc,
@@ -6282,8 +6300,8 @@ get_namewidth_list(zpool_handle_t *zhp, void *data)
list_cbdata_t *cb = data;
int width;
width = get_namewidth(zhp, cb->cb_namewidth, cb->cb_name_flags,
cb->cb_verbose);
width = get_namewidth(zhp, cb->cb_namewidth,
cb->cb_name_flags | VDEV_NAME_TYPE_ID, cb->cb_verbose);
if (width < 9)
width = 9;
+3 -3
View File
@@ -363,9 +363,6 @@ zstream_do_dump(int argc, char *argv[])
BSWAP_64(drrb->drr_fromguid);
}
featureflags =
DMU_GET_FEATUREFLAGS(drrb->drr_versioninfo);
(void) printf("BEGIN record\n");
(void) printf("\thdrtype = %lld\n",
DMU_GET_STREAM_HDRTYPE(drrb->drr_versioninfo));
@@ -465,6 +462,9 @@ zstream_do_dump(int argc, char *argv[])
BSWAP_64(drro->drr_maxblkid);
}
featureflags =
DMU_GET_FEATUREFLAGS(drrb->drr_versioninfo);
if (featureflags & DMU_BACKUP_FEATURE_RAW &&
drro->drr_bonuslen > drro->drr_raw_bonuslen) {
(void) fprintf(stderr,
+1
View File
@@ -2193,6 +2193,7 @@ ztest_replay_write(void *arg1, void *arg2, boolean_t byteswap)
* but not always, because we also want to verify correct
* behavior when the data was not recently read into cache.
*/
ASSERT(doi.doi_data_block_size);
ASSERT0(offset % doi.doi_data_block_size);
if (ztest_random(4) != 0) {
int prefetch = ztest_random(2) ?
+13 -4
View File
@@ -28,15 +28,17 @@ filter_out_deleted_zvols() {
list_zvols() {
read -r default_volmode < /sys/module/zfs/parameters/zvol_volmode
zfs list -t volume -H -o \
name,volmode,receive_resume_token,redact_snaps |
while IFS=" " read -r name volmode token redacted; do # IFS=\t here!
name,volmode,receive_resume_token,redact_snaps,keystatus |
while IFS=" " read -r name volmode token redacted keystatus; do # IFS=\t here!
# /dev links are not created for zvols with volmode = "none"
# or for redacted zvols.
# /dev links are not created for zvols with volmode = "none",
# redacted zvols, or encrypted zvols for which the key has not
# been loaded.
[ "$volmode" = "none" ] && continue
[ "$volmode" = "default" ] && [ "$default_volmode" = "3" ] &&
continue
[ "$redacted" = "-" ] || continue
[ "$keystatus" = "unavailable" ] && continue
# We also ignore partially received zvols if it is
# not an incremental receive, as those won't even have a block
@@ -107,6 +109,13 @@ while [ "$outer_loop" -lt 20 ]; do
exit 0
fi
fi
#
# zvol_count made some progress - let's stay in this loop.
#
if [ "$old_zvols_count" -gt "$zvols_count" ]; then
outer_loop=$((outer_loop - 1))
fi
done
echo "Timed out waiting on zvol links"
+32 -36
View File
@@ -88,7 +88,7 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_NO_FORMAT_TRUNCATION], [
])
dnl #
dnl # Check if gcc supports -Wno-format-truncation option.
dnl # Check if gcc supports -Wno-format-zero-length option.
dnl #
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_NO_FORMAT_ZERO_LENGTH], [
AC_MSG_CHECKING([whether $CC supports -Wno-format-zero-length])
@@ -108,57 +108,30 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_NO_FORMAT_ZERO_LENGTH], [
AC_SUBST([NO_FORMAT_ZERO_LENGTH])
])
dnl #
dnl # Check if gcc supports -Wno-bool-compare option.
dnl # Check if gcc supports -Wno-clobbered option.
dnl #
dnl # We actually invoke gcc with the -Wbool-compare option
dnl # We actually invoke gcc with the -Wclobbered option
dnl # and infer the 'no-' version does or doesn't exist based upon
dnl # the results. This is required because when checking any of
dnl # no- prefixed options gcc always returns success.
dnl #
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_NO_BOOL_COMPARE], [
AC_MSG_CHECKING([whether $CC supports -Wno-bool-compare])
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_NO_CLOBBERED], [
AC_MSG_CHECKING([whether $CC supports -Wno-clobbered])
saved_flags="$CFLAGS"
CFLAGS="$CFLAGS -Werror -Wbool-compare"
CFLAGS="$CFLAGS -Werror -Wclobbered"
AC_COMPILE_IFELSE([AC_LANG_PROGRAM([], [])], [
NO_BOOL_COMPARE=-Wno-bool-compare
NO_CLOBBERED=-Wno-clobbered
AC_MSG_RESULT([yes])
], [
NO_BOOL_COMPARE=
NO_CLOBBERED=
AC_MSG_RESULT([no])
])
CFLAGS="$saved_flags"
AC_SUBST([NO_BOOL_COMPARE])
])
dnl #
dnl # Check if gcc supports -Wno-unused-but-set-variable option.
dnl #
dnl # We actually invoke gcc with the -Wunused-but-set-variable option
dnl # and infer the 'no-' version does or doesn't exist based upon
dnl # the results. This is required because when checking any of
dnl # no- prefixed options gcc always returns success.
dnl #
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_NO_UNUSED_BUT_SET_VARIABLE], [
AC_MSG_CHECKING([whether $CC supports -Wno-unused-but-set-variable])
saved_flags="$CFLAGS"
CFLAGS="$CFLAGS -Werror -Wunused-but-set-variable"
AC_COMPILE_IFELSE([AC_LANG_PROGRAM([], [])], [
NO_UNUSED_BUT_SET_VARIABLE=-Wno-unused-but-set-variable
AC_MSG_RESULT([yes])
], [
NO_UNUSED_BUT_SET_VARIABLE=
AC_MSG_RESULT([no])
])
CFLAGS="$saved_flags"
AC_SUBST([NO_UNUSED_BUT_SET_VARIABLE])
AC_SUBST([NO_CLOBBERED])
])
dnl #
@@ -184,6 +157,29 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_IMPLICIT_FALLTHROUGH], [
AC_SUBST([IMPLICIT_FALLTHROUGH])
])
dnl #
dnl # Check if cc supports -Winfinite-recursion option.
dnl #
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_INFINITE_RECURSION], [
AC_MSG_CHECKING([whether $CC supports -Winfinite-recursion])
saved_flags="$CFLAGS"
CFLAGS="$CFLAGS -Werror -Winfinite-recursion"
AC_COMPILE_IFELSE([AC_LANG_PROGRAM([], [])], [
INFINITE_RECURSION=-Winfinite-recursion
AC_DEFINE([HAVE_INFINITE_RECURSION], 1,
[Define if compiler supports -Winfinite-recursion])
AC_MSG_RESULT([yes])
], [
INFINITE_RECURSION=
AC_MSG_RESULT([no])
])
CFLAGS="$saved_flags"
AC_SUBST([INFINITE_RECURSION])
])
dnl #
dnl # Check if gcc supports -fno-omit-frame-pointer option.
dnl #
+8
View File
@@ -0,0 +1,8 @@
dnl #
dnl # Check if GNU parallel is available.
dnl #
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_PARALLEL], [
AC_CHECK_PROG([PARALLEL], [parallel], [yes])
AM_CONDITIONAL([HAVE_PARALLEL], [test "x$PARALLEL" = "xyes"])
])
+10
View File
@@ -46,6 +46,16 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_PYZFS], [
])
AC_SUBST(DEFINE_PYZFS)
dnl #
dnl # Autodetection disables pyzfs if kernel or srpm config
dnl #
AS_IF([test "x$enable_pyzfs" = xcheck], [
AS_IF([test "x$ZFS_CONFIG" = xkernel -o "x$ZFS_CONFIG" = xsrpm ], [
enable_pyzfs=no
AC_MSG_NOTICE([Disabling pyzfs for kernel/srpm config])
])
])
dnl #
dnl # Python "packaging" (or, failing that, "distlib") module is required to build and install pyzfs
dnl #
+36 -37
View File
@@ -97,23 +97,13 @@ AC_DEFUN([AX_PYTHON_DEVEL],[
# Check for a version of Python >= 2.1.0
#
AC_MSG_CHECKING([for a version of Python >= '2.1.0'])
ac_supports_python_ver=`cat<<EOD | $PYTHON -
from __future__ import print_function;
import sys;
try:
from packaging import version;
except ImportError:
from distlib import version;
ver = sys.version.split ()[[0]];
(tst_cmp, tst_ver) = ">= '2.1.0'".split ();
tst_ver = tst_ver.strip ("'");
eval ("print (version.LegacyVersion (ver)"+ tst_cmp +"version.LegacyVersion (tst_ver))")
EOD`
ac_supports_python_ver=`$PYTHON -c "import sys; \
ver = sys.version.split ()[[0]]; \
print (ver >= '2.1.0')"`
if test "$ac_supports_python_ver" != "True"; then
if test -z "$PYTHON_NOVERSIONCHECK"; then
AC_MSG_RESULT([no])
m4_ifvaln([$2],[$2],[
AC_MSG_FAILURE([
AC_MSG_FAILURE([
This version of the AC@&t@_PYTHON_DEVEL macro
doesn't work properly with versions of Python before
2.1.0. You may need to re-run configure, setting the
@@ -122,7 +112,6 @@ PYTHON_EXTRA_LIBS and PYTHON_EXTRA_LDFLAGS by hand.
Moreover, to disable this check, set PYTHON_NOVERSIONCHECK
to something else than an empty string.
])
])
else
AC_MSG_RESULT([skip at user request])
fi
@@ -131,37 +120,47 @@ to something else than an empty string.
fi
#
# if the macro parameter ``version'' is set, honour it
# If the macro parameter ``version'' is set, honour it.
# A Python shim class, VPy, is used to implement correct version comparisons via
# string expressions, since e.g. a naive textual ">= 2.7.3" won't work for
# Python 2.7.10 (the ".1" being evaluated as less than ".3").
#
if test -n "$1"; then
AC_MSG_CHECKING([for a version of Python $1])
# Why the strip ()? Because if we don't, version.parse
# will, for example, report 3.10.0 >= '3.11.0'
ac_supports_python_ver=`cat<<EOD | $PYTHON -
from __future__ import print_function;
import sys;
try:
from packaging import version;
except ImportError:
from distlib import version;
ver = sys.version.split ()[[0]];
(tst_cmp, tst_ver) = "$1".split ();
tst_ver = tst_ver.strip ("'");
eval ("print (version.LegacyVersion (ver)"+ tst_cmp +"version.LegacyVersion (tst_ver))")
EOD`
cat << EOF > ax_python_devel_vpy.py
class VPy:
def vtup(self, s):
return tuple(map(int, s.strip().replace("rc", ".").split(".")))
def __init__(self):
import sys
self.vpy = tuple(sys.version_info)
def __eq__(self, s):
return self.vpy == self.vtup(s)
def __ne__(self, s):
return self.vpy != self.vtup(s)
def __lt__(self, s):
return self.vpy < self.vtup(s)
def __gt__(self, s):
return self.vpy > self.vtup(s)
def __le__(self, s):
return self.vpy <= self.vtup(s)
def __ge__(self, s):
return self.vpy >= self.vtup(s)
EOF
ac_supports_python_ver=`$PYTHON -c "import ax_python_devel_vpy; \
ver = ax_python_devel_vpy.VPy(); \
print (ver $1)"`
rm -rf ax_python_devel_vpy*.py* __pycache__/ax_python_devel_vpy*.py*
if test "$ac_supports_python_ver" = "True"; then
AC_MSG_RESULT([yes])
AC_MSG_RESULT([yes])
else
AC_MSG_RESULT([no])
m4_ifvaln([$2],[$2],[
AC_MSG_ERROR([this package requires Python $1.
AC_MSG_ERROR([this package requires Python $1.
If you have it installed, but it isn't the default Python
interpreter in your system path, please pass the PYTHON_VERSION
variable to configure. See ``configure --help'' for reference.
])
PYTHON_VERSION=""
])
PYTHON_VERSION=""
fi
fi
@@ -224,7 +223,7 @@ EOD`
ac_python_version=$PYTHON_VERSION
else
ac_python_version=`$PYTHON -c "import sys; \
print (sys.version[[:3]])"`
print ('.'.join(sys.version.split('.')[[:2]]))"`
fi
fi
+2 -2
View File
@@ -66,7 +66,7 @@ deb-utils: deb-local rpm-utils-initramfs
## to do this, so we install a shim onto the path which calls the real
## dh_shlibdeps with the required arguments.
path_prepend=`mktemp -d /tmp/intercept.XXXXXX`; \
echo "#$(SHELL)" > $${path_prepend}/dh_shlibdeps; \
echo "#!$(SHELL)" > $${path_prepend}/dh_shlibdeps; \
echo "`which dh_shlibdeps` -- \
-xlibuutil3linux -xlibnvpair3linux -xlibzfs5linux -xlibzpool5linux" \
>> $${path_prepend}/dh_shlibdeps; \
@@ -74,7 +74,7 @@ deb-utils: deb-local rpm-utils-initramfs
## Debianized packages from the auto-generated dependencies of the new debs,
## which should NOT be mixed with the alien-generated debs created here
chmod +x $${path_prepend}/dh_shlibdeps; \
env PATH=$${path_prepend}:$${PATH} \
env "PATH=$${path_prepend}:$${PATH}" \
fakeroot $(ALIEN) --bump=0 --scripts --to-deb --target=$$debarch \
$$pkg1 $$pkg2 $$pkg3 $$pkg4 $$pkg5 $$pkg6 $$pkg7 \
$$pkg8 $$pkg9 $$pkg10 || exit 1; \
+46 -4
View File
@@ -165,6 +165,9 @@ dnl #
dnl # 5.15 API change,
dnl # Added the bool rcu argument to get_acl for rcu path walk.
dnl #
dnl # 6.2 API change,
dnl # get_acl() was renamed to get_inode_acl()
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_INODE_OPERATIONS_GET_ACL], [
ZFS_LINUX_TEST_SRC([inode_operations_get_acl], [
#include <linux/fs.h>
@@ -189,6 +192,18 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_INODE_OPERATIONS_GET_ACL], [
.get_acl = get_acl_fn,
};
],[])
ZFS_LINUX_TEST_SRC([inode_operations_get_inode_acl], [
#include <linux/fs.h>
struct posix_acl *get_inode_acl_fn(struct inode *inode, int type,
bool rcu) { return NULL; }
static const struct inode_operations
iops __attribute__ ((unused)) = {
.get_inode_acl = get_inode_acl_fn,
};
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_INODE_OPERATIONS_GET_ACL], [
@@ -201,7 +216,12 @@ AC_DEFUN([ZFS_AC_KERNEL_INODE_OPERATIONS_GET_ACL], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_GET_ACL_RCU, 1, [iops->get_acl() takes rcu])
],[
ZFS_LINUX_TEST_ERROR([iops->get_acl()])
ZFS_LINUX_TEST_RESULT([inode_operations_get_inode_acl], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_GET_INODE_ACL, 1, [has iops->get_inode_acl()])
],[
ZFS_LINUX_TEST_ERROR([iops->get_acl() or iops->get_inode_acl()])
])
])
])
])
@@ -213,7 +233,22 @@ dnl #
dnl # 5.12 API change,
dnl # set_acl() added a user_namespace* parameter first
dnl #
dnl # 6.2 API change,
dnl # set_acl() second paramter changed to a struct dentry *
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_INODE_OPERATIONS_SET_ACL], [
ZFS_LINUX_TEST_SRC([inode_operations_set_acl_userns_dentry], [
#include <linux/fs.h>
int set_acl_fn(struct user_namespace *userns,
struct dentry *dent, struct posix_acl *acl,
int type) { return 0; }
static const struct inode_operations
iops __attribute__ ((unused)) = {
.set_acl = set_acl_fn,
};
],[])
ZFS_LINUX_TEST_SRC([inode_operations_set_acl_userns], [
#include <linux/fs.h>
@@ -246,11 +281,18 @@ AC_DEFUN([ZFS_AC_KERNEL_INODE_OPERATIONS_SET_ACL], [
AC_DEFINE(HAVE_SET_ACL, 1, [iops->set_acl() exists])
AC_DEFINE(HAVE_SET_ACL_USERNS, 1, [iops->set_acl() takes 4 args])
],[
ZFS_LINUX_TEST_RESULT([inode_operations_set_acl], [
ZFS_LINUX_TEST_RESULT([inode_operations_set_acl_userns_dentry], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SET_ACL, 1, [iops->set_acl() exists, takes 3 args])
AC_DEFINE(HAVE_SET_ACL, 1, [iops->set_acl() exists])
AC_DEFINE(HAVE_SET_ACL_USERNS_DENTRY_ARG2, 1,
[iops->set_acl() takes 4 args, arg2 is struct dentry *])
],[
AC_MSG_RESULT(no)
ZFS_LINUX_TEST_RESULT([inode_operations_set_acl], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SET_ACL, 1, [iops->set_acl() exists, takes 3 args])
],[
ZFS_LINUX_REQUIRE_API([i_op->set_acl()], [3.14])
])
])
])
])
+3 -5
View File
@@ -3,16 +3,14 @@ dnl # 5.16 API change
dnl # add_disk grew a must-check return code
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_ADD_DISK], [
ZFS_LINUX_TEST_SRC([add_disk_ret], [
#include <linux/genhd.h>
#include <linux/blkdev.h>
], [
struct gendisk *disk = NULL;
int err = add_disk(disk);
err = err;
int error __attribute__ ((unused)) = add_disk(disk);
])
])
AC_DEFUN([ZFS_AC_KERNEL_ADD_DISK], [
AC_MSG_CHECKING([whether add_disk() returns int])
ZFS_LINUX_TEST_RESULT([add_disk_ret],
+38 -1
View File
@@ -464,7 +464,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_CGROUP_HEADER], [
])
AC_DEFUN([ZFS_AC_KERNEL_BLK_CGROUP_HEADER], [
AC_MSG_CHECKING([for existence of linux/blk-cgroup.h])
AC_MSG_CHECKING([whether linux/blk-cgroup.h exists])
ZFS_LINUX_TEST_RESULT([blk_cgroup_header],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_LINUX_BLK_CGROUP_HEADER, 1,
@@ -474,6 +474,41 @@ AC_DEFUN([ZFS_AC_KERNEL_BLK_CGROUP_HEADER], [
])
])
dnl #
dnl # Linux 5.18 API
dnl #
dnl # In 07888c665b405b1cd3577ddebfeb74f4717a84c4 ("block: pass a block_device and opf to bio_alloc")
dnl # bio_alloc(gfp_t gfp_mask, unsigned short nr_iovecs)
dnl # became
dnl # bio_alloc(struct block_device *bdev, unsigned short nr_vecs, unsigned int opf, gfp_t gfp_mask)
dnl # however
dnl # > NULL/0 can be passed, both for the
dnl # > passthrough case on a raw request_queue and to temporarily avoid
dnl # > refactoring some nasty code.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BIO_ALLOC_4ARG], [
ZFS_LINUX_TEST_SRC([bio_alloc_4arg], [
#include <linux/bio.h>
],[
gfp_t gfp_mask = 0;
unsigned short nr_iovecs = 0;
struct block_device *bdev = NULL;
unsigned int opf = 0;
struct bio *__attribute__((unused)) allocated = bio_alloc(bdev, nr_iovecs, opf, gfp_mask);
])
])
AC_DEFUN([ZFS_AC_KERNEL_BIO_ALLOC_4ARG], [
AC_MSG_CHECKING([whether bio_alloc() wants 4 args])
ZFS_LINUX_TEST_RESULT([bio_alloc_4arg],[
AC_MSG_RESULT(yes)
AC_DEFINE([HAVE_BIO_ALLOC_4ARG], 1, [bio_alloc() takes 4 arguments])
],[
AC_MSG_RESULT(no)
])
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_BIO], [
ZFS_AC_KERNEL_SRC_REQ
ZFS_AC_KERNEL_SRC_BIO_OPS
@@ -488,6 +523,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BIO], [
ZFS_AC_KERNEL_SRC_BDEV_SUBMIT_BIO_RETURNS_VOID
ZFS_AC_KERNEL_SRC_BIO_SET_DEV_MACRO
ZFS_AC_KERNEL_SRC_BLK_CGROUP_HEADER
ZFS_AC_KERNEL_SRC_BIO_ALLOC_4ARG
])
AC_DEFUN([ZFS_AC_KERNEL_BIO], [
@@ -512,4 +548,5 @@ AC_DEFUN([ZFS_AC_KERNEL_BIO], [
ZFS_AC_KERNEL_BIO_BDEV_DISK
ZFS_AC_KERNEL_BDEV_SUBMIT_BIO_RETURNS_VOID
ZFS_AC_KERNEL_BLK_CGROUP_HEADER
ZFS_AC_KERNEL_BIO_ALLOC_4ARG
])
+74 -30
View File
@@ -74,6 +74,8 @@ AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_UPDATE_READAHEAD], [
AC_DEFINE(HAVE_BLK_QUEUE_UPDATE_READAHEAD, 1,
[blk_queue_update_readahead() exists])
],[
AC_MSG_RESULT(no)
AC_MSG_CHECKING([whether disk_update_readahead() exists])
ZFS_LINUX_TEST_RESULT([disk_update_readahead], [
AC_MSG_RESULT(yes)
@@ -86,69 +88,111 @@ AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_UPDATE_READAHEAD], [
])
dnl #
dnl # 2.6.32 API,
dnl # blk_queue_discard()
dnl # 5.19: bdev_max_discard_sectors() available
dnl # 2.6.32: blk_queue_discard() available
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_QUEUE_DISCARD], [
ZFS_LINUX_TEST_SRC([bdev_max_discard_sectors], [
#include <linux/blkdev.h>
],[
struct block_device *bdev __attribute__ ((unused)) = NULL;
unsigned int error __attribute__ ((unused));
error = bdev_max_discard_sectors(bdev);
])
ZFS_LINUX_TEST_SRC([blk_queue_discard], [
#include <linux/blkdev.h>
],[
struct request_queue *q __attribute__ ((unused)) = NULL;
struct request_queue r;
struct request_queue *q = &r;
int value __attribute__ ((unused));
memset(q, 0, sizeof(r));
value = blk_queue_discard(q);
])
])
AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_DISCARD], [
AC_MSG_CHECKING([whether blk_queue_discard() is available])
ZFS_LINUX_TEST_RESULT([blk_queue_discard], [
AC_MSG_CHECKING([whether bdev_max_discard_sectors() is available])
ZFS_LINUX_TEST_RESULT([bdev_max_discard_sectors], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BDEV_MAX_DISCARD_SECTORS, 1,
[bdev_max_discard_sectors() is available])
],[
ZFS_LINUX_TEST_ERROR([blk_queue_discard])
AC_MSG_RESULT(no)
AC_MSG_CHECKING([whether blk_queue_discard() is available])
ZFS_LINUX_TEST_RESULT([blk_queue_discard], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BLK_QUEUE_DISCARD, 1,
[blk_queue_discard() is available])
],[
ZFS_LINUX_TEST_ERROR([blk_queue_discard])
])
])
])
dnl #
dnl # 4.8 API,
dnl # blk_queue_secure_erase()
dnl #
dnl # 2.6.36 - 4.7 API,
dnl # blk_queue_secdiscard()
dnl # 5.19: bdev_max_secure_erase_sectors() available
dnl # 4.8: blk_queue_secure_erase() available
dnl # 2.6.36: blk_queue_secdiscard() available
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_QUEUE_SECURE_ERASE], [
ZFS_LINUX_TEST_SRC([bdev_max_secure_erase_sectors], [
#include <linux/blkdev.h>
],[
struct block_device *bdev __attribute__ ((unused)) = NULL;
unsigned int error __attribute__ ((unused));
error = bdev_max_secure_erase_sectors(bdev);
])
ZFS_LINUX_TEST_SRC([blk_queue_secure_erase], [
#include <linux/blkdev.h>
],[
struct request_queue *q __attribute__ ((unused)) = NULL;
struct request_queue r;
struct request_queue *q = &r;
int value __attribute__ ((unused));
memset(q, 0, sizeof(r));
value = blk_queue_secure_erase(q);
])
ZFS_LINUX_TEST_SRC([blk_queue_secdiscard], [
#include <linux/blkdev.h>
],[
struct request_queue *q __attribute__ ((unused)) = NULL;
struct request_queue r;
struct request_queue *q = &r;
int value __attribute__ ((unused));
memset(q, 0, sizeof(r));
value = blk_queue_secdiscard(q);
])
])
AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_SECURE_ERASE], [
AC_MSG_CHECKING([whether blk_queue_secure_erase() is available])
ZFS_LINUX_TEST_RESULT([blk_queue_secure_erase], [
AC_MSG_CHECKING([whether bdev_max_secure_erase_sectors() is available])
ZFS_LINUX_TEST_RESULT([bdev_max_secure_erase_sectors], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BLK_QUEUE_SECURE_ERASE, 1,
[blk_queue_secure_erase() is available])
AC_DEFINE(HAVE_BDEV_MAX_SECURE_ERASE_SECTORS, 1,
[bdev_max_secure_erase_sectors() is available])
],[
AC_MSG_RESULT(no)
AC_MSG_CHECKING([whether blk_queue_secdiscard() is available])
ZFS_LINUX_TEST_RESULT([blk_queue_secdiscard], [
AC_MSG_CHECKING([whether blk_queue_secure_erase() is available])
ZFS_LINUX_TEST_RESULT([blk_queue_secure_erase], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BLK_QUEUE_SECDISCARD, 1,
[blk_queue_secdiscard() is available])
AC_DEFINE(HAVE_BLK_QUEUE_SECURE_ERASE, 1,
[blk_queue_secure_erase() is available])
],[
ZFS_LINUX_TEST_ERROR([blk_queue_secure_erase])
AC_MSG_RESULT(no)
AC_MSG_CHECKING([whether blk_queue_secdiscard() is available])
ZFS_LINUX_TEST_RESULT([blk_queue_secdiscard], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BLK_QUEUE_SECDISCARD, 1,
[blk_queue_secdiscard() is available])
],[
ZFS_LINUX_TEST_ERROR([blk_queue_secure_erase])
])
])
])
])
@@ -215,17 +259,17 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_QUEUE_FLUSH], [
ZFS_LINUX_TEST_SRC([blk_queue_flush], [
#include <linux/blkdev.h>
], [
struct request_queue *q = NULL;
struct request_queue *q __attribute__ ((unused)) = NULL;
(void) blk_queue_flush(q, REQ_FLUSH);
], [$NO_UNUSED_BUT_SET_VARIABLE], [ZFS_META_LICENSE])
], [], [ZFS_META_LICENSE])
ZFS_LINUX_TEST_SRC([blk_queue_write_cache], [
#include <linux/kernel.h>
#include <linux/blkdev.h>
], [
struct request_queue *q = NULL;
struct request_queue *q __attribute__ ((unused)) = NULL;
blk_queue_write_cache(q, true, true);
], [$NO_UNUSED_BUT_SET_VARIABLE], [ZFS_META_LICENSE])
], [], [ZFS_META_LICENSE])
])
AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_FLUSH], [
@@ -278,9 +322,9 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_QUEUE_MAX_HW_SECTORS], [
ZFS_LINUX_TEST_SRC([blk_queue_max_hw_sectors], [
#include <linux/blkdev.h>
], [
struct request_queue *q = NULL;
struct request_queue *q __attribute__ ((unused)) = NULL;
(void) blk_queue_max_hw_sectors(q, BLK_SAFE_MAX_SECTORS);
], [$NO_UNUSED_BUT_SET_VARIABLE])
], [])
])
AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_MAX_HW_SECTORS], [
@@ -301,9 +345,9 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_QUEUE_MAX_SEGMENTS], [
ZFS_LINUX_TEST_SRC([blk_queue_max_segments], [
#include <linux/blkdev.h>
], [
struct request_queue *q = NULL;
struct request_queue *q __attribute__ ((unused)) = NULL;
(void) blk_queue_max_segments(q, BLK_MAX_SEGMENTS);
], [$NO_UNUSED_BUT_SET_VARIABLE])
], [])
])
AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_MAX_SEGMENTS], [
+81
View File
@@ -294,6 +294,83 @@ AC_DEFUN([ZFS_AC_KERNEL_BLKDEV_BDEV_WHOLE], [
])
])
dnl #
dnl # 5.20 API change,
dnl # Removed bdevname(), snprintf(.., %pg) should be used.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLKDEV_BDEVNAME], [
ZFS_LINUX_TEST_SRC([bdevname], [
#include <linux/fs.h>
#include <linux/blkdev.h>
], [
struct block_device *bdev __attribute__ ((unused)) = NULL;
char path[BDEVNAME_SIZE];
(void) bdevname(bdev, path);
])
])
AC_DEFUN([ZFS_AC_KERNEL_BLKDEV_BDEVNAME], [
AC_MSG_CHECKING([whether bdevname() exists])
ZFS_LINUX_TEST_RESULT([bdevname], [
AC_DEFINE(HAVE_BDEVNAME, 1, [bdevname() is available])
AC_MSG_RESULT(yes)
], [
AC_MSG_RESULT(no)
])
])
dnl #
dnl # 5.19 API: blkdev_issue_secure_erase()
dnl # 3.10 API: blkdev_issue_discard(..., BLKDEV_DISCARD_SECURE)
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLKDEV_ISSUE_SECURE_ERASE], [
ZFS_LINUX_TEST_SRC([blkdev_issue_secure_erase], [
#include <linux/blkdev.h>
],[
struct block_device *bdev = NULL;
sector_t sector = 0;
sector_t nr_sects = 0;
int error __attribute__ ((unused));
error = blkdev_issue_secure_erase(bdev,
sector, nr_sects, GFP_KERNEL);
])
ZFS_LINUX_TEST_SRC([blkdev_issue_discard_flags], [
#include <linux/blkdev.h>
],[
struct block_device *bdev = NULL;
sector_t sector = 0;
sector_t nr_sects = 0;
unsigned long flags = 0;
int error __attribute__ ((unused));
error = blkdev_issue_discard(bdev,
sector, nr_sects, GFP_KERNEL, flags);
])
])
AC_DEFUN([ZFS_AC_KERNEL_BLKDEV_ISSUE_SECURE_ERASE], [
AC_MSG_CHECKING([whether blkdev_issue_secure_erase() is available])
ZFS_LINUX_TEST_RESULT([blkdev_issue_secure_erase], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BLKDEV_ISSUE_SECURE_ERASE, 1,
[blkdev_issue_secure_erase() is available])
],[
AC_MSG_RESULT(no)
AC_MSG_CHECKING([whether blkdev_issue_discard() is available])
ZFS_LINUX_TEST_RESULT([blkdev_issue_discard_flags], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BLKDEV_ISSUE_DISCARD, 1,
[blkdev_issue_discard() is available])
],[
ZFS_LINUX_TEST_ERROR([blkdev_issue_discard()])
])
])
])
dnl #
dnl # 5.13 API change
dnl # blkdev_get_by_path() no longer handles ERESTARTSYS
@@ -326,6 +403,8 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLKDEV], [
ZFS_AC_KERNEL_SRC_BLKDEV_CHECK_DISK_CHANGE
ZFS_AC_KERNEL_SRC_BLKDEV_BDEV_CHECK_MEDIA_CHANGE
ZFS_AC_KERNEL_SRC_BLKDEV_BDEV_WHOLE
ZFS_AC_KERNEL_SRC_BLKDEV_BDEVNAME
ZFS_AC_KERNEL_SRC_BLKDEV_ISSUE_SECURE_ERASE
])
AC_DEFUN([ZFS_AC_KERNEL_BLKDEV], [
@@ -339,5 +418,7 @@ AC_DEFUN([ZFS_AC_KERNEL_BLKDEV], [
ZFS_AC_KERNEL_BLKDEV_CHECK_DISK_CHANGE
ZFS_AC_KERNEL_BLKDEV_BDEV_CHECK_MEDIA_CHANGE
ZFS_AC_KERNEL_BLKDEV_BDEV_WHOLE
ZFS_AC_KERNEL_BLKDEV_BDEVNAME
ZFS_AC_KERNEL_BLKDEV_GET_ERESTARTSYS
ZFS_AC_KERNEL_BLKDEV_ISSUE_SECURE_ERASE
])
+12 -5
View File
@@ -6,13 +6,16 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_CHECK_EVENTS], [
#include <linux/blkdev.h>
unsigned int blk_check_events(struct gendisk *disk,
unsigned int clearing) { return (0); }
unsigned int clearing) {
(void) disk, (void) clearing;
return (0);
}
static const struct block_device_operations
bops __attribute__ ((unused)) = {
.check_events = blk_check_events,
};
], [], [$NO_UNUSED_BUT_SET_VARIABLE])
], [], [])
])
AC_DEFUN([ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_CHECK_EVENTS], [
@@ -31,7 +34,10 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_RELEASE_VOID], [
ZFS_LINUX_TEST_SRC([block_device_operations_release_void], [
#include <linux/blkdev.h>
void blk_release(struct gendisk *g, fmode_t mode) { return; }
void blk_release(struct gendisk *g, fmode_t mode) {
(void) g, (void) mode;
return;
}
static const struct block_device_operations
bops __attribute__ ((unused)) = {
@@ -40,7 +46,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_RELEASE_VOID], [
.ioctl = NULL,
.compat_ioctl = NULL,
};
], [], [$NO_UNUSED_BUT_SET_VARIABLE])
], [], [])
])
AC_DEFUN([ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_RELEASE_VOID], [
@@ -61,6 +67,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_REVALIDATE_DISK], [
#include <linux/blkdev.h>
int blk_revalidate_disk(struct gendisk *disk) {
(void) disk;
return(0);
}
@@ -68,7 +75,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_REVALIDATE_DISK], [
bops __attribute__ ((unused)) = {
.revalidate_disk = blk_revalidate_disk,
};
], [], [$NO_UNUSED_BUT_SET_VARIABLE])
], [], [])
])
AC_DEFUN([ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_REVALIDATE_DISK], [
+86 -2
View File
@@ -19,19 +19,48 @@ AC_DEFUN([ZFS_AC_KERNEL_CONFIG_DEFINED], [
])
])
ZFS_AC_KERNEL_SRC_CONFIG_MODULES
ZFS_AC_KERNEL_SRC_CONFIG_BLOCK
ZFS_AC_KERNEL_SRC_CONFIG_DEBUG_LOCK_ALLOC
ZFS_AC_KERNEL_SRC_CONFIG_TRIM_UNUSED_KSYMS
ZFS_AC_KERNEL_SRC_CONFIG_ZLIB_INFLATE
ZFS_AC_KERNEL_SRC_CONFIG_ZLIB_DEFLATE
ZFS_AC_KERNEL_SRC_CONFIG_ZLIB_INFLATE
AC_MSG_CHECKING([for kernel config option compatibility])
ZFS_LINUX_TEST_COMPILE_ALL([config])
AC_MSG_RESULT([done])
ZFS_AC_KERNEL_CONFIG_MODULES
ZFS_AC_KERNEL_CONFIG_BLOCK
ZFS_AC_KERNEL_CONFIG_DEBUG_LOCK_ALLOC
ZFS_AC_KERNEL_CONFIG_TRIM_UNUSED_KSYMS
ZFS_AC_KERNEL_CONFIG_ZLIB_INFLATE
ZFS_AC_KERNEL_CONFIG_ZLIB_DEFLATE
ZFS_AC_KERNEL_CONFIG_ZLIB_INFLATE
])
dnl #
dnl # Check CONFIG_BLOCK
dnl #
dnl # Verify the kernel has CONFIG_BLOCK support enabled.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_CONFIG_BLOCK], [
ZFS_LINUX_TEST_SRC([config_block], [
#if !defined(CONFIG_BLOCK)
#error CONFIG_BLOCK not defined
#endif
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_CONFIG_BLOCK], [
AC_MSG_CHECKING([whether CONFIG_BLOCK is defined])
ZFS_LINUX_TEST_RESULT([config_block], [
AC_MSG_RESULT([yes])
],[
AC_MSG_RESULT([no])
AC_MSG_ERROR([
*** This kernel does not include the required block device support.
*** Rebuild the kernel with CONFIG_BLOCK=y set.])
])
])
dnl #
@@ -72,6 +101,61 @@ AC_DEFUN([ZFS_AC_KERNEL_CONFIG_DEBUG_LOCK_ALLOC], [
])
])
dnl #
dnl # Check CONFIG_MODULES
dnl #
dnl # Verify the kernel has CONFIG_MODULES support enabled.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_CONFIG_MODULES], [
ZFS_LINUX_TEST_SRC([config_modules], [
#if !defined(CONFIG_MODULES)
#error CONFIG_MODULES not defined
#endif
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_CONFIG_MODULES], [
AC_MSG_CHECKING([whether CONFIG_MODULES is defined])
AS_IF([test "x$enable_linux_builtin" != xyes], [
ZFS_LINUX_TEST_RESULT([config_modules], [
AC_MSG_RESULT([yes])
],[
AC_MSG_RESULT([no])
AC_MSG_ERROR([
*** This kernel does not include the required loadable module
*** support!
***
*** To build OpenZFS as a loadable Linux kernel module
*** enable loadable module support by setting
*** `CONFIG_MODULES=y` in the kernel configuration and run
*** `make modules_prepare` in the Linux source tree.
***
*** If you don't intend to enable loadable kernel module
*** support, please compile OpenZFS as a Linux kernel built-in.
***
*** Prepare the Linux source tree by running `make prepare`,
*** use the OpenZFS `--enable-linux-builtin` configure option,
*** copy the OpenZFS sources into the Linux source tree using
*** `./copy-builtin <linux source directory>`,
*** set `CONFIG_ZFS=y` in the kernel configuration and compile
*** kernel as usual.
])
])
], [
ZFS_LINUX_TRY_COMPILE([], [], [
AC_MSG_RESULT([not needed])
],[
AC_MSG_RESULT([error])
AC_MSG_ERROR([
*** This kernel is unable to compile object files.
***
*** Please make sure you prepared the Linux source tree
*** by running `make prepare` there.
])
])
])
])
dnl #
dnl # Check CONFIG_TRIM_UNUSED_KSYMS
dnl #
+29
View File
@@ -0,0 +1,29 @@
dnl #
dnl # On certain architectures `__copy_from_user_inatomic`
dnl # is a GPL exported variable and cannot be used by OpenZFS.
dnl #
dnl #
dnl # Checking if `__copy_from_user_inatomic` is available.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC___COPY_FROM_USER_INATOMIC], [
ZFS_LINUX_TEST_SRC([__copy_from_user_inatomic], [
#include <linux/uaccess.h>
], [
int result __attribute__ ((unused)) = __copy_from_user_inatomic(NULL, NULL, 0);
], [], [ZFS_META_LICENSE])
])
AC_DEFUN([ZFS_AC_KERNEL___COPY_FROM_USER_INATOMIC], [
AC_MSG_CHECKING([whether __copy_from_user_inatomic is available])
ZFS_LINUX_TEST_RESULT([__copy_from_user_inatomic_license], [
AC_MSG_RESULT(yes)
], [
AC_MSG_RESULT(no)
AC_MSG_ERROR([
*** The `__copy_from_user_inatomic()` Linux kernel function is
*** incompatible with the CDDL license and will prevent the module
*** linking stage from succeeding. OpenZFS cannot be compiled.
])
])
])
+30
View File
@@ -0,0 +1,30 @@
dnl #
dnl # 3.18 API change
dnl # Dentry aliases are in d_u struct dentry member
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_DENTRY_ALIAS_D_U], [
ZFS_LINUX_TEST_SRC([dentry_alias_d_u], [
#include <linux/fs.h>
#include <linux/dcache.h>
#include <linux/list.h>
], [
struct inode *inode __attribute__ ((unused)) = NULL;
struct dentry *dentry __attribute__ ((unused)) = NULL;
hlist_for_each_entry(dentry, &inode->i_dentry,
d_u.d_alias) {
d_drop(dentry);
}
])
])
AC_DEFUN([ZFS_AC_KERNEL_DENTRY_ALIAS_D_U], [
AC_MSG_CHECKING([whether dentry aliases are in d_u member])
ZFS_LINUX_TEST_RESULT([dentry_alias_d_u], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_DENTRY_D_U_ALIASES, 1,
[dentry aliases are in d_u member])
],[
AC_MSG_RESULT(no)
])
])
+24 -7
View File
@@ -2,6 +2,9 @@ dnl #
dnl # Handle differences in kernel FPU code.
dnl #
dnl # Kernel
dnl # 5.19: The asm/fpu/internal.h header was removed, it has been
dnl # effectively empty since the 5.16 kernel.
dnl #
dnl # 5.16: XCR code put into asm/fpu/xcr.h
dnl # HAVE_KERNEL_FPU_XCR_HEADER
dnl #
@@ -33,21 +36,31 @@ AC_DEFUN([ZFS_AC_KERNEL_FPU_HEADER], [
],[
AC_DEFINE(HAVE_KERNEL_FPU_API_HEADER, 1,
[kernel has asm/fpu/api.h])
AC_MSG_RESULT(asm/fpu/api.h)
AC_MSG_CHECKING([whether fpu/xcr header is available])
fpu_headers="asm/fpu/api.h"
ZFS_LINUX_TRY_COMPILE([
#include <linux/module.h>
#include <asm/fpu/xcr.h>
],[
],[
AC_DEFINE(HAVE_KERNEL_FPU_XCR_HEADER, 1,
[kernel has asm/fpu/xcr.h])
AC_MSG_RESULT(asm/fpu/xcr.h)
],[
AC_MSG_RESULT(no asm/fpu/xcr.h)
[kernel has asm/fpu/xcr.h])
fpu_headers="$fpu_headers asm/fpu/xcr.h"
])
ZFS_LINUX_TRY_COMPILE([
#include <linux/module.h>
#include <asm/fpu/internal.h>
],[
],[
AC_DEFINE(HAVE_KERNEL_FPU_INTERNAL_HEADER, 1,
[kernel has asm/fpu/internal.h])
fpu_headers="$fpu_headers asm/fpu/internal.h"
])
AC_MSG_RESULT([$fpu_headers])
],[
AC_MSG_RESULT(i387.h & xcr.h)
AC_MSG_RESULT([i387.h & xcr.h])
])
])
@@ -93,7 +106,9 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_FPU], [
#include <linux/types.h>
#ifdef HAVE_KERNEL_FPU_API_HEADER
#include <asm/fpu/api.h>
#ifdef HAVE_KERNEL_FPU_INTERNAL_HEADER
#include <asm/fpu/internal.h>
#endif
#else
#include <asm/i387.h>
#include <asm/xcr.h>
@@ -130,7 +145,9 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_FPU], [
#include <linux/types.h>
#ifdef HAVE_KERNEL_FPU_API_HEADER
#include <asm/fpu/api.h>
#ifdef HAVE_KERNEL_FPU_INTERNAL_HEADER
#include <asm/fpu/internal.h>
#endif
#else
#include <asm/i387.h>
#include <asm/xcr.h>
+55 -28
View File
@@ -2,6 +2,19 @@ dnl #
dnl # Check for generic io accounting interface.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_GENERIC_IO_ACCT], [
ZFS_LINUX_TEST_SRC([bdev_io_acct], [
#include <linux/blkdev.h>
], [
struct block_device *bdev = NULL;
struct bio *bio = NULL;
unsigned long passed_time = 0;
unsigned long start_time;
start_time = bdev_start_io_acct(bdev, bio_sectors(bio),
bio_op(bio), passed_time);
bdev_end_io_acct(bdev, bio_op(bio), start_time);
])
ZFS_LINUX_TEST_SRC([disk_io_acct], [
#include <linux/blkdev.h>
], [
@@ -50,61 +63,75 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_GENERIC_IO_ACCT], [
AC_DEFUN([ZFS_AC_KERNEL_GENERIC_IO_ACCT], [
dnl #
dnl # 5.12 API,
dnl # 5.19 API,
dnl #
dnl # bio_start_io_acct() and bio_end_io_acct() became GPL-exported
dnl # so use disk_start_io_acct() and disk_end_io_acct() instead
dnl # disk_start_io_acct() and disk_end_io_acct() have been replaced by
dnl # bdev_start_io_acct() and bdev_end_io_acct().
dnl #
AC_MSG_CHECKING([whether generic disk_*_io_acct() are available])
ZFS_LINUX_TEST_RESULT([disk_io_acct], [
AC_MSG_CHECKING([whether generic bdev_*_io_acct() are available])
ZFS_LINUX_TEST_RESULT([bdev_io_acct], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_DISK_IO_ACCT, 1, [disk_*_io_acct() available])
AC_DEFINE(HAVE_BDEV_IO_ACCT, 1, [bdev_*_io_acct() available])
], [
AC_MSG_RESULT(no)
dnl #
dnl # 5.7 API,
dnl # 5.12 API,
dnl #
dnl # Added bio_start_io_acct() and bio_end_io_acct() helpers.
dnl # bio_start_io_acct() and bio_end_io_acct() became GPL-exported
dnl # so use disk_start_io_acct() and disk_end_io_acct() instead
dnl #
AC_MSG_CHECKING([whether generic bio_*_io_acct() are available])
ZFS_LINUX_TEST_RESULT([bio_io_acct], [
AC_MSG_CHECKING([whether generic disk_*_io_acct() are available])
ZFS_LINUX_TEST_RESULT([disk_io_acct], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BIO_IO_ACCT, 1, [bio_*_io_acct() available])
AC_DEFINE(HAVE_DISK_IO_ACCT, 1, [disk_*_io_acct() available])
], [
AC_MSG_RESULT(no)
dnl #
dnl # 4.14 API,
dnl # 5.7 API,
dnl #
dnl # generic_start_io_acct/generic_end_io_acct now require
dnl # request_queue to be provided. No functional changes,
dnl # but preparation for inflight accounting.
dnl # Added bio_start_io_acct() and bio_end_io_acct() helpers.
dnl #
AC_MSG_CHECKING([whether generic_*_io_acct wants 4 args])
ZFS_LINUX_TEST_RESULT_SYMBOL([generic_acct_4args],
[generic_start_io_acct], [block/bio.c], [
AC_MSG_CHECKING([whether generic bio_*_io_acct() are available])
ZFS_LINUX_TEST_RESULT([bio_io_acct], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_GENERIC_IO_ACCT_4ARG, 1,
[generic_*_io_acct() 4 arg available])
AC_DEFINE(HAVE_BIO_IO_ACCT, 1, [bio_*_io_acct() available])
], [
AC_MSG_RESULT(no)
dnl #
dnl # 3.19 API addition
dnl # 4.14 API,
dnl #
dnl # torvalds/linux@394ffa50 allows us to increment
dnl # iostat counters without generic_make_request().
dnl # generic_start_io_acct/generic_end_io_acct now require
dnl # request_queue to be provided. No functional changes,
dnl # but preparation for inflight accounting.
dnl #
AC_MSG_CHECKING(
[whether generic_*_io_acct wants 3 args])
ZFS_LINUX_TEST_RESULT_SYMBOL([generic_acct_3args],
AC_MSG_CHECKING([whether generic_*_io_acct wants 4 args])
ZFS_LINUX_TEST_RESULT_SYMBOL([generic_acct_4args],
[generic_start_io_acct], [block/bio.c], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_GENERIC_IO_ACCT_3ARG, 1,
[generic_*_io_acct() 3 arg available])
AC_DEFINE(HAVE_GENERIC_IO_ACCT_4ARG, 1,
[generic_*_io_acct() 4 arg available])
], [
AC_MSG_RESULT(no)
dnl #
dnl # 3.19 API addition
dnl #
dnl # torvalds/linux@394ffa50 allows us to increment
dnl # iostat counters without generic_make_request().
dnl #
AC_MSG_CHECKING(
[whether generic_*_io_acct wants 3 args])
ZFS_LINUX_TEST_RESULT_SYMBOL([generic_acct_3args],
[generic_start_io_acct], [block/bio.c], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_GENERIC_IO_ACCT_3ARG, 1,
[generic_*_io_acct() 3 arg available])
], [
AC_MSG_RESULT(no)
])
])
])
])
+58
View File
@@ -0,0 +1,58 @@
dnl #
dnl # 5.17 API change,
dnl #
dnl # GENHD_FL_EXT_DEVT flag removed
dnl # GENHD_FL_NO_PART_SCAN renamed GENHD_FL_NO_PART
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_GENHD_FLAGS], [
ZFS_LINUX_TEST_SRC([genhd_fl_ext_devt], [
#include <linux/blkdev.h>
], [
int flags __attribute__ ((unused)) = GENHD_FL_EXT_DEVT;
])
ZFS_LINUX_TEST_SRC([genhd_fl_no_part], [
#include <linux/blkdev.h>
], [
int flags __attribute__ ((unused)) = GENHD_FL_NO_PART;
])
ZFS_LINUX_TEST_SRC([genhd_fl_no_part_scan], [
#include <linux/blkdev.h>
], [
int flags __attribute__ ((unused)) = GENHD_FL_NO_PART_SCAN;
])
])
AC_DEFUN([ZFS_AC_KERNEL_GENHD_FLAGS], [
AC_MSG_CHECKING([whether GENHD_FL_EXT_DEVT flag is available])
ZFS_LINUX_TEST_RESULT([genhd_fl_ext_devt], [
AC_MSG_RESULT(yes)
AC_DEFINE(ZFS_GENHD_FL_EXT_DEVT, GENHD_FL_EXT_DEVT,
[GENHD_FL_EXT_DEVT flag is available])
], [
AC_MSG_RESULT(no)
AC_DEFINE(ZFS_GENHD_FL_EXT_DEVT, 0,
[GENHD_FL_EXT_DEVT flag is not available])
])
AC_MSG_CHECKING([whether GENHD_FL_NO_PART flag is available])
ZFS_LINUX_TEST_RESULT([genhd_fl_no_part], [
AC_MSG_RESULT(yes)
AC_DEFINE(ZFS_GENHD_FL_NO_PART, GENHD_FL_NO_PART,
[GENHD_FL_NO_PART flag is available])
], [
AC_MSG_RESULT(no)
AC_MSG_CHECKING([whether GENHD_FL_NO_PART_SCAN flag is available])
ZFS_LINUX_TEST_RESULT([genhd_fl_no_part_scan], [
AC_MSG_RESULT(yes)
AC_DEFINE(ZFS_GENHD_FL_NO_PART, GENHD_FL_NO_PART_SCAN,
[GENHD_FL_NO_PART_SCAN flag is available])
], [
ZFS_LINUX_TEST_ERROR([GENHD_FL_NO_PART|GENHD_FL_NO_PART_SCAN])
])
])
])
+2 -2
View File
@@ -5,9 +5,9 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_GET_DISK_RO], [
ZFS_LINUX_TEST_SRC([get_disk_ro], [
#include <linux/blkdev.h>
],[
struct gendisk *disk = NULL;
struct gendisk *disk __attribute__ ((unused)) = NULL;
(void) get_disk_ro(disk);
], [$NO_UNUSED_BUT_SET_VARIABLE])
], [])
])
AC_DEFUN([ZFS_AC_KERNEL_GET_DISK_RO], [
+1 -1
View File
@@ -55,7 +55,7 @@ dnl #
AC_DEFUN([ZFS_AC_KERNEL_ENUM_MEMBER], [
AC_MSG_CHECKING([whether enum $2 contains $1])
AS_IF([AC_TRY_COMMAND(
"${srcdir}/scripts/enum-extract.pl" "$2" "$3" | egrep -qx $1)],[
"${srcdir}/scripts/enum-extract.pl" "$2" "$3" | grep -Eqx $1)],[
AC_MSG_RESULT([yes])
AC_DEFINE(m4_join([_], [ZFS_ENUM], m4_toupper($2), $1), 1,
[enum $2 contains $1])
+2 -2
View File
@@ -6,8 +6,8 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_GROUP_INFO_GID], [
ZFS_LINUX_TEST_SRC([group_info_gid], [
#include <linux/cred.h>
],[
struct group_info *gi = groups_alloc(1);
gi->gid[0] = KGIDT_INIT(0);
struct group_info gi __attribute__ ((unused)) = {};
gi.gid[0] = KGIDT_INIT(0);
])
])
+20
View File
@@ -49,6 +49,13 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_MAKE_REQUEST_FN], [
struct gendisk *disk __attribute__ ((unused));
disk = blk_alloc_disk(NUMA_NO_NODE);
])
ZFS_LINUX_TEST_SRC([blk_cleanup_disk], [
#include <linux/blkdev.h>
],[
struct gendisk *disk __attribute__ ((unused));
blk_cleanup_disk(disk);
])
])
AC_DEFUN([ZFS_AC_KERNEL_MAKE_REQUEST_FN], [
@@ -73,6 +80,19 @@ AC_DEFUN([ZFS_AC_KERNEL_MAKE_REQUEST_FN], [
ZFS_LINUX_TEST_RESULT([blk_alloc_disk], [
AC_MSG_RESULT(yes)
AC_DEFINE([HAVE_BLK_ALLOC_DISK], 1, [blk_alloc_disk() exists])
dnl #
dnl # 5.20 API change,
dnl # Removed blk_cleanup_disk(), put_disk() should be used.
dnl #
AC_MSG_CHECKING([whether blk_cleanup_disk() exists])
ZFS_LINUX_TEST_RESULT([blk_cleanup_disk], [
AC_MSG_RESULT(yes)
AC_DEFINE([HAVE_BLK_CLEANUP_DISK], 1,
[blk_cleanup_disk() exists])
], [
AC_MSG_RESULT(no)
])
], [
AC_MSG_RESULT(no)
])
+2
View File
@@ -53,6 +53,8 @@ AC_DEFUN([ZFS_AC_KERNEL_MKDIR], [
AC_DEFINE(HAVE_IOPS_MKDIR_USERNS, 1,
[iops->mkdir() takes struct user_namespace*])
],[
AC_MSG_RESULT(no)
AC_MSG_CHECKING([whether iops->mkdir() takes umode_t])
ZFS_LINUX_TEST_RESULT([inode_operations_mkdir], [
AC_MSG_RESULT(yes)
-33
View File
@@ -1,33 +0,0 @@
dnl #
dnl # Grsecurity kernel API change
dnl # constified parameters of module_param_call() methods
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_MODULE_PARAM_CALL_CONST], [
ZFS_LINUX_TEST_SRC([module_param_call], [
#include <linux/module.h>
#include <linux/moduleparam.h>
int param_get(char *b, const struct kernel_param *kp)
{
return (0);
}
int param_set(const char *b, const struct kernel_param *kp)
{
return (0);
}
module_param_call(p, param_set, param_get, NULL, 0644);
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_MODULE_PARAM_CALL_CONST], [
AC_MSG_CHECKING([whether module_param_call() is hardened])
ZFS_LINUX_TEST_RESULT([module_param_call], [
AC_MSG_RESULT(yes)
AC_DEFINE(MODULE_PARAM_CALL_CONST, 1,
[hardened module_param_call])
],[
AC_MSG_RESULT(no)
])
])
+1 -1
View File
@@ -15,7 +15,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_PAGEMAP_FOLIO_WAIT_BIT], [
])
AC_DEFUN([ZFS_AC_KERNEL_PAGEMAP_FOLIO_WAIT_BIT], [
AC_MSG_CHECKING([folio_wait_bit() exists])
AC_MSG_CHECKING([whether folio_wait_bit() exists])
ZFS_LINUX_TEST_RESULT([pagemap_has_folio_wait_bit], [
AC_MSG_RESULT([yes])
AC_DEFINE(HAVE_PAGEMAP_FOLIO_WAIT_BIT, 1,
+25
View File
@@ -0,0 +1,25 @@
dnl #
dnl # Linux 5.18 removes address_space_operations ->readpages in favour of
dnl # ->readahead
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_VFS_READPAGES], [
ZFS_LINUX_TEST_SRC([vfs_has_readpages], [
#include <linux/fs.h>
static const struct address_space_operations
aops __attribute__ ((unused)) = {
.readpages = NULL,
};
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_VFS_READPAGES], [
AC_MSG_CHECKING([whether aops->readpages exists])
ZFS_LINUX_TEST_RESULT([vfs_has_readpages], [
AC_MSG_RESULT([yes])
AC_DEFINE(HAVE_VFS_READPAGES, 1,
[address_space_operations->readpages exists])
],[
AC_MSG_RESULT([no])
])
])
+2 -2
View File
@@ -8,14 +8,14 @@ dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_REVALIDATE_DISK], [
ZFS_LINUX_TEST_SRC([revalidate_disk_size], [
#include <linux/genhd.h>
#include <linux/blkdev.h>
], [
struct gendisk *disk = NULL;
(void) revalidate_disk_size(disk, false);
])
ZFS_LINUX_TEST_SRC([revalidate_disk], [
#include <linux/genhd.h>
#include <linux/blkdev.h>
], [
struct gendisk *disk = NULL;
(void) revalidate_disk(disk);
+52 -15
View File
@@ -54,6 +54,21 @@ AC_DEFUN([ZFS_AC_KERNEL_SHRINK_CONTROL_HAS_NID], [
])
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_REGISTER_SHRINKER_VARARG], [
ZFS_LINUX_TEST_SRC([register_shrinker_vararg], [
#include <linux/mm.h>
unsigned long shrinker_cb(struct shrinker *shrink,
struct shrink_control *sc) { return 0; }
],[
struct shrinker cache_shrinker = {
.count_objects = shrinker_cb,
.scan_objects = shrinker_cb,
.seeks = DEFAULT_SEEKS,
};
register_shrinker(&cache_shrinker, "vararg-reg-shrink-test");
])
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_SHRINKER_CALLBACK], [
ZFS_LINUX_TEST_SRC([shrinker_cb_shrink_control], [
#include <linux/mm.h>
@@ -83,29 +98,50 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_SHRINKER_CALLBACK], [
AC_DEFUN([ZFS_AC_KERNEL_SHRINKER_CALLBACK],[
dnl #
dnl # 3.0 - 3.11 API change
dnl # ->shrink(struct shrinker *, struct shrink_control *sc)
dnl # 6.0 API change
dnl # register_shrinker() becomes a var-arg function that takes
dnl # a printf-style format string as args > 0
dnl #
AC_MSG_CHECKING([whether new 2-argument shrinker exists])
ZFS_LINUX_TEST_RESULT([shrinker_cb_shrink_control], [
AC_MSG_CHECKING([whether new var-arg register_shrinker() exists])
ZFS_LINUX_TEST_RESULT([register_shrinker_vararg], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SINGLE_SHRINKER_CALLBACK, 1,
[new shrinker callback wants 2 args])
AC_DEFINE(HAVE_REGISTER_SHRINKER_VARARG, 1,
[register_shrinker is vararg])
dnl # We assume that the split shrinker callback exists if the
dnl # vararg register_shrinker() exists, because the latter is
dnl # a much more recent addition, and the macro test for the
dnl # var-arg version only works if the callback is split
AC_DEFINE(HAVE_SPLIT_SHRINKER_CALLBACK, 1,
[cs->count_objects exists])
],[
AC_MSG_RESULT(no)
dnl #
dnl # 3.12 API change,
dnl # ->shrink() is logically split in to
dnl # ->count_objects() and ->scan_objects()
dnl # 3.0 - 3.11 API change
dnl # cs->shrink(struct shrinker *, struct shrink_control *sc)
dnl #
AC_MSG_CHECKING([whether ->count_objects callback exists])
ZFS_LINUX_TEST_RESULT([shrinker_cb_shrink_control_split], [
AC_MSG_CHECKING([whether new 2-argument shrinker exists])
ZFS_LINUX_TEST_RESULT([shrinker_cb_shrink_control], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SPLIT_SHRINKER_CALLBACK, 1,
[->count_objects exists])
AC_DEFINE(HAVE_SINGLE_SHRINKER_CALLBACK, 1,
[new shrinker callback wants 2 args])
],[
ZFS_LINUX_TEST_ERROR([shrinker])
AC_MSG_RESULT(no)
dnl #
dnl # 3.12 API change,
dnl # cs->shrink() is logically split in to
dnl # cs->count_objects() and cs->scan_objects()
dnl #
AC_MSG_CHECKING([if cs->count_objects callback exists])
ZFS_LINUX_TEST_RESULT(
[shrinker_cb_shrink_control_split],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SPLIT_SHRINKER_CALLBACK, 1,
[cs->count_objects exists])
],[
ZFS_LINUX_TEST_ERROR([shrinker])
])
])
])
])
@@ -141,6 +177,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_SHRINKER], [
ZFS_AC_KERNEL_SRC_SHRINK_CONTROL_HAS_NID
ZFS_AC_KERNEL_SRC_SHRINKER_CALLBACK
ZFS_AC_KERNEL_SRC_SHRINK_CONTROL_STRUCT
ZFS_AC_KERNEL_SRC_REGISTER_SHRINKER_VARARG
])
AC_DEFUN([ZFS_AC_KERNEL_SHRINKER], [
+37
View File
@@ -0,0 +1,37 @@
dnl #
dnl # Linux 5.2/5.18 API
dnl #
dnl # In cdb4f26a63c391317e335e6e683a614358e70aeb ("kobject: kobj_type: remove default_attrs")
dnl # struct kobj_type.default_attrs
dnl # was finally removed in favour of
dnl # struct kobj_type.default_groups
dnl #
dnl # This was added in aa30f47cf666111f6bbfd15f290a27e8a7b9d854 ("kobject: Add support for default attribute groups to kobj_type"),
dnl # if both are present (5.2-5.17), we prefer default_groups; they're otherwise equivalent
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_SYSFS_DEFAULT_GROUPS], [
ZFS_LINUX_TEST_SRC([sysfs_default_groups], [
#include <linux/kobject.h>
],[
struct kobj_type __attribute__ ((unused)) kt = {
.default_groups = (const struct attribute_group **)NULL };
])
])
AC_DEFUN([ZFS_AC_KERNEL_SYSFS_DEFAULT_GROUPS], [
AC_MSG_CHECKING([whether struct kobj_type.default_groups exists])
ZFS_LINUX_TEST_RESULT([sysfs_default_groups],[
AC_MSG_RESULT(yes)
AC_DEFINE([HAVE_SYSFS_DEFAULT_GROUPS], 1, [struct kobj_type has default_groups])
],[
AC_MSG_RESULT(no)
])
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_SYSFS], [
ZFS_AC_KERNEL_SRC_SYSFS_DEFAULT_GROUPS
])
AC_DEFUN([ZFS_AC_KERNEL_SYSFS], [
ZFS_AC_KERNEL_SYSFS_DEFAULT_GROUPS
])
+27 -5
View File
@@ -3,11 +3,25 @@ dnl # 3.11 API change
dnl # Add support for i_op->tmpfile
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_TMPFILE], [
dnl #
dnl # 6.1 API change
dnl # use struct file instead of struct dentry
dnl #
ZFS_LINUX_TEST_SRC([inode_operations_tmpfile], [
#include <linux/fs.h>
int tmpfile(struct user_namespace *userns,
struct inode *inode, struct file *file,
umode_t mode) { return 0; }
static struct inode_operations
iops __attribute__ ((unused)) = {
.tmpfile = tmpfile,
};
],[])
dnl #
dnl # 5.11 API change
dnl # add support for userns parameter to tmpfile
dnl #
ZFS_LINUX_TEST_SRC([inode_operations_tmpfile_userns], [
ZFS_LINUX_TEST_SRC([inode_operations_tmpfile_dentry_userns], [
#include <linux/fs.h>
int tmpfile(struct user_namespace *userns,
struct inode *inode, struct dentry *dentry,
@@ -17,7 +31,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_TMPFILE], [
.tmpfile = tmpfile,
};
],[])
ZFS_LINUX_TEST_SRC([inode_operations_tmpfile], [
ZFS_LINUX_TEST_SRC([inode_operations_tmpfile_dentry], [
#include <linux/fs.h>
int tmpfile(struct inode *inode, struct dentry *dentry,
umode_t mode) { return 0; }
@@ -30,16 +44,24 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_TMPFILE], [
AC_DEFUN([ZFS_AC_KERNEL_TMPFILE], [
AC_MSG_CHECKING([whether i_op->tmpfile() exists])
ZFS_LINUX_TEST_RESULT([inode_operations_tmpfile_userns], [
ZFS_LINUX_TEST_RESULT([inode_operations_tmpfile], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_TMPFILE, 1, [i_op->tmpfile() exists])
AC_DEFINE(HAVE_TMPFILE_USERNS, 1, [i_op->tmpfile() has userns])
],[
ZFS_LINUX_TEST_RESULT([inode_operations_tmpfile], [
ZFS_LINUX_TEST_RESULT([inode_operations_tmpfile_dentry_userns], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_TMPFILE, 1, [i_op->tmpfile() exists])
AC_DEFINE(HAVE_TMPFILE_USERNS, 1, [i_op->tmpfile() has userns])
AC_DEFINE(HAVE_TMPFILE_DENTRY, 1, [i_op->tmpfile() uses old dentry signature])
],[
AC_MSG_RESULT(no)
ZFS_LINUX_TEST_RESULT([inode_operations_tmpfile_dentry], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_TMPFILE, 1, [i_op->tmpfile() exists])
AC_DEFINE(HAVE_TMPFILE_DENTRY, 1, [i_op->tmpfile() uses old dentry signature])
],[
ZFS_LINUX_REQUIRE_API([i_op->tmpfile()], [3.11])
])
])
])
])
+30
View File
@@ -0,0 +1,30 @@
dnl #
dnl # Linux 5.18 uses filemap_dirty_folio in lieu of
dnl # ___set_page_dirty_nobuffers
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_VFS_FILEMAP_DIRTY_FOLIO], [
ZFS_LINUX_TEST_SRC([vfs_has_filemap_dirty_folio], [
#include <linux/pagemap.h>
#include <linux/writeback.h>
static const struct address_space_operations
aops __attribute__ ((unused)) = {
.dirty_folio = filemap_dirty_folio,
};
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_VFS_FILEMAP_DIRTY_FOLIO], [
dnl #
dnl # Linux 5.18 uses filemap_dirty_folio in lieu of
dnl # ___set_page_dirty_nobuffers
dnl #
AC_MSG_CHECKING([whether filemap_dirty_folio exists])
ZFS_LINUX_TEST_RESULT([vfs_has_filemap_dirty_folio], [
AC_MSG_RESULT([yes])
AC_DEFINE(HAVE_VFS_FILEMAP_DIRTY_FOLIO, 1,
[filemap_dirty_folio exists])
],[
AC_MSG_RESULT([no])
])
])
+2
View File
@@ -134,6 +134,8 @@ AC_DEFUN([ZFS_AC_KERNEL_VFS_IOV_ITER], [
AC_DEFINE(HAVE_IOV_ITER_FAULT_IN_READABLE, 1,
[iov_iter_fault_in_readable() is available])
],[
AC_MSG_RESULT(no)
AC_MSG_CHECKING([whether fault_in_iov_iter_readable() is available])
ZFS_LINUX_TEST_RESULT([fault_in_iov_iter_readable], [
AC_MSG_RESULT(yes)
+32
View File
@@ -0,0 +1,32 @@
dnl #
dnl # Linux 5.19 uses read_folio in lieu of readpage
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_VFS_READ_FOLIO], [
ZFS_LINUX_TEST_SRC([vfs_has_read_folio], [
#include <linux/fs.h>
static int
test_read_folio(struct file *file, struct folio *folio) {
(void) file; (void) folio;
return (0);
}
static const struct address_space_operations
aops __attribute__ ((unused)) = {
.read_folio = test_read_folio,
};
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_VFS_READ_FOLIO], [
dnl #
dnl # Linux 5.19 uses read_folio in lieu of readpage
dnl #
AC_MSG_CHECKING([whether read_folio exists])
ZFS_LINUX_TEST_RESULT([vfs_has_read_folio], [
AC_MSG_RESULT([yes])
AC_DEFINE(HAVE_VFS_READ_FOLIO, 1, [read_folio exists])
],[
AC_MSG_RESULT([no])
])
])
+1 -1
View File
@@ -23,7 +23,7 @@ AC_DEFUN([ZFS_AC_KERNEL_VFS_SET_PAGE_DIRTY_NOBUFFERS], [
dnl # Linux 5.14 change requires set_page_dirty() to be assigned
dnl # in address_space_operations()
dnl #
AC_MSG_CHECKING([__set_page_dirty_nobuffers exists])
AC_MSG_CHECKING([whether __set_page_dirty_nobuffers exists])
ZFS_LINUX_TEST_RESULT([vfs_has_set_page_dirty_nobuffers], [
AC_MSG_RESULT([yes])
AC_DEFINE(HAVE_VFS_SET_PAGE_DIRTY_NOBUFFERS, 1,
+28 -1
View File
@@ -100,6 +100,19 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_XATTR_HANDLER_GET], [
.get = get,
};
],[])
ZFS_LINUX_TEST_SRC([xattr_handler_get_dentry_inode_flags], [
#include <linux/xattr.h>
int get(const struct xattr_handler *handler,
struct dentry *dentry, struct inode *inode,
const char *name, void *buffer,
size_t size, int flags) { return 0; }
static const struct xattr_handler
xops __attribute__ ((unused)) = {
.get = get,
};
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_XATTR_HANDLER_GET], [
@@ -142,7 +155,21 @@ AC_DEFUN([ZFS_AC_KERNEL_XATTR_HANDLER_GET], [
AC_DEFINE(HAVE_XATTR_GET_DENTRY, 1,
[xattr_handler->get() wants dentry])
],[
ZFS_LINUX_TEST_ERROR([xattr get()])
dnl #
dnl # Android API change,
dnl # The xattr_handler->get() callback was
dnl # changed to take dentry, inode and flags.
dnl #
AC_MSG_RESULT(no)
AC_MSG_CHECKING(
[whether xattr_handler->get() wants dentry and inode and flags])
ZFS_LINUX_TEST_RESULT([xattr_handler_get_dentry_inode_flags], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_XATTR_GET_DENTRY_INODE_FLAGS, 1,
[xattr_handler->get() wants dentry and inode and flags])
],[
ZFS_LINUX_TEST_ERROR([xattr get()])
])
])
])
])
+27
View File
@@ -0,0 +1,27 @@
dnl #
dnl # ZERO_PAGE() is an alias for emtpy_zero_page. On certain architectures
dnl # this is a GPL exported variable.
dnl #
dnl #
dnl # Checking if ZERO_PAGE is exported GPL-only
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_ZERO_PAGE], [
ZFS_LINUX_TEST_SRC([zero_page], [
#include <asm/pgtable.h>
], [
struct page *p __attribute__ ((unused));
p = ZERO_PAGE(0);
], [], [ZFS_META_LICENSE])
])
AC_DEFUN([ZFS_AC_KERNEL_ZERO_PAGE], [
AC_MSG_CHECKING([whether ZERO_PAGE() is GPL-only])
ZFS_LINUX_TEST_RESULT([zero_page_license], [
AC_MSG_RESULT(no)
], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_ZERO_PAGE_GPL_ONLY, 1,
[ZERO_PAGE() is GPL-only])
])
])
+69 -37
View File
@@ -8,8 +8,8 @@ AC_DEFUN([ZFS_AC_CONFIG_KERNEL], [
ZFS_AC_QAT
dnl # Sanity checks for module building and CONFIG_* defines
ZFS_AC_KERNEL_TEST_MODULE
ZFS_AC_KERNEL_CONFIG_DEFINED
ZFS_AC_MODULE_SYMVERS
dnl # Sequential ZFS_LINUX_TRY_COMPILE tests
ZFS_AC_KERNEL_FPU_HEADER
@@ -61,6 +61,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_SRC], [
ZFS_AC_KERNEL_SRC_BIO
ZFS_AC_KERNEL_SRC_BLKDEV
ZFS_AC_KERNEL_SRC_BLK_QUEUE
ZFS_AC_KERNEL_SRC_GENHD_FLAGS
ZFS_AC_KERNEL_SRC_REVALIDATE_DISK
ZFS_AC_KERNEL_SRC_GET_DISK_RO
ZFS_AC_KERNEL_SRC_GENERIC_READLINK_GLOBAL
@@ -92,6 +93,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_SRC], [
ZFS_AC_KERNEL_SRC_SETATTR_PREPARE
ZFS_AC_KERNEL_SRC_INSERT_INODE_LOCKED
ZFS_AC_KERNEL_SRC_DENTRY
ZFS_AC_KERNEL_SRC_DENTRY_ALIAS_D_U
ZFS_AC_KERNEL_SRC_TRUNCATE_SETSIZE
ZFS_AC_KERNEL_SRC_SECURITY_INODE
ZFS_AC_KERNEL_SRC_FST_MOUNT
@@ -99,10 +101,14 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_SRC], [
ZFS_AC_KERNEL_SRC_SET_NLINK
ZFS_AC_KERNEL_SRC_SGET
ZFS_AC_KERNEL_SRC_LSEEK_EXECUTE
ZFS_AC_KERNEL_SRC_VFS_FILEMAP_DIRTY_FOLIO
ZFS_AC_KERNEL_SRC_VFS_READ_FOLIO
ZFS_AC_KERNEL_SRC_VFS_GETATTR
ZFS_AC_KERNEL_SRC_VFS_FSYNC_2ARGS
ZFS_AC_KERNEL_SRC_VFS_ITERATE
ZFS_AC_KERNEL_SRC_VFS_DIRECT_IO
ZFS_AC_KERNEL_SRC_VFS_READPAGES
ZFS_AC_KERNEL_SRC_VFS_SET_PAGE_DIRTY_NOBUFFERS
ZFS_AC_KERNEL_SRC_VFS_RW_ITERATE
ZFS_AC_KERNEL_SRC_VFS_GENERIC_WRITE_CHECKS
ZFS_AC_KERNEL_SRC_VFS_IOV_ITER
@@ -114,7 +120,6 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_SRC], [
ZFS_AC_KERNEL_SRC_FMODE_T
ZFS_AC_KERNEL_SRC_KUIDGID_T
ZFS_AC_KERNEL_SRC_KUID_HELPERS
ZFS_AC_KERNEL_SRC_MODULE_PARAM_CALL_CONST
ZFS_AC_KERNEL_SRC_RENAME
ZFS_AC_KERNEL_SRC_CURRENT_TIME
ZFS_AC_KERNEL_SRC_USERNS_CAPABILITIES
@@ -131,12 +136,14 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_SRC], [
ZFS_AC_KERNEL_SRC_BIO_MAX_SEGS
ZFS_AC_KERNEL_SRC_SIGNAL_STOP
ZFS_AC_KERNEL_SRC_SIGINFO
ZFS_AC_KERNEL_SRC_SYSFS
ZFS_AC_KERNEL_SRC_SET_SPECIAL_STATE
ZFS_AC_KERNEL_SRC_VFS_SET_PAGE_DIRTY_NOBUFFERS
ZFS_AC_KERNEL_SRC_STANDALONE_LINUX_STDARG
ZFS_AC_KERNEL_SRC_PAGEMAP_FOLIO_WAIT_BIT
ZFS_AC_KERNEL_SRC_ADD_DISK
ZFS_AC_KERNEL_SRC_KTHREAD
ZFS_AC_KERNEL_SRC_ZERO_PAGE
ZFS_AC_KERNEL_SRC___COPY_FROM_USER_INATOMIC
AC_MSG_CHECKING([for available kernel interfaces])
ZFS_LINUX_TEST_COMPILE_ALL([kabi])
@@ -171,6 +178,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_RESULT], [
ZFS_AC_KERNEL_BIO
ZFS_AC_KERNEL_BLKDEV
ZFS_AC_KERNEL_BLK_QUEUE
ZFS_AC_KERNEL_GENHD_FLAGS
ZFS_AC_KERNEL_REVALIDATE_DISK
ZFS_AC_KERNEL_GET_DISK_RO
ZFS_AC_KERNEL_GENERIC_READLINK_GLOBAL
@@ -202,6 +210,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_RESULT], [
ZFS_AC_KERNEL_SETATTR_PREPARE
ZFS_AC_KERNEL_INSERT_INODE_LOCKED
ZFS_AC_KERNEL_DENTRY
ZFS_AC_KERNEL_DENTRY_ALIAS_D_U
ZFS_AC_KERNEL_TRUNCATE_SETSIZE
ZFS_AC_KERNEL_SECURITY_INODE
ZFS_AC_KERNEL_FST_MOUNT
@@ -209,10 +218,14 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_RESULT], [
ZFS_AC_KERNEL_SET_NLINK
ZFS_AC_KERNEL_SGET
ZFS_AC_KERNEL_LSEEK_EXECUTE
ZFS_AC_KERNEL_VFS_FILEMAP_DIRTY_FOLIO
ZFS_AC_KERNEL_VFS_READ_FOLIO
ZFS_AC_KERNEL_VFS_GETATTR
ZFS_AC_KERNEL_VFS_FSYNC_2ARGS
ZFS_AC_KERNEL_VFS_ITERATE
ZFS_AC_KERNEL_VFS_DIRECT_IO
ZFS_AC_KERNEL_VFS_READPAGES
ZFS_AC_KERNEL_VFS_SET_PAGE_DIRTY_NOBUFFERS
ZFS_AC_KERNEL_VFS_RW_ITERATE
ZFS_AC_KERNEL_VFS_GENERIC_WRITE_CHECKS
ZFS_AC_KERNEL_VFS_IOV_ITER
@@ -224,7 +237,6 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_RESULT], [
ZFS_AC_KERNEL_FMODE_T
ZFS_AC_KERNEL_KUIDGID_T
ZFS_AC_KERNEL_KUID_HELPERS
ZFS_AC_KERNEL_MODULE_PARAM_CALL_CONST
ZFS_AC_KERNEL_RENAME
ZFS_AC_KERNEL_CURRENT_TIME
ZFS_AC_KERNEL_USERNS_CAPABILITIES
@@ -241,12 +253,14 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_RESULT], [
ZFS_AC_KERNEL_BIO_MAX_SEGS
ZFS_AC_KERNEL_SIGNAL_STOP
ZFS_AC_KERNEL_SIGINFO
ZFS_AC_KERNEL_SYSFS
ZFS_AC_KERNEL_SET_SPECIAL_STATE
ZFS_AC_KERNEL_VFS_SET_PAGE_DIRTY_NOBUFFERS
ZFS_AC_KERNEL_STANDALONE_LINUX_STDARG
ZFS_AC_KERNEL_PAGEMAP_FOLIO_WAIT_BIT
ZFS_AC_KERNEL_ADD_DISK
ZFS_AC_KERNEL_KTHREAD
ZFS_AC_KERNEL_ZERO_PAGE
ZFS_AC_KERNEL___COPY_FROM_USER_INATOMIC
])
dnl #
@@ -390,11 +404,11 @@ AC_DEFUN([ZFS_AC_KERNEL], [
utsrelease1=$kernelbuild/include/linux/version.h
utsrelease2=$kernelbuild/include/linux/utsrelease.h
utsrelease3=$kernelbuild/include/generated/utsrelease.h
AS_IF([test -r $utsrelease1 && fgrep -q UTS_RELEASE $utsrelease1], [
AS_IF([test -r $utsrelease1 && grep -qF UTS_RELEASE $utsrelease1], [
utsrelease=$utsrelease1
], [test -r $utsrelease2 && fgrep -q UTS_RELEASE $utsrelease2], [
], [test -r $utsrelease2 && grep -qF UTS_RELEASE $utsrelease2], [
utsrelease=$utsrelease2
], [test -r $utsrelease3 && fgrep -q UTS_RELEASE $utsrelease3], [
], [test -r $utsrelease3 && grep -qF UTS_RELEASE $utsrelease3], [
utsrelease=$utsrelease3
])
@@ -435,8 +449,6 @@ AC_DEFUN([ZFS_AC_KERNEL], [
AC_SUBST(LINUX)
AC_SUBST(LINUX_OBJ)
AC_SUBST(LINUX_VERSION)
ZFS_AC_MODULE_SYMVERS
])
dnl #
@@ -531,27 +543,6 @@ AC_DEFUN([ZFS_AC_QAT], [
])
])
dnl #
dnl # Basic toolchain sanity check.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_TEST_MODULE], [
AC_MSG_CHECKING([whether modules can be built])
ZFS_LINUX_TRY_COMPILE([], [], [
AC_MSG_RESULT([yes])
],[
AC_MSG_RESULT([no])
if test "x$enable_linux_builtin" != xyes; then
AC_MSG_ERROR([
*** Unable to build an empty module.
])
else
AC_MSG_ERROR([
*** Unable to build an empty module.
*** Please run 'make scripts' inside the kernel source tree.])
fi
])
])
dnl #
dnl # ZFS_LINUX_CONFTEST_H
dnl #
@@ -654,8 +645,10 @@ AC_DEFUN([ZFS_LINUX_COMPILE], [
build kernel modules with LLVM/CLANG toolchain])
AC_TRY_COMMAND([
KBUILD_MODPOST_NOFINAL="$5" KBUILD_MODPOST_WARN="$6"
make modules -k -j$TEST_JOBS ${KERNEL_CC:+CC=$KERNEL_CC} ${KERNEL_LD:+LD=$KERNEL_LD} ${KERNEL_LLVM:+LLVM=$KERNEL_LLVM} -C $LINUX_OBJ $ARCH_UM
M=$PWD/$1 >$1/build.log 2>&1])
make modules -k -j$TEST_JOBS ${KERNEL_CC:+CC=$KERNEL_CC}
${KERNEL_LD:+LD=$KERNEL_LD} ${KERNEL_LLVM:+LLVM=$KERNEL_LLVM}
CONFIG_MODULES=y CFLAGS_MODULE=-DCONFIG_MODULES
-C $LINUX_OBJ $ARCH_UM M=$PWD/$1 >$1/build.log 2>&1])
AS_IF([AC_TRY_COMMAND([$2])], [$3], [$4])
])
@@ -941,8 +934,47 @@ dnl # like ZFS_LINUX_TRY_COMPILE, except the contents conftest.h are
dnl # provided via the fifth parameter
dnl #
AC_DEFUN([ZFS_LINUX_TRY_COMPILE_HEADER], [
ZFS_LINUX_COMPILE_IFELSE(
[ZFS_LINUX_TEST_PROGRAM([[$1]], [[$2]], [[ZFS_META_LICENSE]])],
[test -f build/conftest/conftest.ko],
[$3], [$4], [$5])
AS_IF([test "x$enable_linux_builtin" = "xyes"], [
ZFS_LINUX_COMPILE_IFELSE(
[ZFS_LINUX_TEST_PROGRAM([[$1]], [[$2]],
[[ZFS_META_LICENSE]])],
[test -f build/conftest/conftest.o], [$3], [$4], [$5])
], [
ZFS_LINUX_COMPILE_IFELSE(
[ZFS_LINUX_TEST_PROGRAM([[$1]], [[$2]],
[[ZFS_META_LICENSE]])],
[test -f build/conftest/conftest.ko], [$3], [$4], [$5])
])
])
dnl #
dnl # AS_VERSION_COMPARE_LE
dnl # like AS_VERSION_COMPARE_LE, but runs $3 if (and only if) $1 <= $2
dnl # AS_VERSION_COMPARE_LE (version-1, version-2, [action-if-less-or-equal], [action-if-greater])
dnl #
AC_DEFUN([AS_VERSION_COMPARE_LE], [
AS_VERSION_COMPARE([$1], [$2], [$3], [$3], [$4])
])
dnl #
dnl # ZFS_LINUX_REQUIRE_API
dnl # like ZFS_LINUX_TEST_ERROR, except only fails if the kernel is
dnl # at least some specified version.
dnl #
AC_DEFUN([ZFS_LINUX_REQUIRE_API], [
AS_VERSION_COMPARE_LE([$2], [$kernsrcver], [
AC_MSG_ERROR([
*** None of the expected "$1" interfaces were detected. This
*** interface is expected for kernels version "$2" and above.
*** This may be because your kernel version is newer than what is
*** supported, or you are using a patched custom kernel with
*** incompatible modifications. Newer kernels may have incompatible
*** APIs.
***
*** ZFS Version: $ZFS_META_ALIAS
*** Compatible Kernels: $ZFS_META_KVER_MIN - $ZFS_META_KVER_MAX
])
], [
AC_MSG_RESULT(no)
])
])
+8 -3
View File
@@ -173,7 +173,7 @@ AC_DEFUN([ZFS_AC_DEBUG_KMEM_TRACKING], [
])
AC_DEFUN([ZFS_AC_DEBUG_INVARIANTS_DETECT_FREEBSD], [
AS_IF([sysctl -n kern.conftxt | fgrep -qx $'options\tINVARIANTS'],
AS_IF([sysctl -n kern.conftxt | grep -Fqx $'options\tINVARIANTS'],
[enable_invariants="yes"],
[enable_invariants="no"])
])
@@ -209,8 +209,8 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS], [
AX_COUNT_CPUS([])
AC_SUBST(CPU_COUNT)
ZFS_AC_CONFIG_ALWAYS_CC_NO_UNUSED_BUT_SET_VARIABLE
ZFS_AC_CONFIG_ALWAYS_CC_NO_BOOL_COMPARE
ZFS_AC_CONFIG_ALWAYS_CC_NO_CLOBBERED
ZFS_AC_CONFIG_ALWAYS_CC_INFINITE_RECURSION
ZFS_AC_CONFIG_ALWAYS_CC_IMPLICIT_FALLTHROUGH
ZFS_AC_CONFIG_ALWAYS_CC_FRAME_LARGER_THAN
ZFS_AC_CONFIG_ALWAYS_CC_NO_FORMAT_TRUNCATION
@@ -226,6 +226,7 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS], [
ZFS_AC_CONFIG_ALWAYS_SED
ZFS_AC_CONFIG_ALWAYS_CPPCHECK
ZFS_AC_CONFIG_ALWAYS_SHELLCHECK
ZFS_AC_CONFIG_ALWAYS_PARALLEL
])
AC_DEFUN([ZFS_AC_CONFIG], [
@@ -323,6 +324,10 @@ AC_DEFUN([ZFS_AC_RPM], [
RPM_DEFINE_COMMON=${RPM_DEFINE_COMMON}' --define "$(DEBUG_KMEM_TRACKING_ZFS) 1"'
RPM_DEFINE_COMMON=${RPM_DEFINE_COMMON}' --define "$(ASAN_ZFS) 1"'
AS_IF([test "x$enable_debuginfo" = xyes], [
RPM_DEFINE_COMMON=${RPM_DEFINE_COMMON}' --define "__strip /bin/true"'
])
RPM_DEFINE_UTIL=' --define "_initconfdir $(initconfdir)"'
dnl # Make the next three RPM_DEFINE_UTIL additions conditional, since
@@ -57,6 +57,12 @@ array_contains () {
}
check() {
# https://github.com/dracutdevs/dracut/pull/1711 provides a zfs_devs
# function to detect the physical devices backing zfs pools. If this
# function exists in the version of dracut this module is being called
# from, then it does not need to run.
type zfs_devs >/dev/null 2>&1 && return 1
local mp
local dev
local blockdevs
+1 -3
View File
@@ -1,14 +1,12 @@
#!/bin/sh
. /lib/dracut-zfs-lib.sh
_do_zpool_export() {
ret=0
errs=""
final="${1}"
info "ZFS: Exporting ZFS storage pools..."
errs=$(export_all -F 2>&1)
errs=$(zpool export -aF 2>&1)
ret=$?
[ -z "${errs}" ] || echo "${errs}" | vwarn
if [ "x${ret}" != "x0" ]; then
+71 -101
View File
@@ -6,8 +6,8 @@ check() {
[ "${1}" = "-d" ] && return 0
# Verify the zfs tool chain
for tool in "@sbindir@/zgenhostid" "@sbindir@/zpool" "@sbindir@/zfs" "@mounthelperdir@/mount.zfs" ; do
test -x "$tool" || return 1
for tool in "zgenhostid" "zpool" "zfs" "mount.zfs"; do
command -v "${tool}" >/dev/null || return 1
done
return 0
@@ -19,125 +19,95 @@ depends() {
}
installkernel() {
instmods zfs
instmods zcommon
instmods znvpair
instmods zavl
instmods zunicode
instmods zlua
instmods icp
instmods spl
instmods zlib_deflate
instmods zlib_inflate
instmods -c zfs
}
install() {
inst_rules @udevruledir@/90-zfs.rules
inst_rules @udevruledir@/69-vdev.rules
inst_rules @udevruledir@/60-zvol.rules
dracut_install hostid
dracut_install grep
dracut_install @sbindir@/zgenhostid
dracut_install @sbindir@/zfs
dracut_install @sbindir@/zpool
# Workaround for https://github.com/openzfs/zfs/issues/4749 by
# ensuring libgcc_s.so(.1) is included
if ldd @sbindir@/zpool | grep -qF 'libgcc_s.so'; then
# Dracut will have already tracked and included it
:;
elif command -v gcc-config >/dev/null 2>&1; then
# On systems with gcc-config (Gentoo, Funtoo, etc.):
# Use the current profile to resolve the appropriate path
s="$(gcc-config -c)"
dracut_install "/usr/lib/gcc/${s%-*}/${s##*-}/libgcc_s.so"*
elif [ "$(echo /usr/lib/libgcc_s.so*)" != "/usr/lib/libgcc_s.so*" ]; then
# Try a simple path first
dracut_install /usr/lib/libgcc_s.so*
elif [ "$(echo /lib*/libgcc_s.so*)" != "/lib*/libgcc_s.so*" ]; then
# SUSE
dracut_install /lib*/libgcc_s.so*
else
# Fallback: Guess the path and include all matches
dracut_install /usr/lib*/gcc/**/libgcc_s.so*
inst_rules 90-zfs.rules 69-vdev.rules 60-zvol.rules
inst_multiple \
zgenhostid \
zfs \
zpool \
mount.zfs \
hostid \
grep \
awk \
tr \
cut \
head ||
{ dfatal "Failed to install essential binaries"; exit 1; }
# Adapted from https://github.com/zbm-dev/zfsbootmenu
if ! ldd "$(command -v zpool)" | grep -qF 'libgcc_s.so'; then
# On systems with gcc-config (Gentoo, Funtoo, etc.), use it to find libgcc_s
if command -v gcc-config >/dev/null; then
inst_simple "/usr/lib/gcc/$(s=$(gcc-config -c); echo "${s%-*}/${s##*-}")/libgcc_s.so.1" ||
{ dfatal "Unable to install libgcc_s.so"; exit 1; }
# Otherwise, use dracut's library installation function to find the right one
elif ! inst_libdir_file "libgcc_s.so*"; then
# If all else fails, just try looking for some gcc arch directory
inst_simple /usr/lib/gcc/*/*/libgcc_s.so* ||
{ dfatal "Unable to install libgcc_s.so"; exit 1; }
fi
fi
# shellcheck disable=SC2050
if [ @LIBFETCH_DYNAMIC@ -gt 0 ]; then
for d in $libdirs; do
[ -e "$d/@LIBFETCH_SONAME@" ] && dracut_install "$d/@LIBFETCH_SONAME@"
done
fi
dracut_install @mounthelperdir@/mount.zfs
dracut_install @udevdir@/vdev_id
dracut_install awk
dracut_install cut
dracut_install tr
dracut_install head
dracut_install @udevdir@/zvol_id
inst_hook cmdline 95 "${moddir}/parse-zfs.sh"
if [ -n "$systemdutildir" ] ; then
inst_script "${moddir}/zfs-generator.sh" "$systemdutildir"/system-generators/dracut-zfs-generator
if [ -n "${systemdutildir}" ]; then
inst_script "${moddir}/zfs-generator.sh" "${systemdutildir}/system-generators/dracut-zfs-generator"
fi
inst_hook pre-mount 90 "${moddir}/zfs-load-key.sh"
inst_hook mount 98 "${moddir}/mount-zfs.sh"
inst_hook cleanup 99 "${moddir}/zfs-needshutdown.sh"
inst_hook shutdown 20 "${moddir}/export-zfs.sh"
inst_simple "${moddir}/zfs-lib.sh" "/lib/dracut-zfs-lib.sh"
if [ -e @sysconfdir@/zfs/zpool.cache ]; then
inst @sysconfdir@/zfs/zpool.cache
type mark_hostonly >/dev/null 2>&1 && mark_hostonly @sysconfdir@/zfs/zpool.cache
fi
inst_script "${moddir}/zfs-lib.sh" "/lib/dracut-zfs-lib.sh"
if [ -e @sysconfdir@/zfs/vdev_id.conf ]; then
inst @sysconfdir@/zfs/vdev_id.conf
type mark_hostonly >/dev/null 2>&1 && mark_hostonly @sysconfdir@/zfs/vdev_id.conf
fi
# -H ensures they are marked host-only
# -o ensures there is no error upon absence of these files
inst_multiple -o -H \
"@sysconfdir@/zfs/zpool.cache" \
"@sysconfdir@/zfs/vdev_id.conf"
# Synchronize initramfs and system hostid
if [ -f @sysconfdir@/hostid ]; then
inst @sysconfdir@/hostid
type mark_hostonly >/dev/null 2>&1 && mark_hostonly @sysconfdir@/hostid
elif HOSTID="$(hostid 2>/dev/null)" && [ "${HOSTID}" != "00000000" ]; then
zgenhostid -o "${initdir}@sysconfdir@/hostid" "${HOSTID}"
type mark_hostonly >/dev/null 2>&1 && mark_hostonly @sysconfdir@/hostid
if ! inst_simple -H @sysconfdir@/hostid; then
if HOSTID="$(hostid 2>/dev/null)" && [ "${HOSTID}" != "00000000" ]; then
zgenhostid -o "${initdir}@sysconfdir@/hostid" "${HOSTID}"
mark_hostonly @sysconfdir@/hostid
fi
fi
if dracut_module_included "systemd"; then
mkdir -p "${initdir}/$systemdsystemunitdir/zfs-import.target.wants"
for _service in "zfs-import-scan.service" "zfs-import-cache.service" ; do
dracut_install "@systemdunitdir@/$_service"
if ! [ -L "${initdir}/$systemdsystemunitdir/zfs-import.target.wants/$_service" ]; then
ln -sf ../$_service "${initdir}/$systemdsystemunitdir/zfs-import.target.wants/$_service"
type mark_hostonly >/dev/null 2>&1 && mark_hostonly "@systemdunitdir@/$_service"
fi
inst_simple "${systemdsystemunitdir}/zfs-import.target"
systemctl -q --root "${initdir}" add-wants initrd.target zfs-import.target
inst_simple "${moddir}/zfs-env-bootfs.service" "${systemdsystemunitdir}/zfs-env-bootfs.service"
systemctl -q --root "${initdir}" add-wants zfs-import.target zfs-env-bootfs.service
for _service in \
"zfs-import-scan.service" \
"zfs-import-cache.service"; do
inst_simple "${systemdsystemunitdir}/${_service}"
systemctl -q --root "${initdir}" add-wants zfs-import.target "${_service}"
# Add user-provided unit overrides
# - /etc/systemd/system/zfs-import-{scan,cache}.service
# - /etc/systemd/system/zfs-import-{scan,cache}.service.d/overrides.conf
# -H ensures they are marked host-only
# -o ensures there is no error upon absence of these files
inst_multiple -o -H \
"${systemdsystemconfdir}/${_service}" \
"${systemdsystemconfdir}/${_service}.d/"*.conf
done
inst "${moddir}"/zfs-env-bootfs.service "${systemdsystemunitdir}"/zfs-env-bootfs.service
ln -s ../zfs-env-bootfs.service "${initdir}/${systemdsystemunitdir}/zfs-import.target.wants"/zfs-env-bootfs.service
type mark_hostonly >/dev/null 2>&1 && mark_hostonly @systemdunitdir@/zfs-env-bootfs.service
dracut_install systemd-ask-password
dracut_install systemd-tty-ask-password-agent
mkdir -p "${initdir}/$systemdsystemunitdir/initrd.target.wants"
dracut_install @systemdunitdir@/zfs-import.target
if ! [ -L "${initdir}/$systemdsystemunitdir/initrd.target.wants"/zfs-import.target ]; then
ln -s ../zfs-import.target "${initdir}/$systemdsystemunitdir/initrd.target.wants"/zfs-import.target
type mark_hostonly >/dev/null 2>&1 && mark_hostonly @systemdunitdir@/zfs-import.target
fi
for _service in zfs-snapshot-bootfs.service zfs-rollback-bootfs.service ; do
inst "${moddir}/$_service" "${systemdsystemunitdir}/$_service"
if ! [ -L "${initdir}/$systemdsystemunitdir/initrd.target.wants/$_service" ]; then
ln -s "../$_service" "${initdir}/$systemdsystemunitdir/initrd.target.wants/$_service"
fi
for _service in \
"zfs-snapshot-bootfs.service" \
"zfs-rollback-bootfs.service"; do
inst_simple "${moddir}/${_service}" "${systemdsystemunitdir}/${_service}"
systemctl -q --root "${initdir}" add-wants initrd.target "${_service}"
done
# There isn't a pkg-config variable for this,
# and dracut doesn't automatically resolve anything this'd be next to
local systemdsystemenvironmentgeneratordir
systemdsystemenvironmentgeneratordir="$(pkg-config --variable=prefix systemd || echo "/usr")/lib/systemd/system-environment-generators"
mkdir -p "${initdir}/${systemdsystemenvironmentgeneratordir}"
inst "${moddir}"/import-opts-generator.sh "${systemdsystemenvironmentgeneratordir}"/zfs-import-opts.sh
inst_simple "${moddir}/import-opts-generator.sh" "${systemdutildir}/system-environment-generators/zfs-import-opts.sh"
fi
}
+83 -49
View File
@@ -3,48 +3,73 @@
. /lib/dracut-zfs-lib.sh
ZFS_DATASET=""
ZFS_POOL=""
case "${root}" in
zfs:*) ;;
*) return ;;
esac
decode_root_args || return 0
GENERATOR_FILE=/run/systemd/generator/sysroot.mount
GENERATOR_EXTENSION=/run/systemd/generator/sysroot.mount.d/zfs-enhancement.conf
if [ -e "$GENERATOR_FILE" ] && [ -e "$GENERATOR_EXTENSION" ] ; then
# If the ZFS sysroot.mount flag exists, the initial RAM disk configured
# it to mount ZFS on root. In that case, we bail early. This flag
# file gets created by the zfs-generator program upon successful run.
info "ZFS: There is a sysroot.mount and zfs-generator has extended it."
info "ZFS: Delegating root mount to sysroot.mount."
# Let us tell the initrd to run on shutdown.
# We have a shutdown hook to run
# because we imported the pool.
if [ -e "$GENERATOR_FILE" ] && [ -e "$GENERATOR_EXTENSION" ]; then
# We're under systemd and dracut-zfs-generator ran to completion.
info "ZFS: Delegating root mount to sysroot.mount at al."
# We now prevent Dracut from running this thing again.
for zfsmounthook in "$hookdir"/mount/*zfs* ; do
if [ -f "$zfsmounthook" ] ; then
rm -f "$zfsmounthook"
fi
done
rm -f "$hookdir"/mount/*zfs*
return
fi
info "ZFS: No sysroot.mount exists or zfs-generator did not extend it."
info "ZFS: Mounting root with the traditional mount-zfs.sh instead."
# ask_for_password tries prompt cmd
#
# Wraps around plymouth ask-for-password and adds fallback to tty password ask
# if plymouth is not present.
ask_for_password() {
tries="$1"
prompt="$2"
cmd="$3"
{
flock -s 9
# Prompt for password with plymouth, if installed and running.
if plymouth --ping 2>/dev/null; then
plymouth ask-for-password \
--prompt "$prompt" --number-of-tries="$tries" | \
eval "$cmd"
ret=$?
else
i=1
while [ "$i" -le "$tries" ]; do
printf "%s [%i/%i]:" "$prompt" "$i" "$tries" >&2
eval "$cmd" && ret=0 && break
ret=$?
i=$((i+1))
printf '\n' >&2
done
unset i
fi
} 9>/.console_lock
[ "$ret" -ne 0 ] && echo "Wrong password" >&2
return "$ret"
}
# Delay until all required block devices are present.
modprobe zfs 2>/dev/null
udevadm settle
ZFS_DATASET=
ZFS_POOL=
if [ "${root}" = "zfs:AUTO" ] ; then
if ! ZFS_DATASET="$(find_bootfs)" ; then
if ! ZFS_DATASET="$(zpool get -Ho value bootfs | grep -m1 -vFx -)"; then
# shellcheck disable=SC2086
zpool import -N -a ${ZPOOL_IMPORT_OPTS}
if ! ZFS_DATASET="$(find_bootfs)" ; then
if ! ZFS_DATASET="$(zpool get -Ho value bootfs | grep -m1 -vFx -)"; then
warn "ZFS: No bootfs attribute found in importable pools."
export_all -F
zpool export -aF
rootok=0
return 1
@@ -53,34 +78,43 @@ if [ "${root}" = "zfs:AUTO" ] ; then
info "ZFS: Using ${ZFS_DATASET} as root."
fi
ZFS_DATASET="${ZFS_DATASET:-${root#zfs:}}"
ZFS_DATASET="${ZFS_DATASET:-${root}}"
ZFS_POOL="${ZFS_DATASET%%/*}"
if import_pool "${ZFS_POOL}" ; then
# Load keys if we can or if we need to
if [ "$(zpool list -H -o feature@encryption "${ZFS_POOL}")" = 'active' ]; then
# if the root dataset has encryption enabled
ENCRYPTIONROOT="$(zfs get -H -o value encryptionroot "${ZFS_DATASET}")"
if ! [ "${ENCRYPTIONROOT}" = "-" ]; then
KEYSTATUS="$(zfs get -H -o value keystatus "${ENCRYPTIONROOT}")"
# if the key needs to be loaded
if [ "$KEYSTATUS" = "unavailable" ]; then
# decrypt them
ask_for_password \
--tries 5 \
--prompt "Encrypted ZFS password for ${ENCRYPTIONROOT}: " \
--cmd "zfs load-key '${ENCRYPTIONROOT}'"
fi
if ! zpool get -Ho value name "${ZFS_POOL}" > /dev/null 2>&1; then
info "ZFS: Importing pool ${ZFS_POOL}..."
# shellcheck disable=SC2086
if ! zpool import -N ${ZPOOL_IMPORT_OPTS} "${ZFS_POOL}"; then
warn "ZFS: Unable to import pool ${ZFS_POOL}"
rootok=0
return 1
fi
fi
# Load keys if we can or if we need to
# TODO: for_relevant_root_children like in zfs-load-key.sh.in
if [ "$(zpool get -Ho value feature@encryption "${ZFS_POOL}")" = 'active' ]; then
# if the root dataset has encryption enabled
ENCRYPTIONROOT="$(zfs get -Ho value encryptionroot "${ZFS_DATASET}")"
if ! [ "${ENCRYPTIONROOT}" = "-" ]; then
KEYSTATUS="$(zfs get -Ho value keystatus "${ENCRYPTIONROOT}")"
# if the key needs to be loaded
if [ "$KEYSTATUS" = "unavailable" ]; then
# decrypt them
ask_for_password \
5 \
"Encrypted ZFS password for ${ENCRYPTIONROOT}: " \
"zfs load-key '${ENCRYPTIONROOT}'"
fi
fi
# Let us tell the initrd to run on shutdown.
# We have a shutdown hook to run
# because we imported the pool.
info "ZFS: Mounting dataset ${ZFS_DATASET}..."
if mount_dataset "${ZFS_DATASET}" ; then
ROOTFS_MOUNTED=yes
return 0
fi
fi
rootok=0
# Let us tell the initrd to run on shutdown.
# We have a shutdown hook to run
# because we imported the pool.
info "ZFS: Mounting dataset ${ZFS_DATASET}..."
if ! mount_dataset "${ZFS_DATASET}"; then
rootok=0
return 1
fi
+17 -45
View File
@@ -1,7 +1,8 @@
#!/bin/sh
# shellcheck disable=SC2034,SC2154
. /lib/dracut-lib.sh
# shellcheck source=zfs-lib.sh.in
. /lib/dracut-zfs-lib.sh
# Let the command line override our host id.
spl_hostid=$(getarg spl_hostid=)
@@ -15,49 +16,20 @@ else
warn "ZFS: Pools may not import correctly."
fi
wait_for_zfs=0
case "${root}" in
""|zfs|zfs:)
# We'll take root unset, root=zfs, or root=zfs:
# No root set, so we want to read the bootfs attribute. We
# can't do that until udev settles so we'll set dummy values
# and hope for the best later on.
root="zfs:AUTO"
rootok=1
wait_for_zfs=1
if decode_root_args; then
if [ "$root" = "zfs:AUTO" ]; then
info "ZFS: Boot dataset autodetected from bootfs=."
else
info "ZFS: Boot dataset is ${root}."
fi
info "ZFS: Enabling autodetection of bootfs after udev settles."
;;
ZFS=*|zfs:*|FILESYSTEM=*)
# root is explicit ZFS root. Parse it now. We can handle
# a root=... param in any of the following formats:
# root=ZFS=rpool/ROOT
# root=zfs:rpool/ROOT
# root=zfs:FILESYSTEM=rpool/ROOT
# root=FILESYSTEM=rpool/ROOT
# root=ZFS=pool+with+space/ROOT+WITH+SPACE (translates to root=ZFS=pool with space/ROOT WITH SPACE)
# Strip down to just the pool/fs
root="${root#zfs:}"
root="${root#FILESYSTEM=}"
root="zfs:${root#ZFS=}"
# switch + with spaces because kernel cmdline does not allow us to quote parameters
root=$(echo "$root" | tr '+' ' ')
rootok=1
wait_for_zfs=1
info "ZFS: Set ${root} as bootfs."
;;
esac
# Make sure Dracut is happy that we have a root and will wait for ZFS
# modules to settle before mounting.
if [ ${wait_for_zfs} -eq 1 ]; then
ln -s /dev/null /dev/root 2>/dev/null
initqueuedir="${hookdir}/initqueue/finished"
test -d "${initqueuedir}" || {
initqueuedir="${hookdir}/initqueue-finished"
}
echo '[ -e /dev/zfs ]' > "${initqueuedir}/zfs.sh"
rootok=1
# Make sure Dracut is happy that we have a root and will wait for ZFS
# modules to settle before mounting.
if [ -n "${wait_for_zfs}" ]; then
ln -s null /dev/root
echo '[ -e /dev/zfs ]' > "${hookdir}/initqueue/finished/zfs.sh"
fi
else
info "ZFS: no ZFS-on-root."
fi
@@ -8,7 +8,7 @@ Before=zfs-import.target
[Service]
Type=oneshot
ExecStart=/bin/sh -c "exec systemctl set-environment BOOTFS=$(@sbindir@/zpool list -H -o bootfs | grep -m1 -v '^-$')"
ExecStart=/bin/sh -c "exec systemctl set-environment BOOTFS=$(@sbindir@/zpool list -H -o bootfs | grep -m1 -vFx -)"
[Install]
WantedBy=zfs-import.target
+5 -25
View File
@@ -1,5 +1,5 @@
#!/bin/sh
# shellcheck disable=SC2016,SC1004
# shellcheck disable=SC2016,SC1004,SC2154
grep -wq debug /proc/cmdline && debug=1
[ -n "$debug" ] && echo "zfs-generator: starting" >> /dev/kmsg
@@ -10,37 +10,17 @@ GENERATOR_DIR="$1"
exit 1
}
[ -f /lib/dracut-lib.sh ] && dracutlib=/lib/dracut-lib.sh
[ -f /usr/lib/dracut/modules.d/99base/dracut-lib.sh ] && dracutlib=/usr/lib/dracut/modules.d/99base/dracut-lib.sh
command -v getarg >/dev/null 2>&1 || {
[ -n "$debug" ] && echo "zfs-generator: loading Dracut library from $dracutlib" >> /dev/kmsg
. "$dracutlib"
}
# shellcheck source=zfs-lib.sh.in
. /lib/dracut-zfs-lib.sh
decode_root_args || exit 0
[ -z "$root" ] && root=$(getarg root=)
[ -z "$rootfstype" ] && rootfstype=$(getarg rootfstype=)
[ -z "$rootflags" ] && rootflags=$(getarg rootflags=)
# If root is not ZFS= or zfs: or rootfstype is not zfs
# then we are not supposed to handle it.
[ "${root##zfs:}" = "${root}" ] &&
[ "${root##ZFS=}" = "${root}" ] &&
[ "$rootfstype" != "zfs" ] &&
exit 0
[ -z "${rootflags}" ] && rootflags=$(getarg rootflags=)
case ",${rootflags}," in
*,zfsutil,*) ;;
,,) rootflags=zfsutil ;;
*) rootflags="zfsutil,${rootflags}" ;;
esac
if [ "${root}" != "zfs:AUTO" ]; then
root="${root##zfs:}"
root="${root##ZFS=}"
fi
[ -n "$debug" ] && echo "zfs-generator: writing extension for sysroot.mount to $GENERATOR_DIR/sysroot.mount.d/zfs-enhancement.conf" >> /dev/kmsg
@@ -89,7 +69,7 @@ else
_zfs_generator_cb() {
dset="${1}"
mpnt="${2}"
unit="sysroot$(echo "$mpnt" | tr '/' '-').mount"
unit="$(systemd-escape --suffix=mount -p "/sysroot${mpnt}")"
{
echo "[Unit]"
+54 -142
View File
@@ -1,74 +1,16 @@
#!/bin/sh
# shellcheck disable=SC2034
command -v getarg >/dev/null || . /lib/dracut-lib.sh
command -v getargbool >/dev/null || {
# Compatibility with older Dracut versions.
# With apologies to the Dracut developers.
getargbool() {
_default="$1"; shift
! _b=$(getarg "$@") && [ -z "$_b" ] && _b="$_default"
if [ -n "$_b" ]; then
[ "$_b" = "0" ] && return 1
[ "$_b" = "no" ] && return 1
[ "$_b" = "off" ] && return 1
fi
return 0
}
}
command -v getarg >/dev/null || . /lib/dracut-lib.sh || . /usr/lib/dracut/modules.d/99base/dracut-lib.sh
OLDIFS="${IFS}"
NEWLINE="
"
TAB=" "
ZPOOL_IMPORT_OPTS=""
if getargbool 0 zfs_force -y zfs.force -y zfsforce ; then
ZPOOL_IMPORT_OPTS=
if getargbool 0 zfs_force -y zfs.force -y zfsforce; then
warn "ZFS: Will force-import pools if necessary."
ZPOOL_IMPORT_OPTS="${ZPOOL_IMPORT_OPTS} -f"
ZPOOL_IMPORT_OPTS=-f
fi
# find_bootfs
# returns the first dataset with the bootfs attribute.
find_bootfs() {
IFS="${NEWLINE}"
for dataset in $(zpool list -H -o bootfs); do
case "${dataset}" in
"" | "-")
continue
;;
"no pools available")
IFS="${OLDIFS}"
return 1
;;
*)
IFS="${OLDIFS}"
echo "${dataset}"
return 0
;;
esac
done
IFS="${OLDIFS}"
return 1
}
# import_pool POOL
# imports the given zfs pool if it isn't imported already.
import_pool() {
pool="${1}"
if ! zpool list -H "${pool}" > /dev/null 2>&1; then
info "ZFS: Importing pool ${pool}..."
# shellcheck disable=SC2086
if ! zpool import -N ${ZPOOL_IMPORT_OPTS} "${pool}" ; then
warn "ZFS: Unable to import pool ${pool}"
return 1
fi
fi
return 0
}
_mount_dataset_cb() {
mount -o zfsutil -t zfs "${1}" "${NEWROOT}${2}"
}
@@ -121,87 +63,57 @@ for_relevant_root_children() {
)
}
# export_all OPTS
# exports all imported zfs pools.
export_all() {
ret=0
IFS="${NEWLINE}"
for pool in $(zpool list -H -o name) ; do
if zpool list -H "${pool}" > /dev/null 2>&1; then
zpool export "${pool}" "$@" || ret=$?
fi
done
IFS="${OLDIFS}"
return ${ret}
}
# ask_for_password
# Parse root=, rootfstype=, return them decoded and normalised to zfs:AUTO for auto, plain dset for explicit
#
# Wraps around plymouth ask-for-password and adds fallback to tty password ask
# if plymouth is not present.
# True if ZFS-on-root, false if we shouldn't
#
# --cmd command
# Command to execute. Required.
# --prompt prompt
# Password prompt. Note that function already adds ':' at the end.
# Recommended.
# --tries n
# How many times repeat command on its failure. Default is 3.
# --ply-[cmd|prompt|tries]
# Command/prompt/tries specific for plymouth password ask only.
# --tty-[cmd|prompt|tries]
# Command/prompt/tries specific for tty password ask only.
# --tty-echo-off
# Turn off input echo before tty command is executed and turn on after.
# It's useful when password is read from stdin.
ask_for_password() {
ply_tries=3
tty_tries=3
while [ "$#" -gt 0 ]; do
case "$1" in
--cmd) ply_cmd="$2"; tty_cmd="$2"; shift;;
--ply-cmd) ply_cmd="$2"; shift;;
--tty-cmd) tty_cmd="$2"; shift;;
--prompt) ply_prompt="$2"; tty_prompt="$2"; shift;;
--ply-prompt) ply_prompt="$2"; shift;;
--tty-prompt) tty_prompt="$2"; shift;;
--tries) ply_tries="$2"; tty_tries="$2"; shift;;
--ply-tries) ply_tries="$2"; shift;;
--tty-tries) tty_tries="$2"; shift;;
--tty-echo-off) tty_echo_off=yes;;
# Supported values:
# root=
# root=zfs
# root=zfs:
# root=zfs:AUTO
#
# root=ZFS=data/set
# root=zfs:data/set
# root=zfs:ZFS=data/set (as a side-effect; allowed but undocumented)
#
# rootfstype=zfs AND root=data/set <=> root=data/set
# rootfstype=zfs AND root= <=> root=zfs:AUTO
#
# '+'es in explicit dataset decoded to ' 's.
decode_root_args() {
if [ -n "$rootfstype" ]; then
[ "$rootfstype" = zfs ]
return
fi
root=$(getarg root=)
rootfstype=$(getarg rootfstype=)
# shellcheck disable=SC2249
case "$root" in
""|zfs|zfs:|zfs:AUTO)
root=zfs:AUTO
rootfstype=zfs
return 0
;;
ZFS=*|zfs:*)
root="${root#zfs:}"
root="${root#ZFS=}"
root=$(echo "$root" | tr '+' ' ')
rootfstype=zfs
return 0
;;
esac
if [ "$rootfstype" = "zfs" ]; then
case "$root" in
"") root=zfs:AUTO ;;
*) root=$(echo "$root" | tr '+' ' ') ;;
esac
shift
done
return 0
fi
{ flock -s 9;
# Prompt for password with plymouth, if installed and running.
if plymouth --ping 2>/dev/null; then
plymouth ask-for-password \
--prompt "$ply_prompt" --number-of-tries="$ply_tries" | \
eval "$ply_cmd"
ret=$?
else
if [ "$tty_echo_off" = yes ]; then
stty_orig="$(stty -g)"
stty -echo
fi
i=1
while [ "$i" -le "$tty_tries" ]; do
[ -n "$tty_prompt" ] && \
printf "%s [%i/%i]:" "$tty_prompt" "$i" "$tty_tries" >&2
eval "$tty_cmd" && ret=0 && break
ret=$?
i=$((i+1))
[ -n "$tty_prompt" ] && printf '\n' >&2
done
unset i
[ "$tty_echo_off" = yes ] && stty "$stty_orig"
fi
} 9>/.console_lock
[ $ret -ne 0 ] && echo "Wrong password" >&2
return $ret
return 1
}
+48 -57
View File
@@ -4,70 +4,61 @@
# only run this on systemd systems, we handle the decrypt in mount-zfs.sh in the mount hook otherwise
[ -e /bin/systemctl ] || [ -e /usr/bin/systemctl ] || return 0
# This script only gets executed on systemd systems, see mount-zfs.sh for non-systemd systems
# shellcheck source=zfs-lib.sh.in
. /lib/dracut-zfs-lib.sh
# import the libs now that we know the pool imported
[ -f /lib/dracut-lib.sh ] && dracutlib=/lib/dracut-lib.sh
[ -f /usr/lib/dracut/modules.d/99base/dracut-lib.sh ] && dracutlib=/usr/lib/dracut/modules.d/99base/dracut-lib.sh
# shellcheck source=./lib-zfs.sh.in
. "$dracutlib"
# load the kernel command line vars
[ -z "$root" ] && root="$(getarg root=)"
# If root is not ZFS= or zfs: or rootfstype is not zfs then we are not supposed to handle it.
[ "${root##zfs:}" = "${root}" ] && [ "${root##ZFS=}" = "${root}" ] && [ "$rootfstype" != "zfs" ] && exit 0
decode_root_args || return 0
# There is a race between the zpool import and the pre-mount hooks, so we wait for a pool to be imported
while [ "$(zpool list -H)" = "" ]; do
systemctl is-failed --quiet zfs-import-cache.service zfs-import-scan.service && exit 1
while ! systemctl is-active --quiet zfs-import.target; do
systemctl is-failed --quiet zfs-import-cache.service zfs-import-scan.service && return 1
sleep 0.1s
done
# run this after import as zfs-import-cache/scan service is confirmed good
# we do not overwrite the ${root} variable, but create a new one, BOOTFS, to hold the dataset
if [ "${root}" = "zfs:AUTO" ] ; then
BOOTFS="$(zpool list -H -o bootfs | awk '$1 != "-" {print; exit}')"
else
BOOTFS="${root##zfs:}"
BOOTFS="${BOOTFS##ZFS=}"
BOOTFS="$root"
if [ "$BOOTFS" = "zfs:AUTO" ]; then
BOOTFS="$(zpool get -Ho value bootfs | grep -m1 -vFx -)"
fi
# if pool encryption is active and the zfs command understands '-o encryption'
if [ "$(zpool list -H -o feature@encryption "${BOOTFS%%/*}")" = 'active' ]; then
# if the root dataset has encryption enabled
ENCRYPTIONROOT="$(zfs get -H -o value encryptionroot "${BOOTFS}")"
if ! [ "${ENCRYPTIONROOT}" = "-" ]; then
KEYSTATUS="$(zfs get -H -o value keystatus "${ENCRYPTIONROOT}")"
# continue only if the key needs to be loaded
[ "$KEYSTATUS" = "unavailable" ] || exit 0
[ "$(zpool get -Ho value feature@encryption "${BOOTFS%%/*}")" = 'active' ] || return 0
KEYLOCATION="$(zfs get -H -o value keylocation "${ENCRYPTIONROOT}")"
case "${KEYLOCATION%%://*}" in
prompt)
for _ in 1 2 3; do
systemd-ask-password --no-tty "Encrypted ZFS password for ${BOOTFS}" | zfs load-key "${ENCRYPTIONROOT}" && break
_load_key_cb() {
dataset="$1"
ENCRYPTIONROOT="$(zfs get -Ho value encryptionroot "${dataset}")"
[ "${ENCRYPTIONROOT}" = "-" ] && return 0
[ "$(zfs get -Ho value keystatus "${ENCRYPTIONROOT}")" = "unavailable" ] || return 0
KEYLOCATION="$(zfs get -Ho value keylocation "${ENCRYPTIONROOT}")"
case "${KEYLOCATION%%://*}" in
prompt)
for _ in 1 2 3; do
systemd-ask-password --no-tty "Encrypted ZFS password for ${dataset}" | zfs load-key "${ENCRYPTIONROOT}" && break
done
;;
http*)
systemctl start network-online.target
zfs load-key "${ENCRYPTIONROOT}"
;;
file)
KEYFILE="${KEYLOCATION#file://}"
[ -r "${KEYFILE}" ] || udevadm settle
[ -r "${KEYFILE}" ] || {
info "ZFS: Waiting for key ${KEYFILE} for ${ENCRYPTIONROOT}..."
for _ in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20; do
sleep 0.5s
[ -r "${KEYFILE}" ] && break
done
;;
http*)
systemctl start network-online.target
zfs load-key "${ENCRYPTIONROOT}"
;;
file)
KEYFILE="${KEYLOCATION#file://}"
[ -r "${KEYFILE}" ] || udevadm settle
[ -r "${KEYFILE}" ] || {
info "Waiting for key ${KEYFILE} for ${ENCRYPTIONROOT}..."
for _ in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20; do
sleep 0.5s
[ -r "${KEYFILE}" ] && break
done
}
[ -r "${KEYFILE}" ] || warn "Key ${KEYFILE} for ${ENCRYPTIONROOT} hasn't appeared. Trying anyway."
zfs load-key "${ENCRYPTIONROOT}"
;;
*)
zfs load-key "${ENCRYPTIONROOT}"
;;
esac
fi
fi
}
[ -r "${KEYFILE}" ] || warn "ZFS: Key ${KEYFILE} for ${ENCRYPTIONROOT} hasn't appeared. Trying anyway."
zfs load-key "${ENCRYPTIONROOT}"
;;
*)
zfs load-key "${ENCRYPTIONROOT}"
;;
esac
}
_load_key_cb "$BOOTFS"
for_relevant_root_children "$BOOTFS" _load_key_cb
+1 -1
View File
@@ -2,7 +2,7 @@
command -v getarg >/dev/null 2>&1 || . /lib/dracut-lib.sh
if zpool list 2>&1 | grep -q 'no pools available' ; then
if [ -z "$(zpool get -Ho value name)" ]; then
info "ZFS: No active pools, no need to export anything."
else
info "ZFS: There is an active pool, will export it."
@@ -1,14 +1,12 @@
[Unit]
Description=Rollback bootfs just before it is mounted
Requisite=zfs-import.target
After=zfs-import.target zfs-snapshot-bootfs.service
After=zfs-import.target dracut-pre-mount.service zfs-snapshot-bootfs.service
Before=dracut-mount.service
DefaultDependencies=no
ConditionKernelCommandLine=bootfs.rollback
[Service]
# ${BOOTFS} should have been set by zfs-env-bootfs.service
Type=oneshot
ExecStartPre=/bin/sh -c 'test -n "${BOOTFS}"'
ExecStart=/bin/sh -c '. /lib/dracut-lib.sh; SNAPNAME="$(getarg bootfs.rollback)"; exec @sbindir@/zfs rollback -Rf "${BOOTFS}@${SNAPNAME:-%v}"'
ExecStart=/bin/sh -c '. /lib/dracut-zfs-lib.sh; decode_root_args || exit; [ "$root" = "zfs:AUTO" ] && root="$BOOTFS"; SNAPNAME="$(getarg bootfs.rollback)"; exec @sbindir@/zfs rollback -Rf "$root@${SNAPNAME:-%v}"'
RemainAfterExit=yes
@@ -1,14 +1,12 @@
[Unit]
Description=Snapshot bootfs just before it is mounted
Requisite=zfs-import.target
After=zfs-import.target
After=zfs-import.target dracut-pre-mount.service
Before=dracut-mount.service
DefaultDependencies=no
ConditionKernelCommandLine=bootfs.snapshot
[Service]
# ${BOOTFS} should have been set by zfs-env-bootfs.service
Type=oneshot
ExecStartPre=/bin/sh -c 'test -n "${BOOTFS}"'
ExecStart=-/bin/sh -c '. /lib/dracut-lib.sh; SNAPNAME="$(getarg bootfs.snapshot)"; exec @sbindir@/zfs snapshot "${BOOTFS}@${SNAPNAME:-%v}"'
ExecStart=-/bin/sh -c '. /lib/dracut-zfs-lib.sh; decode_root_args || exit; [ "$root" = "zfs:AUTO" ] && root="$BOOTFS"; SNAPNAME="$(getarg bootfs.snapshot)"; exec @sbindir@/zfs snapshot "$root@${SNAPNAME:-%v}"'
RemainAfterExit=yes
+38 -213
View File
@@ -1,225 +1,50 @@
How to setup a zfs root filesystem using dracut
-----------------------------------------------
## Basic setup
1. Install `zfs-dracut`
2. Set `mountpoint=/` for your root dataset (for compatibility, `legacy` also works, but is not recommended for new installations):
```sh
zfs set mountpoint=/ pool/dataset
```
3. Either (a) set `bootfs=` on the pool to the dataset:
```sh
zpool set bootfs=pool/dataset pool
```
4. Or (b) append `root=zfs:pool/dataset` to your kernel cmdline.
5. Re-generate your initrd and update it in your boot bundle
1) Install the zfs-dracut package. This package adds a zfs dracut module
to the /usr/share/dracut/modules.d/ directory which allows dracut to
create an initramfs which is zfs aware.
Encrypted datasets have keys loaded automatically or prompted for.
2) Set the bootfs property for the bootable dataset in the pool. Then set
the dataset mountpoint property to '/'.
If the root dataset contains children with `mountpoint=`s of `/etc`, `/bin`, `/lib*`, or `/usr`, they're mounted too.
$ zpool set bootfs=pool/dataset pool
$ zfs set mountpoint=/ pool/dataset
For complete documentation, see `dracut.zfs(7)`.
Alternately, legacy mountpoints can be used by setting the 'root=' option
on the kernel line of your grub.conf/menu.lst configuration file. Then
set the dataset mountpoint property to 'legacy'.
## cmdline
1. `root=` | Root dataset is… |
---------------------------|----------------------------------------------------------|
*(empty)* | the first `bootfs=` after `zpool import -aN` |
`zfs:AUTO`, `zfs:`, `zfs` | *(as above, but overriding other autoselection methods)* |
`ZFS=pool/dataset` | `pool/dataset` |
`zfs:pool/dataset` | *(as above)* |
$ grub.conf/menu.lst: kernel ... root=ZFS=pool/dataset
$ zfs set mountpoint=legacy pool/dataset
All `+`es are replaced with spaces (i.e. to boot from `root pool/data set`, pass `root=zfs:root+pool/data+set`).
3) To set zfs module options put them in /etc/modprobe.d/zfs.conf file.
The complete list of zfs module options is available by running the
_modinfo zfs_ command. Commonly set options include: zfs_arc_min,
zfs_arc_max, zfs_prefetch_disable, and zfs_vdev_max_pending.
The dataset can be at any depth, including being the pool's root dataset (i.e. `root=zfs:pool`).
4) Finally, create your new initramfs by running dracut.
`rootfstype=zfs` is equivalent to `root=zfs:AUTO`, `rootfstype=zfs root=pool/dataset` is equivalent to `root=zfs:pool/dataset`.
$ dracut --force /path/to/initramfs kernel_version
2. `spl_hostid`: passed to `zgenhostid -f`, useful to override the `/etc/hostid` file baked into the initrd.
Kernel Command Line
-------------------
3. `bootfs.snapshot`, `bootfs.snapshot=snapshot-name`: enables `zfs-snapshot-bootfs.service`,
which creates a snapshot `$root_dataset@$(uname -r)` (or, in the second form, `$root_dataset@snapshot-name`)
after pool import but before the rootfs is mounted.
Failure to create the snapshot is noted, but booting continues.
The initramfs' behavior is influenced by the following kernel command line
parameters passed in from the boot loader:
4. `bootfs.rollback`, `bootfs.rollback=snapshot-name`: enables `zfs-snapshot-bootfs.service`,
which `-Rf` rolls back to `$root_dataset@$(uname -r)` (or, in the second form, `$root_dataset@snapshot-name`)
after pool import but before the rootfs is mounted.
Failure to roll back will fall down to the rescue shell.
This has obvious potential for data loss: make sure your persistent data is not below the rootfs and you don't care about any intermediate snapshots.
* `root=...`: If not set, importable pools are searched for a bootfs
attribute. If an explicitly set root is desired, you may use
`root=ZFS:pool/dataset`
5. If both `bootfs.snapshot` and `bootfs.rollback` are set, `bootfs.rollback` is ordered *after* `bootfs.snapshot`.
* `zfs_force=0`: If set to 1, the initramfs will run `zpool import -f` when
attempting to import pools if the required pool isn't automatically imported
by the zfs module. This can save you a trip to a bootcd if hostid has
changed, but is dangerous and can lead to zpool corruption, particularly in
cases where storage is on a shared fabric such as iSCSI where multiple hosts
can access storage devices concurrently. _Please understand the implications
of force-importing a pool before enabling this option!_
* `spl_hostid`: By default, the hostid used by the SPL module is read from
/etc/hostid inside the initramfs. This file is placed there from the host
system when the initramfs is built which effectively ties the ramdisk to the
host which builds it. If a different hostid is desired, one may be set in
this attribute and will override any file present in the ramdisk. The
format should be hex exactly as found in the `/etc/hostid` file, IE
`spl_hostid=0x00bab10c`.
Note that changing the hostid between boots will most likely lead to an
un-importable pool since the last importing hostid won't match. In order
to recover from this, you may use the `zfs_force` option or boot from a
different filesystem and `zpool import -f` then `zpool export` the pool
before rebooting with the new hostid.
* `bootfs.snapshot`: If listed, enables the zfs-snapshot-bootfs service on a Dracut system. The zfs-snapshot-bootfs service simply runs `zfs snapshot $BOOTFS@%v` after the pool has been imported but before the bootfs is mounted. `$BOOTFS` is substituted with the value of the bootfs setting on the pool. `%v` is substituted with the version string of the kernel currently being booted (e.g. 5.6.6-200.fc31.x86\_64). Failure to create the snapshot (e.g. because one with the same name already exists) will be logged, but will not otherwise interrupt the boot process.
It is safe to leave the bootfs.snapshot flag set persistently on your kernel command line so that a new snapshot of your bootfs will be created on every kernel update. If you leave bootfs.snapshot set persistently on your kernel command line, you may find the below script helpful for automatically removing old snapshots of the bootfs along with their associated kernel.
#!/usr/bin/sh
if [[ "$1" == "remove" ]] && grep -q "\bbootfs.snapshot\b" /proc/cmdline; then
zfs destroy $(findmnt -n -o source /)@$2 &> /dev/null
fi
exit 0
To use the above script place it in a plain text file named /etc/kernel/install.d/99-zfs-cleanup.install and mark it executable with the following command:
$ chmod +x /etc/kernel/install.d/99-zfs-cleanup.install
On Red Hat based systems, you can change the value of `installonly_limit` in /etc/dnf/dnf.conf to adjust the number of kernels and their associated snapshots that are kept.
* `bootfs.snapshot=<snapname>`: Is identical to the bootfs.snapshot parameter explained above except that the value substituted for \<snapname\> will be used when creating the snapshot instead of the version string of the kernel currently being booted.
* `bootfs.rollback`: If listed, enables the zfs-rollback-bootfs service on a Dracut system. The zfs-rollback-bootfs service simply runs `zfs rollback -Rf $BOOTFS@%v` after the pool has been imported but before the bootfs is mounted. If the rollback operation fails, the boot process will be interrupted with a Dracut rescue shell. __Use this parameter with caution. Intermediate snapshots of the bootfs will be destroyed!__ TIP: Keep your user data (e.g. /home) on separate file systems (it can be in the same pool though).
* `bootfs.rollback=<snapname>`: Is identical to the bootfs.rollback parameter explained above except that the value substituted for \<snapname\> will be used when rolling back the bootfs instead of the version string of the kernel currently being booted. If you use this form, choose a snapshot that is new enough to contain the needed kernel modules under /lib/modules or use a kernel that has all the needed modules built-in.
How it Works
============
The Dracut module consists of the following files (less Makefile's):
* `module-setup.sh`: Script run by the initramfs builder to create the
ramdisk. Contains instructions on which files are required by the modules
and z* programs. Also triggers inclusion of `/etc/hostid` and the zpool
cache. This file is not included in the initramfs.
* `90-zfs.rules`: udev rules which trigger loading of the ZFS modules at boot.
* `zfs-lib.sh`: Utility functions used by the other files.
* `parse-zfs.sh`: Run early in the initramfs boot process to parse kernel
command line and determine if ZFS is the active root filesystem.
* `mount-zfs.sh`: Run later in initramfs boot process after udev has settled
to mount the root dataset.
* `export-zfs.sh`: Run on shutdown after dracut has restored the initramfs
and pivoted to it, allowing for a clean unmount and export of the ZFS root.
`zfs-lib.sh`
------------
This file provides a few handy functions for working with ZFS. Those
functions are used by the `mount-zfs.sh` and `export-zfs.sh` files.
However, they could be used by any other file as well, as long as the file
sources `/lib/dracut-zfs-lib.sh`.
`module-setup.sh`
-----------------
This file is run by the Dracut script within the live system, not at boot
time. It's not included in the final initramfs. Functions in this script
describe which files are needed by ZFS at boot time.
Currently all the various z* and spl modules are included, a dependency is
asserted on udev-rules, and the various zfs, zpool, etc. helpers are included.
Dracut provides library functions which automatically gather the shared libs
necessary to run each of these binaries, so statically built binaries are
not required.
The zpool and zvol udev rules files are copied from where they are
installed by the ZFS build. __PACKAGERS TAKE NOTE__: If you move
`/etc/udev/rules/60-z*.rules`, you'll need to update this file to match.
Currently this file also includes `/etc/hostid` and `/etc/zfs/zpool.cache`
which means the generated ramdisk is specific to the host system which built
it. If a generic initramfs is required, it may be preferable to omit these
files and specify the `spl_hostid` from the boot loader instead.
`parse-zfs.sh`
--------------
Run during the cmdline phase of the initramfs boot process, this script
performs some basic sanity checks on kernel command line parameters to
determine if booting from ZFS is likely to be what is desired. Dracut
requires this script to adjust the `root` variable if required and to set
`rootok=1` if a mountable root filesystem is available. Unfortunately this
script must run before udev is settled and kernel modules are known to be
loaded, so accessing the zpool and zfs commands is unsafe.
If the root=ZFS... parameter is set on the command line, then it's at least
certain that ZFS is what is desired, though this script is unable to
determine if ZFS is in fact available. This script will alter the `root`
parameter to replace several historical forms of specifying the pool and
dataset name with the canonical form of `zfs:pool/dataset`.
If no root= parameter is set, the best this script can do is guess that
ZFS is desired. At present, no other known filesystems will work with no
root= parameter, though this might possibly interfere with using the
compiled-in default root in the kernel image. It's considered unlikely
that would ever be the case when an initramfs is in use, so this script
sets `root=zfs:AUTO` and hopes for the best.
Once the root=... (or lack thereof) parameter is parsed, a dummy symlink
is created from `/dev/root` -> `/dev/null` to satisfy parts of the Dracut
process which check for presence of a single root device node.
Finally, an initqueue/finished hook is registered which causes the initqueue
phase of Dracut to wait for `/dev/zfs` to become available before attempting
to mount anything.
`mount-zfs.sh`
--------------
This script is run after udev has settled and all tasks in the initqueue
have succeeded. This ensures that `/dev/zfs` is available and that the
various ZFS modules are successfully loaded. As it is now safe to call
zpool and friends, we can proceed to find the bootfs attribute if necessary.
If the root parameter was explicitly set on the command line, no parsing is
necessary. The list of imported pools is checked to see if the desired pool
is already imported. If it's not, and attempt is made to import the pool
explicitly, though no force is attempted. Finally the specified dataset
is mounted on `$NEWROOT`, first using the `-o zfsutil` option to handle
non-legacy mounts, then if that fails, without zfsutil to handle legacy
mount points.
If no root parameter was specified, this script attempts to find a pool with
its bootfs attribute set. First, already-imported pools are scanned and if
an appropriate pool is found, no additional pools are imported. If no pool
with bootfs is found, any additional pools in the system are imported with
`zpool import -N -a`, and the scan for bootfs is tried again. If no bootfs
is found with all pools imported, all pools are re-exported, and boot fails.
Assuming a bootfs is found, an attempt is made to mount it to `$NEWROOT`,
first with, then without the zfsutil option as above.
Ordinarily pools are imported _without_ the force option which may cause
boot to fail if the hostid has changed or a pool has been physically moved
between servers. The `zfs_force` kernel parameter is provided which when
set to `1` causes `zpool import` to be run with the `-f` flag. Forcing pool
import can lead to serious data corruption and loss of pools, so this option
should be used with extreme caution. Note that even with this flag set, if
the required zpool was auto-imported by the kernel module, no additional
`zpool import` commands are run, so nothing is forced.
`export-zfs.sh`
---------------
Normally the zpool containing the root dataset cannot be exported on
shutdown as it is still in use by the init process. To work around this,
Dracut is able to restore the initramfs on shutdown and pivot to it.
All remaining process are then running from a ramdisk, allowing for a
clean unmount and export of the ZFS root. The theory of operation is
described in detail in the [Dracut manual](https://www.kernel.org/pub/linux/utils/boot/dracut/dracut.html#_dracut_on_shutdown).
This script will try to export all remaining zpools after Dracut has
pivoted to the initramfs. If an initial regular export is not successful,
Dracut will call this script once more with the `final` option,
in which case a forceful export is attempted.
Other Dracut modules include similar shutdown scripts and Dracut
invokes these scripts round-robin until they succeed. In particular,
the `90dm` module installs a script which tries to close and remove
all device mapper targets. Thus, if there are ZVOLs containing
dm-crypt volumes or if the zpool itself is backed by a dm-crypt
volume, the shutdown scripts will try to untangle this.
6. `zfs_force`, `zfs.force`, `zfsforce`: add `-f` to all `zpool import` invocations.
May be useful. Use with caution.
+12 -16
View File
@@ -326,30 +326,26 @@ mount_fs()
# Need the _original_ datasets mountpoint!
mountpoint=$(get_fs_value "$fs" mountpoint)
ZFS_CMD="mount -o zfsutil -t zfs"
ZFS_CMD="mount.zfs -o zfsutil"
if [ "$mountpoint" = "legacy" ] || [ "$mountpoint" = "none" ]; then
# Can't use the mountpoint property. Might be one of our
# clones. Check the 'org.zol:mountpoint' property set in
# clone_snap() if that's usable.
mountpoint=$(get_fs_value "$fs" org.zol:mountpoint)
if [ "$mountpoint" = "legacy" ] ||
[ "$mountpoint" = "none" ] ||
[ "$mountpoint" = "-" ]
mountpoint1=$(get_fs_value "$fs" org.zol:mountpoint)
if [ "$mountpoint1" = "legacy" ] ||
[ "$mountpoint1" = "none" ] ||
[ "$mountpoint1" = "-" ]
then
if [ "$fs" != "${ZFS_BOOTFS}" ]; then
# We don't have a proper mountpoint and this
# isn't the root fs.
return 0
else
# Last hail-mary: Hope 'rootmnt' is set!
mountpoint=""
fi
fi
# If it's not a legacy filesystem, it can only be a
# native one...
if [ "$mountpoint" = "legacy" ]; then
ZFS_CMD="mount -t zfs"
ZFS_CMD="mount.zfs"
# Last hail-mary: Hope 'rootmnt' is set!
mountpoint=""
else
mountpoint="$mountpoint1"
fi
fi
@@ -503,7 +499,7 @@ clone_snap()
echo "Error: $ZFS_ERROR"
echo ""
echo "Failed to clone snapshot."
echo "Make sure that the any problems are corrected and then make sure"
echo "Make sure that any problems are corrected and then make sure"
echo "that the dataset '$destfs' exists and is bootable."
shell
else
@@ -915,7 +911,7 @@ mountroot()
echo " not specified on the kernel command line."
echo ""
echo "Manually mount the root filesystem on $rootmnt and then exit."
echo "Hint: Try: mount -o zfsutil -t zfs ${ZFS_RPOOL-rpool}/ROOT/system $rootmnt"
echo "Hint: Try: mount.zfs -o zfsutil ${ZFS_RPOOL-rpool}/ROOT/system $rootmnt"
shell
fi
+11 -3
View File
@@ -496,7 +496,6 @@ zfs_key_config_get_dataset(zfs_key_config_t *config)
if (zhp == NULL) {
pam_syslog(NULL, LOG_ERR, "dataset %s not found",
config->homes_prefix);
zfs_close(zhp);
return (NULL);
}
@@ -508,6 +507,10 @@ zfs_key_config_get_dataset(zfs_key_config_t *config)
return (dsname);
}
if (config->homes_prefix == NULL) {
return (NULL);
}
size_t len = ZFS_MAX_DATASET_NAME_LEN;
size_t total_len = strlen(config->homes_prefix) + 1
+ strlen(config->username);
@@ -711,7 +714,10 @@ pam_sm_open_session(pam_handle_t *pamh, int flags,
return (PAM_SUCCESS);
}
zfs_key_config_t config;
zfs_key_config_load(pamh, &config, argc, argv);
if (zfs_key_config_load(pamh, &config, argc, argv) != 0) {
return (PAM_SESSION_ERR);
}
if (config.uid < 1000) {
zfs_key_config_free(&config);
return (PAM_SUCCESS);
@@ -765,7 +771,9 @@ pam_sm_close_session(pam_handle_t *pamh, int flags,
return (PAM_SUCCESS);
}
zfs_key_config_t config;
zfs_key_config_load(pamh, &config, argc, argv);
if (zfs_key_config_load(pamh, &config, argc, argv) != 0) {
return (PAM_SESSION_ERR);
}
if (config.uid < 1000) {
zfs_key_config_free(&config);
return (PAM_SUCCESS);
+1 -1
View File
@@ -104,7 +104,7 @@ zfs_errno = enum_with_offset(1024, [
)
# compat before we used the enum helper for these values
ZFS_ERR_CHECKPOINT_EXISTS = zfs_errno.ZFS_ERR_CHECKPOINT_EXISTS
assert(ZFS_ERR_CHECKPOINT_EXISTS == 1024)
assert (ZFS_ERR_CHECKPOINT_EXISTS == 1024)
ZFS_ERR_DISCARDING_CHECKPOINT = zfs_errno.ZFS_ERR_DISCARDING_CHECKPOINT
ZFS_ERR_NO_CHECKPOINT = zfs_errno.ZFS_ERR_NO_CHECKPOINT
ZFS_ERR_DEVRM_IN_PROGRESS = zfs_errno.ZFS_ERR_DEVRM_IN_PROGRESS
+3 -2
View File
@@ -1,9 +1,10 @@
include $(top_srcdir)/config/Substfiles.am
include $(top_srcdir)/config/Shellcheck.am
initconf_SCRIPTS = zfs
initconf_DATA = zfs
SUBSTFILES += $(initconf_SCRIPTS)
SUBSTFILES += $(initconf_DATA)
SHELLCHECKSCRIPTS = $(initconf_DATA)
SHELLCHECK_SHELL = sh
SHELLCHECK_IGNORE = ,SC2034
@@ -27,9 +27,6 @@
#include <sys/types.h>
#include <sys/time.h>
#include <sys/stat.h>
#include <sys/wait.h>
#include <sys/mman.h>
#include <semaphore.h>
#include <stdbool.h>
#include <unistd.h>
#include <fcntl.h>
@@ -44,25 +41,16 @@
#include <errno.h>
#include <libzfs.h>
/*
* For debugging only.
*
* Free statics with trivial life-times,
* but saved line filenames are replaced with a static string.
*/
#define FREE_STATICS false
#define nitems(arr) (sizeof (arr) / sizeof (*arr))
#define STRCMP ((int(*)(const void *, const void *))&strcmp)
#define PID_T_CMP ((int(*)(const void *, const void *))&pid_t_cmp)
static int
pid_t_cmp(const pid_t *lhs, const pid_t *rhs)
{
/*
* This is always valid, quoth sys_types.h(7posix):
* > blksize_t, pid_t, and ssize_t shall be signed integer types.
*/
return (*lhs - *rhs);
}
#define EXIT_ENOMEM() \
do { \
fprintf(stderr, PROGNAME "[%d]: " \
"not enough memory (L%d)!\n", getpid(), __LINE__); \
_exit(1); \
} while (0)
#define PROGNAME "zfs-mount-generator"
@@ -80,20 +68,11 @@ pid_t_cmp(const pid_t *lhs, const pid_t *rhs)
#define URI_REGEX_S "^\\([A-Za-z][A-Za-z0-9+.\\-]*\\):\\/\\/\\(.*\\)$"
static regex_t uri_regex;
static char *argv0;
static const char *destdir = "/tmp";
static int destdir_fd = -1;
static void *known_pools = NULL; /* tsearch() of C strings */
static struct {
sem_t noauto_not_on_sem;
sem_t noauto_names_sem;
size_t noauto_names_len;
size_t noauto_names_max;
char noauto_names[][NAME_MAX];
} *noauto_files;
static void *noauto_files = NULL; /* tsearch() of C strings */
static char *
@@ -103,8 +82,12 @@ systemd_escape(const char *input, const char *prepend, const char *append)
size_t applen = strlen(append);
size_t prelen = strlen(prepend);
char *ret = malloc(4 * len + prelen + applen + 1);
if (!ret)
EXIT_ENOMEM();
if (!ret) {
fprintf(stderr, PROGNAME "[%d]: "
"out of memory to escape \"%s%s%s\"!\n",
getpid(), prepend, input, append);
return (NULL);
}
memcpy(ret, prepend, prelen);
char *out = ret + prelen;
@@ -166,8 +149,12 @@ systemd_escape_path(char *input, const char *prepend, const char *append)
{
if (strcmp(input, "/") == 0) {
char *ret;
if (asprintf(&ret, "%s-%s", prepend, append) == -1)
EXIT_ENOMEM();
if (asprintf(&ret, "%s-%s", prepend, append) == -1) {
fprintf(stderr, PROGNAME "[%d]: "
"out of memory to escape \"%s%s%s\"!\n",
getpid(), prepend, input, append);
ret = NULL;
}
return (ret);
} else {
/*
@@ -209,6 +196,10 @@ fopenat(int dirfd, const char *pathname, int flags,
static int
line_worker(char *line, const char *cachefile)
{
int ret = 0;
void *tofree_all[8];
void **tofree = tofree_all;
char *toktmp;
/* BEGIN CSTYLED */
const char *dataset = strtok_r(line, "\t", &toktmp);
@@ -240,11 +231,9 @@ line_worker(char *line, const char *cachefile)
if (p_nbmand == NULL) {
fprintf(stderr, PROGNAME "[%d]: %s: not enough tokens!\n",
getpid(), dataset);
return (1);
goto err;
}
strncpy(argv0, dataset, strlen(argv0));
/* Minimal pre-requisites to mount a ZFS dataset */
const char *after = "zfs-import.target";
const char *wants = "zfs-import.target";
@@ -280,28 +269,31 @@ line_worker(char *line, const char *cachefile)
if (strcmp(p_encroot, "-") != 0) {
char *keyloadunit =
char *keyloadunit = *(tofree++) =
systemd_escape(p_encroot, "zfs-load-key@", ".service");
if (keyloadunit == NULL)
goto err;
if (strcmp(dataset, p_encroot) == 0) {
const char *keymountdep = NULL;
bool is_prompt = false;
bool need_network = false;
regmatch_t uri_matches[3];
if (regexec(&uri_regex, p_keyloc,
sizeof (uri_matches) / sizeof (*uri_matches),
uri_matches, 0) == 0) {
nitems(uri_matches), uri_matches, 0) == 0) {
p_keyloc[uri_matches[1].rm_eo] = '\0';
p_keyloc[uri_matches[2].rm_eo] = '\0';
const char *scheme =
&p_keyloc[uri_matches[1].rm_so];
const char *path =
&p_keyloc[uri_matches[2].rm_so];
/*
* Assumes all URI keylocations need
* the mount for their path;
* http://, for example, wouldn't
* (but it'd need network-online.target et al.)
*/
keymountdep = path;
if (strcmp(scheme, "https") == 0 ||
strcmp(scheme, "http") == 0)
need_network = true;
else
keymountdep = path;
} else {
if (strcmp(p_keyloc, "prompt") != 0)
fprintf(stderr, PROGNAME "[%d]: %s: "
@@ -321,7 +313,7 @@ line_worker(char *line, const char *cachefile)
"couldn't open %s under %s: %s\n",
getpid(), dataset, keyloadunit, destdir,
strerror(errno));
return (1);
goto err;
}
fprintf(keyloadunit_f,
@@ -335,20 +327,22 @@ line_worker(char *line, const char *cachefile)
"After=%s\n",
dataset, cachefile, wants, after);
if (need_network)
fprintf(keyloadunit_f,
"Wants=network-online.target\n"
"After=network-online.target\n");
if (p_systemd_requires)
fprintf(keyloadunit_f,
"Requires=%s\n", p_systemd_requires);
if (p_systemd_requiresmountsfor || keymountdep) {
fprintf(keyloadunit_f, "RequiresMountsFor=");
if (p_systemd_requiresmountsfor)
fprintf(keyloadunit_f,
"%s ", p_systemd_requiresmountsfor);
if (keymountdep)
fprintf(keyloadunit_f,
"'%s'", keymountdep);
fprintf(keyloadunit_f, "\n");
}
if (p_systemd_requiresmountsfor)
fprintf(keyloadunit_f,
"RequiresMountsFor=%s\n",
p_systemd_requiresmountsfor);
if (keymountdep)
fprintf(keyloadunit_f,
"RequiresMountsFor='%s'\n", keymountdep);
/* BEGIN CSTYLED */
fprintf(keyloadunit_f,
@@ -393,9 +387,13 @@ line_worker(char *line, const char *cachefile)
if (after[0] == '\0')
after = keyloadunit;
else if (asprintf(&toktmp, "%s %s", after, keyloadunit) != -1)
after = toktmp;
else
EXIT_ENOMEM();
after = *(tofree++) = toktmp;
else {
fprintf(stderr, PROGNAME "[%d]: %s: "
"out of memory to generate after=\"%s %s\"!\n",
getpid(), dataset, after, keyloadunit);
goto err;
}
}
@@ -404,12 +402,12 @@ line_worker(char *line, const char *cachefile)
strcmp(p_systemd_ignore, "off") == 0) {
/* ok */
} else if (strcmp(p_systemd_ignore, "on") == 0)
return (0);
goto end;
else {
fprintf(stderr, PROGNAME "[%d]: %s: "
"invalid org.openzfs.systemd:ignore=%s\n",
getpid(), dataset, p_systemd_ignore);
return (1);
goto err;
}
/* Check for canmount */
@@ -418,21 +416,21 @@ line_worker(char *line, const char *cachefile)
} else if (strcmp(p_canmount, "noauto") == 0)
noauto = true;
else if (strcmp(p_canmount, "off") == 0)
return (0);
goto end;
else {
fprintf(stderr, PROGNAME "[%d]: %s: invalid canmount=%s\n",
getpid(), dataset, p_canmount);
return (1);
goto err;
}
/* Check for legacy and blank mountpoints */
if (strcmp(p_mountpoint, "legacy") == 0 ||
strcmp(p_mountpoint, "none") == 0)
return (0);
goto end;
else if (p_mountpoint[0] != '/') {
fprintf(stderr, PROGNAME "[%d]: %s: invalid mountpoint=%s\n",
getpid(), dataset, p_mountpoint);
return (1);
goto err;
}
/* Escape the mountpoint per systemd policy */
@@ -442,7 +440,7 @@ line_worker(char *line, const char *cachefile)
fprintf(stderr,
PROGNAME "[%d]: %s: abnormal simplified mountpoint: %s\n",
getpid(), dataset, p_mountpoint);
return (1);
goto err;
}
@@ -552,8 +550,7 @@ line_worker(char *line, const char *cachefile)
* files if we're sure they were created by us. (see 5.)
* 2. We handle files differently based on canmount.
* Units with canmount=on always have precedence over noauto.
* This is enforced by the noauto_not_on_sem semaphore,
* which is only unlocked when the last canmount=on process exits.
* This is enforced by processing these units before all others.
* It is important to use p_canmount and not noauto here,
* since we categorise by canmount while other properties,
* e.g. org.openzfs.systemd:wanted-by, also modify noauto.
@@ -561,7 +558,7 @@ line_worker(char *line, const char *cachefile)
* Additionally, we use noauto_files to track the unit file names
* (which are the systemd-escaped mountpoints) of all (exclusively)
* noauto datasets that had a file created.
* 4. If the file to be created is found in the tracking array,
* 4. If the file to be created is found in the tracking tree,
* we do NOT create it.
* 5. If a file exists for a noauto dataset,
* we check whether the file name is in the array.
@@ -571,29 +568,14 @@ line_worker(char *line, const char *cachefile)
* further noauto datasets creating a file for this path again.
*/
{
sem_t *our_sem = (strcmp(p_canmount, "on") == 0) ?
&noauto_files->noauto_names_sem :
&noauto_files->noauto_not_on_sem;
while (sem_wait(our_sem) == -1 && errno == EINTR)
;
}
struct stat stbuf;
bool already_exists = fstatat(destdir_fd, mountfile, &stbuf, 0) == 0;
bool is_known = tfind(mountfile, &noauto_files, STRCMP) != NULL;
bool is_known = false;
for (size_t i = 0; i < noauto_files->noauto_names_len; ++i) {
if (strncmp(
noauto_files->noauto_names[i], mountfile, NAME_MAX) == 0) {
is_known = true;
break;
}
}
*(tofree++) = (void *)mountfile;
if (already_exists) {
if (is_known) {
/* If it's in $noauto_files, we must be noauto too */
/* If it's in noauto_files, we must be noauto too */
/* See 5 */
errno = 0;
@@ -614,43 +596,31 @@ line_worker(char *line, const char *cachefile)
}
/* File exists: skip current dataset */
if (strcmp(p_canmount, "on") == 0)
sem_post(&noauto_files->noauto_names_sem);
return (0);
goto end;
} else {
if (is_known) {
/* See 4 */
if (strcmp(p_canmount, "on") == 0)
sem_post(&noauto_files->noauto_names_sem);
return (0);
goto end;
} else if (strcmp(p_canmount, "noauto") == 0) {
if (noauto_files->noauto_names_len ==
noauto_files->noauto_names_max)
if (tsearch(mountfile, &noauto_files, STRCMP) == NULL)
fprintf(stderr, PROGNAME "[%d]: %s: "
"noauto dataset limit (%zu) reached! "
"Not tracking %s. Please report this to "
"https://github.com/openzfs/zfs\n",
getpid(), dataset,
noauto_files->noauto_names_max, mountfile);
else {
strncpy(noauto_files->noauto_names[
noauto_files->noauto_names_len],
mountfile, NAME_MAX);
++noauto_files->noauto_names_len;
}
"out of memory for noauto datasets! "
"Not tracking %s.\n",
getpid(), dataset, mountfile);
else
/* mountfile escaped to noauto_files */
*(--tofree) = NULL;
}
}
FILE *mountfile_f = fopenat(destdir_fd, mountfile,
O_WRONLY | O_CREAT | O_TRUNC | O_CLOEXEC, "w", 0644);
if (strcmp(p_canmount, "on") == 0)
sem_post(&noauto_files->noauto_names_sem);
if (!mountfile_f) {
fprintf(stderr,
PROGNAME "[%d]: %s: couldn't open %s under %s: %s\n",
getpid(), dataset, mountfile, destdir, strerror(errno));
return (1);
goto err;
}
fprintf(mountfile_f,
@@ -699,12 +669,17 @@ line_worker(char *line, const char *cachefile)
(void) fclose(mountfile_f);
if (!requiredby && !wantedby)
return (0);
goto end;
/* Finally, create the appropriate dependencies */
char *linktgt;
if (asprintf(&linktgt, "../%s", mountfile) == -1)
EXIT_ENOMEM();
if (asprintf(&linktgt, "../%s", mountfile) == -1) {
fprintf(stderr, PROGNAME "[%d]: %s: "
"out of memory for dependents of %s!\n",
getpid(), dataset, mountfile);
goto err;
}
*(tofree++) = linktgt;
char *dependencies[][2] = {
{"wants", wantedby},
@@ -719,8 +694,14 @@ line_worker(char *line, const char *cachefile)
reqby;
reqby = strtok_r(NULL, " ", &toktmp)) {
char *depdir;
if (asprintf(&depdir, "%s.%s", reqby, (*dep)[0]) == -1)
EXIT_ENOMEM();
if (asprintf(
&depdir, "%s.%s", reqby, (*dep)[0]) == -1) {
fprintf(stderr, PROGNAME "[%d]: %s: "
"out of memory for dependent dir name "
"\"%s.%s\"!\n",
getpid(), dataset, reqby, (*dep)[0]);
continue;
}
(void) mkdirat(destdir_fd, depdir, 0755);
int depdir_fd = openat(destdir_fd, depdir,
@@ -746,7 +727,24 @@ line_worker(char *line, const char *cachefile)
}
}
return (0);
end:
if (tofree >= tofree_all + nitems(tofree_all)) {
/*
* This won't happen as-is:
* we've got 8 slots and allocate 4 things at most.
*/
fprintf(stderr,
PROGNAME "[%d]: %s: need to free %zu > %zu!\n",
getpid(), dataset, tofree - tofree_all, nitems(tofree_all));
ret = tofree - tofree_all;
}
while (tofree-- != tofree_all)
free(*tofree);
return (ret);
err:
ret = 1;
goto end;
}
@@ -780,12 +778,11 @@ main(int argc, char **argv)
if (kmfd >= 0) {
(void) dup2(kmfd, STDERR_FILENO);
(void) close(kmfd);
setlinebuf(stderr);
}
}
uint8_t debug = 0;
argv0 = argv[0];
switch (argc) {
case 1:
/* Use default */
@@ -844,33 +841,9 @@ main(int argc, char **argv)
}
}
{
/*
* We could just get a gigabyte here and Not Care,
* but if vm.overcommit_memory=2, then MAP_NORESERVE is ignored
* and we'd try (and likely fail) to rip it out of swap
*/
noauto_files = mmap(NULL, 4 * 1024 * 1024,
PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0);
if (noauto_files == MAP_FAILED) {
fprintf(stderr,
PROGNAME "[%d]: couldn't allocate IPC region: %s\n",
getpid(), strerror(errno));
_exit(1);
}
sem_init(&noauto_files->noauto_not_on_sem, true, 0);
sem_init(&noauto_files->noauto_names_sem, true, 1);
noauto_files->noauto_names_len = 0;
/* Works out to 16447ish, *well* enough */
noauto_files->noauto_names_max =
(4 * 1024 * 1024 - sizeof (*noauto_files)) / NAME_MAX;
}
bool debug = false;
char *line = NULL;
size_t linelen = 0;
struct timespec time_start = {};
{
const char *dbgenv = getenv("ZFS_DEBUG");
if (dbgenv)
@@ -879,7 +852,7 @@ main(int argc, char **argv)
FILE *cmdline = fopen("/proc/cmdline", "re");
if (cmdline != NULL) {
if (getline(&line, &linelen, cmdline) >= 0)
debug = strstr(line, "debug") ? 2 : 0;
debug = strstr(line, "debug");
(void) fclose(cmdline);
}
}
@@ -888,19 +861,17 @@ main(int argc, char **argv)
dup2(STDERR_FILENO, STDOUT_FILENO);
}
size_t forked_canmount_on = 0;
size_t forked_canmount_not_on = 0;
size_t canmount_on_pids_len = 128;
pid_t *canmount_on_pids =
malloc(canmount_on_pids_len * sizeof (*canmount_on_pids));
if (canmount_on_pids == NULL)
canmount_on_pids_len = 0;
struct timespec time_start = {};
if (debug)
clock_gettime(CLOCK_MONOTONIC_RAW, &time_start);
ssize_t read;
pid_t pid;
struct line {
char *line;
const char *fname;
struct line *next;
} *lines_canmount_not_on = NULL;
int ret = 0;
struct dirent *cachent;
while ((cachent = readdir(fslist_dir)) != NULL) {
if (strcmp(cachent->d_name, ".") == 0 ||
@@ -916,129 +887,67 @@ main(int argc, char **argv)
continue;
}
const char *filename = FREE_STATICS ? "(elided)" : NULL;
ssize_t read;
while ((read = getline(&line, &linelen, cachefile)) >= 0) {
line[read - 1] = '\0'; /* newline */
switch (pid = fork()) {
case -1:
fprintf(stderr,
PROGNAME "[%d]: couldn't fork for %s: %s\n",
getpid(), line, strerror(errno));
break;
case 0: /* child */
_exit(line_worker(line, cachent->d_name));
default: { /* parent */
char *tmp;
char *dset = strtok_r(line, "\t", &tmp);
strtok_r(NULL, "\t", &tmp);
char *canmount = strtok_r(NULL, "\t", &tmp);
bool canmount_on =
canmount && strncmp(canmount, "on", 2) == 0;
char *canmount = line;
canmount += strcspn(canmount, "\t");
canmount += strspn(canmount, "\t");
canmount += strcspn(canmount, "\t");
canmount += strspn(canmount, "\t");
bool canmount_on = strncmp(canmount, "on", 2) == 0;
if (debug >= 2)
printf(PROGNAME ": forked %d, "
"canmount_on=%d, dataset=%s\n",
(int)pid, canmount_on, dset);
if (canmount_on)
ret |= line_worker(line, cachent->d_name);
else {
if (filename == NULL)
filename =
strdup(cachent->d_name) ?: "(?)";
if (canmount_on &&
forked_canmount_on ==
canmount_on_pids_len) {
size_t new_len =
(canmount_on_pids_len ?: 16) * 2;
void *new_pidlist =
realloc(canmount_on_pids,
new_len *
sizeof (*canmount_on_pids));
if (!new_pidlist) {
fprintf(stderr,
PROGNAME "[%d]: "
"out of memory! "
"Mount ordering may be "
"affected.\n", getpid());
continue;
}
canmount_on_pids = new_pidlist;
canmount_on_pids_len = new_len;
struct line *l = calloc(1, sizeof (*l));
char *nl = strdup(line);
if (l == NULL || nl == NULL) {
fprintf(stderr, PROGNAME "[%d]: "
"out of memory for \"%s\" in %s\n",
getpid(), line, cachent->d_name);
free(l);
free(nl);
continue;
}
if (canmount_on) {
canmount_on_pids[forked_canmount_on] =
pid;
++forked_canmount_on;
} else
++forked_canmount_not_on;
break;
}
l->line = nl;
l->fname = filename;
l->next = lines_canmount_not_on;
lines_canmount_not_on = l;
}
}
(void) fclose(cachefile);
fclose(cachefile);
}
free(line);
if (forked_canmount_on == 0) {
/* No canmount=on processes to finish, so don't deadlock here */
for (size_t i = 0; i < forked_canmount_not_on; ++i)
sem_post(&noauto_files->noauto_not_on_sem);
} else {
/* Likely a no-op, since we got these from a narrow fork loop */
qsort(canmount_on_pids, forked_canmount_on,
sizeof (*canmount_on_pids), PID_T_CMP);
}
while (lines_canmount_not_on) {
struct line *l = lines_canmount_not_on;
lines_canmount_not_on = l->next;
int status, ret = 0;
struct rusage usage;
size_t forked_canmount_on_max = forked_canmount_on;
while ((pid = wait4(-1, &status, 0, &usage)) != -1) {
ret |= WEXITSTATUS(status) | WTERMSIG(status);
if (forked_canmount_on != 0) {
if (bsearch(&pid, canmount_on_pids,
forked_canmount_on_max, sizeof (*canmount_on_pids),
PID_T_CMP))
--forked_canmount_on;
if (forked_canmount_on == 0) {
/*
* All canmount=on processes have finished,
* let all the lower-priority ones finish now
*/
for (size_t i = 0;
i < forked_canmount_not_on; ++i)
sem_post(
&noauto_files->noauto_not_on_sem);
}
ret |= line_worker(l->line, l->fname);
if (FREE_STATICS) {
free(l->line);
free(l);
}
if (debug >= 2)
printf(PROGNAME ": %d done, user=%llu.%06us, "
"system=%llu.%06us, maxrss=%ldB, ex=0x%x\n",
(int)pid,
(unsigned long long) usage.ru_utime.tv_sec,
(unsigned int) usage.ru_utime.tv_usec,
(unsigned long long) usage.ru_stime.tv_sec,
(unsigned int) usage.ru_stime.tv_usec,
usage.ru_maxrss * 1024, status);
}
if (debug) {
struct timespec time_end = {};
clock_gettime(CLOCK_MONOTONIC_RAW, &time_end);
struct rusage usage;
getrusage(RUSAGE_SELF, &usage);
printf(
"\n"
PROGNAME ": self : "
"user=%llu.%06us, system=%llu.%06us, maxrss=%ldB\n",
(unsigned long long) usage.ru_utime.tv_sec,
(unsigned int) usage.ru_utime.tv_usec,
(unsigned long long) usage.ru_stime.tv_sec,
(unsigned int) usage.ru_stime.tv_usec,
usage.ru_maxrss * 1024);
getrusage(RUSAGE_CHILDREN, &usage);
printf(PROGNAME ": children: "
PROGNAME ": "
"user=%llu.%06us, system=%llu.%06us, maxrss=%ldB\n",
(unsigned long long) usage.ru_utime.tv_sec,
(unsigned int) usage.ru_utime.tv_usec,
@@ -1068,7 +977,7 @@ main(int argc, char **argv)
time_init.tv_nsec / 1000000000;
time_init.tv_nsec %= 1000000000;
printf(PROGNAME ": wall : "
printf(PROGNAME ": "
"total=%llu.%09llus = "
"init=%llu.%09llus + real=%llu.%09llus\n",
(unsigned long long) time_init.tv_sec,
@@ -1077,7 +986,15 @@ main(int argc, char **argv)
(unsigned long long) time_start.tv_nsec,
(unsigned long long) time_end.tv_sec,
(unsigned long long) time_end.tv_nsec);
fflush(stdout);
}
if (FREE_STATICS) {
closedir(fslist_dir);
tdestroy(noauto_files, free);
tdestroy(known_pools, free);
regfree(&uri_regex);
}
_exit(ret);
}
+1
View File
@@ -22,3 +22,4 @@ SUBSTFILES += $(systemdpreset_DATA) $(systemdunit_DATA)
install-data-hook:
$(MKDIR_P) "$(DESTDIR)$(systemdunitdir)"
ln -sf /dev/null "$(DESTDIR)$(systemdunitdir)/zfs-import.service"
ln -sf /dev/null "$(DESTDIR)$(systemdunitdir)/zfs-load-key.service"
@@ -5,7 +5,7 @@ DefaultDependencies=no
Requires=systemd-udev-settle.service
After=systemd-udev-settle.service
After=cryptsetup.target
After=multipathd.target
After=multipathd.service
After=systemd-remount-fs.service
Before=zfs-import.target
ConditionFileNotEmpty=@sysconfdir@/zfs/zpool.cache
@@ -5,7 +5,7 @@ DefaultDependencies=no
Requires=systemd-udev-settle.service
After=systemd-udev-settle.service
After=cryptsetup.target
After=multipathd.target
After=multipathd.service
Before=zfs-import.target
ConditionFileNotEmpty=!@sysconfdir@/zfs/zpool.cache
ConditionPathIsDirectory=/sys/module/zfs
+1 -1
View File
@@ -5,7 +5,7 @@ ConditionPathIsDirectory=/sys/module/zfs
[Service]
ExecStart=@sbindir@/zed -F
Restart=on-abort
Restart=always
[Install]
Alias=zed.service
+3 -2
View File
@@ -10,9 +10,10 @@ dist_pkgsysconf_DATA = \
vdev_id.conf.multipath.example \
vdev_id.conf.scsi.example
pkgsysconf_SCRIPTS = \
pkgsysconf_DATA = \
zfs-functions
SUBSTFILES += $(pkgsysconf_SCRIPTS)
SUBSTFILES += $(pkgsysconf_DATA)
SHELLCHECKSCRIPTS = $(pkgsysconf_DATA)
SHELLCHECK_SHELL = dash # local variables
+10 -8
View File
@@ -150,6 +150,7 @@ typedef enum zfs_error {
EZFS_NO_RESILVER_DEFER, /* pool doesn't support resilver_defer */
EZFS_EXPORT_IN_PROGRESS, /* currently exporting the pool */
EZFS_REBUILDING, /* resilvering (sequential reconstrution) */
EZFS_CKSUM, /* insufficient replicas */
EZFS_UNKNOWN
} zfs_error_t;
@@ -257,10 +258,10 @@ extern int zpool_add(zpool_handle_t *, nvlist_t *);
typedef struct splitflags {
/* do not split, but return the config that would be split off */
int dryrun : 1;
unsigned int dryrun : 1;
/* after splitting, import the pool */
int import : 1;
unsigned int import : 1;
int name_flags;
} splitflags_t;
@@ -649,13 +650,13 @@ extern int zfs_rollback(zfs_handle_t *, zfs_handle_t *, boolean_t);
typedef struct renameflags {
/* recursive rename */
int recursive : 1;
unsigned int recursive : 1;
/* don't unmount file systems */
int nounmount : 1;
unsigned int nounmount : 1;
/* force unmount file systems */
int forceunmount : 1;
unsigned int forceunmount : 1;
} renameflags_t;
extern int zfs_rename(zfs_handle_t *, const char *, renameflags_t);
@@ -795,9 +796,10 @@ extern int zfs_receive(libzfs_handle_t *, const char *, nvlist_t *,
recvflags_t *, int, avl_tree_t *);
typedef enum diff_flags {
ZFS_DIFF_PARSEABLE = 0x1,
ZFS_DIFF_TIMESTAMP = 0x2,
ZFS_DIFF_CLASSIFY = 0x4
ZFS_DIFF_PARSEABLE = 1 << 0,
ZFS_DIFF_TIMESTAMP = 1 << 1,
ZFS_DIFF_CLASSIFY = 1 << 2,
ZFS_DIFF_NO_MANGLE = 1 << 3
} diff_flags_t;
extern int zfs_show_diffs(zfs_handle_t *, int, const char *, const char *,
+1
View File
@@ -234,6 +234,7 @@ typedef struct differ_info {
boolean_t scripted;
boolean_t classify;
boolean_t timestamped;
boolean_t no_mangle;
uint64_t shares;
int zerr;
int cleanupfd;
+4 -2
View File
@@ -151,13 +151,15 @@ int zfs_ioctl_fd(int fd, unsigned long request, struct zfs_cmd *zc);
* List of colors to use
*/
#define ANSI_RED "\033[0;31m"
#define ANSI_GREEN "\033[0;32m"
#define ANSI_YELLOW "\033[0;33m"
#define ANSI_BLUE "\033[0;34m"
#define ANSI_RESET "\033[0m"
#define ANSI_BOLD "\033[1m"
void color_start(char *color);
void color_start(const char *color);
void color_end(void);
int printf_color(char *color, char *format, ...);
int printf_color(const char *color, char *format, ...);
/*
* These functions are used by the ZFS libraries and cmd/zpool code, but are
-1
View File
@@ -83,7 +83,6 @@
#define __printf(a, b) __printflike(a, b)
#define barrier() __asm__ __volatile__("": : :"memory")
#define smp_rmb() rmb()
#define ___PASTE(a, b) a##b
#define __PASTE(a, b) ___PASTE(a, b)
+2 -1
View File
@@ -57,7 +57,8 @@ extern uint64_t atomic_cas_64(volatile uint64_t *target, uint64_t cmp,
uint64_t newval);
#endif
#define membar_producer atomic_thread_fence_rel
#define membar_consumer() atomic_thread_fence_acq()
#define membar_producer() atomic_thread_fence_rel()
static __inline uint32_t
atomic_add_32_nv(volatile uint32_t *target, int32_t delta)
+1 -1
View File
@@ -52,7 +52,7 @@
#define ZFS_MODULE_PARAM_CALL_IMPL(parent, name, perm, args, desc) \
SYSCTL_DECL(parent); \
SYSCTL_PROC(parent, OID_AUTO, name, perm | args, desc)
SYSCTL_PROC(parent, OID_AUTO, name, CTLFLAG_MPSAFE | perm | args, desc)
#define ZFS_MODULE_PARAM_CALL(scope_prefix, name_prefix, name, func, _, perm, desc) \
ZFS_MODULE_PARAM_CALL_IMPL(_vfs_ ## scope_prefix, name, perm, func ## _args(name_prefix ## name), desc)
+2
View File
@@ -95,9 +95,11 @@ vn_flush_cached_data(vnode_t *vp, boolean_t sync)
if (vp->v_object->flags & OBJ_MIGHTBEDIRTY) {
#endif
int flags = sync ? OBJPC_SYNC : 0;
vn_lock(vp, LK_SHARED | LK_RETRY);
zfs_vmobject_wlock(vp->v_object);
vm_object_page_clean(vp->v_object, 0, 0, flags);
zfs_vmobject_wunlock(vp->v_object);
VOP_UNLOCK(vp);
}
}
+18 -14
View File
@@ -123,25 +123,29 @@ extern minor_t zfsdev_minor_alloc(void);
#define zn_rlimit_fsize(zp, uio) \
vn_rlimit_fsize(ZTOV(zp), GET_UIO_STRUCT(uio), zfs_uio_td(uio))
#define ZFS_ENTER_ERROR(zfsvfs, error) do { \
ZFS_TEARDOWN_ENTER_READ((zfsvfs), FTAG); \
if (__predict_false((zfsvfs)->z_unmounted)) { \
ZFS_TEARDOWN_EXIT_READ(zfsvfs, FTAG); \
return (error); \
} \
} while (0)
/* Called on entry to each ZFS vnode and vfs operation */
#define ZFS_ENTER(zfsvfs) \
{ \
ZFS_TEARDOWN_ENTER_READ((zfsvfs), FTAG); \
if (__predict_false((zfsvfs)->z_unmounted)) { \
ZFS_TEARDOWN_EXIT_READ(zfsvfs, FTAG); \
return (EIO); \
} \
}
#define ZFS_ENTER(zfsvfs) ZFS_ENTER_ERROR(zfsvfs, EIO)
/* Must be called before exiting the vop */
#define ZFS_EXIT(zfsvfs) ZFS_TEARDOWN_EXIT_READ(zfsvfs, FTAG)
#define ZFS_EXIT(zfsvfs) ZFS_TEARDOWN_EXIT_READ(zfsvfs, FTAG)
#define ZFS_VERIFY_ZP_ERROR(zp, error) do { \
if (__predict_false((zp)->z_sa_hdl == NULL)) { \
ZFS_EXIT((zp)->z_zfsvfs); \
return (error); \
} \
} while (0)
/* Verifies the znode is valid */
#define ZFS_VERIFY_ZP(zp) \
if (__predict_false((zp)->z_sa_hdl == NULL)) { \
ZFS_EXIT((zp)->z_zfsvfs); \
return (EIO); \
} \
#define ZFS_VERIFY_ZP(zp) ZFS_VERIFY_ZP_ERROR(zp, EIO)
/*
* Macros for dealing with dmu_buf_hold
+41 -8
View File
@@ -357,7 +357,11 @@ vdev_lookup_bdev(const char *path, dev_t *dev)
static inline void
bio_set_op_attrs(struct bio *bio, unsigned rw, unsigned flags)
{
#if defined(HAVE_BIO_BI_OPF)
bio->bi_opf = rw | flags;
#else
bio->bi_rw |= rw | flags;
#endif /* HAVE_BIO_BI_OPF */
}
#endif
@@ -495,21 +499,45 @@ blk_queue_discard_granularity(struct request_queue *q, unsigned int dg)
}
/*
* 5.19 API,
* bdev_max_discard_sectors()
*
* 2.6.32 API,
* blk_queue_discard()
*/
static inline boolean_t
bdev_discard_supported(struct block_device *bdev)
{
#if defined(HAVE_BDEV_MAX_DISCARD_SECTORS)
return (!!bdev_max_discard_sectors(bdev));
#elif defined(HAVE_BLK_QUEUE_DISCARD)
return (!!blk_queue_discard(bdev_get_queue(bdev)));
#else
#error "Unsupported kernel"
#endif
}
/*
* 5.19 API,
* bdev_max_secure_erase_sectors()
*
* 4.8 API,
* blk_queue_secure_erase()
*
* 2.6.36 - 4.7 API,
* blk_queue_secdiscard()
*/
static inline int
blk_queue_discard_secure(struct request_queue *q)
static inline boolean_t
bdev_secure_discard_supported(struct block_device *bdev)
{
#if defined(HAVE_BLK_QUEUE_SECURE_ERASE)
return (blk_queue_secure_erase(q));
#if defined(HAVE_BDEV_MAX_SECURE_ERASE_SECTORS)
return (!!bdev_max_secure_erase_sectors(bdev));
#elif defined(HAVE_BLK_QUEUE_SECURE_ERASE)
return (!!blk_queue_secure_erase(bdev_get_queue(bdev)));
#elif defined(HAVE_BLK_QUEUE_SECDISCARD)
return (blk_queue_secdiscard(q));
return (!!blk_queue_secdiscard(bdev_get_queue(bdev)));
#else
return (0);
#error "Unsupported kernel"
#endif
}
@@ -527,7 +555,10 @@ blk_generic_start_io_acct(struct request_queue *q __attribute__((unused)),
struct gendisk *disk __attribute__((unused)),
int rw __attribute__((unused)), struct bio *bio)
{
#if defined(HAVE_DISK_IO_ACCT)
#if defined(HAVE_BDEV_IO_ACCT)
return (bdev_start_io_acct(bio->bi_bdev, bio_sectors(bio),
bio_op(bio), jiffies));
#elif defined(HAVE_DISK_IO_ACCT)
return (disk_start_io_acct(disk, bio_sectors(bio), bio_op(bio)));
#elif defined(HAVE_BIO_IO_ACCT)
return (bio_start_io_acct(bio));
@@ -550,7 +581,9 @@ blk_generic_end_io_acct(struct request_queue *q __attribute__((unused)),
struct gendisk *disk __attribute__((unused)),
int rw __attribute__((unused)), struct bio *bio, unsigned long start_time)
{
#if defined(HAVE_DISK_IO_ACCT)
#if defined(HAVE_BDEV_IO_ACCT)
bdev_end_io_acct(bio->bi_bdev, bio_op(bio), start_time);
#elif defined(HAVE_DISK_IO_ACCT)
disk_end_io_acct(disk, bio_op(bio), start_time);
#elif defined(HAVE_BIO_IO_ACCT)
bio_end_io_acct(bio, start_time);
@@ -35,6 +35,10 @@
#define d_make_root(inode) d_alloc_root(inode)
#endif /* HAVE_D_MAKE_ROOT */
#ifdef HAVE_DENTRY_D_U_ALIASES
#define d_alias d_u.d_alias
#endif
/*
* 2.6.30 API change,
* The const keyword was added to the 'struct dentry_operations' in
@@ -61,4 +65,21 @@ d_clear_d_op(struct dentry *dentry)
DCACHE_OP_REVALIDATE | DCACHE_OP_DELETE);
}
/*
* Walk and invalidate all dentry aliases of an inode
* unless it's a mountpoint
*/
static inline void
zpl_d_drop_aliases(struct inode *inode)
{
struct dentry *dentry;
spin_lock(&inode->i_lock);
hlist_for_each_entry(dentry, &inode->i_dentry, d_alias) {
if (!IS_ROOT(dentry) && !d_mountpoint(dentry) &&
(dentry->d_inode == inode)) {
d_drop(dentry);
}
}
spin_unlock(&inode->i_lock);
}
#endif /* _ZFS_DCACHE_H */
+9 -5
View File
@@ -30,12 +30,16 @@
#include <linux/module.h>
#include <linux/moduleparam.h>
/* Grsecurity kernel API change */
#ifdef MODULE_PARAM_CALL_CONST
/*
* Despite constifying struct kernel_param_ops, some older kernels define a
* `__check_old_set_param()` function in their headers that checks for a
* non-constified `->set()`. This has long been fixed in Linux mainline, but
* since we support older kernels, we workaround it by using a preprocessor
* definition to disable it.
*/
#define __check_old_set_param(_) (0)
typedef const struct kernel_param zfs_kernel_param_t;
#else
typedef struct kernel_param zfs_kernel_param_t;
#endif
#define ZMOD_RW 0644
#define ZMOD_RD 0444
+2
View File
@@ -87,7 +87,9 @@
#if defined(HAVE_KERNEL_FPU_API_HEADER)
#include <asm/fpu/api.h>
#if defined(HAVE_KERNEL_FPU_INTERNAL_HEADER)
#include <asm/fpu/internal.h>
#endif
#if defined(HAVE_KERNEL_FPU_XCR_HEADER)
#include <asm/fpu/xcr.h>
#endif
@@ -115,6 +115,20 @@ fn(struct dentry *dentry, const char *name, void *buffer, size_t size, \
{ \
return (__ ## fn(dentry->d_inode, name, buffer, size)); \
}
/*
* Android API change,
* The xattr_handler->get() callback was changed to take a dentry and inode
* and flags, because the dentry might not be attached to an inode yet.
*/
#elif defined(HAVE_XATTR_GET_DENTRY_INODE_FLAGS)
#define ZPL_XATTR_GET_WRAPPER(fn) \
static int \
fn(const struct xattr_handler *handler, struct dentry *dentry, \
struct inode *inode, const char *name, void *buffer, \
size_t size, int flags) \
{ \
return (__ ## fn(inode, name, buffer, size)); \
}
#else
#error "Unsupported kernel"
#endif
+4
View File
@@ -64,7 +64,11 @@
* }
*/
#ifdef HAVE_REGISTER_SHRINKER_VARARG
#define spl_register_shrinker(x) register_shrinker(x, "zfs-arc-shrinker")
#else
#define spl_register_shrinker(x) register_shrinker(x)
#endif
#define spl_unregister_shrinker(x) unregister_shrinker(x)
/*
+2
View File
@@ -44,7 +44,9 @@
#define zfs_totalhigh_pages totalhigh_pages
#endif
#define membar_consumer() smp_rmb()
#define membar_producer() smp_wmb()
#define physmem zfs_totalram_pages
#define xcopyin(from, to, size) copy_from_user(to, from, size)
+2 -4
View File
@@ -62,7 +62,6 @@ DECLARE_EVENT_CLASS(zfs_ace_class,
__field(boolean_t, z_is_sa)
__field(boolean_t, z_is_mapped)
__field(boolean_t, z_is_ctldir)
__field(boolean_t, z_is_stale)
__field(uint32_t, i_uid)
__field(uint32_t, i_gid)
@@ -95,7 +94,6 @@ DECLARE_EVENT_CLASS(zfs_ace_class,
__entry->z_is_sa = zn->z_is_sa;
__entry->z_is_mapped = zn->z_is_mapped;
__entry->z_is_ctldir = zn->z_is_ctldir;
__entry->z_is_stale = zn->z_is_stale;
__entry->i_uid = KUID_TO_SUID(ZTOI(zn)->i_uid);
__entry->i_gid = KGID_TO_SGID(ZTOI(zn)->i_gid);
@@ -117,7 +115,7 @@ DECLARE_EVENT_CLASS(zfs_ace_class,
"zn_prefetch %u blksz %u seq %u "
"mapcnt %llu size %llu pflags %llu "
"sync_cnt %u mode 0x%x is_sa %d "
"is_mapped %d is_ctldir %d is_stale %d inode { "
"is_mapped %d is_ctldir %d inode { "
"uid %u gid %u ino %lu nlink %u size %lli "
"blkbits %u bytes %u mode 0x%x generation %x } } "
"ace { type %u flags %u access_mask %u } mask_matched %u",
@@ -126,7 +124,7 @@ DECLARE_EVENT_CLASS(zfs_ace_class,
__entry->z_seq, __entry->z_mapcnt, __entry->z_size,
__entry->z_pflags, __entry->z_sync_cnt, __entry->z_mode,
__entry->z_is_sa, __entry->z_is_mapped,
__entry->z_is_ctldir, __entry->z_is_stale, __entry->i_uid,
__entry->z_is_ctldir, __entry->i_uid,
__entry->i_gid, __entry->i_ino, __entry->i_nlink,
__entry->i_size, __entry->i_blkbits,
__entry->i_bytes, __entry->i_mode, __entry->i_generation,
+6 -2
View File
@@ -45,7 +45,8 @@ extern const struct inode_operations zpl_inode_operations;
extern const struct inode_operations zpl_dir_inode_operations;
extern const struct inode_operations zpl_symlink_inode_operations;
extern const struct inode_operations zpl_special_inode_operations;
extern dentry_operations_t zpl_dentry_operations;
/* zpl_file.c */
extern const struct address_space_operations zpl_address_space_operations;
extern const struct file_operations zpl_file_operations;
extern const struct file_operations zpl_dir_file_operations;
@@ -66,11 +67,14 @@ extern int zpl_xattr_security_init(struct inode *ip, struct inode *dip,
#if defined(HAVE_SET_ACL_USERNS)
extern int zpl_set_acl(struct user_namespace *userns, struct inode *ip,
struct posix_acl *acl, int type);
#elif defined(HAVE_SET_ACL_USERNS_DENTRY_ARG2)
extern int zpl_set_acl(struct user_namespace *userns, struct dentry *dentry,
struct posix_acl *acl, int type);
#else
extern int zpl_set_acl(struct inode *ip, struct posix_acl *acl, int type);
#endif /* HAVE_SET_ACL_USERNS */
#endif /* HAVE_SET_ACL */
#if defined(HAVE_GET_ACL_RCU)
#if defined(HAVE_GET_ACL_RCU) || defined(HAVE_GET_INODE_ACL)
extern struct posix_acl *zpl_get_acl(struct inode *ip, int type, bool rcu);
#elif defined(HAVE_GET_ACL)
extern struct posix_acl *zpl_get_acl(struct inode *ip, int type);

Some files were not shown because too many files have changed in this diff Show More