Compare commits

...

83 Commits

Author SHA1 Message Date
Tony Hutter 21bd766133 Tag zfs-2.1.7
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2022-12-01 12:39:45 -08:00
Tony Hutter 7819b12f2c zfs-2.1.7: Use ubuntu-20.04 for zloop and sanity builders
The zfs-2.1.7 branch is still using the older 'python-dev'
package names rather than the newer 'python3-dev' packages that
are required for 'ubuntu-latest'.  Use 'ubuntu-20.04' instead of
'ubuntu-latest' to get around this.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2022-12-01 12:39:45 -08:00
George Amanakis c8d2ab05e1 Fix setting the large_block feature after receiving a snapshot
We are not allowed to dirty a filesystem when done receiving
a snapshot. In this case the flag SPA_FEATURE_LARGE_BLOCKS will
not be set on that filesystem since the filesystem is not on
dp_dirty_datasets, and a subsequent encrypted raw send will fail.
Fix this by checking in dsl_dataset_snapshot_sync_impl() if the feature
needs to be activated and do so if appropriate.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #13699
Closes #13782
2022-12-01 12:39:45 -08:00
Damian Szuberski 2c50512ad2 Make autodetection disable pyzfs for kernel/srpm configurations
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #13394
Closes #14178
2022-12-01 12:39:44 -08:00
Brooks Davis c4468a70c3 Don't leak packed recieved proprties
When local properties (e.g., from -o and -x) are provided, don't leak
the packed representation of the received properties due to variable
reuse.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Brooks Davis <brooks.davis@sri.com>
Closes #14197
2022-12-01 12:39:44 -08:00
Richard Yao e48aaef89f Fix NULL pointer dereference in dbuf_prefetch_indirect_done()
When ZFS is built with assertions, a prefetch is done on a redacted
blkptr and `dpa->dpa_dnode` is NULL, we will have a NULL pointer
dereference in `dbuf_prefetch_indirect_done()`.

Both Coverity and Clang's Static Analyzer caught this.

Reported-by: Coverity (CID 1524671)
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14210
2022-12-01 12:39:44 -08:00
Richard Yao 0e3abd2994 Lua: Fix bad bitshift in lua_strx2number()
The port of lua to OpenZFS modified lua to use int64_t for numbers
instead of double. As part of this, a function for calculating
exponentiation was replaced with a bit shift. Unfortunately, it did not
handle negative values. Also, it only supported exponents numbers with
7 digits before before overflow. This supports exponents up to 15 digits
before overflow.

Clang's static analyzer reported this as "Result of operation is garbage
or undefined" because the exponent was negative.

Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14204
2022-12-01 12:39:44 -08:00
Damian Szuberski 3d1e808096 Fix clang 13 compilation errors
```
os/linux/zfs/zvol_os.c:1111:3: error: ignoring return value of function
  declared with 'warn_unused_result' attribute [-Werror,-Wunused-result]
                add_disk(zv->zv_zso->zvo_disk);
                ^~~~~~~~ ~~~~~~~~~~~~~~~~~~~~

zpl_xattr.c:1579:1: warning: no previous prototype for function
  'zpl_posix_acl_release_impl' [-Wmissing-prototypes]
```

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #13551
(cherry picked from commit 9884319666)
2022-12-01 12:39:44 -08:00
наб 108c07c655 Remove final K&R definitions
Clang trunk now warns -Wstrict-prototypes on this, and they're removed
in C2x

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13447
2022-12-01 12:39:44 -08:00
наб 32f7499acf module: zfs: vdev_removal: remove unused num_indirect
Found with -Wunused-but-set-variable on Clang trunk

Fixes: a1d477c24c ("OpenZFS 7614, 9064 - zfs device evacuation/removal")
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13304
2022-12-01 12:39:44 -08:00
наб 670d66e7a0 tests: cmd: draid: remove unused and undocumented -v
Found with -Wunused-but-set-variable on Clang trunk

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13304
2022-12-01 12:39:44 -08:00
наб ad0379bf0e linux: libspl: zone: () -> (void)
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12968
2022-12-01 12:39:44 -08:00
Laura Hild 2662b8e72b Correct multipathd.target to .service
https://github.com/openzfs/zfs/pull/9863 says it "orders
zfs-import-cache.service and zfs-import-scan.service after
multipathd.service" but the commit (79add96) actually
ordered them after .target.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Laura Hild <lsh@jlab.org>
Closes #12709
Closes #14171
2022-12-01 12:39:44 -08:00
Rich Ercolani fa7d572a8a Handle and detect #13709's unlock regression (#14161)
In #13709, as in #11294 before it, it turns out that 63a26454 still had
the same failure mode as when it was first landed as d1d47691, and
fails to unlock certain datasets that formerly worked.

Rather than reverting it again, let's add handling to just throw out
the accounting metadata that failed to unlock when that happens, as
well as a test with a pre-broken pool image to ensure that we never get
bitten by this again.

Fixes: #13709

Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
2022-12-01 12:39:43 -08:00
shodanshok d9de079a4b Fix arc_p aggressive increase
The original ARC paper called for an initial 50/50 MRU/MFU split
and this is accounted in various places where arc_p = arc_c >> 1,
with further adjustment based on ghost lists size/hit. However, in
current code both arc_adapt() and arc_get_data_impl() aggressively
grow arc_p until arc_c is reached, causing unneeded pressure on
MFU and greatly reducing its scan-resistance until ghost list
adjustments kick in.

This patch restores the original behavior of initially having arc_p
as 1/2 of total ARC, without preventing MRU to use up to 100% total
ARC when MFU is empty.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gionatan Danti <g.danti@assyoma.it>
Closes #14137
Closes #14120
2022-12-01 12:39:43 -08:00
Richard Yao 957c3776f2 FreeBSD: Fix out of bounds read in zfs_ioctl_ozfs_to_legacy()
There is an off by 1 error in the check. Fortunately, this function does
not appear to be used in kernel space, despite being compiled as part of
the kernel module. However, it is used in userspace. Callers of
lzc_ioctl_fd() likely will crash if they attempt to use the
unimplemented request number.

This was reported by FreeBSD's coverity scan.

Reported-by: Coverity (CID 1432059)
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14135
2022-12-01 12:39:43 -08:00
Serapheim Dimitropoulos 85537f77a3 Expose zfs_vdev_open_timeout_ms as a tunable
Some of our customers have been occasionally hitting zfs import failures
in Linux because udevd doesn't create the by-id symbolic links in time
for zpool import to use them. The main issue is that the
systemd-udev-settle.service that zfs-import-cache.service and other
services depend on is racy. There is also an openzfs issue filed (see
https://github.com/openzfs/zfs/issues/10891) outlining the problem and
potential solutions.

With the proper solutions being significant in terms of complexity and
the priority of the issue being low for the time being, this patch
exposes `zfs_vdev_open_timeout_ms` as a tunable so people that are
experiencing this issue often can increase it as a workaround.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #14133
2022-12-01 12:39:43 -08:00
Brooks Davis 5f53a444b3 Remove an unused variable
Clang-16 detects this set-but-unused variable which is assigned and
incremented, but never referenced otherwise.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Brooks Davis <brooks.davis@sri.com>
Closes #14125
2022-12-01 12:39:43 -08:00
Brooks Davis 572bd18c1f Make 1-bit bitfields unsigned
This fixes -Wsingle-bit-bitfield-constant-conversion warning from
clang-16 like:

lib/libzfs/libzfs_dataset.c:4529:19: error: implicit truncation
  from 'int' to a one-bit wide bit-field changes value from
  1 to -1 [-Werror,-Wsingle-bit-bitfield-constant-conversion]
                flags.nounmount = B_TRUE;
				^ ~~~~~~

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Brooks Davis <brooks.davis@sri.com>
Closes #14125
2022-12-01 12:39:43 -08:00
Richard Yao 256b74d0b0 Address warnings about possible division by zero from clangsa
* The complaint in ztest_replay_write() is only possible if something
   went horribly wrong. An assertion will silence this and if it goes
   off, we will know that something is wrong.
 * The complaint in spa_estimate_metaslabs_to_flush() is not impossible,
   but seems very unlikely. We resolve this by passing the value from
   the `MIN()` that does not go to infinity when the variable is zero.

There was a third report from Clang's scan-build, but that was a
definite false positive and disappeared when checked again through
Clang's static analyzer with Z3 refution via CodeChecker.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14124
2022-12-01 12:39:43 -08:00
Allan Jude ac01b876c9 Avoid null pointer dereference in dsl_fs_ss_limit_check()
Check for cr == NULL before dereferencing it in
dsl_enforce_ds_ss_limits() to lookup the zone/jail ID.

Reported-by: Coverity (CID 1210459)
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Closes #14103
2022-12-01 12:39:43 -08:00
Richard Yao e9a8fb17b5 Fix too few arguments to formatting function
CodeQL reported that when the VERIFY3U condition is false, we do not
pass enough arguments to `spl_panic()`. This is because the format
string from `snprintf()` was concatenated into the format string for
`spl_panic()`, which causes us to have an unexpected format specifier.

A CodeQL developer suggested fixing the macro to have a `%s` format
string that takes a stringified RIGHT argument, which would fix this.
However, upon inspection, the VERIFY3U check was never necessary in the
first place, so we remove it in favor of just calling `snprintf()`.

Lastly, it is interesting that every other static analyzer run on the
codebase did not catch this, including some that made an effort to catch
such things. Presumably, all of them relied on header annotations, which
we have not yet done on `spl_panic()`. CodeQL apparently is able to
track the flow of arguments on their way to annotated functions, which
llowed it to catch this when others did not. A future patch that I have
in development should annotate `spl_panic()`, so the others will catch
this too.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14098
2022-12-01 12:39:43 -08:00
Pavel Snajdr 52e658edd7 Remove zpl_revalidate: fix snapshot rollback
Open files, which aren't present in the snapshot, which is being
roll-backed to, need to disappear from the visible VFS image of
the dataset.

Kernel provides d_drop function to drop invalid entry from
the dcache, but inode can be referenced by dentry multiple dentries.

The introduced zpl_d_drop_aliases function walks and invalidates
all aliases of an inode.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pavel Snajdr <snajpa@snajpa.net>
Closes #9600
Closes #14070
2022-12-01 12:39:42 -08:00
Richard Yao 4c59fde1f5 Fix theoretical use of uninitialized values
Clang's static analyzer complains about this.

In get_configs(), if we have an invalid configuration that has no top
level vdevs, we can read a couple of uninitialized variables. Aborting
upon seeing this would break the userland tools for healthy pools, so we
instead initialize the two variables to 0 to allow the userland tools to
continue functioning for the pools with valid configurations.

In zfs_do_wait(), if no wait activities are enabled, we read an
uninitialized error variable.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14043
2022-12-01 12:39:42 -08:00
Richard Yao 3830858c5c Fix memory leaks in dmu_send()/dmu_send_obj()
If we encounter an EXDEV error when using the redacted snapshots
feature, the memory used by dspp.fromredactsnaps is leaked.

Clang's static analyzer caught this during an experiment in which I had
annotated various headers in an attempt to improve the results of static
analysis.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13973
2022-12-01 12:39:42 -08:00
Richard Yao af2e53f62c Fix possible NULL pointer dereference in sha2_mac_init()
If mechanism->cm_param is NULL, passing mechanism to
PROV_SHA2_GET_DIGEST_LEN() will dereference a NULL pointer.

Coverity reported this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14044
2022-12-01 12:39:42 -08:00
Richard Yao 89c41f3979 set_global_var() should not pass NULL pointers to dlclose()
Both Coverity and Clang's static analyzer caught this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14044
2022-12-01 12:39:42 -08:00
Richard Yao 409c99a1d3 Fix NULL pointer dereference in spa_open_common()
Calling spa_open() will pass a NULL pointer to spa_open_common()'s
config parameter. Under the right circumstances, we will dereference the
config parameter without doing a NULL check.

Clang's static analyzer found this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14044
2022-12-01 12:39:42 -08:00
Richard Yao bbec0e60a8 Fix NULL pointer passed to strlcpy from zap_lookup_impl()
Clang's static analyzer pointed out that whenever zap_lookup_by_dnode()
is called, we have the following stack where strlcpy() is passed a NULL
pointer for realname from zap_lookup_by_dnode():

strlcpy()
zap_lookup_impl()
zap_lookup_norm_by_dnode()
zap_lookup_by_dnode()

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14044
2022-12-01 12:39:42 -08:00
Richard Yao a5f17a94d3 fm_fmri_hc_create() must call va_end() before returning
clang-tidy caught this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14044
2022-12-01 12:39:42 -08:00
Richard Yao 5eaad8bdb5 Fix NULL pointer dereference in zdb
Clang's static analyzer complained that we dereference a NULL pointer in
dump_path() if we return 0 when there is an error.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14044
2022-12-01 12:39:42 -08:00
Richard Yao 4351d18fb0 ZED: Fix uninitialized value reads
Coverity complained about a couple of uninitialized value reads in ZED.

 * zfs_deliver_dle() can pass an uninitialized string to zed_log_msg()
 * An uninitialized sev.sigev_signo is passed to timer_create()

The former would log garbage while the latter is not a real issue, but
we might as well suppress it by initializing the field to 0 for
consistency's sake.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14047
2022-12-01 12:39:41 -08:00
Richard Yao 2453f90350 Fix theoretical array overflow in lua_typename()
Out of the 12 defects in lua that coverity reports, 5 of them involve
`lua_typename()` and out of the dozens of defects in ZFS that lua
reports, 3 of them involve `lua_typename()` due to the ZCP code. Given
all of the uses of `lua_typename()` in the ZCP code, I was surprised
that there were not more. It appears that only 2 were reported because
only 3 called `lua_type()`, which does a defective sanity check that
allows invalid types to be passed.

lua/lua@d4fb848be7 addressed this in
upstream lua 5.3. Unfortunately, we did not get that fix since we use
lua 5.2 and we do not have assertions enabled in lua, so the upstream
solution would not do anything.

While we could adopt the upstream solution and enable assertions, a
simpler solution is to fix the issue by making `lua_typename()` return
`internal_type_error` whenever it is called with an invalid type. This
avoids the array overflow and if we ever see it appear somewhere, we
will know there is a problem with the lua interpreter.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13947
2022-12-01 12:39:41 -08:00
Richard Yao d016ca1a92 Fix potential NULL pointer dereference in lzc_ioctl()
Users are allowed to pass NULL to resultp, but we unconditionally assume
that they never do. When an external user does pass NULL to resultp, we
dereference a NULL pointer.

Clang's static analyzer complained about this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14008
2022-12-01 12:39:41 -08:00
Richard Yao d05f247aec scripts/enum-extract.pl should not hard code perl path
This is a portability issue. The issue had already been fixed for
scripts/cstyle.pl by 2dbf1bf829.
scripts/enum-extract.pl was added to the repository the following year
without this portability fix.

Michael Bishop informed me that this broke his attempt to build ZFS
2.1.6 on NixOS, since he was building manually outside of their package
manager (that usually rewrites the shebangs to NixOS' unusual paths).
NixOS puts all of the paths into $PATH, so scripts that portably rely
on env to find the interpreter still work.

Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14012
2022-12-01 12:39:41 -08:00
Richard Yao fa74250cd3 PAM: Fix unchecked return value from zfs_key_config_load()
9a49c6b782 was intended to fix this issue,
but I had missed the case in pam_sm_open_session(). Clang's static
analyzer had not reported it and I forgot to look for other cases.

Interestingly, GCC gcc-12.1.1_p20220625's static analyzer had caught
this as multiple double-free bugs, since another failure after the
failure in zfs_key_config_load() will cause us to attempt to free the
memory that zfs_key_config_load() was supposed to allocate, but had
cleaned up upon failure.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13978
2022-12-01 12:39:41 -08:00
Richard Yao c562bbefc0 Fix potential NULL pointer dereference in dsl_dataset_promote_check()
If the `list_head()` returns NULL, we dereference it, right before we
check to see if it returned NULL.

We have defined two different pointers that both point to the same
thing, which are `origin_head` and `origin_ds`. Almost everything uses
`origin_ds`, so we switch them to use `origin_ds`.

We also promote `origin_ds` to a const pointer so that the compiler
verifies that nothing modifies it.

Coverity complained about this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13967
2022-12-01 12:39:41 -08:00
Richard Yao d4df36de5d Fix unreachable code in zstreamdump
82226e4f44 was intended to prevent a
warning from being printed in situations where it was inappropriate, but
accidentally disabled it entirely by setting featureflags in the wrong
case statement.

Coverity reported this as dead code.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13946
2022-12-01 12:39:41 -08:00
Richard Yao 531361114b PAM: Fix uninitialized value read
Clang's static analyzer found that config.uid is uninitialized when
zfs_key_config_load() returns an error.

Oddly, this was not included in the unchecked return values that
Coverity found.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13957
2022-12-01 12:39:41 -08:00
Richard Yao e11c4327f1 set_global_var_parse_kv() should pass the pointer from strdup()
A comment says that the caller should free k_out, but the pointer passed
via k_out is not the same pointer we received from strdup(). Instead,
it is a pointer into the region we received from strdup(). The free
function should always be called with the original pointer, so this is
likely a bug.

We solve this by calling `strdup()` a second time and then freeing the
original pointer.

Coverity reported this as a memory leak.

Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13867
2022-12-01 12:39:41 -08:00
Richard Yao fbe150fe5b Call va_end() before return in zpool_standard_error_fmt()
Commit ecd6cf800b63704be73fb264c3f5b6e0dafc068d by marks in OpenSolaris
at Tue Jun 26 07:44:24 2007 -0700 introduced a bug where we fail to call
`va_end()` before returning.

The man page for va_start() says:

"Each invocation of va_start() must be matched by a corresponding
invocation of va_end() in the same function."

Coverity complained about this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Chunwei Chen <david.chen@nutanix.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13904
2022-12-01 12:39:41 -08:00
Richard Yao 1ff8f41851 Fix potential NULL pointer dereference in zfsdle_vdev_online()
Coverity complained about this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Chunwei Chen <david.chen@nutanix.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13903
2022-12-01 12:39:40 -08:00
Richard Yao c6d93d0a80 FreeBSD: Fix uninitialized pointer read in spa_import_rootpool()
The FreeBSD project's coverity scans found this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13923
2022-12-01 12:39:40 -08:00
Richard Yao 9f1691a964 Linux: Fix use-after-free in zfsvfs_create()
Coverity reported that we pass a pointer to zfsvfs to
`dmu_objset_disown()` after freeing zfsvfs in zfsvfs_create_impl() after
a failure in zfsvfs_init().

We have nearly identical duplicate versions of this code for FreeBSD and
Linux, but interestingly, the FreeBSD version of this code differs in
such a way that it does not suffer from this bug. We remove the
difference from the FreeBSD version to fix this bug.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13883
2022-12-01 12:39:40 -08:00
Richard Yao 12b859c970 Fix null pointer dereferences in PAM
Coverity caught these.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13889
2022-12-01 12:39:40 -08:00
наб 39a39b8ab9 Handle ECKSUM as new EZFS_CKSUM ‒ "insufficient replicas"
Add a meaningful error message for ECKSUM to common error messages.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #6805
Closes #13808
Closes #13898
2022-12-01 12:39:40 -08:00
Richard Yao 1d5e569a69 Fix use-after-free bugs in icp code
These were reported by Coverity as "Read from pointer after free" bugs.
Presumably, it did not report it as a use-after-free bug because it does
not understand the inline assembly that implements the atomic
instruction.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13881
2022-12-01 12:39:40 -08:00
Richard Yao 3f380df778 Remove incorrect free() in zfs_get_pci_slots_sys_path()
Coverity found this. We attempted to free tmp, which is a pointer to a
string that should be freed by the caller.

Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13864
2022-12-01 12:39:40 -08:00
Richard Yao b247d47be1 Cleanup: Make memory barrier definitions consistent across kernels
We inherited membar_consumer() and membar_producer() from OpenSolaris,
but we had replaced membar_consumer() with Linux's smp_rmb() in
zfs_ioctl.c. The FreeBSD SPL consequently implemented a shim for the
Linux-only smp_rmb().

We reinstate membar_consumer() in platform independent code and fix the
FreeBSD SPL to implement membar_consumer() in a way analogous to Linux.

Reviewed-by: Konstantin Belousov <kib@FreeBSD.org>
Reviewed-by: Mateusz Guzik <mjguzik@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13843
2022-12-01 12:39:40 -08:00
Richard Yao 792825724b zpool_load_compat() should create strings of length ZFS_MAXPROPLEN
Otherwise, `strlcat()` can overflow them.

Coverity found this.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13866
2022-12-01 12:39:40 -08:00
Alexander Lobakin ab22031d79 icp: fix all !ENDBR objtool warnings in x86 Asm code
Currently, only Blake3 x86 Asm code has signs of being ENDBR-aware.
At least, under certain conditions it includes some header file and
uses some custom macro from there.
Linux has its own NOENDBR since several releases ago. It's defined
in the same <asm/linkage.h>, so currently <sys/asm_linkage.h>
already is provided with it.

Let's unify those two into one %ENDBR macro. At first, check if it's
present already. If so -- use Linux kernel version. Otherwise, try
to go that second way and use %_CET_ENDBR from <cet.h> if available.
If no, fall back to just empty definition.
This fixes a couple more 'relocations to !ENDBR' across the module.
And now that we always have the latest/actual ENDBR definition, use
it at the entrance of the few corresponding functions that objtool
still complains about. This matches the way how it's used in the
upstream x86 core Asm code.

Reviewed-by: Attila Fülöp <attila@fueloep.org>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Closes #14035
2022-12-01 12:39:39 -08:00
Alexander Lobakin 33bc03dea7 icp: fix rodata being marked as text in x86 Asm code
objtool properly complains that it can't decode some of the
instructions from ICP x86 Asm code. As mentioned in the Makefile,
where those object files were excluded from objtool check (but they
can still be visible under IBT and LTO), those are just constants,
not code.
In that case, they must be placed in .rodata, so they won't be
marked as "allocatable, executable" (ax) in EFL headers and this
effectively prevents objtool from trying to decode this data. That
reveals a whole bunch of other issues in ICP Asm code, as previously
objtool was bailing out after that warning message.

Reviewed-by: Attila Fülöp <attila@fueloep.org>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Closes #14035

Conflicts:
	module/Kbuild.in
2022-11-30 10:15:58 -08:00
Alexander Lobakin ee93cbc9d4 icp: properly fix all RETs in x86_64 Asm code
Commit 43569ee374 ("Fix objtool: missing int3 after ret warning")
addressed replacing all `ret`s in x86 asm code to a macro in the
Linux kernel in order to enable SLS. That was done by copying the
upstream macro definitions and fixed objtool complaints.
Since then, several more mitigations were introduced, including
Rethunk. It requires to have a jump to one of the thunks in order
to work, so the RET macro was changed again. And, as ZFS code
didn't use the mainline defition, but copied it, this is currently
missing.

Objtool reminds about it time to time (Clang 16, CONFIG_RETHUNK=y):

fs/zfs/lua/zlua.o: warning: objtool: setjmp+0x25: 'naked' return
 found in RETHUNK build
fs/zfs/lua/zlua.o: warning: objtool: longjmp+0x27: 'naked' return
 found in RETHUNK build

Do it the following way:
* if we're building under Linux, unconditionally include
  <linux/linkage.h> in the related files. It is available in x86
  sources since even pre-2.6 times, so doesn't need any conftests;
* then, if RET macro is available, it will be used directly, so that
  we will always have the version actual to the kernel we build;
* if there's no such macro, we define it as a simple `ret`, as it
  was on pre-SLS times.

This ensures we always have the up-to-date definition with no need
to update it manually, and at the same time is safe for the whole
variety of kernels ZFS module supports.
Then, there's a couple more "naked" rets left in the code, they're
just defined as:

	.byte 0xf3,0xc3

In fact, this is just:

	rep ret

`rep ret` instead of just `ret` seems to mitigate performance issues
on some old AMD processors and most likely makes no sense as of
today.
Anyways, address those rets, so that they will be protected with
Rethunk and SLS. Include <sys/asm_linkage.h> here which now always
has RET definition and replace those constructs with just RET.
This wipes the last couple of places with unpatched rets objtool's
been complaining about.

Reviewed-by: Attila Fülöp <attila@fueloep.org>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Closes #14035
2022-11-30 10:15:58 -08:00
Ryan Moeller 1d9aa838ed libzfs recv: Check if user prop before inheritable
User props trigger an assert in zfs_prop_inheritable(), we must check
if the prop is a user prop first.

Signed-off-by: Ryan Moeller <ryan@iXsystems.com>

Backported as snippit from:
63652e1 Add --enable-asan and --enable-ubsan switches
2022-11-30 10:13:23 -08:00
Damian Szuberski 0f4ee295ba dsl_prop_known_index(): check for invalid prop
Resolve UBSAN array-index-out-of-bounds error in zprop_desc_t.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #14142
Closes #14147
2022-11-08 10:16:21 -08:00
Ameer Hamza 8c0684d326 zed: Avoid core dump if wholedisk property does not exist
zed aborts and dumps core in vdev_whole_disk_from_config() if
wholedisk property does not exist. make_leaf_vdev() adds the
property but there may be already pools that don't have the
wholedisk in the label.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #14062
2022-11-08 10:10:05 -08:00
Ameer Hamza ca3a675c74 zed: Prevent special vdev to be replaced by hot spare
Special vdevs should not be replaced by a hot spare.
Log vdevs already support this, extending the
functionality for special vdevs.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #14129
2022-11-07 13:36:57 -08:00
Attila Fülöp cd1f023846 Deny receiving into encrypted datasets if the keys are not loaded (#14139)
Commit 68ddc06b61 introduced support
for receiving unencrypted datasets as children of encrypted ones but
unfortunately got the logic upside down. This resulted in failing to
deny receives of incremental sends into encrypted datasets without
their keys loaded. If receiving a filesystem, the receive was done
into a newly created unencrypted child dataset of the target. In
case of volumes the receive made the target volume undeletable since
a dataset was created below it, which we obviously can't handle.
Incremental streams with embedded blocks are affected as well.

We fix the broken logic to properly deny receives in such cases.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes #13598
Closes #14055
Closes #14119
2022-11-04 11:07:29 -07:00
Ryan Moeller b27c7a1457 zil: Relax assertion in zil_parse
Rather than panic debug builds when we fail to parse a whole ZIL, let's
instead improve the logging of errors and continue like in a release
build.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #14116
2022-11-01 12:49:14 -07:00
Mariusz Zaborski 186e39f336 quota: extend quota for dataset
This patch relax the quota limitation for dataset by around 3%.
What this means is that user can write more data then the quota is
set to. However thanks to that we can get more stable bandwidth, in
case when we are overwriting data in-place, and not consuming any
additional space.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: Mariusz Zaborski <oshogbo@vexillium.org>
Sponsored-by: Zededa Inc.
Sponsored-by: Klara Inc.
Closes #13839
2022-11-01 12:48:37 -07:00
shodanshok 1d2b0563f7 Fix ARC target collapse when zfs_arc_meta_limit_percent=100
Reclaim metadata when arc_available_memory < 0 even if
meta_used is not bigger than arc_meta_limit.

As described in https://github.com/openzfs/zfs/issues/14054 if
zfs_arc_meta_limit_percent=100 then ARC target can collapse to
arc_min due to arc_purge not freeing any metadata.

This patch lets arc_prune to do its work when arc_available_memory
is negative even if meta_used is not bigger than arc_meta_limit,
avoiding ARC target collapse.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gionatan Danti <g.danti@assyoma.it>
Closes #14054 
Closes #14093
2022-11-01 12:48:30 -07:00
vaclavskala 8929355b4c Propagate extent_bytes change to autotrim thread
The autotrim thread only reads zfs_trim_extent_bytes_min and
zfs_trim_extent_bytes_max variable only on thread start.  We
should check for parameter changes during thread execution to
allow parameter changes take effect without needing to disable
then restart the autotrim.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Václav Skála <skala@vshosting.cz>
Closes #14077
2022-11-01 12:48:23 -07:00
Coleman Kane 212ba9bd97 Linux 6.1 compat: change order of sys/mutex.h includes
After Linux 6.1-rc1 came out, the build started failing to build a
couple of the files in the linux spl code due to the mutex_init
redefinition. Moving the sys/mutex.h include to a lower position within
these two files appears to fix the problem.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #14040
2022-11-01 12:44:56 -07:00
Brian Behlendorf 7ce097c874 Linux 6.0 compat: META
Update the META file to reflect compatibility with the 6.0 kernel.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #14091
2022-11-01 12:43:49 -07:00
Alexander 3e767e34bd Linux compat: fix DECLARE_EVENT_CLASS() test when ZFS is built-in
ZFS_LINUX_TRY_COMPILE_HEADER macro doesn't take CONFIG_ZFS=y into
account. As a result, on several latest Linux versions, configure
script marks DECLARE_EVENT_CLASS() available for non-GPL when ZFS
is being built as a module, but marks it unavailable when ZFS is
built-in.
Follow the logic of the neighbor macros and adjust
ZFS_LINUX_TRY_COMPILE_HEADER accordingly, so that it doesn't try
to look for a .ko when ZFS is built-in.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Closes #14006
2022-11-01 12:43:29 -07:00
Christian Schwarz df000276b8 zfs_domount: fix double-disown of dataset / double-free of zfsvfs_t
Before this patch, in zfs_domount, if zfs_root or d_make_root fails, we
leave zfsvfs != NULL. This will lead to execution of the error handling
`if` statement at the `out` label, and hence to a call to
dmu_objset_disown and zfsvfs_free.

However, zfs_umount, which we call upon failure of zfs_root and
d_make_root already does dmu_objset_disown and zfsvfs_free.

I suppose this patch rather adds to the brittleness of this part of the
code base, but I don't want to invest more time in this right now.
To add a regression test, we'd need some kind of fault injection
facility for zfs_root or d_make_root, which doesn't exist right now.
And even then, I think that regression test would be too closely tied
to the implementation.

To repro the double-disown / double-free, do the following:
1. patch zfs_root to always return an error
2. mount a ZFS filesystem

Here's the stack trace you would see then:

  VERIFY3(ds->ds_owner == tag) failed (0000000000000000 == ffff9142361e8000)
  PANIC at dsl_dataset.c:1003:dsl_dataset_disown()
  Showing stack for process 28332
  CPU: 2 PID: 28332 Comm: zpool Tainted: G           O      5.10.103-1.nutanix.el7.x86_64 #1
  Call Trace:
   dump_stack+0x74/0x92
   spl_dumpstack+0x29/0x2b [spl]
   spl_panic+0xd4/0xfc [spl]
   dsl_dataset_disown+0xe9/0x150 [zfs]
   dmu_objset_disown+0xd6/0x150 [zfs]
   zfs_domount+0x17b/0x4b0 [zfs]
   zpl_mount+0x174/0x220 [zfs]
   legacy_get_tree+0x2b/0x50
   vfs_get_tree+0x2a/0xc0
   path_mount+0x2fa/0xa70
   do_mount+0x7c/0xa0
   __x64_sys_mount+0x8b/0xe0
   do_syscall_64+0x38/0x50
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Co-authored-by: Christian Schwarz <christian.schwarz@nutanix.com>
Signed-off-by: Christian Schwarz <christian.schwarz@nutanix.com>
Closes #14025
2022-11-01 12:42:32 -07:00
Richard Yao 7a1b6c51d0 Linux: Remove ZFS_AC_KERNEL_SRC_MODULE_PARAM_CALL_CONST autotools check
On older kernels, the definition for `module_param_call()` typecasts
function pointers to `(void *)`, which triggers -Werror, causing the
check to return false when it should return true.

Fixing this breaks the build process on some older kernels because they
define a `__check_old_set_param()` function in their headers that checks
for a non-constified `->set()`. We workaround that through the c
preprocessor by defining `__check_old_set_param(set)` to `(set)`, which
prevents the build failures.

However, it is now apparent that all kernels that we support have
adopted the GRSecurity change, so there is no need to have an explicit
autotools check for it anymore. We therefore remove the autotools check,
while adding the workaround to our headers for the build time
non-constified `->set()` check done by older kernel headers.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13984
Closes #14004
2022-11-01 12:42:01 -07:00
George Melikov 4dd9c3b08e CI: bump actions/upload-artifact to v3
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #14018
2022-11-01 12:38:22 -07:00
George Melikov 1bbc09e1f7 CI: bump actions/checkout to v3
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #14018
2022-11-01 12:38:09 -07:00
Serapheim Dimitropoulos 37d5a3e04b Stop ganging due to past vdev write errors
= Problem

While examining a customer's system we noticed unreasonable space
usage from a few snapshots due to gang blocks. Under some further
analysis we discovered that the pool would create gang blocks because
all its disks had non-zero write error counts and they'd be skipped
for normal metaslab allocations due to the following if-clause in
`metaslab_alloc_dva()`:
```
	/*
	 * Avoid writing single-copy data to a failing,
	 * non-redundant vdev, unless we've already tried all
	 * other vdevs.
	 */
	if ((vd->vdev_stat.vs_write_errors > 0 ||
	    vd->vdev_state < VDEV_STATE_HEALTHY) &&
	    d == 0 && !try_hard && vd->vdev_children == 0) {
		metaslab_trace_add(zal, mg, NULL, psize, d,
		    TRACE_VDEV_ERROR, allocator);
		goto next;
	}
```

= Proposed Solution

Get rid of the predicate in the if-clause that checks the past
write errors of the selected vdev. We still try to allocate from
HEALTHY vdevs anyway by checking vdev_state so the past write
errors doesn't seem to help us (quite the opposite - it can cause
issues in long-lived pools like the one from our customer).

= Testing

I first created a pool with 3 vdevs:
```
$ zpool list -v volpool
NAME        SIZE  ALLOC   FREE
volpool    22.5G   117M  22.4G
  xvdb     7.99G  40.2M  7.46G
  xvdc     7.99G  39.1M  7.46G
  xvdd     7.99G  37.8M  7.46G
```

And used `zinject` like so with each one of them:
```
$ sudo zinject -d xvdb -e io -T write -f 0.1 volpool
```

And got the vdevs to the following state:
```
$ zpool status volpool
  pool: volpool
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.
...<cropped>..
action: Determine if the device needs to be replaced, and clear the
...<cropped>..
config:

	NAME        STATE     READ WRITE CKSUM
	volpool     ONLINE       0     0     0
	  xvdb      ONLINE       0     1     0
	  xvdc      ONLINE       0     1     0
	  xvdd      ONLINE       0     4     0

```

I also double-checked their write error counters with sdb:
```
sdb> spa volpool | vdev | member vdev_stat.vs_write_errors
(uint64_t)0  # <---- this is the root vdev
(uint64_t)2
(uint64_t)1
(uint64_t)1
```

Then I checked that I the problem was reproduced in my VM as I the
gang count was growing in zdb as I was writting more data:
```
$ sudo zdb volpool | grep gang
        ganged count:              1384

$ sudo zdb volpool | grep gang
        ganged count:              1393

$ sudo zdb volpool | grep gang
        ganged count:              1402

$ sudo zdb volpool | grep gang
        ganged count:              1414
```

Then I updated my bits with this patch and the gang count stayed the
same.

Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #14003
2022-11-01 12:36:25 -07:00
Serapheim Dimitropoulos 25096e1180 zvol_wait logic may terminate prematurely
Setups that have a lot of zvols may see zvol_wait terminate prematurely
even though the script is still making progress.  For example, we have a
customer that called zvol_wait for ~7100 zvols and by the last iteration
of that script it was still waiting on ~2900. Similarly another one
called zvol_wait for 2200 and by the time the script terminated there
were only 50 left.

This patch adjusts the logic to stay within the outer loop of the script
if we are making any progress whatsoever.

Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #13998
2022-11-01 12:35:36 -07:00
shodanshok 820edcbf91 Remove ambiguity on demand vs prefetch stats reported by arc_summary
arc_summary currently list prefetch stats as "demand prefetch"
However, a hit/miss can be due to demand or prefetch, not both.
To remove any confusion, this patch removes the "Demand" word
from the affected lines.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Gionatan Danti <g.danti@assyoma.it>
Closes #13985
2022-11-01 12:35:05 -07:00
Serapheim Dimitropoulos 37763ea2a6 Fix panic in dsl_process_sub_livelist for EINTR
= Issue

Recently we hit an assertion panic in `dsl_process_sub_livelist` while
exporting the spa and interrupting `bpobj_iterate_nofree`. In that case
`bpobj_iterate_nofree` stops mid-way returning an EINTR without clearing
the intermediate AVL tree that keeps track of the livelist entries it
has encountered so far. At that point the code has a VERIFY for the
number of elements of the AVL expecting it to be zero (which is not the
case for EINTR).

= Fix

Cleanup any intermediate state before destroying the AVL when
encountering EINTR. Also added a comment documenting the scenario where
the EINTR comes up. There is no need to do anything else for the calles
of `dsl_process_sub_livelist` as they already handle the EINTR case.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #13939
2022-11-01 12:34:08 -07:00
Mateusz Guzik c8d6a91a99 Bring per_txg_dirty_frees_percent back to 30
The current value causes significant artificial slowdown during mass
parallel file removal, which can be observed both on FreeBSD and Linux
when running real workloads.

Sample results from Linux doing make -j 96 clean after an allyesconfig
modules build:

before: 4.14s user 6.79s system 48% cpu 22.631 total
after:	4.17s user 6.44s system 153% cpu 6.927 total

FreeBSD results in the ticket.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by:	Mateusz Guzik <mjguzik@gmail.com>
Closes #13932
Closes #13938
2022-11-01 12:32:40 -07:00
Akash B 7ac732b8d6 Add options to zfs redundant_metadata property
Currently, additional/extra copies are created for metadata in
addition to the redundancy provided by the pool(mirror/raidz/draid),
due to this 2 times more space is utilized per inode and this decreases
the total number of inodes that can be created in the filesystem. By
setting redundant_metadata to none, no additional copies of metadata
are created, hence can reduce the space consumed by the additional
metadata copies and increase the total number of inodes that can be
created in the filesystem.  Additionally, this can improve file create
performance due to the reduced amount of metadata which needs
to be written.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Dipak Ghosh <dipak.ghosh@hpe.com>
Signed-off-by: Akash B <akash-b@hpe.com>
Closes #13680
2022-11-01 12:25:58 -07:00
Andriy Gapon 04f1983aab FreeBSD: vn_flush_cached_data: observe vnode locking contract
vm_object_page_clean() expects that the associated vnode is locked
as VOP_PUTPAGES() may get called on the vnode.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Andriy Gapon <avg@FreeBSD.org>
Closes #14079
(cherry picked from commit 41133c9794)
2022-10-27 16:14:57 -07:00
Mark Johnston 4e3fecbdfd FreeBSD: Fix a pair of bugs in zfs_fhtovp()
- Add a zfs_exit() call in an error path, otherwise a lock is leaked.
- Remove the fid_gen > 1 check.  That appears to be Linux-specific:
  zfsctl_snapdir_fid() sets fid_gen to 0 or 1 depending on whether the
  snapshot directory is mounted.  On FreeBSD it fails, making snapshot
  dirs inaccessible via NFS.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Andriy Gapon <avg@FreeBSD.org>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Fixes: 43dbf88178 ("FreeBSD: vfsops: use setgen for error case")
Closes #14001
Closes #13974
(cherry picked from commit ed566bf1cd)
2022-10-26 14:59:25 -07:00
samwyc fc1c0053f9 Fix sequential resilver drive failure race condition
This patch handles the race condition on simultaneous failure of
2 drives, which misses the vdev_rebuild_reset_wanted signal in
vdev_rebuild_thread. We retry to catch this inside the
vdev_rebuild_complete_sync function.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Dipak Ghosh <dipak.ghosh@hpe.com>
Reviewed-by: Akash B <akash-b@hpe.com>
Signed-off-by: Samuel Wycliffe J <samwyc@hpe.com>
Closes #14041
Closes #14050
2022-10-21 14:05:06 -07:00
Brian Behlendorf 7795975681 contrib: dracut: zfs-snapshot-bootfs: exit status fix
Correct misplaced `-` is the original backport of #13769.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #13769
2022-10-20 11:37:21 -07:00
gregory-lee-bartholomew 3b935cc3ed contrib: dracut: zfs-{rollback,snapshot}-bootfs: explicit snapname fix
Due to a missing semicolon on the ExecStart line, it wasn't possible
to specify the snapshot name on the bootfs.{rollback,snapshot}
kernel parameters if the boot dataset name was obtained from the
root=zfs:... kernel parameter.

Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gregory Bartholomew <gregory.lee.bartholomew@gmail.com>
Closes #13585
2022-10-20 11:34:59 -07:00
Richard Yao b0bc882395 kcfpool_alloc() should have its argument list marked void
This error occurred when building on Gentoo with debugging enabled:

zfs-kmod-2.1.6/work/zfs-2.1.6/module/icp/core/kcf_sched.c:1277:14:
error: a function declaration without a prototype is deprecated
in all versions of C [-Werror,-Wstrict-prototypes]
  kcfpool_alloc()
               ^
               void
1 error generated.

This function is not present in master.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14023
2022-10-12 15:47:39 -07:00
наб 8cf59e97c4 etc: mask zfs-load-key.service
Otherwise, systemd-sysv-generator will generate a service equivalent
that breaks the boot: under systemd this is covered by
zfs-mount-generator

We already do this for zfs-import.service, and other init scripts are
suppressed automatically by the "actual" .service files

Fixes: commit f04b976200 ("Add init script
 to load keys")
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #14010
Closes #14019
2022-10-12 15:29:21 -07:00
Damian Szuberski 4d22befde6 initramfs: use mount.zfs instead of mount
A followup to d7a67402a8

For `mount -t zfs -o opts ds mp` command line
some implementations of `mount(8)`, e. g. Busybox in Debian
work as follows:

```
newfstatat(AT_FDCWD, "ds", 0x7fff826f4ab0, 0) = -1
mount("ds", "mp", "zfs", MS_SILENT, NULL) = 0
```

The logic above skips completely `mount.zfs` and prevents us
from reading filesystem properties and applying mount options.

For comparison, the coreutils `mount(8)` implementation does:

```
openat(AT_FDCWD, "/proc/filesystems", O_RDONLY|O_CLOEXEC) = 3
// figure out that zfs is a `nodev` filesystem and look for a helper
newfstatat(AT_FDCWD, "/sbin/mount.zfs" ...) = 0
execve("/sbin/mount.zfs" ...) = 0
```

Using `mount.zfs` in initramfs would help circumvent deficiencies
of some of `mount(8)` implementations. `mount -t zfs` translates
to `mount.zfs` invocation, except for cases when explicitly disabled
by `-i`.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #13305
(cherry picked from commit 35d81a75a8)
2022-10-05 17:01:39 -07:00
108 changed files with 762 additions and 309 deletions
+2 -2
View File
@@ -8,7 +8,7 @@ jobs:
checkstyle:
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
with:
ref: ${{ github.event.pull_request.head.sha }}
- name: Install dependencies
@@ -43,7 +43,7 @@ jobs:
if: failure() && steps.CheckABI.outcome == 'failure'
run: |
find -name *.abi | tar -cf abi_files.tar -T -
- uses: actions/upload-artifact@v2
- uses: actions/upload-artifact@v3
if: failure() && steps.CheckABI.outcome == 'failure'
with:
name: New ABI files (use only if you're sure about interface changes)
+2 -2
View File
@@ -12,7 +12,7 @@ jobs:
os: [18.04, 20.04]
runs-on: ubuntu-${{ matrix.os }}
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
with:
ref: ${{ github.event.pull_request.head.sha }}
- name: Install dependencies
@@ -75,7 +75,7 @@ jobs:
sudo chmod +r $RESULTS_PATH/*
# Replace ':' in dir names, actions/upload-artifact doesn't support it
for f in $(find /var/tmp/test_results -name '*:*'); do mv "$f" "${f//:/__}"; done
- uses: actions/upload-artifact@v2
- uses: actions/upload-artifact@v3
if: failure()
with:
name: Test logs Ubuntu-${{ matrix.os }}
+3 -3
View File
@@ -6,9 +6,9 @@ on:
jobs:
tests:
runs-on: ubuntu-latest
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
with:
ref: ${{ github.event.pull_request.head.sha }}
- name: Install dependencies
@@ -71,7 +71,7 @@ jobs:
sudo chmod +r $RESULTS_PATH/*
# Replace ':' in dir names, actions/upload-artifact doesn't support it
for f in $(find /var/tmp/test_results -name '*:*'); do mv "$f" "${f//:/__}"; done
- uses: actions/upload-artifact@v2
- uses: actions/upload-artifact@v3
if: failure()
with:
name: Test logs
+4 -4
View File
@@ -6,11 +6,11 @@ on:
jobs:
tests:
runs-on: ubuntu-latest
runs-on: ubuntu-20.04
env:
TEST_DIR: /var/tmp/zloop
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
with:
ref: ${{ github.event.pull_request.head.sha }}
- name: Install dependencies
@@ -50,7 +50,7 @@ jobs:
if: failure()
run: |
sudo chmod +r -R $TEST_DIR/
- uses: actions/upload-artifact@v2
- uses: actions/upload-artifact@v3
if: failure()
with:
name: Logs
@@ -58,7 +58,7 @@ jobs:
/var/tmp/zloop/*/
!/var/tmp/zloop/*/vdev/
if-no-files-found: ignore
- uses: actions/upload-artifact@v2
- uses: actions/upload-artifact@v3
if: failure()
with:
name: Pool files
+2 -2
View File
@@ -1,10 +1,10 @@
Meta: 1
Name: zfs
Branch: 1.0
Version: 2.1.6
Version: 2.1.7
Release: 1
Release-Tags: relext
License: CDDL
Author: OpenZFS
Linux-Maximum: 5.19
Linux-Maximum: 6.0
Linux-Minimum: 3.10
+4 -4
View File
@@ -686,9 +686,9 @@ def section_archits(kstats_dict):
print()
print('Cache hits by data type:')
dt_todo = (('Demand data:', arc_stats['demand_data_hits']),
('Demand prefetch data:', arc_stats['prefetch_data_hits']),
('Prefetch data:', arc_stats['prefetch_data_hits']),
('Demand metadata:', arc_stats['demand_metadata_hits']),
('Demand prefetch metadata:',
('Prefetch metadata:',
arc_stats['prefetch_metadata_hits']))
for title, value in dt_todo:
@@ -697,10 +697,10 @@ def section_archits(kstats_dict):
print()
print('Cache misses by data type:')
dm_todo = (('Demand data:', arc_stats['demand_data_misses']),
('Demand prefetch data:',
('Prefetch data:',
arc_stats['prefetch_data_misses']),
('Demand metadata:', arc_stats['demand_metadata_misses']),
('Demand prefetch metadata:',
('Prefetch metadata:',
arc_stats['prefetch_metadata_misses']))
for title, value in dm_todo:
+1 -1
View File
@@ -2995,7 +2995,7 @@ open_objset(const char *path, void *tag, objset_t **osp)
}
sa_os = *osp;
return (0);
return (err);
}
static void
+1
View File
@@ -599,6 +599,7 @@ fmd_timer_install(fmd_hdl_t *hdl, void *arg, fmd_event_t *ep, hrtime_t delta)
sev.sigev_notify_function = _timer_notify;
sev.sigev_notify_attributes = NULL;
sev.sigev_value.sival_ptr = ftp;
sev.sigev_signo = 0;
timer_create(CLOCK_REALTIME, &sev, &ftp->ft_tid);
timer_settime(ftp->ft_tid, 0, &its, NULL);
+8 -10
View File
@@ -938,14 +938,13 @@ vdev_whole_disk_from_config(zpool_handle_t *zhp, const char *vdev_path)
{
nvlist_t *nvl = NULL;
boolean_t avail_spare, l2cache, log;
uint64_t wholedisk;
uint64_t wholedisk = 0;
nvl = zpool_find_vdev(zhp, vdev_path, &avail_spare, &l2cache, &log);
if (!nvl)
return (0);
verify(nvlist_lookup_uint64(nvl, ZPOOL_CONFIG_WHOLE_DISK,
&wholedisk) == 0);
(void) nvlist_lookup_uint64(nvl, ZPOOL_CONFIG_WHOLE_DISK, &wholedisk);
return (wholedisk);
}
@@ -965,7 +964,7 @@ zfsdle_vdev_online(zpool_handle_t *zhp, void *data)
nvlist_t *tgt;
int error;
char *tmp_devname, devname[MAXPATHLEN];
char *tmp_devname, devname[MAXPATHLEN] = "";
uint64_t guid;
if (nvlist_lookup_uint64(udev_nvl, ZFS_EV_VDEV_GUID, &guid) == 0) {
@@ -984,7 +983,7 @@ zfsdle_vdev_online(zpool_handle_t *zhp, void *data)
if ((tgt = zpool_find_vdev_by_physpath(zhp, devname,
&avail_spare, &l2cache, NULL)) != NULL) {
char *path, fullpath[MAXPATHLEN];
uint64_t wholedisk;
uint64_t wholedisk = 0;
error = nvlist_lookup_string(tgt, ZPOOL_CONFIG_PATH, &path);
if (error) {
@@ -992,10 +991,8 @@ zfsdle_vdev_online(zpool_handle_t *zhp, void *data)
return (0);
}
error = nvlist_lookup_uint64(tgt, ZPOOL_CONFIG_WHOLE_DISK,
(void) nvlist_lookup_uint64(tgt, ZPOOL_CONFIG_WHOLE_DISK,
&wholedisk);
if (error)
wholedisk = 0;
if (wholedisk) {
path = strrchr(path, '/');
@@ -1125,6 +1122,7 @@ zfs_deliver_dle(nvlist_t *nvl)
strlcpy(name, devname, MAXPATHLEN);
zfs_append_partition(name, MAXPATHLEN);
} else {
sprintf(name, "unknown");
zed_log_msg(LOG_INFO, "zfs_deliver_dle: no guid or physpath");
}
@@ -1214,7 +1212,7 @@ zfs_enum_pools(void *arg)
* For now, each agent has its own libzfs instance
*/
int
zfs_slm_init()
zfs_slm_init(void)
{
if ((g_zfshdl = libzfs_init()) == NULL)
return (-1);
@@ -1240,7 +1238,7 @@ zfs_slm_init()
}
void
zfs_slm_fini()
zfs_slm_fini(void)
{
unavailpool_t *pool;
pendingdev_t *device;
+4 -4
View File
@@ -402,7 +402,7 @@ zed_udev_monitor(void *arg)
}
int
zed_disk_event_init()
zed_disk_event_init(void)
{
int fd, fflags;
@@ -438,7 +438,7 @@ zed_disk_event_init()
}
void
zed_disk_event_fini()
zed_disk_event_fini(void)
{
/* cancel monitor thread at recvmsg() */
(void) pthread_cancel(g_mon_tid);
@@ -456,13 +456,13 @@ zed_disk_event_fini()
#include "zed_disk_event.h"
int
zed_disk_event_init()
zed_disk_event_init(void)
{
return (0);
}
void
zed_disk_event_fini()
zed_disk_event_fini(void)
{
}
+1 -1
View File
@@ -8535,7 +8535,7 @@ static int
zfs_do_wait(int argc, char **argv)
{
boolean_t enabled[ZFS_WAIT_NUM_ACTIVITIES];
int error, i;
int error = 0, i;
int c;
/* By default, wait for all types of activity. */
+3 -3
View File
@@ -363,9 +363,6 @@ zstream_do_dump(int argc, char *argv[])
BSWAP_64(drrb->drr_fromguid);
}
featureflags =
DMU_GET_FEATUREFLAGS(drrb->drr_versioninfo);
(void) printf("BEGIN record\n");
(void) printf("\thdrtype = %lld\n",
DMU_GET_STREAM_HDRTYPE(drrb->drr_versioninfo));
@@ -465,6 +462,9 @@ zstream_do_dump(int argc, char *argv[])
BSWAP_64(drro->drr_maxblkid);
}
featureflags =
DMU_GET_FEATUREFLAGS(drrb->drr_versioninfo);
if (featureflags & DMU_BACKUP_FEATURE_RAW &&
drro->drr_bonuslen > drro->drr_raw_bonuslen) {
(void) fprintf(stderr,
+1
View File
@@ -2193,6 +2193,7 @@ ztest_replay_write(void *arg1, void *arg2, boolean_t byteswap)
* but not always, because we also want to verify correct
* behavior when the data was not recently read into cache.
*/
ASSERT(doi.doi_data_block_size);
ASSERT0(offset % doi.doi_data_block_size);
if (ztest_random(4) != 0) {
int prefetch = ztest_random(2) ?
+7
View File
@@ -109,6 +109,13 @@ while [ "$outer_loop" -lt 20 ]; do
exit 0
fi
fi
#
# zvol_count made some progress - let's stay in this loop.
#
if [ "$old_zvols_count" -gt "$zvols_count" ]; then
outer_loop=$((outer_loop - 1))
fi
done
echo "Timed out waiting on zvol links"
+10
View File
@@ -46,6 +46,16 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_PYZFS], [
])
AC_SUBST(DEFINE_PYZFS)
dnl #
dnl # Autodetection disables pyzfs if kernel or srpm config
dnl #
AS_IF([test "x$enable_pyzfs" = xcheck], [
AS_IF([test "x$ZFS_CONFIG" = xkernel -o "x$ZFS_CONFIG" = xsrpm ], [
enable_pyzfs=no
AC_MSG_NOTICE([Disabling pyzfs for kernel/srpm config])
])
])
dnl #
dnl # Python "packaging" (or, failing that, "distlib") module is required to build and install pyzfs
dnl #
+1 -2
View File
@@ -7,8 +7,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_ADD_DISK], [
#include <linux/blkdev.h>
], [
struct gendisk *disk = NULL;
int err = add_disk(disk);
err = err;
int error __attribute__ ((unused)) = add_disk(disk);
])
])
+30
View File
@@ -0,0 +1,30 @@
dnl #
dnl # 3.18 API change
dnl # Dentry aliases are in d_u struct dentry member
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_DENTRY_ALIAS_D_U], [
ZFS_LINUX_TEST_SRC([dentry_alias_d_u], [
#include <linux/fs.h>
#include <linux/dcache.h>
#include <linux/list.h>
], [
struct inode *inode __attribute__ ((unused)) = NULL;
struct dentry *dentry __attribute__ ((unused)) = NULL;
hlist_for_each_entry(dentry, &inode->i_dentry,
d_u.d_alias) {
d_drop(dentry);
}
])
])
AC_DEFUN([ZFS_AC_KERNEL_DENTRY_ALIAS_D_U], [
AC_MSG_CHECKING([whether dentry aliases are in d_u member])
ZFS_LINUX_TEST_RESULT([dentry_alias_d_u], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_DENTRY_D_U_ALIASES, 1,
[dentry aliases are in d_u member])
],[
AC_MSG_RESULT(no)
])
])
-33
View File
@@ -1,33 +0,0 @@
dnl #
dnl # Grsecurity kernel API change
dnl # constified parameters of module_param_call() methods
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_MODULE_PARAM_CALL_CONST], [
ZFS_LINUX_TEST_SRC([module_param_call], [
#include <linux/module.h>
#include <linux/moduleparam.h>
int param_get(char *b, const struct kernel_param *kp)
{
return (0);
}
int param_set(const char *b, const struct kernel_param *kp)
{
return (0);
}
module_param_call(p, param_set, param_get, NULL, 0644);
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_MODULE_PARAM_CALL_CONST], [
AC_MSG_CHECKING([whether module_param_call() is hardened])
ZFS_LINUX_TEST_RESULT([module_param_call], [
AC_MSG_RESULT(yes)
AC_DEFINE(MODULE_PARAM_CALL_CONST, 1,
[hardened module_param_call])
],[
AC_MSG_RESULT(no)
])
])
+13 -6
View File
@@ -93,6 +93,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_SRC], [
ZFS_AC_KERNEL_SRC_SETATTR_PREPARE
ZFS_AC_KERNEL_SRC_INSERT_INODE_LOCKED
ZFS_AC_KERNEL_SRC_DENTRY
ZFS_AC_KERNEL_SRC_DENTRY_ALIAS_D_U
ZFS_AC_KERNEL_SRC_TRUNCATE_SETSIZE
ZFS_AC_KERNEL_SRC_SECURITY_INODE
ZFS_AC_KERNEL_SRC_FST_MOUNT
@@ -119,7 +120,6 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_SRC], [
ZFS_AC_KERNEL_SRC_FMODE_T
ZFS_AC_KERNEL_SRC_KUIDGID_T
ZFS_AC_KERNEL_SRC_KUID_HELPERS
ZFS_AC_KERNEL_SRC_MODULE_PARAM_CALL_CONST
ZFS_AC_KERNEL_SRC_RENAME
ZFS_AC_KERNEL_SRC_CURRENT_TIME
ZFS_AC_KERNEL_SRC_USERNS_CAPABILITIES
@@ -210,6 +210,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_RESULT], [
ZFS_AC_KERNEL_SETATTR_PREPARE
ZFS_AC_KERNEL_INSERT_INODE_LOCKED
ZFS_AC_KERNEL_DENTRY
ZFS_AC_KERNEL_DENTRY_ALIAS_D_U
ZFS_AC_KERNEL_TRUNCATE_SETSIZE
ZFS_AC_KERNEL_SECURITY_INODE
ZFS_AC_KERNEL_FST_MOUNT
@@ -236,7 +237,6 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_RESULT], [
ZFS_AC_KERNEL_FMODE_T
ZFS_AC_KERNEL_KUIDGID_T
ZFS_AC_KERNEL_KUID_HELPERS
ZFS_AC_KERNEL_MODULE_PARAM_CALL_CONST
ZFS_AC_KERNEL_RENAME
ZFS_AC_KERNEL_CURRENT_TIME
ZFS_AC_KERNEL_USERNS_CAPABILITIES
@@ -934,8 +934,15 @@ dnl # like ZFS_LINUX_TRY_COMPILE, except the contents conftest.h are
dnl # provided via the fifth parameter
dnl #
AC_DEFUN([ZFS_LINUX_TRY_COMPILE_HEADER], [
ZFS_LINUX_COMPILE_IFELSE(
[ZFS_LINUX_TEST_PROGRAM([[$1]], [[$2]], [[ZFS_META_LICENSE]])],
[test -f build/conftest/conftest.ko],
[$3], [$4], [$5])
AS_IF([test "x$enable_linux_builtin" = "xyes"], [
ZFS_LINUX_COMPILE_IFELSE(
[ZFS_LINUX_TEST_PROGRAM([[$1]], [[$2]],
[[ZFS_META_LICENSE]])],
[test -f build/conftest/conftest.o], [$3], [$4], [$5])
], [
ZFS_LINUX_COMPILE_IFELSE(
[ZFS_LINUX_TEST_PROGRAM([[$1]], [[$2]],
[[ZFS_META_LICENSE]])],
[test -f build/conftest/conftest.ko], [$3], [$4], [$5])
])
])
@@ -8,5 +8,5 @@ ConditionKernelCommandLine=bootfs.rollback
[Service]
Type=oneshot
ExecStart=/bin/sh -c '. /lib/dracut-zfs-lib.sh; decode_root_args || exit; [ "$root" = "zfs:AUTO" ] && root="$BOOTFS" SNAPNAME="$(getarg bootfs.rollback)"; exec @sbindir@/zfs rollback -Rf "$root@${SNAPNAME:-%v}"'
ExecStart=/bin/sh -c '. /lib/dracut-zfs-lib.sh; decode_root_args || exit; [ "$root" = "zfs:AUTO" ] && root="$BOOTFS"; SNAPNAME="$(getarg bootfs.rollback)"; exec @sbindir@/zfs rollback -Rf "$root@${SNAPNAME:-%v}"'
RemainAfterExit=yes
@@ -8,5 +8,5 @@ ConditionKernelCommandLine=bootfs.snapshot
[Service]
Type=oneshot
-ExecStart=/bin/sh -c '. /lib/dracut-zfs-lib.sh; decode_root_args || exit; [ "$root" = "zfs:AUTO" ] && root="$BOOTFS" SNAPNAME="$(getarg bootfs.snapshot)"; exec @sbindir@/zfs snapshot "$root@${SNAPNAME:-%v}"'
ExecStart=-/bin/sh -c '. /lib/dracut-zfs-lib.sh; decode_root_args || exit; [ "$root" = "zfs:AUTO" ] && root="$BOOTFS"; SNAPNAME="$(getarg bootfs.snapshot)"; exec @sbindir@/zfs snapshot "$root@${SNAPNAME:-%v}"'
RemainAfterExit=yes
+3 -3
View File
@@ -326,7 +326,7 @@ mount_fs()
# Need the _original_ datasets mountpoint!
mountpoint=$(get_fs_value "$fs" mountpoint)
ZFS_CMD="mount -o zfsutil -t zfs"
ZFS_CMD="mount.zfs -o zfsutil"
if [ "$mountpoint" = "legacy" ] || [ "$mountpoint" = "none" ]; then
# Can't use the mountpoint property. Might be one of our
# clones. Check the 'org.zol:mountpoint' property set in
@@ -349,7 +349,7 @@ mount_fs()
# If it's not a legacy filesystem, it can only be a
# native one...
if [ "$mountpoint" = "legacy" ]; then
ZFS_CMD="mount -t zfs"
ZFS_CMD="mount.zfs"
fi
fi
@@ -915,7 +915,7 @@ mountroot()
echo " not specified on the kernel command line."
echo ""
echo "Manually mount the root filesystem on $rootmnt and then exit."
echo "Hint: Try: mount -o zfsutil -t zfs ${ZFS_RPOOL-rpool}/ROOT/system $rootmnt"
echo "Hint: Try: mount.zfs -o zfsutil ${ZFS_RPOOL-rpool}/ROOT/system $rootmnt"
shell
fi
+11 -3
View File
@@ -496,7 +496,6 @@ zfs_key_config_get_dataset(zfs_key_config_t *config)
if (zhp == NULL) {
pam_syslog(NULL, LOG_ERR, "dataset %s not found",
config->homes_prefix);
zfs_close(zhp);
return (NULL);
}
@@ -508,6 +507,10 @@ zfs_key_config_get_dataset(zfs_key_config_t *config)
return (dsname);
}
if (config->homes_prefix == NULL) {
return (NULL);
}
size_t len = ZFS_MAX_DATASET_NAME_LEN;
size_t total_len = strlen(config->homes_prefix) + 1
+ strlen(config->username);
@@ -711,7 +714,10 @@ pam_sm_open_session(pam_handle_t *pamh, int flags,
return (PAM_SUCCESS);
}
zfs_key_config_t config;
zfs_key_config_load(pamh, &config, argc, argv);
if (zfs_key_config_load(pamh, &config, argc, argv) != 0) {
return (PAM_SESSION_ERR);
}
if (config.uid < 1000) {
zfs_key_config_free(&config);
return (PAM_SUCCESS);
@@ -765,7 +771,9 @@ pam_sm_close_session(pam_handle_t *pamh, int flags,
return (PAM_SUCCESS);
}
zfs_key_config_t config;
zfs_key_config_load(pamh, &config, argc, argv);
if (zfs_key_config_load(pamh, &config, argc, argv) != 0) {
return (PAM_SESSION_ERR);
}
if (config.uid < 1000) {
zfs_key_config_free(&config);
return (PAM_SUCCESS);
+1
View File
@@ -22,3 +22,4 @@ SUBSTFILES += $(systemdpreset_DATA) $(systemdunit_DATA)
install-data-hook:
$(MKDIR_P) "$(DESTDIR)$(systemdunitdir)"
ln -sf /dev/null "$(DESTDIR)$(systemdunitdir)/zfs-import.service"
ln -sf /dev/null "$(DESTDIR)$(systemdunitdir)/zfs-load-key.service"
@@ -5,7 +5,7 @@ DefaultDependencies=no
Requires=systemd-udev-settle.service
After=systemd-udev-settle.service
After=cryptsetup.target
After=multipathd.target
After=multipathd.service
After=systemd-remount-fs.service
Before=zfs-import.target
ConditionFileNotEmpty=@sysconfdir@/zfs/zpool.cache
@@ -5,7 +5,7 @@ DefaultDependencies=no
Requires=systemd-udev-settle.service
After=systemd-udev-settle.service
After=cryptsetup.target
After=multipathd.target
After=multipathd.service
Before=zfs-import.target
ConditionFileNotEmpty=!@sysconfdir@/zfs/zpool.cache
ConditionPathIsDirectory=/sys/module/zfs
+6 -5
View File
@@ -150,6 +150,7 @@ typedef enum zfs_error {
EZFS_NO_RESILVER_DEFER, /* pool doesn't support resilver_defer */
EZFS_EXPORT_IN_PROGRESS, /* currently exporting the pool */
EZFS_REBUILDING, /* resilvering (sequential reconstrution) */
EZFS_CKSUM, /* insufficient replicas */
EZFS_UNKNOWN
} zfs_error_t;
@@ -257,10 +258,10 @@ extern int zpool_add(zpool_handle_t *, nvlist_t *);
typedef struct splitflags {
/* do not split, but return the config that would be split off */
int dryrun : 1;
unsigned int dryrun : 1;
/* after splitting, import the pool */
int import : 1;
unsigned int import : 1;
int name_flags;
} splitflags_t;
@@ -649,13 +650,13 @@ extern int zfs_rollback(zfs_handle_t *, zfs_handle_t *, boolean_t);
typedef struct renameflags {
/* recursive rename */
int recursive : 1;
unsigned int recursive : 1;
/* don't unmount file systems */
int nounmount : 1;
unsigned int nounmount : 1;
/* force unmount file systems */
int forceunmount : 1;
unsigned int forceunmount : 1;
} renameflags_t;
extern int zfs_rename(zfs_handle_t *, const char *, renameflags_t);
-1
View File
@@ -83,7 +83,6 @@
#define __printf(a, b) __printflike(a, b)
#define barrier() __asm__ __volatile__("": : :"memory")
#define smp_rmb() rmb()
#define ___PASTE(a, b) a##b
#define __PASTE(a, b) ___PASTE(a, b)
+2 -1
View File
@@ -57,7 +57,8 @@ extern uint64_t atomic_cas_64(volatile uint64_t *target, uint64_t cmp,
uint64_t newval);
#endif
#define membar_producer atomic_thread_fence_rel
#define membar_consumer() atomic_thread_fence_acq()
#define membar_producer() atomic_thread_fence_rel()
static __inline uint32_t
atomic_add_32_nv(volatile uint32_t *target, int32_t delta)
+2
View File
@@ -95,9 +95,11 @@ vn_flush_cached_data(vnode_t *vp, boolean_t sync)
if (vp->v_object->flags & OBJ_MIGHTBEDIRTY) {
#endif
int flags = sync ? OBJPC_SYNC : 0;
vn_lock(vp, LK_SHARED | LK_RETRY);
zfs_vmobject_wlock(vp->v_object);
vm_object_page_clean(vp->v_object, 0, 0, flags);
zfs_vmobject_wunlock(vp->v_object);
VOP_UNLOCK(vp);
}
}
@@ -61,4 +61,25 @@ d_clear_d_op(struct dentry *dentry)
DCACHE_OP_REVALIDATE | DCACHE_OP_DELETE);
}
/*
* Walk and invalidate all dentry aliases of an inode
* unless it's a mountpoint
*/
static inline void
zpl_d_drop_aliases(struct inode *inode)
{
struct dentry *dentry;
spin_lock(&inode->i_lock);
#ifdef HAVE_DENTRY_D_U_ALIASES
hlist_for_each_entry(dentry, &inode->i_dentry, d_u.d_alias) {
#else
hlist_for_each_entry(dentry, &inode->i_dentry, d_alias) {
#endif
if (!IS_ROOT(dentry) && !d_mountpoint(dentry) &&
(dentry->d_inode == inode)) {
d_drop(dentry);
}
}
spin_unlock(&inode->i_lock);
}
#endif /* _ZFS_DCACHE_H */
+9 -5
View File
@@ -30,12 +30,16 @@
#include <linux/module.h>
#include <linux/moduleparam.h>
/* Grsecurity kernel API change */
#ifdef MODULE_PARAM_CALL_CONST
/*
* Despite constifying struct kernel_param_ops, some older kernels define a
* `__check_old_set_param()` function in their headers that checks for a
* non-constified `->set()`. This has long been fixed in Linux mainline, but
* since we support older kernels, we workaround it by using a preprocessor
* definition to disable it.
*/
#define __check_old_set_param(_) (0)
typedef const struct kernel_param zfs_kernel_param_t;
#else
typedef struct kernel_param zfs_kernel_param_t;
#endif
#define ZMOD_RW 0644
#define ZMOD_RD 0444
+2
View File
@@ -44,7 +44,9 @@
#define zfs_totalhigh_pages totalhigh_pages
#endif
#define membar_consumer() smp_rmb()
#define membar_producer() smp_wmb()
#define physmem zfs_totalram_pages
#define xcopyin(from, to, size) copy_from_user(to, from, size)
+2 -4
View File
@@ -62,7 +62,6 @@ DECLARE_EVENT_CLASS(zfs_ace_class,
__field(boolean_t, z_is_sa)
__field(boolean_t, z_is_mapped)
__field(boolean_t, z_is_ctldir)
__field(boolean_t, z_is_stale)
__field(uint32_t, i_uid)
__field(uint32_t, i_gid)
@@ -95,7 +94,6 @@ DECLARE_EVENT_CLASS(zfs_ace_class,
__entry->z_is_sa = zn->z_is_sa;
__entry->z_is_mapped = zn->z_is_mapped;
__entry->z_is_ctldir = zn->z_is_ctldir;
__entry->z_is_stale = zn->z_is_stale;
__entry->i_uid = KUID_TO_SUID(ZTOI(zn)->i_uid);
__entry->i_gid = KGID_TO_SGID(ZTOI(zn)->i_gid);
@@ -117,7 +115,7 @@ DECLARE_EVENT_CLASS(zfs_ace_class,
"zn_prefetch %u blksz %u seq %u "
"mapcnt %llu size %llu pflags %llu "
"sync_cnt %u mode 0x%x is_sa %d "
"is_mapped %d is_ctldir %d is_stale %d inode { "
"is_mapped %d is_ctldir %d inode { "
"uid %u gid %u ino %lu nlink %u size %lli "
"blkbits %u bytes %u mode 0x%x generation %x } } "
"ace { type %u flags %u access_mask %u } mask_matched %u",
@@ -126,7 +124,7 @@ DECLARE_EVENT_CLASS(zfs_ace_class,
__entry->z_seq, __entry->z_mapcnt, __entry->z_size,
__entry->z_pflags, __entry->z_sync_cnt, __entry->z_mode,
__entry->z_is_sa, __entry->z_is_mapped,
__entry->z_is_ctldir, __entry->z_is_stale, __entry->i_uid,
__entry->z_is_ctldir, __entry->i_uid,
__entry->i_gid, __entry->i_ino, __entry->i_nlink,
__entry->i_size, __entry->i_blkbits,
__entry->i_bytes, __entry->i_mode, __entry->i_generation,
+2 -1
View File
@@ -45,7 +45,8 @@ extern const struct inode_operations zpl_inode_operations;
extern const struct inode_operations zpl_dir_inode_operations;
extern const struct inode_operations zpl_symlink_inode_operations;
extern const struct inode_operations zpl_special_inode_operations;
extern dentry_operations_t zpl_dentry_operations;
/* zpl_file.c */
extern const struct address_space_operations zpl_address_space_operations;
extern const struct file_operations zpl_file_operations;
extern const struct file_operations zpl_dir_file_operations;
+7
View File
@@ -27,6 +27,7 @@
* Copyright (c) 2014 Spectra Logic Corporation, All rights reserved.
* Copyright 2013 Saso Kiselkov. All rights reserved.
* Copyright (c) 2017, Intel Corporation.
* Copyright (c) 2022 Hewlett Packard Enterprise Development LP.
*/
/* Portions Copyright 2010 Robert Milkowski */
@@ -142,6 +143,12 @@ typedef enum dmu_object_byteswap {
#define DMU_OT_IS_DDT(ot) \
((ot) == DMU_OT_DDT_ZAP)
#define DMU_OT_IS_CRITICAL(ot) \
(DMU_OT_IS_METADATA(ot) && \
(ot) != DMU_OT_DNODE && \
(ot) != DMU_OT_DIRECTORY_CONTENTS && \
(ot) != DMU_OT_SA)
/* Note: ztest uses DMU_OT_UINT64_OTHER as a proxy for file blocks */
#define DMU_OT_IS_FILE(ot) \
((ot) == DMU_OT_PLAIN_FILE_CONTENTS || (ot) == DMU_OT_UINT64_OTHER)
+4 -1
View File
@@ -29,6 +29,7 @@
* Copyright (c) 2019 Datto Inc.
* Portions Copyright 2010 Robert Milkowski
* Copyright (c) 2021, Colm Buckley <colm@tuatha.org>
* Copyright (c) 2022 Hewlett Packard Enterprise Development LP.
*/
#ifndef _SYS_FS_ZFS_H
@@ -423,7 +424,9 @@ typedef enum {
typedef enum {
ZFS_REDUNDANT_METADATA_ALL,
ZFS_REDUNDANT_METADATA_MOST
ZFS_REDUNDANT_METADATA_MOST,
ZFS_REDUNDANT_METADATA_SOME,
ZFS_REDUNDANT_METADATA_NONE
} zfs_redundant_metadata_type_t;
typedef enum {
-1
View File
@@ -190,7 +190,6 @@ typedef struct znode {
boolean_t z_is_sa; /* are we native sa? */
boolean_t z_is_mapped; /* are we mmap'ed */
boolean_t z_is_ctldir; /* are we .zfs entry */
boolean_t z_is_stale; /* are we stale due to rollback? */
boolean_t z_suspended; /* extra ref from a suspend? */
uint_t z_blksz; /* block size in bytes */
uint_t z_seq; /* modification sequence number */
+1 -1
View File
@@ -26,7 +26,7 @@
#include <zone.h>
zoneid_t
getzoneid()
getzoneid(void)
{
return (GLOBAL_ZONEID);
}
+4 -4
View File
@@ -59,8 +59,8 @@
static boolean_t zpool_vdev_is_interior(const char *name);
typedef struct prop_flags {
int create:1; /* Validate property on creation */
int import:1; /* Validate property on import */
unsigned int create:1; /* Validate property on creation */
unsigned int import:1; /* Validate property on import */
} prop_flags_t;
/*
@@ -4788,8 +4788,8 @@ zpool_load_compat(const char *compat, boolean_t *features, char *report,
for (uint_t i = 0; i < SPA_FEATURES; i++)
features[i] = B_TRUE;
char err_badfile[1024] = "";
char err_badtoken[1024] = "";
char err_badfile[ZFS_MAXPROPLEN] = "";
char err_badtoken[ZFS_MAXPROPLEN] = "";
/*
* We ignore errors from the directory open()
+2 -2
View File
@@ -4013,8 +4013,8 @@ zfs_setup_cmdline_props(libzfs_handle_t *hdl, zfs_type_t type,
* properties: if we're asked to exclude this kind of
* values we remove them from "recvprops" input nvlist.
*/
if (!zfs_prop_inheritable(prop) &&
!zfs_prop_user(name) && /* can be inherited too */
if (!zfs_prop_user(name) && /* can be inherited too */
!zfs_prop_inheritable(prop) &&
nvlist_exists(recvprops, name))
fnvlist_remove(recvprops, name);
else
+7 -1
View File
@@ -170,6 +170,8 @@ libzfs_error_description(libzfs_handle_t *hdl)
return (dgettext(TEXT_DOMAIN, "I/O error"));
case EZFS_INTR:
return (dgettext(TEXT_DOMAIN, "signal received"));
case EZFS_CKSUM:
return (dgettext(TEXT_DOMAIN, "insufficient replicas"));
case EZFS_ISSPARE:
return (dgettext(TEXT_DOMAIN, "device is reserved as a hot "
"spare"));
@@ -392,6 +394,10 @@ zfs_common_error(libzfs_handle_t *hdl, int error, const char *fmt,
case EINTR:
zfs_verror(hdl, EZFS_INTR, fmt, ap);
return (-1);
case ECKSUM:
zfs_verror(hdl, EZFS_CKSUM, fmt, ap);
return (-1);
}
return (0);
@@ -674,7 +680,7 @@ zpool_standard_error_fmt(libzfs_handle_t *hdl, int error, const char *fmt, ...)
case ENOSPC:
case EDQUOT:
zfs_verror(hdl, EZFS_NOSPC, fmt, ap);
return (-1);
break;
case EAGAIN:
zfs_error_aux(hdl, dgettext(TEXT_DOMAIN,
+11 -8
View File
@@ -233,7 +233,7 @@ lzc_ioctl(zfs_ioc_t ioc, const char *name,
break;
}
}
if (zc.zc_nvlist_dst_filled) {
if (zc.zc_nvlist_dst_filled && resultp != NULL) {
*resultp = fnvlist_unpack((void *)(uintptr_t)zc.zc_nvlist_dst,
zc.zc_nvlist_dst_size);
}
@@ -890,7 +890,8 @@ recv_impl(const char *snapname, nvlist_t *recvdprops, nvlist_t *localprops,
fnvlist_free(outnvl);
} else {
zfs_cmd_t zc = {"\0"};
char *packed = NULL;
char *rp_packed = NULL;
char *lp_packed = NULL;
size_t size;
ASSERT3S(g_refcount, >, 0);
@@ -899,14 +900,14 @@ recv_impl(const char *snapname, nvlist_t *recvdprops, nvlist_t *localprops,
(void) strlcpy(zc.zc_value, snapname, sizeof (zc.zc_value));
if (recvdprops != NULL) {
packed = fnvlist_pack(recvdprops, &size);
zc.zc_nvlist_src = (uint64_t)(uintptr_t)packed;
rp_packed = fnvlist_pack(recvdprops, &size);
zc.zc_nvlist_src = (uint64_t)(uintptr_t)rp_packed;
zc.zc_nvlist_src_size = size;
}
if (localprops != NULL) {
packed = fnvlist_pack(localprops, &size);
zc.zc_nvlist_conf = (uint64_t)(uintptr_t)packed;
lp_packed = fnvlist_pack(localprops, &size);
zc.zc_nvlist_conf = (uint64_t)(uintptr_t)lp_packed;
zc.zc_nvlist_conf_size = size;
}
@@ -941,8 +942,10 @@ recv_impl(const char *snapname, nvlist_t *recvdprops, nvlist_t *localprops,
zc.zc_nvlist_dst_size, errors, KM_SLEEP));
}
if (packed != NULL)
fnvlist_pack_free(packed, size);
if (rp_packed != NULL)
fnvlist_pack_free(rp_packed, size);
if (lp_packed != NULL)
fnvlist_pack_free(lp_packed, size);
free((void *)(uintptr_t)zc.zc_nvlist_dst);
}
+5 -3
View File
@@ -173,12 +173,13 @@ set_global_var_parse_kv(const char *arg, char **k_out, u_longlong_t *v_out)
goto err_free;
}
*k_out = k;
*k_out = strdup(k);
*v_out = val;
free(d);
return (0);
err_free:
free(k);
free(d);
return (err);
}
@@ -227,13 +228,14 @@ set_global_var(char const *arg)
fprintf(stderr, "Failed to open libzpool.so to set global "
"variable\n");
ret = EIO;
goto out_dlclose;
goto out_free;
}
ret = 0;
out_dlclose:
dlclose(zpoolhdl);
out_free:
free(varname);
out_ret:
return (ret);
@@ -265,7 +265,6 @@ zfs_get_pci_slots_sys_path(const char *dev_name)
free(address2);
if (asprintf(&path, "/sys/bus/pci/slots/%s",
ep->d_name) == -1) {
free(tmp);
continue;
}
break;
+2 -3
View File
@@ -467,11 +467,9 @@ get_configs(libpc_handle_t *hdl, pool_list_t *pl, boolean_t active_ok,
uint64_t guid;
uint_t children = 0;
nvlist_t **child = NULL;
uint_t holes;
uint64_t *hole_array, max_id;
uint_t c;
boolean_t isactive;
uint64_t hostid;
nvlist_t *nvl;
boolean_t valid_top_config = B_FALSE;
@@ -479,7 +477,8 @@ get_configs(libpc_handle_t *hdl, pool_list_t *pl, boolean_t active_ok,
goto nomem;
for (pe = pl->pools; pe != NULL; pe = pe->pe_next) {
uint64_t id, max_txg = 0;
uint64_t id, max_txg = 0, hostid = 0;
uint_t holes = 0;
if (nvlist_alloc(&config, NV_UNIQUE_NAME, 0) != 0)
goto nomem;
+8 -1
View File
@@ -1222,6 +1222,13 @@ Ideally, this will be at least the sum of each queue's
.Sy max_active .
.No See Sx ZFS I/O SCHEDULER .
.
.It Sy zfs_vdev_open_timeout_ms Ns = Ns Sy 1000 Pq uint
Timeout value to wait before determining a device is missing
during import.
This is helpful for transient missing paths due
to links being briefly removed and recreated in response to
udev events.
.
.It Sy zfs_vdev_rebuild_max_active Ns = Ns Sy 3 Pq int
Maximum sequential resilver I/O operations active to each device.
.No See Sx ZFS I/O SCHEDULER .
@@ -1651,7 +1658,7 @@ prefetched during a pool traversal, like
.Nm zfs Cm send
or other data crawling operations.
.
.It Sy zfs_per_txg_dirty_frees_percent Ns = Ns Sy 5 Ns % Pq ulong
.It Sy zfs_per_txg_dirty_frees_percent Ns = Ns Sy 30 Ns % Pq ulong
Control percentage of dirtied indirect blocks from frees allowed into one TXG.
After this threshold is crossed, additional frees will wait until the next TXG.
.Sy 0 No disables this throttle.
+15 -3
View File
@@ -36,8 +36,9 @@
.\" Copyright 2018 Nexenta Systems, Inc.
.\" Copyright 2019 Joyent, Inc.
.\" Copyright (c) 2019, Kjeld Schouten-Lebbing
.\" Copyright (c) 2022 Hewlett Packard Enterprise Development LP.
.\"
.Dd May 24, 2021
.Dd July 21, 2022
.Dt ZFSPROPS 7
.Os
.
@@ -1445,7 +1446,7 @@ affects only files created afterward; existing files are unaffected.
.Pp
This property can also be referred to by its shortened column name,
.Sy recsize .
.It Sy redundant_metadata Ns = Ns Sy all Ns | Ns Sy most
.It Sy redundant_metadata Ns = Ns Sy all Ns | Ns Sy most Ns | Ns Sy some Ns | Ns Sy none
Controls what types of metadata are stored redundantly.
ZFS stores an extra copy of metadata, so that if a single block is corrupted,
the amount of user data lost is limited.
@@ -1477,7 +1478,7 @@ When set to
ZFS stores an extra copy of most types of metadata.
This can improve performance of random writes, because less metadata must be
written.
In practice, at worst about 100 blocks
In practice, at worst about 1000 blocks
.Po of
.Sy recordsize
bytes each
@@ -1486,6 +1487,17 @@ of user data can be lost if a single on-disk block is corrupt.
The exact behavior of which metadata blocks are stored redundantly may change in
future releases.
.Pp
When set to
.Sy some ,
ZFS stores an extra copy of only critical metadata.
This can improve file create performance since less metadata needs to be written.
If a single on-disk block is corrupt, at worst a single user file can be lost.
.Pp
When set to
.Sy none ,
ZFS does not store any copies of metadata redundantly.
If a single on-disk block is corrupt, an entire dataset can be lost.
.Pp
The default value is
.Sy all .
.It Sy refquota Ns = Ns Ar size Ns | Ns Sy none
+2 -5
View File
@@ -60,12 +60,9 @@ $(MODULE)-$(CONFIG_X86) += algs/modes/gcm_pclmulqdq.o
$(MODULE)-$(CONFIG_X86) += algs/aes/aes_impl_aesni.o
$(MODULE)-$(CONFIG_X86) += algs/aes/aes_impl_x86-64.o
# Suppress objtool "can't find jump dest instruction at" warnings. They
# are caused by the constants which are defined in the text section of the
# assembly file using .byte instructions (e.g. bswap_mask). The objtool
# utility tries to interpret them as opcodes and obviously fails doing so.
# Suppress objtool "return with modified stack frame" warnings.
OBJECT_FILES_NON_STANDARD_aesni-gcm-x86_64.o := y
OBJECT_FILES_NON_STANDARD_ghash-x86_64.o := y
# Suppress objtool "unsupported stack pointer realignment" warnings. We are
# not using a DRAP register while aligning the stack to a 64 byte boundary.
# See #6950 for the reasoning.
+2
View File
@@ -704,6 +704,7 @@ enc_tab:
ENTRY_NP(aes_encrypt_amd64)
ENDBR
#ifdef GLADMAN_INTERFACE
// Original interface
sub $[4*8], %rsp // gnu/linux/opensolaris binary interface
@@ -809,6 +810,7 @@ dec_tab:
ENTRY_NP(aes_decrypt_amd64)
ENDBR
#ifdef GLADMAN_INTERFACE
// Original interface
sub $[4*8], %rsp // gnu/linux/opensolaris binary interface
+15 -5
View File
@@ -47,6 +47,9 @@
#if defined(__x86_64__) && defined(HAVE_AVX) && \
defined(HAVE_AES) && defined(HAVE_PCLMULQDQ)
#define _ASM
#include <sys/asm_linkage.h>
.extern gcm_avx_can_use_movbe
.text
@@ -56,6 +59,7 @@
.align 32
_aesni_ctr32_ghash_6x:
.cfi_startproc
ENDBR
vmovdqu 32(%r11),%xmm2
subq $6,%rdx
vpxor %xmm4,%xmm4,%xmm4
@@ -363,7 +367,7 @@ _aesni_ctr32_ghash_6x:
vpxor 16+8(%rsp),%xmm8,%xmm8
vpxor %xmm4,%xmm8,%xmm8
.byte 0xf3,0xc3
RET
.cfi_endproc
.size _aesni_ctr32_ghash_6x,.-_aesni_ctr32_ghash_6x
#endif /* ifdef HAVE_MOVBE */
@@ -372,6 +376,7 @@ _aesni_ctr32_ghash_6x:
.align 32
_aesni_ctr32_ghash_no_movbe_6x:
.cfi_startproc
ENDBR
vmovdqu 32(%r11),%xmm2
subq $6,%rdx
vpxor %xmm4,%xmm4,%xmm4
@@ -691,7 +696,7 @@ _aesni_ctr32_ghash_no_movbe_6x:
vpxor 16+8(%rsp),%xmm8,%xmm8
vpxor %xmm4,%xmm8,%xmm8
.byte 0xf3,0xc3
RET
.cfi_endproc
.size _aesni_ctr32_ghash_no_movbe_6x,.-_aesni_ctr32_ghash_no_movbe_6x
@@ -700,6 +705,7 @@ _aesni_ctr32_ghash_no_movbe_6x:
.align 32
aesni_gcm_decrypt:
.cfi_startproc
ENDBR
xorq %r10,%r10
cmpq $0x60,%rdx
jb .Lgcm_dec_abort
@@ -810,13 +816,14 @@ aesni_gcm_decrypt:
.cfi_def_cfa_register %rsp
.Lgcm_dec_abort:
movq %r10,%rax
.byte 0xf3,0xc3
RET
.cfi_endproc
.size aesni_gcm_decrypt,.-aesni_gcm_decrypt
.type _aesni_ctr32_6x,@function
.align 32
_aesni_ctr32_6x:
.cfi_startproc
ENDBR
vmovdqu 0-128(%rcx),%xmm4
vmovdqu 32(%r11),%xmm2
leaq -2(%rbp),%r13 // ICP uses 10,12,14 not 9,11,13 for rounds.
@@ -880,7 +887,7 @@ _aesni_ctr32_6x:
vmovups %xmm14,80(%rsi)
leaq 96(%rsi),%rsi
.byte 0xf3,0xc3
RET
.align 32
.Lhandle_ctr32_2:
vpshufb %xmm0,%xmm1,%xmm6
@@ -911,6 +918,7 @@ _aesni_ctr32_6x:
.align 32
aesni_gcm_encrypt:
.cfi_startproc
ENDBR
xorq %r10,%r10
cmpq $288,%rdx
jb .Lgcm_enc_abort
@@ -1186,7 +1194,7 @@ aesni_gcm_encrypt:
.cfi_def_cfa_register %rsp
.Lgcm_enc_abort:
movq %r10,%rax
.byte 0xf3,0xc3
RET
.cfi_endproc
.size aesni_gcm_encrypt,.-aesni_gcm_encrypt
@@ -1239,6 +1247,7 @@ atomic_toggle_boolean_nv:
RET
.size atomic_toggle_boolean_nv,.-atomic_toggle_boolean_nv
.pushsection .rodata
.align 64
.Lbswap_mask:
.byte 15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0
@@ -1252,6 +1261,7 @@ atomic_toggle_boolean_nv:
.byte 1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
.byte 65,69,83,45,78,73,32,71,67,77,32,109,111,100,117,108,101,32,102,111,114,32,120,56,54,95,54,52,44,32,67,82,89,80,84,79,71,65,77,83,32,98,121,32,60,97,112,112,114,111,64,111,112,101,110,115,115,108,46,111,114,103,62,0
.align 64
.popsection
/* Mark the stack non-executable. */
#if defined(__linux__) && defined(__ELF__)
+13 -3
View File
@@ -97,6 +97,9 @@
#if defined(__x86_64__) && defined(HAVE_AVX) && \
defined(HAVE_AES) && defined(HAVE_PCLMULQDQ)
#define _ASM
#include <sys/asm_linkage.h>
.text
.globl gcm_gmult_clmul
@@ -104,6 +107,7 @@
.align 16
gcm_gmult_clmul:
.cfi_startproc
ENDBR
.L_gmult_clmul:
movdqu (%rdi),%xmm0
movdqa .Lbswap_mask(%rip),%xmm5
@@ -149,7 +153,7 @@ gcm_gmult_clmul:
pxor %xmm1,%xmm0
.byte 102,15,56,0,197
movdqu %xmm0,(%rdi)
.byte 0xf3,0xc3
RET
.cfi_endproc
.size gcm_gmult_clmul,.-gcm_gmult_clmul
@@ -158,6 +162,7 @@ gcm_gmult_clmul:
.align 32
gcm_init_htab_avx:
.cfi_startproc
ENDBR
vzeroupper
vmovdqu (%rsi),%xmm2
@@ -262,7 +267,7 @@ gcm_init_htab_avx:
vmovdqu %xmm5,-16(%rdi)
vzeroupper
.byte 0xf3,0xc3
RET
.cfi_endproc
.size gcm_init_htab_avx,.-gcm_init_htab_avx
@@ -271,6 +276,7 @@ gcm_init_htab_avx:
.align 32
gcm_gmult_avx:
.cfi_startproc
ENDBR
jmp .L_gmult_clmul
.cfi_endproc
.size gcm_gmult_avx,.-gcm_gmult_avx
@@ -279,6 +285,7 @@ gcm_gmult_avx:
.align 32
gcm_ghash_avx:
.cfi_startproc
ENDBR
vzeroupper
vmovdqu (%rdi),%xmm10
@@ -649,9 +656,11 @@ gcm_ghash_avx:
vpshufb %xmm13,%xmm10,%xmm10
vmovdqu %xmm10,(%rdi)
vzeroupper
.byte 0xf3,0xc3
RET
.cfi_endproc
.size gcm_ghash_avx,.-gcm_ghash_avx
.pushsection .rodata
.align 64
.Lbswap_mask:
.byte 15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0
@@ -705,6 +714,7 @@ gcm_ghash_avx:
.byte 71,72,65,83,72,32,102,111,114,32,120,56,54,95,54,52,44,32,67,82,89,80,84,79,71,65,77,83,32,98,121,32,60,97,112,112,114,111,64,111,112,101,110,115,115,108,46,111,114,103,62,0
.align 64
.popsection
/* Mark the stack non-executable. */
#if defined(__linux__) && defined(__ELF__)
+1
View File
@@ -84,6 +84,7 @@ SHA256TransformBlocks(SHA2_CTX *ctx, const void *in, size_t num)
ENTRY_NP(SHA256TransformBlocks)
.cfi_startproc
ENDBR
movq %rsp, %rax
.cfi_def_cfa_register %rax
push %rbx
+1
View File
@@ -85,6 +85,7 @@ SHA512TransformBlocks(SHA2_CTX *ctx, const void *in, size_t num)
ENTRY_NP(SHA512TransformBlocks)
.cfi_startproc
ENDBR
movq %rsp, %rax
.cfi_def_cfa_register %rax
push %rbx
+1 -1
View File
@@ -720,8 +720,8 @@ kcf_remove_mech_provider(char *mech_name, kcf_provider_desc_t *prov_desc)
}
/* free entry */
KCF_PROV_REFRELE(prov_mech->pm_prov_desc);
KCF_PROV_IREFRELE(prov_mech->pm_prov_desc);
KCF_PROV_REFRELE(prov_mech->pm_prov_desc);
kmem_free(prov_mech, sizeof (kcf_prov_mech_desc_t));
}
+1 -1
View File
@@ -171,8 +171,8 @@ kcf_prov_tab_rem_provider(crypto_provider_id_t prov_id)
* at that time.
*/
KCF_PROV_REFRELE(prov_desc);
KCF_PROV_IREFRELE(prov_desc);
KCF_PROV_REFRELE(prov_desc);
return (CRYPTO_SUCCESS);
}
+1 -1
View File
@@ -1274,7 +1274,7 @@ kcf_aop_done(kcf_areq_node_t *areq, int error)
* Allocate the thread pool and initialize all the fields.
*/
static void
kcfpool_alloc()
kcfpool_alloc(void)
{
kcfpool = kmem_alloc(sizeof (kcf_pool_t), KM_SLEEP);
+24 -3
View File
@@ -30,9 +30,29 @@
#include <sys/stack.h>
#include <sys/trap.h>
#if defined(__linux__) && defined(CONFIG_SLS)
#define RET ret; int3
#else
#if defined(_KERNEL) && defined(__linux__)
#include <linux/linkage.h>
#endif
#ifndef ENDBR
#if defined(__ELF__) && defined(__CET__) && defined(__has_include)
/* CSTYLED */
#if __has_include(<cet.h>)
#include <cet.h>
#ifdef _CET_ENDBR
#define ENDBR _CET_ENDBR
#endif /* _CET_ENDBR */
#endif /* <cet.h> */
#endif /* __ELF__ && __CET__ && __has_include */
#endif /* !ENDBR */
#ifndef ENDBR
#define ENDBR
#endif
#ifndef RET
#define RET ret
#endif
@@ -204,6 +224,7 @@ sym1 = sym2
* insert the calls to mcount for profiling. ENTRY_NP is identical, but
* never calls mcount.
*/
#undef ENTRY
#define ENTRY(x) \
.text; \
.align ASM_ENTRY_ALIGN; \
+8 -5
View File
@@ -823,12 +823,15 @@ sha2_mac_init(crypto_ctx_t *ctx, crypto_mechanism_t *mechanism,
*/
if (mechanism->cm_type % 3 == 2) {
if (mechanism->cm_param == NULL ||
mechanism->cm_param_len != sizeof (ulong_t))
ret = CRYPTO_MECHANISM_PARAM_INVALID;
PROV_SHA2_GET_DIGEST_LEN(mechanism,
PROV_SHA2_HMAC_CTX(ctx)->hc_digest_len);
if (PROV_SHA2_HMAC_CTX(ctx)->hc_digest_len > sha_digest_len)
mechanism->cm_param_len != sizeof (ulong_t)) {
ret = CRYPTO_MECHANISM_PARAM_INVALID;
} else {
PROV_SHA2_GET_DIGEST_LEN(mechanism,
PROV_SHA2_HMAC_CTX(ctx)->hc_digest_len);
if (PROV_SHA2_HMAC_CTX(ctx)->hc_digest_len >
sha_digest_len)
ret = CRYPTO_MECHANISM_PARAM_INVALID;
}
}
if (ret != CRYPTO_SUCCESS) {
+2
View File
@@ -251,6 +251,8 @@ LUA_API int lua_type (lua_State *L, int idx) {
LUA_API const char *lua_typename (lua_State *L, int t) {
UNUSED(L);
if (t > 8 || t < 0)
return "internal_type_error";
return ttypename(t);
}
+1 -1
View File
@@ -144,7 +144,7 @@ static lua_Number lua_strx2number (const char *s, char **endptr) {
*endptr = cast(char *, s); /* valid up to here */
ret:
if (neg) r = -r;
return (r * (1 << e));
return ((e >= 0) ? (r * (1ULL << e)) : (r / (1ULL << -e)));
}
#endif
+8 -7
View File
@@ -23,7 +23,15 @@
* Copyright (c) 1992, 2010, Oracle and/or its affiliates. All rights reserved.
*/
#if defined(_KERNEL) && defined(__linux__)
#include <linux/linkage.h>
#endif
#ifndef RET
#define RET ret
#endif
#undef ENTRY
#define ENTRY(x) \
.text; \
.align 8; \
@@ -34,13 +42,6 @@ x:
#define SET_SIZE(x) \
.size x, [.-x]
#if defined(__linux__) && defined(CONFIG_SLS)
#define RET ret; int3
#else
#define RET ret
#endif
/*
* Setjmp and longjmp implement non-local gotos using state vectors
* type label_t.
+1 -1
View File
@@ -250,7 +250,7 @@ spa_import_rootpool(const char *name, bool checkpointrewind)
mutex_exit(&spa_namespace_lock);
fnvlist_free(config);
cmn_err(CE_NOTE, "Can not parse the config for pool '%s'",
pname);
name);
return (error);
}
+1 -1
View File
@@ -319,7 +319,7 @@ zfs_ioctl_legacy_to_ozfs(int request)
int
zfs_ioctl_ozfs_to_legacy(int request)
{
if (request > ZFS_IOC_LAST)
if (request >= ZFS_IOC_LAST)
return (-1);
if (request > ZFS_IOC_PLATFORM) {
+2 -1
View File
@@ -1845,7 +1845,8 @@ zfs_fhtovp(vfs_t *vfsp, fid_t *fidp, int flags, vnode_t **vpp)
return (SET_ERROR(EINVAL));
}
if (fidp->fid_len == LONG_FID_LEN && (fid_gen > 1 || setgen != 0)) {
if (fidp->fid_len == LONG_FID_LEN && setgen != 0) {
ZFS_EXIT(zfsvfs);
dprintf("snapdir fid: fid_gen (%llu) and setgen (%llu)\n",
(u_longlong_t)fid_gen, (u_longlong_t)setgen);
return (SET_ERROR(EINVAL));
+1 -1
View File
@@ -23,9 +23,9 @@
*/
#include <sys/list.h>
#include <sys/mutex.h>
#include <sys/procfs_list.h>
#include <linux/proc_fs.h>
#include <sys/mutex.h>
/*
* A procfs_list is a wrapper around a linked list which implements the seq_file
+4 -1
View File
@@ -56,7 +56,7 @@ static void *zfs_vdev_holder = VDEV_HOLDER;
* device is missing. The missing path may be transient since the links
* can be briefly removed and recreated in response to udev events.
*/
static unsigned zfs_vdev_open_timeout_ms = 1000;
static uint_t zfs_vdev_open_timeout_ms = 1000;
/*
* Size of the "reserved" partition, in blocks.
@@ -1020,3 +1020,6 @@ param_set_max_auto_ashift(const char *buf, zfs_kernel_param_t *kp)
return (0);
}
ZFS_MODULE_PARAM(zfs_vdev, zfs_vdev_, open_timeout_ms, UINT, ZMOD_RW,
"Timeout before determining that a device is missing");
-1
View File
@@ -470,7 +470,6 @@ zfsctl_inode_alloc(zfsvfs_t *zfsvfs, uint64_t id,
zp->z_is_sa = B_FALSE;
zp->z_is_mapped = B_FALSE;
zp->z_is_ctldir = B_TRUE;
zp->z_is_stale = B_FALSE;
zp->z_sa_hdl = NULL;
zp->z_blksz = 0;
zp->z_seq = 0;
+5 -5
View File
@@ -791,9 +791,7 @@ zfsvfs_create(const char *osname, boolean_t readonly, zfsvfs_t **zfvp)
}
error = zfsvfs_create_impl(zfvp, zfsvfs, os);
if (error != 0) {
dmu_objset_disown(os, B_TRUE, zfsvfs);
}
return (error);
}
@@ -833,6 +831,7 @@ zfsvfs_create_impl(zfsvfs_t **zfvp, zfsvfs_t *zfsvfs, objset_t *os)
error = zfsvfs_init(zfsvfs, os);
if (error != 0) {
dmu_objset_disown(os, B_TRUE, zfsvfs);
*zfvp = NULL;
zfsvfs_free(zfsvfs);
return (error);
@@ -1501,7 +1500,6 @@ zfs_domount(struct super_block *sb, zfs_mnt_t *zm, int silent)
sb->s_op = &zpl_super_operations;
sb->s_xattr = zpl_xattr_handlers;
sb->s_export_op = &zpl_export_operations;
sb->s_d_op = &zpl_dentry_operations;
/* Set features for file system. */
zfs_set_fuid_feature(zfsvfs);
@@ -1535,6 +1533,7 @@ zfs_domount(struct super_block *sb, zfs_mnt_t *zm, int silent)
error = zfs_root(zfsvfs, &root_inode);
if (error) {
(void) zfs_umount(sb);
zfsvfs = NULL; /* avoid double-free; first in zfs_umount */
goto out;
}
@@ -1542,6 +1541,7 @@ zfs_domount(struct super_block *sb, zfs_mnt_t *zm, int silent)
sb->s_root = d_make_root(root_inode);
if (sb->s_root == NULL) {
(void) zfs_umount(sb);
zfsvfs = NULL; /* avoid double-free; first in zfs_umount */
error = SET_ERROR(ENOMEM);
goto out;
}
@@ -1858,8 +1858,8 @@ zfs_resume_fs(zfsvfs_t *zfsvfs, dsl_dataset_t *ds)
zp = list_next(&zfsvfs->z_all_znodes, zp)) {
err2 = zfs_rezget(zp);
if (err2) {
zpl_d_drop_aliases(ZTOI(zp));
remove_inode_hash(ZTOI(zp));
zp->z_is_stale = B_TRUE;
}
/* see comment in zfs_suspend_fs() */
-1
View File
@@ -544,7 +544,6 @@ zfs_znode_alloc(zfsvfs_t *zfsvfs, dmu_buf_t *db, int blksz,
zp->z_atime_dirty = B_FALSE;
zp->z_is_mapped = B_FALSE;
zp->z_is_ctldir = B_FALSE;
zp->z_is_stale = B_FALSE;
zp->z_suspended = B_FALSE;
zp->z_sa_hdl = NULL;
zp->z_mapcnt = 0;
-44
View File
@@ -698,46 +698,6 @@ out:
return (error);
}
static int
#ifdef HAVE_D_REVALIDATE_NAMEIDATA
zpl_revalidate(struct dentry *dentry, struct nameidata *nd)
{
unsigned int flags = (nd ? nd->flags : 0);
#else
zpl_revalidate(struct dentry *dentry, unsigned int flags)
{
#endif /* HAVE_D_REVALIDATE_NAMEIDATA */
/* CSTYLED */
zfsvfs_t *zfsvfs = dentry->d_sb->s_fs_info;
int error;
if (flags & LOOKUP_RCU)
return (-ECHILD);
/*
* After a rollback negative dentries created before the rollback
* time must be invalidated. Otherwise they can obscure files which
* are only present in the rolled back dataset.
*/
if (dentry->d_inode == NULL) {
spin_lock(&dentry->d_lock);
error = time_before(dentry->d_time, zfsvfs->z_rollback_time);
spin_unlock(&dentry->d_lock);
if (error)
return (0);
}
/*
* The dentry may reference a stale inode if a mounted file system
* was rolled back to a point in time where the object didn't exist.
*/
if (dentry->d_inode && ITOZ(dentry->d_inode)->z_is_stale)
return (0);
return (1);
}
const struct inode_operations zpl_inode_operations = {
.setattr = zpl_setattr,
.getattr = zpl_getattr,
@@ -826,7 +786,3 @@ const struct inode_operations zpl_special_inode_operations = {
.get_acl = zpl_get_acl,
#endif /* CONFIG_FS_POSIX_ACL */
};
dentry_operations_t zpl_dentry_operations = {
.d_revalidate = zpl_revalidate,
};
+3 -1
View File
@@ -1414,7 +1414,9 @@ zpl_xattr_handler(const char *name)
return (NULL);
}
#if !defined(HAVE_POSIX_ACL_RELEASE) || defined(HAVE_POSIX_ACL_RELEASE_GPL_ONLY)
#if defined(CONFIG_FS_POSIX_ACL) && \
(!defined(HAVE_POSIX_ACL_RELEASE) || \
defined(HAVE_POSIX_ACL_RELEASE_GPL_ONLY))
struct acl_rel_struct {
struct acl_rel_struct *next;
struct posix_acl *acl;
+30 -1
View File
@@ -25,6 +25,7 @@
* Copyright 2016, Joyent, Inc.
* Copyright (c) 2019, Klara Inc.
* Copyright (c) 2019, Allan Jude
* Copyright (c) 2022 Hewlett Packard Enterprise Development LP.
*/
/* Portions Copyright 2010 Robert Milkowski */
@@ -371,6 +372,8 @@ zfs_prop_init(void)
static zprop_index_t redundant_metadata_table[] = {
{ "all", ZFS_REDUNDANT_METADATA_ALL },
{ "most", ZFS_REDUNDANT_METADATA_MOST },
{ "some", ZFS_REDUNDANT_METADATA_SOME },
{ "none", ZFS_REDUNDANT_METADATA_NONE },
{ NULL }
};
@@ -387,7 +390,7 @@ zfs_prop_init(void)
zprop_register_index(ZFS_PROP_REDUNDANT_METADATA, "redundant_metadata",
ZFS_REDUNDANT_METADATA_ALL,
PROP_INHERIT, ZFS_TYPE_FILESYSTEM | ZFS_TYPE_VOLUME,
"all | most", "REDUND_MD",
"all | most | some | none", "REDUND_MD",
redundant_metadata_table);
zprop_register_index(ZFS_PROP_SYNC, "sync", ZFS_SYNC_STANDARD,
PROP_INHERIT, ZFS_TYPE_FILESYSTEM | ZFS_TYPE_VOLUME,
@@ -719,6 +722,8 @@ zfs_prop_init(void)
boolean_t
zfs_prop_delegatable(zfs_prop_t prop)
{
ASSERT3S(prop, >=, 0);
ASSERT3S(prop, <, ZFS_NUM_PROPS);
zprop_desc_t *pd = &zfs_prop_table[prop];
/* The mlslabel property is never delegatable. */
@@ -841,6 +846,8 @@ zfs_prop_valid_for_type(int prop, zfs_type_t types, boolean_t headcheck)
zprop_type_t
zfs_prop_get_type(zfs_prop_t prop)
{
ASSERT3S(prop, >=, 0);
ASSERT3S(prop, <, ZFS_NUM_PROPS);
return (zfs_prop_table[prop].pd_proptype);
}
@@ -850,6 +857,8 @@ zfs_prop_get_type(zfs_prop_t prop)
boolean_t
zfs_prop_readonly(zfs_prop_t prop)
{
ASSERT3S(prop, >=, 0);
ASSERT3S(prop, <, ZFS_NUM_PROPS);
return (zfs_prop_table[prop].pd_attr == PROP_READONLY ||
zfs_prop_table[prop].pd_attr == PROP_ONETIME ||
zfs_prop_table[prop].pd_attr == PROP_ONETIME_DEFAULT);
@@ -861,6 +870,8 @@ zfs_prop_readonly(zfs_prop_t prop)
boolean_t
zfs_prop_visible(zfs_prop_t prop)
{
ASSERT3S(prop, >=, 0);
ASSERT3S(prop, <, ZFS_NUM_PROPS);
return (zfs_prop_table[prop].pd_visible &&
zfs_prop_table[prop].pd_zfs_mod_supported);
}
@@ -871,6 +882,8 @@ zfs_prop_visible(zfs_prop_t prop)
boolean_t
zfs_prop_setonce(zfs_prop_t prop)
{
ASSERT3S(prop, >=, 0);
ASSERT3S(prop, <, ZFS_NUM_PROPS);
return (zfs_prop_table[prop].pd_attr == PROP_ONETIME ||
zfs_prop_table[prop].pd_attr == PROP_ONETIME_DEFAULT);
}
@@ -878,12 +891,16 @@ zfs_prop_setonce(zfs_prop_t prop)
const char *
zfs_prop_default_string(zfs_prop_t prop)
{
ASSERT3S(prop, >=, 0);
ASSERT3S(prop, <, ZFS_NUM_PROPS);
return (zfs_prop_table[prop].pd_strdefault);
}
uint64_t
zfs_prop_default_numeric(zfs_prop_t prop)
{
ASSERT3S(prop, >=, 0);
ASSERT3S(prop, <, ZFS_NUM_PROPS);
return (zfs_prop_table[prop].pd_numdefault);
}
@@ -894,6 +911,8 @@ zfs_prop_default_numeric(zfs_prop_t prop)
const char *
zfs_prop_to_name(zfs_prop_t prop)
{
ASSERT3S(prop, >=, 0);
ASSERT3S(prop, <, ZFS_NUM_PROPS);
return (zfs_prop_table[prop].pd_name);
}
@@ -903,6 +922,8 @@ zfs_prop_to_name(zfs_prop_t prop)
boolean_t
zfs_prop_inheritable(zfs_prop_t prop)
{
ASSERT3S(prop, >=, 0);
ASSERT3S(prop, <, ZFS_NUM_PROPS);
return (zfs_prop_table[prop].pd_attr == PROP_INHERIT ||
zfs_prop_table[prop].pd_attr == PROP_ONETIME);
}
@@ -955,6 +976,8 @@ zfs_prop_valid_keylocation(const char *str, boolean_t encrypted)
const char *
zfs_prop_values(zfs_prop_t prop)
{
ASSERT3S(prop, >=, 0);
ASSERT3S(prop, <, ZFS_NUM_PROPS);
return (zfs_prop_table[prop].pd_values);
}
@@ -966,6 +989,8 @@ zfs_prop_values(zfs_prop_t prop)
int
zfs_prop_is_string(zfs_prop_t prop)
{
ASSERT3S(prop, >=, 0);
ASSERT3S(prop, <, ZFS_NUM_PROPS);
return (zfs_prop_table[prop].pd_proptype == PROP_TYPE_STRING ||
zfs_prop_table[prop].pd_proptype == PROP_TYPE_INDEX);
}
@@ -977,6 +1002,8 @@ zfs_prop_is_string(zfs_prop_t prop)
const char *
zfs_prop_column_name(zfs_prop_t prop)
{
ASSERT3S(prop, >=, 0);
ASSERT3S(prop, <, ZFS_NUM_PROPS);
return (zfs_prop_table[prop].pd_colname);
}
@@ -987,6 +1014,8 @@ zfs_prop_column_name(zfs_prop_t prop)
boolean_t
zfs_prop_align_right(zfs_prop_t prop)
{
ASSERT3S(prop, >=, 0);
ASSERT3S(prop, <, ZFS_NUM_PROPS);
return (zfs_prop_table[prop].pd_rightalign);
}
+4 -3
View File
@@ -4460,7 +4460,7 @@ restart:
* meta buffers. Requests to the upper layers will be made with
* increasingly large scan sizes until the ARC is below the limit.
*/
if (meta_used > arc_meta_limit) {
if (meta_used > arc_meta_limit || arc_available_memory() < 0) {
if (type == ARC_BUFC_DATA) {
type = ARC_BUFC_METADATA;
} else {
@@ -5166,7 +5166,7 @@ arc_adapt(int bytes, arc_state_t *state)
atomic_add_64(&arc_c, (int64_t)bytes);
if (arc_c > arc_c_max)
arc_c = arc_c_max;
else if (state == arc_anon)
else if (state == arc_anon && arc_p < arc_c >> 1)
atomic_add_64(&arc_p, (int64_t)bytes);
if (arc_p > arc_c)
arc_p = arc_c;
@@ -5379,7 +5379,8 @@ arc_get_data_impl(arc_buf_hdr_t *hdr, uint64_t size, void *tag,
if (aggsum_upper_bound(&arc_sums.arcstat_size) < arc_c &&
hdr->b_l1hdr.b_state == arc_anon &&
(zfs_refcount_count(&arc_anon->arcs_size) +
zfs_refcount_count(&arc_mru->arcs_size) > arc_p))
zfs_refcount_count(&arc_mru->arcs_size) > arc_p &&
arc_p < arc_c >> 1))
arc_p = MIN(arc_c, arc_p + size);
}
}
+2 -2
View File
@@ -3300,10 +3300,10 @@ dbuf_prefetch_indirect_done(zio_t *zio, const zbookmark_phys_t *zb,
blkptr_t *bp = ((blkptr_t *)abuf->b_data) +
P2PHASE(nextblkid, 1ULL << dpa->dpa_epbs);
ASSERT(!BP_IS_REDACTED(bp) ||
ASSERT(!BP_IS_REDACTED(bp) || (dpa->dpa_dnode &&
dsl_dataset_feature_is_active(
dpa->dpa_dnode->dn_objset->os_dsl_dataset,
SPA_FEATURE_REDACTED_DATASETS));
SPA_FEATURE_REDACTED_DATASETS)));
if (BP_IS_HOLE(bp) || BP_IS_REDACTED(bp)) {
dbuf_prefetch_fini(dpa, B_TRUE);
} else if (dpa->dpa_curlevel == dpa->dpa_zb.zb_level) {
+17 -6
View File
@@ -28,6 +28,7 @@
* Copyright (c) 2019 Datto Inc.
* Copyright (c) 2019, Klara Inc.
* Copyright (c) 2019, Allan Jude
* Copyright (c) 2022 Hewlett Packard Enterprise Development LP.
*/
#include <sys/dmu.h>
@@ -70,7 +71,7 @@ int zfs_nopwrite_enabled = 1;
* will wait until the next TXG.
* A value of zero will disable this throttle.
*/
unsigned long zfs_per_txg_dirty_frees_percent = 5;
unsigned long zfs_per_txg_dirty_frees_percent = 30;
/*
* Enable/disable forcing txg sync when dirty checking for holes with lseek().
@@ -1988,12 +1989,22 @@ dmu_write_policy(objset_t *os, dnode_t *dn, int level, int wp, zio_prop_t *zp)
ZCHECKSUM_FLAG_EMBEDDED))
checksum = ZIO_CHECKSUM_FLETCHER_4;
if (os->os_redundant_metadata == ZFS_REDUNDANT_METADATA_ALL ||
(os->os_redundant_metadata ==
ZFS_REDUNDANT_METADATA_MOST &&
(level >= zfs_redundant_metadata_most_ditto_level ||
DMU_OT_IS_METADATA(type) || (wp & WP_SPILL))))
switch (os->os_redundant_metadata) {
case ZFS_REDUNDANT_METADATA_ALL:
copies++;
break;
case ZFS_REDUNDANT_METADATA_MOST:
if (level >= zfs_redundant_metadata_most_ditto_level ||
DMU_OT_IS_METADATA(type) || (wp & WP_SPILL))
copies++;
break;
case ZFS_REDUNDANT_METADATA_SOME:
if (DMU_OT_IS_CRITICAL(type))
copies++;
break;
case ZFS_REDUNDANT_METADATA_NONE:
break;
}
} else if (wp & WP_NOFILL) {
ASSERT(level == 0);
+4 -1
View File
@@ -32,6 +32,7 @@
* Copyright (c) 2018, loli10K <ezomori.nozomu@gmail.com>. All rights reserved.
* Copyright (c) 2019, Klara Inc.
* Copyright (c) 2019, Allan Jude
* Copyright (c) 2022 Hewlett Packard Enterprise Development LP.
*/
/* Portions Copyright 2010 Robert Milkowski */
@@ -287,7 +288,9 @@ redundant_metadata_changed_cb(void *arg, uint64_t newval)
* Inheritance and range checking should have been done by now.
*/
ASSERT(newval == ZFS_REDUNDANT_METADATA_ALL ||
newval == ZFS_REDUNDANT_METADATA_MOST);
newval == ZFS_REDUNDANT_METADATA_MOST ||
newval == ZFS_REDUNDANT_METADATA_SOME ||
newval == ZFS_REDUNDANT_METADATA_NONE);
os->os_redundant_metadata = newval;
}
+1 -1
View File
@@ -602,7 +602,7 @@ dmu_recv_begin_check(void *arg, dmu_tx_t *tx)
* so add the DS_HOLD_FLAG_DECRYPT flag only if we are dealing
* with a dataset we may encrypt.
*/
if (drba->drba_dcp != NULL &&
if (drba->drba_dcp == NULL ||
drba->drba_dcp->cp_crypt != ZIO_CRYPT_OFF) {
dsflags |= DS_HOLD_FLAG_DECRYPT;
}
+8
View File
@@ -2716,6 +2716,10 @@ dmu_send_obj(const char *pool, uint64_t tosnap, uint64_t fromsnap,
dspp.numfromredactsnaps = NUM_SNAPS_NOT_REDACTED;
err = dmu_send_impl(&dspp);
}
if (dspp.fromredactsnaps)
kmem_free(dspp.fromredactsnaps,
dspp.numfromredactsnaps * sizeof (uint64_t));
dsl_dataset_rele(dspp.to_ds, FTAG);
return (err);
}
@@ -2924,6 +2928,10 @@ dmu_send(const char *tosnap, const char *fromsnap, boolean_t embedok,
/* dmu_send_impl will call dsl_pool_rele for us. */
err = dmu_send_impl(&dspp);
} else {
if (dspp.fromredactsnaps)
kmem_free(dspp.fromredactsnaps,
dspp.numfromredactsnaps *
sizeof (uint64_t));
dsl_pool_rele(dspp.dp, FTAG);
}
} else {
+19 -4
View File
@@ -2672,6 +2672,7 @@ spa_do_crypt_objset_mac_abd(boolean_t generate, spa_t *spa, uint64_t dsobj,
objset_phys_t *osp = buf;
uint8_t portable_mac[ZIO_OBJSET_MAC_LEN];
uint8_t local_mac[ZIO_OBJSET_MAC_LEN];
const uint8_t zeroed_mac[ZIO_OBJSET_MAC_LEN] = {0};
/* look up the key from the spa's keystore */
ret = spa_keystore_lookup_key(spa, dsobj, FTAG, &dck);
@@ -2694,10 +2695,24 @@ spa_do_crypt_objset_mac_abd(boolean_t generate, spa_t *spa, uint64_t dsobj,
return (0);
}
if (bcmp(portable_mac, osp->os_portable_mac, ZIO_OBJSET_MAC_LEN) != 0 ||
bcmp(local_mac, osp->os_local_mac, ZIO_OBJSET_MAC_LEN) != 0) {
abd_return_buf(abd, buf, datalen);
return (SET_ERROR(ECKSUM));
if (memcmp(portable_mac, osp->os_portable_mac,
ZIO_OBJSET_MAC_LEN) != 0 ||
memcmp(local_mac, osp->os_local_mac, ZIO_OBJSET_MAC_LEN) != 0) {
/*
* If the MAC is zeroed out, we failed to decrypt it.
* This should only arise, at least on Linux,
* if we hit edge case handling for useraccounting, since we
* shouldn't get here without bailing out on error earlier
* otherwise.
*
* So if we're in that case, we can just fall through and
* special-casing noticing that it's zero will handle it
* elsewhere, since we can just regenerate it.
*/
if (memcmp(local_mac, zeroed_mac, ZIO_OBJSET_MAC_LEN) != 0) {
abd_return_buf(abd, buf, datalen);
return (SET_ERROR(ECKSUM));
}
}
abd_return_buf(abd, buf, datalen);
+18 -5
View File
@@ -1747,6 +1747,21 @@ dsl_dataset_snapshot_sync_impl(dsl_dataset_t *ds, const char *snapname,
}
}
/*
* We are not allowed to dirty a filesystem when done receiving
* a snapshot. In this case the flag SPA_FEATURE_LARGE_BLOCKS will
* not be set and a subsequent encrypted raw send will fail. Hence
* activate this feature if needed here.
*/
for (spa_feature_t f = 0; f < SPA_FEATURES; f++) {
if (zfeature_active(f, ds->ds_feature_activation[f]) &&
!(zfeature_active(f, ds->ds_feature[f]))) {
dsl_dataset_activate_feature(dsobj, f,
ds->ds_feature_activation[f], tx);
ds->ds_feature[f] = ds->ds_feature_activation[f];
}
}
ASSERT3U(ds->ds_prev != 0, ==,
dsl_dataset_phys(ds)->ds_prev_snap_obj != 0);
if (ds->ds_prev) {
@@ -3307,7 +3322,6 @@ dsl_dataset_promote_check(void *arg, dmu_tx_t *tx)
dsl_pool_t *dp = dmu_tx_pool(tx);
dsl_dataset_t *hds;
struct promotenode *snap;
dsl_dataset_t *origin_ds, *origin_head;
int err;
uint64_t unused;
uint64_t ss_mv_cnt;
@@ -3327,12 +3341,11 @@ dsl_dataset_promote_check(void *arg, dmu_tx_t *tx)
}
snap = list_head(&ddpa->shared_snaps);
origin_head = snap->ds;
if (snap == NULL) {
err = SET_ERROR(ENOENT);
goto out;
}
origin_ds = snap->ds;
dsl_dataset_t *const origin_ds = snap->ds;
/*
* Encrypted clones share a DSL Crypto Key with their origin's dsl dir.
@@ -3428,10 +3441,10 @@ dsl_dataset_promote_check(void *arg, dmu_tx_t *tx)
* Check that bookmarks that are being transferred don't have
* name conflicts.
*/
for (dsl_bookmark_node_t *dbn = avl_first(&origin_head->ds_bookmarks);
for (dsl_bookmark_node_t *dbn = avl_first(&origin_ds->ds_bookmarks);
dbn != NULL && dbn->dbn_phys.zbm_creation_txg <=
dsl_dataset_phys(origin_ds)->ds_creation_txg;
dbn = AVL_NEXT(&origin_head->ds_bookmarks, dbn)) {
dbn = AVL_NEXT(&origin_ds->ds_bookmarks, dbn)) {
if (strlen(dbn->dbn_name) >= max_snap_len) {
err = SET_ERROR(ENAMETOOLONG);
goto out;
+6 -1
View File
@@ -1028,8 +1028,13 @@ dsl_process_sub_livelist(bpobj_t *bpobj, bplist_t *to_free, zthr_t *t,
.t = t
};
int err = bpobj_iterate_nofree(bpobj, dsl_livelist_iterate, &arg, size);
VERIFY(err != 0 || avl_numnodes(&avl) == 0);
VERIFY0(avl_numnodes(&avl));
void *cookie = NULL;
livelist_entry_t *le = NULL;
while ((le = avl_destroy_nodes(&avl, &cookie)) != NULL) {
kmem_free(le, sizeof (livelist_entry_t));
}
avl_destroy(&avl);
return (err);
}
+23 -14
View File
@@ -809,6 +809,18 @@ dsl_fs_ss_limit_check(dsl_dir_t *dd, uint64_t delta, zfs_prop_t prop,
ASSERT(prop == ZFS_PROP_FILESYSTEM_LIMIT ||
prop == ZFS_PROP_SNAPSHOT_LIMIT);
if (prop == ZFS_PROP_SNAPSHOT_LIMIT) {
/*
* We don't enforce the limit for temporary snapshots. This is
* indicated by a NULL cred_t argument.
*/
if (cr == NULL)
return (0);
count_prop = DD_FIELD_SNAPSHOT_COUNT;
} else {
count_prop = DD_FIELD_FILESYSTEM_COUNT;
}
/*
* If we're allowed to change the limit, don't enforce the limit
* e.g. this can happen if a snapshot is taken by an administrative
@@ -828,19 +840,6 @@ dsl_fs_ss_limit_check(dsl_dir_t *dd, uint64_t delta, zfs_prop_t prop,
if (delta == 0)
return (0);
if (prop == ZFS_PROP_SNAPSHOT_LIMIT) {
/*
* We don't enforce the limit for temporary snapshots. This is
* indicated by a NULL cred_t argument.
*/
if (cr == NULL)
return (0);
count_prop = DD_FIELD_SNAPSHOT_COUNT;
} else {
count_prop = DD_FIELD_FILESYSTEM_COUNT;
}
/*
* If an ancestor has been provided, stop checking the limit once we
* hit that dir. We need this during rename so that we don't overcount
@@ -1262,6 +1261,7 @@ dsl_dir_tempreserve_impl(dsl_dir_t *dd, uint64_t asize, boolean_t netfree,
uint64_t quota;
struct tempreserve *tr;
int retval;
uint64_t ext_quota;
uint64_t ref_rsrv;
top_of_function:
@@ -1337,7 +1337,16 @@ top_of_function:
* on-disk is over quota and there are no pending changes
* or deferred frees (which may free up space for us).
*/
if (used_on_disk + est_inflight >= quota) {
ext_quota = quota >> 5;
if (quota == UINT64_MAX)
ext_quota = 0;
if (used_on_disk >= quota) {
/* Quota exceeded */
mutex_exit(&dd->dd_lock);
DMU_TX_STAT_BUMP(dmu_tx_quota);
return (retval);
} else if (used_on_disk + est_inflight >= quota + ext_quota) {
if (est_inflight > 0 || used_on_disk < quota) {
retval = SET_ERROR(ERESTART);
} else {
+93
View File
@@ -23,6 +23,7 @@
* Copyright (c) 2012, 2015 by Delphix. All rights reserved.
* Copyright (c) 2013 Martin Matuska. All rights reserved.
* Copyright 2019 Joyent, Inc.
* Copyright (c) 2022 Hewlett Packard Enterprise Development LP.
*/
#include <sys/zfs_context.h>
@@ -41,6 +42,7 @@
#define ZPROP_INHERIT_SUFFIX "$inherit"
#define ZPROP_RECVD_SUFFIX "$recvd"
#define ZPROP_IUV_SUFFIX "$iuv"
static int
dodefault(zfs_prop_t prop, int intsz, int numints, void *buf)
@@ -69,6 +71,17 @@ dodefault(zfs_prop_t prop, int intsz, int numints, void *buf)
return (0);
}
static int
dsl_prop_known_index(zfs_prop_t prop, uint64_t value)
{
const char *str = NULL;
if (prop != ZPROP_CONT && prop != ZPROP_INVAL &&
zfs_prop_get_type(prop) == PROP_TYPE_INDEX)
return (!zfs_prop_index_to_string(prop, value, &str));
return (-1);
}
int
dsl_prop_get_dd(dsl_dir_t *dd, const char *propname,
int intsz, int numints, void *buf, char *setpoint, boolean_t snapshot)
@@ -81,6 +94,7 @@ dsl_prop_get_dd(dsl_dir_t *dd, const char *propname,
boolean_t inheriting = B_FALSE;
char *inheritstr;
char *recvdstr;
char *iuvstr;
ASSERT(dsl_pool_config_held(dd->dd_pool));
@@ -91,6 +105,7 @@ dsl_prop_get_dd(dsl_dir_t *dd, const char *propname,
inheritable = (prop == ZPROP_INVAL || zfs_prop_inheritable(prop));
inheritstr = kmem_asprintf("%s%s", propname, ZPROP_INHERIT_SUFFIX);
recvdstr = kmem_asprintf("%s%s", propname, ZPROP_RECVD_SUFFIX);
iuvstr = kmem_asprintf("%s%s", propname, ZPROP_IUV_SUFFIX);
/*
* Note: dd may become NULL, therefore we shouldn't dereference it
@@ -105,6 +120,18 @@ dsl_prop_get_dd(dsl_dir_t *dd, const char *propname,
inheriting = B_TRUE;
}
/* Check for a iuv value. */
err = zap_lookup(mos, dsl_dir_phys(dd)->dd_props_zapobj,
iuvstr, intsz, numints, buf);
if (dsl_prop_known_index(zfs_name_to_prop(propname),
*(uint64_t *)buf) != 1)
err = ENOENT;
if (err != ENOENT) {
if (setpoint != NULL && err == 0)
dsl_dir_name(dd, setpoint);
break;
}
/* Check for a local value. */
err = zap_lookup(mos, dsl_dir_phys(dd)->dd_props_zapobj,
propname, intsz, numints, buf);
@@ -155,6 +182,7 @@ dsl_prop_get_dd(dsl_dir_t *dd, const char *propname,
kmem_strfree(inheritstr);
kmem_strfree(recvdstr);
kmem_strfree(iuvstr);
return (err);
}
@@ -647,6 +675,45 @@ dsl_prop_changed_notify(dsl_pool_t *dp, uint64_t ddobj,
dsl_dir_rele(dd, FTAG);
}
/*
* For newer values in zfs index type properties, we add a new key
* propname$iuv (iuv = Ignore Unknown Values) to the properties zap object
* to store the new property value and store the default value in the
* existing prop key. So that the propname$iuv key is ignored by the older zfs
* versions and the default property value from the existing prop key is
* used.
*/
static void
dsl_prop_set_iuv(objset_t *mos, uint64_t zapobj, const char *propname,
int intsz, int numints, const void *value, dmu_tx_t *tx)
{
char *iuvstr = kmem_asprintf("%s%s", propname, ZPROP_IUV_SUFFIX);
boolean_t iuv = B_FALSE;
zfs_prop_t prop = zfs_name_to_prop(propname);
switch (prop) {
case ZFS_PROP_REDUNDANT_METADATA:
if (*(uint64_t *)value == ZFS_REDUNDANT_METADATA_SOME ||
*(uint64_t *)value == ZFS_REDUNDANT_METADATA_NONE)
iuv = B_TRUE;
break;
default:
break;
}
if (iuv) {
VERIFY0(zap_update(mos, zapobj, iuvstr, intsz, numints,
value, tx));
uint64_t val = zfs_prop_default_numeric(prop);
VERIFY0(zap_update(mos, zapobj, propname, intsz, numints,
&val, tx));
} else {
zap_remove(mos, zapobj, iuvstr, tx);
}
kmem_strfree(iuvstr);
}
void
dsl_prop_set_sync_impl(dsl_dataset_t *ds, const char *propname,
zprop_source_t source, int intsz, int numints, const void *value,
@@ -659,6 +726,7 @@ dsl_prop_set_sync_impl(dsl_dataset_t *ds, const char *propname,
const char *valstr = NULL;
char *inheritstr;
char *recvdstr;
char *iuvstr;
char *tbuf = NULL;
int err;
uint64_t version = spa_version(ds->ds_dir->dd_pool->dp_spa);
@@ -692,6 +760,7 @@ dsl_prop_set_sync_impl(dsl_dataset_t *ds, const char *propname,
inheritstr = kmem_asprintf("%s%s", propname, ZPROP_INHERIT_SUFFIX);
recvdstr = kmem_asprintf("%s%s", propname, ZPROP_RECVD_SUFFIX);
iuvstr = kmem_asprintf("%s%s", propname, ZPROP_IUV_SUFFIX);
switch ((int)source) {
case ZPROP_SRC_NONE:
@@ -709,11 +778,14 @@ dsl_prop_set_sync_impl(dsl_dataset_t *ds, const char *propname,
/*
* remove propname$inherit
* set propname -> value
* set propname$iuv -> new property value
*/
err = zap_remove(mos, zapobj, inheritstr, tx);
ASSERT(err == 0 || err == ENOENT);
VERIFY0(zap_update(mos, zapobj, propname,
intsz, numints, value, tx));
(void) dsl_prop_set_iuv(mos, zapobj, propname, intsz,
numints, value, tx);
break;
case ZPROP_SRC_INHERITED:
/*
@@ -723,6 +795,8 @@ dsl_prop_set_sync_impl(dsl_dataset_t *ds, const char *propname,
*/
err = zap_remove(mos, zapobj, propname, tx);
ASSERT(err == 0 || err == ENOENT);
err = zap_remove(mos, zapobj, iuvstr, tx);
ASSERT(err == 0 || err == ENOENT);
if (version >= SPA_VERSION_RECVD_PROPS &&
dsl_prop_get_int_ds(ds, ZPROP_HAS_RECVD, &dummy) == 0) {
dummy = 0;
@@ -763,6 +837,7 @@ dsl_prop_set_sync_impl(dsl_dataset_t *ds, const char *propname,
kmem_strfree(inheritstr);
kmem_strfree(recvdstr);
kmem_strfree(iuvstr);
/*
* If we are left with an empty snap zap we can destroy it.
@@ -1012,6 +1087,14 @@ dsl_prop_get_all_impl(objset_t *mos, uint64_t propobj,
propname = za.za_name;
source = setpoint;
/* Skip if iuv entries are preset. */
valstr = kmem_asprintf("%s%s", propname,
ZPROP_IUV_SUFFIX);
err = zap_contains(mos, propobj, valstr);
kmem_strfree(valstr);
if (err == 0)
continue;
} else if (strcmp(suffix, ZPROP_INHERIT_SUFFIX) == 0) {
/* Skip explicitly inherited entries. */
continue;
@@ -1044,6 +1127,16 @@ dsl_prop_get_all_impl(objset_t *mos, uint64_t propobj,
source = ((flags & DSL_PROP_GET_INHERITING) ?
setpoint : ZPROP_SOURCE_VAL_RECVD);
} else if (strcmp(suffix, ZPROP_IUV_SUFFIX) == 0) {
(void) strlcpy(buf, za.za_name,
MIN(sizeof (buf), suffix - za.za_name + 1));
propname = buf;
source = setpoint;
prop = zfs_name_to_prop(propname);
if (dsl_prop_known_index(prop,
za.za_first_integer) != 1)
continue;
} else {
/*
* For backward compatibility, skip suffixes we don't
+1
View File
@@ -953,6 +953,7 @@ fm_fmri_hc_create(nvlist_t *fmri, int version, const nvlist_t *auth,
}
atomic_inc_64(
&erpt_kstat_data.fmri_set_failed.value.ui64);
va_end(ap);
return;
}
}
+2 -3
View File
@@ -5207,12 +5207,11 @@ top:
ASSERT(mg->mg_initialized);
/*
* Avoid writing single-copy data to a failing,
* Avoid writing single-copy data to an unhealthy,
* non-redundant vdev, unless we've already tried all
* other vdevs.
*/
if ((vd->vdev_stat.vs_write_errors > 0 ||
vd->vdev_state < VDEV_STATE_HEALTHY) &&
if (vd->vdev_state < VDEV_STATE_HEALTHY &&
d == 0 && !try_hard && vd->vdev_children == 0) {
metaslab_trace_add(zal, mg, NULL, psize, d,
TRACE_VDEV_ERROR, allocator);
+5 -3
View File
@@ -5230,7 +5230,7 @@ spa_open_common(const char *pool, spa_t **spapp, void *tag, nvlist_t *nvpolicy,
* If we've recovered the pool, pass back any information we
* gathered while doing the load.
*/
if (state == SPA_LOAD_RECOVER) {
if (state == SPA_LOAD_RECOVER && config != NULL) {
fnvlist_add_nvlist(*config, ZPOOL_CONFIG_LOAD_INFO,
spa->spa_load_info);
}
@@ -6772,10 +6772,12 @@ spa_vdev_attach(spa_t *spa, uint64_t guid, nvlist_t *nvroot, int replacing,
return (spa_vdev_exit(spa, newrootvd, txg, error));
/*
* Spares can't replace logs
* log, dedup and special vdevs should not be replaced by spares.
*/
if (oldvd->vdev_top->vdev_islog && newvd->vdev_isspare)
if ((oldvd->vdev_top->vdev_alloc_bias != VDEV_BIAS_NONE ||
oldvd->vdev_top->vdev_islog) && newvd->vdev_isspare) {
return (spa_vdev_exit(spa, newrootvd, txg, ENOTSUP));
}
/*
* A dRAID spare can only replace a child of its parent dRAID vdev.
+2 -1
View File
@@ -690,7 +690,8 @@ spa_estimate_metaslabs_to_flush(spa_t *spa)
* based on the incoming rate until we exceed it.
*/
if (available_blocks >= 0 && available_txgs >= 0) {
uint64_t skip_txgs = MIN(available_txgs + 1,
uint64_t skip_txgs = (incoming == 0) ?
available_txgs + 1 : MIN(available_txgs + 1,
(available_blocks / incoming) + 1);
available_blocks -= (skip_txgs * incoming);
available_txgs -= skip_txgs;
+13 -1
View File
@@ -22,6 +22,7 @@
*
* Copyright (c) 2018, Intel Corporation.
* Copyright (c) 2020 by Lawrence Livermore National Security, LLC.
* Copyright (c) 2022 Hewlett Packard Enterprise Development LP.
*/
#include <sys/vdev_impl.h>
@@ -134,6 +135,7 @@ int zfs_rebuild_scrub_enabled = 1;
* For vdev_rebuild_initiate_sync() and vdev_rebuild_reset_sync().
*/
static void vdev_rebuild_thread(void *arg);
static void vdev_rebuild_reset_sync(void *arg, dmu_tx_t *tx);
/*
* Clear the per-vdev rebuild bytes value for a vdev tree.
@@ -307,6 +309,17 @@ vdev_rebuild_complete_sync(void *arg, dmu_tx_t *tx)
vdev_rebuild_phys_t *vrp = &vr->vr_rebuild_phys;
mutex_enter(&vd->vdev_rebuild_lock);
/*
* Handle a second device failure if it occurs after all rebuild I/O
* has completed but before this sync task has been executed.
*/
if (vd->vdev_rebuild_reset_wanted) {
mutex_exit(&vd->vdev_rebuild_lock);
vdev_rebuild_reset_sync(arg, tx);
return;
}
vrp->vrp_rebuild_state = VDEV_REBUILD_COMPLETE;
vrp->vrp_end_time = gethrestime_sec();
@@ -760,7 +773,6 @@ vdev_rebuild_thread(void *arg)
ASSERT(vd->vdev_rebuilding);
ASSERT(spa_feature_is_active(spa, SPA_FEATURE_DEVICE_REBUILD));
ASSERT3B(vd->vdev_rebuild_cancel_wanted, ==, B_FALSE);
ASSERT3B(vd->vdev_rebuild_reset_wanted, ==, B_FALSE);
vdev_rebuild_t *vr = &vd->vdev_rebuild_config;
vdev_rebuild_phys_t *vrp = &vr->vr_rebuild_phys;
-3
View File
@@ -2056,7 +2056,6 @@ spa_vdev_remove_top_check(vdev_t *vd)
* and not be raidz or draid.
*/
vdev_t *rvd = spa->spa_root_vdev;
int num_indirect = 0;
for (uint64_t id = 0; id < rvd->vdev_children; id++) {
vdev_t *cvd = rvd->vdev_child[id];
@@ -2072,8 +2071,6 @@ spa_vdev_remove_top_check(vdev_t *vd)
if (cvd->vdev_ashift != 0 &&
cvd->vdev_alloc_bias == VDEV_BIAS_NONE)
ASSERT3U(cvd->vdev_ashift, ==, spa->spa_max_ashift);
if (cvd->vdev_ops == &vdev_indirect_ops)
num_indirect++;
if (!vdev_is_concrete(cvd))
continue;
if (vdev_get_nparity(cvd) != 0)
+2 -3
View File
@@ -1188,12 +1188,11 @@ vdev_autotrim_thread(void *arg)
mutex_exit(&vd->vdev_autotrim_lock);
spa_config_enter(spa, SCL_CONFIG, FTAG, RW_READER);
uint64_t extent_bytes_max = zfs_trim_extent_bytes_max;
uint64_t extent_bytes_min = zfs_trim_extent_bytes_min;
while (!vdev_autotrim_should_stop(vd)) {
int txgs_per_trim = MAX(zfs_trim_txg_batch, 1);
boolean_t issued_trim = B_FALSE;
uint64_t extent_bytes_max = zfs_trim_extent_bytes_max;
uint64_t extent_bytes_min = zfs_trim_extent_bytes_min;
/*
* All of the metaslabs are divided in to groups of size
+4 -2
View File
@@ -988,8 +988,10 @@ zap_lookup_impl(zap_t *zap, const char *name,
} else {
*(uint64_t *)buf =
MZE_PHYS(zap, mze)->mze_value;
(void) strlcpy(realname,
MZE_PHYS(zap, mze)->mze_name, rn_len);
if (realname != NULL)
(void) strlcpy(realname,
MZE_PHYS(zap, mze)->mze_name,
rn_len);
if (ncp) {
*ncp = mzap_normalization_conflict(zap,
zn, mze);
+3 -3
View File
@@ -277,9 +277,9 @@ zcp_table_to_nvlist(lua_State *state, int index, int depth)
}
break;
case LUA_TNUMBER:
VERIFY3U(sizeof (buf), >,
snprintf(buf, sizeof (buf), "%lld",
(longlong_t)lua_tonumber(state, -2)));
(void) snprintf(buf, sizeof (buf), "%lld",
(longlong_t)lua_tonumber(state, -2));
key = buf;
if (saw_str_could_collide) {
key_could_collide = B_TRUE;
-2
View File
@@ -253,7 +253,6 @@ void
zfs_ereport_clear(spa_t *spa, vdev_t *vd)
{
uint64_t vdev_guid, pool_guid;
int cnt = 0;
ASSERT(vd != NULL || spa != NULL);
if (vd == NULL) {
@@ -277,7 +276,6 @@ zfs_ereport_clear(spa_t *spa, vdev_t *vd)
avl_remove(&recent_events_tree, entry);
list_remove(&recent_events_list, entry);
kmem_free(entry, sizeof (*entry));
cnt++;
}
}
+1 -1
View File
@@ -7398,7 +7398,7 @@ zfsdev_get_state_impl(minor_t minor, enum zfsdev_state_type which)
for (zs = zfsdev_state_list; zs != NULL; zs = zs->zs_next) {
if (zs->zs_minor == minor) {
smp_rmb();
membar_consumer();
switch (which) {
case ZST_ONEXIT:
return (zs->zs_onexit);
+11 -5
View File
@@ -397,8 +397,18 @@ zil_parse(zilog_t *zilog, zil_parse_blk_func_t *parse_blk_func,
error = zil_read_log_block(zilog, decrypt, &blk, &next_blk,
lrbuf, &end);
if (error != 0)
if (error != 0) {
if (claimed) {
char name[ZFS_MAX_DATASET_NAME_LEN];
dmu_objset_name(zilog->zl_os, name);
cmn_err(CE_WARN, "ZFS read log block error %d, "
"dataset %s, seq 0x%llx\n", error, name,
(u_longlong_t)blk_seq);
}
break;
}
for (lrp = lrbuf; lrp < end; lrp += reclen) {
lr_t *lr = (lr_t *)lrp;
@@ -422,10 +432,6 @@ done:
zilog->zl_parse_blk_count = blk_count;
zilog->zl_parse_lr_count = lr_count;
ASSERT(!claimed || !(zh->zh_flags & ZIL_CLAIM_LR_SEQ_VALID) ||
(max_blk_seq == claim_blk_seq && max_lr_seq == claim_lr_seq) ||
(decrypt && error == EIO));
zil_bp_tree_fini(zilog);
zio_buf_free(lrbuf, SPA_OLD_MAXBLOCKSIZE);
+1 -1
View File
@@ -1,4 +1,4 @@
#!/usr/bin/perl -w
#!/usr/bin/env perl
my $usage = <<EOT;
usage: config-enum enum [file ...]
+2 -2
View File
@@ -828,7 +828,7 @@ tests = ['recv_dedup', 'recv_dedup_encrypted_zvol', 'rsend_001_pos',
'send_realloc_encrypted_files', 'send_spill_block', 'send_holds',
'send_hole_birth', 'send_mixed_raw', 'send-wR_encrypted_zvol',
'send_partial_dataset', 'send_invalid', 'send_doall',
'send_raw_spill_block', 'send_raw_ashift']
'send_raw_spill_block', 'send_raw_ashift', 'send_raw_large_blocks']
tags = ['functional', 'rsend']
[tests/functional/scrub_mirror]
@@ -892,7 +892,7 @@ tests = [
'userquota_007_pos', 'userquota_008_pos', 'userquota_009_pos',
'userquota_010_pos', 'userquota_011_pos', 'userquota_012_neg',
'userspace_001_pos', 'userspace_002_pos', 'userspace_encrypted',
'userspace_send_encrypted']
'userspace_send_encrypted', 'userspace_encrypted_13709']
tags = ['functional', 'userquota']
[tests/functional/vdev_zaps]
+2 -5
View File
@@ -1318,14 +1318,11 @@ static int
draid_merge(int argc, char *argv[])
{
char filename[MAXPATHLEN];
int c, error, total_merged = 0, verbose = 0;
int c, error, total_merged = 0;
nvlist_t *allcfgs;
while ((c = getopt(argc, argv, ":v")) != -1) {
while ((c = getopt(argc, argv, ":")) != -1) {
switch (c) {
case 'v':
verbose++;
break;
case ':':
(void) fprintf(stderr,
"missing argument for '%c' option\n", optopt);
+2 -1
View File
@@ -11,6 +11,7 @@
#
# Copyright (c) 2012, 2016, Delphix. All rights reserved.
# Copyright (c) 2022 Hewlett Packard Enterprise Development LP.
#
. $STF_SUITE/include/libtest.shlib
@@ -27,7 +28,7 @@ typeset -a canmount_prop_vals=('on' 'off' 'noauto')
typeset -a copies_prop_vals=('1' '2' '3')
typeset -a logbias_prop_vals=('latency' 'throughput')
typeset -a primarycache_prop_vals=('all' 'none' 'metadata')
typeset -a redundant_metadata_prop_vals=('all' 'most')
typeset -a redundant_metadata_prop_vals=('all' 'most' 'some' 'none')
typeset -a secondarycache_prop_vals=('all' 'none' 'metadata')
typeset -a snapdir_prop_vals=('hidden' 'visible')
typeset -a sync_prop_vals=('standard' 'always' 'disabled')

Some files were not shown because too many files have changed in this diff Show More