zfs_acl_node_alloc allocates an uninitialized data buffer, but upstack
zfs_acl_chmod only partially initializes it. KMSAN reported that this
memory remained uninitialized at the point when it was read by
lzjb_compress, which suggests a possible kernel memory disclosure bug.
The full KMSAN warning may be found in the PR.
https://github.com/openzfs/zfs/pull/16511
Signed-off-by: Alan Somers <asomers@gmail.com>
Sponsored by: Axcient
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Follow up to 99495ba6ab which
accidentally introduce this regression.
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Quartz <yyhran@163.com>
Closes#15907
... instead of list_head() + list_remove(). On FreeBSD the list
functions are not inlined, so in addition to more compact code
this also saves another function call.
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes#14955
Building with Clang on Linux generates a warning that err could be
uninitialized if mnt_ns is a NULL pointer. However, mnt_ns should never
be NULL, so there is no need to put this behind an if statement. Taking
it outside of the if statement means that the possibility of err being
uninitialized goes from being always zero in a way that the compiler
could not realize to a way that is always zero in a way that the
compiler can realize.
Sponsored-By: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Youzhong Yang <yyang@mathworks.com>
Signed-off-by: Richard Yao <richard.yao@klarasystems.com>
Closes#14738
Linux kernel 6.3 changed a bunch of APIs to use the dedicated idmap
type for mounts (struct mnt_idmap), we need to detect these changes
and make zfs work with the new APIs.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Youzhong Yang <yyang@mathworks.com>
Closes#14682
I recently gained the ability to run Clang's static analyzer on the
linux kernel modules via a few hacks. This extended coverage to code
that was previously missed since Clang's static analyzer only looked at
code that we built in userspace. Running it against the Linux kernel
modules built from my local branch produced a total of 72 reports
against my local branch. Of those, 50 were reports of logic errors and
22 were reports of dead code. Since we already had cleaned up all of
the previous dead code reports, I felt it would be a good next step to
clean up these dead code reports. Clang did a further breakdown of the
dead code reports into:
Dead assignment 15
Dead increment 2
Dead nested assignment 5
The benefit of cleaning these up, especially in the case of dead nested
assignment, is that they can expose places where our error handling is
incorrect. A number of them were fairly straight forward. However
several were not:
In vdev_disk_physio_completion(), not only were we not using the return
value from the static function vdev_disk_dio_put(), but nothing used it,
so I changed it to return void and removed the existing (void) cast in
the other area where we call it in addition to no longer storing it to a
stack value.
In FSE_createDTable(), the function is dead code. Its helper function
FSE_freeDTable() is also dead code, as are the CPP definitions in
`module/zstd/include/zstd_compat_wrapper.h`. We just delete it all.
In zfs_zevent_wait(), we have an optimization opportunity. cv_wait_sig()
returns 0 if there are waiting signals and 1 if there are none. The
Linux SPL version literally returns `signal_pending(current) ? 0 : 1)`
and FreeBSD implements the same semantics, we can just do
`!cv_wait_sig()` in place of `signal_pending(current)` to avoid
unnecessarily calling it again.
zfs_setattr() on FreeBSD version did not have error handling issue
because the code was removed entirely from FreeBSD version. The error is
from updating the attribute directory's files. After some thought, I
decided to propapage errors on it to userspace.
In zfs_secpolicy_tmp_snapshot(), we ignore a lack of permission from the
first check in favor of checking three other permissions. I assume this
is intentional.
In zfs_create_fs(), the return value of zap_update() was not checked
despite setting an important version number. I see no backward
compatibility reason to permit failures, so we add an assertion to catch
failures. Interestingly, Linux is still using ASSERT(error == 0) from
OpenSolaris while FreeBSD has switched to the improved ASSERT0(error)
from illumos, although illumos has yet to adopt it here. ASSERT(error ==
0) was used on Linux while ASSERT0(error) was used on FreeBSD since the
entire file needs conversion and that should be the subject of
another patch.
dnode_move()'s issue was caused by us not having implemented
POINTER_IS_VALID() on Linux. We have a stub in
`include/os/linux/spl/sys/kmem_cache.h` for it, when it really should be
in `include/os/linux/spl/sys/kmem.h` to be consistent with
Illumos/OpenSolaris. FreeBSD put both `POINTER_IS_VALID()` and
`POINTER_INVALIDATE()` in `include/os/freebsd/spl/sys/kmem.h`, so we
copy what it did.
Whenever a report was in platform-specific code, I checked the FreeBSD
version to see if it also applied to FreeBSD, but it was only relevant a
few times.
Lastly, the patch that enabled Clang's static analyzer to be run on the
Linux kernel modules needs more work before it can be put into a PR. I
plan to do that in the future as part of the on-going static analysis
work that I am doing.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes#14380
In zfs_zaccess_dataset_check(), we have the following subexpression:
(!IS_DEVVP(ZTOV(zp)) ||
(IS_DEVVP(ZTOV(zp)) && (v4_mode & WRITE_MASK_ATTRS)))
When !IS_DEVVP(ZTOV(zp)) is false, IS_DEVVP(ZTOV(zp)) is true under the
law of the excluded middle since we are not doing pseudoboolean alegbra.
Therefore doing:
(IS_DEVVP(ZTOV(zp)) && (v4_mode & WRITE_MASK_ATTRS))
Is unnecessary and we can just do:
(v4_mode & WRITE_MASK_ATTRS)
The Linux 5.16.14 kernel's coccicheck caught this. The semantic
patch that caught it was:
./scripts/coccinelle/misc/excluded_middle.cocci
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes#14372
There was the series from me a year ago which fixed most of the
callback vs implementation prototype mismatches. It was based on
running the CFI-enabled kernel (in permissive mode -- warning
instead of panic) and performing a full ZTS cycle, and then fixing
all of the problems caught by CFI.
Now, Clang 16-dev has new warning flag, -Wcast-function-type-strict,
which detect such mismatches at compile-time. It allows to find the
remaining issues missed by the first series.
There are only two of them left: one for the
secpolicy_vnode_setattr() callback and one for taskq_dispatch().
The fix is easy, since they are not used anywhere else.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Closes#14207
Linux 5.17 commit torvalds/linux@5dfbfe71e enables "the idmapping
infrastructure to support idmapped mounts of filesystems mounted
with an idmapping". Update the OpenZFS accordingly to improve the
idmapped mount support.
This pull request contains the following changes:
- xattr setter functions are fixed to take mnt_ns argument. Without
this, cp -p would fail for an idmapped mount in a user namespace.
- idmap_util is enhanced/fixed for its use in a user ns context.
- One test case added to test idmapped mount in a user ns.
Reviewed-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Youzhong Yang <yyang@mathworks.com>
Closes#14097
Avoid assuming that a pointer can fit in a uint64_t and use uintptr_t
instead.
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Brooks Davis <brooks.davis@sri.com>
Closes#14131
Adds support for idmapped mounts. Supported as of Linux 5.12 this
functionality allows user and group IDs to be remapped without changing
their state on disk. This can be useful for portable home directories
and a variety of container related use cases.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Youzhong Yang <yyang@mathworks.com>
Closes#12923Closes#13671
Coverity complains about possible bugs involving referencing NULL return
values and division by zero. The division by zero bugs require that a
block pointer be corrupt, either from in-memory corruption, or on-disk
corruption. The NULL return value complaints are only bugs if
assumptions that we make about the state of data structures are wrong.
Some seem impossible to be wrong and thus are false positives, while
others are hard to analyze.
Rather than dismiss these as false positives by assuming we know better,
we add defensive assertions to let us know when our assumptions are
wrong.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes#13972
Coverity found a number of places where we either do MAX(unsigned, 0) or
do assertions that a unsigned variable is >= 0. These do nothing, so
let us drop them all.
It also found a spot where we do `if (unsigned >= 0 && ...)`. Let us
also drop the unsigned >= 0 check.
Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes#13871
Replace ZFS_ENTER and ZFS_VERIFY_ZP, which have hidden returns, with
functions that return error code. The reason we want to do this is
because hidden returns are not obvious and had caused some missing fail
path unwinding.
This patch changes the common, linux, and freebsd parts. Also fixes
fail path unwinding in zfs_fsync, zpl_fsync, zpl_xattr_{list,get,set}, and
zfs_lookup().
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes#13831
Bypass check of ZFS aces if the ACL is trivial. When an ACL is
trivial its permissions are represented by the mode without any
loss of information. In this case, it is safe to convert the
access request into equivalent mode and then pass desired mask
and inode to generic_permission(). This has the added benefit
of also checking whether entries in a POSIX ACL on the file grant
the desired access.
This commit also skips the ACL check on looking up the xattr dir
since such restrictions don't exist in Linux kernel and it makes
xattr lookup behavior inconsistent between SA and file-based
xattrs. We also don't want to perform a POSIX ACL check while
looking up the POSIX ACL if for some reason it is located in
the xattr dir rather than an SA.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Andrew Walker <awalker@ixsystems.com>
Closes#13237
bcopy() has a confusing argument order and is actually a move, not a
copy; they're all deprecated since POSIX.1-2001 and removed in -2008,
and we shim them out to mem*() on Linux anyway
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes#12996
Unfortunately macOS has obj-C keyword "fallthrough" in the OS headers.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Signed-off-by: Jorgen Lundman <lundman@lundman.net>
Closes#13097
Evaluated every variable that lives in .data (and globals in .rodata)
in the kernel modules, and constified/eliminated/localised them
appropriately. This means that all read-only data is now actually
read-only data, and, if possible, at file scope. A lot of previously-
global-symbols became inlinable (and inlined!) constants. Probably
not in a big Wowee Performance Moment, but hey.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes#12899
As of the Linux 5.9 kernel a fallthrough macro has been added which
should be used to anotate all intentional fallthrough paths. Once
all of the kernel code paths have been updated to use fallthrough
the -Wimplicit-fallthrough option will because the default. To
avoid warnings in the OpenZFS code base when this happens apply
the fallthrough macro.
Additional reading: https://lwn.net/Articles/794944/
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#12441
ZFS_READONLY represents the "DOS R/O" attribute.
When that flag is set, we should behave as if write access
were not granted by anything in the ACL. In particular:
We _must_ allow writes after opening the file r/w, then
setting the DOS R/O attribute, and writing some more.
(Similar to how you can write after fchmod(fd, 0444).)
Restore these semantics which were lost on FreeBSD when refactoring
zfs_write. To my knowledge Linux does not actually expose this flag,
but we'll need it to eventually so I've added the supporting checks.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes#11693
The acltype property is currently hidden on FreeBSD and does not
reflect the NFSv4 style ZFS ACLs used on the platform. This makes it
difficult to observe that a pool imported from FreeBSD on Linux has a
different type of ACL that is being ignored, and vice versa.
Add an nfsv4 acltype and expose the property on FreeBSD.
Make the default acltype nfsv4 on FreeBSD.
Setting acltype to an unhanded style is treated the same as setting
it to off. The ACLs will not be removed, but they will be ignored.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes#10520
Prefer acltype=off|posix, retaining the old names as aliases.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes#10918
In absence of inheriting entry for owner@, group@, or everyone@,
zfs_acl_chmod() is called to set these. This can cause confusion for Samba
admins who do not expect these entries to appear on newly created files and
directories once they have been stripped from from the parent directory.
When aclmode is set to "restricted", chmod is prevented on non-trivial ACLs.
It is not a stretch to assume that in this case the administrator does not want
ZFS to add the missing special entries. Add check for this aclmode, and if an
inherited entry is present skip zfs_acl_chmod().
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andrew Walker <awalker@ixsystems.com>
Closes#10748
Mark functions used only in the same translation unit as static. This
only includes functions that do not have a prototype in a header file
either.
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Closes#10470
reflect delete permissions for ACLs
Authored by: Kevin Crowe <kevin.crowe@nexenta.com>
Reviewed by: Gordon Ross <gwr@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Richard Lowe <richlowe@richlowe.net>
Ported-by: Paul B. Henson <henson@acm.org>
Porting Notes:
* Only comments are updated
OpenZFS-issue: https://www.illumos.org/issues/6765
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/da412744bcCloses#10266
- and some additional considerations
Authored by: Kevin Crowe <kevin.crowe@nexenta.com>
Reviewed by: Gordon Ross <gwr@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Richard Lowe <richlowe@richlowe.net>
Ported-by: Paul B. Henson <henson@acm.org>
OpenZFS-issue: https://www.illumos.org/issues/6762
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/1eb4e906ecCloses#10266
with aclmode=passthrough
Authored by: Albert Lee <trisk@nexenta.com>
Reviewed by: Gordon Ross <gwr@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Richard Lowe <richlowe@richlowe.net>
Ported-by: Paul B. Henson <henson@acm.org>
OpenZFS-issue: https://www.illumos.org/issues/6764
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/de0f1ddb59Closes#10266
Porting notes:
* Updated zfs_acl_chmod to take 'boolean_t isdir' as first parameter
rather than 'zfsvfs_t *zfsvfs'
* zfs man pages changes mixed between zfs and new zfsprops man pages
Reviewed by: Aram Hvrneanu <aram@nexenta.com>
Reviewed by: Gordon Ross <gwr@nexenta.com>
Reviewed by: Robert Gordon <rbg@openrbg.com>
Reviewed by: Mark.Maybee@oracle.com
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Garrett D'Amore <garrett@nexenta.com>
Ported-by: Paul B. Henson <henson@acm.org>
OpenZFS-issue: https://www.illumos.org/issues/742
OpenZFS-issue: https://www.illumos.org/issues/664
OpenZFS-issue: https://www.illumos.org/issues/279
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/a3c49ce110Closes#10266
The quota functions are common to all implementations and can be
moved to common code. As a simplification they were moved to the
Linux platform code in the initial refactoring.
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@ixsystems.com>
Closes#9710
Change many of the znops routines to take a znode rather
than an inode so that zfs_replay code can be largely shared
and in the future the much of the znops code may be shared.
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes#9708
Provide a common zfs_file_* interface which can be implemented on all
platforms to perform normal file access from either the kernel module
or the libzpool library.
This allows all non-portable vnode_t usage in the common code to be
replaced by the new portable zfs_file_t. The associated vnode and
kobj compatibility functions, types, and macros have been removed
from the SPL. Moving forward, vnodes should only be used in platform
specific code when provided by the native operating system.
Reviewed-by: Sean Eric Fagan <sef@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes#9556
Some of the znode fields are different and functions
consuming an inode don't exist on FreeBSD.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes#9536
The FreeBSD implementation can fail, allow this function to
fail and add the required error handling for Linux.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes#9541
It's mostly a noop on ZoL and it conflicts with platforms that
support dtrace. Remove this header to resolve the conflict.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Signed-off-by: Matt Macy <mmacy@FreeBSD.org>
Closes#9497
Move platform specific Linux source under module/os/linux/
and update the build system accordingly. Additional code
restructuring will follow to make the common code fully
portable.
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Macy <mmacy@FreeBSD.org>
Closes#9206