Commit Graph

17 Commits

Author SHA1 Message Date
Gvozden Neskovic
ee36c709c3 Performance optimization of AVL tree comparator functions
perf: 2.75x faster ddt_entry_compare()
    First 256bits of ddt_key_t is a block checksum, which are expected
to be close to random data. Hence, on average, comparison only needs to
look at first few bytes of the keys. To reduce number of conditional
jump instructions, the result is computed as: sign(memcmp(k1, k2)).

Sign of an integer 'a' can be obtained as: `(0 < a) - (a < 0)` := {-1, 0, 1} ,
which is computed efficiently.  Synthetic performance evaluation of
original and new algorithm over 1G random keys on 2.6GHz Intel(R) Xeon(R)
CPU E5-2660 v3:

old	6.85789 s
new	2.49089 s

perf: 2.8x faster vdev_queue_offset_compare() and vdev_queue_timestamp_compare()
    Compute the result directly instead of using conditionals

perf: zfs_range_compare()
    Speedup between 1.1x - 2.5x, depending on compiler version and
optimization level.

perf: spa_error_entry_compare()
    `bcmp()` is not suitable for comparator use. Use `memcmp()` instead.

perf: 2.8x faster metaslab_compare() and metaslab_rangesize_compare()
perf: 2.8x faster zil_bp_compare()
perf: 2.8x faster mze_compare()
perf: faster dbuf_compare()
perf: faster compares in spa_misc
perf: 2.8x faster layout_hash_compare()
perf: 2.8x faster space_reftree_compare()
perf: libzfs: faster avl tree comparators
perf: guid_compare()
perf: dsl_deadlist_compare()
perf: perm_set_compare()
perf: 2x faster range_tree_seg_compare()
perf: faster unique_compare()
perf: faster vdev_cache _compare()
perf: faster vdev_uberblock_compare()
perf: faster fuid _compare()
perf: faster zfs_znode_hold_compare()

Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
Signed-off-by: Richard Elling <richard.elling@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #5033
2016-08-31 14:35:34 -07:00
Nikolay Borisov
2c6abf15ff Remove znode's z_uid/z_gid member
Remove duplicate z_uid/z_gid member which are also held in the
generic vfs inode struct. This is done by first removing the members
from struct znode and then using the KUID_TO_SUID/KGID_TO_SGID
macros to access the respective member from struct inode. In cases
where the uid/gids are being marshalled from/to disk, use the newly
introduced zfs_(uid|gid)_(read|write) functions to properly
save the uids rather than the internal kernel representation.

Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #4685
Issue #227
2016-07-25 13:21:49 -07:00
Chunwei Chen
100a91aa3e Fix NFS credential
The commit f74b821 caused a regression where creating file through NFS will
always create a file owned by root. This is because the patch enables the KSID
code in zfs_acl_ids_create, which it would use euid and egid of the current
process. However, on Linux, we should use fsuid and fsgid for file operations,
which is the original behaviour. So we revert this part of code.

The patch also enables secpolicy_vnode_*, since they are also used in file
operations, we change them to use fsuid and fsgid.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4772
Closes #4758
2016-06-21 09:58:37 -07:00
Brian Behlendorf
f74b821a66 Add zfs allow and zfs unallow support
ZFS allows for specific permissions to be delegated to normal users
with the `zfs allow` and `zfs unallow` commands.  In addition, non-
privileged users should be able to run all of the following commands:

  * zpool [list | iostat | status | get]
  * zfs [list | get]

Historically this functionality was not available on Linux.  In order
to add it the secpolicy_* functions needed to be implemented and mapped
to the equivalent Linux capability.  Only then could the permissions on
the `/dev/zfs` be relaxed and the internal ZFS permission checks used.

Even with this change some limitations remain.  Under Linux only the
root user is allowed to modify the namespace (unless it's a private
namespace).  This means the mount, mountpoint, canmount, unmount,
and remount delegations cannot be supported with the existing code.  It
may be possible to add this functionality in the future.

This functionality was validated with the cli_user and delegation test
cases from the ZFS Test Suite.  These tests exhaustively verify each
of the supported permissions which can be delegated and ensures only
an authorized user can perform it.

Two minor bug fixes were required for test-running.py.  First, the
Timer() object cannot be safely created in a `try:` block when there
is an unconditional `finally` block which references it.  Second,
when running as a normal user also check for scripts using the
both the .ksh and .sh suffixes.

Finally, existing users who are simulating delegations by setting
group permissions on the /dev/zfs device should revert that
customization when updating to a version with this change.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #362 
Closes #434 
Closes #4100
Closes #4394 
Closes #4410 
Closes #4487
2016-06-07 09:16:52 -07:00
George Wilson
a117a6d66e Illumos #3522
3522 zfs module should not allow uninitialized variables
Reviewed by: Sebastien Roy <seb@delphix.com>
Reviewed by: Adam Leventhal <ahl@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Garrett D'Amore <garrett@damore.org>

References:
  https://www.illumos.org/issues/3522
  illumos/illumos-gate@d5285cae91

Ported-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>

Porting notes:

1. ZFSOnLinux had already addressed many of these issues because of
   its use of -Wall. However, the manner in which they were addressed
   differed. The illumos fixes replace the ones previously made in
   ZFSOnLinux to reduce code differences.

2. Part of the upstream patch made a small change to arc.c that might
   address zfsonlinux/zfs#1334.

3. The initialization of aclsize in zfs_log_create() differs because
   vsecp is a NULL pointer on ZFSOnLinux.

4. The changes to zfs_register_callbacks() were dropped because it
   has diverged and needs to be resynced.
2013-10-30 14:51:27 -07:00
Brian Behlendorf
5484965ab6 Drop HAVE_XVATTR macros
When I began work on the Posix layer it immediately became clear to
me that to integrate cleanly with the Linux VFS certain Solaris
specific things would have to go.  One of these things was to elimate
as many Solaris specific types from the ZPL layer as possible.  They
would be replaced with their Linux equivalents.  This would not only
be good for performance, but for the general readability and health of
the code.  The Solaris and Linux VFS are different beasts and should
be treated as such.  Most of the code remains common for constructing
transactions and such, but there are subtle and important differenced
which need to be repsected.

This policy went quite for for certain types such as the vnode_t,
and it initially seemed to be working out well for the vattr_t.  There
was a relatively small amount of related xvattr_t code I was forced to
comment out with HAVE_XVATTR.  But it didn't look that hard to come
back soon and replace it all with a native Linux type.

However, after going doing this path with xvattr some distance it
clear that this code was woven in the ZPL more deeply than I thought.
In particular its hooks went very deep in to the ZPL replay code
and replacing it would not be as easy as I originally thought.

Rather than continue persuing replacing and removing this code I've
taken a step back and reevaluted things.  This commit reverts many of
my previous commits which removed xvattr related code.  It restores
much of the code to its original upstream state and now relies on
improved xvattr_t support in the zfs package itself.

The result of this is that much of the code which I had commented
out, which accidentally broke things like replay, is now back in
place and working.  However, there may be a small performance
impact for getattr/setattr operations because they now require
a translation from native Linux to Solaris types.  For now that's
a price I'm willing to pay.  Once everything is completely functional
we can revisting the issue of removing the vattr_t/xvattr_t types.

Closes #111
2011-03-02 11:44:34 -08:00
Brian Behlendorf
037849f854 Use provided uid/gid for setattr
When changing the uid/gid of a file via zfs_setattr() use the
Posix id passed in iattr->ia_uid/gid.  While the zfs_fuid_create()
code already had the fuid support disabled for Linux it was
returning the uid/gid from the credential.  With this change
the 'chown' command which relies on setxattr is now working
properly.

Also remove a little stray white space which was in front of
zfs_update_inode() call and the end of zfs_setattr().
2011-02-17 14:23:48 -08:00
Brian Behlendorf
3558fd73b5 Prototype/structure update for Linux
I appologize in advance why to many things ended up in this commit.
When it could be seperated in to a whole series of commits teasing
that all apart now would take considerable time and I'm not sure
there's much merrit in it.  As such I'll just summerize the intent
of the changes which are all (or partly) in this commit.  Broadly
the intent is to remove as much Solaris specific code as possible
and replace it with native Linux equivilants.  More specifically:

1) Replace all instances of zfsvfs_t with zfs_sb_t.  While the
type is largely the same calling it private super block data
rather than a zfsvfs is more consistent with how Linux names
this.  While non critical it makes the code easier to read when
your thinking in Linux friendly VFS terms.

2) Replace vnode_t with struct inode.  The Linux VFS doesn't have
the notion of a vnode and there's absolutely no good reason to
create one.  There are in fact several good reasons to remove it.
It just adds overhead on Linux if we were to manage one, it
conplicates the code, and it likely will lead to bugs so there's
a good change it will be out of date.  The code has been updated
to remove all need for this type.

3) Replace all vtype_t's with umode types.  Along with this shift
all uses of types to mode bits.  The Solaris code would pass a
vtype which is redundant with the Linux mode.  Just update all the
code to use the Linux mode macros and remove this redundancy.

4) Remove using of vn_* helpers and replace where needed with
inode helpers.  The big example here is creating iput_aync to
replace vn_rele_async.  Other vn helpers will be addressed as
needed but they should be be emulated.  They are a Solaris VFS'ism
and should simply be replaced with Linux equivilants.

5) Update znode alloc/free code.  Under Linux it's common to
embed the inode specific data with the inode itself.  This removes
the need for an extra memory allocation.  In zfs this information
is called a znode and it now embeds the inode with it.  Allocators
have been updated accordingly.

6) Minimal integration with the vfs flags for setting up the
super block and handling mount options has been added this
code will need to be refined but functionally it's all there.

This will be the first and last of these to large to review commits.
2011-02-10 09:27:21 -08:00
Brian Behlendorf
a405c8a665 ACL related changes
A small collection of ACL related changes related to not
supporting fuid mapping.  This whole are will need to be
closely investigated.
2011-02-10 09:26:26 -08:00
Brian Behlendorf
60101509ee Add linux kernel disk support
Native Linux vdev disk interfaces

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2010-08-31 13:41:57 -07:00
Brian Behlendorf
c65aa5b2b9 Fix gcc missing parenthesis warnings
Gcc -Wall warn: 'missing parenthesis'

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2010-08-31 08:38:35 -07:00
Brian Behlendorf
572e285762 Update to onnv_147
This is the last official OpenSolaris tag before the public
development tree was closed.
2010-08-26 14:24:34 -07:00
Brian Behlendorf
428870ff73 Update core ZFS code from build 121 to build 141. 2010-05-28 13:45:14 -07:00
Brian Behlendorf
45d1cae3b8 Rebase master to b121 2009-08-18 11:43:27 -07:00
Brian Behlendorf
9babb37438 Rebase master to b117 2009-07-02 15:44:48 -07:00
Brian Behlendorf
fb5f0bc833 Rebase master to b105 2009-01-15 13:59:39 -08:00
Brian Behlendorf
172bb4bd5e Move the world out of /zfs/ and seperate out module build tree 2008-12-11 11:08:09 -08:00