mirror_zfs/module/zfs
Brian Behlendorf 82a37189aa Implement SA based xattrs
The current ZFS implementation stores xattrs on disk using a hidden
directory.  In this directory a file name represents the xattr name
and the file contexts are the xattr binary data.  This approach is
very flexible and allows for arbitrarily large xattrs.  However,
it also suffers from a significant performance penalty.  Accessing
a single xattr can requires up to three disk seeks.

  1) Lookup the dnode object.
  2) Lookup the dnodes's xattr directory object.
  3) Lookup the xattr object in the directory.

To avoid this performance penalty Linux filesystems such as ext3
and xfs try to store the xattr as part of the inode on disk.  When
the xattr is to large to store in the inode then a single external
block is allocated for them.  In practice most xattrs are small
and this approach works well.

The addition of System Attributes (SA) to zfs provides us a clean
way to make this optimization.  When the dataset property 'xattr=sa'
is set then xattrs will be preferentially stored as System Attributes.
This allows tiny xattrs (~100 bytes) to be stored with the dnode and
up to 64k of xattrs to be stored in the spill block.  If additional
xattr space is required, which is unlikely under Linux, they will be
stored using the traditional directory approach.

This optimization results in roughly a 3x performance improvement
when accessing xattrs which brings zfs roughly to parity with ext4
and xfs (see table below).  When multiple xattrs are stored per-file
the performance improvements are even greater because all of the
xattrs stored in the spill block will be cached.

However, by default SA based xattrs are disabled in the Linux port
to maximize compatibility with other implementations.  If you do
enable SA based xattrs then they will not be visible on platforms
which do not support this feature.

----------------------------------------------------------------------
   Time in seconds to get/set one xattr of N bytes on 100,000 files
------+--------------------------------+------------------------------
      |            setxattr            |            getxattr
bytes |  ext4     xfs zfs-dir  zfs-sa  |  ext4     xfs zfs-dir  zfs-sa
------+--------------------------------+------------------------------
1     |  2.33   31.88   21.50    4.57  |  2.35    2.64    6.29    2.43
32    |  2.79   30.68   21.98    4.60  |  2.44    2.59    6.78    2.48
256   |  3.25   31.99   21.36    5.92  |  2.32    2.71    6.22    3.14
1024  |  3.30   32.61   22.83    8.45  |  2.40    2.79    6.24    3.27
4096  |  3.57  317.46   22.52   10.73  |  2.78   28.62    6.90    3.94
16384 |   n/a 2342.39   34.30   19.20  |   n/a   45.44  145.90    7.55
65536 |   n/a 2941.39  128.15  131.32* |   n/a  141.92  256.85  262.12*

Legend:
* ext4      - Stock RHEL6.1 ext4 mounted with '-o user_xattr'.
* xfs       - Stock RHEL6.1 xfs mounted with default options.
* zfs-dir   - Directory based xattrs only.
* zfs-sa    - Prefer SAs but spill in to directories as needed, a
              trailing * indicates overflow in to directories occured.

NOTE: Ext4 supports 4096 bytes of xattr name/value pairs per file.
NOTE: XFS and ZFS have no limit on xattr name/value pairs per file.
NOTE: Linux limits individual name/value pairs to 65536 bytes.
NOTE: All setattr/getattr's were done after dropping the cache.
NOTE: All tests were run against a single hard drive.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #443
2011-11-28 15:45:51 -08:00
..
arc.c Add L2ARC tunables 2011-07-08 12:44:11 -07:00
bplist.c Fix gcc missing parenthesis warnings 2010-08-31 08:38:35 -07:00
bpobj.c Update to onnv_147 2010-08-26 14:24:34 -07:00
dbuf.c Illumos #764: panic in zfs:dbuf_sync_list 2011-08-01 12:09:11 -07:00
ddt_zap.c Fix stack ddt_zap_lookup() 2011-05-31 12:17:27 -07:00
ddt.c Fix stack ddt_class_contains() 2011-05-31 12:17:27 -07:00
dmu_diff.c Update to onnv_147 2010-08-26 14:24:34 -07:00
dmu_object.c Add linux kernel module support 2010-08-31 13:41:58 -07:00
dmu_objset.c Merge branch 'zpl' 2011-02-18 09:31:25 -08:00
dmu_send.c Illumos #755: dmu_recv_stream builds incomplete guid_to_ds_map 2011-10-18 11:18:14 -07:00
dmu_traverse.c Revert "Fix stack traverse_visitbp()" 2011-05-31 12:17:27 -07:00
dmu_tx.c Add linux kernel module support 2010-08-31 13:41:58 -07:00
dmu_zfetch.c Add missing ZFS tunables 2011-05-04 10:02:37 -07:00
dmu.c Suppress large kmem_alloc() warning 2011-02-10 09:27:22 -08:00
dnode_sync.c Fix dbuf eviction assertion 2010-08-31 08:38:45 -07:00
dnode.c Improve meta data performance 2011-11-03 10:19:21 -07:00
dsl_dataset.c Illumos #1092: zfs refratio property 2011-08-01 12:09:11 -07:00
dsl_deadlist.c Update core ZFS code from build 121 to build 141. 2010-05-28 13:45:14 -07:00
dsl_deleg.c Add linux kernel module support 2010-08-31 13:41:58 -07:00
dsl_dir.c Add linux kernel module support 2010-08-31 13:41:58 -07:00
dsl_pool.c Add missing ZFS tunables 2011-05-04 10:02:37 -07:00
dsl_prop.c Fix enum compiler warning 2011-02-23 12:52:51 -08:00
dsl_scan.c Add missing ZFS tunables 2011-05-04 10:02:37 -07:00
dsl_synctask.c Add linux kernel module support 2010-08-31 13:41:58 -07:00
fm.c Add missing ZFS tunables 2011-05-04 10:02:37 -07:00
gzip.c Fix zmod.h usage in userspace 2010-08-31 08:38:46 -07:00
lzjb.c Fix stack lzjb 2010-08-31 08:38:49 -07:00
Makefile.in Implemented NFS export_operations. 2011-04-29 12:36:13 -07:00
metaslab.c Illumos #1051: zfs should handle imbalanced luns 2011-08-01 12:09:11 -07:00
refcount.c Fix gcc uninitialized variable warnings 2010-08-31 08:38:43 -07:00
rrwlock.c Enable rrwlock.c compilation 2010-12-07 16:05:25 -08:00
sa.c Implement SA based xattrs 2011-11-28 15:45:51 -08:00
sha256.c Add linux sha2 support 2010-08-31 13:41:59 -07:00
spa_boot.c Add linux kernel module support 2010-08-31 13:41:58 -07:00
spa_config.c Prototype/structure update for Linux 2011-02-10 09:27:21 -08:00
spa_errlog.c Add linux kernel module support 2010-08-31 13:41:58 -07:00
spa_history.c Add linux kernel module support 2010-08-31 13:41:58 -07:00
spa_misc.c Illumos #1051: zfs should handle imbalanced luns 2011-08-01 12:09:11 -07:00
spa.c Change sun.com URLs to zfsonlinux.org 2011-10-24 09:52:21 -07:00
space_map.c Update core ZFS code from build 121 to build 141. 2010-05-28 13:45:14 -07:00
txg.c Illumos #1313: Integer overflow in txg_delay() 2011-08-01 12:09:43 -07:00
uberblock.c Update core ZFS code from build 121 to build 141. 2010-05-28 13:45:14 -07:00
unique.c Fix gcc ident pragma warnings 2010-08-27 15:34:02 -07:00
vdev_cache.c Illumos #175: zfs vdev cache consumes excessive memory 2011-08-01 12:09:11 -07:00
vdev_disk.c Linux 2.6.37 compat, WRITE_FLUSH_FUA 2011-06-17 14:37:26 -07:00
vdev_file.c Prototype/structure update for Linux 2011-02-10 09:27:21 -08:00
vdev_label.c Fix gcc uninitialized variable warnings 2010-08-31 08:38:43 -07:00
vdev_mirror.c Fix gcc c90 compliance warnings 2010-08-27 15:28:32 -07:00
vdev_missing.c Update core ZFS code from build 121 to build 141. 2010-05-28 13:45:14 -07:00
vdev_queue.c Add missing ZFS tunables 2011-05-04 10:02:37 -07:00
vdev_raidz.c Fix variables named current 2010-08-31 08:38:44 -07:00
vdev_root.c Fix gcc c90 compliance warnings 2010-08-27 15:28:32 -07:00
vdev.c Add missing ZFS tunables 2011-05-04 10:02:37 -07:00
zap_leaf.c Fix gcc uninitialized variable warnings 2010-08-31 08:38:43 -07:00
zap_micro.c Export symbols for the full ZAP API 2011-09-27 16:12:36 -07:00
zap.c Fix rw_init() usage 2010-08-31 08:38:46 -07:00
zfs_acl.c Linux compat 2.6.39: mount_nodev() 2011-07-01 13:36:39 -07:00
zfs_byteswap.c Add linux kernel module support 2010-08-31 13:41:58 -07:00
zfs_debug.c Update core ZFS code from build 121 to build 141. 2010-05-28 13:45:14 -07:00
zfs_dir.c Linux compat 2.6.39: mount_nodev() 2011-07-01 13:36:39 -07:00
zfs_fm.c Initial zio delay timing 2010-10-12 14:55:02 -07:00
zfs_fuid.c Drop HAVE_XVATTR macros 2011-03-02 11:44:34 -08:00
zfs_ioctl.c Suppress kmem_alloc() warning in zfs_prop_set_special() 2011-09-15 20:26:51 -07:00
zfs_log.c Drop HAVE_XVATTR macros 2011-03-02 11:44:34 -08:00
zfs_onexit.c Add linux kernel device support 2010-08-31 13:41:50 -07:00
zfs_replay.c Use Linux ATTR_ versions 2011-03-03 11:29:15 -08:00
zfs_rlock.c Range lock performance improvements 2011-03-08 12:44:06 -08:00
zfs_sa.c Implement SA based xattrs 2011-11-28 15:45:51 -08:00
zfs_vfsops.c Implement SA based xattrs 2011-11-28 15:45:51 -08:00
zfs_vnops.c Fix a race condition in zfs_getattr_fast() 2011-11-03 10:13:09 -07:00
zfs_znode.c Implement SA based xattrs 2011-11-28 15:45:51 -08:00
zil.c Illumos #883: ZIL reuse during remount corruption 2011-08-01 12:09:11 -07:00
zio_checksum.c Update core ZFS code from build 121 to build 141. 2010-05-28 13:45:14 -07:00
zio_compress.c Update core ZFS code from build 121 to build 141. 2010-05-28 13:45:14 -07:00
zio_inject.c Add missing ZFS tunables 2011-05-04 10:02:37 -07:00
zio.c Improve meta data performance 2011-11-03 10:19:21 -07:00
zle.c Update core ZFS code from build 121 to build 141. 2010-05-28 13:45:14 -07:00
zpl_export.c Linux compat 2.6.39: mount_nodev() 2011-07-01 13:36:39 -07:00
zpl_file.c Linux 3.1 compat, fops->fsync() 2011-11-10 10:03:08 -08:00
zpl_inode.c Set mtime on symbolic links 2011-10-18 15:49:31 -07:00
zpl_super.c Linux compat 2.6.39: mount_nodev() 2011-07-01 13:36:39 -07:00
zpl_xattr.c Implement SA based xattrs 2011-11-28 15:45:51 -08:00
zrlock.c Export ZFS symbols needed by Lustre. 2010-09-17 16:24:15 -07:00
zvol.c Fix 'zfs set volsize=N pool/dataset' 2011-05-02 08:54:40 -07:00