Implementation of AVX2 optimized Fletcher-4

New functionality:
- Preserves existing scalar implementation.
- Adds AVX2 optimized Fletcher-4 computation.
- Fastest routines selected on module load (benchmark).
- Test case for Fletcher-4 added to ztest.

New zcommon module parameters:
-  zfs_fletcher_4_impl (str): selects the implementation to use.
    "fastest" - use the fastest version available
    "cycle"   - cycle trough all available impl for ztest
    "scalar"  - use the original version
    "avx2"    - new AVX2 implementation if available

Performance comparison (Intel i7 CPU, 1MB data buffers):
- Scalar:  4216 MB/s
- AVX2:   14499 MB/s

See contents of `/sys/module/zcommon/parameters/zfs_fletcher_4_impl`
to get list of supported values. If an implementation is not supported
on the system, it will not be shown. Currently selected option is
enclosed in `[]`.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4330
This commit is contained in:
Jinshan Xiong
2015-12-09 15:34:16 -08:00
committed by Brian Behlendorf
parent 8fbbc6b4cf
commit 1eeb4562a7
12 changed files with 589 additions and 70 deletions
+1
View File
@@ -22,6 +22,7 @@ KERNEL_C = \
zfs_comutil.c \
zfs_deleg.c \
zfs_fletcher.c \
zfs_fletcher_intel.c \
zfs_namecheck.c \
zfs_prop.c \
zfs_uio.c \
+4
View File
@@ -40,6 +40,7 @@
#include <sys/utsname.h>
#include <sys/time.h>
#include <sys/systeminfo.h>
#include <zfs_fletcher.h>
/*
* Emulation of kernel services in userland.
@@ -1236,12 +1237,15 @@ kernel_init(int mode)
spa_init(mode);
fletcher_4_init();
tsd_create(&rrw_tsd_key, rrw_tsd_destroy);
}
void
kernel_fini(void)
{
fletcher_4_fini();
spa_fini();
system_taskq_fini();