mirror of
https://git.proxmox.com/git/mirror_zfs.git
synced 2026-05-22 02:27:36 +03:00
Implementation of SSE optimized Fletcher-4
Builds off of 1eeb4562 (Implementation of AVX2 optimized Fletcher-4)
This commit adds another implementation of the Fletcher-4 algorithm.
It is automatically selected at module load if it benchmarks higher
than all other available implementations.
The module benchmark was also amended to analyze the performance of
the byteswap-ed version of Fletcher-4, as well as the non-byteswaped
version. The average performance of the two is used to select the
the fastest implementation available on the host system.
Adds a pair of fields to an existing zcommon module parameter:
- zfs_fletcher_4_impl (str)
"sse2" - new SSE2 implementation if available
"ssse3" - new SSSE3 implementation if available
Signed-off-by: Tyler J. Stachecki <stachecki.tyler@gmail.com>
Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4789
This commit is contained in:
committed by
Brian Behlendorf
parent
dfbc86309f
commit
35a76a0366
@@ -838,11 +838,15 @@ Default value: \fB67,108,864\fR.
|
||||
.RS 12n
|
||||
Select a fletcher 4 implementation.
|
||||
.sp
|
||||
Supported selectors are: \fBfastest\fR, \fBscalar\fR, and \fBavx2\fR when
|
||||
AVX2 is supported by the processor. If multiple implementations of fletcher 4
|
||||
are available the \fBfastest\fR will be chosen using a micro benchmark.
|
||||
Selecting \fBscalar\fR results in the original CPU based calculation being
|
||||
used, \fBavx2\fR uses the AVX2 vector instructions to compute a fletcher 4.
|
||||
Supported selectors are: \fBfastest\fR, \fBscalar\fR, \fBsse2\fR, \fBssse3\fR,
|
||||
and \fBavx2\fR. All of the selectors except \fBfastest\fR and \fBscalar\fR
|
||||
require instruction set extensions to be available and will only appear if ZFS
|
||||
detects that they are present at runtime. If multiple implementations of
|
||||
fletcher 4 are available, the \fBfastest\fR will be chosen using a micro
|
||||
benchmark. Selecting \fBscalar\fR results in the original CPU based calculation
|
||||
being used. Selecting any option other than \fBfastest\fR and \fBscalar\fR
|
||||
results in vector instructions from the respective CPU instruction set being
|
||||
used.
|
||||
.sp
|
||||
Default value: \fBfastest\fR.
|
||||
.RE
|
||||
|
||||
Reference in New Issue
Block a user