Implementation of SSE optimized Fletcher-4

Builds off of 1eeb4562 (Implementation of AVX2 optimized Fletcher-4) This commit adds another implementation of the Fletcher-4 algorithm. It is automatically selected at module load if it benchmarks higher than all other available implementations. The module benchmark was also amended to analyze the performance of the byteswap-ed version of Fletcher-4, as well as the non-byteswaped version. The average performance of the two is used to select the the fastest implementation available on the host system. Adds a pair of fields to an existing zcommon module parameter: - zfs_fletcher_4_impl (str) "sse2" - new SSE2 implementation if available "ssse3" - new SSSE3 implementation if available Signed-off-by: Tyler J. Stachecki <stachecki.tyler@gmail.com> Signed-off-by: Gvozden Neskovic <neskovic@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4789
2026-05-22 02:27:36 +03:00 · 2016-06-23 23:32:40 -04:00
parent dfbc86309f
commit 35a76a0366
6 changed files with 243 additions and 5 deletions
@@ -61,6 +61,14 @@ typedef struct fletcher_4_func {
 	const char *name;
 } fletcher_4_ops_t;

+#if defined(HAVE_SSE2)
+extern const fletcher_4_ops_t fletcher_4_sse2_ops;
+#endif
+
+#if defined(HAVE_SSE2) && defined(HAVE_SSSE3)
+extern const fletcher_4_ops_t fletcher_4_ssse3_ops;
+#endif
+
 #if defined(HAVE_AVX) && defined(HAVE_AVX2)
 extern const fletcher_4_ops_t fletcher_4_avx2_ops;
 #endif