Add superscalar fletcher4

This is the Fletcher4 algorithm implemented in pure C, but using multiple counters using algorithms identical to those used for SSE/NEON and AVX2. This allows for faster execution on core with strong superscalar capabilities but weak SIMD capabilities. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Romain Dolbeau <romain.dolbeau@atos.net> Closes #5317
2026-05-24 19:28:53 +03:00 · 2016-11-04 18:53:03 +01:00
parent ace1eae84c
commit 7f3194932d
8 changed files with 405 additions and 2 deletions
@@ -164,6 +164,8 @@ static fletcher_4_ops_t fletcher_4_fastest_impl = {

 static const fletcher_4_ops_t *fletcher_4_impls[] = {
 	&fletcher_4_scalar_ops,
+	&fletcher_4_superscalar_ops,
+	&fletcher_4_superscalar4_ops,
 #if defined(HAVE_SSE2)
 	&fletcher_4_sse2_ops,
 #endif