zcommon: add specialized versions of cityhash4

Specializing cityhash4 on 32-bit architectures can reduce the size
of stack frames as well as instruction count. This is a tiny but
useful optimization, since some callers invoke it frequently.

When specializing into 1/2/3/4-arg versions, the stack usage
(in bytes) on some 32-bit arches are listed as follows:

- x86: 32, 32, 32, 40
- arm-v7a: 20, 20, 28, 36
- riscv: 0, 0, 0, 16
- power: 16, 16, 16, 32
- mipsel: 8, 8, 8, 24

And each actual argument (even if passing 0) contributes evenly
to the number of multiplication instructions generated:

- x86: 9, 12, 15 ,18
- arm-v7a: 6, 8, 10, 12
- riscv / power: 12, 18, 20, 24
- mipsel: 9, 12, 15, 19

On 64-bit architectures, the tendencies are similar. But both stack
sizes and instruction counts are significantly smaller thus negligible.

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Closes #16131
Closes #16483
This commit is contained in:
Shengqi Chen
2024-09-07 21:55:03 +08:00
committed by Brian Behlendorf
parent 1c35206124
commit 0ae4460c61
3 changed files with 56 additions and 2 deletions
+31 -2
View File
@@ -49,8 +49,8 @@ cityhash_helper(uint64_t u, uint64_t v, uint64_t mul)
return (b);
}
uint64_t
cityhash4(uint64_t w1, uint64_t w2, uint64_t w3, uint64_t w4)
static inline uint64_t
cityhash_impl(uint64_t w1, uint64_t w2, uint64_t w3, uint64_t w4)
{
uint64_t mul = HASH_K2 + 64;
uint64_t a = w1 * HASH_K1;
@@ -59,9 +59,38 @@ cityhash4(uint64_t w1, uint64_t w2, uint64_t w3, uint64_t w4)
uint64_t d = w3 * HASH_K2;
return (cityhash_helper(rotate(a + b, 43) + rotate(c, 30) + d,
a + rotate(b + HASH_K2, 18) + c, mul));
}
/*
* Passing w as the 2nd argument could save one 64-bit multiplication.
*/
uint64_t
cityhash1(uint64_t w)
{
return (cityhash_impl(0, w, 0, 0));
}
uint64_t
cityhash2(uint64_t w1, uint64_t w2)
{
return (cityhash_impl(w1, w2, 0, 0));
}
uint64_t
cityhash3(uint64_t w1, uint64_t w2, uint64_t w3)
{
return (cityhash_impl(w1, w2, w3, 0));
}
uint64_t
cityhash4(uint64_t w1, uint64_t w2, uint64_t w3, uint64_t w4)
{
return (cityhash_impl(w1, w2, w3, w4));
}
#if defined(_KERNEL)
EXPORT_SYMBOL(cityhash1);
EXPORT_SYMBOL(cityhash2);
EXPORT_SYMBOL(cityhash3);
EXPORT_SYMBOL(cityhash4);
#endif