mirror_zfs/module/Makefile.bsd
Tino Reichardt 985c33b132
Introduce BLAKE3 checksums as an OpenZFS feature
This commit adds BLAKE3 checksums to OpenZFS, it has similar
performance to Edon-R, but without the caveats around the latter.

Homepage of BLAKE3: https://github.com/BLAKE3-team/BLAKE3
Wikipedia: https://en.wikipedia.org/wiki/BLAKE_(hash_function)#BLAKE3

Short description of Wikipedia:

  BLAKE3 is a cryptographic hash function based on Bao and BLAKE2,
  created by Jack O'Connor, Jean-Philippe Aumasson, Samuel Neves, and
  Zooko Wilcox-O'Hearn. It was announced on January 9, 2020, at Real
  World Crypto. BLAKE3 is a single algorithm with many desirable
  features (parallelism, XOF, KDF, PRF and MAC), in contrast to BLAKE
  and BLAKE2, which are algorithm families with multiple variants.
  BLAKE3 has a binary tree structure, so it supports a practically
  unlimited degree of parallelism (both SIMD and multithreading) given
  enough input. The official Rust and C implementations are
  dual-licensed as public domain (CC0) and the Apache License.

Along with adding the BLAKE3 hash into the OpenZFS infrastructure a
new benchmarking file called chksum_bench was introduced.  When read
it reports the speed of the available checksum functions.

On Linux: cat /proc/spl/kstat/zfs/chksum_bench
On FreeBSD: sysctl kstat.zfs.misc.chksum_bench

This is an example output of an i3-1005G1 test system with Debian 11:

implementation      1k      4k     16k     64k    256k      1m      4m
edonr-generic     1196    1602    1761    1749    1762    1759    1751
skein-generic      546     591     608     615     619     612     616
sha256-generic     240     300     316     314     304     285     276
sha512-generic     353     441     467     476     472     467     426
blake3-generic     308     313     313     313     312     313     312
blake3-sse2        402    1289    1423    1446    1432    1458    1413
blake3-sse41       427    1470    1625    1704    1679    1607    1629
blake3-avx2        428    1920    3095    3343    3356    3318    3204
blake3-avx512      473    2687    4905    5836    5844    5643    5374

Output on Debian 5.10.0-10-amd64 system: (Ryzen 7 5800X)

implementation      1k      4k     16k     64k    256k      1m      4m
edonr-generic     1840    2458    2665    2719    2711    2723    2693
skein-generic      870     966     996     992    1003    1005    1009
sha256-generic     415     442     453     455     457     457     457
sha512-generic     608     690     711     718     719     720     721
blake3-generic     301     313     311     309     309     310     310
blake3-sse2        343    1865    2124    2188    2180    2181    2186
blake3-sse41       364    2091    2396    2509    2463    2482    2488
blake3-avx2        365    2590    4399    4971    4915    4802    4764

Output on Debian 5.10.0-9-powerpc64le system: (POWER 9)

implementation      1k      4k     16k     64k    256k      1m      4m
edonr-generic     1213    1703    1889    1918    1957    1902    1907
skein-generic      434     492     520     522     511     525     525
sha256-generic     167     183     187     188     188     187     188
sha512-generic     186     216     222     221     225     224     224
blake3-generic     153     152     154     153     151     153     153
blake3-sse2        391    1170    1366    1406    1428    1426    1414
blake3-sse41       352    1049    1212    1174    1262    1258    1259

Output on Debian 5.10.0-11-arm64 system: (Pi400)

implementation      1k      4k     16k     64k    256k      1m      4m
edonr-generic      487     603     629     639     643     641     641
skein-generic      271     299     303     308     309     309     307
sha256-generic     117     127     128     130     130     129     130
sha512-generic     145     165     170     172     173     174     175
blake3-generic      81      29      71      89      89      89      89
blake3-sse2        112     323     368     379     380     371     374
blake3-sse41       101     315     357     368     369     364     360

Structurally, the new code is mainly split into these parts:
- 1x cross platform generic c variant: blake3_generic.c
- 4x assembly for X86-64 (SSE2, SSE4.1, AVX2, AVX512)
- 2x assembly for ARMv8 (NEON converted from SSE2)
- 2x assembly for PPC64-LE (POWER8 converted from SSE2)
- one file for switching between the implementations

Note the PPC64 assembly requires the VSX instruction set and the
kfpu_begin() / kfpu_end() calls on PowerPC were updated accordingly.

Reviewed-by: Felix Dörre <felix@dogcraft.de>
Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Co-authored-by: Rich Ercolani <rincebrain@gmail.com>
Closes #10058
Closes #12918
2022-06-08 15:55:57 -07:00

448 lines
9.3 KiB
Makefile

.if !defined(WITH_CTF)
WITH_CTF=1
.endif
.include <bsd.sys.mk>
SRCDIR=${.CURDIR}
INCDIR=${.CURDIR:H}/include
KMOD= openzfs
.PATH: ${SRCDIR}/avl \
${SRCDIR}/icp/algs/blake3 \
${SRCDIR}/icp/asm-aarch64/blake3 \
${SRCDIR}/icp/asm-ppc64/blake3 \
${SRCDIR}/icp/asm-x86_64/blake3 \
${SRCDIR}/lua \
${SRCDIR}/nvpair \
${SRCDIR}/icp/algs/edonr \
${SRCDIR}/os/freebsd/spl \
${SRCDIR}/os/freebsd/zfs \
${SRCDIR}/unicode \
${SRCDIR}/zcommon \
${SRCDIR}/zfs \
${SRCDIR}/zstd \
${SRCDIR}/zstd/lib/common \
${SRCDIR}/zstd/lib/compress \
${SRCDIR}/zstd/lib/decompress
CFLAGS+= -I${.OBJDIR:H}/include
CFLAGS+= -I${INCDIR}
CFLAGS+= -I${INCDIR}/os/freebsd
CFLAGS+= -I${INCDIR}/os/freebsd/spl
CFLAGS+= -I${INCDIR}/os/freebsd/zfs
CFLAGS+= -I${SRCDIR}/zstd/include
CFLAGS+= -I${SRCDIR}/icp/include
CFLAGS+= -include ${INCDIR}/os/freebsd/spl/sys/ccompile.h
CFLAGS+= -D__KERNEL__ -DFREEBSD_NAMECACHE -DBUILDING_ZFS -D__BSD_VISIBLE=1 \
-DHAVE_UIO_ZEROCOPY -DWITHOUT_NETDUMP -D__KERNEL -D_SYS_CONDVAR_H_ \
-D_SYS_VMEM_H_ -DKDTRACE_HOOKS -DSMP -DCOMPAT_FREEBSD11
.if ${MACHINE_ARCH} == "amd64"
CFLAGS+= -D__x86_64 -DHAVE_SSE2 -DHAVE_SSSE3 -DHAVE_SSE4_1 -DHAVE_SSE4_2 \
-DHAVE_AVX -DHAVE_AVX2 -DHAVE_AVX512F -DHAVE_AVX512VL
.endif
.if defined(WITH_DEBUG) && ${WITH_DEBUG} == "true"
CFLAGS+= -DZFS_DEBUG -g
.if defined(WITH_INVARIANTS) && ${WITH_INVARIANTS} == "true"
CFLAGS+= -DINVARIANTS -DWITNESS -DOPENSOLARIS_WITNESS
.endif
.if defined(WITH_O0) && ${WITH_O0} == "true"
CFLAGS+= -O0
.endif
.else
CFLAGS += -DNDEBUG
.endif
.if defined(WITH_VFS_DEBUG) && ${WITH_VFS_DEBUG} == "true"
# kernel must also be built with this option for this to work
CFLAGS+= -DDEBUG_VFS_LOCKS
.endif
.if defined(WITH_GCOV) && ${WITH_GCOV} == "true"
CFLAGS+= -fprofile-arcs -ftest-coverage
.endif
DEBUG_FLAGS=-g
.if ${MACHINE_ARCH} == "i386" || ${MACHINE_ARCH} == "powerpc" || \
${MACHINE_ARCH} == "arm"
CFLAGS+= -DBITS_PER_LONG=32
.else
CFLAGS+= -DBITS_PER_LONG=64
.endif
SRCS= vnode_if.h device_if.h bus_if.h
#avl
SRCS+= avl.c
# icp
SRCS+= edonr.c
#icp/algs/blake3
SRCS+= blake3.c \
blake3_generic.c \
blake3_impl.c \
blake3_x86-64.c
#icp/asm-aarch64/blake3
SRCS+= b3_aarch64_sse2.S \
b3_aarch64_sse41.S
#icp/asm-ppc64/blake3
SRCS+= b3_ppc64le_sse2.S \
b3_ppc64le_sse41.S
#icp/asm-x86_64/blake3
SRCS+= blake3_avx2.S \
blake3_avx512.S \
blake3_sse2.S \
blake3_sse41.S
#lua
SRCS+= lapi.c \
lauxlib.c \
lbaselib.c \
lcode.c \
lcompat.c \
lcorolib.c \
lctype.c \
ldebug.c \
ldo.c \
lfunc.c \
lgc.c \
llex.c \
lmem.c \
lobject.c \
lopcodes.c \
lparser.c \
lstate.c \
lstring.c \
lstrlib.c \
ltable.c \
ltablib.c \
ltm.c \
lvm.c \
lzio.c
#nvpair
SRCS+= nvpair.c \
fnvpair.c \
nvpair_alloc_spl.c \
nvpair_alloc_fixed.c
#os/freebsd/spl
SRCS+= acl_common.c \
callb.c \
list.c \
sha256c.c \
sha512c.c \
spl_acl.c \
spl_cmn_err.c \
spl_dtrace.c \
spl_kmem.c \
spl_kstat.c \
spl_misc.c \
spl_policy.c \
spl_procfs_list.c \
spl_string.c \
spl_sunddi.c \
spl_sysevent.c \
spl_taskq.c \
spl_uio.c \
spl_vfs.c \
spl_vm.c \
spl_zlib.c \
spl_zone.c
.if ${MACHINE_ARCH} == "i386" || ${MACHINE_ARCH} == "powerpc" || \
${MACHINE_ARCH} == "arm"
SRCS+= spl_atomic.c
.endif
#os/freebsd/zfs
SRCS+= abd_os.c \
arc_os.c \
crypto_os.c \
dmu_os.c \
hkdf.c \
kmod_core.c \
spa_os.c \
sysctl_os.c \
vdev_file.c \
vdev_geom.c \
vdev_label_os.c \
zfs_acl.c \
zfs_ctldir.c \
zfs_debug.c \
zfs_dir.c \
zfs_ioctl_compat.c \
zfs_ioctl_os.c \
zfs_racct.c \
zfs_vfsops.c \
zfs_vnops_os.c \
zfs_znode.c \
zio_crypt.c \
zvol_os.c
#unicode
SRCS+= uconv.c \
u8_textprep.c
#zcommon
SRCS+= zfeature_common.c \
zfs_comutil.c \
zfs_deleg.c \
zfs_fletcher.c \
zfs_fletcher_avx512.c \
zfs_fletcher_intel.c \
zfs_fletcher_sse.c \
zfs_fletcher_superscalar.c \
zfs_fletcher_superscalar4.c \
zfs_namecheck.c \
zfs_prop.c \
zpool_prop.c \
zprop_common.c
#zfs
SRCS+= abd.c \
aggsum.c \
arc.c \
blake3_zfs.c \
blkptr.c \
bplist.c \
bpobj.c \
btree.c \
cityhash.c \
dbuf.c \
dbuf_stats.c \
bptree.c \
bqueue.c \
dataset_kstats.c \
ddt.c \
ddt_zap.c \
dmu.c \
dmu_diff.c \
dmu_object.c \
dmu_objset.c \
dmu_recv.c \
dmu_redact.c \
dmu_send.c \
dmu_traverse.c \
dmu_tx.c \
dmu_zfetch.c \
dnode.c \
dnode_sync.c \
dsl_dataset.c \
dsl_deadlist.c \
dsl_deleg.c \
dsl_bookmark.c \
dsl_dir.c \
dsl_crypt.c \
dsl_destroy.c \
dsl_pool.c \
dsl_prop.c \
dsl_scan.c \
dsl_synctask.c \
dsl_userhold.c \
edonr_zfs.c \
fm.c \
gzip.c \
lzjb.c \
lz4.c \
lz4_zfs.c \
metaslab.c \
mmp.c \
multilist.c \
objlist.c \
pathname.c \
range_tree.c \
refcount.c \
rrwlock.c \
sa.c \
sha256.c \
skein_zfs.c \
spa.c \
spa_boot.c \
spa_checkpoint.c \
spa_config.c \
spa_errlog.c \
spa_history.c \
spa_log_spacemap.c \
spa_misc.c \
spa_stats.c \
space_map.c \
space_reftree.c \
txg.c \
uberblock.c \
unique.c \
vdev.c \
vdev_cache.c \
vdev_draid.c \
vdev_draid_rand.c \
vdev_indirect.c \
vdev_indirect_births.c \
vdev_indirect_mapping.c \
vdev_initialize.c \
vdev_label.c \
vdev_mirror.c \
vdev_missing.c \
vdev_queue.c \
vdev_raidz.c \
vdev_raidz_math.c \
vdev_raidz_math_scalar.c \
vdev_raidz_math_avx2.c \
vdev_raidz_math_avx512bw.c \
vdev_raidz_math_avx512f.c \
vdev_raidz_math_sse2.c \
vdev_raidz_math_ssse3.c \
vdev_rebuild.c \
vdev_removal.c \
vdev_root.c \
vdev_trim.c \
zap.c \
zap_leaf.c \
zap_micro.c \
zcp.c \
zcp_get.c \
zcp_global.c \
zcp_iter.c \
zcp_set.c \
zcp_synctask.c \
zfeature.c \
zfs_byteswap.c \
zfs_chksum.c \
zfs_file_os.c \
zfs_fm.c \
zfs_fuid.c \
zfs_ioctl.c \
zfs_log.c \
zfs_onexit.c \
zfs_quota.c \
zfs_ratelimit.c \
zfs_replay.c \
zfs_rlock.c \
zfs_sa.c \
zfs_vnops.c \
zil.c \
zio.c \
zio_checksum.c \
zio_compress.c \
zio_inject.c \
zle.c \
zrlock.c \
zthr.c \
zvol.c
#zstd
SRCS+= zfs_zstd.c \
entropy_common.c \
error_private.c \
fse_decompress.c \
pool.c \
zstd_common.c \
fse_compress.c \
hist.c \
huf_compress.c \
zstd_compress.c \
zstd_compress_literals.c \
zstd_compress_sequences.c \
zstd_compress_superblock.c \
zstd_double_fast.c \
zstd_fast.c \
zstd_lazy.c \
zstd_ldm.c \
zstd_opt.c \
huf_decompress.c \
zstd_ddict.c \
zstd_decompress.c \
zstd_decompress_block.c
beforeinstall:
.if ${MK_DEBUG_FILES} != "no"
mtree -eu \
-f /etc/mtree/BSD.debug.dist \
-p ${DESTDIR}/usr/lib
.endif
.include <bsd.kmod.mk>
CFLAGS.gcc+= -Wno-pointer-to-int-cast
CFLAGS.lapi.c= -Wno-cast-qual
CFLAGS.lcompat.c= -Wno-cast-qual
CFLAGS.lobject.c= -Wno-cast-qual
CFLAGS.ltable.c= -Wno-cast-qual
CFLAGS.lvm.c= -Wno-cast-qual
CFLAGS.nvpair.c= -DHAVE_RPC_TYPES -Wno-cast-qual
CFLAGS.spl_string.c= -Wno-cast-qual
CFLAGS.spl_vm.c= -Wno-cast-qual
CFLAGS.spl_zlib.c= -Wno-cast-qual
CFLAGS.abd.c= -Wno-cast-qual
CFLAGS.zfs_log.c= -Wno-cast-qual
CFLAGS.zfs_vnops_os.c= -Wno-pointer-arith
CFLAGS.u8_textprep.c= -Wno-cast-qual
CFLAGS.zfs_fletcher.c= -Wno-cast-qual -Wno-pointer-arith
CFLAGS.zfs_fletcher_intel.c= -Wno-cast-qual -Wno-pointer-arith
CFLAGS.zfs_fletcher_sse.c= -Wno-cast-qual -Wno-pointer-arith
CFLAGS.zfs_fletcher_avx512.c= -Wno-cast-qual -Wno-pointer-arith
CFLAGS.zprop_common.c= -Wno-cast-qual
CFLAGS.ddt.c= -Wno-cast-qual
CFLAGS.dmu.c= -Wno-cast-qual
CFLAGS.dmu_traverse.c= -Wno-cast-qual
CFLAGS.dsl_dir.c= -Wno-cast-qual
CFLAGS.dsl_deadlist.c= -Wno-cast-qual
CFLAGS.dsl_prop.c= -Wno-cast-qual
CFLAGS.edonr.c=-Wno-cast-qual
CFLAGS.fm.c= -Wno-cast-qual
CFLAGS.lz4_zfs.c= -Wno-cast-qual
CFLAGS.spa.c= -Wno-cast-qual
CFLAGS.spa_misc.c= -Wno-cast-qual
CFLAGS.sysctl_os.c= -include ../zfs_config.h
CFLAGS.vdev_draid.c= -Wno-cast-qual
CFLAGS.vdev_raidz.c= -Wno-cast-qual
CFLAGS.vdev_raidz_math.c= -Wno-cast-qual
CFLAGS.vdev_raidz_math_scalar.c= -Wno-cast-qual
CFLAGS.vdev_raidz_math_avx2.c= -Wno-cast-qual -Wno-duplicate-decl-specifier
CFLAGS.vdev_raidz_math_avx512f.c= -Wno-cast-qual -Wno-duplicate-decl-specifier
CFLAGS.vdev_raidz_math_sse2.c= -Wno-cast-qual -Wno-duplicate-decl-specifier
CFLAGS.zap_leaf.c= -Wno-cast-qual
CFLAGS.zap_micro.c= -Wno-cast-qual
CFLAGS.zcp.c= -Wno-cast-qual
CFLAGS.zfs_fm.c= -Wno-cast-qual
CFLAGS.zfs_ioctl.c= -Wno-cast-qual
CFLAGS.zil.c= -Wno-cast-qual
CFLAGS.zio.c= -Wno-cast-qual
CFLAGS.zrlock.c= -Wno-cast-qual
CFLAGS.zfs_zstd.c= -Wno-cast-qual -Wno-pointer-arith
CFLAGS.entropy_common.c= -fno-tree-vectorize -U__BMI__
CFLAGS.error_private.c= -fno-tree-vectorize -U__BMI__
CFLAGS.fse_decompress.c= -fno-tree-vectorize -U__BMI__
CFLAGS.pool.c= -fno-tree-vectorize -U__BMI__
CFLAGS.xxhash.c= -fno-tree-vectorize -U__BMI__
CFLAGS.zstd_common.c= -fno-tree-vectorize -U__BMI__
CFLAGS.fse_compress.c= -fno-tree-vectorize -U__BMI__
CFLAGS.hist.c= -fno-tree-vectorize -U__BMI__
CFLAGS.huf_compress.c= -fno-tree-vectorize -U__BMI__
CFLAGS.zstd_compress.c= -fno-tree-vectorize -U__BMI__
CFLAGS.zstd_compress_literals.c= -fno-tree-vectorize -U__BMI__
CFLAGS.zstd_compress_sequences.c= -fno-tree-vectorize -U__BMI__
CFLAGS.zstd_compress_superblock.c= -fno-tree-vectorize -U__BMI__
CFLAGS.zstd_double_fast.c= -fno-tree-vectorize -U__BMI__
CFLAGS.zstd_fast.c= -fno-tree-vectorize -U__BMI__
CFLAGS.zstd_lazy.c= -fno-tree-vectorize -U__BMI__
CFLAGS.zstd_ldm.c= -fno-tree-vectorize -U__BMI__
CFLAGS.zstd_opt.c= -fno-tree-vectorize -U__BMI__
CFLAGS.huf_decompress.c= -fno-tree-vectorize -U__BMI__
CFLAGS.zstd_ddict.c= -fno-tree-vectorize -U__BMI__
CFLAGS.zstd_decompress.c= -fno-tree-vectorize -U__BMI__
CFLAGS.zstd_decompress_block.c= -fno-tree-vectorize -U__BMI__