mirror_zfs/module
Alexander Motin 9dcdee7889
Optimize microzaps
Microzap on-disk format does not include a hash tree, expecting one to
be built in RAM during mzap_open().  The built tree is linked to DMU
user buffer, freed when original DMU buffer is dropped from cache. I've
found that workloads accessing many large directories and having active
eviction from DMU cache spend significant amount of time building and
then destroying the trees.  I've also found that for each 64 byte mzap
element additional 64 byte tree element is allocated, that is a waste
of memory and CPU caches.

Improve memory efficiency of the hash tree by switching from AVL-tree
to B-tree.  It allows to save 24 bytes per element just on pointers.
Save 32 bits on mze_hash by storing only upper 32 bits since lower 32
bits are always zero for microzaps.  Save 16 bits on mze_chunkid, since
microzap can never have so many elements.  Respectively with the 16 bits
there can be no more than 16 bits of collision differentiators.  As
result, struct mzap_ent now drops from 48 (rounded to 64) to 8 bytes.

Tune B-trees for small data.  Reduce BTREE_CORE_ELEMS from 128 to 126
to allow struct zfs_btree_core in case of 8 byte elements to pack into
2KB instead of 4KB.  Aside of the microzaps it should also help 32bit
range trees.  Allow custom B-tree leaf size to reduce memmove() time.

Split zap_name_alloc() into zap_name_alloc() and zap_name_init_str().
It allows to not waste time allocating/freeing memory when processing
multiple names in a loop during mzap_open().

Together on a pool with 10K directories of 1800 files each and DMU
cache limited to 128MB this reduces time of `find . -name zzz` by 41%
from 7.63s to 4.47s, and saves additional ~30% of CPU time on the DMU
cache reclamation.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #14039
2022-10-20 11:57:15 -07:00
..
avl Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
icp crypto_get_ptrs() should always write to *out_data_2 2022-10-19 17:10:56 -07:00
lua Fix theoretical array overflow in lua_typename() 2022-10-14 13:41:56 -07:00
nvpair Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
os Support idmapped mount 2022-10-19 11:17:09 -07:00
unicode Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zcommon Add options to zfs redundant_metadata property 2022-10-19 17:07:51 -07:00
zfs Optimize microzaps 2022-10-20 11:57:15 -07:00
zstd Cleanup: Specify unsignedness on things that should not be signed 2022-09-27 16:42:41 -07:00
.gitignore FreeBSD: Ignore symlink to i386 includes 2022-08-02 16:34:23 -07:00
Kbuild.in Cleanup dead spa_boot code 2022-09-13 16:40:10 -07:00
Makefile.bsd Cleanup dead spa_boot code 2022-09-13 16:40:10 -07:00
Makefile.in autoconf: use include directives instead of recursing down lib 2022-05-10 10:18:11 -07:00