mirror_zfs

mirror of https://git.proxmox.com/git/mirror_zfs.git synced 2026-03-22 08:51:30 +03:00

Author	SHA1	Message	Date
Matthew Ahrens	d9eea113f8	It is not necessary to zero struct dbuf_hold_impl_data Under a workload which makes heavy use of `dbuf_hold()`, I noticed that a considerable amount of time was spent in `dbuf_hold_impl()`, due to its call to `kmem_zalloc(sizeof (struct dbuf_hold_impl_data) * DBUF_HOLD_IMPL_MAX_DEPTH)`, which is around 2KiB. This structure is used as a stack, to limit the size of the C stack as dbuf_hold() calls itself recursively. We make a recursive call to hold the parent's dbuf when the requested dbuf is not found. The vast majority of the time, the parent or grandparent indirect dbuf is cached, so the number of recursive calls is very low. However, we initialize this entire array for every call to dbuf_hold(). To improve performance, this commit changes `dbuf_hold()` to use `kmem_alloc()` instead of `kmem_zalloc()`. __dbuf_hold_impl_init is changed to initialize all members of the struct before they are used. I observed ~5% performance improvement on a workload which creates many files. Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4974	2016-08-16 15:27:17 -07:00
Gvozden Neskovic	f0c26069bd	zdb: fencepost error at zdb_cb.zcb_embedded_histogram[][] Erroneous access detected by gcc UndefinedBehaviorSanitizer: `zdb.c:2424:7: runtime error: index 112 out of bounds for type 'uint64_t [112]'` Fix: increase histogram size by 1 to accommodate all possible sizes. Signed-off-by: Gvozden Neskovic <neskovic@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4934 Issue #4883	2016-08-16 14:53:09 -07:00
Gvozden Neskovic	fc897b24b2	Rework of fletcher_4 module - Benchmark memory block is increased to 128kiB to reflect real block sizes more accurately. Measurements include all three stages needed for checksum generation, i.e. `init()/compute()/fini()`. The inner loop is repeated multiple times to offset overhead of time function. - Fastest implementation selects native and byteswap methods independently in benchmark. To support this new function pointers `init_byteswap()/fini_byteswap()` are introduced. - Implementation mutex lock is replaced by atomic variable. - To save time, benchmark is not executed in userspace. Instead, highest supported implementation is used for fastest. Default userspace selector is still 'cycle'. - `fletcher_4_native/byteswap()` methods use incremental methods to finish calculation if data size is not multiple of vector stride (currently 64B). - Added `fletcher_4_native_varsize()` special purpose method for use when buffer size is not known in advance. The method does not enforce 4B alignment on buffer size, and will ignore last (size % 4) bytes of the data buffer. - Benchmark `kstat` is changed to match the one of vdev_raidz. It now shows throughput for all supported implementations (in B/s), native and byteswap, as well as the code [fastest] is running. Example of `fletcher_4_bench` running on `Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz`: implementation native byteswap scalar 4768120823 3426105750 sse2 7947841777 4318964249 ssse3 7951922722 6112191941 avx2 13269714358 11043200912 fastest avx2 avx2 Example of `fletcher_4_bench` running on `Intel(R) Xeon Phi(TM) CPU 7210 @ 1.30GHz`: implementation native byteswap scalar 1291115967 1031555336 sse2 2539571138 1280970926 ssse3 2537778746 1080016762 avx2 4950749767 1078493449 avx512f 9581379998 4010029046 fastest avx512f avx512f Signed-off-by: Gvozden Neskovic <neskovic@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4952	2016-08-16 14:11:55 -07:00
Gvozden Neskovic	70b258fc96	Fletcher4 implementation using avx512f instruction set Algorithm runs 8 parallel sums, consuming 8x uint32_t elements per loop iteration. Size alignment of main fletcher4 methods is adjusted accordingly. New implementation is called 'avx512f'. Note: byteswap method can be implemented more efficiently when avx512bw hardware becomes available. Currently, it is ~ 2x slower than native method. Table shows result of full (native) fletcher4 calculation for different buffer size: fletcher4 4KB 16KB 64KB 128KB 256KB 1MB 16MB -------------------------------------------------------------------- [scalar] 1213 1228 1231 1231 1225 1200 1160 [sse2] 2374 2442 2459 2456 2462 2250 2220 [avx2] 4288 4753 4871 4893 4900 4050 3882 [avx512f] 5975 8445 9196 9221 9262 6307 5620 Signed-off-by: Gvozden Neskovic <neskovic@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #4952	2016-08-16 14:11:14 -07:00
Gvozden Neskovic	32ffaa3de5	Add support for AVX-512 family of instruction sets This patch adds compiler and runtime tests (user and kernel) for following instruction sets: avx512f, avx512cd, avx512er, avx512pf, avx512bw, avx512dq, avx512vl, avx512ifma, avx512vbmi. note: Linux support for AVX-512F (Foundation) instruction set started with linux v3.15 Signed-off-by: Gvozden Neskovic <neskovic@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #4952	2016-08-16 14:10:33 -07:00
Rich Ercolani	6d836e6f8b	Add tunable to ignore hole_birth Adds a module option which disables the hole_birth optimization which has been responsible for several recent bugs, including issue #4050. Original-patch: https://gist.github.com/pcd1193182/2c0cd47211f3aee623958b4698836c48 Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4833	2016-08-15 09:52:56 -07:00
GeLiXin	e35c5a8265	Fix incorrect pool state after import Import a raidz pool which has a vdev with a bad label, zpool status shows the right state of the dev, but the wrong state of the pool. The pool state should be DEGRADED, not ONLINE. We examine the label in vdev_validate while in spa_load_impl, the bad label can be detected but doesn't propagate its state to the parent. There are other chances to propagate state in the following vdev_load if we failed to load DTL, but our pool is raidz1 which can tolerate a faulted disk. So we lost the last chance to correct the pool state. Propagate the leaf vdev's state to parent if its label was corrupted, as is done elsewhere in vdev_validate. Signed-off-by: GeLiXin <ge.lixin@zte.com.cn> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Don Brady <don.brady@intel.com> Closes #4948	2016-08-12 13:46:51 -07:00
Hans Rosenfeld	fb390aafc8	OpenZFS 5997 - FRU field not set during pool creation and never updated Authored by: Hans Rosenfeld <hans.rosenfeld@nexenta.com> Reviewed by: Dan Fields <dan.fields@nexenta.com> Reviewed by: Josef Sipek <josef.sipek@nexenta.com> Reviewed by: Richard Elling <richard.elling@gmail.com> Reviewed by: George Wilson <george.wilson@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Signed-off-by: Don Brady <don.brady@intel.com> Ported-by: Brian Behlendorf <behlendorf1@llnl.gov> OpenZFS-issue: https://www.illumos.org/issues/5997 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/1437283 Porting Notes: In addition to the OpenZFS changes this patch realigns the events with those found in OpenZFS. Events which would be logged as sysevents on illumos have been been mapped to the 'sysevent' class for Linux. In addition, several subclass names have been changed to match what is used in OpenZFS. In all cases this means a '.' was changed to an '_' in the subclass. The scripts provided by ZoL have been updated, however users which provide scripts for any of the following events will need to rename them based on the new subclass names. ereport.fs.zfs.config.sync sysevent.fs.zfs.config_sync ereport.fs.zfs.zpool.destroy sysevent.fs.zfs.pool_destroy ereport.fs.zfs.zpool.reguid sysevent.fs.zfs.pool_reguid ereport.fs.zfs.vdev.remove sysevent.fs.zfs.vdev_remove ereport.fs.zfs.vdev.clear sysevent.fs.zfs.vdev_clear ereport.fs.zfs.vdev.check sysevent.fs.zfs.vdev_check ereport.fs.zfs.vdev.spare sysevent.fs.zfs.vdev_spare ereport.fs.zfs.vdev.autoexpand sysevent.fs.zfs.vdev_autoexpand ereport.fs.zfs.resilver.start sysevent.fs.zfs.resilver_start ereport.fs.zfs.resilver.finish sysevent.fs.zfs.resilver_finish ereport.fs.zfs.scrub.start sysevent.fs.zfs.scrub_start ereport.fs.zfs.scrub.finish sysevent.fs.zfs.scrub_finish ereport.fs.zfs.bootfs.vdev.attach sysevent.fs.zfs.bootfs_vdev_attach	2016-08-12 13:06:48 -07:00
luozhengzheng	834f1e426c	Fix a typo in ZIL write handling comment The following comment in zil.h * WR_COPIED: * If we know we'll immediately be committing the * transaction (FSYNC or FDSYNC), then we allocate a larger * log record here for the data and copy the data in. The word "the" should be "then". Signed-off-by: luozhengzheng <luo.zhengzheng@zte.com.cn> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4961	2016-08-12 10:30:16 -07:00
Jason Zaman	a3600a106d	icp: mark asm files with noexec stack If there is no explicit note in the .S files, the obj file will mark it as requiring an executable stack. This is unneeded and causes issues on hardened systems. More info: https://wiki.gentoo.org/wiki/Hardened/GNU_stack_quickstart Signed-off-by: Jason Zaman <jason@perfinion.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4947 Closes #4962	2016-08-12 09:51:26 -07:00
Jason Zaman	a9947ce771	icp: add no_const for PaX Compat The constify plugin will automatically constify a class of types that contain only function pointers. The icp structs fail to build if this is enabled with the following error. The no_const attribute makes the plugin skip those structs. module/icp/spi/kcf_spi.c: In function ‘copy_ops_vector_v1’: module/icp/spi/kcf_spi.c:61:16: error: assignment of read-only location ‘dst_ops->cou.cou_v1.co_control_ops’ ((dst)->ops) = *((src)->ops); ^ module/icp/spi/kcf_spi.c:74:2: note: in expansion of macro ‘KCF_SPI_COPY_OPS’ KCF_SPI_COPY_OPS(src_ops, dst_ops, co_control_ops); ^ Signed-off-by: Jason Zaman <jason@perfinion.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4947 Closes #4962	2016-08-12 09:51:19 -07:00
Brian Behlendorf	6eb73b0046	Reorder HAVE_BIO_RW_* checks The HAVE_BIO_RW_* #ifdef's must appear before REQ_* #ifdef's in the bio_is_flush() and bio_is_discard() macros. Linux 2.6.32 era kernels defined both of values and the HAVE_BIO_RW_* must be used in this case. This resulted in a panic in zconfig test 5. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Closes #4951 Closes #4959	2016-08-12 09:17:40 -07:00
Matthew Ahrens	169ab07cc8	OpenZFS 7263 - deeply nested nvlist can overflow stack nvlist_pack() and nvlist_unpack are implemented recursively, which can cause the stack to overflow with a deeply nested nvlist; i.e. an nvlist which contains an nvlist, which contains an nvlist, which... Unprivileged users can pass an nvlist to the kernel via certain ioctls on /dev/zfs, which the kernel will unpack without additional permission checking or validation. Therefore, an unprivileged user can cause the kernel's stack to overflow and panic. Ideally, these functions would be implemented non-recursively. As a quick fix, this patch limits the depth of the recursion and returns an error when attempting to pack and unpack a deeply-nested nvlist. Signed-off-by: Adam Leventhal <ahl@delphix.com> Signed-off-by: George Wilson <george.wilson@delphix.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Ported-by: Prakash Surya <prakash.surya@delphix.com> OpenZFS-issue: https://www.illumos.org/issues/7263 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/0511d6d -	2016-08-11 15:58:03 -07:00
Chunwei Chen	b320dd91a9	Fix infinite loop when zdb -R with d flag Also print decompress progress to stderr so it wouldn't pollute raw output with r flag. Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4956	2016-08-11 15:21:32 -07:00
Chen Haiquan	d9c97ec08b	Use file_dentry and file_inode wrappers Fix bugs due to kernel change in torvalds/linux@4bacc9c923 ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay"). This problem crashes system when use zfs as a layer of overlayfs. Signed-off-by: Chen Haiquan <oc@yunify.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4914 Closes #4935	2016-08-11 12:06:37 -07:00
GeLiXin	d5884c3453	Fix indefinite article The indefinite article before nvlist should be "an", not "a". We have 27 "an nvlist" and 7 "a nvlist" in our comment, they should stay the same as we are such a strict filesystem. Signed-off-by: GeLiXin <ge.lixin@zte.com.cn> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4941	2016-08-11 11:23:49 -07:00
Brian Behlendorf	e5fe9ddeec	Remove custom root pool import code Non-Linux OpenZFS implementations require additional support to be used a root pool. This code should simply be removed to avoid confusion and improve readability. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Closes #4951	2016-08-11 11:19:34 -07:00
Brian Behlendorf	cf41432c70	Linux 4.8 compat: Fix removal of bio->bi_rw member All users of bio->bi_rw have been replaced with compatibility wrappers. This allows the kernel specific logic to be abstracted away, and for each of the supported cases to be documented with the wrapper. The updated interfaces are as follows: * void blk_queue_set_write_cache(struct request_queue , bool, bool) boolean_t bio_is_flush(struct bio ) boolean_t bio_is_fua(struct bio ) boolean_t bio_is_discard(struct bio ) boolean_t bio_is_secure_erase(struct bio ) VDEV_WRITE_FLUSH_FUA Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Closes #4951	2016-08-11 11:19:34 -07:00
Gvozden Neskovic	689f093ebc	Build user-space with different gcc optimization levels This fix resolves warnings reported during compiling of user-space libraries with different gcc optimization levels. Tested with gcc versions: 4.9.2 (Debian), and 6.1.1 (Fedora). The patch enables use of following opt levels: O0, O1, O2, O3, Og, Os, Ofast. List of warnings: [GCC 4.9.2 -Os] libzfs_sendrecv.c:3726:26: error: 'clp' may be used uninitialized in this function [-Werror=maybe-uninitialized] [GCC 4.9.2 -Og] fs_fletcher.c:323:26: error: 'idx' may be used uninitialized in this function [-Werror=maybe-uninitialized] dsl_dataset.c:1290:12: error: 'atp' may be used uninitialized in this function [-Werror=maybe-uninitialized] [GCC 4.9.2 -Ofast] u8_textprep.c:1310:9: error: 'tc[3ul]' may be used uninitialized in this function [-Werror=maybe-uninitialized] u8_textprep.c:177:23: error: 'u8t[0ul]' may be used uninitialized in this function [-Werror=maybe-uninitialized] dsl_dataset.c:2089:37: error: ‘hds’ may be used uninitialized in this function [-Werror=maybe-uninitialized] dsl_dataset.c:3216:2: error: ‘ds’ may be used uninitialized in this function [-Werror=maybe-uninitialized] dsl_dataset.c:1591:2: error: ‘ds’ may be used uninitialized in this function [-Werror=maybe-uninitialized] dsl_dataset.c:3341:2: error: ‘ds’ may be used uninitialized in this function [-Werror=maybe-uninitialized] vdev_raidz.c:1153:8: error: 'dcount[2]' may be used uninitialized in this function [-Werror=maybe-uninitialized] vdev_raidz.c:1167:17: error: 'dst[2]' may be used uninitialized in this function [-Werror=maybe-uninitialized] kernel.c:1005:2: error: ‘resid’ may be used uninitialized in this function [-Werror=maybe-uninitialized] libzfs_dataset.c:2826:8: error: ‘val’ may be used uninitialized in this function [-Werror=maybe-uninitialized] libzfs_dataset.c:3056:35: error: ‘val’ may be used uninitialized in this function [-Werror=maybe-uninitialized] libzfs_dataset.c:1584:13: error: ‘val’ may be used uninitialized in this function [-Werror=maybe-uninitialized] libzfs_dataset.c:3056:35: error: ‘val’ may be used uninitialized in this function [-Werror=maybe-uninitialized] libzfs_dataset.c:1792:66: error: ‘val’ may be used uninitialized in this function [-Werror=maybe-uninitialized] libzfs_dataset.c:3986:35: error: ‘val’ may be used uninitialized in this function [-Werror=maybe-uninitialized] [GCC 6.1.1] Resolved in PR #4907 Signed-off-by: Gvozden Neskovic <neskovic@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4937	2016-08-09 14:40:35 -07:00
Chunwei Chen	afb6c031e8	Linux 4.7 compat: fix zpl_get_acl returns invalid acl pointer Starting from Linux 4.7, get_acl will set acl cache pointer to temporary sentinel value before calling i_op->get_acl. Therefore we can't compare against ACL_NOT_CACHED and return. Since from Linux 3.14, get_acl already check the cache for us, so we disable this in zpl_get_acl. Linux 4.7 also does set_cached_acl for us so we disable it in zpl_get_acl. Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4944 Closes #4946	2016-08-09 10:03:04 -07:00
GeLiXin	88c4c7a067	Fix call zfs_get_name() with invalid parameter zfs_get_name() expects a parameter of type zfs_handle_t zhp , but gets an invalid parameter type of zfs_handle_t *zhp actually in libzfs_dataset_cmp(), which may trigger a coredump if called. libzfs_dataset_cmp() working normally so far, just because all the callers only give datasets of type ZFS_TYPE_FILESYSTEM to it, we compared their mountpoint and return, luckily. Signed-off-by: GeLiXin <ge.lixin@zte.com.cn> Signed-off-by: Tim Chase <tim@chase2k.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4919	2016-08-08 12:24:57 -07:00
Brian Behlendorf	4b908d3220	Linux 4.8 compat: posix_acl_valid() The posix_acl_valid() function has been updated to require a user namespace. Filesystem callers should normally provide the user_ns from the super block associcated with the ACL; the zpl_posix_acl_valid() wrapper has been added for this purpose. See https://github.com/torvalds/linux/commit/0d4d717f for complete details. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com> Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Closes #4922	2016-08-08 11:46:40 -07:00
Brian Behlendorf	e85a6396b0	Retire HAVE_CURRENT_UMASK and HAVE_POSIX_ACL_CACHING Remove ZFS_AC_KERNEL_CURRENT_UMASK and ZFS_AC_KERNEL_POSIX_ACL_CACHING configure checks, all supported kernel provide this functionality. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com> Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Closes #4922	2016-08-08 11:46:32 -07:00
Nikolay Borisov	64aefee1b8	Fix interaction between userns uid/gid and SA * When the uid/gid change is handled in zfs_setattr we want to actually adjust the user passed uid to a KUID and write that to disk. * In trace points use the i_uid member without doing translation, since it has already been performed. * Use kuid in zfs_aclset_common Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com> Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4928	2016-08-08 10:47:43 -07:00
Nikolay Borisov	938cfeb0f2	Linux 4.8 compat: new s_user_ns member of struct super_block Kernel 4.8 paved the way to enabling mounting a file system inside a non-init user namespace. To facilitate this a s_user_ns member was added holding the userns in which the filesystem's instance was mounted. This enables doing the uid/gid translation relative to this particular username space and not the default init_user_ns. Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com> Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4928	2016-08-08 10:47:22 -07:00
Gaurav Kumar	cf2731e65b	arc_meta_limit should be updated when arc_max is changed. When arc_max is increased, arc_meta_limit will not be updated to 3/4 of the new arc_c_max value. This was done originally to preserve any existing maximum value. This turned out to be counter intuitive to users and this fix changes that behavior. If zfs_arc_meta_limit is non-default, it will be picked up later in the ARC tuning function. Signed-off-by: Gaurav Kumar <gaurav.kumar@nutanix.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4893	2016-08-02 13:43:36 -07:00
Brian Behlendorf	f3c9cac143	Fix gcc -Warray-bounds check for dump_object() in zdb As of gcc 6.1.1 20160621 (Red Hat 6.1.1-3) an array bounds warnings is detected in the zdb the dump_object() function. The analysis is correct but difficult to interpret because this is implemented as a macro. Rework the ZDB_OT_NAME in to a function and remove the case detected by gcc which is a side effect of the DMU_OT_IS_VALID() macro. zdb.c: In function ‘dump_object’: zdb.c:1931:288: error: array subscript is outside array bounds [-Werror=array-bounds] Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Gvozden Neskovic <neskovic@gmail.com> Closes #4907	2016-08-02 13:14:47 -07:00
Brian Behlendorf	efe7978d89	Fix gcc self-comparison warning As of gcc 6.1.1 20160621 (Red Hat 6.1.1-3) a self-comparison is detected by gcc in metaslab_alloc(). Resolve the warning by passing a physical size of 0 to BP_SET_BIRTH() as it done by other callers. module/zfs/metaslab.c: In function ‘metaslab_alloc’: module/zfs/metaslab.c:2575:184: error: self-comparison always evaluates to true [-Werror=tautological-compare] Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Gvozden Neskovic <neskovic@gmail.com> Issue #4907	2016-08-02 13:14:18 -07:00
Chunwei Chen	5b1bc1a1d8	Set proper dependency for string replacement targets A lot of string replacement target don't have dependency or incorrect dependency. We setup proper dependency by pattern rules. Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4908	2016-08-02 10:28:29 -07:00
Brian Behlendorf	b64e02e580	Add `make lint` target Add a `make lint` target which maps to a cppcheck target. As with the shellcheck target it will only run when cppcheck is installed. This allows a `make lint` build check to be incrementally added to the automated testing for distribution which provide cppcheck. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4915	2016-08-02 09:25:43 -07:00
Tony Hutter	4eb0db42d3	Fix possible VDEV stats array overflow Fix a possible VDEV statistics array overflow when ZIOs with ZIO_PRIORITY_NOW complete. Signed-off-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #4883 Closes #4917	2016-08-02 08:45:24 -07:00
liaoyuxiangqin	e24e62a948	Fix memory leak in function add_config() Config of a hot spare or l2cache device will leak memory in function add_config(). At the start of this function, when dealing with a config which belongs to a hot spare not currently in use or a l2cache device the config should be freed. Signed-off-by: liaoyuxiangqin <guo.yong33@zte.com.cn> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4910	2016-08-01 12:49:03 -07:00
Gvozden Neskovic	df053d67a9	ztest: memory leaks reported by AddressSanitizer Leaks reported by using AddressSanitizer, GCC 6.1.0 Direct leak of 4097 byte(s) in 1 object(s) allocated from: #1 0x414f73 in process_options cmd/ztest/ztest.c:721 Direct leak of 5440 byte(s) in 17 object(s) allocated from: #1 0x41bfd5 in umem_alloc ../../lib/libspl/include/umem.h:88 #2 0x41bfd5 in ztest_zap_parallel cmd/ztest/ztest.c:4659 #3 0x4163a8 in ztest_execute cmd/ztest/ztest.c:5907 Signed-off-by: Gvozden Neskovic <neskovic@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4896	2016-07-29 15:34:12 -07:00
Gvozden Neskovic	78867a0a0a	libzfs_import.c: Uninitialized pointer read In zpool_find_import_scan: Reads an uninitialized pointer or its target Coverity #150966 Found by static analysis with CoverityScan 0.8.5 Signed-off-by: Gvozden Neskovic <neskovic@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4897	2016-07-29 15:34:12 -07:00
Colin Ian King	b264d9b3e5	libzfs: Fix missing va_end call on ENOSPC and EDQUOT cases The switch statement in function zfs_standard_error_fmt for the ENOSPC and EDQUOT cases returns immediately and unlike all other cases in the switch this does not perform the va_end call. Perform a break which ends up calling va_end rather than returning immediately. Found by static analysis with CoverityScan 0.8.5 Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4900	2016-07-29 15:34:12 -07:00
Nikolay Borisov	ba2fe6affb	Move assignment of i_blkbits field Currently i_blkbits is always set to SPA_MINBLOCKSHIFT every time zfs_inode_update_impl is called. Since this value never changes move its assignment to at inode creation time. Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4906	2016-07-29 15:34:12 -07:00
Nikolay Borisov	e334e828a6	Unify license of icp module with the rest of zfs The newly added icp module uses a hardcoded value of CDDL for the license, however in local development one might want to change that to something else in order to facilitate compiling against lock debugging enabled kernel. All modules of the zfs use the ZFS_META_LICNSE string which is replaced with the value held in the META file. One can modify the value in the META file once and then rerun the configure to have all modules' licenses changed. Change the icp module license string to be ZFS_META_LICENSE so that it falls under the same paradigm. Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4905	2016-07-29 15:34:12 -07:00
heary-cao	9f3d1407dc	Fix zfs_allow_log_destroy() NULL dereference In zfs_ioc_log_history() function the tsd_set() function is called with NULL which causes the zfs_allow_log_destroy() to be run. In this case the passed value will be NULL. This is normally entirely safe because strfree() maps directly to kfree() which may be passed a NULL. However, since alternate implementations of strfree() may not handle this gracefully add a check for NULL. Observed under an embedded Linux 2.6.32.41 kernel running the automated testing while running the ZFS Test Suite. Signed-off-by: caoxuewen <cao.xuewen@zte.com.cn> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4872	2016-07-29 15:34:12 -07:00
Chunwei Chen	3b86aeb295	Linux 4.8 compat: REQ_OP and bio_set_op_attrs() New REQ_OP_* definitions have been introduced to separate the WRITE, READ, and DISCARD operations from the flags. This included changing the encoding of bi_rw. It places REQ_OP_* in high order bits and other stuff in low order bits. This encoding is done through the new helper function bio_set_op_attrs. For complete details refer to: https://github.com/torvalds/linux/commit/f215082 https://github.com/torvalds/linux/commit/4e1b2d5 Signed-off-by: Tim Chase <tim@chase2k.com> Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4892 Closes #4899	2016-07-29 14:48:19 -07:00
Brian Behlendorf	76e5f6fe10	Linux 4.8 compat: REQ_PREFLUSH The REQ_FLUSH flag was renamed REQ_PREFLUSH to avoid confusion with REQ_OP_FLUSH. See https://github.com/torvalds/linux/commit/28a8f0d3 for complete details. Signed-off-by: Tim Chase <tim@chase2k.com> Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #4892 Issue #4899	2016-07-29 14:48:09 -07:00
Brian Behlendorf	bbb1b6cea7	Linux 4.8 compat: submit_bio() The rw argument has been removed from submit_bio/submit_bio_wait. Callers are now expected to set bio->bi_rw instead of passing it in. See https://github.com/torvalds/linux/commit/4e49ea4a for complete details. Signed-off-by: Tim Chase <tim@chase2k.com> Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #4892 Issue #4899	2016-07-29 14:48:00 -07:00
Richard Yao	f26b4b3c8a	txg visibility code should not execute under tc_open_lock The memory allocation and locking in `spa_txg_history_*()` can potentially block txg_hold_open for arbitrarily long periods of time. Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #4333	2016-07-27 14:11:13 -07:00
Brian Behlendorf	fcf64f45d9	Fix zdb crash with 4K-only devices Here's the problem - on 4K native devices in userland on Linux using O_DIRECT, buffers must be 4K aligned or I/O will fail with EINVAL, causing zdb (and others) to coredump. Since userland probably doesn't need optimized buffer caches, we just force 4K alignment on everything. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Gvozden Neskovic <neskovic@gmail.com> Closes #4479	2016-07-27 13:38:46 -07:00
Brian Behlendorf	a0cacb760a	Enable history test cases Updated test case history_001_pos.ksh so it can run in tree. The original test case assumed /usr/sbin/zfs and /usr/sbin/zpool were the only valid locations for these utilities. The same modification has already been made too history_common.kshlib. The only other failing test case was history_010_pos and that was the result of the ":linux" suffix not being appended when checking the long output in the test case. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4882	2016-07-27 13:38:46 -07:00
Colin Ian King	bf18fd89f9	void integer overflow on computation of refquota_slack DMU_MAX_ACCESS should be cast to a uint64_t otherwise the multiplication of DMU_MAX_ACCESS with spa_asize_inflation will be 32 bit and may lead to an overflow. Currently DMU_MAX_ACCESS is 64 * 1024 * 1024, so spa_asize_inflation being 64 or more will lead to an overflow. Found by static analysis with CoverityScan 0.8.5 CID 150942 (#1 of 1): Unintentional integer overflow (OVERFLOW_BEFORE_WIDEN) overflow_before_widen: Potentially overflowing expression 67108864 * spa_asize_inflation with type int (32 bits, signed) is evaluated using 32-bit arithmetic, and then used in a context that expects an expression of type uint64_t (64 bits, unsigned). Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4889	2016-07-27 13:38:46 -07:00
Brian Behlendorf	8a39abaafa	Multi-thread 'zpool import' for blkid Commit `519129f` added support to multi-thread 'zpool import' for the case where block devices are scanned for under /dev/. This commit generalizes that logic and applies it to the case where device names are acquired from libblkid. The zpool_find_import_scan() and zpool_find_import_blkid() functions create an AVL tree containing each device name. Each entry in this tree is dispatched to a taskq where the function zpool_open_func() validates the device by opening it and reading the label. This may result in additional entries being added to the tree and those device paths being verified. This is largely how the upstream OpenZFS code behaves but due to significant differences the non-Linux code has been dropped for readability. Additionally, this code makes use of taskqs and kmutexs which are normally not available to the command line tools. Special care has been taken to allow their use in the import functions. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Olaf Faaland <faaland1@llnl.gov> Closes #4794	2016-07-27 13:38:46 -07:00
Gvozden Neskovic	a64f903b06	Fixes for issues found with cppcheck tool The patch fixes small number of errors/false positives reported by `cppcheck`, static analysis tool for C/C++. cppcheck 1.72 $ cppcheck . --force --quiet [cmd/zfs/zfs_main.c:4444]: (error) Possible null pointer dereference: who_perm [cmd/zfs/zfs_main.c:4445]: (error) Possible null pointer dereference: who_perm [cmd/zfs/zfs_main.c:4446]: (error) Possible null pointer dereference: who_perm [cmd/zpool/zpool_iter.c:317]: (error) Uninitialized variable: nvroot [cmd/zpool/zpool_vdev.c:1526]: (error) Memory leak: child [lib/libefi/rdwr_efi.c:1118]: (error) Memory leak: efi_label [lib/libuutil/uu_misc.c:207]: (error) va_list 'args' was opened but not closed by va_end(). [lib/libzfs/libzfs_import.c:1554]: (error) Dangerous usage of 'diskname' (strncpy doesn't always null-terminate it). [lib/libzfs/libzfs_sendrecv.c:3279]: (error) Dereferencing 'cp' after it is deallocated / released [tests/zfs-tests/cmd/file_write/file_write.c:154]: (error) Possible null pointer dereference: operation [tests/zfs-tests/cmd/randfree_file/randfree_file.c:90]: (error) Memory leak: buf [cmd/zinject/zinject.c:1068]: (error) Uninitialized variable: dataset [module/icp/io/sha2_mod.c:698]: (error) Uninitialized variable: blocks_per_int64 Signed-off-by: Gvozden Neskovic <neskovic@gmail.com> Signed-off-by: Chunwei Chen <david.chen@osnexus.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1392	2016-07-27 13:31:22 -07:00
Tim Chase	25458cbef9	Limit the amount of dnode metadata in the ARC Metadata-intensive workloads can cause the ARC to become permanently filled with dnode_t objects as they're pinned by the VFS layer. Subsequent data-intensive workloads may only benefit from about 25% of the potential ARC (arc_c_max - arc_meta_limit). In order to help track metadata usage more precisely, the other_size metadata arcstat has replaced with dbuf_size, dnode_size and bonus_size. The new zfs_arc_dnode_limit tunable, which defaults to 10% of zfs_arc_meta_limit, defines the minimum number of bytes which is desirable to be consumed by dnodes. Attempts to evict non-metadata will trigger async prune tasks if the space used by dnodes exceeds this limit. The new zfs_arc_dnode_reduce_percent tunable specifies the amount by which the excess dnode space is attempted to be pruned as a percentage of the amount by which zfs_arc_dnode_limit is being exceeded. By default, it tries to unpin 10% of the dnodes. The problem of dnode metadata pinning was observed with the following testing procedure (in this example, zfs_arc_max is set to 4GiB): - Create a large number of small files until arc_meta_used exceeds arc_meta_limit (3GiB with default tuning) and arc_prune starts increasing. - Create a 3GiB file with dd. Observe arc_mata_used. It will still be around 3GiB. - Repeatedly read the 3GiB file and observe arc_meta_limit as before. It will continue to stay around 3GiB. With this modification, space for the 3GiB file is gradually made available as subsequent demands on the ARC are made. The previous behavior can be restored by setting zfs_arc_dnode_limit to the same value as the zfs_arc_meta_limit. Signed-off-by: Tim Chase <tim@chase2k.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #4345 Issue #4512 Issue #4773 Closes #4858	2016-07-25 15:26:38 -07:00
Tim Chase	e6603b7c1f	Fix sync behavior for disk vdevs Prior to `b39c22b`, which was first generally available in the 0.6.5 release as `b39c22b`, ZoL never actually submitted synchronous read or write requests to the Linux block layer. This means the vdev_disk_dio_is_sync() function had always returned false and, therefore, the completion in dio_request_t.dr_comp was never actually used. In `b39c22b`, synchronous ZIO operations were translated to synchronous BIO requests in vdev_disk_io_start(). The follow-on commits `5592404` and `aa159af` fixed several problems introduced by `b39c22b`. In particular, `5592404` introduced the new flag parameter "wait" to __vdev_disk_physio() but under ZoL, since vdev_disk_physio() is never actually used, the wait flag was always zero so the new code had no effect other than to cause a bug in the use of the dio_request_t.dr_comp which was fixed by `aa159af`. The original rationale for introducing synchronous operations in `b39c22b` was to hurry certains requests through the BIO layer which would have otherwise been subject to its unplug timer which would increase the latency. This behavior of the unplug timer, however, went away during the transition of the plug/unplug system between kernels 2.6.32 and 2.6.39. To handle the unplug timer behavior on 2.6.32-2.6.35 kernels the BIO_RW_UNPLUG flag is used as a hint to suppress the plugging behavior. For kernels 2.6.36-2.6.38, the REQ_UNPLUG macro will be available and ise used for the same purpose. Signed-off-by: Tim Chase <tim@chase2k.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #4858	2016-07-25 14:24:47 -07:00
Brian Behlendorf	273ff9b5cc	Fix uninitialized variable in avl_add() Silence the following warning when compiling with gcc 5.4.0. Specifically gcc (Ubuntu 5.4.0-6ubuntu1~16.04.1) 5.4.0 20160609. module/avl/avl.c: In function ‘avl_add’: module/avl/avl.c:647:2: warning: ‘where’ may be used uninitialized in this function [-Wmaybe-uninitialized] avl_insert(tree, new_node, where); Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2016-07-25 14:21:34 -07:00

... 2 3 4 5 6 ...

2442 Commits