mirror_zfs

mirror of https://git.proxmox.com/git/mirror_zfs.git synced 2025-10-26 18:05:04 +03:00

Author	SHA1	Message	Date
Turbo Fredriksson	2a34db1bdb	Base init scripts for SYSV systems * Based on the init scripts included with Debian GNU/Linux, then take code from the already existing ones, trying to merge them into one set of scripts that will work for 'everyone' for better maintainability. * Add configurable variables to control the workings of the init scripts: * ZFS_INITRD_PRE_MOUNTROOT_SLEEP Set a sleep time before we load the module (used primarily by initrd scripts to allow for slower media (such as USB devices etc) to be availible before we load the zfs module). * ZFS_INITRD_POST_MODPROBE_SLEEP Set a timed sleep in the initrd to after the load of the zfs module. * ZFS_INITRD_ADDITIONAL_DATASETS To allow for mounting additional datasets in the initrd. Primarily used in initrd scripts to allow for when filesystem needed to boot (such as /usr, /opt, /var etc) isn't directly under the root dataset. * ZFS_POOL_EXCEPTIONS Exclude pools from being imported (in the initrd and/or init scripts). * ZFS_DKMS_ENABLE_DEBUG, ZFS_DKMS_ENABLE_DEBUG_DMU_TX, ZFS_DKMS_DISABLE_STRIP Set to control how dkms should build the dkms packages. * ZPOOL_IMPORT_PATH Set path(s) where "zpool import" should import pools from. This was previously the job of "USE_DISK_BY_ID" (which is still used for backwards compatibility) but was renamed to allow for better control of import path(s). * If old USE_DISK_BY_ID is set, but not new ZPOOL_IMPORT_PATH, then we set ZPOOL_IMPORT_PATH to sane defaults just to be on the safe side. * ZED_ARGS To allow for local options to zed without having to change the init script. * The import function, do_import(), imports pools by name instead of '-a' for better control of pools to import and from where. * If USE_DISK_BY_ID is set (for backwards compatibility), but isn't 'yes' then ignore it. * If pool(s) isn't found with a simple "zpool import" (seen it happen), try looking for them in /dev/disk/by-id (if it exists). Any duplicates (pools found with both commands) is filtered out. * IF we have found extra pool(s) this way, we must force USE_DISK_BY_ID so that the first, simple "zpool import $pool" is able to find it. * Fallback on importing the pool using the cache file (if it exists) only if 'simple' import (either with ZPOOL_IMPORT_PATH or the 'built in' defaults) didn't work. * The export function, do_export(), will export all pools imported, EXCEPT the root pool (if there is one). * ZED script from the Debian GNU/Linux packages added. * Refreshed ZED init script from behlendorf@5e7a660 to be portable so it may be used on both LSB and Redhat style systems. * If there is no pool(s) imported and zed successfully shut down, we will unload the zfs modules. * The function library file for the ZoL init script is installed as /etc/init.d/zfs-functions. * The four init scripts, the /etc/{defaults,sysconfig,conf.d}/zfs config file as well as the common function library is tagged as '%config(noreplace)' in the rpm rules file to make sure they are not replaced automatically if locally modifed. * Pitfals and workarounds: * If we're running from init, remove stale /etc/dfs/sharetab before importing pools in the zfs-import init script. * On Debian GNU/Linux, there's a 'sendsigs' script that will kill basically everything quite early in the shutdown phase and zed is/should be stopped much later than that. We don't want zed to be among the ones killed, so add the zed pid to list of pids for 'sendsigs' to ignore. * CentOS uses echo_success() and echo_failure() to print out status of command. These in turn uses "echo -n \0xx[etc]" to move cursor and choose colour etc. This doesn't work with the modified IFS variable we need to use in zfs-import for some reason, so work around that when we define zfs_log_{end,failure}_msg() for RedHat and derivative distributions. * All scripts passes ShellCheck (with one false positive in do_mount()). Signed-off-by: Turbo Fredriksson turbo@bayour.com Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: Richard Yao <ryao@gentoo.org> Reviewed by: Chris Dunlap <cdunlap@llnl.gov> Closes #2974 Closes #2107	2015-05-28 14:14:53 -07:00
Turbo Fredriksson	01fcbec52d	The mount helper mount.zfs MUST be in /sbin (not '$sbindir'). Commit `60e9f69` added the --with-mounthelperdir option for Gentoo and in the process accidentally modified the default installation location. For security reasons mount(8) expects it to only be installed under /sbin. Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3426	2015-05-18 16:54:36 -07:00
Tim Chase	e48533383b	Linux 2.6.36 compat, use REQ_FAILFAST_MASK and remove pre-2.6.36 support Commit `f4af6bb783` which added support for REQ_FAILFAST_MASK but the new autoconf test didn't use the same preprocessor macro name as the code did. The effect is that FAILFAST mode has not been enabled for ZoL in any post-2.6.35 kernel. Retire the HAVE_BIO_RW_FAILFAST interface used in pre-2.6.28 kernels. Raise an error condition if the FAILFAST interface can't be detected. Signed-off-by: Tim Chase <tim@onlight.com Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3386	2015-05-11 15:07:00 -07:00
Brian Behlendorf	fade6b00b6	Add RHEL style kmod packages Provide a Redhat specific spl-kmod.spec file which uses the old style kmods (not kmods2) packaging. By using the provided kmodtool script packages can be built which support weak modules. This allows for the kernel to be updated without having to rebuild the SPL kernel modules. Packages for RHEL/Centos/SL/TOSS which use this spec file can by built as follows: $ ./configure --with-spec=redhat $ make rpms Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2015-03-27 14:42:04 -07:00
Brian Behlendorf	ee2ca1db28	Add RHEL style kmod packages Provide a Redhat specific zfs-kmod.spec file which uses the old style kmods (not kmods2) packaging. By using the provided kmodtool script packages can be built which support weak modules. This allows for the kernel to be updated without having to rebuild the ZFS kernel modules. Packages for RHEL/Centos/SL/TOSS which use this spec file can by built as follows: $ ./configure --with-spec=redhat $ make rpms Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2015-03-27 14:41:48 -07:00
Brian Behlendorf	d820d2e9cf	Remove rpm/fedora directory Originally it was thought that custom spec files might be required for Fedora. Happily that has turns out not to be the case. Since this directory just contains symlinks to the generic spec files it can be removed. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2015-03-27 14:30:58 -07:00
Brian Behlendorf	72998c2c9d	Remove rpm/fedora directory Originally it was thought that custom spec files might be required for Fedora. Happily that has turns out not to be the case. Since this directory just contains symlinks to the generic spec files it can be removed. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2015-03-27 14:22:38 -07:00
Tim Chase	abb642b9a9	Set HAVE_FS_STRUCT_SPINLOCK correctly when CONFIG_FRAME_WARN==1024 If kernel lock debugging is enabled, the fs_struct structure exceeds the typical 1024 byte limit of CONFIG_FRAME_WARN and isn't enabled when it otherwise should be. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tim Chase <tim@chase2k.com> Closes #440	2015-03-24 13:25:25 -07:00
Bill McGonigle	e023409500	Linux 4.0 compat: bdi_setup_and_register() __must_check Explicitly disable the unused by variable warnings by setting __attribute__((unused)) for bdi_setup_and_register(). This is required because the function is defined with the __must_check attribute. Signed-off-by: Bill McGonigle <bill-github.com-public1@bfccomputing.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3141	2015-03-16 10:56:26 -07:00
Brian Behlendorf	8c45def24a	Linux 4.0 compat: bdi_setup_and_register() The 'capabilities' argument which was passed to bdi_setup_and_register() has been removed. File systems should no longer pass BDI_CAP_MAP_COPY. For our purposes this means there are now three different interfaces which must be handled. A zpl_bdi_setup_and_register() wrapper function has been introduced to provide a single interface to the ZPL code. * 2.6.32 - 2.6.33, bdi_setup_and_register() is not exported. * 2.6.34 - 3.19, bdi_setup_and_register() takes 3 arguments. * 4.0 - x.y, bdi_setup_and_register() takes 2 arguments. I've also taken this opportunity to remove HAVE_BDI because kernels older then 2.6.32 are no longer supported. All kernels newer than this will have one of the above interfaces. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chunwei Chen <tuxoko@gmail.com> Closes #3128	2015-03-03 10:49:45 -08:00
Brian Behlendorf	5f920fbee1	Retire MUTEX_OWNER checks To minimize the size of a kmutex_t a MUTEX_OWNER check was added. It allowed the kmutex_t wrapper to leverage the mutex owner which was already stored in the mutex for certain kernel configurations. The upside to this was that it reduced the size of the kmutex_t wrapper structure by the size of a task_struct pointer (4/8 bytes). The downside was that two mutex implementations needed to be maintained. Depending on your exact kernel configuration the correct one would be selected. Over the years this solution worked but it could be fragile since it depending heavily on assumed kernel mutex implementation details. For example the SPL_AC_MUTEX_OWNER_TASK_STRUCT configure check needed to be added when the kernel changed how the owner was stored. It also made the code more complicated than it needed to be. Therefore, in the name of simplicity and portability this optimization is being retired. It will slightly increase the memory requirements for a kmutex_t but only very slightly. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tim Chase <tim@chase2k.com> Issue #435	2015-03-03 10:13:33 -08:00
Jörg Thalheim	534759fad3	Linux 3.19 compat: file_inode was added struct access f->f_dentry->d_inode was replaced by accessor function file_inode(f) Signed-off-by: Joerg Thalheim <joerg@higgsboson.tk> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #3084	2015-02-10 11:24:51 -08:00
Ned Bass	4e30e68caf	Don't use AC_LANG_SOURCE for conftest.h source Using AC_LANG_SOURCE with some versions of autoconf is problematic if the given source is to be written to a header file. Such versions assume the contents are to be written to conftest.c and generate shell code to that effect. The contents of the test program to detect support for Linux tracepoints were consequently malformed (containing the source for conftest.h) so the build system incorrectly disabled tracepoints support. Fix this in ZFS_LINUX_TRY_COMPILE_HEADER by passing the header source directly to ZFS_LINUX_COMPILE_IFELSE. Signed-off-by: Ned Bass <bass6@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #2953	2015-01-06 16:53:30 -08:00
Brian Behlendorf	8d9a23e82c	Retire legacy debugging infrastructure When the SPL was originally written Linux tracepoints were still in their infancy. Therefore, an entire debugging subsystem was added to facilite tracing which served us well for many years. Now that Linux tracepoints have matured they provide all the functionality of the previous tracing subsystem. Rather than maintain parallel functionality it makes sense to fully adopt tracepoints. Therefore, this patch retires the legacy debugging infrastructure. See zfsonlinux/zfs@bc9f413 for the tracepoint changes. Signed-off-by: Ned Bass <bass6@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #408	2014-11-19 10:35:07 -08:00
Prakash Surya	0b39b9f96f	Swap DTRACE_PROBE* with Linux tracepoints This patch leverages Linux tracepoints from within the ZFS on Linux code base. It also refactors the debug code to bring it back in sync with Illumos. The information exported via tracepoints can be used for a variety of reasons (e.g. debugging, tuning, general exploration/understanding, etc). It is advantageous to use Linux tracepoints as the mechanism to export this kind of information (as opposed to something else) for a number of reasons: * A number of external tools can make use of our tracepoints "automatically" (e.g. perf, systemtap) * Tracepoints are designed to be extremely cheap when disabled * It's one of the "accepted" ways to export this kind of information; many other kernel subsystems use tracepoints too. Unfortunately, though, there are a few caveats as well: * Linux tracepoints appear to only be available to GPL licensed modules due to the way certain kernel functions are exported. Thus, to actually make use of the tracepoints introduced by this patch, one might have to patch and re-compile the kernel; exporting the necessary functions to non-GPL modules. * Prior to upstream kernel version v3.14-rc6-30-g66cc69e, Linux tracepoints are not available for unsigned kernel modules (tracepoints will get disabled due to the module's 'F' taint). Thus, one either has to sign the zfs kernel module prior to loading it, or use a kernel versioned v3.14-rc6-30-g66cc69e or newer. Assuming the above two requirements are satisfied, lets look at an example of how this patch can be used and what information it exposes (all commands run as 'root'): # list all zfs tracepoints available $ ls /sys/kernel/debug/tracing/events/zfs enable filter zfs_arc__delete zfs_arc__evict zfs_arc__hit zfs_arc__miss zfs_l2arc__evict zfs_l2arc__hit zfs_l2arc__iodone zfs_l2arc__miss zfs_l2arc__read zfs_l2arc__write zfs_new_state__mfu zfs_new_state__mru # enable all zfs tracepoints, clear the tracepoint ring buffer $ echo 1 > /sys/kernel/debug/tracing/events/zfs/enable $ echo 0 > /sys/kernel/debug/tracing/trace # import zpool called 'tank', inspect tracepoint data (each line was # truncated, they're too long for a commit message otherwise) $ zpool import tank $ cat /sys/kernel/debug/tracing/trace \| head -n35 # tracer: nop # # entries-in-buffer/entries-written: 1219/1219 #P:8 # # _-----=> irqs-off # / _----=> need-resched # \| / _---=> hardirq/softirq # \|\| / _--=> preempt-depth # \|\|\| / delay # TASK-PID CPU# \|\|\|\| TIMESTAMP FUNCTION # \| \| \| \|\|\|\| \| \| lt-zpool-30132 [003] .... 91344.200050: zfs_arc__miss: hdr... z_rd_int/0-30156 [003] .... 91344.200611: zfs_new_state__mru... lt-zpool-30132 [003] .... 91344.201173: zfs_arc__miss: hdr... z_rd_int/1-30157 [003] .... 91344.201756: zfs_new_state__mru... lt-zpool-30132 [003] .... 91344.201795: zfs_arc__miss: hdr... z_rd_int/2-30158 [003] .... 91344.202099: zfs_new_state__mru... lt-zpool-30132 [003] .... 91344.202126: zfs_arc__hit: hdr ... lt-zpool-30132 [003] .... 91344.202130: zfs_arc__hit: hdr ... lt-zpool-30132 [003] .... 91344.202134: zfs_arc__hit: hdr ... lt-zpool-30132 [003] .... 91344.202146: zfs_arc__miss: hdr... z_rd_int/3-30159 [003] .... 91344.202457: zfs_new_state__mru... lt-zpool-30132 [003] .... 91344.202484: zfs_arc__miss: hdr... z_rd_int/4-30160 [003] .... 91344.202866: zfs_new_state__mru... lt-zpool-30132 [003] .... 91344.202891: zfs_arc__hit: hdr ... lt-zpool-30132 [001] .... 91344.203034: zfs_arc__miss: hdr... z_rd_iss/1-30149 [001] .... 91344.203749: zfs_new_state__mru... lt-zpool-30132 [001] .... 91344.203789: zfs_arc__hit: hdr ... lt-zpool-30132 [001] .... 91344.203878: zfs_arc__miss: hdr... z_rd_iss/3-30151 [001] .... 91344.204315: zfs_new_state__mru... lt-zpool-30132 [001] .... 91344.204332: zfs_arc__hit: hdr ... lt-zpool-30132 [001] .... 91344.204337: zfs_arc__hit: hdr ... lt-zpool-30132 [001] .... 91344.204352: zfs_arc__hit: hdr ... lt-zpool-30132 [001] .... 91344.204356: zfs_arc__hit: hdr ... lt-zpool-30132 [001] .... 91344.204360: zfs_arc__hit: hdr ... To highlight the kind of detailed information that is being exported using this infrastructure, I've taken the first tracepoint line from the output above and reformatted it such that it fits in 80 columns: lt-zpool-30132 [003] .... 91344.200050: zfs_arc__miss: hdr { dva 0x1:0x40082 birth 15491 cksum0 0x163edbff3a flags 0x640 datacnt 1 type 1 size 2048 spa 3133524293419867460 state_type 0 access 0 mru_hits 0 mru_ghost_hits 0 mfu_hits 0 mfu_ghost_hits 0 l2_hits 0 refcount 1 } bp { dva0 0x1:0x40082 dva1 0x1:0x3000e5 dva2 0x1:0x5a006e cksum 0x163edbff3a:0x75af30b3dd6:0x1499263ff5f2b:0x288bd118815e00 lsize 2048 } zb { objset 0 object 0 level -1 blkid 0 } For the specific tracepoint shown here, 'zfs_arc__miss', data is exported detailing the arc_buf_hdr_t (hdr), blkptr_t (bp), and zbookmark_t (zb) that caused the ARC miss (down to the exact DVA!). This kind of precise and detailed information can be extremely valuable when trying to answer certain kinds of questions. For anybody unfamiliar but looking to build on this, I found the XFS source code along with the following three web links to be extremely helpful: * http://lwn.net/Articles/379903/ * http://lwn.net/Articles/381064/ * http://lwn.net/Articles/383362/ I should also node the more "boring" aspects of this patch: * The ZFS_LINUX_COMPILE_IFELSE autoconf macro was modified to support a sixth paramter. This parameter is used to populate the contents of the new conftest.h file. If no sixth parameter is provided, conftest.h will be empty. * The ZFS_LINUX_TRY_COMPILE_HEADER autoconf macro was introduced. This macro is nearly identical to the ZFS_LINUX_TRY_COMPILE macro, except it has support for a fifth option that is then passed as the sixth parameter to ZFS_LINUX_COMPILE_IFELSE. These autoconf changes were needed to test the availability of the Linux tracepoint macros. Due to the odd nature of the Linux tracepoint macro API, a separate ".h" must be created (the path and filename is used internally by the kernel's define_trace.h file). * The HAVE_DECLARE_EVENT_CLASS autoconf macro was introduced. This is to determine if we can safely enable the Linux tracepoint functionality. We need to selectively disable the tracepoint code due to the kernel exporting certain functions as GPL only. Without this check, the build process will fail at link time. In addition, the SET_ERROR macro was modified into a tracepoint as well. To do this, the 'sdt.h' file was moved into the 'include/sys' directory and now contains a userspace portion and a kernel space portion. The dprintf and zfs_dbgmsg* interfaces are now implemented as tracepoint as well. Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Ned Bass <bass6@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-11-17 11:13:55 -08:00
Marcel Wysocki	7f118e836e	Add config/compile to config/.gitignore This file may be added by automake and therefore should be added to config/.gitignore. For the full list of possible auxiliary programs see the full automake documentation. http://www.gnu.org/software/automake/manual/automake.html#Auxiliary-Programs Signed-off-by: Marcel Wysocki <maci.stgn@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-10-31 16:26:44 -07:00
Marcel Wysocki	11662bf969	Add config/compile to config/.gitignore This file may be added by automake and therefore should be added to config/.gitignore. For the full list of possible auxiliary programs see the full automake documentation. http://www.gnu.org/software/automake/manual/automake.html#Auxiliary-Programs Signed-off-by: Marcel Wysocki <maci.stgn@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #2848	2014-10-31 16:25:34 -07:00
Richard Yao	b76707027c	Make systemd-modules-load.service file directory configurable Installing outside of the prefix is not permissible under Gentoo Prefix. The package manager will cause the installation process to fail if/when it sees this. We could handle this by disabling systemd support on prefix because systemd does not check these paths, but the Gentoo Council decided that small files such as these should be installed. That means disabling systemd support on prefix is not an acceptable workaround. As a consequence, we need some way of control the directory into which these files are installed. Making this configurable increases our compliance with the freedesktop.org specification, which allows these files to be installed into /etc/modules-load.d: http://www.freedesktop.org/software/systemd/man/modules-load.d.html Signed-off-by: Richard Yao <richard.yao@clusterhq.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #2641	2014-10-28 09:41:14 -07:00
Richard Yao	60e9f69c97	Make directory into which mount.zfs is installed configurable Installing outside of the prefix is not permissible under Gentoo Prefix. The package manager will cause the installation process to fail if/when it sees this. I could script a workaround inside the ebuild, but it seemed to make more sense to make this more configurable. Signed-off-by: Richard Yao <richard.yao@clusterhq.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #2641	2014-10-28 09:40:59 -07:00
Richard Yao	d8d7826721	Search /usr/local/src for SPL Object Directory Since we changed the default location for the kernel headers to respect --prefix in the SPL, we must search that location to prevent user builds from breaking. Signed-off-by: Richard Yao <richard.yao@clusterhq.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #2641	2014-10-28 09:37:23 -07:00
Brian Behlendorf	dcf91382b9	Remove vfs_fsync() wrapper The vfs_fsync() function has been available since Linux 2.6.29. There is no longer a need to maintain this compatibility code. However, the HAVE_2ARGS_VFS_FSYNC check was left in place since that change occured after 2.6.32. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-10-17 15:11:52 -07:00
Brian Behlendorf	599662c538	Remove kern_path() wrapper The kern_path() function has been available since Linux 2.6.28. There is no longer a need to maintain this compatibility code. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-10-17 15:11:52 -07:00
Brian Behlendorf	3d5392cefa	Remove kvasprintf() wrapper The kvasprintf() function has been available since Linux 2.6.22. There is no longer a need to maintain this compatibility code. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-10-17 15:11:52 -07:00
Brian Behlendorf	0fac9c9e6d	Remove proc_handler() wrapper As of Linux 2.6.32 the proc handlers where updated to expect only five arguments. Therefore there is no longer a need to maintain this compatibility code and this infrastructure can be simplified. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-10-17 15:11:52 -07:00
Brian Behlendorf	e03119e86f	Update put_task_struct() comments Update the comments to correctly reflect when this interface was added. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-10-17 15:11:51 -07:00
Brian Behlendorf	68a829b29d	Remove credential configure checks. The groups_search() function was never exported by a mainline kernel therefore we drop this compatibility code and always provide our own implementation. Additionally, the cred_t structure has been available since 2.6.29 so there is no longer a need to maintain compatibility code. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-10-17 15:11:51 -07:00
Brian Behlendorf	e39174ed56	Add vfs_unlink() and vfs_rename() comments Just for consistency with the other autoconf checks a small comment block was added before these checks. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-10-17 15:11:51 -07:00
Brian Behlendorf	137af025f6	Remove set_fs_pwd() configure check This function has never been exported by any mainline and was only briefly available under RHEL5. Therefore this check is being removed and the code update to always use the wrapper function. The next step will be to eliminate all this code. If ZFS were updated not to assume that it's pwd was / there would be no need for this. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-10-17 15:11:51 -07:00
Brian Behlendorf	3c49a16989	Remove user_path_dir() wrapper The user_path_dir() function has been available since Linux 2.6.27. There is no longer a need to maintain this compatibility code. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-10-17 15:11:51 -07:00
Brian Behlendorf	44778f4110	Remove kallsyms_lookup_name() wrapper After the removable of get_vmalloc_info(), the unused global memory variables, and the optional dcache/icache shrinkers there is no longer a need for the kallsyms compatibility code. This allows us to eliminate another brittle area of the code by removing the kernel upcall this functionality depended on for older kernels. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-10-17 15:11:51 -07:00
Brian Behlendorf	89a461e70c	Remove shrink_{i,d}node_cache() wrappers This is optional functionality which may or may not be useful to ZFS when using older kernels. It is never a hard requirement. Therefore this functionality is being removed from the SPL and a simpler slimmed down version will be added to ZFS. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-10-17 15:11:51 -07:00
Brian Behlendorf	8bbbe46f86	Remove global memory variables Platforms such as Illumos and FreeBSD have historically provided global variables which summerize the memory state of a system. Linux on the otherhand doesn't expose any of this information to kernel modules and uses entirely different mechanisms for memory management. In order to simplify the original ZFS port to Linux these global variables were emulated by the SPL for the benefit of ZFS. As ZoL has matured over the years it has moved steadily away from these interfaces and now no longer depends on them at all. Therefore, this patch completely removes the global variables availrmem, minfree, desfree, lotsfree, needfree, swapfs_minfree, and swapfs_reserve. This greatly simplifies the memory management code and eliminates a common area of confusion. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-10-17 15:11:51 -07:00
Brian Behlendorf	e1310afae3	Remove get_vmalloc_info() wrapper The get_vmalloc_info() function was used to back the vmem_size() function. This was always problematic and resulted in brittle code because the kernel never provided a clean interface for modules. However, it turns out that the only caller of this function in ZFS uses it to determine the total virtual address space size. This can be determined easily without get_vmalloc_info() so vmem_size() has been updated to take this approach which allows us to shed the get_vmalloc_info() dependency. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-10-17 15:11:51 -07:00
Brian Behlendorf	50e41ab1e1	Remove on_each_cpu() wrapper The on_each_cpu() function has been available since Linux 2.6.27. There is no longer a need to maintain this compatibility code. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-10-17 15:11:51 -07:00
Brian Behlendorf	b652d169b0	Remove mutex_lock_nested() wrapper The mutex_lock_nested() function has been available since Linux 2.6.18. There is no longer a need to maintain this compatibility code. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-10-17 15:11:51 -07:00
Brian Behlendorf	2bc5666f53	Remove i_mutex() configure check The inode structure has used i_mutex as its internal locking primitive since 2.6.16. The compatibility code to check for the previous semaphore primitive has been removed. However, the wrapper function itself is being kept because it's entirely possible this primitive will change again to allow finer grained locking. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-10-17 15:11:51 -07:00
Brian Behlendorf	9f36cace41	Remove kmalloc_node() compatibility code The kmalloc_node() function has been available since Linux 2.6.12. There is no longer a need to maintain this compatibility code. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-10-17 15:11:51 -07:00
Brian Behlendorf	d227e114ed	Remove linux/uaccess.h header check The uaccess header has been available in the same location since Linux 2.6.18. There is no longer a need to maintain this compatibility code. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-10-17 15:11:51 -07:00
Brian Behlendorf	e5b65e3179	Remove uintptr_t typedef The uintptr_t typedef has been available since Linux 2.6.24. There is no longer a need to maintain this compatibility code. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-10-17 15:11:50 -07:00
Brian Behlendorf	ff0582cb39	Remove atomic64_xchg() wrappers The atomic64_xchg() and atomic64_cmpxchg() functions have been available since Linux 2.6.24. There is no longer a need to maintain this compatibility code. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-10-17 15:11:50 -07:00
Brian Behlendorf	82f2f1a3af	Simplify the time compatibility wrappers Many of the time functions had grown overly complex in order to handle kernel compatibility issues. However, as of Linux 2.6.26 all the required functionality is available. This allows us to retire numerous configure checks and greatly simplify the time compatibility wrappers. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-10-17 15:11:50 -07:00
Brian Behlendorf	87f8055a91	Map highbit64() to fls64() The fls64() function has been available since Linux 2.6.16 and it should be used to implemented highbit64(). This allows us to provide an optimized implementation and simplify the code. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-10-17 15:11:50 -07:00
Brian Behlendorf	9c91800d19	Remove CTL_UNNUMBERED sysctl interface Support for the CTL_UNNUMBERED sysctl interface was removed in Linux 2.6.19. There is no longer any reason to maintain this compatibility code. There also issue any reason to keep around the CTL_NAME macro and helpers so they have been retired. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-10-17 15:11:50 -07:00
Brian Behlendorf	b38bf6a4e3	Remove register_sysctl() compatibility code The register_sysctl() interface has been stable since Linux 2.6.21. There is no longer a need to maintain compatibility code. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-10-17 15:11:50 -07:00
Brian Behlendorf	bb4dee3df2	Remove utsname() wrapper There is no longer a need to wrap this because utsname() is provided by the kernel and can be called directly. This will require a small change in the ZFS code because utsname is expected to be a global structure and not a function. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-10-17 15:11:41 -07:00
Brian Behlendorf	a80d69caf0	Remove adaptive mutex implementation Since the Linux 2.6.29 kernel all mutexes have been adaptive mutexs. There is no longer any point in keeping this code so it is being removed to simplify the code. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-10-17 15:07:28 -07:00
Brian Behlendorf	3a92530563	Update code to use misc_register()/misc_deregister() When the SPL was originally written it was designed to use the device_create() and device_destroy() functions. Unfortunately, these functions changed considerably over the years making them difficult to rely on. As it turns out a better choice would have been to use the misc_register()/misc_deregister() functions. This interface for registering character devices has remained stable, is simple, and provides everything we need. Therefore the code has been reworked to use this interface. The higher level ZFS code has always depended on these same interfaces so this is also as a step towards minimizing our kernel dependencies. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-10-17 15:07:28 -07:00
Brian Behlendorf	6203295438	Make license compatibility checks consistent Apply the license specified in the META file to ensure the compatibility checks are all performed consistently. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-10-17 15:07:28 -07:00
Brian Behlendorf	e33045ee98	Make license compatibility checks consistent Apply the license specified in the META file to ensure the compatibility checks are all performed consistently. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #2757	2014-10-17 14:58:38 -07:00
Brian Behlendorf	9ad656b2d0	Retire HAVE_IOCTL_* configure checks The HAVE_IOCTL_* configure checks were originally added for compatibility with an ancient version of glibc. This support and additional complexity is no longer needed and is therefore being removed. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Closes #585	2014-08-28 07:45:54 -07:00
Richard Yao	ec18fe3ce8	Cleanup vn_rename() and vn_remove() zfsonlinux/spl#bcb15891ab394e11615eee08bba1fd85ac32e158 implemented Linux 3.6+ support by adding duplicate vn_rename and vn_remove functions. The new ones were cleaner, but the duplicate functions made the codebase less maintainable. This adds some compatibility shims that allow us to retire the older vn_rename and vn_remove in favor of the new ones on old kernels. The result is a net 143 line reduction in lines of code and a cleaner codebase. Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #370	2014-08-13 16:25:44 -07:00
Ned Bass	2fc44f66ec	Linux 3.17 compat: remove wait_on_bit action function Linux kernel 3.17 removes the action function argument from wait_on_bit(). Add autoconf test and compatibility macro to support the new interface. The former "wait_on_bit" interface required an 'action' function to be provided which does the actual waiting. There were over 20 such functions in the kernel, many of them identical, though most cases can be satisfied by one of just two functions: one which uses io_schedule() and one which just uses schedule(). This API change was made to consolidate all of those redundant wait functions. References: torvalds/linux@7431620 Signed-off-by: Ned Bass <bass6@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #378	2014-08-11 14:17:00 -07:00
Brian Behlendorf	1139491da7	Revert "Disable GCCs aggressive loop optimization" This reverts commit `0f62f3f9ab`. Signed-off-by: Tim Chase <tim@chase2k.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #2010	2014-07-22 09:56:55 -07:00
Turbo Fredriksson	2ee4e7da90	Accept udev and dracut paths specified by ./configure There are two common locations where udev and dracut components are commonly installed. When building packages using the 'make rpm\|deb' targets check those common locations and pass them to rpmbuild. For non-standard configurations these values can be provided by the the following configure options: --with-udevdir=DIR install udev helpers [default=check] --with-udevruledir=DIR install udev rules [[UDEVDIR/rules.d]] --with-dracutdir=DIR install dracut helpers [default=check] When rebuilding using the source packages the per-distribution default values specified in the spec file will be used. This is the preferred way to build packages for a distribution but the ability to override the defaults is provided as a convenience. Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #2310 Closes #1680	2014-06-11 16:32:57 -07:00
Turbo Fredriksson	0f629346bb	Set LANG to a reasonable default (C) Set LANG=C before calling 'rpmbuild' to avoid rpmbuild failing on the translated date string in the changelog. Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes: zfsonlinux/spl#306	2014-06-10 16:46:21 -07:00
Turbo Fredriksson	1e929b97ac	Set LANG to a reasonable default (C) Set LANG=C before calling 'rpmbuild' to avoid rpmbuild failing on the translated date string in the changelog. Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #306	2014-06-10 14:50:11 -07:00
Turbo Fredriksson	69c7bdb6e7	Accept kernel source dir(s) specified by ./configure This adds ability to set the location of the kernel via defines when building from the spec files. This is useful when building against a kernel installed in a non-standard location. Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1874	2014-06-05 13:46:49 -07:00
Turbo Fredriksson	c9b5cc8c00	Move the libraries into separate packages From day one the various ZFS libraries should have been placed in their own sub-packages. Primarily this allows for multiple major versions of the libraries to be concurrently installed. It also facilitates a smaller build environment by minimizing the required dependencies. The specific changes required to split the libraries from the utilities are as follows: * libzpool2, libnvpair1, libuutil1, and libzfs2 packages were added and contain the versioned shared libraries. The Fedora packaging guidelines discourage providing static libraries so they are not included in the packages. http://fedoraproject.org/wiki/Packaging:Guidelines#Packaging_Static_Libraries * The zfs-devel package was renamed libzfs2-devel and the new package obsoletes the old zfs-devel package. This package includes all the required headers for the libzpool2, libnvpair1, libuutil1, and libzfs2 libraries and their respective unversioned shared libraries. This package should eventually be split in to individual lib-devel packages but it will still take some work to cleanly separate them. Therefore the libzfs2-devel package provides the expected lib-devel packages so the all proper dependencies can still be created. http://fedoraproject.org/wiki/Packaging:Guidelines#Devel_Packages * Moved '/sbin/ldconfig' execution from the zfs packge to each of the new library packages as described by the packaging guidelines. http://fedoraproject.org/wiki/Packaging:Guidelines#Shared_Libraries * The /usr/share/doc/ files were moved in to the libzfs2-devel package. * Updated config/deb.am to be aware of the packaging changes. This ensures that 'deb-utils' make target converts all the resulting packages generated by the 'rpm-utils' target. Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes: #2329 Closes: #2341 Issue: #2145	2014-06-02 13:43:20 -07:00
Brian Behlendorf	79aada6105	Restrict release number to META version When creating packages in a git repository the release number can be automatically set by 'git describe'. This normally works well but if your repository has newer tags which match the form NAME-VERSION* the release may be incorrectly calculated. To prevent this the match patten has been restricted to the contents of the META file, NAME-VERSION. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-05-29 19:31:50 -07:00
Brian Behlendorf	c4f38ddd80	Restrict release number to META version When creating packages in a git repository the release number can be automatically set by 'git describe'. This normally works well but if your repository has newer tags which match the form NAME-VERSION* the release may be incorrectly calculated. To prevent this the match patten has been restricted to NAME-VERSION. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2014-05-29 19:08:03 -07:00
Brian Behlendorf	a073aeb060	Add KMC_SLAB cache type For small objects the Linux slab allocator has several advantages over its counterpart in the SPL. These include: 1) It is more memory-efficient and packs objects more tightly. 2) It is continually tuned to maximize performance. Therefore it makes sense to layer the SPLs slab allocator on top of the Linux slab allocator. This allows us to leverage the advantages above while preserving the Illumos semantics we depend on. However, there are some things we need to be careful of: 1) The Linux slab allocator was never designed to work well with large objects. Because the SPL slab must still handle this use case a cut off limit was added to transition from Linux slab backed objects to kmem or vmem backed slabs. spl_kmem_cache_slab_limit - Objects less than or equal to this size in bytes will be backed by the Linux slab. By default this value is zero which disables the Linux slab functionality. Reasonable values for this cut off limit are in the range of 4096-16386 bytes. spl_kmem_cache_kmem_limit - Objects less than or equal to this size in bytes will be backed by a kmem slab. Objects over this size will be vmem backed instead. This value defaults to 1/8 a page, or 512 bytes on an x86_64 architecture. 2) Be aware that using the Linux slab may inadvertently introduce new deadlocks. Care has been taken previously to ensure that all allocations which occur in the write path use GFP_NOIO. However, there may be internal allocations performed in the Linux slab which do not honor these flags. If this is the case a deadlock may occur. The path forward is definitely to start relying on the Linux slab. But for that to happen we need to start building confidence that there aren't any unexpected surprises lurking for us. And ideally need to move completely away from using the SPLs slab for large memory allocations. This patch is a first step. NOTES: 1) The KMC_NOMAGAZINE flag was leveraged to support the Linux slab backed caches but it is not supported for kmem/vmem backed caches. 2) Regardless of the spl_kmem_cache_*_limit settings a cache may be explicitly set to a given type by passed the KMC_KMEM, KMC_VMEM, or KMC_SLAB flags during cache creation. 3) The constructors, destructors, and reclaim callbacks are all functional and will be called regardless of the cache type. 4) KMC_SLAB caches will not appear in /proc/spl/kmem/slab due to the issues involved in presenting correct object accounting. Instead they will appear in /proc/slabinfo under the same names. 5) Several kmem SPLAT tests needed to be fixed because they relied incorrectly on internal kmem slab accounting. With the updated test cases all the SPLAT tests pass as expected. 6) An autoconf test was added to ensure that the __GFP_COMP flag was correctly added to the default flags used when allocating a slab. This is required to ensure all pages in higher order slabs are properly refcounted, see `ae16ed9`. 7) When using the SLUB allocator there is no need to attempt to set the __GFP_COMP flag. This has been the default behavior for the SLUB since Linux 2.6.25. 8) When using the SLUB it may be desirable to set the slub_nomerge kernel parameter to prevent caches from being merged. Original-patch-by: DHE <git@dehacked.net> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Tim Chase <tim@chase2k.com> Signed-off-by: DHE <git@dehacked.net> Signed-off-by: Chunwei Chen <tuxoko@gmail.com> Closes #356	2014-05-22 10:28:01 -07:00
Chunwei Chen	ad3412efd7	Linux 3.15: vfs_rename() added a flags argument Detect the updated vfs_rename() interface and call it with an extra flags argument. References: torvalds/linux@520c8b1 Signed-off-by: Chunwei Chen <tuxoko@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #355	2014-05-07 13:38:17 -07:00
Richard Yao	3b4f425a5a	Refactor inode_owner_or_capable() autotools check We need inode_owner_or_capable() for ZFS file attributes in addition to xattrs, so it should go into its own file. This moves it into its own file and changes it to be more comprehensive. It will now fail if no known good API is detected. Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #1691	2014-05-01 10:06:49 -07:00
Chunwei Chen	b761912b34	Linux 3.14 compat: rq_for_each_segment in dmu_req_copy rq_for_each_segment changed from taking bio_vec * to taking bio_vec. We provide rq_for_each_segment4 which takes both. Signed-off-by: Chunwei Chen <tuxoko@gmail.com> Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #2124	2014-04-10 14:28:51 -07:00
Chunwei Chen	d4541210f3	Linux 3.14 compat: Immutable biovec changes in vdev_disk.c bi_sector, bi_size and bi_idx are moved from bio to bio->bi_iter. This patch creates BIO_BI_*(bio) macros to hide the differences. Signed-off-by: Chunwei Chen <tuxoko@gmail.com> Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #2124	2014-04-10 14:28:38 -07:00
Chunwei Chen	408ec0d2e1	Linux 3.14 compat: posix_acl_{create,chmod} posix_acl_{create,chmod} is changed to __posix_acl_{create_chmod} Signed-off-by: Chunwei Chen <tuxoko@gmail.com> Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #2124	2014-04-10 14:27:03 -07:00
Chris Dunlap	518eba1492	Replace check for _POSIX_MEMLOCK w/ HAVE_MLOCKALL zed supports a '-M' cmdline opt to lock all pages in memory via mlockall(). The _POSIX_MEMLOCK define is checked to determine whether this function is supported. The current test assumes mlockall() is supported if _POSIX_MEMLOCK is non-zero. However, this test is insufficient according to mlock(2) and sysconf(3). If _POSIX_MEMLOCK is -1, mlockall() is not supported; but if _POSIX_MEMLOCK is 0, availability must be checked at runtime. This commit adds an autoconf check for mlockall() to user.m4. The zed code block for mlockall() is now guarded with a test for HAVE_MLOCKALL. If defined, mlockall() will be called and its runtime availability checked via its return value. Signed-off-by: Chris Dunlap <cdunlap@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #2	2014-04-02 13:10:08 -07:00
Chris Dunlap	07917db990	Add defs for makefile installation dir vars Add macro definitions to AM_CPPFLAGS to propagate makefile installation directory variables for libexecdir, runstatedir, sbindir, and sysconfdir. https://www.gnu.org/software/autoconf/manual/autoconf-2.69/html_node/Installation-Directory-Variables.html A corollary is that you should not use these variables except in makefiles. For instance, instead of trying to evaluate datadir in configure and hard-coding it in makefiles using e.g., 'AC_DEFINE_UNQUOTED([DATADIR], ["$datadir"], [Data directory.])', you should add -DDATADIR='$(datadir)' to your makefile's definition of CPPFLAGS (AM_CPPFLAGS if you are also using Automake). The runstatedir directory is for "installing data files which the programs modify while they run, that pertain to one specific machine, and which need not persist longer than the execution of the program". https://www.gnu.org/prep/standards/html_node/Directory-Variables.html It will be defined by autoconf 2.70 or later, and default to "$(localstatedir)/run". http://git.savannah.gnu.org/gitweb/?p=autoconf.git;a=commit;h=a197431414088a417b407b9b20583b2e8f7363bd Signed-off-by: Chris Dunlap <cdunlap@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #2	2014-03-31 16:11:13 -07:00
Richard Yao	1de1488fdc	Linux 3.13 compat: Handle __must_check bdi_setup_and_register torvalds/linux@8077c0d983 added a __must_check to the bdi_setup_and_register(), which caused our autotools check to break. zfsonlinux/zfs@729210564a was intended to correct that, but it depended on -Wno-unused-result, which is unrecognized in older GCC versions. That commit has been reverted in favor of a solution that does not require -Wno-unused-result. Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #2102 Closes #2135	2014-03-24 11:10:06 -07:00
Richard Yao	6b6b8d1041	Revert "Properly ignore bdi_setup_and_register return value" Older GCC versions do not obey -Wno-unused-result. This reverts commit `729210564a` in favor of a solution that does not require -Wno-unused-result. Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #1906	2014-03-24 11:08:55 -07:00
Chunwei Chen	a15dac42df	config: compile test rather than run test When testing compiler flags, we only need to do compile test. Otherwise, configure will fail with "configure: error: cannot run test program while cross compiling" when cross compiling. Signed-off-by: Chunwei Chen <tuxoko@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #2191	2014-03-20 12:11:57 -07:00
Ralf Ertzinger	881f45c6a8	Add systemd unit files for ZFS startup This adds systemd unit files replacing the functionality offered by the SysV init script found in etc/init.d. It has been developed and tested on Fedora 19, Fedora 20 and openSuSE 13.1. Four unit files and one target are offered. zfs-import-cache.service: Import pools from /etc/zfs/zpool.cache. This unit will wait for udev to settle. zfs-import-scan.service: Import pools by scanning /dev/disk/by-id for zvols. This unit will only run if /etc/zfs/zpool.cache is not present. This unit will wait for udev to settle zfs-mount.service: Mount ZFS native filesystems. It contains a dependency to be loaded before local-fs.target. zfs-share.service: Share NFS/SMB filesystems. This unit contains a dependency that will cause it to be restarted whenever the smb or nfs-server unit is restarted, restoring the shares added. zfs.target: This target pulls in the other units in order to start ZFS. It's the only unit that can be enabled/disabled, all other services are static and pulled in by dependencies. It will honour zfs=off and zfs=no options on the kernel command line. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #2108	2014-02-05 12:25:30 -08:00
Brian Behlendorf	0f62f3f9ab	Disable GCCs aggressive loop optimization GCC >+ 4.8's aggressive loop optimization breaks some of the iterators over the dn_blkptr[] pseudo-array in dnode_phys. Since dn_blkptr[] is defined as a single-element array, GCC believes an iterator can only access index 0 and will unroll the loop into a single iteration. One way to resolve the issue would be to cast the array to a pointer and fix all the iterators that might break. The only loop where it is known to cause a problem is this loop in dmu_objset_write_ready(): for (i = 0; i < dnp->dn_nblkptr; i++) bp->blk_fill += dnp->dn_blkptr[i].blk_fill; In the common case where dn_nblkptr is 3, the loop is only executed a single time and "i" is equal to 1 following the loop. The specific breakage caused by this problem is that the blk_fill of root block pointers wouldn't be set properly when more than one blkptr is in use (when no indrect blocks are needed). The simple reproducing sequence is: zpool create tank /tank.img zdb -ddddd tank 0 Notice that "fill=31", however, there are two L0 indirect blocks with "F=31" and "F=5". The fill count should be 36 rather than 31. This problem causes an assert to be hit in a simple "zdb tank" when built with --enable-debug. However, this approach was not taken because we need to be absolutely sure we catch all instances of this unwanted optimization. Therefore, the build system has been updated to detect if GCC supports the aggressive loop optimization. If it does the optimization will be explicitly disabled using the -fno-aggressive-loop-optimization option. Original-fix-by: Tim Chase <tim@chase2k.com> Signed-off-by: Tim Chase <tim@chase2k.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #2010 Closes #2051	2014-01-14 13:55:58 -08:00
Matthew Thode	11b9ec23b9	Add full SELinux support Four new dataset properties have been added to support SELinux. They are 'context', 'fscontext', 'defcontext' and 'rootcontext' which map directly to the context options described in mount(8). When one of these properties is set to something other than 'none'. That string will be passed verbatim as a mount option for the given context when the filesystem is mounted. For example, if you wanted the rootcontext for a filesystem to be set to 'system_u:object_r:fs_t' you would set the property as follows: $ zfs set rootcontext="system_u:object_r:fs_t" storage-pool/media This will ensure the filesystem is automatically mounted with that rootcontext. It is equivalent to manually specifying the rootcontext with the -o option like this: $ zfs mount -o rootcontext=system_u:object_r:fs_t storage-pool/media By default all four contexts are set to 'none'. Further information on SELinux contexts is detailed in mount(8) and selinux(8) man pages. Signed-off-by: Matthew Thode <prometheanfire@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Richard Yao <ryao@gentoo.org> Closes #1504	2013-12-19 10:37:31 -08:00
Richard Yao	729210564a	Properly ignore bdi_setup_and_register return value This broke compilation against Linux 3.13 and GCC 4.7.3. Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1906	2013-12-04 14:53:45 -08:00
Richard Yao	50a0749eba	Linux 3.13 compat: Pass NULL for new delegated inode argument This check was originally added for SLES10, `a093c6a`, to check for a 'struct vfsmount *' argument which they added. However, since SLES10 is based on a 2.6.16 kernel which is no longer supported this functionality was dropped. The checks were refactored to support Linux 3.13 without concern for historical versions. Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #312	2013-12-02 10:37:49 -08:00
DHE	fd23663000	Fix typos in commit `b83e3e48c9` There's a missing semicolon and equals sign in the first hunk of this commit in config/kernel-bdi.m4. This results in the test always failing. The effects were noticed when rrdtool, a tool which modifies files by mmap() and msync(), would have data never get saved to disk in spite of the files working while the mounted filesystem remains mounted. Signed-off-by: DHE <git@dehacked.net> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Richard Yao <ryao@gentoo.org> Closes #1889	2013-11-20 15:24:39 -08:00
Richard Yao	c3d9c0df3e	Linux 3.12 compat: New shrinker API torvalds/linux@24f7c6 introduced a new shrinker API while torvalds/linux@a0b021 dropped support for the old shrinker API. This patch adds support for the new shrinker API by wrapping the old one with the new one. This change also reorganizes the autotools checks on the shrinker API such that the configure script will fail early if an unknown API is encountered in the future. Support for the set_shrinker() API which was used by Linux 2.6.22 and older has been dropped. As a general rule compatibility is only maintained back to Linux 2.6.26. Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes zfsonlinux/zfs#1732 Closes zfsonlinux/zfs#1822 Closes #293 Closes #307	2013-11-06 13:23:40 -08:00
Ned Bass	184c687387	Emulate illumos interface cv_timedwait_hires() Needed for Illumos #3582. This interface is supposed to support a variable-resolution timeout with nanosecond granularity. This implementation rounds up to microsecond resolution, as nanosecond- precision timing is rarely needed for real-world performance tuning and may incur unnecessary busy-waiting. usleep_range() is used if available, otherwise udelay() or msleep() are used depending on the length of the delay interval. Add flags from sys/callo.h as these are used to control the behavior of cv_timedwait_hires(). Specifically, CALLOUT_FLAG_ABSOLUTE Normally, the expiration passed to the timeout API functions is an expiration interval. If this flag is specified, then it is interpreted as the expiration time itself. CALLOUT_FLAG_ROUNDUP Roundup the expiration time to the next resolution boundary. If this flag is not specified, the expiration time is rounded down. References: https://www.illumos.org/issues/3582 illumos/illumos-gate@0689f76 Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #304	2013-11-04 09:49:24 -08:00
Massimo Maggi	023699cd62	Posix ACL Support This change adds support for Posix ACLs by storing them as an xattr which is common practice for many Linux file systems. Since the Posix ACL is stored as an xattr it will not overwrite any existing ZFS/NFSv4 ACLs which may have been set. The Posix ACL will also be non-functional on other platforms although it may be visible as an xattr if that platform understands SA based xattrs. By default Posix ACLs are disabled but they may be enabled with the new 'aclmode=noacl\|posixacl' property. Set the property to 'posixacl' to enable them. If ZFS/NFSv4 ACL support is ever added an appropriate acltype will be added. This change passes the POSIX Test Suite cleanly with the exception of xacl/00.t test 45 which is incorrect for Linux (Ext4 fails too). http://www.tuxera.com/community/posix-test-suite/ Signed-off-by: Massimo Maggi <me@massimo-maggi.eu> Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #170	2013-10-29 14:54:26 -07:00
Richard Yao	1db7b9be75	Fix libblkid support libblkid support is dormant because the autotools check is broken and liblkid identifies ZFS vdevs as "zfs_member", not "zfs". We fix that with a few changes: First, we fix the libblkid autotools check to do a few things: 1. Make a 64MB file, which is the minimum size ZFS permits. 2. Make 4 fake uberblock entries to make libblkid's check succeed. 3. Return 0 upon success to make autotools use the success case. 4. Include stdlib.h to avoid implicit declration of free(). 5. Check for "zfs_member", not "zfs" 6. Make --with-blkid disable autotools check (avoids Gentoo sandbox violation) 7. Pass '-lblkid' correctly using LIBS not LDFLAGS. Second, we change the libblkid support to scan for "zfs_member", not "zfs". This makes --with-blkid work on Gentoo. Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #1751	2013-10-10 16:56:51 -07:00
Richard Yao	b83e3e48c9	Stop runtime pointer modifications in autotools checks `c38367c73f` was meant to eliminate runtime function pointer modifications in autotools checks because they were prone to false negatives on kernels hardened by the PaX project. Unfortunately, I missed the xattr_handler and super_block->s_bdi autotools checks. Recent changes to PaX constified xattr_handler->get/set, which lead me to discover this oversight. Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1433	2013-09-13 13:30:55 -07:00
Richard Yao	0f37d0c8be	Linux 3.11 compat: fops->iterate() Commit torvalds/linux@2233f31aad replaced ->readdir() with ->iterate() in struct file_operations. All filesystems must now use the new ->iterate method. To handle this the code was reworked to use the new ->iterate interface. Care was taken to keep the majority of changes confined to the ZPL layer which is already Linux specific. However, minor changes were required to the common zfs_readdir() function. Compatibility with older kernels was accomplished by adding versions of the trivial dir_emit* helper functions. Also the various _readdir() functions were reworked in to wrappers which create a dir_context structure to pass to the new _iterate() functions. Unfortunately, the new dir_emit* functions prevent us from passing a private pointer to the filldir function. The xattr directory code leveraged this ability through zfs_readdir() to generate the list of xattr names. Since we can no longer use zfs_readdir() a simplified zpl_xattr_readdir() function was added to perform the same task. Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1653 Issue #1591	2013-08-15 16:19:07 -07:00
Richard Yao	f7fd6ddd96	Linux 3.8 compat: Use kuid_t/kgid_t when required When CONFIG_UIDGID_STRICT_TYPE_CHECKS is enabled uid_t/git_t are replaced by kuid_t/kgid_t, which are structures instead of integral types. This causes any code that uses an integral type to fail to build. The User Namespace functionality introduced in Linux 3.8 requires CONFIG_UIDGID_STRICT_TYPE_CHECKS, so we could not build against any kernel that supported it. We resolve this by converting between the new kuid_t/kgid_t structures and the original uid_t/gid_t types. Original-patch-by: DHE Rewrite-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #260	2013-08-09 10:09:29 -07:00
Brian Behlendorf	dba1d70566	Fix arc_adapt() spinning in iterate_supers_type() The iterate_supers_type() function which was introduced in the 3.0 kernel was supposed to provide a safe way to call an arbitrary function on all super blocks of a specific type. Unfortunately, because a list_head was used a bug was introduced which made it possible for iterate_supers_type() to get stuck spinning on a super block which was just deactivated. This can occur because when the list head is removed from the fs_supers list it is reinitialized to point to itself. If the iterate_supers_type() function happened to be processing the removed list_head it will get stuck spinning on that list_head. The bug was fixed in the 3.3 kernel by converting the list_head to an hlist_node. However, to resolve the issue for existing 3.0 - 3.2 kernels we detect when a list_head is used. Then to prevent the spinning from occurring the .next pointer is set to the fs_supers list_head which ensures the iterate_supers_type() function will always terminate. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1045 Closes #861 Closes #790	2013-07-17 09:28:06 -07:00
Chris Dunlop	a1d9543a39	3.10 API change: block_device_operations->release() returns void Linux kernel commit torvalds/linux@db2a144 changed the return type of block_device_operations->release() to void. Detect the expected prototype and defined our callout accordingly. Signed-off-by: Chris Dunlop <chris@onthe.net.au> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1494	2013-07-08 15:41:57 -07:00
Yuxuan Shui	1ddf9722dc	Linux 3.10 compat: replace PDE()->data with PDE_DATA() Linux kernel commit torvalds/linux@d9dda78b renamed PDE() to PDE_DATA(). To handle this detect the prefered interface and define a PDE_DATA() wrapper for consistency. Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com> Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #257	2013-07-08 15:14:21 -07:00
Yuxuan Shui	c02ab72fb9	Linux 3.10 compat: struct vmalloc_info moved Linux kernel commmit torvalds/linux@db3808c1 moved the vmalloc_info structure from a private to a public header. Now that it's available for kernel modules use it. Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #257	2013-07-08 15:09:20 -07:00
Li Dongyang	802e7b5feb	Add SEEK_DATA/SEEK_HOLE to lseek()/llseek() The approach taken was the rework zfs_holey() as little as possible and then just wrap the code as needed to ensure correct locking and error handling. Tested with xfstests 285 and 286. All tests pass except for 7-9 of 285 which try to reserve blocks first via fallocate(2) and fail because fallocate(2) is not yet supported. Note that the filp->f_lock spinlock did not exist prior to Linux 2.6.30, but we avoid the need for autotools check by virtue of the fact that SEEK_DATA/SEEK_HOLE support was not added until Linux 3.1. An autoconf check was added for lseek_execute() which is currently a private function but the expectation is that it will be exported perhaps as early as Linux 3.11. Reviewed-by: Richard Laager <rlaager@wiktel.com> Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1384	2013-07-02 09:24:43 -07:00
Carlos Alberto Lopez Perez	5165473737	Ensure --with-spl-timeout waits for spl_config.h and symvers The previous code was only waiting for the symver file. But the postinst target of the DKMS script for SPL will not only create the symvers file, but also the header spl_config.h. If we are waiting in the configure script of ZFS for the SPL symvers file, then we also need to wait for spl_config.h. Otherwise the configure script will abort because the spl_config.h is not yet available. On top of that, the function ZFS_AC_SPL_MODULE_SYMVERS is moved to the end of the function ZFS_AC_SPL to allow both checks share the with-spl-timeout parameter. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1431	2013-05-02 15:40:44 -07:00
Brian Behlendorf	e013670550	Set RPM_DEFINE_COMMON options When the kmod packaging was introduced the ability to pass the --enable-debug and --enable-dmu-tx options from configure all the way through to `make rpm\|deb` was accidenally lost. Update ZFS_AC_RPM to explicitlu set RPM_DEFINE_COMMON with these rpmbuild defines. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #1402	2013-04-24 16:18:55 -07:00
Turbo Fredriksson	1a33036df9	Add --bump=0 to alien Preserve the release field when creating Debian packages. The --keep-version option was not used because it results in a failure when the git '<commit>_<hash>' syntax is used for the release. The '_' is a valid character for RPM packages but not for DEBs. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Issue #1402 Issue #928	2013-04-24 16:18:53 -07:00
Turbo Fredriksson	d012ba3832	Support .nogitrelease file When building a custom release in a git tree provide the ability to prevent the release field from being overwritten by the `git describe` output. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #1402	2013-04-24 16:18:49 -07:00
Turbo Fredriksson	16253cff43	Add --bump=0 to alien Preserve the release field when creating Debian packages. The --keep-version option was not used because it results in a failure when the git '<commit>_<hash>' syntax is used for the release. The '_' is a valid character for RPM packages but not for DEBs. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Turbo Fredriksson <turbo@bayour.com> Issue zfsonlinux/zfs#1402 Issue zfsonlinux/zfs#928	2013-04-24 16:18:11 -07:00
Turbo Fredriksson	2c21370746	Support .nogitrelease file When building a custom release in a git tree provide the ability to prevent the release field from being overwritten by the `git describe` output. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue zfsonlinux/zfs#1402	2013-04-24 16:18:03 -07:00
Brian Behlendorf	d17eeafbf0	Replace the ZFS_AC_META perl dependency with awk The only remaining perl dependency is part of the ZFS_AC_META macro. By eliminating this and replacing it with awk we can avoid the need to pull in perl to rebuild the packages. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #1380	2013-04-02 16:05:45 -07:00
Brian Behlendorf	7fd629d430	Replace the SPL_AC_META perl dependency with awk The only remaining perl dependency is part of the SPL_AC_META macro. By eliminating this and replacing it with awk we can avoid the need to pull in perl to rebuild the packages. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue zfsonlinux/zfs#1380	2013-04-02 16:04:19 -07:00
Jan Engelhardt	83918aebe5	build: do not call boilerplate ourself Rationale see section 3.5 "Using `autoreconf' to Update `configure' Scripts" of the autoconf manual. Signed-off-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-04-02 11:08:46 -07:00
Jan Engelhardt	a9e86ac4fd	gitignore: anchor entries at their respective directory .ko is specific to module, .m4 to config, etc. Signed-off-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-04-02 11:07:52 -07:00
Jan Engelhardt	92c4ea38c9	build: use CPPFLAGS -D and -I are preprocessor flags, so should preferably be in the appropriate variable. Signed-off-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-04-02 11:07:11 -07:00
Jan Engelhardt	7a8a639390	build: resolve orthographic and other grammatical errors Signed-off-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-04-02 11:06:38 -07:00
Jan Engelhardt	8c39262945	build: do not call boilerplate ourself Rationale see section 3.5 "Using `autoreconf' to Update `configure' Scripts" of the autoconf manual. http://www.gnu.org/software/autoconf/manual/autoconf-2.67/html_node/autoreconf-Invocation.html Signed-off-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-04-02 10:55:20 -07:00
Jan Engelhardt	ea0fcfc875	gitignore: anchor entries at their respective directory .ko is specific to module, .m4 to config, etc. Signed-off-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-04-02 10:50:17 -07:00
Jan Engelhardt	ecf76e3676	build: use CPPFLAGS -D and -I are preprocessor flags, so should preferably be in the appropriate variable. Signed-off-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-04-02 10:48:26 -07:00
Jan Engelhardt	4e95cc99b0	build: resolve orthographic and other grammatical errors Signed-off-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-04-02 10:44:52 -07:00
Brian Behlendorf	c14183adca	Use 'git describe' for working builds When building from an arbitrary commit in the git tree it's useful for the resulting packages to be uniquely identifiable. Therefore, the build system has been updated to detect if your compiling in git tree. If you are building in a git tree, and there are commits after the last annotated tag. Then the <id>-<hash> component of 'git describe' will be used to overwrite the 'Release:' field in the META file. The only tricky part is that to ensure the 'make dist' tarball is built using the correct release. A dist-hook was added to the top level make file to rewrite the META file using the correct release. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #195 Issue #111	2013-03-22 15:00:55 -07:00
Brian Behlendorf	f6fb7651a0	Use 'git describe' for working builds When building from an arbitrary commit in the git tree it's useful for the resulting packages to be uniquely identifiable. Therefore, the build system has been updated to detect if your compiling in git tree. If you are building in a git tree, and there are commits after the last annotated tag. Then the <id>-<hash> component of 'git describe' will be used to overwrite the 'Release:' field in the META file. The only tricky part is that to ensure the 'make dist' tarball is built using the correct release. A dist-hook was added to the top level make file to rewrite the META file using the correct release. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-03-22 14:52:01 -07:00
Brian Behlendorf	f3757573a6	Refresh RPM packaging Refresh the existing RPM packaging to conform to the 'Fedora Packaging Guidelines'. This includes adopting the kmods2 packaging standard which is used fod kmods distributed by rpmfusion for Fedora/RHEL. http://fedoraproject.org/wiki/Packaging:Guidelines http://rpmfusion.org/Packaging/KernelModules/Kmods2 While the spec files have been entirely rewritten from a user perspective the only major changes are: * The Fedora packages now have a build dependency on the rpmfusion repositories. The generic kmod packages also have a new dependency on kmodtool-1.22 but it is bundled with the source rpm so no additional packages are needed. * The kernel binary module packages have been renamed from zfs-modules-* to kmod-zfs-* as specificed by kmods2. * The is now a common kmod-zfs-devel-* package in addition to the per-kernel devel packages. The common package contains the development headers while the per-kernel package contains kernel specific build products. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1341	2013-03-18 15:33:17 -07:00
Brian Behlendorf	493972c896	Refresh RPM packaging Refresh the existing RPM packaging to conform to the 'Fedora Packaging Guidelines'. This includes adopting the kmods2 packaging standard which is used fod kmods distributed by rpmfusion for Fedora/RHEL. http://fedoraproject.org/wiki/Packaging:Guidelines http://rpmfusion.org/Packaging/KernelModules/Kmods2 While the spec files have been entirely rewritten from a user perspective the only major changes are: * The Fedora packages now have a build dependency on the rpmfusion repositories. The generic kmod packages also have a new dependency on kmodtool-1.22 but it is bundled with the source rpm so no additional packages are needed. * The kernel binary module packages have been renamed from spl-modules-* to kmod-spl-* as specificed by kmods2. * The is now a common kmod-spl-devel-* package in addition to the per-kernel devel packages. The common package contains the development headers while the per-kernel package contains kernel specific build products. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #222	2013-03-18 15:31:54 -07:00
Richard Yao	8274ed5988	Drop support for 3 argument version of set_fs_pwd This was a suggestion that Brian Behlendorf made when reviewing an early pull request for Linux 3.9 support. This commit was made intentionally easy to revert should we ever have a reason to reintroduce support for older kernels. Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-03-14 10:43:31 -07:00
Richard Yao	a54718cfe0	Linux 3.9 compat: set_fs_root takes const struct path * torvalds/linux@dcf787f391 enforces const-correctness in passing struct path *. Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-03-14 10:43:29 -07:00
Richard Yao	2a305c34c8	Linux 3.9 compat: vfs_getattr takes two arguments The function prototype of vfs_getattr previoulsy took struct vfsmount * and struct dentry * as arguments. These would always be defined together in a struct path . torvalds/linux@3dadecce20 modified vfs_getattr to take struct path is taken as an argument instead. Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-03-14 10:43:26 -07:00
Richard Yao	10087fe1fa	Linux 3.9 compat: Include linux/sched/rt.h Linux 3.9 reorganized sched.h, splitting it into numerous files. torvalds/linux@8bd75c77b7 moved MAX_PRIO and MAX_RT_PRIO to linux/sched/rt.h. Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-03-14 10:43:19 -07:00
Brian Behlendorf	9b2af9a097	Configure --with-spl{-obj} auto-detect cleanup Because the install location for the spl/zfs-devel headers was changed we need to refresh the auto-detect code. Note that for packaging which already explicitly calls --with-spl{-obj} nothing has changed. The updated code is now structured like that in ZFS_AC_KERNEL and should be cleaner and easier to maintain. In addition, it's stricter about detecting a valid source and object directory. It requires: * The source directory contains the file 'spl.release' * The object directory contains the file 'spl_config.h' * The following paths will be checked. Notice the /var/lib/ and /usr/src paths require that the spl and zfs version be matched. This is done to prevent accidentally mixing releases. dnl # 1) /var/lib/dkms/spl/<version>/build dnl # 2) /usr/src/spl-<version>/<kernel-version> dnl # 3) /usr/src/spl-<version> dnl # 4) ../spl dnl # 5) /usr/src/kernels/<kernel-version> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-03-13 13:42:16 -07:00
Brian Behlendorf	0da31cd6ca	Remove ARCH packaging The kernel modules are now available in the Arch User Repository (AUR) via zfs. Since their packaging is maintained and superior to ours it is being removed from the tree. https://wiki.archlinux.org/index.php/ZFS Now that various distributions are picking up the packages we should eventually be able to remove most of this infrastructure. Packaging belongs with the distributions not upstream. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-03-06 15:46:41 -08:00
Brian Behlendorf	ffb21118ad	Add --with-dracutdir configure option The standard dracut directory has moved from /usr/share/dracut to /usr/lib/dracut. To ensure the dracut modules get installed in the correct location provide a --with-dracutdir configure option to set the path. The default install location has been updated to /usr/lib/dracut which is used by more current versions of Fedora. However, this default is overriden by the RPM packaging for consistency. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-03-06 15:46:41 -08:00
Brian Behlendorf	5f0a4b0847	Remove ARCH packaging The kernel modules are now available in the Arch User Repository (AUR) via zfs. Since their packaging is maintained and superior to ours it is being removed from the tree. https://wiki.archlinux.org/index.php/ZFS Now that various distributions are picking up the packages we should eventually be able to remove most of this infrastructure. Packaging belongs with the distributions not upstream. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-03-04 19:09:34 -08:00
Richard Yao	c38367c73f	Eliminate runtime function pointer mods in autotools checks PaX/GrSecurity patched kernels implement a dialect of C that relies on a GCC plugin for enforcement. A basic idea in this dialect is that function pointers in structures should not change during runtime. This causes code that modifies function pointers at runtime to fail to compile in many instances. The autotools checks rely on whether or not small test cases compile against a given kernel. Some autotools checks assume some default case if other cases fail. When one of these autotools checks tests a PaX/GrSecurity patched kernel by modifying a function pointer at runtime, the default case will be used. Early detection of such situations is possible by relying on compiler warnings, which are compiler errors when --enable-debug is used. Unfortunately, very few people build ZFS with --enable-debug. The more common situation is that these issues manifest themselves as runtime failures in the form of NULL pointer exceptions. Previous patches that addressed such issues with PaX/GrSecurity compatibility largely relied on rewriting autotools checks to avoid runtime function pointer modification or the addition of PaX/GrSecurity specific checks. This patch takes the previous work to its logical conclusion by eliminating the use of runtime function pointer modification. This permits the removal of PaX-specific autotools checks in favor of ones that work across all supported kernels. This should resolve issues that were reported to occur with PaX/GrSecurity-patched Linux 3.7.5 kernels on Gentoo Linux. https://bugs.gentoo.org/show_bug.cgi?id=457176 We should be able to prevent future regressions in PaX/GrSecurity compatibility by ensuring that all changes to ZFSOnLinux avoid runtime function pointer modification. At the same time, this does not solve the issue of silent failures triggering default cases in the autotools check, which is what permitted these regressions to become runtime failures in the first place. This will need to be addressed in a future patch. Reported-by: Marcin Mirosław <bug@mejor.pl> Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1300	2013-03-04 08:49:17 -08:00
Etienne Dechamps	d9b0ebbe82	Remove the bio_empty_barrier() check. To determine whether the kernel is capable of handling empty barrier BIOs, we check for the presence of the bio_empty_barrier() macro, which was introduced in 2.6.24. If this macro is defined, then we can flush disk vdevs; if it isn't, then flushing is disabled. Unfortunately, the bio_empty_barrier() macro was removed in 2.6.37, even though the kernel is still capable of handling empty barrier BIOs. As a result, flushing is effectively disabled on kernels >= 2.6.37, meaning that starting from this kernel version, zfs doesn't use barriers to guarantee on-disk data consistency. This is quite bad and can lead to potential data corruption on power failures. This patch fixes the issue by removing the configure check for bio_empty_barrier(), as we don't support kernels <= 2.6.24 anymore. Thanks to Richard Kojedzinszky for catching this nasty bug. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1318	2013-02-24 10:22:34 -08:00
Etienne Dechamps	d75af3c0eb	Use -Werror for all kernel configure tests. As a matter of fact, we're already using -Werror for most tests because of a bug in kernel-bio-empty-barrier.m4 which sets -Werror without reverting it afterwards. This meant that all tests which ran after this one was using -Werror. This patch simply makes it clear that we're using -Werror and makes the code more readable and more predictable. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1317	2013-02-24 10:20:28 -08:00
Richard Yao	a0625691b3	Fix HAVE_MUTEX_OWNER_TASK_STRUCT autotools check on PPC64 The HAVE_MUTEX_OWNER_TASK_STRUCT fails on PPC64 with the following error: error: 'current' undeclared (first use in this function) We include linux/sched.h to ensure that current is available. Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-02-05 15:36:03 -08:00
Brian Behlendorf	dd3678fc29	Fix atomic64_* autoconf checks The SPL_AC_ATOMIC_SPINLOCK, SPL_AC_TYPE_ATOMIC64_CMPXCHG, and SPL_AC_TYPE_ATOMIC64_XCHG were all directly including the 'asm/atomic.h' header. As of Linux 3.4 this header was removed which results in a build failure. The right thing to do is include 'linux/atomic.h' however we can't safely do this because it doesn't exist in 2.6.26 kernels. Therefore, we include 'linux/fs.h' which in turn includes the correct atomic header regardless of the kernel version. When these incorrect APIs are used in ZFS the following build failure results. arc.c:791:80: warning: '__ret' may be used uninitialized in this function [-Wuninitialized] arc.c:791:1875: error: call to '__cmpxchg_wrong_size' declared with attribute error: Bad argument size for cmpxchg Since this is all Linux 2.6.24 compatibility code there's an argument to be made that it should be removed because kernels this old are not supported. However, because we're so close to a release I'm going to leave it in place for now. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes zfsonlinux/zfs#814 Closes zfsonlinux/zfs#1254	2013-02-05 10:05:46 -08:00
Brian Behlendorf	de081a2ab4	Check for KALLSYMS Check at ./configure time that the kernel was built with kallsyms support. If the kernel doesn't have CONFIG_KALLSYMS defined the modules will still compile cleanly but will not be loadable. So we really want to catch this early during ./configure. Note that we do not require CONFIG_KALLSYMS_ALL but it may be safely defined. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #6	2013-01-29 16:35:23 -08:00
Brian Behlendorf	79c6e4c445	Remove NPTL_GUARD_WITHIN_STACK Commit `4b2f65b253` increased the user space stack by 4x to resolve certain stack overflows. As such it no longer makes sense to worry about a single extra page which might or might not be part of the process stack. There is now ample headroom for normal usage. By eliminating this configure check we are also resolving the following segfault which intentionally occurs at configure time and may be logged in dmesg. conftest[22156]: segfault at 7fbf18a47e48 ip 00000000004007fe sp 00007fbf18a4be50 error 6 in conftest[400000+1000] Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-01-29 10:58:20 -08:00
Brian Behlendorf	2b7ab9d4d9	Linux 2.6.26 compat, lookup_bdev() It's doubtful many people were impacted by this but commit `6c28567` accidentally broke ZFS builds for 2.6.26 and earlier kernels. This commit depends on the lookup_bdev() function which exists in 2.6.26 but wasn't exported until 2.6.27. The availability of the function isn't critical so a wrapper is introduced which returns ERR_PTR(-ENOTSUP) when the function isn't defined. This will have the effect of causing zvol_is_zvol() to always fail for 2.6.26 kernels. This in turn means vdevs will always get opened concurrently which is good for normal usage. This will only become an issue if your using a zvol as a vdev in another pool. In which case you really should be using a newer kernel anyway. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1205	2013-01-28 15:35:00 -08:00
Brian Behlendorf	ee93035378	Use sb->s_d_op default dentry operations As of Linux 2.6.37 the right way to register custom dentry operations is to use the super block's ->s_d_op field. For older kernels they should be registered as part of the lookup operation. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1223	2013-01-18 15:04:23 -08:00
Brian Behlendorf	84dd1f4f15	Remove spl_invalidate_inodes() This functionality is no longer required by ZFS, see commit zfsonlinux/zfs@7b3e34ba5a. Since there are no other consumers, and because it adds additional autoconf complexity which must be maintained the spl_invalidate_inodes() function has been removed. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue zfsonlinux/zfs#795	2013-01-17 11:40:47 -08:00
Ned Bass	f1a05fa114	Fix false ENOENT on snapshot control dentries Lookups in the snapshot control directory for an existing snapshot fail with ENOENT if an earlier lookup failed before the snapshot was created. This is because the earlier lookup causes a negative dentry to be cached which is never invalidated. The bug can be reproduced as follows (the second ls should succeed): $ ls /tank/.zfs/snapshot/s ls: cannot access /tank/.zfs/snapshot/s: No such file or directory $ zfs snap tank@s $ ls /tank/.zfs/snapshot/s ls: cannot access /tank/.zfs/snapshot/s: No such file or directory To remedy this, always invalidate cached dentries in the snapshot control directory. Since these entries never exist on disk there is no significant performance penalty for the extra lookups. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1192	2013-01-16 16:28:54 -08:00
Brian Behlendorf	e191b54ecf	Only use gcc -Wunused-but-set-variable when available Certain versions of gcc generate an 'unrecognized command line option' error message when -Wunused-but-set-variable is used unconditionally. This in turn can cause several of the autoconf tests to misdetect an interface. Now, the use of -Wunused-but-set-variable in the autoconf tests was introduced by commit `b9c59ec8` to address a gcc 4.6 compatibility problem. So we really only need to pass this option for version of gcc which are known to support it. Therefore, the tests have been updated to use the result of the existing ZFS_AC_CONFIG_ALWAYS_NO_UNUSED_BUT_SET_VARIABLE which determines if gcc supports this option. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1004	2013-01-10 16:09:39 -08:00
Brian Behlendorf	42b3ce622f	Check for ZLIB_INFLATE and ZLIB_DEFLATE Check at ./configure time that the kernel was built with zlib support enabled. This support may either be configured as a module or builtin to the kernel. But if it's missing the build will fail so it's best to catch this early. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes zfsonlinux/zfs#582	2013-01-09 16:40:25 -08:00
Brian Behlendorf	050cd84e62	Linux compat 3.7.1, on_each_cpu() Some kernels require that we include the 'linux/irqflags.h' header for the SPL_AC_3ARGS_ON_EACH_CPU check. Otherwise, the functions local_irq_enable()/local_irq_disable() will not be defined and the prototype will be misdetected as the four argument version. This change actually include 'linux/interrupt.h' which in turn includes 'linux/irqflags.h' to be as generic as possible. Additionally, passing NULL as the function can result in a gcc error because the on_each_cpu() macro executes it unconditionally. To make the test more robust we pass the dummy function on_each_cpu_func(). Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #204	2013-01-09 10:28:28 -08:00
Brian Behlendorf	1c7b3eaf87	RHEL 6.4 compat, fallocate() In the upstream kernel the FALLOC_FL_PUNCH_HOLE #define was introduced after the fallocate() function was moved from the inode_operations to the file_operations structure. Therefore, the SPL code assumed that if FALLOC_FL_PUNCH_HOLE was defined it was safe to use f_ops->fallocate(). Unfortunately, the RHEL6.4 kernel has only backported the FALLOC_FL_PUNCH_HOLE #define and not the fallocate() change. To address this compatibility issue the spl_filp_fallocate() helper function was added to properly detect which interface is available. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-01-08 09:53:13 -08:00
Brian Behlendorf	8780c53961	Update SAs when an inode is dirtied Revert the portion of commit `d3aa3ea` which always resulted in the SAs being update when an mmap()'ed file was closed. That change accidentally resulted in unexpected ctime updates which upset tools like git. That was always a horrible hack and I'm happy it will never make it in to a tagged release. The right fix is something I initially resisted doing because I was worried about the additional overhead. However, in hindsight the overhead isn't as bad as I feared. This patch implemented the sops->dirty_inode() callback which is unsurprisingly called when an inode is dirtied. We leverage this callback to keep the znode SAs strictly in sync with the inode. However, for now we're going to go slowly to avoid introducing any new unexpected issues by only updating the atime, mtime, and ctime. This will cover the callpath of most concern to us. ->filemap_page_mkwrite->file_update_time->update_time-> mark_inode_dirty_sync->__mark_inode_dirty->dirty_inode Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #764 Closes #1140	2012-12-14 12:18:54 -08:00
Brian Behlendorf	eb0be2ed46	Removed SPL_AC_3ARGS_INIT_WORK check All consumers of the kernel delayed work queues have been shifted over to rely on the taskq implementation. This compatibility code can now be removed. Any new callers which need this functionality should use the taskq interfaces for delayed work items. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-12-12 09:57:10 -08:00
Brian Behlendorf	56a517ae3a	Verify --with-linux source directory exists Previously this check was only performed when ./configure was attempting to autodetect your kernel source directory. But we should also handle the case where --with-linux was provided and is obviously wrong. This way we catch the error before invoking make and compiling the source with an incorrect autoconf results. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes zfsonlinux/spl#162	2012-11-29 15:08:35 -08:00
Brian Behlendorf	251677e98f	Verify --with-linux source directory exists Previously this check was only performed when ./configure was attempting to autodetect your kernel source directory. But we should also handle the case where --with-linux was provided and is obviously wrong. This way we catch the error before invoking make and compiling the source with an incorrect autoconf results. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #162	2012-11-29 15:05:54 -08:00
Brian Behlendorf	2404b01499	Improve AF hard disk detection Use the bdev_physical_block_size() interface to determine the minimize write size which can be issued without incurring a read-modify-write operation. This is used to set the ashift correctly to prevent a performance penalty when using AF hard disks. Unfortunately, this interface isn't entirely reliable because it's not uncommon for disks to misreport this value. For this reason you may still need to manually set your ashift with: zpool create -o ashift=12 ... The solution to this in the upstream Illumos source was to add a white list of known offending drives. Maintaining such a list will be a burden, but it still may be worth doing if we can detect a large number of these drives. This should be considered as future work. Reported-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #916	2012-11-15 11:06:14 -08:00
Brian Behlendorf	1e0c2c2ccf	Linux 3.7 compat, __clear_close_on_exec() removed Commit torvalds/linux@b8318b0 moved the __clear_close_on_exec() function out of include/linux/fdtable.h and in to fs/file.c making it unavailable to the SPL. Now as it turns out we only used this function to tear down some test infrastructure for the vn_getf()/vn_releasef() SPLAT regression tests. Rather than implement even more autoconf compatibilty code to handle this we just remove the test case. This also allows us to drop three existing autoconf tests. This does mean the SPLAT tests will no longer verify these functions but historically they have never been a problem. And if we feel we absolutely need this test coverage I'm sure a more portable version of the test case could be added. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #183	2012-10-18 13:36:44 -07:00
Yuxuan Shui	bcb15891ab	Linux 3.6 compat, kern_path_locked() added The kern_path_parent() function was removed from Linux 3.6 because it was observed that all the callers just want the parent dentry. The simpler kern_path_locked() function replaces kern_path_parent() and does the lookup while holding the ->i_mutex lock. This is good news for the vn implementation because it removes the need for us to handle the locking. However, it makes it harder to implement a single readable vn_remove()/vn_rename() function which is usually what we prefer. Therefore, we implement a new version of vn_remove()/vn_rename() for Linux 3.6 and newer kernels. This allows us to leave the existing working implementation untouched, and to add a simpler version for newer kernels. Long term I would very much like to see all of the vn code removed since what this code enabled is generally frowned upon in the kernel. But that can't happen util we either abondon the zpool.cache file or implement alternate infrastructure to update is correctly in user space. Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com> Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #154	2012-10-14 16:26:21 -07:00
Richard Yao	95f5c63b47	Linux 3.6 compat, iops->mkdir() Use .mkdir instead of .create in 3.3 compatibility check. Linux 3.6 modifies inode_operations->create's function prototype. This causes an autotools Linux 3.3. compatibility check for a function prototype change in create, mkdir and mknode to fail. Since mkdir and mknode are unchanged, we modify the check to examine it instead. Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #873	2012-10-14 15:29:26 -07:00
Yuxuan Shui	558ef6d080	Linux 3.6 compat, iops->create() As of Linux commit ebfc3b49a7ac25920cb5be5445f602e51d2ea559 the struct nameidata is no longer passed to iops->create. Instead only the result of (inamedata->flags & LOOKUP_EXCL) is passed. ZFS like almost all Linux fileystems never made use of this so only the prototype needs to be wrapped for compatibility. Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #873	2012-10-14 14:42:25 -07:00
Yuxuan Shui	8f195a908f	Linux 3.6 compat, iops->lookup() As of Linux commit 00cd8dd3bf95f2cc8435b4cac01d9995635c6d0b the struct nameidata is no longer passed to iops->lookup. Instead only the inamedata->flags are passed. ZFS like almost all Linux fileystems never made use of this so only the prototype needs to be wrapped for compatibility. Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #873	2012-10-14 13:06:54 -07:00
Yuxuan Shui	3c20361075	Linux 3.6 compat, sget() As of Linux commit 9249e17fe094d853d1ef7475dd559a2cc7e23d42 the mount flags are now passed to sget() so they can be used when initializing a new superblock. ZFS never uses sget() in this fashion so we can simply pass a zero and add a zpl_sget() compatibility wrapper. Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #873	2012-10-14 13:06:48 -07:00
Etienne Dechamps	bbdc6ae495	Add interface for file hole punching. This adds an interface to "punch holes" (deallocate space) in VFS files. The interface is identical to the Solaris VOP_SPACE interface. This interface is necessary for TRIM support on file vdevs. This is implemented using Linux fallocate(FALLOC_FL_PUNCH_HOLE), which was introduced in 2.6.38. For a brief time before 2.6.38 this was done using the truncate_range inode operation, which was quickly deprecated. This patch only supports FALLOC_FL_PUNCH_HOLE. This adds support for the truncate_range() inode operation to VOP_SPACE() for file hole punching. This API is deprecated and removed in 3.5, so it's only useful for old kernels. On tmpfs, the truncate_range() inode operation translates to shmem_truncate_range(). Unfortunately, this function expects the end offset to be inclusive and aligned to the end of a page. If it is not, the kernel will stop with a BUG_ON(). This patch fixes the issue by adapting to the constraints set forth by shmem_truncate_range(). Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #168	2012-10-04 16:22:07 -07:00
Brian Behlendorf	6d1d976b2c	Modify vdev_elevator_switch() to use elevator_change() As of Linux 2.6.36 an elevator_change() interface was added. This commit updates vdev_elevator_switch() to use this interface when available, otherwise it falls back to the usermodehelper method. Original-patch-by: foobarz <sysop@xeon.(none)> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #906	2012-10-03 13:31:44 -07:00
Cyril Plisko	393b44c711	Implement .commit_metadata hook for NFS export In order to implement synchronous NFS metadata semantics ZFS needs to provide the .commit_metadata hook. All it takes there is to make sure changes are committed to ZIL. Fortunately zfs_fsync() does just that, so simply calling it from zpl_commit_metadata() does the trick. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #969	2012-10-03 10:49:45 -07:00
Brian Behlendorf	cda4db408c	Revert "Improve AF hard disk detection" This reverts commit `395350c85d` which accidentally introduced issue #955. Pools using AF drives which were originally created with a sector size of 512 bytes will now be correctly detected to have physical sector size of 4096. This is desirable for a new pool, however for an existing pool abruptly changing the sector size causes problems. For this reason, this change is being reverted until the additional logic can be added to detect the existing pool case. Existing pools must use the ashift size stored in the label regardless of what the disk reports. This is critical for compatibility. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #955	2012-09-11 16:33:49 -07:00
Brian Behlendorf	395350c85d	Improve AF hard disk detection Use the bdev_physical_block_size() interface to determine the minimize write size which can be issued without incurring a read-modify-write operation. This is used to set the ashift correctly to prevent a performance penalty when using AF hard disks. Unfortunately, this interface isn't entirely reliable because it's not uncommon for disks to misreport this value. For this reason you may still need to manually set your ashift with: zpool create -o ashift=12 ... The solution to this in the upstream Illumos source was to add a while list of known offending drives. Maintaining such a list will be a burden, but it still may be worth doing if we can detect a large number of these drives. This should be considered as future work. Reported-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #916	2012-09-04 15:35:32 -07:00
Brian Behlendorf	bc03e07a7c	Revert "Detect kernels that honor gfp flags passed to vmalloc()" This reverts commit `36811b4430`. Which is no longer required because there is now SPL code in place to safely handle the deadlocks the kernel patch was designed to address. Therefore we can unconditionally use vmalloc() and drop all the PF_MEMALLOC code. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-08-27 12:00:55 -07:00
Prakash Surya	587045a638	Remove SPL_LINUX_CONFIG autoconf macro Since removing the check for CONFIG_PREEMPT, there are no consumers of the SPL_LINUX_CONFIG macro. As such, there is no reason to keep it around. Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #164	2012-08-27 11:58:37 -07:00
Prakash Surya	f86373f5b2	Remove autoconf check for CONFIG_PREEMPT The autoconf macro which failed if CONFIG_PREEMPT was set in the kernel config was removed. With the inclusion of a few previous patches targeting support for preempt enabled kernels, it is now safe to run with this kernel config option enabled. Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #83	2012-08-27 11:54:41 -07:00
Prakash Surya	e3a4360702	Revert "Make CONFIG_PREEMPT Fatal" This reverts commit `7731d46b69`. Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-08-27 11:52:53 -07:00
Brian Behlendorf	ca8b5af89d	Remove autotools products Remove all of the generated autotools products from the repository and update the .gitignore files accordingly. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #718	2012-08-27 11:47:44 -07:00
Brian Behlendorf	c638e9ad04	Remove autotools products Remove all of the generated autotools products from the repository and update the .gitignore files accordingly. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue zfsonlinux/zfs#718	2012-08-27 11:46:23 -07:00
Richard Yao	074e72953c	Check kernel source directory for SPL ZFS fails to build when SPL is built into the kernel on unless --with-spl=/path/to/kernel/sources is specified. We fallback to the kernel sources directory when SPL is not found elsewhere to resolve that. Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closed #896	2012-08-26 13:49:09 -07:00
Massimo Maggi	52cd92022e	Fix snapshot automounting with GrSecurity constify plugin. ./configure erroneously detects absence of dops->d_automount when built against a GrSecurity patched kernel. Summerized error message found in config.log: checking whether dops->d_automount() exists ... In function 'main': ... error: constified variable 'dops' cannot be local The "dops" variable cannot be a local variable, so it's moved to the global scope. This test also fails if the prototype of the dops->d_automount function pointer is changed. Signed-off-by: Massimo Maggi <massimo@mmmm.it> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Closes #884	2012-08-24 08:56:38 -07:00
Prakash Surya	26e08952e6	Support building a zfs-modules-dkms sub package This commit adds support for building a zfs-modules-dkms sub package built around Dynamic Kernel Module Support. This is to allow building packages using the DKMS infrastructure which is intended to ease the burden of kernel version changes, upgrades, etc. By default zfs-modules-dkms-* sub package will be built as part of the 'make rpm' target. Alternately, you can build only the DKMS module package using the 'make rpm-dkms' target. Examples: # To build packaged binaries as well as a dkms packages $ ./configure && make rpm # To build only the packaged binary utilities and dkms packages $ ./configure && make rpm-utils rpm-dkms Note: Only the RHEL 5/6, CHAOS 5, and Fedora distributions are supported for building the dkms sub package. Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #535	2012-08-08 15:21:01 -07:00
Prakash Surya	5085d55817	Add '--with-spl-timeout' option When checking for the SPL Module.symvers file, a timeout can now be passed in which will pause the configure step while it waits for this file to be generated. By default, the configure behavior is unchanged as a timeout of 0 is used. If a positive number of seconds is passed, configure will wait that number of seconds for the Module.symvers file before moving on. The main motivation for this change was to support parallel execution of './configure && make' for the SPL and ZFS packages in preparation of supporting DKMS based packages. Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-08-08 15:20:55 -07:00
Prakash Surya	d83d25c2f8	Support building a spl-modules-dkms sub package This commit adds support for building a spl-modules-dkms sub package built around Dynamic Kernel Module Support. This is to allow building packages using the DKMS infrastructure which is intended to ease the burden of kernel version changes, upgrades, etc. By default spl-modules-dkms-* sub package will be built as part of the 'make rpm' target. Alternately, you can build only the DKMS module package using the 'make rpm-dkms' target. Examples: # To build packaged binaries as well as a dkms packages $ ./configure && make rpm # To build only the packaged binary utilities and dkms packages $ ./configure && make rpm-utils rpm-dkms Note: Only the RHEL 5/6, CHAOS 5, and Fedora distributions are supported for building the dkms sub package. Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue zfsonlinux/zfs#535	2012-08-08 13:49:40 -07:00
Etienne Dechamps	ee5fd0bb80	Set zvol discard_granularity to the volblocksize. Currently, zvols have a discard granularity set to 0, which suggests to the upper layer that discard requests of arbirarily small size and alignment can be made efficiently. In practice however, ZFS does not handle unaligned discard requests efficiently: indeed, it is unable to free a part of a block. It will write zeros to the specified range instead, which is both useless and inefficient (see dnode_free_range). With this patch, zvol block devices expose volblocksize as their discard granularity, so the upper layer is aware that it's not supposed to send discard requests smaller than volblocksize. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #862	2012-08-07 14:55:31 -07:00
Etienne Dechamps	476ff5a4da	Handle any invalidate_inodes_check prototype. In the comments of commit `723aa3b0c2`, mmatuska reported that the test for invalidate_inodes_check() is broken if invalidate_inodes() takes two arguments. This patch fixes the issue by resorting to another approach for detecting invalidate_inodes_check(): is simply checks if invalidate_inodes is defined as a macro. If it is, then it concludes that invalidate_inodes_check() is available. This will continue to work even if the prototype of invalidate_inodes_check() changes over time. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #148	2012-08-06 11:39:49 -07:00
Etienne Dechamps	723aa3b0c2	When checking for symbol exports, try compiling. This patch adds a new autoconf function: SPL_LINUX_TRY_COMPILE_SYMBOL. This new function does the following: - Call LINUX_TRY_COMPILE with the specified parameters. - If unsuccessful, return false. - If successful and we're configuring with --enable-linux-builtin, return true. - Else, call CHECK_SYMBOL_EXPORT with the specified parameters and return the result. All calls to CHECK_SYMBOL_EXPORT are converted to LINUX_TRY_COMPILE_SYMBOL so that the tests work even when configuring for builtin on a kernel which doesn't have loadable module support, or hasn't been built yet. The only exception are: - AC_GET_VMALLOC_INFO, because we don't even have a public header to include in the test case, but that's okay considering this symbol can be ignored just fine. - SPL_AC_DEVICE_CREATE, which is legacy API for 2.6.18 kernels. Since kernels this old are no longer supported it should arguably just be removed entirely from the build system. Note that we're also checking for the correct prototype with an actual call, which was not the case with CHECK_SYMBOL_EXPORT. However, for "complicated" test cases like with multiple symbol versions (e.g. vfs_fsync), we stick with the original behavior and only check for the function's existence. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue zfsonlinux/zfs#851	2012-07-26 15:12:35 -07:00
Etienne Dechamps	df7cc5bc71	Fake modpost stage for LINUX_COMPILE. Currently, when building a test case, we're compiling an entire Linux module from beginning to end. This includes the MODPOST stage, which generates a "conftest.mod.c" file with some boilerplate module declaration code. This poses a problem when configuring for built-in on kernels which have loadable module support disabled. In this case conftest.mod.c is referencing disabled code, resulting in a compilation failure, thus breaking the tests. This patch fixes the issue by faking the modpost stage when the --enable-linux-builtin option is provided. It does so by forcing the modpost command to be /bin/true, and using an empty conftest.mod.c file. The test module still compiles fine, although the result isn't loadable, but we don't really care at this point. Note it is important to preserve the modpost stage when building out of tree. This allows for the posibility of configure checks to leverage this phase to identify GPL-only symbols. Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue zfsonlinux/zfs#851	2012-07-26 15:12:10 -07:00
Etienne Dechamps	0408008b33	Make configure builtin-aware. This patch adds a new option to configure: --enable-linux-builtin. When this option is used, the following happens: - Compilation of kernel modules is disabled. - A failure to find UTS_RELEASE is followed by a suggestion to run "make prepare" on the kernel source tree. This patch also adds a new test which tries to compile an empty module as a basic toolchain sanity test. If it fails and the option was specified, the error is followed by a suggestion to run "make scripts" on the kernel source tree. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue zfsonlinux/zfs#851	2012-07-26 14:55:20 -07:00
Etienne Dechamps	016432fbeb	Don't build packages that haven't been selected. Currently, when configure --with-config is used, selective compilation is only effective for the simple "make" case. Package builders (e.g. make rpm) still build everything (utils and modules). This patch fixes that. This patch also drops the duplicate rpm-modules build target. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Prakash Surya <surya1@llnl.gov> Issue zfsonlinux/zfs#851	2012-07-26 14:54:32 -07:00
Etienne Dechamps	705741827a	When checking for symbol exports, try compiling. This patch adds a new autoconf function: ZFS_LINUX_TRY_COMPILE_SYMBOL. This new function does the following: - Call LINUX_TRY_COMPILE with the specified parameters. - If unsuccessful, return false. - If successful and we're configuring with --enable-linux-builtin, return true. - Else, call CHECK_SYMBOL_EXPORT with the specified parameters and return the result. All calls to CHECK_SYMBOL_EXPORT are converted to LINUX_TRY_COMPILE_SYMBOL so that the tests work even when configuring for builtin on a kernel which doesn't have loadable module support, or hasn't been built yet. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #851	2012-07-26 13:42:57 -07:00
Etienne Dechamps	fc88a6dda9	Fake modpost stage for LINUX_COMPILE. Currently, when building a test case, we're compiling an entire Linux module from beginning to end. This includes the MODPOST stage, which generates a "conftest.mod.c" file with some boilerplate module declaration code. This poses a problem when configuring for built-in on kernels which have loadable module support disabled. In this case conftest.mod.c is referencing disabled code, resulting in a compilation failure, thus breaking the tests. This patch fixes the issue by faking the modpost stage when the --enable-linux-builtin option is provided. It does so by forcing the modpost command to be /bin/true, and using an empty conftest.mod.c file. The test module still compiles fine, although the result isn't loadable, but we don't really care at this point. Note it is important to preserve the modpost stage when building out of tree. The ZFS_AC_KERNEL_BLK_END_REQUEST, ZFS_AC_KERNEL_BLK_QUEUE_FLUSH, and ZFS_AC_KERNEL_BLK_RQ_BYTES configure checks all depend on it to identify GPL-only symbols. Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #851	2012-07-26 13:41:02 -07:00
Etienne Dechamps	319a99a3d4	Make configure builtin-aware. This patch adds a new option to configure: --enable-linux-builtin. When this option is used, the following happens: - Compilation of kernel modules is disabled. - A failure to find UTS_RELEASE is followed by a suggestion to run "make prepare" on the kernel source tree. This patch also adds a new test which tries to compile an empty module as a basic toolchain sanity test. If it fails and the option was specified, the error is followed by a suggestion to run "make scripts" on the kernel source tree. Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #851	2012-07-26 13:40:18 -07:00
Etienne Dechamps	b2c5198b19	Don't build packages that haven't been selected. Currently, when configure --with-config is used, selective compilation is only effective for the simple "make" case. Package builders (e.g. make rpm) still build everything (utils and modules). This patch fixes that. Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #851	2012-07-26 13:39:37 -07:00
Richard Yao	739a1a82e0	Linux 3.5 compat, end_writeback() changed to clear_inode() The end_writeback() function was changed by moving the call to inode_sync_wait() earlier in to evict(). This effecitvely changes the ordering of the sync but it does not impact the details of the zfs implementation. However, as part of this change end_writeback() was renamed to clear_inode() to reflect the new semantics. This change does impact us and clear_inode() now maps to end_writeback() for kernels prior to 3.5. Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #784	2012-07-23 12:29:36 -07:00
Richard Yao	ea1fdf46e2	Linux 3.5 compat, iops->truncate_range() removed The vmtruncate_range() support has been removed from the kernel in favor of using the fallocate method in the file_operations table. Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #784	2012-07-23 12:29:32 -07:00
Richard Yao	756c3e5a9c	Linux 3.5 compat, eops->encode_fh() takes inodes The export_operations member ->encode_fh() has been updated to take both the child and parent inodes. This interface used to take the child dentry and a bool describing if the parent is needed. NOTE: While updating this code I noticed that we do not currently cleanly handle the case where we're passed a connectable parent. This code should be audited to make sure we're doing the right thing. Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #784	2012-07-23 12:29:23 -07:00
Richard Yao	ed3fc80048	Fix NULL pointer dereference on PaX/GRSecurity patched Linux 3.3 and later kernels Support for PaX/GRSecurity patched kernels was developed against Linux 3.2. Unfortunately, an autotools check introduced for a Linux 3.3 API fails on PaX/GRSecurity patched kernels. This causes the module to be built against the Linux 3.2 ABI, which results in a NULL pointer dereference at runtime. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Closes #794 Closes #809	2012-07-20 12:31:45 -07:00
Richard Yao	0a6b03d3b8	Fix build failures on PaX/GRSecurity patched kernels Gentoo Hardened kernels include the PaX/GRSecurity patches. They use a dialect of C that relies on a GCC plugin. In particular, struct file_operations has been marked do_const in the PaX/GRSecurity dialect, which causes GCC to consider all instances of it as const. This caused failures in the autotools checks and the ZFS source code. To address this, we modify the autotools checks to take into account differences between the PaX C dialect and the regular C dialect. We also modify struct zfs_acl's z_ops member to be a pointer to a function pointer table. Lastly, we modify zpl_put_link() to address a PaX change to the function prototype of nd_get_link(). This avoids compiler errors in the PaX/GRSecurity dialect. Note that the change in zpl_put_link() causes a warning that becomes a build failure when debugging is enabled. Fixing that warning requires ryao/spl@5ca50ef459. Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #484	2012-07-17 09:22:43 -07:00
Etienne Dechamps	b5a28807cd	Move partition scanning from userspace to module. Currently, zpool online -e (dynamic vdev expansion) doesn't work on whole disks because we're invoking ioctl(BLKRRPART) from userspace while ZFS still has a partition open on the disk, which results in EBUSY. This patch moves the BLKRRPART invocation from the zpool utility to the module. Specifically, this is done just before opening the device in vdev_disk_open() which is called inside vdev_reopen(). This requires jumping through some hoops to get to the disk device from the partition device, and to make sure we can still open the partition after the BLKRRPART call. Note that this new code path is triggered on dynamic vdev expansion only; other actions, like creating a new pool, are unchanged and still call BLKRRPART from userspace. This change also depends on API changes which are available in 2.6.37 and latter kernels. The build system has been updated to detect this, but there is no compatibility mode for older kernels. This means that online expansion will NOT be available in older kernels. However, it will still be possible to expand the vdev offline. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #808	2012-07-17 09:17:31 -07:00
Richard Yao	36811b4430	Detect kernels that honor gfp flags passed to vmalloc() zfsonlinux/spl@2092cf68d8 used PF_MEMALLOC to workaround a bug in the Linux kernel where allocations did not honor the gfp flags passed to vmalloc(). Unfortunately, PF_MEMALLOC has the side effect of permitting allocations to allocate pages outside of ZONE_NORMAL. This has been observed to result in the depletion of ZONE_DMA32. A kernel patch is available in the Gentoo bug tracker for this issue. https://bugs.gentoo.org/show_bug.cgi?id=416685 This negates any benefit PF_MEMALLOC provides, so we introduce an autotools check to disable the use of PF_MEMALLOC on systems with patched kernels. Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #126	2012-07-11 11:44:27 -07:00
Richard Yao	e0093fea58	Linux 3.4 compat, __clear_close_on_exec replaces FD_CLR torvalds/linux@1dce27c5aa introduced __clear_close_on_exec() as a replacement for FD_CLR. Further commits appear to have removed FD_CLR from the Linux source tree. This causes the following failure: error: implicit declaration of function '__FD_CLR' [-Werror=implicit-function-declaration] To correct this we update the code to use the current __clear_close_on_exec() interface for readability. Then we introduce an autotools check to determine if __clear_close_on_exec() is available. If it isn't then we define some compatibility logic which used the older FD_CLR() interface. Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #124	2012-06-13 16:18:51 -07:00
Richard Yao	6a0936babc	Linux 3.4 compat, d_make_root() replaces d_alloc_root() torvalds/linux@adc0e91ab1 introduced introduced d_make_root() as a replacement for d_alloc_root(). Further commits appear to have removed d_alloc_root() from the Linux source tree. This causes the following failure: error: implicit declaration of function 'd_alloc_root' [-Werror=implicit-function-declaration] To correct this we update the code to use the current d_make_root() interface for readability. Then we introduce an autotools check to determine if d_make_root() is available. If it isn't then we define some compatibility logic which used the older d_alloc_root() interface. Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #776	2012-06-11 10:04:49 -07:00
Ned Bass	cac1f230e0	Improve CONFIG_DEBUG_LOCK_ALLOC error message The configure script error message for kernels built with CONFIG_DEBUG_LOCK_ALLOC may give the impression that the issue is strictly with license compliance. To avoid confusion add some words indicating that the linking stage will fail if the build continues. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #773	2012-06-11 09:28:04 -07:00
Brian Behlendorf	e5b8562277	Extend CONFIG_DEBUG_LOCK_ALLOC check The CONFIG_DEBUG_LOCK_ALLOC check at configure time was added to detect when mutex_lock() is defined as a GPL-only symbol. However, the check as written only inferred this from this configuration setting, it never actually checked. This change introduces that missing check to prevent false positives. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-06-01 08:51:56 -07:00
Brian Behlendorf	b39d3b9f7b	Linux 3.3 compat, iops->create()/mkdir()/mknod() The mode argument of iops->create()/mkdir()/mknod() was changed from an 'int' to a 'umode_t'. To prevent a compiler warning an autoconf check was added to detect the API change and then correctly set a zpl_umode_t typedef. There is no functional change. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #701	2012-04-30 12:52:38 -07:00
Brian Behlendorf	f47e1351db	Fix executable permissions Caught by lint, this permission change was accidentally introduced by commit `42cb3819f1`. Restore the correct permissions and while I'm at it add a missing whack-bang to config/ltmain.sh. lint: executable-not-elf-or-script: zpool_main.c zfs_main.c Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #620	2012-03-26 11:52:44 -07:00
Brian Behlendorf	1c5de20ae2	Add --enable-debug-dmu-tx configure option Allow rigorous (and expensive) tx validation to be enabled/disabled indepentantly from the standard zfs debugging. When enabled these checks ensure that all txs are constructed properly and that a dbuf is never dirtied without taking the correct tx hold. This checking is particularly helpful when adding new dmu consumers like Lustre. However, for established consumers such as the zpl with no known outstanding tx construction problems this is just overhead. --enable-debug-dmu-tx - Enable/disable validation of each tx as --disable-debug-dmu-tx it is constructed. By default validation is disabled due to performance concerns. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-03-23 12:25:17 -07:00
Brian Behlendorf	ebe7e575ea	Add .zfs control directory Add support for the .zfs control directory. This was accomplished by leveraging as much of the existing ZFS infrastructure as posible and updating it for Linux as required. The bulk of the core functionality is now all there with the following limitations. ) The .zfs/snapshot directory automount support requires a 2.6.37 or newer kernel. The exception is RHEL6.2 which has backported the d_automount patches. ) Creating/destroying/renaming snapshots with mkdir/rmdir/mv in the .zfs/snapshot directory works as expected. However, this functionality is only available to root until zfs delegations are finished. * mkdir - create a snapshot * rmdir - destroy a snapshot * mv - rename a snapshot The following issues are known defeciences, but we expect them to be addressed by future commits. ) Add automount support for kernels older the 2.6.37. This should be possible using follow_link() which is what Linux did before. ) Accessing the .zfs/snapshot directory via NFS is not yet possible. The majority of the ground work for this is complete. However, finishing this work will require resolving some lingering integration issues with the Linux NFS kernel server. *) The .zfs/shares directory exists but no futher smb functionality has yet been implemented. Contributions-by: Rohan Puri <rohan.puri15@gmail.com> Contributiobs-by: Andrew Barnes <barnes333@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #173	2012-03-22 13:03:47 -07:00
Brian Behlendorf	a3a69b74cd	Fix distribution detection Improve the distribution detection by moving the tests for distribution specific files first. The Ubuntu and Debian checks are left for last because they are the least likely to be unique. This is particularly true in the case of Debian since so many distributions are based on Debian. Since this is currently only used to identify the correct packaging method for this system the result in many instances is simply cosmetic. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-03-05 10:38:38 -08:00
Richard Yao	76c2b24c61	Fix distribution detection Improve the distribution detection by moving the tests for distribution specific files first. The Ubuntu and Debian checks are left for last because they are the least likely to be unique. This is particularly true in the case of Debian since so many distributions are based on Debian. Since this is currently only used to identify the correct packaging method for this system the result in many instances is simply cosmetic. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-03-05 10:38:27 -08:00
Brian Behlendorf	3c208a5480	Cleanly support debug packages Allow a source rpm to be rebuilt with debugging enabled. This avoids the need to have to manually modify the spec file. By default debugging is still largely disabled. To enable specific debugging features use the following options with rpmbuild. '--with debug' - Enables ASSERTs '--with debug-log' - Enables the internal debug log '--with debug-kmem' - Enables basic memory accounting '--with debug-kmem-tracking' - Enables detailed memory tracking # For example: $ rpmbuild --rebuild --with debug spl-modules-0.6.0-rc6.src.rpm Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-02-27 14:24:22 -08:00
Brian Behlendorf	4b787d75c8	Cleanly support debug packages Allow a source rpm to be rebuilt with debugging enabled. This avoids the need to have to manually modify the spec file. By default debugging is still largely disabled. To enable specific debugging features use the following options with rpmbuild. '--with debug' - Enables ASSERTs # For example: $ rpmbuild --rebuild --with debug zfs-modules-0.6.0-rc6.src.rpm Additionally, ZFS_CONFIG has been added to zfs_config.h for packages which build against these headers. This is critical to ensure both zfs and the dependant package are using the same prototype and structure definitions. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-02-27 14:08:17 -08:00
Etienne Dechamps	30930fba21	Add support for DISCARD to ZVOLs. DISCARD (REQ_DISCARD, BLKDISCARD) is useful for thin provisioning. It allows ZVOL clients to discard (unmap, trim) block ranges from a ZVOL, thus optimizing disk space usage by allowing a ZVOL to shrink instead of just grow. We can't use zfs_space() or zfs_freesp() here, since these functions only work on regular files, not volumes. Fortunately we can use the low-level function dmu_free_long_range() which does exactly what we want. Currently the discard operation is not added to the log. That's not a big deal since losing discard requests cannot result in data corruption. It would however result in disk space usage higher than it should be. Thus adding log support to zvol_discard() is probably a good idea for a future improvement. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-02-09 16:19:38 -08:00
Etienne Dechamps	cb2d19010d	Support the fallocate() file operation. Currently only the (FALLOC_FL_PUNCH_HOLE) flag combination is supported, since it's the only one that matches the behavior of zfs_space(). This makes it pretty much useless in its current form, but it's a start. To support other flag combinations we would need to modify zfs_space() to make it more flexible, or emulate the desired functionality in zpl_fallocate(). Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #334	2012-02-09 16:19:32 -08:00
Etienne Dechamps	34037afe24	Improve ZVOL queue behavior. The Linux block device queue subsystem exposes a number of configurable settings described in Linux block/blk-settings.c. The defaults for these settings are tuned for hard drives, and are not optimized for ZVOLs. Proper configuration of these options would allow upper layers (I/O scheduler) to take better decisions about write merging and ordering. Detailed rationale: - max_hw_sectors is set to unlimited (UINT_MAX). zvol_write() is able to handle writes of any size, so there's no reason to impose a limit. Let the upper layer decide. - max_segments and max_segment_size are set to unlimited. zvol_write() will copy the requests' contents into a dbuf anyway, so the number and size of the segments are irrelevant. Let the upper layer decide. - physical_block_size and io_opt are set to the ZVOL's block size. This has the potential to somewhat alleviate issue #361 for ZVOLs, by warning the upper layers that writes smaller than the volume's block size will be slow. - The NONROT flag is set to indicate this isn't a rotational device. Although the backing zpool might be composed of rotational devices, the resulting ZVOL often doesn't exhibit the same behavior due to the COW mechanisms used by ZFS. Setting this flag will prevent upper layers from making useless decisions (such as reordering writes) based on incorrect assumptions about the behavior of the ZVOL. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-02-07 16:23:06 -08:00
Etienne Dechamps	b18019d2d8	Fix synchronicity for ZVOLs. zvol_write() assumes that the write request must be written to stable storage if rq_is_sync() is true. Unfortunately, this assumption is incorrect. Indeed, "sync" does not mean what we think it means in the context of the Linux block layer. This is well explained in linux/fs.h: WRITE: A normal async write. Device will be plugged. WRITE_SYNC: Synchronous write. Identical to WRITE, but passes down the hint that someone will be waiting on this IO shortly. WRITE_FLUSH: Like WRITE_SYNC but with preceding cache flush. WRITE_FUA: Like WRITE_SYNC but data is guaranteed to be on non-volatile media on completion. In other words, SYNC does not mean that the write must be on stable storage on completion. It just means that someone is waiting on us to complete the write request. Thus triggering a ZIL commit for each SYNC write request on a ZVOL is unnecessary and harmful for performance. To make matters worse, ZVOL users have no way to express that they actually want data to be written to stable storage, which means the ZIL is broken for ZVOLs. The request for stable storage is expressed by the FUA flag, so we must commit the ZIL after the write if the FUA flag is set. In addition, we must commit the ZIL before the write if the FLUSH flag is set. Also, we must inform the block layer that we actually support FLUSH and FUA. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-02-07 16:23:06 -08:00
Brian Behlendorf	47621f3d76	Linux 3.3 compat, sops->show_options() The second argument of sops->show_options() was changed from a 'struct vfsmount ' to a 'struct dentry '. Add an autoconf check to detect the API change and then conditionally define the expected interface. In either case we are only interested in the zfs_sb_t. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #549	2012-02-03 10:02:01 -08:00
Brian Behlendorf	4b2220f0b9	Add --enable-debug-log configure option Until now the notion of an internal debug logging infrastructure was conflated with enabling ASSERT()s. This patch clarifies things by cleanly breaking the two subsystem apart. The result of this is the following behavior. --enable-debug - Enable/disable code wrapped in ASSERT()s. --disable-debug ASSERT()s are used to check invariants and are never required for correct operation. They are disabled by default because they may impact performance. --enable-debug-log - Enable/disable the debug log infrastructure. --disable-debug-log This infrastructure allows the spl code and its consumer to log messages to an in-kernel log. The granularity of the logging can be controlled by a debug mask. By default the mask disables most debug messages resulting in a negligible performance impact. Because of this the debug log is enabled by default. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-02-02 11:27:54 -08:00
Brian Behlendorf	b40a77aefc	Add the release component to headers When the original build system code was added the release component was accidentally omited from the development header install path. This patch adds the missing path component so it's always clear exactly what release your compiling against. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-01-18 12:19:47 -08:00
Brian Behlendorf	0b14b9f327	Run SPL_AC_PACMAN only if $VENDOR is "arch" Unfortunately, Arch's package manager `pacman` shares it's name with a popular arcade video game. Thus, in order to refrain from executing the video game when we mean to execute the package manager, SPL_AC_PACMAN is now only run when $VENDOR is determined to be "arch". Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes zfsonlinux/zfs#517	2012-01-13 09:08:12 -08:00
Prakash Surya	58d956b085	Run ZFS_AC_PACMAN only if $VENDOR is "arch" Unfortunately, Arch's package manager `pacman` shares it's name with a popular arcade video game. Thus, in order to refrain from executing the video game when we mean to execute the package manager, ZFS_AC_PACMAN is now only run when $VENDOR is determined to be "arch". Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #517	2012-01-13 09:03:11 -08:00
Brian Behlendorf	166dd49de0	Linux 3.2 compat, security_inode_init_security() The security_inode_init_security() API has been changed to include a filesystem specific callback to write security extended attributes. This was done to support the initialization of multiple LSM xattrs and the EVM xattr. This change updates the code to use the new API when it's available. Otherwise it falls back to the previous implementation. In addition, the ZFS_AC_KERNEL_6ARGS_SECURITY_INODE_INIT_SECURITY autoconf test has been made more rigerous by passing the expected types. This is done to ensure we always properly the detect the correct form for the security_inode_init_security() API. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #516	2012-01-12 15:06:39 -08:00
Darik Horn	588d900433	Linux 3.2 compat: rw_semaphore.wait_lock is raw The wait_lock member of the rw_semaphore struct became a raw_spinlock_t in Linux 3.2 at torvalds/linux@ddb6c9b58a. Wrap spin_lock_* function calls in a new spl_rwsem_* interface to ensure type safety if raw_spinlock_t becomes architecture specific, and to satisfy these compiler warnings: warning: passing argument 1 of ‘spinlock_check’ from incompatible pointer type [enabled by default] note: expected ‘struct spinlock_t ’ but argument is of type ‘struct raw_spinlock_t ’ Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes: #76 Closes: zfsonlinux/zfs#463	2012-01-11 16:28:05 -08:00
Brian Behlendorf	ab26409db7	Linux 3.1 compat, super_block->s_shrink The Linux 3.1 kernel has introduced the concept of per-filesystem shrinkers which are directly assoicated with a super block. Prior to this change there was one shared global shrinker. The zfs code relied on being able to call the global shrinker when the arc_meta_limit was exceeded. This would cause the VFS to drop references on a fraction of the dentries in the dcache. The ARC could then safely reclaim the memory used by these entries and honor the arc_meta_limit. Unfortunately, when per-filesystem shrinkers were added the old interfaces were made unavailable. This change adds support to use the new per-filesystem shrinker interface so we can continue to honor the arc_meta_limit. The major benefit of the new interface is that we can now target only the zfs filesystem for dentry and inode pruning. Thus we can minimize any impact on the caching of other filesystems. In the context of making this change several other important issues related to managing the ARC were addressed, they include: * The dnlc_reduce_cache() function which was called by the ARC to drop dentries for the Posix layer was replaced with a generic zfs_prune_t callback. The ZPL layer now registers a callback to drop these dentries removing a layering violation which dates back to the Solaris code. This callback can also be used by other ARC consumers such as Lustre. arc_add_prune_callback() arc_remove_prune_callback() * The arc_reduce_dnlc_percent module option has been changed to arc_meta_prune for clarity. The dnlc functions are specific to Solaris's VFS and have already been largely eliminated already. The replacement tunable now represents the number of bytes the prune callback will request when invoked. * Less aggressively invoke the prune callback. We used to call this whenever we exceeded the arc_meta_limit however that's not strictly correct since it results in over zeleous reclaim of dentries and inodes. It is now only called once the arc_meta_limit is exceeded and every effort has been made to evict other data from the ARC cache. * More promptly manage exceeding the arc_meta_limit. When reading meta data in to the cache if a buffer was unable to be recycled notify the arc_reclaim thread to invoke the required prune. * Added arcstat_prune kstat which is incremented when the ARC is forced to request that a consumer prune its cache. Remember this will only occur when the ARC has no other choice. If it can evict buffers safely without invoking the prune callback it will. * This change is also expected to resolve the unexpect collapses of the ARC cache. This would occur because when exceeded just the arc_meta_limit reclaim presure would be excerted on the arc_c value via arc_shrink(). This effectively shrunk the entire cache when really we just needed to reclaim meta data. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #466 Closes #292	2012-01-11 11:46:02 -08:00
Brian Behlendorf	5f6c14b1ed	Proxmox VE kernel compat, invalidate_inodes() The Proxmox VE kernel contains a patch which renames the function invalidate_inodes() to invalidate_inodes_check(). In the process it adds a 'check' argument and a '#define invalidate_inodes(x)' compatibility wrapper for legacy callers. Therefore, if either of these functions are exported invalidate_inodes() can be safely used. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #58	2011-12-21 14:29:45 -08:00
Prakash Surya	8eaa020b46	Move Arch Linux's VENDOR check above Ubuntu's If the lsb-release package is installed on an Arch Linux distribution, the configure step will incorrectly detect the running distribution as Ubuntu. This is a result of both distributions providing an /etc/lsb-release file, and the Ubuntu VENDOR check being performed first. Since the Arch Linux test check's for a file more specific to the Arch Linux distribution, moving Arch Linux's VENDOR check above Unbuntu's check provides a quick and easy solution. Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2011-12-19 12:05:10 -08:00
Prakash Surya	cd2817f8a6	Move Arch Linux's VENDOR check above Ubuntu's If the lsb-release package is installed on an Arch Linux distribution, the configure step will incorrectly detect the running distribution as Ubuntu. This is a result of both distributions providing an /etc/lsb-release file, and the Ubuntu VENDOR check being performed first. Since the Arch Linux test check's for a file more specific to the Arch Linux distribution, moving Arch Linux's VENDOR check above Unbuntu's check provides a quick and easy solution. Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #72	2011-12-19 12:03:40 -08:00
Darik Horn	28eb9213d8	Linux 3.2 compat: set_nlink() Directly changing inode->i_nlink is deprecated in Linux 3.2 by commit SHA: bfe8684869601dacfcb2cd69ef8cfd9045f62170 Use the new set_nlink() kernel function instead. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes: #462	2011-12-16 20:02:52 -08:00
Prakash Surya	6ba3b44614	Add make rule for building Arch Linux packages Added the necessary build infrastructure for building packages compatible with the Arch Linux distribution. As such, one can now run: $ ./configure $ make pkg # Alternatively, one can run 'make arch' as well on the Arch Linux machine to create two binary packages compatible with the pacman package manager, one for the zfs userland utilities and another for the zfs kernel modules. The new packages can then be installed by running: # pacman -U $package.pkg.tar.xz In addition, source-only packages suitable for an Arch Linux chroot environment or remote builder can also be build using the 'sarch' make rule. NOTE: Since the source dist tarball is created on the fly from the head of the build tree, it's MD5 hash signature will be continually influx. As a result, the md5sum variable was intentionally omitted from the PKGBUILD files, and the '--skipinteg' makepkg option is used. This may or may not have any serious security implications, as the source tarball is not being downloaded from an outside source. Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #491	2011-12-14 19:14:23 -08:00
Prakash Surya	c2dceb5cd5	Add make rule for building Arch Linux packages Added the necessary build infrastructure for building packages compatible with the Arch Linux distribution. As such, one can now run: $ ./configure $ make pkg # Alternatively, one can run 'make arch' as well on an Arch Linux machine to create two binary packages compatible with the pacman package manager, one for the spl userland utilties and another for the spl kernel modules. The new packages can then be installed by running: # pacman -U $package.pkg.tar.xz In addition, source-only packages suitable for an Arch Linux chroot environment or remote builder can also be built using the 'sarch' make rule. NOTE: Since the source dist tarball is created on the fly from the head of the build tree, it's MD5 hash signature will be continually influx. As a result, the md5sum variable was intentionally omitted from the PKGBUILD files, and the '--skipinteg' makepkg option is used. This may or may not have any serious security implications, as the source tarball is not being downloaded from an outside source. Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes: #68	2011-12-14 16:44:10 -08:00
Prakash Surya	b9c59ec83a	Fix configure tests to play nice with GCC 4.6 As of GCC 4.6, specific kernel 2.6.32 header files do not compile cleanly without warnings. One specific example of this is the arch/x86/include/asm/percpu.h file. Thus, a few of the configure tests were getting hung up on this and the '-Wno-unsued-but-set-variables' compile option had to be introduced. Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #459	2011-11-29 16:14:25 -08:00
Prakash Surya	e89236fd28	In autoconf v2.68, AC_LANG_PROGRAM must be quoted This change updates the AC_LANG_PROGRAM autoconf macro invocations to be wrapped in quotes. As of autoconf version 2.68, the quotes are necessary to prevent warnings from appearing. Specifically, the autoconf v2.68 Forward Porting Notes specifies: It is important to note that you need to ensure that the call to AC_LANG_SOURCE is quoted and not expanded, otherwise that will cause the warning to appear nonetheless. Finally, because of the additional quoting we can drop the extra quotas used by the ZFS_AC_CONFIG_USER_STACK_GUARD autoconf check. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #464	2011-11-28 11:16:33 -08:00
Brian Behlendorf	adcd70bd1a	Linux 3.1 compat, fops->fsync() The Linux 3.1 kernel updated the fops->fsync() callback yet again. They now pass the requested range and delegate the responsibility for calling filemap_write_and_wait_range() to the callback. In addition imutex is no longer held by the caller and the callback is responsible for taking the lock if required. This commit updates the code to provide a zpl_fsync() function for the updated API. Implementations for the previous two APIs are also maintained for compatibility. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #445	2011-11-10 10:03:08 -08:00
Brian Behlendorf	0d0b523728	Linux 3.1 compat, vfs_fsync() Preferentially use the vfs_fsync() function. This function was initially introduced in 2.6.29 and took three arguments. As of 2.6.35 the dentry argument was dropped from the function. For older kernels fall back to using file_fsync() which also took three arguments including the dentry. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #52	2011-11-09 19:36:21 -08:00
Brian Behlendorf	12ff95ff57	Linux 3.1 compat, kern_path_parent() Prior to Linux 3.1 the kern_path_parent symbol was exported for use by kernel modules. As of Linux 3.1 it is now longer easily available. To handle this case the spl will now dynamically look up address of the missing symbol at module load time. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #52	2011-11-09 16:51:25 -08:00
Brian Behlendorf	8c19f5b407	Suppress packaging warning Only under Ubuntu Lucid the rpm packaging step mistakenly adds the following files twice to the package because of the /lib naming convention. This is harmless but results in a warning which the buildot flags as a failure. Suppress this warning. warning: File listed twice: /lib/udev/rules.d warning: File listed twice: /lib/udev/rules.d/60-zpool.rules warning: File listed twice: /lib/udev/rules.d/60-zvol.rules warning: File listed twice: /lib/udev/rules.d/90-zfs.rules warning: File listed twice: /lib/udev/sas_switch_id warning: File listed twice: /lib/udev/zpool_id warning: File listed twice: /lib/udev/zvol_id Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2011-11-08 11:32:04 -08:00
Brian Behlendorf	5547c2f1bf	Simplify BDI integration Update the code to use the bdi_setup_and_register() helper to simplify the bdi integration code. The updated code now just registers the bdi during mount and destroys it during unmount. The only complication is that for 2.6.32 - 2.6.33 kernels the helper wasn't available so in these cases the zfs code must provide it. Luckily the bdi_setup_and_register() function is trivial. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #367	2011-11-08 10:19:03 -08:00
Brian Behlendorf	97fd6a07c2	Fix HAVE_FS_STRUCT_SPINLOCK check for gcc-4.1.2 Older versions of gcc (gcc-4.1.2) will treat an 'incompatible pointer type' as a warning instead of an error. This results in HAVE_FS_STRUCT_SPINLOCK being defined incorrectly. This failure mode was observed when using a RHEL6 2.6.32 based kernel under RHEL5.5 which contains the old version of gcc. To resolve the issue the warning is explicitly promoted to an error. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2011-09-19 13:45:08 -07:00
Prakash Surya	8366cd6a83	Convert 'if' statements to AS_IF in kernel.m4 The 'if' statements found in kernel.m4 were converted to use the portable alternative provided by autoconf, the AS_IF macro. Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2011-09-06 13:20:48 -07:00
Prakash Surya	2984e0bb0c	Fix minor autoconf error message inconsistencies A few of the autoconf error messages were inconsistent with the rest of the build system. To be specific, the inconsistencies addressed by this commit are the following: * The second line of the error message for the CONFIG_PREEMPT check was missing it's third asterisk. * A few of the error messages were prefixed by two tabs, whereas the majority of error messages are only prefixed by a single tab. Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2011-09-06 13:20:04 -07:00
Brian Behlendorf	9c4f40b894	Buildbot suppression rules The warnings listed in the suppression file will be suppressed and not flagged during regular buildbot builds. These warnings are expected, harmless, and can obscure real issues unless they are suppressed. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2011-08-19 16:26:06 -07:00
Brian Behlendorf	ddd052aa83	Improve HAVE_EVICT_INODE check The hardened gentoo kernel defines all of the super block operation callbacks as const. This prevents the autoconf test from assigning the callback and results in a false negative. By moving the assignment in to the declaration we can avoid this issue and get a correct result for this patched kernel. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #296	2011-08-08 16:42:09 -07:00
Kyle Fuller	12d06bac9b	Move udev rules from /etc/udev to /lib/udev This change moves the default install location for the zfs udev rules from /etc/udev/ to /lib/udev/. The correct convention is for rules provided by a package to be installed in /lib/udev/. The /etc/udev/ directory is reserved for custom rules or local overrides. Additionally, this patch cleans up some abuse of the bindir install location by adding a udevdir and udevruledir install directories. This allows us to revert to the default bin install location. The udev install directories can be set with the following new options. --with-udevdir=DIR install udev helpers [EPREFIX/lib/udev] --with-udevruledir=DIR install udev rules [UDEVDIR/rules.d] Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #356	2011-08-08 16:21:10 -07:00
Brian Behlendorf	76659dc110	Add backing_device_info per-filesystem For a long time now the kernel has been moving away from using the pdflush daemon to write 'old' dirty pages to disk. The primary reason for this is because the pdflush daemon is single threaded and can be a limiting factor for performance. Since pdflush sequentially walks the dirty inode list for each super block any delay in processing can slow down dirty page writeback for all filesystems. The replacement for pdflush is called bdi (backing device info). The bdi system involves creating a per-filesystem control structure each with its own private sets of queues to manage writeback. The advantage is greater parallelism which improves performance and prevents a single filesystem from slowing writeback to the others. For a long time both systems co-existed in the kernel so it wasn't strictly required to implement the bdi scheme. However, as of Linux 2.6.36 kernels the pdflush functionality has been retired. Since ZFS already bypasses the page cache for most I/O this is only an issue for mmap(2) writes which must go through the page cache. Even then adding this missing support for newer kernels was overlooked because there are other mechanisms which can trigger writeback. However, there is one critical case where not implementing the bdi functionality can cause problems. If an application handles a page fault it can enter the balance_dirty_pages() callpath. This will result in the application hanging until the number of dirty pages in the system drops below the dirty ratio. Without a registered backing_device_info for the filesystem the dirty pages will not get written out. Thus the application will hang. As mentioned above this was less of an issue with older kernels because pdflush would eventually write out the dirty pages. This change adds a backing_device_info structure to the zfs_sb_t which is already allocated per-super block. It is then registered when the filesystem mounted and unregistered on unmount. It will not be registered for mounted snapshots which are read-only. This change will result in flush-<pool> thread being dynamically created and destroyed per-mounted filesystem for writeback. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #174	2011-08-04 13:37:38 -07:00
Brian Behlendorf	0da7869690	Fix the configure CONFIG_* option detection The latest kernels no longer define AUTOCONF_INCLUDED which was being used to detect the new style autoconf.h kernel configure options. This results in the CONFIG_* checks always failing incorrectly for newer kernels. The fix for this is a simplification of the testing method. Rather than attempting to explicitly include to renamed config header. It is simpler to unconditionally include <linux/module.h> which must pick up the correctly named header. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #320	2011-07-22 15:07:16 -07:00
Brian Behlendorf	c064bdee95	Fix the configure CONFIG_* option detection The latest kernels no longer define AUTOCONF_INCLUDED which was being used to detect the new style autoconf.h kernel configure options. This results in the CONFIG_* checks always failing incorrectly for newer kernels. The fix for this is a simplification of the testing method. Rather than attempting to explicitly include to renamed config header. It is simpler to unconditionally include <linux/module.h> which must pick up the correctly named header. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #320	2011-07-22 15:07:03 -07:00
Kyle Fuller	615ab66d18	Provide a rc.d script for archlinux Unlike most other Linux distributions archlinux installs its init scripts in /etc/rc.d insead of /etc/init.d. This commit provides an archlinux rc.d script for zfs and extends the build infrastructure to ensure it get's installed in the correct place. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #322	2011-07-11 14:12:23 -07:00
Brian Behlendorf	2cf7f52bc4	Linux compat 2.6.39: mount_nodev() The .get_sb callback has been replaced by a .mount callback in the file_system_type structure. When using the new interface the caller must now use the mount_nodev() helper. Unfortunately, the new interface no longer passes the vfsmount down to the zfs layers. This poses a problem for the existing implementation because we currently save this pointer in the super block for latter use. It provides our only entry point in to the namespace layer for manipulating certain mount options. This needed to be done originally to allow commands like 'zfs set atime=off tank' to work properly. It also allowed me to keep more of the original Solaris code unmodified. Under Solaris there is a 1-to-1 mapping between a mount point and a file system so this is a fairly natural thing to do. However, under Linux they many be multiple entries in the namespace which reference the same filesystem. Thus keeping a back reference from the filesystem to the namespace is complicated. Rather than introduce some ugly hack to get the vfsmount and continue as before. I'm leveraging this API change to update the ZFS code to do things in a more natural way for Linux. This has the upside that is resolves the compatibility issue for the long term and fixes several other minor bugs which have been reported. This commit updates the code to remove this vfsmount back reference entirely. All modifications to filesystem mount options are now passed in to the kernel via a '-o remount'. This is the expected Linux mechanism and allows the namespace to properly handle any options which apply to it before passing them on to the file system itself. Aside from fixing the compatibility issue, removing the vfsmount has had the benefit of simplifying the code. This change which fairly involved has turned out nicely. Closes #246 Closes #217 Closes #187 Closes #248 Closes #231	2011-07-01 13:36:39 -07:00
Brian Behlendorf	5c03efc379	Linux compat 2.6.39: security_inode_init_security() The security_inode_init_security() function now takes an additional qstr argument which must be passed in from the dentry if available. Passing a NULL is safe when no qstr is available the relevant security checks will just be skipped. Closes #246 Closes #217 Closes #187	2011-07-01 12:40:08 -07:00
Brian Behlendorf	bd2f5ac97f	Avoid 'rpm -q' bug for 'make pkg' RPM version 4.9.0 has been observed to generate extra debug messages in certain cases. These debug messages prevent us from cleanly acquiring the architecture. This is clearly an upstream RPM bug which will get fixed. But until then a safe solution is to pipe the result through 'tail -1' to just grab the architecture bit we care about. Example 'rpm -qp spl-0.6.0-rc4.src.rpm --qf %{arch}' output: Freeing read locks for locker 0x166: 28031/47480843735008 Freeing read locks for locker 0x168: 28031/47480843735008 x86_64	2011-07-01 12:39:25 -07:00
Prasad Joshi	b312979252	Tear down and flush the mmap region The inode eviction should unmap the pages associated with the inode. These pages should also be flushed to disk to avoid the data loss. Therefore, use truncate_setsize() in evict_inode() to release the pagecache. The API truncate_setsize() was added in 2.6.35 kernel. To ensure compatibility with the old kernel, the patch defines its own truncate_setsize function. Signed-off-by: Prasad Joshi <pjoshi@stec-inc.com> Closes #255	2011-06-27 09:59:19 -07:00
Brian Behlendorf	86fd39f354	Linux 2.6.39 compat, mutex owner Prior to Linux 2.6.39 when CONFIG_DEBUG_MUTEXES was defined the kernel stored a thread_info pointer as the mutex owner. From this you could get the pointer of the current task_struct to compare with get_current(). As of Linux 2.6.39 this behavior has changed and now the mutex stores a pointer to the task_struct. This commit detects the type of pointer stored in the mutex and adjusts the mutex_owner() and mutex_owned() functions to perform the correct comparision.	2011-06-24 13:00:08 -07:00
Brian Behlendorf	a55bcaad18	Linux 3.0: Shrinker compatibility Update the the wrapper macros for the memory shrinker to handle this 4th API change. The callback function now takes a shrink_control structure. This is certainly a step in the right direction but it's annoying to have to accomidate yet another version of the API.	2011-06-21 14:02:39 -07:00
Brian Behlendorf	a32661a6c9	Avoid 'rpm -q' bug for 'make pkg' RPM version 4.9.0 has been observed to generate extra debug messages in certain cases. These debug messages prevent us from cleanly acquiring the architecture. This is clearly an upstream RPM bug which will get fixed. But until then a safe solution is to pipe the result through 'tail -1' to just grab the architecture bit we care about. Example 'rpm -qp spl-0.6.0-rc4.src.rpm --qf %{arch}' output: Freeing read locks for locker 0x166: 28031/47480843735008 Freeing read locks for locker 0x168: 28031/47480843735008 x86_64	2011-06-16 11:49:38 -07:00
Brian Behlendorf	2e08aedba4	Always check -Wno-unused-but-set-variable gcc support The previous commit `8a7e1ceefa` wasn't quite right. This check applies to both the user and kernel space build and as such we must make sure it runs regardless of what the --with-config option is set too. For example, if --with-config=kernel then the autoconf test does not run and we generate build warnings when compiling the kernel packages.	2011-06-14 16:40:35 -07:00
Brian Behlendorf	8a7e1ceefa	Check for -Wno-unused-but-set-variable gcc support Gcc versions 4.3.2 and earlier do not support the compiler flag -Wno-unused-but-set-variable. This can lead to build failures on older Linux platforms such as Debian Lenny. Since this is an optional build argument this changes add a new autoconf check for the option. If it is supported by the installed version of gcc then it is used otherwise it is omited. See commit's `12c1acde76` and `79713039a2` for the reason the -Wno-unused-but-set-variable options was originally added.	2011-06-14 14:43:22 -07:00
Alexey Shvetsov	d9bfe0f57a	Fix distribution detection for gentoo Also this may fix other distros because some of them also provide /etc/lsb-release not only ubuntu. Closes #244	2011-05-14 08:54:48 -07:00
Brian Behlendorf	712f8bd87b	Add Gentoo/Lunar/Redhat Init Scripts Every distribution has slightly different requirements for their init scripts. Because of this the zfs package contains several init scripts for various distributions. These scripts have been contributed by, and are supported by, the larger zfs community. Init scripts for Gentoo/Lunar/Redhat have been contributed by: Gentoo - devsk <devsku@gmail.com> Lunar - Jean-Michel Bruenn <jean.bruenn@ip-minds.de> Redhat - Fajar A. Nugraha <list@fajar.net>	2011-05-02 15:59:13 -07:00
Brian Behlendorf	df554c148e	Fix 'zfs set volsize=N pool/dataset' This change fixes a kernel panic which would occur when resizing a dataset which was not open. The objset_t stored in the zvol_state_t will be set to NULL when the block device is closed. To avoid this issue we pass the correct objset_t as the third arg. The code has also been updated to correctly notify the kernel when the block device capacity changes. For 2.6.28 and newer kernels the capacity change will be immediately detected. For earlier kernels the capacity change will be detected when the device is next opened. This is a known limitation of older kernels. Online ext3 resize test case passes on 2.6.28+ kernels: $ dd if=/dev/zero of=/tmp/zvol bs=1M count=1 seek=1023 $ zpool create tank /tmp/zvol $ zfs create -V 500M tank/zd0 $ mkfs.ext3 /dev/zd0 $ mkdir /mnt/zd0 $ mount /dev/zd0 /mnt/zd0 $ df -h /mnt/zd0 $ zfs set volsize=800M tank/zd0 $ resize2fs /dev/zd0 $ df -h /mnt/zd0 Original-patch-by: Fajar A. Nugraha <github@fajar.net> Closes #68 Closes #84	2011-05-02 08:54:40 -07:00
Gunnar Beutner	055656d4f4	Implemented NFS export_operations. Implemented the required NFS operations for exporting ZFS datasets using the in-kernel NFS daemon.	2011-04-29 12:36:13 -07:00
Darik Horn	ad35b6a6e9	Remove the gawk dependency. This reverts commit `1814251453`. Demote the gawk call back to awk and ensure that stderr is attached. GNU gawk tolerates a missing stderr handle, but many utilities do not, which could be why a regular awk call was unexplainably failing on some systems. Use argv[0] instead of sh_path for consistency internally and with other Linux drivers. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2011-04-21 09:41:09 -07:00
Brian Behlendorf	3dfc591ac4	Linux 2.6.39 compat, zlib_deflate_workspacesize() The function zlib_deflate_workspacesize() now take 2 arguments. This was done to avoid always having to allocate the maximum size workspace (268K). The caller can now specific the windowBits and memLevel compression parameters to get a smaller workspace. For our purposes we introduce a spl_zlib_deflate_workspacesize() wrapper which accepts both arguments. When the two argument version of zlib_deflate_workspacesize() is available the arguments are passed through. When it's not we assume the worst case and a maximally sized workspace is used.	2011-04-20 14:39:15 -07:00
Brian Behlendorf	b1cbc4610c	Linux 2.6.39 compat, kern_path_parent() The path_lookup() function has been renamed to kern_path_parent() and the flags argument has been removed. The only behavior now offered is that of LOOKUP_PARENT. The spl already always passed this flag so dropping the flag does not impact us.	2011-04-20 12:30:17 -07:00
Brian Behlendorf	12c1acde76	Set -Wno-unused-but-set-variable globally As of gcc-4.6 the option -Wunused-but-set-variable is enabled by default. While this is a useful warning there are numerous places in the ZFS code when a variable is set and then only checked in an ASSERT(). To avoid having to update every instance of this in the code we now set -Wno-unused-but-set-variable to suppress the warning. Additionally, when building with --enable-debug and -Werror set these warning also become fatal. We can reevaluate the suppression of these error at a later time if it becomes an issue. For now we are basically just reverting to the previous gcc behavior.	2011-04-19 10:44:10 -07:00
Brian Behlendorf	79713039a2	Fix gcc configure warnings Newer versions of gcc are getting smart enough to detect the sloppy syntax used for the autoconf tests. It is now generating warnings for unused/undeclared variables. Newer version of gcc even have the -Wunused-but-set-variable option set by default. This isn't a problem except when -Werror is set and they get promoted to an error. In this case the autoconf test will return an incorrect result which will result in a build failure latter on. To handle this I'm tightening up many of the autoconf tests to explicitly mark variables as unused to suppress the gcc warning. Remember, all of the autoconf code can never actually be run we just want to get a clean build error to detect which APIs are available. Never using a variable is absolutely fine for this. Closes #176	2011-04-19 10:10:47 -07:00
Brian Behlendorf	03318641af	Fix gcc configure warnings Newer versions of gcc are getting smart enough to detect the sloppy syntax used for the autoconf tests. It is now generating warnings for unused/undeclared variables. Newer version of gcc even have the -Wunused-but-set-variable option set by default. This isn't a problem except when -Werror is set and they get promoted to an error. In this case the autoconf test will return an incorrect result which will result in a build failure latter on. To handle this I'm tightening up many of the autoconf tests to explicitly mark variables as unused to suppress the gcc warning. Remember, all of the autoconf code can never actually be run we just want to get a clean build error to detect which APIs are available. Never using a variable is absolutely fine for this.	2011-04-19 09:41:41 -07:00
Brian Behlendorf	9b0f9079d2	Linux 2.6.39 compat, invalidate_inodes() To resolve a potiential filesystem corruption issue a second argument was added to invalidate_inodes(). This argument controls whether dirty inodes are dropped or treated as busy when invalidating a super block. When only the legacy API is available the second argument will be dropped for compatibility.	2011-04-19 09:08:08 -07:00
Brian Behlendorf	e76f4bf11d	Add dnlc_reduce_cache() support Provide the dnlc_reduce_cache() function which attempts to prune cached entries from the dcache and icache. After the entries are pruned any slabs which they may have been using are reaped. Note the API takes a reclaim percentage but we don't have easy access to the total number of cache entries to calculate the reclaim count. However, in practice this doesn't need to be exactly correct. We simply need to reclaim some useful fraction (but not all) of the cache. The caller can determine if more needs to be done.	2011-04-06 20:06:03 -07:00
Brian Behlendorf	bdf4328b04	Linux 2.6.28 compat, insert_inode_locked() Added insert_inode_locked() helper function, prior to this most callers used insert_inode_hash(). The older method doesn't check for collisions in the inode_hashtable but it still acceptible for use. Fallback to using insert_inode_hash() when insert_inode_locked() is unavailable.	2011-03-22 12:15:54 -07:00
Manuel Amador (Rudd-O)	ae26d0465a	Add dracut support To simplify the process of using zfs as your root filesystem a zfs-drucat sub-package has been added. This sub-package adds a zfs dracut module which allows your initramfs to be rebuilt with zfs support. The process for doing this is still complicated but there is clearly interest from the community about getting this working well and documented. This should help lay some of the groundwork. Longer term these changes should be pushed in the upstream dracut package. Once that occurs this subpackage will no longer be required for new systems, however we may want to conditionally build this package in the future for systems running older dracut versions. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2011-03-17 16:52:04 -07:00
Brian Behlendorf	01c0e61da0	Add init scripts To support automatically mounting your zfs on filesystem on boot a basic init script is needed. Unfortunately, every distribution has their own idea of the _right_ way to do things. Rather than write one very complicated portable init script, which would be invariably replaced by the distributions own anyway. I have instead added support to provide multiple distribution specific init scripts. The correct init script for your distribution will be selected by ZFS_AC_DEFAULT_PACKAGE which will set DEFAULT_INIT_SCRIPT. During 'make install' the correct script for your system will be installed from zfs/etc/init.d/zfs.DEFAULT_INIT_SCRIPT to the usual /etc/init.d/zfs location. Currently, there is zfs.fedora and a more generic zfs.lsb init script. Hopefully, the distribution maintainers who know best how they want their init scripts to function will feedback their approved versions to be included in the project. This change does not consider upstart jobs but I'm not at all opposed to add that sort of thing.	2011-03-17 16:51:54 -07:00
Brian Behlendorf	a60b1c0a8e	Make Missing Modules.symvers Fatal Detect early on in configure if the Modules.symvers file is missing. Without this file there will be build failures later and it's best to catch this early and provide a useful error. In this case the most likely problem is the kernel-devel packages are not installed. It may also be possible that they are using an unbuilt custom kernel in which case they must build the kernel first. Closes #127	2011-03-07 13:09:20 -08:00
Brian Behlendorf	912fd84d13	Make Missing Modules.symvers Fatal Detect early on in configure if the Modules.symvers file is missing. Without this file there will be build failures later and it's best to catch this early and provide a useful error. In this case the most likely problem is the kernel-devel packages are not installed. It may also be possible that they are using an unbuilt custom kernel in which case they must build the kernel first.	2011-03-07 13:09:01 -08:00
Brian Behlendorf	15805c7711	Make CONFIG_PREEMPT Fatal Until support is added for preemptible kernels detect this at configure time and make it fatal. Otherwise, it is possible to have a successful build and kernel modules with flakey behavior.	2011-03-07 12:09:02 -08:00

... 3 4 5 6 7 ...

543 Commits