mirror_zfs

mirror of https://git.proxmox.com/git/mirror_zfs.git synced 2026-04-17 08:54:52 +03:00

Author	SHA1	Message	Date
Brian Behlendorf	bd2f5ac97f	Avoid 'rpm -q' bug for 'make pkg' RPM version 4.9.0 has been observed to generate extra debug messages in certain cases. These debug messages prevent us from cleanly acquiring the architecture. This is clearly an upstream RPM bug which will get fixed. But until then a safe solution is to pipe the result through 'tail -1' to just grab the architecture bit we care about. Example 'rpm -qp spl-0.6.0-rc4.src.rpm --qf %{arch}' output: Freeing read locks for locker 0x166: 28031/47480843735008 Freeing read locks for locker 0x168: 28031/47480843735008 x86_64	2011-07-01 12:39:25 -07:00
Brian Behlendorf	e2e7aa2df8	Add ZFS specific mmap() checks Under Linux the VFS handles virtually all of the mmap() access checks. Filesystem specific checks are left to be handled in the .mmap() hook and normally there arn't any. However, ZFS provides a few attributes which can influence the mmap behavior and should be honored. Note, currently the code to modify these attributes has not been implemented under Linux. * ZFS_IMMUTABLE \| ZFS_READONLY \| ZFS_APPENDONLY: when any of these attributes are set a file may not be mmaped with write access. * ZFS_AV_QUARANTINED: when set a file file may not be mmaped with read or exec access. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2011-07-01 12:23:46 -07:00
Brian Behlendorf	f0b2486034	Remove unused MMAP functions The following functions were required for the OpenSolaris mmap implementation. Because the Linux VFS does most the most heavy lifting for us they are not required and are being removed to keep the code clean and easy to understand. * zfs_null_putapage() * zfs_frlock() * zfs_no_putpage() Signed-off-by: Brian Behlendorf <behlendorf@llnl.gov>	2011-07-01 12:22:57 -07:00
Prasad Joshi	dde471ef5a	MMAP Optimization Enable zfs_getpage, zfs_fillpage, zfs_putpage, zfs_putapage functions. The functions have been modified to make them Linux friendly. ZFS uses these functions to read/write the mmapped pages. Using them from readpage/writepage results in clear code. The patch also adds readpages and writepages interface functions to read/write list of pages in one function call. The code change handles the first mmap optimization mentioned on https://github.com/behlendorf/zfs/issues/225 Signed-off-by: Prasad Joshi <pjoshi@stec-inc.com> Signed-off-by: Brian Behlendorf <behlendorf@llnl.gov> Issue #255	2011-07-01 12:22:52 -07:00
Brian Behlendorf	2a005961a4	Ensure all block devices are available These days most disk drivers will probe for devices asynchronously. This means it's possible that when you zfs init script runs all the required block devices may not yet have been discovered. The result is the pool may fail to cleanly import at boot time. This is particularly common when you have a large number of devices. The fix is for the init script to block until udev settles and we are no longer detecting new devices. Once the system has settled the zfs modules can be loaded and the pool with be automatically imported.	2011-06-30 14:45:33 -07:00
Prasad Joshi	218b8eafbd	Use truncate_setsize in zfs_setattr According to Linux kernel commit 2c27c65e, using truncate_setsize in setattr simplifies the code. Therefore, the patch replaces the call to vmtruncate() with truncate_setsize(). zfs_setattr uses zfs_freesp to free the disk space belonging to the file. As truncate_setsize may release the page cache and flushing the dirty data to disk, it must be called before the zfs_freesp. Suggested-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Prasad Joshi <pjoshi@stec-inc.com> Closes #255	2011-06-27 09:59:52 -07:00
Prasad Joshi	b312979252	Tear down and flush the mmap region The inode eviction should unmap the pages associated with the inode. These pages should also be flushed to disk to avoid the data loss. Therefore, use truncate_setsize() in evict_inode() to release the pagecache. The API truncate_setsize() was added in 2.6.35 kernel. To ensure compatibility with the old kernel, the patch defines its own truncate_setsize function. Signed-off-by: Prasad Joshi <pjoshi@stec-inc.com> Closes #255	2011-06-27 09:59:19 -07:00
Ned A. Bass	560bcf9d14	Multipath device manageability improvements Update udev helper scripts to deal with device-mapper devices created by multipathd. These enhancements are targeted at a particular storage network topology under evaluation at LLNL consisting of two SAS switches providing redundant connectivity between multiple server nodes and disk enclosures. The key to making these systems manageable is to create shortnames for each disk that conveys its physical location in a drawer. In a direct-attached topology we infer a disk's enclosure from the PCI bus number and HBA port number in the by-path name provided by udev. In a switched topology, however, multiple drawers are accessed via a single HBA port. We therefore resort to assigning drawer identifiers based on which switch port a drive's enclosure is connected to. This information is available from sysfs. Add options to zpool_layout to generate an /etc/zfs/zdev.conf using symbolic links in /dev/disk/by-id of the form <label>-<UUID>-switch-port:<X>-slot:<Y>. <label> is a string that depends on the subsystem that created the link and defaults to "dm-uuid-mpath" (this prefix is used by multipathd). <UUID> is a unique identifier for the disk typically obtained from the scsi_id program, and <X> and <Y> denote the switch port and disk slot numbers, respectively. Add a callout script sas_switch_id for use by multipathd to help create symlinks of the form described above. Update zpool_id and the udev zpool rules file to handle both multipath devices and conventional drives.	2011-06-23 10:46:06 -07:00
Brian Behlendorf	7e7baecaa3	Linux 3.0 compat, shrinker compatibility To accomindate the updated Linux 3.0 shrinker API the spl shrinker compatibility code was updated. Unfortunately, this couldn't be done cleanly without slightly adjusting the comapt API. See spl commit `a55bcaad18`. This commit updates the ZFS code to use the slightly modified API. You must use the latest SPL if your building ZFS.	2011-06-21 14:36:39 -07:00
Gunnar Beutner	b00131d43c	Fix unlink/xattr deadlock The problem here is that prune_icache() tries to evict/delete both the xattr directory inode as well as at least one xattr inode contained in that directory. Here's what happens: 1. File is created. 2. xattr is created for that file (behind the scenes a xattr directory and a file in that xattr directory are created) 3. File is deleted. 4. Both the xattr directory inode and at least one xattr inode from that directory are evicted by prune_icache(); prune_icache() acquires a lock on both inodes before it calls ->evict() on the inodes When the xattr directory inode is evicted zfs_zinactive attempts to delete the xattr files contained in that directory. While enumerating these files zfs_zget() is called to obtain a reference to the xattr file znode - which tries to lock the xattr inode. However that very same xattr inode was already locked by prune_icache() further up the call stack, thus leading to a deadlock. This can be reliably reproduced like this: $ touch test $ attr -s a -V b test $ rm test $ echo 3 > /proc/sys/vm/drop_caches This patch fixes the deadlock by moving the zfs_purgedir() call to zfs_unlinked_drain(). Instead zfs_rmnode() now checks whether the xattr dir is empty and leaves the xattr dir in the unlinked set if it finds any xattrs. To ensure zfs_unlinked_drain() never accesses a stale super block zfsvfs_teardown() has been update to block until the iput taskq has been drained. This avoids a potential race where a file with an xattr directory is removed and the file system is immediately unmounted. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #266	2011-06-20 13:47:03 -07:00
Gunnar Beutner	6f0cf71e0d	Removed erroneous zfs_inode_destroy() calls from zfs_rmnode(). iput_final() already calls zpl_inode_destroy() -> zfs_inode_destroy() for us after zfs_zinactive(), thus making sure that the inode is properly cleaned up. The zfs_inode_destroy() calls in zfs_rmnode() would lead to a double-free. Fixes #282	2011-06-20 10:30:17 -07:00
Christian Kohlschütter	df30f56639	Add "ashift" property to zpool create Some disks with internal sectors larger than 512 bytes (e.g., 4k) can suffer from bad write performance when ashift is not configured correctly. This is caused by the disk not reporting its actual sector size, but a sector size of 512 bytes. The drive may behave this way for compatibility reasons. For example, the WDC WD20EARS disks are known to exhibit this behavior. When creating a zpool, ZFS takes that wrong sector size and sets the "ashift" property accordingly (to 9: 1<<9=512), whereas it should be set to 12 for 4k sectors (1<<12=4096). This patch allows an adminstrator to manual specify the known correct ashift size at 'zpool create' time. This can significantly improve performance in certain cases. However, it will have an impact on your total pool capacity. See the updated ashift property description in the zpool.8 man page for additional details. Valid values for the ashift property range from 9 to 17 (512B-128KB). Additionally, you may set the ashift to 0 if you wish to auto-detect the sector size based on what the disk reports, this is the default behavior. The most common ashift values are 9 and 12. Example: zpool create -o ashift=12 tank raidz2 sda sdb sdc sdd Closes #280 Original-patch-by: Richard Laager <rlaager@wiktel.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2011-06-17 16:35:49 -07:00
Brian Behlendorf	96801d2906	Linux 2.6.37 compat, WRITE_FLUSH_FUA The WRITE_FLUSH, WRITE_FUA, and WRITE_FLUSH_FUA flags have been introduced as a replacement for WRITE_BARRIER. This was done to allow richer semantics to be expressed to the block layer. It is the block layers responsibility to choose the correct way to implement these semantics. This change simply updates the bio's to use the new kernel API which should be absolutely safe. However, since ZFS depends entirely on this working as designed for correctness we do want to be careful. Closes #281	2011-06-17 14:37:26 -07:00
Brian Behlendorf	db97f88646	Update rpm/deb packages to be FHS compliant This change is the first step towards updating the default rpm/deb packages to be FHS compliant. It accomplishes this by passing the following options to ./configure to ensure the zfs build products are installed in FHS compliant locations. ./configure --prefix=/ --bindir=/lib/udev \ --libexecdir=/usr/libexec --datadir=/usr/share The core zfs utilities (zfs, zpool, zdb) are now be installed in /sbin, the core libraries in /lib, and the udev helpers (zpool_id, zvol_id) are in /lib/udev with the other udev helpers. The remaining files in the zfs package remain in their previous locations under /usr.	2011-06-17 13:36:16 -07:00
Darik Horn	b772aedeec	Autogen refresh. Run autogen.sh using the same autotools versions as upstream: * autoconf-2.63 * automake-1.11.1 * libtool-2.2.6b	2011-06-17 13:24:44 -07:00
Brian Behlendorf	47a2455fbc	Use datadir not datarootdir for dracut The zfs dracut modules should be installed under the --datadir not --datarootdir path. This was just an oversight in the original Makefile.am. After this change %{_datadir} can now be set safely in the zfs.spec file. The 'make install' location is now consistent with the location expected by the spec file.	2011-06-17 13:22:19 -07:00
Darik Horn	b9f27ee765	Fix autoconf variable substitution in udev rules. Change the variable substitution in the udev rule templates according to the method described in the Autoconf manual; Chapter 4.7.2: Installation Directory Variables. The udev rules are improperly generated if the bindir parameter overrides the prefix parameter during configure. For example: # ./configure --prefix=/usr/local --bindir=/opt/zfs/bin The udev helper is installed as /opt/zfs/bin/zpool_id, but the corresponding udev rule has a different path: # /usr/local/etc/udev/rules.d/60-zpool.rules ENV{DEVTYPE}=="disk", IMPORT{program}="/usr/local/bin/zpool_id -d %p" The @bindir@ variable expands to "${exec_prefix}/bin", so it cannot be used instead of @prefix@ directly. This also applies to the zvol_id helper. Closes #283.	2011-06-17 10:11:29 -07:00
Brian Behlendorf	e130330a87	Handle /etc/mtab -> /proc/mounts symlink Under Fedora 15 /etc/mtab is now a symlink to /proc/mounts by default. When /etc/mtab is a symlink the mount.zfs helper should not update it. There was code in place to handle this case but it used stat() which traverses the link and then issues the stat on /proc/mounts. We need to use lstat() to prevent the link traversal and instead stat /etc/mtab. Closes #270	2011-06-14 16:48:38 -07:00
Brian Behlendorf	2e08aedba4	Always check -Wno-unused-but-set-variable gcc support The previous commit `8a7e1ceefa` wasn't quite right. This check applies to both the user and kernel space build and as such we must make sure it runs regardless of what the --with-config option is set too. For example, if --with-config=kernel then the autoconf test does not run and we generate build warnings when compiling the kernel packages.	2011-06-14 16:40:35 -07:00
Brian Behlendorf	8a7e1ceefa	Check for -Wno-unused-but-set-variable gcc support Gcc versions 4.3.2 and earlier do not support the compiler flag -Wno-unused-but-set-variable. This can lead to build failures on older Linux platforms such as Debian Lenny. Since this is an optional build argument this changes add a new autoconf check for the option. If it is supported by the installed version of gcc then it is used otherwise it is omited. See commit's `12c1acde76` and `79713039a2` for the reason the -Wno-unused-but-set-variable options was originally added.	2011-06-14 14:43:22 -07:00
Brian Behlendorf	10715a0187	Add default stack checking When your kernel is built with kernel stack tracing enabled and you have the debugfs filesystem mounted. Then the zfs.sh script will clear the worst observed kernel stack depth on module load and check the worst case usage on module removal. If the stack depth ever exceeds 7000 bytes the full stack will be printed for debugging. This is dangerously close to overrunning the default 8k stack. This additional advisory debugging is particularly valuable when running the regression tests on a kernel built with 16k stacks. In this case, almost no matter how bad the stack overrun is you will see be able to get a clean stack trace for debugging. Since the worst case stack usage can be highly variable it's helpful to always check the worst case usage.	2011-06-13 13:50:21 -07:00
Brian Behlendorf	da88a7fbe8	Pass -f option for import If a pool was not cleanly exported passing the -f flag may be required at 'zpool import' time. Since this test is simply validating that the pool can be successfully imported in the absense of the cache file always pass the -f to ensure it succeeds. This failure was observed under RHEL6.1.	2011-06-10 11:21:31 -07:00
Brian Behlendorf	1b9d8c340f	Fix 'zfs send -D' segfault Sending pools with dedup results in a segfault due to a Solaris portability issue. Under Solaris the pipe(2) library call creates a bidirectional data channel. Unfortunately, on Linux pipe(2) call creates unidirection data channel. The fix is to use the socketpair(2) function to create the expected bidirectional channel. Seth Heeren did the original leg work on this issue for zfs-fuse. We finally just rediscovered the same portability issue and dfurphy was able to point me at the original issue for the fix. Closes #268	2011-06-09 13:58:48 -07:00
Brian Behlendorf	cbc6fab65c	Sanatize zpios-sanity.sh environment Just like zconfig.sh the zpios-sanity.sh tests should run in a sanatized environment. This ensures they never conflict with an installed /etc/zfs/zpool.cache file. This commit additionally improves the -c cleanup option. It now removes the modules stack if loaded and destroys relevant md devices. This behavior is now identical to zconfig.sh.	2011-06-03 15:08:49 -07:00
Brian Behlendorf	608860b6d0	Delay before destroying loopback devices Generally I don't approve of just adding an arbitrary delay to avoid a problem but in this case I'm going to let it slide. We may need to delay briefly after 'zpool destroy' returns to ensure the loopback devices are closed. If they aren't closed than losetup -d will not be able to destroy them. Unfortunately, there's no easy state the check so we'll have to make due with a simple delay.	2011-06-03 14:38:25 -07:00
Brian Behlendorf	36391312af	Always unload zpios.ko on exit We should always unload zpios.ko on exit. This ensures that subsequent calls to 'zfs.sh -u' from other utilities will be able to unload the module stack and properly cleanup. This is important for the the --cleanup option which can be passed to zconfig.sh and zfault.sh.	2011-06-02 10:25:35 -07:00
Brian Behlendorf	2ea9dc40f8	Fix zpios-sanity.sh return code The zpios-sanity.sh script should return failure when any of the individual zpios.sh tests fail. The previous code would always return success suppressing real failures.	2011-06-02 10:13:15 -07:00
Brian Behlendorf	e95b3bdcbb	Fix stack ddt_class_contains() Stack usage for ddt_class_contains() reduced from 524 bytes to 68 bytes. This large stack allocation significantly contributed to the likelyhood of a stack overflow when scrubbing/resilvering dedup pools.	2011-05-31 12:17:27 -07:00
Brian Behlendorf	5b8c7bbcea	Fix stack ddt_zap_lookup() Stack usage for ddt_zap_lookup() reduced from 368 bytes to 120 bytes. This large stack allocation significantly contributed to the likelyhood of a stack overflow when scrubbing/resilvering dedup pools.	2011-05-31 12:17:27 -07:00
Brian Behlendorf	c7f8f831a4	Revert "Fix stack traverse_visitbp()" This abomination is no longer required because the zio's issued during this recursive call path will now be handled asynchronously by the taskq thread pool. This reverts commit `6656bf5621`.	2011-05-31 12:17:27 -07:00
Brian Behlendorf	2fac4c2a74	Make tgx_sync_thread zio's async The majority of the recursive operations performed by the dsl are done either in the context of the tgx_sync_thread or during pool import. It is these recursive operations which contribute greatly to the stack depth. When this recursion is coupled with a synchronous I/O in the same context overflow becomes possible. Previously to handle this case I have focused on keeping the individual stack frames as light as possible. This is a good idea as long as it can be done in a way which doesn't overly complicate the code. However, there is a better solution. If we treat all zio's issued by the tgx_sync_thread as async then we can use the tgx_sync_thread stack for the recursive parts, and the zio_* threads for the I/O parts. This effectively doubles our available stack space with the only drawback being a small delay to schedule the I/O. However, in practice the scheduling time is so much smaller than the actual I/O time this isn't an issue. Another benefit of making the zio async is that the zio pipeline is now parallel. That should mean for CPU intensive pipelines such as compression or dedup performance may be improved. With this change in place the worst case stack usage observed so far is 6902 bytes. This is still higher than I'd like but significantly improved. Additional changes to specific functions should improve this further. This change allows us to revent commit `6656bf5` which did some horrible things to the recursive traverse_visitbp() callpath in the name of saving stack.	2011-05-31 12:17:27 -07:00
Brian Behlendorf	f74fae8b30	Fix 4K sector support Yesterday I ran across a 3TB drive which exposed 4K sectors to Linux. While I thought I had gotten this support correct it turns out there were 2 subtle bugs which prevented it from working. sudo ./cmd/zpool/zpool create -f large-sector /dev/sda cannot create 'large-sector': one or more devices is currently unavailable 1) The first issue was that it was possible that bdev_capacity() would return the number of 512 byte sectors rather than the number of 4096 sectors. Internally, certain Linux functions only operate with 512 byte sectors so you need to be careful. To avoid any confusion in the future I've updated bdev_capacity() to simply return the device (or partition) capacity in bytes. The higher levels of ZFS want the value in bytes anyway so this is cleaner. 2) When creating a bio the ->bi_sector count must always be expressed in 512 byte sectors. The existing code would scale the byte offset by the logical sector size. Until now this was always 512 so it never caused problems. Trying a 4K sector drive clearly exposed the issue. The problem has been fixed by hard coding the 512 byte sector which is exactly what the bio code does internally. With these changes I'm now able to create ZFS pools using 4K sector drives. No issues were observed during fairly extensive testing. This is also a low risk change if your using 512b sectors devices because none of the logic changes. Closes #256	2011-05-27 11:38:53 -07:00
Brian Behlendorf	2b8cad6159	Use vmem_alloc() for zfs_ioc_userspace_many() The default buffer size when requesting multiple quota entries is 100 times the zfs_useracct_t size. In practice this works out to exactly 27200 bytes. Since this will be a short lived buffer in a non-performance critical path it is preferable to vmem_alloc() the needed memory.	2011-05-20 14:23:18 -07:00
Brian Behlendorf	4804b739e1	Default to internal 'zfs userspace' implementation We will never bring over the pyzfs.py helper script from Solaris to Linux. Instead the missing functionality will be directly integrated in to the zfs commands and libraries. To avoid confusion remove the warning about the missing pyzfs.py utility and simply use the default internal support. The Illumous developers are of the same mind and have proposed an initial patch to do this which has been integrated in to the 'allow' development branch. After some additional testing this code can be merged in to master as the right long term solution.	2011-05-20 10:25:41 -07:00
Brian Behlendorf	f01b360e67	Pass caller's credential in zfsdev_ioctl() Initially when zfsdev_ioctl() was ported to Linux we didn't have any credential support implemented. So at the time we simply passed NULL which wasn't much of a problem since most of the secpolicy code was disabled. However, one exception is quota handling which does require the credential. Now that proper credentials are supported we can safely start passing the callers credential. This is also an initial step towards fully implemented the zfs secpolicy.	2011-05-20 10:12:25 -07:00
Brian Behlendorf	3fd70ee6b0	Fix 'negative objects to delete' warning Normally when the arc_shrinker_func() function is called the return value should be: >=0 - To indicate the number of freeable objects in the cache, or -1 - To indicate this cache should be skipped However, when the shrinker callback is called with 'nr_to_scan' equal to zero. The caller simply wants the number of freeable objects in the cache and we must never return -1. This patch reorders the first two conditionals in arc_shrinker_func() to ensure this behavior. This patch also now explictly casts arc_size and arc_c_min to signed int64_t types so MAX(x, 0) works as expected. As unsigned types we would never see an negative value which defeated the purpose of the MAX() lower bound and broke the shrinker logic. Finally, when nr_to_scan is non-zero we explictly prevent all reclaim below arc_c_min. This is done to prevent the Linux page cache from completely crowding out the ARC. This limit is tunable and some experimentation is likely going to be required to set it exactly right. For now we're sticking with the OpenSolaris defaults. Closes #218 Closes #243	2011-05-18 10:29:22 -07:00
Alexey Shvetsov	d9bfe0f57a	Fix distribution detection for gentoo Also this may fix other distros because some of them also provide /etc/lsb-release not only ubuntu. Closes #244	2011-05-14 08:54:48 -07:00
Brian Behlendorf	e814770f2e	Update synchronous open zfs_close() comment The comment in zfs_close() pertaining to decrementing the synchronous open count needs to be updated for Linux. The code was already updated to be correct, but the comment was missed and is now misleading. Under Linux the zfs_close() hook is only called once when the final reference is dropped. This differs from Solaris where zfs_close() is called for each close. Closes #237	2011-05-13 08:20:06 -07:00
Alexey Shvetsov	6f582dc708	Remove root 'ls' after mount workaround This workaround was introduced to workaround issue #164. This issue was fixed by commit `5f35b19` so the workaround can be safely dropped from both the zfs.fedora and zfs.gentoo init scripts.	2011-05-12 15:01:35 -07:00
Alexey Shvetsov	06abcdd3f4	Fix zfs.gentoo init script logic * Fix zfs.ko module check * Check 'zfs umount -a' return value	2011-05-12 14:45:57 -07:00
Alexey Shvetsov	04c22478a7	Make zfs.gentoo init script more gentoo style. * Improved compatibility with openrc * Removed LOCKFILE * Improved checksystem() function * Remove /etc/mtab check for / * General cleanup	2011-05-12 14:42:43 -07:00
Brian Behlendorf	c91d229809	Merge pull request #235 from nedbass/rdev Don't store rdev in SA for FIFOs and sockets	2011-05-09 16:41:28 -07:00
Ned A. Bass	aa6d8c1086	Don't store rdev in SA for FIFOs and sockets Update the handling of named pipes and sockets to be consistent with other platforms with regard to the rdev attribute. While all ZFS ipmlementations store the rdev for device files in a system attribute (SA), this is not the case for FIFOs and sockets. Indeed, Linux always passes rdev=0 to mknod() for FIFOs and sockets, so the value is not needed. Add an ASSERT that rdev==0 for FIFOs and sockets to detect if the expected behavior ever changes. Closes #216	2011-05-09 13:35:07 -07:00
Brian Behlendorf	21ade34764	Disable direct reclaim for z_wr_* threads The direct reclaim path in the z_wr_* threads must be disabled to ensure forward progress is always maintained for txg processing. This ensures that a txg will never get stuck waiting on itself because it entered the following memory reclaim callpath. ->prune_icache()->dispose_list()->zpl_clear_inode()->zfs_inactive() ->dmu_tx_assign()->dmu_tx_wait()->tgx_wait_open() It would be preferable to target this exact code path but the kernel offers no way to do this without custom patches. To avoid this we are forced to disable all reclaim for these threads. It should not be necessary to do this for other other z_* threads because they will not hold a txg open. Closes #232	2011-05-06 15:26:26 -07:00
Brian Behlendorf	3117dd0b90	Handle NULL in nfsd .fsync() hook How nfsd handles .fsync() has been changed a couple of times in the recent kernels. But basically there are three cases we need to consider. Linux 2.6.12 - 2.6.33 * The .fsync() hook takes 3 arguments * The nfsd will call .fsync() with a NULL file struct pointer. Linux 2.6.34 * The .fsync() hook takes 3 arguments * The nfsd no longer calls .fsync() but instead used sync_inode() Linux 2.6.35 - 2.6.x * The .fsync() hook takes 2 arguments * The nfsd no longer calls .fsync() but instead used sync_inode() For once it looks like we've gotten lucky. The first two cases can actually be collased in to one if we stop using the file struct pointer entirely. Since the dentry is still passed in both cases this is possible. The last case can then be safely handled by unconditionally using the dentry in the file struct pointer now that we know the nfsd caller has been removed. Closes #230	2011-05-06 12:33:45 -07:00
Brian Behlendorf	6ee44e32be	Fix awk usage The zpool_id and zpool_layout helper scripts have been updated to use the more common /usr/bin/awk symlink. On Fedora/Redhat systems there are both /bin/awk and /usr/bin/awk symlinks to your installed version of awk. On Debian/Ubuntu systems only the /usr/bin/awk symlink exists. Additionally, add the '\<' token to the beginning of the regex pattern to prevent partial matches. This pattern only appears to work with gawk despite the mawk man page claiming to support this extended regex. Thus you will need to have gawk installed to use these optional helper scripts. A comment has been added to the script to reflect this reality.	2011-05-06 10:16:04 -07:00
Brian Behlendorf	34b84cb831	Use vmem_alloc() for zfs_ioc_pool_get_history() The default buffer size when requesting history is 128k. This is far to large for a kmem_alloc() so instead use the slower vmem_alloc(). This path has no performance concerns and the buffer is immediately free'd after its contents are copied to the user space buffer.	2011-05-06 09:59:52 -07:00
Brian Behlendorf	3613204cd7	Allow mounting of read-only snapshots With the addition of the mount helper we accidentally regressed the ability to manually mount snapshots. This commit updates the mount helper to expect the possibility of a ZFS_TYPE_SNAPSHOT. All snapshot will be automatically treated as 'legacy' type mounts so they can be mounted manually.	2011-05-05 10:13:38 -07:00
Brian Behlendorf	c409e4647f	Add missing ZFS tunables This commit adds module options for all existing zfs tunables. Ideally the average user should never need to modify any of these values. However, in practice sometimes you do need to tweak these values for one reason or another. In those cases it's nice not to have to resort to rebuilding from source. All tunables are visable to modinfo and the list is as follows: $ modinfo module/zfs/zfs.ko filename: module/zfs/zfs.ko license: CDDL author: Sun Microsystems/Oracle, Lawrence Livermore National Laboratory description: ZFS srcversion: 8EAB1D71DACE05B5AA61567 depends: spl,znvpair,zcommon,zunicode,zavl vermagic: 2.6.32-131.0.5.el6.x86_64 SMP mod_unload modversions parm: zvol_major:Major number for zvol device (uint) parm: zvol_threads:Number of threads for zvol device (uint) parm: zio_injection_enabled:Enable fault injection (int) parm: zio_bulk_flags:Additional flags to pass to bulk buffers (int) parm: zio_delay_max:Max zio millisec delay before posting event (int) parm: zio_requeue_io_start_cut_in_line:Prioritize requeued I/O (bool) parm: zil_replay_disable:Disable intent logging replay (int) parm: zfs_nocacheflush:Disable cache flushes (bool) parm: zfs_read_chunk_size:Bytes to read per chunk (long) parm: zfs_vdev_max_pending:Max pending per-vdev I/Os (int) parm: zfs_vdev_min_pending:Min pending per-vdev I/Os (int) parm: zfs_vdev_aggregation_limit:Max vdev I/O aggregation size (int) parm: zfs_vdev_time_shift:Deadline time shift for vdev I/O (int) parm: zfs_vdev_ramp_rate:Exponential I/O issue ramp-up rate (int) parm: zfs_vdev_read_gap_limit:Aggregate read I/O over gap (int) parm: zfs_vdev_write_gap_limit:Aggregate write I/O over gap (int) parm: zfs_vdev_scheduler:I/O scheduler (charp) parm: zfs_vdev_cache_max:Inflate reads small than max (int) parm: zfs_vdev_cache_size:Total size of the per-disk cache (int) parm: zfs_vdev_cache_bshift:Shift size to inflate reads too (int) parm: zfs_scrub_limit:Max scrub/resilver I/O per leaf vdev (int) parm: zfs_recover:Set to attempt to recover from fatal errors (int) parm: spa_config_path:SPA config file (/etc/zfs/zpool.cache) (charp) parm: zfs_zevent_len_max:Max event queue length (int) parm: zfs_zevent_cols:Max event column width (int) parm: zfs_zevent_console:Log events to the console (int) parm: zfs_top_maxinflight:Max I/Os per top-level (int) parm: zfs_resilver_delay:Number of ticks to delay resilver (int) parm: zfs_scrub_delay:Number of ticks to delay scrub (int) parm: zfs_scan_idle:Idle window in clock ticks (int) parm: zfs_scan_min_time_ms:Min millisecs to scrub per txg (int) parm: zfs_free_min_time_ms:Min millisecs to free per txg (int) parm: zfs_resilver_min_time_ms:Min millisecs to resilver per txg (int) parm: zfs_no_scrub_io:Set to disable scrub I/O (bool) parm: zfs_no_scrub_prefetch:Set to disable scrub prefetching (bool) parm: zfs_txg_timeout:Max seconds worth of delta per txg (int) parm: zfs_no_write_throttle:Disable write throttling (int) parm: zfs_write_limit_shift:log2(fraction of memory) per txg (int) parm: zfs_txg_synctime_ms:Target milliseconds between tgx sync (int) parm: zfs_write_limit_min:Min tgx write limit (ulong) parm: zfs_write_limit_max:Max tgx write limit (ulong) parm: zfs_write_limit_inflated:Inflated tgx write limit (ulong) parm: zfs_write_limit_override:Override tgx write limit (ulong) parm: zfs_prefetch_disable:Disable all ZFS prefetching (int) parm: zfetch_max_streams:Max number of streams per zfetch (uint) parm: zfetch_min_sec_reap:Min time before stream reclaim (uint) parm: zfetch_block_cap:Max number of blocks to fetch at a time (uint) parm: zfetch_array_rd_sz:Number of bytes in a array_read (ulong) parm: zfs_pd_blks_max:Max number of blocks to prefetch (int) parm: zfs_dedup_prefetch:Enable prefetching dedup-ed blks (int) parm: zfs_arc_min:Min arc size (ulong) parm: zfs_arc_max:Max arc size (ulong) parm: zfs_arc_meta_limit:Meta limit for arc size (ulong) parm: zfs_arc_reduce_dnlc_percent:Meta reclaim percentage (int) parm: zfs_arc_grow_retry:Seconds before growing arc size (int) parm: zfs_arc_shrink_shift:log2(fraction of arc to reclaim) (int) parm: zfs_arc_p_min_shift:arc_c shift to calc min/max arc_p (int)	2011-05-04 10:02:37 -07:00
Brian Behlendorf	8db77dd7ed	Prep zfs-0.6.0-rc4 tag Create the fourth 0.6.0 release candidate tag (rc4).	2011-05-03 10:29:05 -07:00

... 30 31 32 33 34 ...

1974 Commits