mirror_zfs

mirror of https://git.proxmox.com/git/mirror_zfs.git synced 2026-03-22 08:51:30 +03:00

Author	SHA1	Message	Date
Turbo Fredriksson	0bc7a7a754	Don't specifically open /etc/mtab - it is done in libzfs_init() a few lines further down and we can share the open file handle. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #1498	2013-08-15 10:19:38 -07:00
Turbo Fredriksson	f9e459d143	Use setmntent() OR fopen() For the same reasons it's used in libzfs_init(), this was just overlooked because zinject gets minimal use. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #1498	2013-08-15 10:09:09 -07:00
Yuri Pankov	105afebb15	Illumos #3098 zfs userspace/groupspace fail 3098 zfs userspace/groupspace fail without saying why when run as non-root Reviewed by: Eric Schrock <eric.schrock@delphix.com> Approved by: Richard Lowe <richlowe@richlowe.net> References: https://www.illumos.org/issues/3098 illumos/illumos-gate@70f56fa693 Ported-by: Tim Chase <tim@chase2k.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1596	2013-08-14 09:28:34 -07:00
Brian Behlendorf	cd72af9c68	Fix 'zpool list -H' error code Due to an uninitialized variable it was possible for the command 'zpool list -H' to return a non-zero error when there are no pools. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1605	2013-07-23 12:39:05 -07:00
Christer Ekholm	da91c90154	Add missing -v to usage help for zpool list. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-07-22 13:39:01 -07:00
Dmitry Khasanov	131cc95ca7	Add FreeBSD 'zpool labelclear' command The FreeBSD implementation of zfs adds the 'zpool labelclear' command. Since this functionality is helpful and straight forward to add it is being included in ZoL. References: freebsd/freebsd@119a041dc9 Ported-by: Dmitry Khasanov <pik4ez@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1126	2013-07-09 15:58:05 -07:00
Tim Chase	5021058756	zdb: enhancement - Display SA xattrs. If the znode has SA xattrs, display them following the other standard attributes. The format used is similar to that used when listing the contents of a ZAP. It is as follows: $ zdb -vvv <pool>/<dataset> <object> ... SA xattrs: <size> bytes, <number> entries <name1> = <value1> <name2> = <value2> ... Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1581	2013-07-09 13:52:28 -07:00
Mike Leddy	5d3dc3fb72	Avoid abort() in vn_rdwr(): libzpool/kernel.c Make sure that buffer is aligned to 512 bytes on linux so that pread call combined with O_DIRECT does not return EINVAL. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1570	2013-07-09 11:56:43 -07:00
Craig Loomis	50fe577d1f	Explicitly flush output at end of each zevent For "zpool events -f" flush stdout to ensure the last zevent is always printed immediately. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1568	2013-07-08 17:01:16 -07:00
Brian Behlendorf	c76955eaa5	Fix parse_dataset error handling A mount failure was accidentally introduced by commit `0c1171d` which reworked the parse_dataset() function to read pool names from devices. The error case where a label is read from the device but the pool name/value pair doesn't exist was not handled properly. In this case we should fall back to the previous behavior. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1560	2013-07-03 09:20:52 -07:00
George Wilson	294f68063b	Illumos #3498 panic in arc_read() 3498 panic in arc_read(): !refcount_is_zero(&pbuf->b_hdr->b_refcnt) Reviewed by: Adam Leventhal <ahl@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Richard Lowe <richlowe@richlowe.net> References: illumos/illumos-gate@1b912ec710 https://www.illumos.org/issues/3498 Ported-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1249	2013-07-02 13:34:31 -07:00
Richard Yao	3db3ff4a78	Use MAXPATHLEN instead of sizeof in snprintf This silences a GCC 4.8.0 warning by fixing a programming error caught by static analysis: ../../cmd/ztest/ztest.c: In function ‘ztest_vdev_aux_add_remove’: ../../cmd/ztest/ztest.c:2584:33: error: argument to ‘sizeof’ in ‘snprintf’ call is the same expression as the destination; did you mean to provide an explicit length? [-Werror=sizeof-pointer-memaccess] (void) snprintf(path, sizeof (path), ztest_aux_template, ^ Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1480	2013-07-02 10:39:24 -07:00
Cyril Plisko	64d7b6cf75	Override default SPA config location via environment When using zdb with non-default SPA config file it is not convenient to add -U <non-default-config-file-path> all the time. This commit introduces support for setting/overriding SPA config location via environment variable 'SPA_CONFIG_PATH'. If -U flag is specified in the command line it will override any other value as usual. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1545	2013-07-01 13:25:44 -07:00
Cyril Plisko	20c17b96c9	Add absent \n at the end of the help text line Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #1545	2013-06-28 11:26:41 -07:00
Brian Behlendorf	0c1171dcb5	Allow fetching the pool from the device at mount To simplify integration with the xfstests test suite the mount.zfs helper has been extended. When passed a block device (/dev/sdX) to mount, instead of a pool/dataset, the pool name will be read from any existing zfs label and used. This allows you to mount the root dataset of a zfs filesystem by specifing any of the member vdevs. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-06-26 15:20:09 -07:00
George Wilson	e51be06697	Illumos #3552 , #3564 3552 condensing one space map burns 3 seconds of CPU in spa_sync() thread 3564 spa_sync() spends 5-10% of its time in metaslab_sync() (when not condensing) Reviewed by: Adam Leventhal <ahl@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Richard Lowe <richlowe@richlowe.net> References: illumos/illumos-gate@16a4a80742 https://www.illumos.org/issues/3552 https://www.illumos.org/issues/3564 Ported-by: Tim Chase <tim@chase2k.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1513	2013-06-19 16:22:39 -07:00
Madhav Suresh	c99c90015e	Illumos #3006 3006 VERIFY[S,U,P] and ASSERT[S,U,P] frequently check if first argument is zero Reviewed by Matt Ahrens <matthew.ahrens@delphix.com> Reviewed by George Wilson <george.wilson@delphix.com> Approved by Eric Schrock <eric.schrock@delphix.com> References: illumos/illumos-gate@fb09f5aad4 https://illumos.org/issues/3006 Requires: zfsonlinux/spl@1c6d149feb Ported-by: Tim Chase <tim@chase2k.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1509	2013-06-19 15:14:10 -07:00
Christ Schlacta	fb02fabf9b	Modified arcstat.py to run on linux * Modified kstat_update() to read arcstats from proc. * Fix shebang. * Added Makefile.am entries for arcstat.py Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1506	2013-06-18 15:43:15 -07:00
Christ Schlacta	7634cd54db	Added arcstat.py from FreeNAS Original source: http://support.freenas.org/browser/nanobsd/Files/usr/local/bin/arcstat.py Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #1506	2013-06-18 15:30:08 -07:00
Ned Bass	da29fe63f0	Don't leak mount flags into kernel When calling mount(), care must be taken to avoid passing in flags that are used only by the user space utilities. Otherwise we may stomp on flags that are reserved for other purposes in the kernel. In particular, openSUSE 12.3 kernels have added a new MS_RICHACL super-block flag whose value conflicts with our MS_COMMENT flag. This causes incorrect behavior such as the umask being ignored. The MS_COMMENT flag essentially serves as a placeholder in the option_map data structure of zfs_mount.c, but its value is never used. Therefore we can avoid the conflict by defining it to 0. The MS_USERS, MS_OWNER, and MS_GROUP flags also conflict with reserved flags in the kernel. While this is not known to have caused any problems, it is nevertheless incorrect. For the purposes of the mount.zfs helper, the "users", "owner", and "group" options just serve as hints to set additional implied options. Therefore we now define their associated mount flags in terms of the options that they imply rather than giving them unique values. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1457	2013-06-18 15:30:08 -07:00
George Wilson	5853fe790d	Illumos #3306 , #3321 3306 zdb should be able to issue reads in parallel 3321 'zpool reopen' command should be documented in the man page and help Reviewed by: Adam Leventhal <ahl@delphix.com> Reviewed by: Matt Ahrens <matthew.ahrens@delphix.com> Reviewed by: Christopher Siden <chris.siden@delphix.com> Approved by: Garrett D'Amore <garrett@damore.org> References: illumos/illumos-gate@31d7e8fa33 https://www.illumos.org/issues/3306 https://www.illumos.org/issues/3321 The vdev_file.c implementation in this patch diverges significantly from the upstream version. For consistenty with the vdev_disk.c code the upstream version leverages the Illumos bio interfaces. This makes sense for Illumos but not for ZoL for two reasons. 1) The vdev_disk.c code in ZoL has been rewritten to use the Linux block device interfaces which differ significantly from those in Illumos. Therefore, updating the vdev_file.c to use the Illumos interfaces doesn't get you consistency with vdev_disk.c. 2) Using the upstream patch as is would requiring implementing compatibility code for those Solaris block device interfaces in user and kernel space. That additional complexity could lead to confusion and doesn't buy us anything. For these reasons I've opted to simply move the existing vn_rdwr() as is in to the taskq function. This has the advantage of being low risk and easy to understand. Moving the vn_rdwr() function in to its own taskq thread also neatly avoids the possibility of a stack overflow. Finally, because of the additional work which is being handled by the free taskq the number of threads has been increased. The thread count under Illumos defaults to 100 but was decreased to 2 in commit 08d08e due to contention. We increase it to 8 until the contention can be address by porting Illumos #3581. Ported-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1354	2013-05-03 16:53:52 -07:00
Brian Behlendorf	937210a54b	Fix zinject list handlers The zfs_fd must be opened before calling print_all_handlers() or the ioctl() cannot be used to the zfs control device. This brings the zinject code back in sync with the Illumos implementation. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-05-01 17:05:58 -07:00
George.Wilson	cc92e9d0c3	3246 ZFS I/O deadman thread Reviewed by: Matt Ahrens <matthew.ahrens@delphix.com> Reviewed by: Eric Schrock <eric.schrock@delphix.com> Reviewed by: Christopher Siden <chris.siden@delphix.com> Approved by: Garrett D'Amore <garrett@damore.org> NOTES: This patch has been reworked from the original in the following ways to accomidate Linux ZFS implementation ) Usage of the cyclic interface was replaced by the delayed taskq interface. This avoids the need to implement new compatibility code and allows us to rely on the existing taskq implementation. ) An extern for zfs_txg_synctime_ms was added to sys/dsl_pool.h because declaring externs in source files as was done in the original patch is just plain wrong. ) Instead of panicing the system when the deadman triggers a zevent describing the blocked vdev and the first pending I/O is posted. If the panic behavior is desired Linux provides other generic methods to panic the system when threads are observed to hang. ) For reference, to delay zios by 30 seconds for testing you can use zinject as follows: 'zinject -d <vdev> -D30 <pool>' References: illumos/illumos-gate@283b84606b https://www.illumos.org/issues/3246 Ported-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1396	2013-05-01 17:05:52 -07:00
Chris Dunlop	254255f735	Trivial spelling fix Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1411	2013-04-19 15:43:16 -07:00
Ned Bass	92db59ca3b	Refresh links to web site A few files still refer to @behlendorf's private fork on github. Use the primary web site URL instead. Two typos are also corrected. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-03-06 15:46:41 -08:00
Richard Yao	b01615d5ac	Constify structures containing function pointers The PaX team modified the kernel's modpost to report writeable function pointers as section mismatches because they are potential exploit targets. We could ignore the warnings, but their presence can obscure actual issues. Proper const correctness can also catch programming mistakes. Building the kernel modules against a PaX/GrSecurity patched Linux 3.4.2 kernel reports 133 section mismatches prior to this patch. This patch eliminates 130 of them. The quantity of writeable function pointers eliminated by constifying each structure is as follows: vdev_opts_t 52 zil_replay_func_t 24 zio_compress_info_t 24 zio_checksum_info_t 9 space_map_ops_t 7 arc_byteswap_func_t 5 The remaining 3 writeable function pointers cannot be addressed by this patch. 2 of them are in zpl_fs_type. The kernel's sget function requires that this be non-const. The final writeable function pointer is created by SPL_SHRINKER_DECLARE. The kernel's set_shrinker() and remove_shrinker() functions also require that this be non-const. Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1300	2013-03-04 08:49:32 -08:00
Brian Behlendorf	8128bd89fb	Fix hot spares The issue with hot spares in ZoL is because it opens all leaf vdevs exclusively (O_EXCL). On Linux, exclusive opens cause subsequent exclusive opens to fail with EBUSY. This could be resolved by not opening any of the devices exclusively, which is what Illumos does, but the additional protection offered by exclusive opens is desirable. It cleanly prevents you from accidentally adding an in-use non-ZFS device to your pool. To fix this we very slightly relaxed the usage of O_EXCL in the following ways. 1) Functions which open the device but only read had the O_EXCL flag removed and were updated to use O_RDONLY. 2) A common holder was added to the vdev disk code. This allow the ZFS code to internally open the device multiple times but non-ZFS callers may not. 3) An exception was added to make_disks() for hot spare when creating partition tables. For hot spare devices which are already opened exclusively we skip creating the partition table because this must already have been done when the disk was originally added as a hot spare. Additional minor changes include fixing check_in_use() to use a partition instead of a slice suffix. And is_spare() was moved above make_disks() to avoid adding a forward reference. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #250	2013-03-01 13:31:02 -08:00
Tim Connors	c5b247f335	-x shouldn't warn about old on-disk format or unavailable features `zpool status -x` should only flag errors or where the pool is unavailable. If it imported fine but isn't using the latest features available in the code, that's not an error. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1319	2013-02-28 09:17:09 -08:00
Brian Behlendorf	f52b31eaf0	Honor 80 character limit in 'zpool status' This is a minor nit, but the second line of the 'action' message when you need to upgrade your pool to support feature flags exceeds the standard 80 character limit. Fix it by moving the word 'feature' on to the third line. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-01-31 11:06:57 -08:00
Brian Behlendorf	dbf763b39b	Retire zpool_id infrastructure In the interest of maintaining only one udev helper to give vdevs user friendly names, the zpool_id and zpool_layout infrastructure is being retired. They are superseded by vdev_id which incorporates all the previous functionality. Documentation for the new vdev_id(8) helper and its configuration file, vdev_id.conf(5), can be found in their respective man pages. Several useful example files are installed under /etc/zfs/. /etc/zfs/vdev_id.conf.alias.example /etc/zfs/vdev_id.conf.multipath.example /etc/zfs/vdev_id.conf.sas_direct.example /etc/zfs/vdev_id.conf.sas_switch.example Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #981	2013-01-29 12:23:17 -08:00
Ned Bass	ba43f4565a	vdev_id: improve keyword parsing flexibility The vdev_id udev helper strictly requires configuration file keywords to always be anchored at the beginning of the line and to be followed by a space character. However, users may prefer to use indentation or tab delimitation. Improve flexibility by simply requiring a keyword to be the first field on the line. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1239	2013-01-25 13:44:32 -08:00
Christopher Siden	e6f7d01502	Illumos #3397 , #3398 3397 zdb <pool> <objnum> output is too verbose 3398 zdb can't dump feature flags zap objects Reviewed by: Matt Ahrens <matthew.ahrens@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Eric Schrock <eric.schrock@delphix.com> Reviewed by: Richard Lowe <richlowe@richlowe.net> Approved by: Dan McDonald <danmcd@nexenta.com> References: illumos/illumos-gate@e690fb27a7 https://www.illumos.org/issues/3397 https://www.illumos.org/issues/3398 Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-01-11 16:42:50 -08:00
Yuri Pankov	5990da81a7	Illumos #1884 , #3028 , #3048 , #3049 , #3060 , #3061 , #3093 1884 Empty "used" field for zfs *space commands 3028 zfs {group,user}space -n prints (null) instead of numeric GID/UID 3048 zfs {user,group}space [-s\|-S] is broken 3049 zfs {user,group}space -t doesn't really filter the results 3060 zfs {user,group}space -H output isn't tab-delimited 3061 zfs {user,group}space -o doesn't use specified fields order 3093 zfs {user,group}space's -i is noop Reviewed by: Garry Mills <gary_mills@fastmail.fm> Reviewed by: Eric Schrock <eric.schrock@delphix.com> Approved by: Richard Lowe <richlowe@richlowe.net> References: illumos/illumos-gate@89f5d17b06 illumos changeset: 13803:b5e49d71ff0e https://www.illumos.org/issues/1884 https://www.illumos.org/issues/3028 https://www.illumos.org/issues/3048 https://www.illumos.org/issues/3049 https://www.illumos.org/issues/3060 https://www.illumos.org/issues/3061 https://www.illumos.org/issues/3093 Ported-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1194	2013-01-11 09:17:19 -08:00
Yuri Pankov	240245896a	Illumos #1377 `zpool status -D' should tell if there are no DDT entries 1337 `zpool status -D' should tell if there are no DDT entries Reviewed by: Eric Schrock <eric.schrock@delphix.com> Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Reviewed by: George Wilson <gwilson@zfsmail.com> Approved by: Albert Lee <trisk@nexenta.com> References: illumos/illumos-gate@ce72e614c1 illumos changeset: 13432:d1ad8d106d64 https://www.illumos.org/issues/1337 Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-01-11 09:17:13 -08:00
Brian Behlendorf	a1e147eef8	Add /sbin/fsck.zfs helper A fsck helper to accomidate distributions that expect to be able to execute a fsck on all filesystem types. Currently this script does nothing but it could be extended to act as a compatibility wrapper for 'zpool scrub'. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #964	2013-01-09 16:54:58 -08:00
Brian Behlendorf	87bdc45ccb	Report realpath() canonicalization error Rather than just reporting the failure include the passed mount point and error number. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1153	2013-01-09 16:54:58 -08:00
George Wilson	1eb5bfa3dc	Illumos #3145 , #3212 3145 single-copy arc 3212 ztest: race condition between vdev_online() and spa_vdev_remove() Reviewed by: Matt Ahrens <matthew.ahrens@delphix.com> Reviewed by: Adam Leventhal <ahl@delphix.com> Reviewed by: Eric Schrock <eric.schrock@delphix.com> Reviewed by: Justin T. Gibbs <gibbs@scsiguy.com> Approved by: Eric Schrock <eric.schrock@delphix.com> References: illumos-gate/commit/9253d63df408bb48584e0b1abfcc24ef2472382e illumos changeset: 13840:97fd5cdf328a https://www.illumos.org/issues/3145 https://www.illumos.org/issues/3212 Ported-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #989 Closes #1137	2013-01-08 10:35:44 -08:00
George Wilson	ea0b2538cd	Illumos #3349 : zpool upgrade -V bumps the on disk version number 3349 zpool upgrade -V bumps the on disk version number, but leaves the in core version Reviewed by: Adam Leventhal <ahl@delphix.com> Reviewed by: Christopher Siden <chris.siden@delphix.com> Reviewed by: Matt Ahrens <matthew.ahrens@delphix.com> Reviewed by: Richard Lowe <richlowe@richlowe.net> Approved by: Dan McDonald <danmcd@nexenta.com> References: illumos/illumos-gate@25345e4666 https://www.illumos.org/issues/3349 Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-01-08 10:35:43 -08:00
Matthew Ahrens	29809a6cba	Illumos #3086 : unnecessarily setting DS_FLAG_INCONSISTENT on async 3086 unnecessarily setting DS_FLAG_INCONSISTENT on async destroyed datasets Reviewed by: Christopher Siden <chris.siden@delphix.com> Approved by: Eric Schrock <Eric.Schrock@delphix.com> References: illumos/illumos-gate@ce636f8b38 illumos changeset: 13776:cd512c80fd75 https://www.illumos.org/issues/3086 Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-01-08 10:35:43 -08:00
Christopher Siden	b9b24bb4ca	Illumos #2762 : zpool command should have better support for feature flags 2762 zpool command should have better support for feature flags Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Approved by: Eric Schrock <Eric.Schrock@delphix.com> References: illumos/illumos-gate@57221772c3 https://www.illumos.org/issues/2762 Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-01-08 10:35:43 -08:00
George Wilson	3bc7e0fb0f	Illumos #3090 and #3102 3090 vdev_reopen() during reguid causes vdev to be treated as corrupt 3102 vdev_uberblock_load() and vdev_validate() may read the wrong label Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Christopher Siden <chris.siden@delphix.com> Reviewed by: Garrett D'Amore <garrett@damore.org> Approved by: Eric Schrock <Eric.Schrock@delphix.com> References: illumos/illumos-gate@dfbb943217 illumos changeset: 13777:b1e53580146d https://www.illumos.org/issues/3090 https://www.illumos.org/issues/3102 Ported-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #939	2013-01-08 10:35:42 -08:00
Brian Behlendorf	5ac0c30a94	Revert "Temporarily disable the reguid test." This reverts commit `d135245791`. Since feature flags have now been merged we can apply the real upstream fix from Illumos. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #997	2013-01-08 10:35:42 -08:00
Christopher Siden	9ae529ec5d	Illumos #2619 and #2747 2619 asynchronous destruction of ZFS file systems 2747 SPA versioning with zfs feature flags Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: George Wilson <gwilson@delphix.com> Reviewed by: Richard Lowe <richlowe@richlowe.net> Reviewed by: Dan Kruchinin <dan.kruchinin@gmail.com> Approved by: Eric Schrock <Eric.Schrock@delphix.com> References: illumos/illumos-gate@53089ab7c8 illumos/illumos-gate@ad135b5d64 illumos changeset: 13700:2889e2596bd6 https://www.illumos.org/issues/2619 https://www.illumos.org/issues/2747 NOTE: The grub specific changes were not ported. This change must be made to the Linux grub packages. Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>	2013-01-08 10:35:35 -08:00
Will Rouesnel	462ee8e3f3	Allow fake mounts to succeed on non-legacy filesystems. mountall in Debian depends on being able to pass the -f parameter to mount, which specifies a fake mount and just updates the mtab. Currently mount.zfs will fail such a request if it is not passed with -o zfsutil. This patch allows a fake mount on a non-legacy filesystem to succeed in the same manner as a -o remount does, thus enabling mountall to work correctly. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1167	2013-01-07 11:30:27 -08:00
Ned Bass	2957f38d78	vdev_id support for device link aliases Add a vdev_id feature to map device names based on already defined udev device links. To increase the odds that vdev_id will run after the rules it depends on, increase the vdev.rules rule number from 60 to 69. With this change, vdev_id now provides functionality analogous to zpool_id and zpool_layout, paving the way to retire those tools. A defined alias takes precedence over a topology-derived name, but the two naming methods can otherwise coexist. For example, one might name drives in a JBOD with the sas_direct topology while naming an internal L2ARC device with an alias. For example, the following lines in vdev_id.conf will result in the creation of links /dev/disk/by-vdev/{d1,d2}, each pointing to the same target as the device link specified in the third field. # by-vdev # name fully qualified or base name of device link alias d1 /dev/disk/by-id/wwn-0x5000c5002de3b9ca alias d2 wwn-0x5000c5002def789e Also perform some minor vdev_id cleanup, such as removal of the unused -s command line option. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #981	2012-12-03 14:04:47 -08:00
Cyril Plisko	4588bf5701	Make zpool attach -o ashift=... actually work Commit `df83110856` missed update to getopt() call, while delivering all the rest. This commit adds "o" to getopt(). Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #566	2012-11-30 13:50:26 -08:00
Cyril Plisko	38b344d22a	vdev_id fails to handle complex device topologies While expanding positional parameters shell requires non-single digits to be enclosed in braces. When the SAS topology is non-trivial the number of positional parameters generated internally by vdev_id script (using set -- ...) easily crosses single digit limit and vdev_id fails to generate links. Signed-off-by: Ned Bass <bass6@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1119	2012-11-29 13:07:47 -08:00
Ned Bass	a6ef9522ea	Make vdev_id POSIX sh compatible Full bash may not be available in all environments where udev helpers run, such as in an initial ramdisk. To avoid breakage in this case, remove use of bash-specific features such as variable arrays and the `declare' keyword from the vdev_id script. Signed-off-by: Ned Bass <bass6@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #870	2012-11-27 14:23:22 -08:00
nordaux	33364b15d3	mount.zfs: canonicalize mount point for mtab Canonicalize the mount point passed to the mount.zfs helper. This way a clean path is always added to mtab which ensures the umount can properly locate and remove the entry. Test case: $ mkdir /mnt/foo $ mount -t zfs zpool/foo /mnt/../mnt/foo//// $ umount /mnt/foo $ cat /etc/mtab \| grep zpool/foo zpool/foo /mnt/../mnt/foo//// zfs rw 0 0 Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #573	2012-11-15 14:28:52 -08:00
Cyril Plisko	df83110856	Add "-o ashift" to zpool add and zpool attach When adding devices to an existing pool "ashift" property is auto-detected. However, if this property was overridden at the pool creation time (i.e. zpool create -o ashift=12 tank ...) this may not be what the user wants. This commit lets the user specify the value of "ashift" property to be used with newly added drives. For example, zpool add -o ashift=12 tank disk1 zpool attach -o ashift=12 tank disk1 disk2 Signed-off-by: Cyril Plisko <cyril.plisko@mountall.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #566	2012-11-15 11:06:14 -08:00
George Wilson	32a9872bba	Illumos #2671 : zpool import should not fail if vdev ashift has increased Reviewed by: Adam Leventhal <ahl@delphix.com> Reviewed by: Eric Schrock <eric.schrock@delphix.com> Reviewed by: Richard Elling <richard.elling@richardelling.com> Reviewed by: Gordon Ross <gwr@nexenta.com> Reviewed by: Garrett D'Amore <garrett@damore.org> Approved by: Richard Lowe <richlowe@richlowe.net> Refererces to Illumos issue: https://www.illumos.org/issues/2671 This patch has been slightly modified from the upstream Illumos version. In the upstream implementation a warning message is logged to the console. To prevent pointless console noise this notification is now posted as a "ereport.fs.zfs.vdev.bad_ashift" event. The event indicates a non-optimial (but entirely safe) ashift value was used to create the pool. Depending on your workload this may impact pool performance. Unfortunately, the only way to correct the issue is to recreate the pool with a new ashift. NOTE: The unrelated fix to the comment in zpool_main.c appears in the upstream commit and was preserved for consistnecy. Ported-by: Cyril Plisko <cyril.plisko@mountall.com> Reworked-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #955	2012-11-15 11:05:59 -08:00
Brian Behlendorf	e8fd45a0f9	Add ddt_object_count() error handling The interface for the ddt_zap_count() function assumes it can never fail. However, internally ddt_zap_count() is implemented with zap_count() which can potentially fail. Now because there was no way to return the error to the caller a VERIFY was used to ensure this case never happens. Unfortunately, it has been observed that pools can be damaged in such a way that zap_count() fails. The result is that the pool can not be imported without hitting the VERIFY and crashing the system. This patch reworks ddt_object_count() so the error can be safely caught and returned to the caller. This allows a pool which has be damaged in this way to be safely rewound for import. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #910	2012-10-29 08:57:45 -07:00
Brian Behlendorf	eac4720465	Allow 'zpool replace' to use short device names The 'zpool replace' command would fail when given a short name because unlike on other platforms the short name cannot be deterministically expanded to a single path. Multiple path prefixes must be checked and in addition the partition suffix for whole disks is determined by the prefix. To handle this complexity a zfs_strcmp_pathname() function was added which takes either a short or fully qualified device name. Short names will be expanded using the prefixes in the default import search path, or the ZPOOL_IMPORT_PATH environment variable if it's defined. All posible expansions are then compared against the comparison path. Care is taken to strip redundant slashes to ensure legitimate matches are not missed. In the context of this work the existing zfs_resolve_shortname() function was extended to consider the ZPOOL_IMPORT_PATH when set. The zfs_append_partition() interface was also simplified to take only a single buffer. The vast majority of these changes rework existing Linux specific code which was originally written to accomidate udev. However, there is some minimal cleanup which removes Illumos specific code. This was done to improve readability but the basic flow and intent of the upstream code was maintained. These changes are the logical conclusion of the previos work to adjust the 'zpool import' search behavior, see commit 44867b6a. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #544 Closes #976	2012-10-22 08:45:58 -07:00
Brian Behlendorf	26099167e6	Disable ztest deadman timer The ztest deadman timer has been causing false positives in the testing VMs. To make it easier to spot possible regressions I'm disabling this timer. The buildbot test infrastructure will still mark ztest instances which take to long to complete as failures. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #1018	2012-10-14 19:35:09 -07:00
Brian Behlendorf	ae380cfa76	Realpath arg 2 must be a minimum of PATH_MAX The realpath(3) function expects that when a buffer is passed for the 'resolved_path' that it be at least PATH_MAX in length. If it's not a buffer overflow may occur. Therefore the passed buffer size is changed from MAXNAMELEN to MAXPATHLEN. We also take this opertunity to dynamically allocate the buffer to keep it off the stack. warning: call to '__realpath_chk_warn' declared with attribute warning: second argument of realpath must be either NULL or at least PATH_MAX bytes long buffer [enabled by default] Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-10-04 13:19:10 -07:00
Brian Behlendorf	5be98cfe2f	Verify the return value for warn_unused_result functions Under Linux the following functions are flagged with the attribute warn_unused_result, this triggers a warning when ever they are used without checking the return value. To handle this case we check the result VERIFY(). It's better to detect this immediately on failure rather than segfault farther down in the function. ../../cmd/ztest/ztest.c:6033:2: warning: ignoring return value of 'asprintf', declared with attribute warn_unused_result [-Wunused-result] ../../cmd/ztest/ztest.c:739:3: warning: ignoring return value of 'realpath', declared with attribute warn_unused_result [-Wunused-result] Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-10-04 13:19:10 -07:00
Brian Behlendorf	facbbe4366	Replace tempnam() with mkstemp() The use of tempnam() is racy and it should be avoided in favor of mkstemp(). According to the Linux tempnam(3) man page. "Although tempnam() generates names that are difficult to guess, it is nevertheless possible that between the time that tempnam() returns a pathname, and the time that the program opens it, another program might create that pathname using open(2), or create it as a symbolic link. This can lead to security holes. To avoid such possibilities, use the open(2) O_EXCL flag to open the pathname. Or better yet, use mkstemp(3) or tmpfile(3)." This issue was flagged by gcc. ztest.o: In function `setup_data_fd': cmd/ztest/ztest.c:5822: warning: the use of `tempnam' is dangerous, better use `mkstemp' Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-10-04 13:19:10 -07:00
Brian Behlendorf	483106eb71	Minimize ztest stack frame size To ensure ztest behaves as similarly as possible to the kernel implementation of ZFS we attempt to honor the kernel stack limits. This includes keeping the individual stack frame sizes under 1K in size. We currently use gcc to detect and enforce this limit. Therefore to get this building cleanly with full debugging enabled the stack usage in the following functions has been reduced by moving the buffer to the heap. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-10-04 13:19:09 -07:00
Etienne Dechamps	9d81146b01	Use dynamic file descriptor numbers in ztest. Currently, ztest expects to get 3 and 4 as the file descriptors for data and random files, respectively. This is quite fragile and breaks easily if ztest is run with these file descriptors already opened (e.g. in a complex shell script). This patch fixes the issue by removing the assumptions on the file descriptor numbers that open() returns. For the random file (/dev/urandom), the new code doesn't rely on a shared file descriptor; instead, it reopens the file in the child. For the data file, the new code writes the file descriptor number into a "ZTEST_FD_DATA" environment variable so that it can be recovered after the execv() call. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-10-04 13:19:09 -07:00
Christopher Siden	22257dc0d5	Fix mmap() usage in ztest. illumos/illumos-gate@ad135b5d64 Illumos changeset: 13700:2889e2596bd6 Note that this is only a partial port of the aforementioned Illumos changeset. Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: George Wilson <gwilson@delphix.com> Reviewed by: Richard Lowe <richlowe@richlowe.net> Reviewed by: Dan Kruchinin <dan.kruchinin@gmail.com> Approved by: Eric Schrock <Eric.Schrock@delphix.com> Ported to zfsonlinux by: Etienne Dechamps <etienne.dechamps@ovh.net>	2012-10-04 13:19:09 -07:00
Chris Siden	c242c188fd	Illumos #1950 : ztest backwards compatibility testing option. illumos/illumos-gate@420dfc9585 Illumos changeset: 13571:a5771a96228c 1950 ztest backwards compatibility testing option Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Adam Leventhal <ahl@delphix.com> Reviewed by: Matt Ahrens <matt@delphix.com> Reviewed by: Richard Lowe <richlowe@richlowe.net> Reviewed by: Robert Mustacchi <rm@joyent.com> Approved by: Eric Schrock <eric.schrock@delphix.com> Ported-by: Etienne Dechamps <etienne.dechamps@ovh.net> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-10-04 13:18:53 -07:00
Etienne Dechamps	d135245791	Temporarily disable the reguid test. Currently, ztest fails with the following error: error: Pool 'ztest' has encountered an uncorrectable I/O failure and the failure mode property for this pool is set to panic. We know how to fix it (see issue #939), but it may take some time before we get around to merging the fix, which has some heavy dependencies. In the mean time, it is not ideal to be unable to use ztest just because of a small isolated issue, so this patch works around the problem by disabling the reguid test. This is just a temporary hack to keep ztest usable. The reguid test will be enabled again when the proper fix is merged. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #997	2012-10-03 13:59:02 -07:00
Etienne Dechamps	6aec1cd5a6	Fix ztest vdev file paths. Currently, in several instances (but not all), ztest generates vdev file paths using a statement similar to this: snprintf(path, sizeof (path), ztest_dev_template, ...); This worked fine until `40b84e7aec`, which changed path to be a pointer to the heap instead of an array allocated on the stack. Before this change, sizeof(path) would return the size of the array; now, it returns the size of the pointer instead. As a result, the aforementioned sprintf statement uses the wrong size and truncates the vdev file path to the first 4 or 8 bytes (depending on the architecture). Typically, with default settings, the file path will become "/tmp/zt" instead of "/test/ztest.XXX". This issue only exists in ztest_vdev_attach_detach() and ztest_fault_inject(), which explains why ztest doesn't fail right away. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #989	2012-10-03 13:32:48 -07:00
Etienne Dechamps	0aebd4f9e3	Create threads in detached state in userspace. Currently, thread_create(), when called in userspace, creates a joinable (i.e. not detached thread). This is the pthread default. Unfortunately, this does not reproduce kthreads behavior (kthreads are always detached). In addition, this contradicts the original Solaris code which creates userspace threads in detached mode. These joinable threads are never joined, which leads to a leakage of pthread thread objects ("zombie threads"). This in turn results in excessive ressource consumption, and possible ressource exhaustion in extreme cases (e.g. long ztest runs). This patch fixes the issue by creating userspace threads in detached mode. The only exception is ztest worker threads which are meant to be joinable. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #989	2012-10-03 13:32:48 -07:00
Bill Pijewski	37abac6d55	Illumos #2703 : add mechanism to report ZFS send progress Reviewed by: Matt Ahrens <matt@delphix.com> Reviewed by: Robert Mustacchi <rm@joyent.com> Reviewed by: Richard Lowe <richlowe@richlowe.net> Approved by: Eric Schrock <Eric.Schrock@delphix.com> References: https://www.illumos.org/issues/2703 Ported by: Martin Matuska <martin@matuska.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-09-19 13:39:06 -07:00
Chris Siden	1bd201e70d	Illumos #1948 : zpool list should show more detailed pool info Reviewed by: Adam Leventhal <ahl@delphix.com> Reviewed by: Matt Ahrens <matt@delphix.com> Reviewed by: Eric Schrock <eric.schrock@delphix.com> Reviewed by: Richard Lowe <richlowe@richlowe.net> Reviewed by: Albert Lee <trisk@nexenta.com> Reviewed by: Dan McDonald <danmcd@nexenta.com> Reviewed by: Garrett D'Amore <garrett@damore.org> Approved by: Eric Schrock <eric.schrock@delphix.com> References: https://www.illumos.org/issues/1948 Ported by: Martin Matuska <martin@matuska.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #685	2012-09-19 13:39:05 -07:00
Richard Lowe	dd4769adc0	Illumos #2088 zdb could use a reasonable manual page Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com> Reviewed by: Garrett D'Amore <garrett@damore.org> Reviewed by: George Wilson <gwilson@zfsmail.com> Reviewed by: Steve Gonczi <gonczi@comcast.net> Reviewed by: Richard Elling <richard.elling@richardelling.com> Approved by: Garrett D'Amore <garrett@damore.org> References: https://www.illumos.org/issues/2088 Ported by: Cyril Plisko <cyril.plisko@mountall.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #682	2012-09-18 09:09:13 -07:00
Brian Behlendorf	44867b6d6e	Improve `zpool import` search behavior The goal of this change is to make 'zpool import' prefer to use the peristent /dev/mapper or /dev/disk/by-* paths. These are far preferable to the devices in /dev/ whos names are not persistent and are determined by the order in which a device is detected. This patch improves things by changing the default search path from just to the top level /dev/ directory to (in order): /dev/disk/by-vdev - Custom rules, use first if they exist /dev/disk/zpool - Custom rules, use first if they exist /dev/mapper - Use multipath devices before components /dev/disk/by-uuid - Single unique entry and persistent /dev/disk/by-id - May be multiple entries and persistent /dev/disk/by-path - Encodes physical location and persistent /dev/disk/by-label - Custom persistent labels /dev - UNSAFE device names will change The default search path can be overriden by setting the ZPOOL_IMPORT_PATH environment variable. This must be a colon delimited list of paths which are searched for vdevs. If the 'zpool import -d' option is specified only those listed paths will be searched. Finally, when multiple paths to the same device are found. If one of the paths is an exact match for the path used last time to import the pool it will be used. When there are no exact matches the prefered path will be determined by the provided search order. This means you can still import a pool and force specific names by providing the -d <path> option. And the prefered names will persist as long as those paths exist on your system. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #965	2012-09-17 13:49:07 -07:00
Cyril Plisko	8e8e7f35b7	Fix zdb printf format string for ZIL data blocks Without this fix the zdb printouts of ZIL data blocks look full of FF due to printf() handling its arguments as int by default. Here is the output before the fix TX_WRITE len 4136, txg 1093817, seq 149231 foid 4242, offset 0, length f68 G FFFFFF8EFFFFFF87FFFFFF91FFFFFFCC 1c FFFFFFAFFFFFFFC9FFFFFFBAZ FFFFFFC3 And the same after the fix TX_WRITE len 4136, txg 1093817, seq 149231 foid 4242, offset 0, length f68 G 8E8791CC 1cAFC9BAZ C3 Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #962	2012-09-13 09:02:12 -07:00
Cyril Plisko	af909a1089	Illumos #3064 : usr/src/cmd/zpool/zpool_main.c misspells "successful" Reviewed by: Andrew Stormont <Andrew.Stormont@nexenta.com> Reviewed by: Kartik Mistry <kartik.mistry@gmail.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> References: https://www.illumos.org/issues/3064 Signed-off-by: Cyril Plisko <cyril.plisko@mountall.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-09-11 10:19:50 -07:00
Etienne Dechamps	b815ff9a8f	Silence "setting dataset to sync always" message in ztest. ztest outputs a message when testing sync=always no matter what the verbosity level is. There is no point outputting this message for low verbosity levels. With this patch the message is only displayed at verbosity level 5 or above. The result is less output pollution. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #951	2012-09-10 10:55:44 -07:00
Brian Behlendorf	1ecc6d1265	Add zstreamdump .gitignore When zstreamdump was merged in commit `b79fc3f` we failed to add the needed .gitignore file. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-09-06 14:23:11 -07:00
Etienne Dechamps	ba7dbeb22e	Add libnvpair to mount_zfs dependencies Commit `e6f290535c` added libzpool to the mount_zfs dependencies. This brought in the nvpair symbols which are used by libzpool. To resolve this include the libnvpair library for mount_zfs even though mount_zfs doesn't directly require any of these symbols. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #926	2012-09-02 15:36:09 -07:00
Martin Matuska	b79fc3fea9	Add zstreamdump(8) command to examine ZFS send streams. Obtained from: illumos-gate revision 11935:538c866aaac6 Source: ssh://anonhg@hg.illumos.org/illumos-gate Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #905	2012-09-02 14:54:27 -07:00
Etienne Dechamps	e6f290535c	Fix mount_zfs dependency on libzpool. mount_zfs depends on libzpool for zfs_prop_written since `330d06f90d`. Unfortunately, the Makefile for mount_zfs has not been modified to reflect this. As a result, libtool doesn't know about the dependency, which may result in the wrong libzpool being used during the build (e.g. the libzpool from the system instead of the libzpool from the build directory). This patch adds the dependency to fix the issue. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Fixes #909.	2012-08-30 16:06:46 -07:00
Brian Behlendorf	ca8b5af89d	Remove autotools products Remove all of the generated autotools products from the repository and update the .gitignore files accordingly. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #718	2012-08-27 11:47:44 -07:00
Christopher Siden	e956d65106	Illumos #1796 , #2871 , #2903 , #2957 1796 "ZFS HOLD" should not be used when doing "ZFS SEND" from a read-only pool 2871 support for __ZFS_POOL_RESTRICT used by ZFS test suite 2903 zfs destroy -d does not work 2957 zfs destroy -R/r sometimes fails when removing defer-destroyed snapshot Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Approved by: Eric Schrock <Eric.Schrock@delphix.com> References: https://www.illumos.org/issues/1796 https://www.illumos.org/issues/2871 https://www.illumos.org/issues/2903 https://www.illumos.org/issues/2957 Ported by: Martin Matuska <martin@matuska.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-08-23 10:40:02 -07:00
Eric Schrock	db49968e5c	Illumos #2635 : 'zfs rename -f' to perform force unmount Reviewed by: Matt Ahrens <matt@delphix.com> Reviewed by: George Wilson <George.Wilson@delphix.com> Reviewed by: Bill Pijewski <wdp@joyent.com> Reviewed by: Richard Elling <richard.elling@richardelling.com> Approved by: Richard Lowe <richlowe@richlowe.net> References: https://www.illumos.org/issues/2635 Ported by: Martin Matuska <martin@matuska.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #717	2012-08-23 10:39:43 -07:00
Andrew Stormont	e346ec25af	Illumos #1936 : add support for "-t <datatype>" argument to zfs get Reviewed by: Kartik Mistry <kartik@nexenta.com> Reviewed by: Dan McDonald <danmcd@nexenta.com> Reviewed by: Richard Elling <richard.elling@gmail.com> Reviewed by: Garrett D'Amore <garrett@damore.org> Approved by: Richard Lowe <richlowe@richlowe.net> References: https://www.illumos.org/issues/1936 Ported by: Martin Matuska <martin@matuska.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #681	2012-08-23 10:35:59 -07:00
Alexander Eremin	684e8c0643	Illumos #1726 : Removal of pyzfs broke delegation for volumes Reviewed by: Andrew Stormont <andyjstormont@googlemail.com> Reviewed by: Garrett D'Amore <garrett@nexenta.com> Reviewed by: Richard Lowe <richlowe@richlowe.net> Reviewed by: Albert Lee <trisk@nexenta.com> Approved by: Garrett D'Amore <garrett@nexenta.com> References: https://www.illumos.org/issues/1726 Ported by: Martin Matuska <martin@matuska.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-08-23 10:35:37 -07:00
Alexander Eremin	79e722432f	Illumos #1977 : zfs allow arguments not parsed correctly after pyzfs removal Reviewed by: Garrett D'Amore <garrett.damore@gmail.com> Reviewed by: Albert Lee <trisk@nexenta.com> Approved by: Richard Lowe <richlowe@richlowe.net> References: https://www.illumos.org/issues/1977 Ported by: Martin Matuska <martin@matuska.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-08-23 10:21:04 -07:00
Dan McDonald	d96eb2b153	Illumos #1693 : persistent 'comment' field for a zpool Reviewed by: George Wilson <gwilson@zfsmail.com> Reviewed by: Eric Schrock <eric.schrock@delphix.com> Approved by: Richard Lowe <richlowe@richlowe.net> References: https://www.illumos.org/issues/1693 Ported by: Martin Matuska <martin@matuska.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #678	2012-08-08 11:49:37 -07:00
Etienne Dechamps	ee5fd0bb80	Set zvol discard_granularity to the volblocksize. Currently, zvols have a discard granularity set to 0, which suggests to the upper layer that discard requests of arbirarily small size and alignment can be made efficiently. In practice however, ZFS does not handle unaligned discard requests efficiently: indeed, it is unable to free a part of a block. It will write zeros to the specified range instead, which is both useless and inefficient (see dnode_free_range). With this patch, zvol block devices expose volblocksize as their discard granularity, so the upper layer is aware that it's not supposed to send discard requests smaller than volblocksize. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #862	2012-08-07 14:55:31 -07:00
Matthew Ahrens	330d06f90d	Illumos #1644 , #1645 , #1646 , #1647 , #1708 1644 add ZFS "clones" property 1645 add ZFS "written" and "written@..." properties 1646 "zfs send" should estimate size of stream 1647 "zfs destroy" should determine space reclaimed by destroying multiple snapshots 1708 adjust size of zpool history data References: https://www.illumos.org/issues/1644 https://www.illumos.org/issues/1645 https://www.illumos.org/issues/1646 https://www.illumos.org/issues/1647 https://www.illumos.org/issues/1708 This commit modifies the user to kernel space ioctl ABI. Extra care should be taken when updating to ensure both the kernel modules and utilities are updated. This change has reordered all of the new ioctl()s to the end of the list. This should help minimize this issue in the future. Reviewed by: Richard Lowe <richlowe@richlowe.net> Reviewed by: George Wilson <gwilson@zfsmail.com> Reviewed by: Albert Lee <trisk@opensolaris.org> Approved by: Garrett D'Amore <garret@nexenta.com> Ported by: Martin Matuska <martin@matuska.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #826 Closes #664	2012-07-31 09:25:30 -07:00
Brian Behlendorf	c171ea71bb	Allow '-o remount' for non-legacy datasets This is done for compatibility with existing Linux infrastructure. In particular, when using zfs as a root filesystem there are init scripts which as part of shutdown remount root read-only. Also, the new systemd infrastructure being used by Fedora expects to be able to remount a file system read-write. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #847	2012-07-30 15:58:02 -07:00
Richard Yao	739a1a82e0	Linux 3.5 compat, end_writeback() changed to clear_inode() The end_writeback() function was changed by moving the call to inode_sync_wait() earlier in to evict(). This effecitvely changes the ordering of the sync but it does not impact the details of the zfs implementation. However, as part of this change end_writeback() was renamed to clear_inode() to reflect the new semantics. This change does impact us and clear_inode() now maps to end_writeback() for kernels prior to 3.5. Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #784	2012-07-23 12:29:36 -07:00
Richard Yao	ea1fdf46e2	Linux 3.5 compat, iops->truncate_range() removed The vmtruncate_range() support has been removed from the kernel in favor of using the fallocate method in the file_operations table. Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #784	2012-07-23 12:29:32 -07:00
Richard Yao	756c3e5a9c	Linux 3.5 compat, eops->encode_fh() takes inodes The export_operations member ->encode_fh() has been updated to take both the child and parent inodes. This interface used to take the child dentry and a bool describing if the parent is needed. NOTE: While updating this code I noticed that we do not currently cleanly handle the case where we're passed a connectable parent. This code should be audited to make sure we're doing the right thing. Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #784	2012-07-23 12:29:23 -07:00
Etienne Dechamps	b5a28807cd	Move partition scanning from userspace to module. Currently, zpool online -e (dynamic vdev expansion) doesn't work on whole disks because we're invoking ioctl(BLKRRPART) from userspace while ZFS still has a partition open on the disk, which results in EBUSY. This patch moves the BLKRRPART invocation from the zpool utility to the module. Specifically, this is done just before opening the device in vdev_disk_open() which is called inside vdev_reopen(). This requires jumping through some hoops to get to the disk device from the partition device, and to make sure we can still open the partition after the BLKRRPART call. Note that this new code path is triggered on dynamic vdev expansion only; other actions, like creating a new pool, are unchanged and still call BLKRRPART from userspace. This change also depends on API changes which are available in 2.6.37 and latter kernels. The build system has been updated to detect this, but there is no compatibility mode for older kernels. This means that online expansion will NOT be available in older kernels. However, it will still be possible to expand the vdev offline. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #808	2012-07-17 09:17:31 -07:00
Garrett D'Amore	3541dc6d02	Illumos #1748 : desire support for reguid in zfs Reviewed by: George Wilson <gwilson@zfsmail.com> Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Reviewed by: Alexander Eremin <alexander.eremin@nexenta.com> Reviewed by: Alexander Stetsenko <ams@nexenta.com> Approved by: Richard Lowe <richlowe@richlowe.net> References: https://www.illumos.org/issues/1748 This commit modifies the user to kernel space ioctl ABI. Extra care should be taken when updating to ensure both the kernel modules and utilities are updated. If only the user space component is updated both the 'zpool events' command and the 'zpool reguid' command will not work until the kernel modules are updated. Ported by: Martin Matuska <martin@matuska.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #665	2012-07-11 13:08:56 -07:00
Pawel Jakub Dawidek	0cee24064a	Speed up 'zfs list -t snapshot -o name -s name' FreeBSD #xxx: Dramatically optimize listing snapshots when user requests only snapshot names and wants to sort them by name, ie. when executes: # zfs list -t snapshot -o name -s name Because only name is needed we don't have to read all snapshot properties. Below you can find how long does it take to list 34509 snapshots from a single disk pool before and after this change with cold and warm cache: before: # time zfs list -t snapshot -o name -s name > /dev/null cold cache: 525s warm cache: 218s after: # time zfs list -t snapshot -o name -s name > /dev/null cold cache: 1.7s warm cache: 1.1s NOTE: This patch only appears in FreeBSD. If/when Illumos picks up the change we may want to drop this patch and adopt their version. However, for now this addresses a real issue. Ported-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #450	2012-06-14 09:49:04 -07:00
Brian Behlendorf	fe2fc8f6d3	Workaround for failing zvol_id This is not a proper fix. It is just a workaround for the stack smashing detected by gcc in zvol_id. We simply disable the gcc stack protector for now when building the zvol_id udev helper. Once the root cause is resolved this patch should be reverted. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issues #569	2012-06-13 11:25:17 -07:00
Daniel Verite	92e91da208	Include <locale.h> to avoid error: 'LC_ALL' undeclared. When compiling ZFS with CFLAGS=-O0 it will trigger the following error. Resolve the issue by properly including locale.h. ../../cmd/mount_zfs/mount_zfs.c: In function 'main': ../../cmd/mount_zfs/mount_zfs.c:318:2: warning: implicit declaration of function 'setlocale' [-Wimplicit-function-declaration] ../../cmd/mount_zfs/mount_zfs.c:318:19: error: 'LC_ALL' undeclared (first use in this function) Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #724	2012-06-11 10:21:40 -07:00
Richard Yao	6a0936babc	Linux 3.4 compat, d_make_root() replaces d_alloc_root() torvalds/linux@adc0e91ab1 introduced introduced d_make_root() as a replacement for d_alloc_root(). Further commits appear to have removed d_alloc_root() from the Linux source tree. This causes the following failure: error: implicit declaration of function 'd_alloc_root' [-Werror=implicit-function-declaration] To correct this we update the code to use the current d_make_root() interface for readability. Then we introduce an autotools check to determine if d_make_root() is available. If it isn't then we define some compatibility logic which used the older d_alloc_root() interface. Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #776	2012-06-11 10:04:49 -07:00
Ned A. Bass	821b683436	Add vdev_id for JBOD-friendly udev aliases vdev_id parses the file /etc/zfs/vdev_id.conf to map a physical path in a storage topology to a channel name. The channel name is combined with a disk enclosure slot number to create an alias that reflects the physical location of the drive. This is particularly helpful when it comes to tasks like replacing failed drives. Slot numbers may also be re-mapped in case the default numbering is unsatisfactory. The drive aliases will be created as symbolic links in /dev/disk/by-vdev. The only currently supported topologies are sas_direct and sas_switch: o sas_direct - a channel is uniquely identified by a PCI slot and a HBA port o sas_switch - a channel is uniquely identified by a SAS switch port A multipath mode is supported in which dm-mpath devices are handled by examining the first running component disk, as reported by 'multipath -l'. In multipath mode the configuration file should contain a channel definition with the same name for each path to a given enclosure. vdev_id can replace the existing zpool_id script on systems where the storage topology conforms to sas_direct or sas_switch. The script could be extended to support other topologies as well. The advantage of vdev_id is that it is driven by a single static input file that can be shared across multiple nodes having a common storage toplogy. zpool_id, on the other hand, requires a unique /etc/zfs/zdev.conf per node and a separate slot-mapping file. However, zpool_id provides the flexibility of using any device names that show up in /dev/disk/by-path, so it may still be needed on some systems. vdev_id's functionality subsumes that of the sas_switch_id script, and it is unlikely that anyone is using it, so sas_switch_id is removed. Finally, /dev/disk/by-vdev is added to the list of directories that 'zpool import' will scan. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #713	2012-06-01 08:55:14 -07:00
Brian Behlendorf	b39d3b9f7b	Linux 3.3 compat, iops->create()/mkdir()/mknod() The mode argument of iops->create()/mkdir()/mknod() was changed from an 'int' to a 'umode_t'. To prevent a compiler warning an autoconf check was added to detect the API change and then correctly set a zpl_umode_t typedef. There is no functional change. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #701	2012-04-30 12:52:38 -07:00
Richard Lowe	ad60af8e1b	Illumos #2067 : uninitialized variables in zfs(1M) may make snapshots undestroyable Reviewed by: Joshua M. Clulow <josh@sysmgr.org> Reviewed by: Milan Jurik <milan.jurik@xylab.cz> Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Reviewed by: Garrett D'Amore <garrett@damore.org> Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: Steve Gonczi <gonczi@comcast.net> Approved by: Garrett D'Amore <garrett@damore.org> References: https://www.illumos.org/issues/2067 Ported by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-04-27 15:17:23 -07:00
Frederik Wessels	95bcd51ecc	Illumos #1946 : incorrect formatting when listing output of multiple pools with zpool iostat -v Reviewed by: Richard Elling <richard.elling@richardelling.com> Reviewed by: Joshua M. Clulow <josh@sysmgr.org> Approved by: Richard Lowe <richlowe@richlowe.net> Reference to Illumos issue: https://www.illumos.org/issues/1946 Ported by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-04-27 15:15:37 -07:00
Mike Harsch	187632dcef	Illumos #952 : separate intent logs should be obvious in 'zpool iostat' output Reviewed by: Adam Leventhal <ahl@delphix.com> Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: Eric Schrock <eric.schrock@delphix.com> Reviewed by: Dan McDonald <danmcd@nexenta.com> Reviewed by: Garrett D'Amore <garrett@nexenta.com> Approved by: Eric Schrock <eric.schrock@delphix.com> Refererce to Illumos issue: https://www.illumos.org/issues/952 Ported-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #607	2012-04-27 15:12:43 -07:00
P.SCH	cf81b00a73	ZFS list snapshot property alias Add support for the `zfs list -t snap` alias which is available under Oracle Solaris 11. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #640	2012-04-11 12:02:46 -07:00
P.SCH	10b75496bb	ZFS snapshot alias For consistency, and because it's handy, add the 'zfs snap' alias which was introduced by Oracle Solaris 11. This includes an update to the man page to reflect all the available alias (snap, umount, and recv). Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #640	2012-04-11 12:02:31 -07:00
Richard Yao	847de12271	Print human readable error message for ENOENT A cryptic error code is printed when mounting a legacy dataset to a non-existent mountpoint. This patch changes this behavior to print "mount point '%s' does not exist", which is similar to the error message printed when mounting procfs. The single quotes were added to be consistent with the existing EBUSY error message, which is the only difference between this error message and the one that is printed when the same condition occurs when mounting procfs. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #633	2012-04-03 10:24:34 -07:00
Craig Sanders	9fc60702c6	Remove hard-coded 80 column output When stdout is detected to be a tty use the number of columns specified by the terminal. If that fails fall back to a default 80 column width. In the non-tty case allow for 999 column lines. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-03-27 15:01:08 -07:00
Brian Behlendorf	f47e1351db	Fix executable permissions Caught by lint, this permission change was accidentally introduced by commit `42cb3819f1`. Restore the correct permissions and while I'm at it add a missing whack-bang to config/ltmain.sh. lint: executable-not-elf-or-script: zpool_main.c zfs_main.c Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #620	2012-03-26 11:52:44 -07:00
Brian Behlendorf	1c5de20ae2	Add --enable-debug-dmu-tx configure option Allow rigorous (and expensive) tx validation to be enabled/disabled indepentantly from the standard zfs debugging. When enabled these checks ensure that all txs are constructed properly and that a dbuf is never dirtied without taking the correct tx hold. This checking is particularly helpful when adding new dmu consumers like Lustre. However, for established consumers such as the zpl with no known outstanding tx construction problems this is just overhead. --enable-debug-dmu-tx - Enable/disable validation of each tx as --disable-debug-dmu-tx it is constructed. By default validation is disabled due to performance concerns. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-03-23 12:25:17 -07:00
Brian Behlendorf	ebe7e575ea	Add .zfs control directory Add support for the .zfs control directory. This was accomplished by leveraging as much of the existing ZFS infrastructure as posible and updating it for Linux as required. The bulk of the core functionality is now all there with the following limitations. ) The .zfs/snapshot directory automount support requires a 2.6.37 or newer kernel. The exception is RHEL6.2 which has backported the d_automount patches. ) Creating/destroying/renaming snapshots with mkdir/rmdir/mv in the .zfs/snapshot directory works as expected. However, this functionality is only available to root until zfs delegations are finished. * mkdir - create a snapshot * rmdir - destroy a snapshot * mv - rename a snapshot The following issues are known defeciences, but we expect them to be addressed by future commits. ) Add automount support for kernels older the 2.6.37. This should be possible using follow_link() which is what Linux did before. ) Accessing the .zfs/snapshot directory via NFS is not yet possible. The majority of the ground work for this is complete. However, finishing this work will require resolving some lingering integration issues with the Linux NFS kernel server. *) The .zfs/shares directory exists but no futher smb functionality has yet been implemented. Contributions-by: Rohan Puri <rohan.puri15@gmail.com> Contributiobs-by: Andrew Barnes <barnes333@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #173	2012-03-22 13:03:47 -07:00
Gregor Kopka	42cb3819f1	Use stderr for 'no pools/datasets available' error The 'zfs list' and 'zpool list' commands output the message 'no datasets/pools available' to stdout. This should go to stderr and only the available datasets/pools should go to stdout. Returning nothing to stdout is expected behavior when there is nothing to list. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #581	2012-03-15 10:24:00 -07:00
Brian Behlendorf	4b787d75c8	Cleanly support debug packages Allow a source rpm to be rebuilt with debugging enabled. This avoids the need to have to manually modify the spec file. By default debugging is still largely disabled. To enable specific debugging features use the following options with rpmbuild. '--with debug' - Enables ASSERTs # For example: $ rpmbuild --rebuild --with debug zfs-modules-0.6.0-rc6.src.rpm Additionally, ZFS_CONFIG has been added to zfs_config.h for packages which build against these headers. This is critical to ensure both zfs and the dependant package are using the same prototype and structure definitions. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-02-27 14:08:17 -08:00
Ned Bass	3a4f6caf08	Return success from check_slice() if device doesn't exist When creating a new pool, make_root_vdev() calls check_in_use() to ensure that none of the consituent disks are in use. If the disk contains a valid vdev label it is read to retrieve the list of its child vdevs and these are checked recursively. However, the partitions stored in the vdev label my no longer exist, for example if the partition table has since been altered. In any such case we would want the pool creation to proceed, so this change removes the check from check_slice() that returns an error if the device doesn't exist. As an added assurance, the Solaris implementation also returns sucess on ENOENT. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-02-27 08:52:38 -08:00
Etienne Dechamps	30930fba21	Add support for DISCARD to ZVOLs. DISCARD (REQ_DISCARD, BLKDISCARD) is useful for thin provisioning. It allows ZVOL clients to discard (unmap, trim) block ranges from a ZVOL, thus optimizing disk space usage by allowing a ZVOL to shrink instead of just grow. We can't use zfs_space() or zfs_freesp() here, since these functions only work on regular files, not volumes. Fortunately we can use the low-level function dmu_free_long_range() which does exactly what we want. Currently the discard operation is not added to the log. That's not a big deal since losing discard requests cannot result in data corruption. It would however result in disk space usage higher than it should be. Thus adding log support to zvol_discard() is probably a good idea for a future improvement. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-02-09 16:19:38 -08:00
Etienne Dechamps	cb2d19010d	Support the fallocate() file operation. Currently only the (FALLOC_FL_PUNCH_HOLE) flag combination is supported, since it's the only one that matches the behavior of zfs_space(). This makes it pretty much useless in its current form, but it's a start. To support other flag combinations we would need to modify zfs_space() to make it more flexible, or emulate the desired functionality in zpl_fallocate(). Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #334	2012-02-09 16:19:32 -08:00
Etienne Dechamps	34037afe24	Improve ZVOL queue behavior. The Linux block device queue subsystem exposes a number of configurable settings described in Linux block/blk-settings.c. The defaults for these settings are tuned for hard drives, and are not optimized for ZVOLs. Proper configuration of these options would allow upper layers (I/O scheduler) to take better decisions about write merging and ordering. Detailed rationale: - max_hw_sectors is set to unlimited (UINT_MAX). zvol_write() is able to handle writes of any size, so there's no reason to impose a limit. Let the upper layer decide. - max_segments and max_segment_size are set to unlimited. zvol_write() will copy the requests' contents into a dbuf anyway, so the number and size of the segments are irrelevant. Let the upper layer decide. - physical_block_size and io_opt are set to the ZVOL's block size. This has the potential to somewhat alleviate issue #361 for ZVOLs, by warning the upper layers that writes smaller than the volume's block size will be slow. - The NONROT flag is set to indicate this isn't a rotational device. Although the backing zpool might be composed of rotational devices, the resulting ZVOL often doesn't exhibit the same behavior due to the COW mechanisms used by ZFS. Setting this flag will prevent upper layers from making useless decisions (such as reordering writes) based on incorrect assumptions about the behavior of the ZVOL. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-02-07 16:23:06 -08:00
Etienne Dechamps	b18019d2d8	Fix synchronicity for ZVOLs. zvol_write() assumes that the write request must be written to stable storage if rq_is_sync() is true. Unfortunately, this assumption is incorrect. Indeed, "sync" does not mean what we think it means in the context of the Linux block layer. This is well explained in linux/fs.h: WRITE: A normal async write. Device will be plugged. WRITE_SYNC: Synchronous write. Identical to WRITE, but passes down the hint that someone will be waiting on this IO shortly. WRITE_FLUSH: Like WRITE_SYNC but with preceding cache flush. WRITE_FUA: Like WRITE_SYNC but data is guaranteed to be on non-volatile media on completion. In other words, SYNC does not mean that the write must be on stable storage on completion. It just means that someone is waiting on us to complete the write request. Thus triggering a ZIL commit for each SYNC write request on a ZVOL is unnecessary and harmful for performance. To make matters worse, ZVOL users have no way to express that they actually want data to be written to stable storage, which means the ZIL is broken for ZVOLs. The request for stable storage is expressed by the FUA flag, so we must commit the ZIL after the write if the FUA flag is set. In addition, we must commit the ZIL before the write if the FLUSH flag is set. Also, we must inform the block layer that we actually support FLUSH and FUA. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-02-07 16:23:06 -08:00
Brian Behlendorf	47621f3d76	Linux 3.3 compat, sops->show_options() The second argument of sops->show_options() was changed from a 'struct vfsmount ' to a 'struct dentry '. Add an autoconf check to detect the API change and then conditionally define the expected interface. In either case we are only interested in the zfs_sb_t. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #549	2012-02-03 10:02:01 -08:00
Darik Horn	750562833f	Combine libraries: spl, avl, efi, share, unicode. These libraries, which are an artifact of the ZoL development process, conflict with packages that are already in distribution: * libspl: SPL Programming Language * libavl: AVL for Linux * libefi: GRUB And these libraries are potential conflicts: * libshare: the Linux Mount Manager * libunicode: Perl and Python Recompose these five ZoL components into the four libraries that are conventionally provided by Solaris and FreeBSD systems: + libnvpair + libuutil + libzpool + libzfs This change resolves the name conflict, makes ZoL more compatible with existing software that uses autotools to detect ZFS, and allows pkg-zfs to better reflect the official Debian kFreeBSD packaging. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes: #430	2012-01-17 15:19:50 -08:00
Suman Chakravartula	e18be9a637	Add overlay(-O) mount option support Linux supports mounting over non-empty directories by default. In Solaris this is not the case and -O option is required for zfs mount to mount a zfs filesystem over a non-empty directory. For compatibility, I've added support for -O option to mount zfs filesystems over non-empty directories if the user wants to, just like in Solaris. I've defined MS_OVERLAY to record it in the flags variable if the -O option is supplied. The flags variable passes through a few functions and its checked before performing the empty directory check in zfs_mount function. If -O is given, the check is not performed. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #473	2012-01-12 15:49:38 -08:00
Darik Horn	b97f368d04	Avoid using awk in the zpool_id script. Some implementations of `awk` incorrectly parse the \< and \> regex symbols, so use a `while read` loop and regular globbing instead. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes: #259	2012-01-11 11:56:56 -08:00
Brian Behlendorf	ab26409db7	Linux 3.1 compat, super_block->s_shrink The Linux 3.1 kernel has introduced the concept of per-filesystem shrinkers which are directly assoicated with a super block. Prior to this change there was one shared global shrinker. The zfs code relied on being able to call the global shrinker when the arc_meta_limit was exceeded. This would cause the VFS to drop references on a fraction of the dentries in the dcache. The ARC could then safely reclaim the memory used by these entries and honor the arc_meta_limit. Unfortunately, when per-filesystem shrinkers were added the old interfaces were made unavailable. This change adds support to use the new per-filesystem shrinker interface so we can continue to honor the arc_meta_limit. The major benefit of the new interface is that we can now target only the zfs filesystem for dentry and inode pruning. Thus we can minimize any impact on the caching of other filesystems. In the context of making this change several other important issues related to managing the ARC were addressed, they include: * The dnlc_reduce_cache() function which was called by the ARC to drop dentries for the Posix layer was replaced with a generic zfs_prune_t callback. The ZPL layer now registers a callback to drop these dentries removing a layering violation which dates back to the Solaris code. This callback can also be used by other ARC consumers such as Lustre. arc_add_prune_callback() arc_remove_prune_callback() * The arc_reduce_dnlc_percent module option has been changed to arc_meta_prune for clarity. The dnlc functions are specific to Solaris's VFS and have already been largely eliminated already. The replacement tunable now represents the number of bytes the prune callback will request when invoked. * Less aggressively invoke the prune callback. We used to call this whenever we exceeded the arc_meta_limit however that's not strictly correct since it results in over zeleous reclaim of dentries and inodes. It is now only called once the arc_meta_limit is exceeded and every effort has been made to evict other data from the ARC cache. * More promptly manage exceeding the arc_meta_limit. When reading meta data in to the cache if a buffer was unable to be recycled notify the arc_reclaim thread to invoke the required prune. * Added arcstat_prune kstat which is incremented when the ARC is forced to request that a consumer prune its cache. Remember this will only occur when the ARC has no other choice. If it can evict buffers safely without invoking the prune callback it will. * This change is also expected to resolve the unexpect collapses of the ARC cache. This would occur because when exceeded just the arc_meta_limit reclaim presure would be excerted on the arc_c value via arc_shrink(). This effectively shrunk the entire cache when really we just needed to reclaim meta data. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #466 Closes #292	2012-01-11 11:46:02 -08:00
Darik Horn	afd7da0ce7	Add LIBSELINUX to mount_zfs_LDFLAGS. Regenerating the autotools configuration on Debian and Ubuntu systems causes compilation to fail with this error message: cmd/mount_zfs/../../cmd/mount_zfs/mount_zfs.c:403: undefined reference to `is_selinux_enabled' In the automake template, set "mount_zfs_LDFLAGS = ... $(LIBSELINUX)" so that the /sbin/mount.zfs utility is linked to libselinux. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2011-12-16 20:04:42 -08:00
Darik Horn	28eb9213d8	Linux 3.2 compat: set_nlink() Directly changing inode->i_nlink is deprecated in Linux 3.2 by commit SHA: bfe8684869601dacfcb2cd69ef8cfd9045f62170 Use the new set_nlink() kernel function instead. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes: #462	2011-12-16 20:02:52 -08:00
Prakash Surya	6ba3b44614	Add make rule for building Arch Linux packages Added the necessary build infrastructure for building packages compatible with the Arch Linux distribution. As such, one can now run: $ ./configure $ make pkg # Alternatively, one can run 'make arch' as well on the Arch Linux machine to create two binary packages compatible with the pacman package manager, one for the zfs userland utilities and another for the zfs kernel modules. The new packages can then be installed by running: # pacman -U $package.pkg.tar.xz In addition, source-only packages suitable for an Arch Linux chroot environment or remote builder can also be build using the 'sarch' make rule. NOTE: Since the source dist tarball is created on the fly from the head of the build tree, it's MD5 hash signature will be continually influx. As a result, the md5sum variable was intentionally omitted from the PKGBUILD files, and the '--skipinteg' makepkg option is used. This may or may not have any serious security implications, as the source tarball is not being downloaded from an outside source. Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #491	2011-12-14 19:14:23 -08:00
Darik Horn	db7c1771da	Demote the whackbang in the zpool_id script. The zpool_id script is posixly correct and does not use bash features, so change its whackbang from /bin/bash to /bin/sh. Debian policy also stipulates that system scripts be dash compatible. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2011-12-05 09:48:18 -08:00
Darik Horn	87193e2b61	Demote egrep to grep in the zpool_id script. Direct invocation of GNU egrep is deprecated by its man page, and the its argument in the zpool_id script is not an extended expression. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2011-12-05 09:48:01 -08:00
Darik Horn	04bf5ecc1f	Quote variables in the zpool_id script. For consistency and safety, quote all variables in the zpool_id script. This accomodates a `-c CONFIG` parameter value with whitespace in the path name. Also fix a typo in the usage synopsis for `-h`. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #439	2011-12-05 09:47:03 -08:00
Darik Horn	9c8254f6f9	Support path_id changes in udev 174. The /lib/udev/path_id helper became a builtin command in the udev 174 release, so test whether path_id is external in the zpool_id script. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes: #429	2011-12-05 09:46:48 -08:00
Brian Behlendorf	5547c2f1bf	Simplify BDI integration Update the code to use the bdi_setup_and_register() helper to simplify the bdi integration code. The updated code now just registers the bdi during mount and destroys it during unmount. The only complication is that for 2.6.32 - 2.6.33 kernels the helper wasn't available so in these cases the zfs code must provide it. Luckily the bdi_setup_and_register() function is trivial. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #367	2011-11-08 10:19:03 -08:00
Darik Horn	3cee2262a6	Change sun.com URLs to zfsonlinux.org ZFS contains error messages that point to the defunct www.sun.com domain, which is currently offline. Change these error messages to use the zfsonlinux.org mirror instead. This commit depends on: zfsonlinux/zfsonlinux.github.com@8e10ead3dc Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2011-10-24 09:52:21 -07:00
Brian Behlendorf	86f35f34f4	Export symbols for the VFS API Export all symbols already marked extern in the zfs_vfsops.h header. Several non-static symbols have also been added to the header and exportewd. This allows external modules to more easily create and manipulate properly created ZFS filesystem type datasets. Rename zfsvfs_teardown() to zfs_sb_teardown and export it. This is done simply for consistency with the rest of the code base. All other zfsvfs_* functions have already been renamed. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2011-10-11 10:25:59 -07:00
Brian Behlendorf	c70602f1ea	Fix uninitialized varible in zfs_do_userspace() When compiling under Debian Lenny with gcc version 4.3.2 (Debian 4.3.2-1.1) the following warning occurs. To quiet the warning initialize 'error' to zero. Newer versions of gcc correctly determine that this uninitialized varible is impossible because ZFS_NUM_USERQUOTA_PROPS is known to be greater than zero. cmd/zfs/zfs_main.c:2377: warning: "error" may be used uninitialized in this function Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2011-09-27 16:56:38 -07:00
Brian Behlendorf	4c069d3494	Fixed uninitialized variable This warning was accidentally introduced by commit b7936d5c2337bc976ac831c1c38de563844c36b. The fix is to simply initialize the variable to ZFS_DELEG_WHO_UNKNOWN. cmd/zfs/zfs_main.c:4460:25: warning: 'who_type' may be used uninitialized in this function Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2011-08-19 16:26:06 -07:00
Brian Behlendorf	29b35200a7	Fix missing format arguments These warnings were accidentally introduced by commit b7936d5c2337bc976ac831c1c38de563844c36b. The fix is to simply add the missing format specifier. cmd/zfs/zfs_main.c:4565: warning: format not a string literal and no format arguments Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2011-08-19 15:16:34 -07:00
Brian Behlendorf	b740d602bd	Disable zfs /etc/mtab updates Completely disable the zfs binary from attempting to directly update /etc/mtab. The Linux port relies entirely on the mount.zfs helper to safely update /etc/mtab. If we left the /etc/mtab updates to the zfs binary then they could race with concurrent non-zfs mounts. Routing everything through the system mount command ensures the /etc/mtab updates are locked properly. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #329	2011-08-19 11:53:01 -07:00
Brian Behlendorf	de0a1c099b	Autogen refresh for udev changes Run autogen.sh using the same autotools versions as upstream: * autoconf-2.63 * automake-1.11.1 * libtool-2.2.6b	2011-08-08 16:30:27 -07:00
Kyle Fuller	12d06bac9b	Move udev rules from /etc/udev to /lib/udev This change moves the default install location for the zfs udev rules from /etc/udev/ to /lib/udev/. The correct convention is for rules provided by a package to be installed in /lib/udev/. The /etc/udev/ directory is reserved for custom rules or local overrides. Additionally, this patch cleans up some abuse of the bindir install location by adding a udevdir and udevruledir install directories. This allows us to revert to the default bin install location. The udev install directories can be set with the following new options. --with-udevdir=DIR install udev helpers [EPREFIX/lib/udev] --with-udevruledir=DIR install udev rules [UDEVDIR/rules.d] Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #356	2011-08-08 16:21:10 -07:00
Brian Behlendorf	76659dc110	Add backing_device_info per-filesystem For a long time now the kernel has been moving away from using the pdflush daemon to write 'old' dirty pages to disk. The primary reason for this is because the pdflush daemon is single threaded and can be a limiting factor for performance. Since pdflush sequentially walks the dirty inode list for each super block any delay in processing can slow down dirty page writeback for all filesystems. The replacement for pdflush is called bdi (backing device info). The bdi system involves creating a per-filesystem control structure each with its own private sets of queues to manage writeback. The advantage is greater parallelism which improves performance and prevents a single filesystem from slowing writeback to the others. For a long time both systems co-existed in the kernel so it wasn't strictly required to implement the bdi scheme. However, as of Linux 2.6.36 kernels the pdflush functionality has been retired. Since ZFS already bypasses the page cache for most I/O this is only an issue for mmap(2) writes which must go through the page cache. Even then adding this missing support for newer kernels was overlooked because there are other mechanisms which can trigger writeback. However, there is one critical case where not implementing the bdi functionality can cause problems. If an application handles a page fault it can enter the balance_dirty_pages() callpath. This will result in the application hanging until the number of dirty pages in the system drops below the dirty ratio. Without a registered backing_device_info for the filesystem the dirty pages will not get written out. Thus the application will hang. As mentioned above this was less of an issue with older kernels because pdflush would eventually write out the dirty pages. This change adds a backing_device_info structure to the zfs_sb_t which is already allocated per-super block. It is then registered when the filesystem mounted and unregistered on unmount. It will not be registered for mounted snapshots which are read-only. This change will result in flush-<pool> thread being dynamically created and destroyed per-mounted filesystem for writeback. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #174	2011-08-04 13:37:38 -07:00
Alexander Stetsenko	0b7936d5c2	Illumos #278 : get rid zfs of python and pyzfs dependencies Remove all python and pyzfs dependencies for consistency and to ensure full functionality even in a mimimalist environment. Reviewed by: gordon.w.ross@gmail.com Reviewed by: trisk@opensolaris.org Reviewed by: alexander.r.eremin@gmail.com Reviewed by: jerry.jelinek@joyent.com Approved by: garrett@nexenta.com References to Illumos issue and patch: - https://www.illumos.org/issues/278 - https://github.com/illumos/illumos-gate/commit/1af68beac3 Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #340 Issue #160 Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2011-08-01 12:09:36 -07:00
Eric Schrock	3e31d2b080	Illumos #883 : ZIL reuse during remount corruption Moving the zil_free() cleanup to zil_close() prevents this problem from occurring in the first place. There is a very good description of the issue and fix in Illumus #883. Reviewed by: Matt Ahrens <Matt.Ahrens@delphix.com> Reviewed by: Adam Leventhal <Adam.Leventhal@delphix.com> Reviewed by: Albert Lee <trisk@nexenta.com> Reviewed by: Gordon Ross <gwr@nexenta.com> Reviewed by: Garrett D'Amore <garrett@nexenta.com> Reivewed by: Dan McDonald <danmcd@nexenta.com> Approved by: Gordon Ross <gwr@nexenta.com> References to Illumos issue and patch: - https://www.illumos.org/issues/883 - https://github.com/illumos/illumos-gate/commit/c9ba2a43cb Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #340	2011-08-01 12:09:11 -07:00
George Wilson	6d974228ef	Illumos #1051 : zfs should handle imbalanced luns Today zfs tries to allocate blocks evenly across all devices. This means when devices are imbalanced zfs will use lots of CPU searching for space on devices which tend to be pretty full. It should instead fail quickly on the full LUNs and move onto devices which have more availability. Reviewed by: Eric Schrock <Eric.Schrock@delphix.com> Reviewed by: Matt Ahrens <Matt.Ahrens@delphix.com> Reviewed by: Adam Leventhal <Adam.Leventhal@delphix.com> Reviewed by: Albert Lee <trisk@nexenta.com> Reviewed by: Gordon Ross <gwr@nexenta.com> Approved by: Garrett D'Amore <garrett@nexenta.com> References to Illumos issue and patch: - https://www.illumos.org/issues/510 - https://github.com/illumos/illumos-gate/commit/5ead3ed965 Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #340	2011-08-01 12:09:11 -07:00
Shampavman	bb939d1085	Illumos #510 : 'zfs get' enhancement - mountpoint as an argument The 'zfs get' command should be able to deal with mountpoint as an argument. It already works with 'zfs list' command: # zfs list /export/home/estibi NAME USED AVAIL REFER MOUNTPOINT rpool/export/home/estibi 1.14G 3.86G 1.14G /export/home/estibi but it fails with 'zfs get': # zfs get all /export/home/estibi cannot open '/export/home/estibi': invalid dataset name Reviewed by: Eric Schrock <eric.schrock@delphix.com> Reviewed by: Deano <deano@rattie.demon.co.uk> Reviewed by: Garrett D'Amore <garrett@nexenta.com> Approved by: Garrett D'Amore <garrett@nexenta.com> References to Illumos issue and patch: - https://www.illumos.org/issues/510 - https://github.com/illumos/illumos-gate/commit/5ead3ed965 Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #340	2011-08-01 12:09:11 -07:00
Kyle Fuller	615ab66d18	Provide a rc.d script for archlinux Unlike most other Linux distributions archlinux installs its init scripts in /etc/rc.d insead of /etc/init.d. This commit provides an archlinux rc.d script for zfs and extends the build infrastructure to ensure it get's installed in the correct place. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #322	2011-07-11 14:12:23 -07:00
Brian Behlendorf	e0f86c9862	Update 'zfs send' documentation The -D and -p options were missing from the manpage. This commit adds documentation for these features. Closes #311	2011-07-08 12:16:09 -07:00
Brian Behlendorf	341b5f1d4c	Update ztest paths Unfortunately, ztest is hard coded to export the zdb utility to be installed in a certain location. When the packaging was updated to install zdb in /sbin/ ztest was broken. To fix this I'm updating ztest to check both common install paths.	2011-07-06 12:30:09 -07:00
Prasad Joshi	5a52105925	Use consistent error message in zpool sub-command The zpool sub-commands like iostat, list, and status should display consistent message when a given pool is unavailable or no pool is present. This change unifies the default behavior as follows: root@prasad:~# ./zpool list 1 2 no pools available no pools available root@prasad:~# ./zpool iostat 1 2 no pools available no pools available root@prasad:~# ./zpool status 1 2 no pools available no pools available root@prasad:~# ./zpool list tan 1 2 cannot open 'tan': no such pool root@prasad:~# ./zpool iostat tan 1 2 cannot open 'tan': no such pool root@prasad:~# ./zpool status tan 1 2 cannot open 'tan': no such pool Reported-by: Rajshree Thorat <rthorat@stec-inc.com> Signed-off-by: Prasad Joshi <pjoshi@stec-inc.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #306	2011-07-04 19:57:01 -07:00
Brian Behlendorf	2cf7f52bc4	Linux compat 2.6.39: mount_nodev() The .get_sb callback has been replaced by a .mount callback in the file_system_type structure. When using the new interface the caller must now use the mount_nodev() helper. Unfortunately, the new interface no longer passes the vfsmount down to the zfs layers. This poses a problem for the existing implementation because we currently save this pointer in the super block for latter use. It provides our only entry point in to the namespace layer for manipulating certain mount options. This needed to be done originally to allow commands like 'zfs set atime=off tank' to work properly. It also allowed me to keep more of the original Solaris code unmodified. Under Solaris there is a 1-to-1 mapping between a mount point and a file system so this is a fairly natural thing to do. However, under Linux they many be multiple entries in the namespace which reference the same filesystem. Thus keeping a back reference from the filesystem to the namespace is complicated. Rather than introduce some ugly hack to get the vfsmount and continue as before. I'm leveraging this API change to update the ZFS code to do things in a more natural way for Linux. This has the upside that is resolves the compatibility issue for the long term and fixes several other minor bugs which have been reported. This commit updates the code to remove this vfsmount back reference entirely. All modifications to filesystem mount options are now passed in to the kernel via a '-o remount'. This is the expected Linux mechanism and allows the namespace to properly handle any options which apply to it before passing them on to the file system itself. Aside from fixing the compatibility issue, removing the vfsmount has had the benefit of simplifying the code. This change which fairly involved has turned out nicely. Closes #246 Closes #217 Closes #187 Closes #248 Closes #231	2011-07-01 13:36:39 -07:00
Brian Behlendorf	5c03efc379	Linux compat 2.6.39: security_inode_init_security() The security_inode_init_security() function now takes an additional qstr argument which must be passed in from the dentry if available. Passing a NULL is safe when no qstr is available the relevant security checks will just be skipped. Closes #246 Closes #217 Closes #187	2011-07-01 12:40:08 -07:00
Prasad Joshi	b312979252	Tear down and flush the mmap region The inode eviction should unmap the pages associated with the inode. These pages should also be flushed to disk to avoid the data loss. Therefore, use truncate_setsize() in evict_inode() to release the pagecache. The API truncate_setsize() was added in 2.6.35 kernel. To ensure compatibility with the old kernel, the patch defines its own truncate_setsize function. Signed-off-by: Prasad Joshi <pjoshi@stec-inc.com> Closes #255	2011-06-27 09:59:19 -07:00
Ned A. Bass	560bcf9d14	Multipath device manageability improvements Update udev helper scripts to deal with device-mapper devices created by multipathd. These enhancements are targeted at a particular storage network topology under evaluation at LLNL consisting of two SAS switches providing redundant connectivity between multiple server nodes and disk enclosures. The key to making these systems manageable is to create shortnames for each disk that conveys its physical location in a drawer. In a direct-attached topology we infer a disk's enclosure from the PCI bus number and HBA port number in the by-path name provided by udev. In a switched topology, however, multiple drawers are accessed via a single HBA port. We therefore resort to assigning drawer identifiers based on which switch port a drive's enclosure is connected to. This information is available from sysfs. Add options to zpool_layout to generate an /etc/zfs/zdev.conf using symbolic links in /dev/disk/by-id of the form <label>-<UUID>-switch-port:<X>-slot:<Y>. <label> is a string that depends on the subsystem that created the link and defaults to "dm-uuid-mpath" (this prefix is used by multipathd). <UUID> is a unique identifier for the disk typically obtained from the scsi_id program, and <X> and <Y> denote the switch port and disk slot numbers, respectively. Add a callout script sas_switch_id for use by multipathd to help create symlinks of the form described above. Update zpool_id and the udev zpool rules file to handle both multipath devices and conventional drives.	2011-06-23 10:46:06 -07:00
Christian Kohlschütter	df30f56639	Add "ashift" property to zpool create Some disks with internal sectors larger than 512 bytes (e.g., 4k) can suffer from bad write performance when ashift is not configured correctly. This is caused by the disk not reporting its actual sector size, but a sector size of 512 bytes. The drive may behave this way for compatibility reasons. For example, the WDC WD20EARS disks are known to exhibit this behavior. When creating a zpool, ZFS takes that wrong sector size and sets the "ashift" property accordingly (to 9: 1<<9=512), whereas it should be set to 12 for 4k sectors (1<<12=4096). This patch allows an adminstrator to manual specify the known correct ashift size at 'zpool create' time. This can significantly improve performance in certain cases. However, it will have an impact on your total pool capacity. See the updated ashift property description in the zpool.8 man page for additional details. Valid values for the ashift property range from 9 to 17 (512B-128KB). Additionally, you may set the ashift to 0 if you wish to auto-detect the sector size based on what the disk reports, this is the default behavior. The most common ashift values are 9 and 12. Example: zpool create -o ashift=12 tank raidz2 sda sdb sdc sdd Closes #280 Original-patch-by: Richard Laager <rlaager@wiktel.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2011-06-17 16:35:49 -07:00
Brian Behlendorf	e130330a87	Handle /etc/mtab -> /proc/mounts symlink Under Fedora 15 /etc/mtab is now a symlink to /proc/mounts by default. When /etc/mtab is a symlink the mount.zfs helper should not update it. There was code in place to handle this case but it used stat() which traverses the link and then issues the stat on /proc/mounts. We need to use lstat() to prevent the link traversal and instead stat /etc/mtab. Closes #270	2011-06-14 16:48:38 -07:00
Brian Behlendorf	2e08aedba4	Always check -Wno-unused-but-set-variable gcc support The previous commit `8a7e1ceefa` wasn't quite right. This check applies to both the user and kernel space build and as such we must make sure it runs regardless of what the --with-config option is set too. For example, if --with-config=kernel then the autoconf test does not run and we generate build warnings when compiling the kernel packages.	2011-06-14 16:40:35 -07:00
Brian Behlendorf	8a7e1ceefa	Check for -Wno-unused-but-set-variable gcc support Gcc versions 4.3.2 and earlier do not support the compiler flag -Wno-unused-but-set-variable. This can lead to build failures on older Linux platforms such as Debian Lenny. Since this is an optional build argument this changes add a new autoconf check for the option. If it is supported by the installed version of gcc then it is used otherwise it is omited. See commit's `12c1acde76` and `79713039a2` for the reason the -Wno-unused-but-set-variable options was originally added.	2011-06-14 14:43:22 -07:00
Brian Behlendorf	4804b739e1	Default to internal 'zfs userspace' implementation We will never bring over the pyzfs.py helper script from Solaris to Linux. Instead the missing functionality will be directly integrated in to the zfs commands and libraries. To avoid confusion remove the warning about the missing pyzfs.py utility and simply use the default internal support. The Illumous developers are of the same mind and have proposed an initial patch to do this which has been integrated in to the 'allow' development branch. After some additional testing this code can be merged in to master as the right long term solution.	2011-05-20 10:25:41 -07:00
Brian Behlendorf	6ee44e32be	Fix awk usage The zpool_id and zpool_layout helper scripts have been updated to use the more common /usr/bin/awk symlink. On Fedora/Redhat systems there are both /bin/awk and /usr/bin/awk symlinks to your installed version of awk. On Debian/Ubuntu systems only the /usr/bin/awk symlink exists. Additionally, add the '\<' token to the beginning of the regex pattern to prevent partial matches. This pattern only appears to work with gawk despite the mawk man page claiming to support this extended regex. Thus you will need to have gawk installed to use these optional helper scripts. A comment has been added to the script to reflect this reality.	2011-05-06 10:16:04 -07:00
Brian Behlendorf	3613204cd7	Allow mounting of read-only snapshots With the addition of the mount helper we accidentally regressed the ability to manually mount snapshots. This commit updates the mount helper to expect the possibility of a ZFS_TYPE_SNAPSHOT. All snapshot will be automatically treated as 'legacy' type mounts so they can be mounted manually.	2011-05-05 10:13:38 -07:00
Brian Behlendorf	df554c148e	Fix 'zfs set volsize=N pool/dataset' This change fixes a kernel panic which would occur when resizing a dataset which was not open. The objset_t stored in the zvol_state_t will be set to NULL when the block device is closed. To avoid this issue we pass the correct objset_t as the third arg. The code has also been updated to correctly notify the kernel when the block device capacity changes. For 2.6.28 and newer kernels the capacity change will be immediately detected. For earlier kernels the capacity change will be detected when the device is next opened. This is a known limitation of older kernels. Online ext3 resize test case passes on 2.6.28+ kernels: $ dd if=/dev/zero of=/tmp/zvol bs=1M count=1 seek=1023 $ zpool create tank /tmp/zvol $ zfs create -V 500M tank/zd0 $ mkfs.ext3 /dev/zd0 $ mkdir /mnt/zd0 $ mount /dev/zd0 /mnt/zd0 $ df -h /mnt/zd0 $ zfs set volsize=800M tank/zd0 $ resize2fs /dev/zd0 $ df -h /mnt/zd0 Original-patch-by: Fajar A. Nugraha <github@fajar.net> Closes #68 Closes #84	2011-05-02 08:54:40 -07:00
Gunnar Beutner	055656d4f4	Implemented NFS export_operations. Implemented the required NFS operations for exporting ZFS datasets using the in-kernel NFS daemon.	2011-04-29 12:36:13 -07:00
Darik Horn	492b8e9e7b	Use gethostid in the Linux convention. Disable the gethostid() override for Solaris behavior because Linux systems implement the POSIX standard in a way that allows a negative result. Mask the gethostid() result to the lower four bytes, like coreutils does in /usr/bin/hostid, to prevent junk bits or sign-extension on systems that have an eight byte long type. This can cause a spurious hostid mismatch that prevents zpool import on 64-bit systems.	2011-04-25 10:36:17 -05:00
Brian Behlendorf	12c1acde76	Set -Wno-unused-but-set-variable globally As of gcc-4.6 the option -Wunused-but-set-variable is enabled by default. While this is a useful warning there are numerous places in the ZFS code when a variable is set and then only checked in an ASSERT(). To avoid having to update every instance of this in the code we now set -Wno-unused-but-set-variable to suppress the warning. Additionally, when building with --enable-debug and -Werror set these warning also become fatal. We can reevaluate the suppression of these error at a later time if it becomes an issue. For now we are basically just reverting to the previous gcc behavior.	2011-04-19 10:44:10 -07:00
Brian Behlendorf	03514b0110	Fix gcc compiler warning, parse_option() When compiling ZFS in user space gcc-4.6.0 correctly identifies the variable 'value' as being set but never used. This generates a warning and a build failure when using --enable-debug. Once again this is correct but I'm reluctant to remove 'value' because we are breaking the string in to name/value pairs. While it is not used now there's a good chance it will be soon and I'd rather not have to reinvent this. To suppress the warning with just as a VERIFY(). This was observed under Fedora 15. cmd/mount_zfs/mount_zfs.c: In function ‘parse_option’: cmd/mount_zfs/mount_zfs.c:112:21: error: variable ‘value’ set but not used [-Werror=unused-but-set-variable]	2011-04-19 09:04:51 -07:00
Manuel Amador (Rudd-O)	8610b52bd4	Added .gitignore for mount.zfs and zvol_id Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2011-04-07 10:28:38 -07:00
Ned Bass	fa417e57a6	Call udevadm trigger more safely Some udev hooks are not designed to be idempotent, so calling udevadm trigger outside of the distribution's initialization scripts can have unexpected (and potentially dangerous) side effects. For example, the system time may change or devices may appear multiple times. See Ubuntu launchpad bug 320200 and this mailing list post for more details: https://lists.ubuntu.com/archives/ubuntu-devel/2009-January/027260.html To avoid these problems we call udevadm trigger with --action=change --subsystem-match=block. The first argument tells udev just to refresh devices, and make sure everything's as it should be. The second argument limits the scope to block devices, so devices belonging to other subsystems cannot be affected. This doesn't fix the problem on older udev implementations that don't provide udevadm but instead have udevtrigger as a standalone program. In this case the above options aren't available so there's no way to call call udevtrigger safely. But we can live with that since this issue only exists in optional test and helper scripts, and most zfs-on-linux users are running newer systems anyways.	2011-04-05 13:00:51 -07:00
Fajar A. Nugraha	a5729f7b22	Fixes to enable zvol symlink creation This commit fixes issue on https://github.com/behlendorf/zfs/issues/#issue/172 Changes: - update BLKZNAME to use _IOR instead of _IO. Kernel 2.6.32 allows read parameters (copy_to_user) with _IO, while newer kernels (tested Archlinux's 2.6.37 kernel) enforces _IOR (which is correct) - fix return code and message on error Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2011-03-24 11:48:18 -07:00
Brian Behlendorf	bdf4328b04	Linux 2.6.28 compat, insert_inode_locked() Added insert_inode_locked() helper function, prior to this most callers used insert_inode_hash(). The older method doesn't check for collisions in the inode_hashtable but it still acceptible for use. Fallback to using insert_inode_hash() when insert_inode_locked() is unavailable.	2011-03-22 12:15:54 -07:00
Brian Behlendorf	f47c42e214	Merge branch 'dracut'	2011-03-22 12:13:04 -07:00
Brian Behlendorf	716895b161	Fix 'LDFLAGS=-Wl,--as-needed' build error Compiling with 'LDFLAGS=-Wl,--as-needed' exposed the fact that there were some library linking problems introduced by mount_zfs. In particular, the libzfs library does use nvpair symbols, and mount_zfs contains no dependencies on libzpool. Closes #161 Closes #162	2011-03-18 14:47:19 -07:00
Brian Behlendorf	ec49a5f0ec	Fix getcwd() warning New versions glibc declare getcwd() with the warn_unused_result attribute. This results in a warning because the updated mount helper was not checking this return value. This issue was fixed by checking the return type and in the case of an error simply returning the passed dataset. One possible, but unlikely, error would be having your cwd directory unlinked while the mount command was running. cmd/mount_zfs/mount_zfs.c: In function ‘parse_dataset’: cmd/mount_zfs/mount_zfs.c:223:2: error: ignoring return value of ‘getcwd’, declared with attribute warn_unused_result	2011-03-18 13:54:49 -07:00
Brian Behlendorf	01c0e61da0	Add init scripts To support automatically mounting your zfs on filesystem on boot a basic init script is needed. Unfortunately, every distribution has their own idea of the _right_ way to do things. Rather than write one very complicated portable init script, which would be invariably replaced by the distributions own anyway. I have instead added support to provide multiple distribution specific init scripts. The correct init script for your distribution will be selected by ZFS_AC_DEFAULT_PACKAGE which will set DEFAULT_INIT_SCRIPT. During 'make install' the correct script for your system will be installed from zfs/etc/init.d/zfs.DEFAULT_INIT_SCRIPT to the usual /etc/init.d/zfs location. Currently, there is zfs.fedora and a more generic zfs.lsb init script. Hopefully, the distribution maintainers who know best how they want their init scripts to function will feedback their approved versions to be included in the project. This change does not consider upstart jobs but I'm not at all opposed to add that sort of thing.	2011-03-17 16:51:54 -07:00
Brian Behlendorf	3aff775555	Strip 'zfsutil,remount' from /etc/mtab When updating /etc/mtab we should be careful and strip certain options. In particular, we need to strip 'zfsutil' because if we don't the mount utility will helpfull provide it to the mount helper when we issue mount(8) again. This subverts the check that the caller is zfs(8) and not mount(8).	2011-03-15 13:33:29 -07:00
Brian Behlendorf	093aa69286	Always allow '-o remount,ro' Allow the mount(8) utility to always operate on all datasets when remounting them read-only. This critical for rc.sysinit/umountroot which remounts the root filesystem read-only during shutdown to ensure everything is correctly flushed to disk. Fix minor typo, the check to set zfsutil should use the bitwise '&'. I must have accidentally hit the adjacent '*' and obviously neither the compiler or my code review caught this. Fix it now.	2011-03-15 13:33:29 -07:00
Brian Behlendorf	a6cba65cca	Check for trailing '/' in mount.zfs When run with a root '/' cwd the mount.zfs helper would strip not only the '/' but also the next character from the dataset name. For example, '/tank' was changed to 'ank' instead of just 'tank'. Originally, this was done for the '/tmp' cwd case where we needed to strip the '/' following the cwd. For example '/tmp/tank' needed to remove the '/tmp' cwd plus 1 character for the '/'. This change fixes the problem by checking the cwd and if it ends in a '/' it does not strip and extra character. Otherwise it will strip the next character. I believe this should only ever be true for the root directory. Closes #148	2011-03-10 12:58:44 -08:00
Brian Behlendorf	d53368f675	Fix mount helper Several issues related to strange mount/umount behavior were reported and this commit should address most of them. The original idea was to put in place a zfs mount helper (mount.zfs). This helper is used to enforce 'legacy' mount behavior, and perform any extra mount argument processing (selinux, zfsutil, etc). This helper wasn't ready for the 0.6.0-rc1 release but with this change it's functional but needs to extensively tested. This change addresses the following open issues. Closes #101 Closes #107 Closes #113 Closes #115 Closes #119	2011-03-09 15:26:48 -08:00
Brian Behlendorf	321a498b95	Add xvattr support With the removal of the minimal xvattr support from the spl this support needs to be replaced in the zfs package. This is fairly easily accomplished by directly adding portions of the sys/vnode.h header from OpenSolaris. These xvattr additions have been placed in the sys/xvattr.h header file and included as needed where simply a sys/vnode.h was included before. In additon to the xvattr types and helper macros two functions were also included. The xva_init() and xva_getxoptattr() functions were included as static inline functions in xvattr.h. They are simple enough and it was simpler to place them here rather than in their own .c file.	2011-03-02 11:43:50 -08:00
Fajar A. Nugraha	4c0d8e50b9	Use udev to create /dev/zvol/[dataset_name] links This commit allows zvols with names longer than 32 characters, which fixes issue on https://github.com/behlendorf/zfs/issues/#issue/102. Changes include: - use /dev/zd* device names for zvol, where * is the device minor (include/sys/fs/zfs.h, module/zfs/zvol.c). - add BLKZNAME ioctl to get dataset name from userland (include/sys/fs/zfs.h, module/zfs/zvol.c, cmd/zvol_id). - add udev rule to create /dev/zvol/[dataset_name] and the legacy /dev/[dataset_name] symlink. For partitions on zvol, it will create /dev/zvol/[dataset_name]-part* (etc/udev/rules.d/60-zvol.rules, cmd/zvol_id). Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2011-02-25 09:43:19 -08:00
Brian Behlendorf	718d77f622	Fix uninitialized variable It was possible for rc to be unitialized in the parse_options() function which triggered a compiler warning. Ensure rc is always initialized.	2011-02-23 12:57:25 -08:00
Brian Behlendorf	45066d1f20	Linux 2.6.38 compat, blkdev_get_by_path() The open_bdev_exclusive() function has been replaced (again) by the more generic blkdev_get_by_path() function. Additionally, the counterpart function close_bdev_exclusive() has been replaced by blkdev_put(). Because these functions are more generic versions of the functions they replaced the compatibility macro must add the FMODE_EXCL mask to ensure they are exclusive. Closes #114	2011-02-23 12:29:38 -08:00
Brian Behlendorf	2c395def27	Linux 2.6.36 compat, sops->evict_inode() The new prefered inteface for evicting an inode from the inode cache is the ->evict_inode() callback. It replaces both the ->delete_inode() and ->clear_inode() callbacks which were previously used for this.	2011-02-11 13:47:51 -08:00
Brian Behlendorf	7268e1bec8	Linux 2.6.35 compat, fops->fsync() The fsync() callback in the file_operations structure used to take 3 arguments. The callback now only takes 2 arguments because the dentry argument was determined to be unused by all consumers. To handle this a compatibility prototype was added to ensure the right prototype is used. Our implementation never used the dentry argument either so it's just a matter of using the right prototype.	2011-02-11 09:05:51 -08:00
Brian Behlendorf	777d4af891	Linux 2.6.35 compat, const struct xattr_handler The const keyword was added to the 'struct xattr_handler' in the generic Linux super_block structure. To handle this we define an appropriate xattr_handler_t typedef which can be used. This was the preferred solution because it keeps the code clean and readable.	2011-02-10 16:29:00 -08:00
Brian Behlendorf	afffb5cd10	MS_DIRSYNC and MS_REC compat It turns out that older versions of the glibc headers do not properly define MS_DIRSYNC despite it being explicitly mentioned in the man pages. They instead call it S_WRITE, so for system where this is not correct defined map MS_DIRSYNC to S_WRITE. At the time of this commit both Ubuntu Lucid, and Debian Squeeze both use the out of date glibc headers. As for MS_REC this field is also not available in the older headers. Since there is no obvious mapping in this case we simply disable the recursive mount option which used it.	2011-02-10 12:14:57 -08:00
Brian Behlendorf	1ac0ea38a5	Add missing -ldl linker option The inclusion on dlsym(), dlopen(), and dlclose() symbols require us to link against the dl library. Be careful to add the flag to both the libzfs library and the commands which depend on the library.	2011-02-10 11:05:44 -08:00
Brian Behlendorf	b4ead57cfb	Remove HAVE_ZPL from commands and libraries Thanks to the previous few commits we can now build all of the user space commands and libraries with support for the zpl.	2011-02-04 16:14:34 -08:00
Brian Behlendorf	9a616b5d17	Documentation updates Minor Linux specific documentation updates to the comments and man pages.	2011-02-04 16:14:34 -08:00
Brian Behlendorf	c5d915f423	Minimal libshare infrastructure ZFS even under Solaris does not strictly require libshare to be available. The current implementation attempts to dlopen() the library to access the needed symbols. If this fails libshare support is simply disabled. This means that on Linux we only need the most minimal libshare implementation. In fact just enough to prevent the build from failing. Longer term we can decide if we want to implement a libshare library like Solaris. At best this would be an abstraction layer between ZFS and NFS/SMB. Alternately, we can drop libshare entirely and directly integrate ZFS with Linux's NFS/SMB. Finally the bare bones user-libshare.m4 test was dropped. If we do decide to implement libshare at some point it will surely be as part of this package so the check is not needed.	2011-02-04 16:14:29 -08:00
Brian Behlendorf	3fb1fcdea1	Add 'zfs mount' support By design the zfs utility is supposed to handle mounting and unmounting a zfs filesystem. We could allow zfs to do this directly. There are system calls available to mount/umount a filesystem. And there are library calls available to manipulate /etc/mtab. But there are a couple very good reasons not to take this appraoch... for now. Instead of directly calling the system and library calls to (u)mount the filesystem we fork and exec a (u)mount process. The principle reason for this is to delegate the responsibility for locking and updating /etc/mtab to (u)mount(8). This ensures maximum portability and ensures the right locking scheme for your version of (u)mount will be used. If we didn't do this we would have to resort to an autoconf test to determine what locking mechanism is used. The downside to using mount(8) instead of mount(2) is that we lose the exact errno which was returned by the kernel. The return code from mount(8) provides some insight in to what went wrong but it not quite as good. For the moment this is translated as a best guess in to a errno for the higher layers of zfs. In the long term a shared library called libmount is under development which provides a common API to address the locking and errno issues. Once the standard mount utility has been updated to use this library we can then leverage it. Until then this is the only safe solution. http://www.kernel.org/pub/linux/utils/util-linux/libmount-docs/index.html	2011-02-04 16:11:58 -08:00
Brian Behlendorf	95c4cae39f	Disable umount.zfs helper For the moment, the only advantage in registering a umount helper would be to automatically unshare a zfs filesystem. Since under Linux this would be unexpected (but nice) behavior there is no harm in disabling it. This is desirable because the 'zfs unmount' path invokes the system umount. This is done to ensure correct mtab locking but has the side effect that the umount.zfs helper would be called if it exists. By default this helper calls back in to zfs to do the unmount on Solaris which we don't want under Linux. Once libmount is available and we have a safe way to correctly lock and update the /etc/mtab file we can reconsider the need for a umount helper. Using libmount is the prefered solution.	2011-01-28 12:47:57 -08:00
Brian Behlendorf	3b8cfee8af	Enable mount.zfs helper While not strictly required to mount a zfs filesystem using a mount helper has certain advantages. First, we need it if we want to honor the mount behavior as found on Solaris. As part of the mount we need to validate that the dataset has the legacy mount property set if we are using 'mount' instead of 'zfs mount'. Secondly, by using a mount helper we can automatically load the zpl kernel module. This way you can just issue a 'mount' or 'zfs mount' and it will just work. Finally, it gives us common hook in user space to add any zfs specific mount options we might want. At the moment we don't have any but now the infrastructure is at least in place.	2011-01-28 12:47:57 -08:00
Brian Behlendorf	b3259b6a2b	Autoconf selinux support If libselinux is detected on your system at configure time link against it. This allows us to use a library call to detect if selinux is enabled and if it is to pass the mount option: "context=\"system_u:object_r:file_t:s0" For now this is required because none of the existing selinux policies are aware of the zfs filesystem type. Because of this they do not properly enable xattr based labeling even though zfs supports all of the required hooks. Until distro's add zfs as a known xattr friendly fs type we must use mntpoint labeling. Alternately, end users could modify their existing selinux policy with a little guidance.	2011-01-28 12:45:19 -08:00
Brian Behlendorf	149e873ab1	Fix minor compiler warnings These compiler warnings were introduced when code which was previously #ifdef'ed out by HAVE_ZPL was re-added for use by the posix layer. All of the following changes should be obviously correct and will cause no semantic changes.	2011-01-06 15:04:28 -08:00
Ricardo M. Correia	8d4e8140ef	Fix block device-related issues in zdb. Specifically, this fixes the two following errors in zdb when a pool is composed of block devices: 1) 'Value too large for defined data type' when running 'zdb <dataset>'. 2) 'character device required' when running 'zdb -l <block-device>'. Signed-off-by: Ricardo M. Correia <ricardo.correia@oracle.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-12-14 09:52:46 -08:00
Ned Bass	3ee56c292b	Make rollbacks fail gracefully Support for rolling back datasets require a functional ZPL, which we currently do not have. The zfs command does not check for ZPL support before attempting a rollback, and in preparation for rolling back a zvol it removes the minor node of the device. To prevent the zvol device node from disappearing after a failed rollback operation, this change wraps the zfs_do_rollback() function in an #ifdef HAVE_ZPL and returns ENOSYS in the absence of a ZPL. This is consistent with the behavior of other ZPL dependent commands such as mount. The orginal error message observed with this bug was rather confusing: internal error: Unknown error 524 Aborted This was because zfs_ioc_rollback() returns ENOTSUP if we don't HAVE_ZPL, but Linux actually has no such error code. It should instead return EOPNOTSUPP, as that is how ENOTSUP is defined in user space. With that we would have gotten the somewhat more helpful message cannot rollback 'tank/fish': unsupported version This is rather a moot point with the above changes since we will no longer make that ioctl call without a ZPL. But, this change updates the error code just in case. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-11-08 14:03:36 -08:00
Ned Bass	d877ac6bfe	Fix intermittent 'zpool add' failures Creating whole-disk vdevs can intermittently fail if a udev-managed symlink to the disk partition is already in place. To avoid this, we now remove any such symlink before partitioning the disk. This makes zpool_label_disk_wait() truly wait for the new link to show up instead of returning if it finds an old link still in place. Otherwise there is a window between when udev deletes and recreates the link during which access attempts will fail with ENOENT. Also, clean up a comment about waiting for udev to create symlinks. It no longer needs to describe the special cases for the link names, since that is now handled in a separate helper function. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-10-22 12:38:58 -07:00
Ned Bass	4682b8c14e	Remove solaris-specific code from make_leaf_vdev() Portability between Solaris and Linux isn't really an issue for us anymore, and removing sections like this one helps simplify the code. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-10-22 12:25:58 -07:00
Ned Bass	79e7242a91	Add helper functions for manipulating device names This change adds two helper functions for working with vdev names and paths. zfs_resolve_shortname() resolves a shorthand vdev name to an absolute path of a file in /dev, /dev/disk/by-id, /dev/disk/by-label, /dev/disk/by-path, /dev/disk/by-uuid, /dev/disk/zpool. This was previously done only in the function is_shorthand_path(), but we need a general helper function to implement shorthand names for additional zpool subcommands like remove. is_shorthand_path() is accordingly updated to call the helper function. There is a minor change in the way zfs_resolve_shortname() tests if a file exists. is_shorthand_path() effectively used open() and stat64() to test for file existence, since its scope includes testing if a device is a whole disk and collecting file status information. zfs_resolve_shortname(), on the other hand, only uses access() to test for existence and leaves it to the caller to perform any additional file operations. This seemed like the most general and lightweight approach, and still preserves the semantics of is_shorthand_path(). zfs_append_partition() appends a partition suffix to a device path. This should be used to generate the name of a whole disk as it is stored in the vdev label. The user-visible names of whole disks do not contain the partition information, while the name in the vdev label does. The code was lifted from the function make_disks(), which now just calls the helper function. Again, having a helper function to do this supports general handling of shorthand names in the user interface. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-10-22 12:25:30 -07:00
Brian Behlendorf	2959d94a0a	Add FAILFAST support ZFS works best when it is notified as soon as possible when a device failure occurs. This allows it to immediately start any recovery actions which may be needed. In theory Linux supports a flag which can be set on bio's called FAILFAST which provides this quick notification by disabling the retry logic in the lower scsi layers. That's the theory at least. In practice is turns out that while the flag exists you oddly have to set it with the BIO_RW_AHEAD flag. And even when it's set it you may get retries in the low level drivers decides that's the right behavior, or if you don't get the right error codes reported to the scsi midlayer. Unfortunately, without additional kernels patchs there's not much which can be done to improve this. Basically, this just means that it may take 2-3 minutes before a ZFS is notified properly that a device has failed. This can be improved and I suspect I'll be submitting patches upstream to handle this.	2010-10-12 14:55:02 -07:00
Brian Behlendorf	c5343ba71b	Fix 'zpool events' formatting for awk To make the 'zpool events' output simple to parse with awk the extra newline after embedded nvlists has been dropped. This allows the entire event to be parsed as a single whitespace seperated record. The -H option has been added to operate in scripted mode. For the 'zpool events' command this means don't print the header. The usage of -H is consistent with scripted mode for other zpool commands.	2010-10-12 14:55:01 -07:00
Ned Bass	5c1bad0013	Fix undersized buffer in is_shorthand_path() The string array 'char dirs[5][8]' was too small to accomodate the terminating NUL character in "by-label". This change adds the needed additional byte. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-10-12 14:47:39 -07:00
Brian Behlendorf	a5b4d63582	Add [-m map] option to zpool_layout By default the zpool_layout command would always use the slot number assigned by Linux when generating the zdev.conf file. This is a reasonable default there are cases when it makes sense to remap the slot id assigned by Linux using your own custom mapping. This commit adds support to zpool_layout to provide a custom slot mapping file. The file contains in the first column the Linux slot it and in the second column the custom slot mapping. By passing this map file with '-m map' to zpool_config the mapping will be applied when generating zdev.conf. Additionally, two sample mapping have been added which reflect different ways to map the slots in the dragon drawers.	2010-09-17 11:02:19 -07:00
Brian Behlendorf	6283f55ea1	Support custom build directories and move includes One of the neat tricks an autoconf style project is capable of is allow configurion/building in a directory other than the source directory. The major advantage to this is that you can build the project various different ways while making changes in a single source tree. For example, this project is designed to work on various different Linux distributions each of which work slightly differently. This means that changes need to verified on each of those supported distributions perferably before the change is committed to the public git repo. Using nfs and custom build directories makes this much easier. I now have a single source tree in nfs mounted on several different systems each running a supported distribution. When I make a change to the source base I suspect may break things I can concurrently build from the same source on all the systems each in their own subdirectory. wget -c http://github.com/downloads/behlendorf/zfs/zfs-x.y.z.tar.gz tar -xzf zfs-x.y.z.tar.gz cd zfs-x-y-z ------------------------- run concurrently ---------------------- <ubuntu system> <fedora system> <debian system> <rhel6 system> mkdir ubuntu mkdir fedora mkdir debian mkdir rhel6 cd ubuntu cd fedora cd debian cd rhel6 ../configure ../configure ../configure ../configure make make make make make check make check make check make check This change also moves many of the include headers from individual incude/sys directories under the modules directory in to a single top level include directory. This has the advantage of making the build rules cleaner and logically it makes a bit more sense.	2010-09-08 12:38:56 -07:00
Brian Behlendorf	e70e591c51	Add initial autoconf products Add the initial products from autogen.sh. These products will be updated incrementally after this point as development occurs. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-08-31 13:42:02 -07:00
Brian Behlendorf	0e8d1b2d8b	Add linux ztest support Minor changes to ztest for this environment. These including updating ztest to run in the local development tree, as well as relocating some local variables in this function to the heap. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-08-31 13:42:02 -07:00
Brian Behlendorf	302ef1517e	Add linux zpios support Linux kernel implementation of PIOS test app. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-08-31 13:42:01 -07:00
Brian Behlendorf	9b020fd97a	Add linux user util support This topic branch contains required changes to the user space utilities to allow them to integrate cleanly with Linux. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-08-31 13:42:01 -07:00
Brian Behlendorf	d603ed6c27	Add linux user disk support This topic branch contains all the changes needed to integrate the user side zfs tools with Linux style devices. Primarily this includes fixing up the Solaris libefi library to be Linux friendly, and integrating with the libblkid library which is provided by e2fsprogs. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-08-31 13:42:00 -07:00
Brian Behlendorf	266852767f	Add linux events This topic branch leverages the Solaris style FMA call points in ZFS to create a user space visible event notification system under Linux. This new system is called zevent and it unifies all previous Solaris style ereports and sysevent notifications. Under this Linux specific scheme when a sysevent or ereport event occurs an nvlist describing the event is created which looks almost exactly like a Solaris ereport. These events are queued up in the kernel when they occur and conditionally logged to the console. It is then up to a user space application to consume the events and do whatever it likes with them. To make this possible the existing /dev/zfs ABI has been extended with two new ioctls which behave as follows. * ZFS_IOC_EVENTS_NEXT Get the next pending event. The kernel will keep track of the last event consumed by the file descriptor and provide the next one if available. If no new events are available the ioctl() will block waiting for the next event. This ioctl may also be called in a non-blocking mode by setting zc.zc_guid = ZEVENT_NONBLOCK. In the non-blocking case if no events are available ENOENT will be returned. It is possible that ESHUTDOWN will be returned if the ioctl() is called while module unloading is in progress. And finally ENOMEM may occur if the provided nvlist buffer is not large enough to contain the entire event. * ZFS_IOC_EVENTS_CLEAR Clear are events queued by the kernel. The kernel will keep a fairly large number of recent events queued, use this ioctl to clear the in kernel list. This will effect all user space processes consuming events. The zpool command has been extended to use this events ABI with the 'events' subcommand. You may run 'zpool events -v' to output a verbose log of all recent events. This is very similar to the Solaris 'fmdump -ev' command with the key difference being it also includes what would be considered sysevents under Solaris. You may also run in follow mode with the '-f' option. To clear the in kernel event queue use the '-c' option. $ sudo cmd/zpool/zpool events -fv TIME CLASS May 13 2010 16:31:15.777711000 ereport.fs.zfs.config.sync class = "ereport.fs.zfs.config.sync" ena = 0x40982b7897700001 detector = (embedded nvlist) version = 0x0 scheme = "zfs" pool = 0xed976600de75dfa6 (end detector) time = 0x4bec8bc3 0x2e5aed98 pool = "zpios" pool_guid = 0xed976600de75dfa6 pool_context = 0x0 While the 'zpool events' command is handy for interactive debugging it is not expected to be the primary consumer of zevents. This ABI was primarily added to facilitate the addition of a user space monitoring daemon. This daemon would consume all events posted by the kernel and based on the type of event perform an action. For most events simply forwarding them on to syslog is likely enough. But this interface also cleanly allows for more sophisticated actions to be taken such as generating an email for a failed drive. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-08-31 13:41:36 -07:00
Brian Behlendorf	c9c0d073da	Add build system Add autoconf style build infrastructure to the ZFS tree. This includes autogen.sh, configure.ac, m4 macros, some scripts/*, and makefiles for all the core ZFS components.	2010-08-31 13:41:27 -07:00
Brian Behlendorf	40b84e7aec	Fix stack ztest While ztest does run in user space we run it with the same stack restrictions it would have in kernel space. This ensures that any stack related issues which would be hit in the kernel can be caught and debugged in user space instead. This patch is a first pass to limit the stack usage of every ztest function to 1024 bytes. Subsequent updates can further reduce this. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-08-31 08:38:50 -07:00
Brian Behlendorf	1e33ac1e26	Fix Solaris thread dependency by using pthreads This is a portability change which removes the dependence of the Solaris thread library. All locations where Solaris thread API was used before have been replaced with equivilant Solaris kernel style thread calls. In user space the kernel style threading API is implemented in term of the portable pthreads library. This includes all threads, mutexs, condition variables, reader/writer locks, and taskqs. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-08-31 08:38:47 -07:00
Brian Behlendorf	235db0acea	Fix deadcode Remove deadcode. It's possible the code should be in use somewhere, but as the source code is laid out it currently is not. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-08-31 08:38:44 -07:00
Ricardo M. Correia	090ff0929e	Fix commit callbacks The upstream commit cb code had a few bugs: 1) The arguments of the list_move_tail() call in txg_dispatch_callbacks() were reversed by mistake. This caused the commit callbacks to not be called at all. 2) ztest had a bug in ztest_dmu_commit_callbacks() where "error" was not initialized correctly. This seems to have caused the test to always take the simulated error code path, which made ztest unable to detect whether commit cbs were being called for transactions that successfuly complete. 3) ztest had another bug in ztest_dmu_commit_callbacks() where the commit cb threshold was not being compared correctly. 4) The commit cb taskq was using 'max_ncpus * 2' as the maxalloc argument of taskq_create(), which could have caused unnecessary delays in the txg sync thread. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-08-31 08:38:44 -07:00
Brian Behlendorf	d4ed667343	Fix gcc uninitialized variable warnings Gcc -Wall warn: 'uninitialized variable' Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-08-31 08:38:43 -07:00
Brian Behlendorf	1fde1e3720	Fix gcc unused variable warnings Gcc -Wall warn: 'unused variable' Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-08-31 08:38:43 -07:00
Brian Behlendorf	c65aa5b2b9	Fix gcc missing parenthesis warnings Gcc -Wall warn: 'missing parenthesis' Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-08-31 08:38:35 -07:00
Brian Behlendorf	e75c13c353	Fix gcc missing case warnings Gcc ASSERT() missing cases are impossible Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-08-27 15:34:03 -07:00
Brian Behlendorf	2598c0012d	Fix gcc missing braces warnings Resolve compiler warnings concerning missing braces. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-08-27 15:34:03 -07:00
Brian Behlendorf	0bc8fd7884	Fix gcc invalid prototype warnings Gcc -Wall warn: 'invalid prototype' Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-08-27 15:34:03 -07:00
Ricardo M. Correia	e5dc681a50	Fix gcc ident pragma warnings Remove all ident pragmas which are unknown to gcc. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-08-27 15:34:02 -07:00
Brian Behlendorf	0e5b68e015	Fix gcc fortify source warnings Resolve issues uncovered by -D_FORTIFY_SOURCE=2, the default redhat macro's file adds this option to the cflags. This causes warnings of the following type designed to keep the developer honest: warning: ignoring return value of 'foo', declared with attribute warn_unused_result The short term fix is to wrap these calls in VERIFY() to check the return code. The code was already assusing these would never fail. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-08-27 15:34:02 -07:00
Brian Behlendorf	b8864a233c	Fix gcc cast warnings Gcc -Wall warn: 'lacks a cast' Gcc -Wall warn: 'comparison between pointer and integer' Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-08-27 15:33:32 -07:00
Brian Behlendorf	d6320ddb78	Fix gcc c90 compliance warnings Fix non-c90 compliant code, for the most part these changes simply deal with where a particular variable is declared. Under c90 it must alway be done at the very start of a block. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-08-27 15:28:32 -07:00
Ricardo M. Correia	c5b3a7bbcc	Fix gcc 64-bit constant warnings Add 'ull' suffix to 64-bit constants. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-08-26 15:18:01 -07:00
Brian Behlendorf	572e285762	Update to onnv_147 This is the last official OpenSolaris tag before the public development tree was closed.	2010-08-26 14:24:34 -07:00
Brian Behlendorf	428870ff73	Update core ZFS code from build 121 to build 141.	2010-05-28 13:45:14 -07:00
Brian Behlendorf	4cd8e49a69	Add .gitignore files to exclude build products	2010-01-08 11:35:17 -08:00
Brian Behlendorf	45d1cae3b8	Rebase master to b121	2009-08-18 11:43:27 -07:00
Brian Behlendorf	9babb37438	Rebase master to b117	2009-07-02 15:44:48 -07:00
Brian Behlendorf	d164b20935	Rebase master to b108	2009-02-18 12:51:31 -08:00
Brian Behlendorf	fb5f0bc833	Rebase master to b105	2009-01-15 13:59:39 -08:00
Brian Behlendorf	36b849fa51	Remove zdump, it's an unrelateds command which I added simply due to the z* command convention	2009-01-05 11:10:13 -08:00
Brian Behlendorf	172bb4bd5e	Move the world out of /zfs/ and seperate out module build tree	2008-12-11 11:08:09 -08:00

... 8 9 10 11 12 ...

679 Commits