mirror_zfs

mirror of https://git.proxmox.com/git/mirror_zfs.git synced 2026-03-22 08:51:30 +03:00

Author	SHA1	Message	Date
Brian Behlendorf	6d1d976b2c	Modify vdev_elevator_switch() to use elevator_change() As of Linux 2.6.36 an elevator_change() interface was added. This commit updates vdev_elevator_switch() to use this interface when available, otherwise it falls back to the usermodehelper method. Original-patch-by: foobarz <sysop@xeon.(none)> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #906	2012-10-03 13:31:44 -07:00
Etienne Dechamps	2f342404c1	Force 4K blocksize when testing ext2 on zvol. Currently, mkfs.ext2 on zconfig.sh zvols tries to use a 8K blocksize, probably because by default zvol exposes an optimal I/O size of 8K. Unfortunately, a ext2 blocksize of 8K is not supported by the kernel, so the resulting filesystem is unmountable. This patch fixes the issue by making sure the blocksize is 4K. We have to use -F to force it else mkfs.ext2 won't allow us to use a blocksize smaller than the optimal I/O size. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #979	2012-10-03 10:52:51 -07:00
Cyril Plisko	393b44c711	Implement .commit_metadata hook for NFS export In order to implement synchronous NFS metadata semantics ZFS needs to provide the .commit_metadata hook. All it takes there is to make sure changes are committed to ZIL. Fortunately zfs_fsync() does just that, so simply calling it from zpl_commit_metadata() does the trick. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #969	2012-10-03 10:49:45 -07:00
Chris Wedgwood	23a61ccc1b	zvol_probe should return NULL when the device isn't found. Previously we returned ERR_PTR(-ENOENT) which the rest of the kernel doesn't expect and as such we can oops. Signed-off-by: Chris Wedgwood <cw@f00f.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #949 Closes #931 Closes #789 Closes #743 Closes #730	2012-10-03 10:39:12 -07:00
Bill Pijewski	37abac6d55	Illumos #2703 : add mechanism to report ZFS send progress Reviewed by: Matt Ahrens <matt@delphix.com> Reviewed by: Robert Mustacchi <rm@joyent.com> Reviewed by: Richard Lowe <richlowe@richlowe.net> Approved by: Eric Schrock <Eric.Schrock@delphix.com> References: https://www.illumos.org/issues/2703 Ported by: Martin Matuska <martin@matuska.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-09-19 13:39:06 -07:00
Chris Siden	1bd201e70d	Illumos #1948 : zpool list should show more detailed pool info Reviewed by: Adam Leventhal <ahl@delphix.com> Reviewed by: Matt Ahrens <matt@delphix.com> Reviewed by: Eric Schrock <eric.schrock@delphix.com> Reviewed by: Richard Lowe <richlowe@richlowe.net> Reviewed by: Albert Lee <trisk@nexenta.com> Reviewed by: Dan McDonald <danmcd@nexenta.com> Reviewed by: Garrett D'Amore <garrett@damore.org> Approved by: Eric Schrock <eric.schrock@delphix.com> References: https://www.illumos.org/issues/1948 Ported by: Martin Matuska <martin@matuska.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #685	2012-09-19 13:39:05 -07:00
Brian Behlendorf	95fd8c9a7f	Switch KM_SLEEP to KM_PUSHPAGE This warning indicates the incorrect use of KM_SLEEP in a call path which must use KM_PUSHPAGE to avoid deadlocking in direct reclaim. See commit `b8d06fca08` for additional details. SPL: Fixing allocation for task txg_sync (6093) which used GFP flags 0x297bda7c with PF_NOFS set Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #973	2012-09-19 11:52:36 -07:00
Brian Behlendorf	0a2f7b3662	Seg fault 'zpool import -d /dev/disk/by-id -a' Introduced by commit `44867b6d6e`. We should of course check to ensure best isn't NULL before attempting to dereference it. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #974	2012-09-18 12:33:37 -07:00
Brian Behlendorf	211204bed3	zfs-0.6.0-rc11	2012-09-18 11:30:24 -07:00
Brian Behlendorf	a6c6839a88	SPL 0.6.0-rc11	2012-09-18 11:28:57 -07:00
Richard Lowe	dd4769adc0	Illumos #2088 zdb could use a reasonable manual page Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com> Reviewed by: Garrett D'Amore <garrett@damore.org> Reviewed by: George Wilson <gwilson@zfsmail.com> Reviewed by: Steve Gonczi <gonczi@comcast.net> Reviewed by: Richard Elling <richard.elling@richardelling.com> Approved by: Garrett D'Amore <garrett@damore.org> References: https://www.illumos.org/issues/2088 Ported by: Cyril Plisko <cyril.plisko@mountall.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #682	2012-09-18 09:09:13 -07:00
Brian Behlendorf	44867b6d6e	Improve `zpool import` search behavior The goal of this change is to make 'zpool import' prefer to use the peristent /dev/mapper or /dev/disk/by-* paths. These are far preferable to the devices in /dev/ whos names are not persistent and are determined by the order in which a device is detected. This patch improves things by changing the default search path from just to the top level /dev/ directory to (in order): /dev/disk/by-vdev - Custom rules, use first if they exist /dev/disk/zpool - Custom rules, use first if they exist /dev/mapper - Use multipath devices before components /dev/disk/by-uuid - Single unique entry and persistent /dev/disk/by-id - May be multiple entries and persistent /dev/disk/by-path - Encodes physical location and persistent /dev/disk/by-label - Custom persistent labels /dev - UNSAFE device names will change The default search path can be overriden by setting the ZPOOL_IMPORT_PATH environment variable. This must be a colon delimited list of paths which are searched for vdevs. If the 'zpool import -d' option is specified only those listed paths will be searched. Finally, when multiple paths to the same device are found. If one of the paths is an exact match for the path used last time to import the pool it will be used. When there are no exact matches the prefered path will be determined by the provided search order. This means you can still import a pool and force specific names by providing the -d <path> option. And the prefered names will persist as long as those paths exist on your system. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #965	2012-09-17 13:49:07 -07:00
Brian Behlendorf	ba367276d8	Switch KM_SLEEP to KM_PUSHPAGE This warning indicates the incorrect use of KM_SLEEP in a call path which must use KM_PUSHPAGE to avoid deadlocking in direct reclaim. See commit `b8d06fca08` for additional details. SPL: Fixing allocation for task txg_sync (6093) which used GFP flags 0x297bda7c with PF_NOFS set Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #917	2012-09-17 11:22:23 -07:00
Cyril Plisko	49d39798f2	ZFS replay transaction error 5 When zfs_replay_write() replays TX_WRITE records from ZIL it calls zpl_write_common() to perform the actual write. zpl_write_common() returns the number of bytes written (similar to write() system call) or an (negative) error. However, the code expects the positive return value to be a residual counter. Thus when zpl_write_common() successfully completes it is mistakenly considered to be a partial write and the error code delivered further. At this point the ZIL processing is aborted with famous "ZFS replay transaction error 5" error message given to the message buffer. The fix is to compare the zpl_write_commmon() return value with the buffer size and flag error only when they disagree. Signed-off-by: Cyril Plisko <cyril.plisko@mountall.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #933	2012-09-17 11:06:58 -07:00
Brian Behlendorf	8312c6df55	Clear PG_writeback for sync I/O error case Commit `2b2861362f` accidentally introduced this issue by only conditionally registering the commit callback in the async case. The error handing code for the dmu_tx_assign() failure case relied on there always being a registered commit callback to clear the PG_writeback bit. Since that is no longer strictly true for the synchronous case we must explicitly invoke the callback. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #961	2012-09-14 15:53:47 -07:00
Cyril Plisko	8e8e7f35b7	Fix zdb printf format string for ZIL data blocks Without this fix the zdb printouts of ZIL data blocks look full of FF due to printf() handling its arguments as int by default. Here is the output before the fix TX_WRITE len 4136, txg 1093817, seq 149231 foid 4242, offset 0, length f68 G FFFFFF8EFFFFFF87FFFFFF91FFFFFFCC 1c FFFFFFAFFFFFFFC9FFFFFFBAZ FFFFFFC3 And the same after the fix TX_WRITE len 4136, txg 1093817, seq 149231 foid 4242, offset 0, length f68 G 8E8791CC 1cAFC9BAZ C3 Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #962	2012-09-13 09:02:12 -07:00
Brian Behlendorf	5915791096	Move iput() after zfs_inode_update() When replaying an unlink/remove operation via zfs_rmdir() the object being removed will be instantiated by a call to zfs_dirent_lock(). This means that there is a single reference protecting the object. Right before the call to zfs_inode_update() this reference is dropped which may cause the object to be destroyed. This will result in a NULL dereference as shown by the stack trace is issue #782. This likely isn't an issue during normal operation because there is always an additional reference held on the object by the VFS. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #782	2012-09-12 14:22:52 -07:00
Brian Behlendorf	3050c9314f	Switch KM_SLEEP to KM_PUSHPAGE Under certain circumstances the following functions may be called in a context where KM_SLEEP is unsafe and can result in a deadlocked system. To avoid this problem the unconditional KM_SLEEPs are converted to KM_PUSHPAGEs. This will prevent them from attempting to initiate any I/O during direct reclaim. This change was originally part of `cd5ca4b` but was reverted by `330fe01`. It always should have had its own commit for exactly this reason. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-09-12 12:27:09 -07:00
Brian Behlendorf	9b51f21841	Remove TQ_SLEEP -> KM_SLEEP mapping When the taskq code was originally written it seemed like a good idea to simply map TQ_SLEEP to KM_SLEEP. Unfortunately, this assumed that the TQ_* flags would never confict with any of the Linux GFP_* flags. When adding the TQ_PUSHPAGE support in commit `cd5ca4b` this invariant was accidentally broken. Therefore to support TQ_PUSHPAGE, which is needed for Linux, and prevent any further confusion I have removed this direct mapping. The TQ_SLEEP, TQ_NOSLEEP, and TQ_PUSHPAGE are no longer defined in terms of their KM_* counterparts. Instead a simple mapping function is introduce to convert TQ_* -> KM_* where needed. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #171	2012-09-12 11:41:42 -07:00
Brian Behlendorf	330fe010e4	Revert "Switch KM_SLEEP to KM_PUSHPAGE" This reverts commit `cd5ca4b2f8` due to conflicts in the higher TQ_ bits which caused incorrect behavior. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-09-12 10:07:48 -07:00
Brian Behlendorf	cda4db408c	Revert "Improve AF hard disk detection" This reverts commit `395350c85d` which accidentally introduced issue #955. Pools using AF drives which were originally created with a sector size of 512 bytes will now be correctly detected to have physical sector size of 4096. This is desirable for a new pool, however for an existing pool abruptly changing the sector size causes problems. For this reason, this change is being reverted until the additional logic can be added to detect the existing pool case. Existing pools must use the ashift size stored in the label regardless of what the disk reports. This is critical for compatibility. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #955	2012-09-11 16:33:49 -07:00
Cyril Plisko	27ccd4147b	Avoid running exportfs on each zfs/zpool command invocation Delay executing exportfs command until its results are actually required. Signed-off-by: Cyril Plisko <cyril.plisko@mountall.com> Signed-off-by: Gunnar Beutner <gunnar@beutner.name> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-09-11 10:21:49 -07:00
Cyril Plisko	af909a1089	Illumos #3064 : usr/src/cmd/zpool/zpool_main.c misspells "successful" Reviewed by: Andrew Stormont <Andrew.Stormont@nexenta.com> Reviewed by: Kartik Mistry <kartik.mistry@gmail.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> References: https://www.illumos.org/issues/3064 Signed-off-by: Cyril Plisko <cyril.plisko@mountall.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-09-11 10:19:50 -07:00
Chris Dunlop	fff276419e	Remove autotools products spl_config.h.in is a generated file: remove and .gitignore it Signed-off-by: Chris Dunlop <chris@onthe.net.au> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-09-11 10:15:13 -07:00
Chris Dunlop	dd87332f47	Remove autotools products spl_config.h.in is a generated file: remove and .gitignore it Signed-off-by: Chris Dunlop <chris@onthe.net.au> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-09-11 10:12:47 -07:00
Etienne Dechamps	b815ff9a8f	Silence "setting dataset to sync always" message in ztest. ztest outputs a message when testing sync=always no matter what the verbosity level is. There is no point outputting this message for low verbosity levels. With this patch the message is only displayed at verbosity level 5 or above. The result is less output pollution. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #951	2012-09-10 10:55:44 -07:00
Brian Behlendorf	4ca9a43644	Remove zvol device node The 'zfs destroy' changes in `330d06f` disrupted how zvol devices get removed on ZoL. However, it basically boils down to the fact that we are no longer reliably calling zvol_remove_minor() via zfs_ioc_destroy_snaps(). Therefore we add the missing call and handle things similarly to the existing zfs_unmount_snap() case. Ideally we would check if this is of type DMU_OST_ZFS or DMU_OST_ZVOL and just do the right thing as in zfs_ioc_destroy(). However, it looks like it would be fairly expensive to get the type, and it's harmless to simply attempt the umount and minor removal. This is also an issue in the latest FreeBSD and Illumos code. It was being tracked under the following issue, and we may want to refresh our code when they settle on what they want to do about it upstream. https://www.illumos.org/issues/3170 Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #903	2012-09-10 10:25:08 -07:00
Brian Behlendorf	3c60f5054c	Debug cv_destroy() with mutex held There still appears to be a race in the condition variables where ->cv_mutex is set after we are woken from the cv_destroy wait queue. This might be possible when cv_destroy() is called immediately after cv_broadcast(). We had some troubles with this previously but there may still be a small race, see commit `d599e4f`. The following patch closes one small race and improves the ASSERTs such that they log the offending value. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> zfsonlinux/zfs#943	2012-09-10 10:23:26 -07:00
Brian Behlendorf	95331f4437	Set KMC_NOEMERGENCY for zlib workspaces The workspace required by zlib to perform compression is roughly 512MB (order-7). These allocations are so large that we should never attempt to directly kmalloc an emergency object for them. It is far preferable to asynchronously vmalloc an additional slab in case it's needed. Then simply block waiting for an existing object to be released or for the new slab to be allocated. This can be accomplished by disabling emergency slab objects by passing the KMC_NOEMERGENCY flag at slab creation time. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> zfsonlinux/zfs#917	2012-09-07 14:36:26 -07:00
Brian Behlendorf	cb5c2acebb	Add KMC_NOEMERGENCY slab flag Provide a flag to disable the use of emergency objects for a specific kmem cache. There may be instances where under no circumstances should you kmalloc() an emergency object. For example, when you cache contains very large objects (>128k). Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-09-07 14:27:03 -07:00
Brian Behlendorf	1ecc6d1265	Add zstreamdump .gitignore When zstreamdump was merged in commit `b79fc3f` we failed to add the needed .gitignore file. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-09-06 14:23:11 -07:00
Cyril Plisko	04f9432d3b	Make ZFS filesystem id persistent across different machines Use ZFS dataset fsid guid as a unique file system id, similar to what is done on Illumos/OpenSolaris. Signed-off-by: Cyril Plisko <cyril.plisko@mountall.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #888	2012-09-06 12:47:11 -07:00
Etienne Dechamps	4b2f65b253	Increase the stack space in userspace. In `1e33ac1e26`, the maximum stack size for userspace tools was set to 8k to mimic the available kernel stack size. Unfortunately, due to differences in how the stack is used in userspace vs kernel space, spurious stack overflows could occur in userspace tools due to the limited stack size. This is especially true in ztest when debugging is enabled. This patch multiplies the userspace stack size by 4, which fixes the stack overflow issues. This comes at the price of not being able to catch stack size issues in userspace, but the previous solution proved unreliable anyway. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Fixes #934.	2012-09-06 11:59:59 -07:00
Brian Behlendorf	ebcfc8a534	Disable page allocation warnings for ARC buffers Buffers for the ARC are normally backed by the SPL virtual slab. However, if memory is low, AND no slab objects are available, AND a new slab cannot be quickly constructed a new emergency object will be directly allocated. These objects can be as large as order 5 on a system with 4k pages. And because they are allocated with KM_PUSHPAGE, to avoid a potential deadlock, they are not allowed to initiate I/O to satisfy the allocation. This can result in the occasional allocation failure. However, since these allocations are allowed to block and perform operations such as memory compaction they will eventually succeed. Since this is not unexpected (just unlikely) behavior this patch disables the warning for the allocation failure. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #465	2012-09-06 11:53:08 -07:00
Michael Martin	fc24f7c887	Fix missing vdev names in zpool status output Commit `858219c` makes more sense down below in the 'if (verbose)' section of the code. Initially, buf and path will never point to the same location. Once 'path = buf' is set on a raidz vdev, the code may drop into the verbose section depending on the verbose flag. In here, using a tmpbuf makes sense since now 'buf == path'. This issue does not occur in the upstream Solaris code because their implementations of snprintf() allow for buf and path to be the same address. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #57	2012-09-05 22:09:12 -07:00
Brian Behlendorf	cafa9709f3	Switch KM_SLEEP to KM_PUSHPAGE This warning indicates the incorrect use of KM_SLEEP in a call path which must use KM_PUSHPAGE to avoid deadlocking in direct reclaim. See commit `b8d06fca08` for additional details. SPL: Fixing allocation for task txg_sync (6093) which used GFP flags 0x297bda7c with PF_NOFS set Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #917	2012-09-05 08:44:58 -07:00
Brian Behlendorf	0ef0ff546e	Switch KM_SLEEP to KM_PUSHPAGE This warning indicates the incorrect use of KM_SLEEP in a call path which must use KM_PUSHPAGE to avoid deadlocking in direct reclaim. See commit `b8d06fca08` for additional details. SPL: Fixing allocation for task txg_sync (6093) which used GFP flags 0x297bda7c with PF_NOFS set Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #917	2012-09-04 16:00:06 -07:00
Brian Behlendorf	395350c85d	Improve AF hard disk detection Use the bdev_physical_block_size() interface to determine the minimize write size which can be issued without incurring a read-modify-write operation. This is used to set the ashift correctly to prevent a performance penalty when using AF hard disks. Unfortunately, this interface isn't entirely reliable because it's not uncommon for disks to misreport this value. For this reason you may still need to manually set your ashift with: zpool create -o ashift=12 ... The solution to this in the upstream Illumos source was to add a while list of known offending drives. Maintaining such a list will be a burden, but it still may be worth doing if we can detect a large number of these drives. This should be considered as future work. Reported-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #916	2012-09-04 15:35:32 -07:00
Brian Behlendorf	594b4dd82a	Switch KM_SLEEP to KM_PUSHPAGE This warning indicates the incorrect use of KM_SLEEP in a call path which must use KM_PUSHPAGE to avoid deadlocking in direct reclaim. See commit `b8d06fca08` for additional details. SPL: Fixing allocation for task txg_sync (6093) which used GFP flags 0x297bda7c with PF_NOFS set Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #917	2012-09-04 08:41:12 -07:00
Etienne Dechamps	ba7dbeb22e	Add libnvpair to mount_zfs dependencies Commit `e6f290535c` added libzpool to the mount_zfs dependencies. This brought in the nvpair symbols which are used by libzpool. To resolve this include the libnvpair library for mount_zfs even though mount_zfs doesn't directly require any of these symbols. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #926	2012-09-02 15:36:09 -07:00
Martin Matuska	b79fc3fea9	Add zstreamdump(8) command to examine ZFS send streams. Obtained from: illumos-gate revision 11935:538c866aaac6 Source: ssh://anonhg@hg.illumos.org/illumos-gate Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #905	2012-09-02 14:54:27 -07:00
Etienne Dechamps	ac8ca67a88	Add DKIOCTRIM for TRIM support. See dechamps/zfs@cc6cd40ad7 for details. This harmless addition was merged to simplify testing the ZFS TRIM support patches. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #167	2012-09-02 14:22:01 -07:00
Chris Dunlop	20a083cbe2	Switch KM_SLEEP to KM_PUSHPAGE This warning indicates the incorrect use of KM_SLEEP in a call path which must use KM_PUSHPAGE to avoid deadlocking in direct reclaim. See commit `b8d06fca08` for additional details. SPL: Fixing allocation for task txg_sync (6093) which used GFP flags 0x297bda7c with PF_NOFS set Signed-off-by: Chris Dunlop <chris@onthe.net.au> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #917	2012-09-02 10:15:49 -07:00
Brian Behlendorf	b404a3f07f	Switch KM_SLEEP to KM_PUSHPAGE This warning indicates the incorrect use of KM_SLEEP in a call path which must use KM_PUSHPAGE to avoid deadlocking in direct reclaim. See commit `b8d06fca08` for additional details. SPL: Fixing allocation for task txg_sync (6093) which used GFP flags 0x297bda7c with PF_NOFS set Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #917	2012-08-31 17:39:29 -07:00
Brian Behlendorf	46b3945d5d	Suppress task_hash_table_init() large allocation warning When various kernel debuging options are enabled this allocation may be larger than usual as shown by the following warning. It is in no way harmful so we suppress the warning. SPL: large kmem_alloc(40960, 0x80d0) at tsd_hash_table_init:358 (76495/76495) Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #93	2012-08-30 21:02:52 -07:00
Brian Behlendorf	2b2861362f	Clear PG_writeback after zil_commit() for sync I/O When writing via ->writepage() the writeback bit was always cleared as part of the txg commit callback. However, when the I/O is also being written synchronsously to the zil we can immediately clear this bit. There is no need to wait for the subsequent TXG sync since the data is already safe on stable storage. This has been observed to reduce the msync(2) delay from up to 5 seconds down 10s of miliseconds. One workload which is expected to benefit from this are the intermittent samba hands described in issue #700. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #700 Closes #907	2012-08-30 20:16:28 -07:00
Etienne Dechamps	e6f290535c	Fix mount_zfs dependency on libzpool. mount_zfs depends on libzpool for zfs_prop_written since `330d06f90d`. Unfortunately, the Makefile for mount_zfs has not been modified to reflect this. As a result, libtool doesn't know about the dependency, which may result in the wrong libzpool being used during the build (e.g. the libzpool from the system instead of the libzpool from the build directory). This patch adds the dependency to fix the issue. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Fixes #909.	2012-08-30 16:06:46 -07:00
Brian Behlendorf	efcd0ca32d	Enhance SPLAT kmem:slab_overcommit test After the emergency slab objects were merged I started observing timeout failures in the kmem:slab_overcommit test. These were due to the ineffecient way the slab_overcommit reclaim function was implemented. And due to the additional cost of potentially allocating ten of thousands of emergency objects and tracking them on a single list. This patch addresses the first concern by enhansing the test case to trace all of the allocations objects as a linked list. This allows for a cleaner version of the reclaim function to simply release SPLAT_KMEM_OBJ_RECLAIM objects. Since this touches some common code all the tests which share these data structions were also updated. After making these changes slab_overcommit is reliably passing. However, there is certainly additional cleanup which could be done here. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-08-30 15:49:00 -07:00
Richard Yao	b8d06fca08	Switch KM_SLEEP to KM_PUSHPAGE Differences between how paging is done on Solaris and Linux can cause deadlocks if KM_SLEEP is used in any the following contexts. * The txg_sync thread * The zvol write/discard threads * The zpl_putpage() VFS callback This is because KM_SLEEP will allow for direct reclaim which may result in the VM calling back in to the filesystem or block layer to write out pages. If a lock is held over this operation the potential exists to deadlock the system. To ensure forward progress all memory allocations in these contexts must us KM_PUSHPAGE which disables performing any I/O to accomplish the memory allocation. Previously, this behavior was acheived by setting PF_MEMALLOC on the thread. However, that resulted in unexpected side effects such as the exhaustion of pages in ZONE_DMA. This approach touchs more of the zfs code, but it is more consistent with the right way to handle these cases under Linux. This is patch lays the ground work for being able to safely revert the following commits which used PF_MEMALLOC: `21ade34` Disable direct reclaim for z_wr_* threads `cfc9a5c` Fix zpl_writepage() deadlock `eec8164` Fix ASSERTION(!dsl_pool_sync_context(tx->tx_pool)) Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #726	2012-08-27 12:01:37 -07:00
Brian Behlendorf	991fc1d7ae	mzap_upgrade() must use kmem_alloc() These allocations in mzap_update() used to be kmem_alloc() but were changed to vmem_alloc() due to the size of the allocation. However, since it turns out this function may be called in the context of the txg_sync thread they must be changed back to use a kmem_alloc() to ensure the KM_PUSHPAGE flag is honored. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-08-27 12:01:37 -07:00

... 78 79 80 81 82 ...

5357 Commits