mirror_zfs

mirror of https://git.proxmox.com/git/mirror_zfs.git synced 2026-01-25 10:12:13 +03:00

Author	SHA1	Message	Date
Brian Behlendorf	26099167e6	Disable ztest deadman timer The ztest deadman timer has been causing false positives in the testing VMs. To make it easier to spot possible regressions I'm disabling this timer. The buildbot test infrastructure will still mark ztest instances which take to long to complete as failures. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #1018	2012-10-14 19:35:09 -07:00
Brian Behlendorf	ee7913b644	Merge branch 'linux-3.6' This branch adds the required compatibility code to support the Linux 3.6 kernel. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #873	2012-10-14 16:32:17 -07:00
Richard Yao	95f5c63b47	Linux 3.6 compat, iops->mkdir() Use .mkdir instead of .create in 3.3 compatibility check. Linux 3.6 modifies inode_operations->create's function prototype. This causes an autotools Linux 3.3. compatibility check for a function prototype change in create, mkdir and mknode to fail. Since mkdir and mknode are unchanged, we modify the check to examine it instead. Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #873	2012-10-14 15:29:26 -07:00
Yuxuan Shui	558ef6d080	Linux 3.6 compat, iops->create() As of Linux commit ebfc3b49a7ac25920cb5be5445f602e51d2ea559 the struct nameidata is no longer passed to iops->create. Instead only the result of (inamedata->flags & LOOKUP_EXCL) is passed. ZFS like almost all Linux fileystems never made use of this so only the prototype needs to be wrapped for compatibility. Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #873	2012-10-14 14:42:25 -07:00
Yuxuan Shui	8f195a908f	Linux 3.6 compat, iops->lookup() As of Linux commit 00cd8dd3bf95f2cc8435b4cac01d9995635c6d0b the struct nameidata is no longer passed to iops->lookup. Instead only the inamedata->flags are passed. ZFS like almost all Linux fileystems never made use of this so only the prototype needs to be wrapped for compatibility. Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #873	2012-10-14 13:06:54 -07:00
Yuxuan Shui	3c20361075	Linux 3.6 compat, sget() As of Linux commit 9249e17fe094d853d1ef7475dd559a2cc7e23d42 the mount flags are now passed to sget() so they can be used when initializing a new superblock. ZFS never uses sget() in this fashion so we can simply pass a zero and add a zpl_sget() compatibility wrapper. Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #873	2012-10-14 13:06:48 -07:00
Yuxuan Shui	af26c4d4ab	Linux 3.6 compat, sops->write_super() removed The .write_super callback was removed the the super_operations structure by Linux commit f0cd2dbb6cf387c11f87265462e370bb5469299e. All file systems are now expected to self manage writing any dirty state assoicated with their super block. ZFS never made use of this callback so it can simply be removed from the super_operations structure. Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #873	2012-10-14 11:33:56 -07:00
Etienne Dechamps	a5c20e2a0a	Don't ashift-align vdev read requests. Currently, the size of read and write requests on vdevs is aligned according to the vdev's ashift, allocating a new ZIO buffer and padding if need be. This makes sense for write requests to prevent read/modify/write if the write happens to be smaller than the device's internal block size. For reads however, the rationale is less clear. It seems that the original code aligns reads because, on Solaris, device drivers will outright refuse unaligned requests. We don't have that issue on Linux. Indeed, Linux block devices are able to accept requests of any size, and take care of alignment issues themselves. As a result, there's no point in enforcing alignment for read requests on Linux. This is a nice optimization opportunity for two reasons: - We remove a memory allocation in a heavily-used code path; - The request gets aligned in the lowest layer possible, which shrinks the path that the additional, useless padding data has to travel. For example, when using 4k-sector drives that lie about their sector size, using 512b read requests instead of 4k means that there will be less data traveling down the ATA/SCSI interface, even though the drive actually reads 4k from the platter. The only exception is raidz, because raidz needs to read the whole allocated block for parity. This patch removes alignment enforcement for read requests, except on raidz. Note that we also remove an assertion that checks that we're aligning a top-level vdev I/O, because that's not the case anymore for repair writes that results from failed reads. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1022	2012-10-12 12:01:56 -07:00
Richard Yao	b68503fb30	Remove vmem_size() consumers There are currently three vmem_size() consumers all of which are part of the ARC implemention. However, since the expected behavior of the Linux and Solaris virtual memory subsystems are so different the behavior in each of these instances needs to be reevaluated. * arc_evict_needed() - This is actually dead code. Arena support was never added to the SPL and zio_arena is always NULL. This support isn't needed so we simply remove this dead code. * arc_memory_throttle() - On Solaris where virtual memory constitutes almost all of the address space we can reasonably expect there to be a fairly large amount free. However, on Linux by default we only have about 100MB total and that's heavily used by the ARC. So the expectation on Linux is that this will usually be a small value. Therefore we remove the vmem_size() check for i386 systems because the expectation is that it will be less than the zfs_write_limit_max. * arc_init() - Here vmem_size() is used to initially size the ARC. Since the ARC is currently backed by the virtual address space it makes sense to use this as a limit on the ARC for 32-bit systems. This code can be removed when the ARC is backed by the page cache. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #831	2012-10-12 10:03:03 -07:00
Brian Behlendorf	87d98efe9e	Fix zfs_txg_timeout module parameter Allow the zfs_txg_timeout variable to be dynamically tuned at run time. By pulling it down out of the variable declaration it will be evaluted each time through the loop. The zfs_txg_timeout variable is now declared extern in a the common sys/txg.h header rather than locally in dsl_scan.c. This prevents potential type mismatches if the global variable needs to be used elsewhere. Move the module_param() code in to the same source file where zfs_txg_timeout is declared. This is the most logical location. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-10-11 15:07:09 -07:00
Richard Yao	7df05a4266	Fix zfs_write_limit_max integer size mismatch on 32-bit systems Commit `c409e4647f` introduced a number of module parameters. This required several types to be changed to accomidate the required module parameters Linux macros. Unfortunately, arc.c contained its own extern definition of the zfs_write_limit_max variable and its type was not updated to be consistent with its dsl_pool.c counterpart. If the variable had been properly marked extern in a common header, then gcc would have generated a warning and this would not have slipped through. The result of this was that the ARC unconditionally expected zfs_write_limit_max to be 64-bit. Unfortunately, the largest size integer module parameter that Linux supports is unsigned long, which varies in size depending on the host system's native word size. The effect was that on 32-bit systems, ARC incorrectly performed 64-bit operations on a 32-bit value by reading the neighboring 32 bits as the upper 32 bits of the 64-bit value. We correct that by changing the extern declaration to use the unsigned long type and move these extern definitions in to the common arc.h header. This should make ARC correctly treat zfs_write_limit_max as a 32-bit value on 32-bit systems. Reported-by: Jorgen Lundman <lundman@lundman.net> Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #749	2012-10-11 11:09:25 -07:00
Cyril Plisko	15fd274973	Make zfs_immediate_write_sz a module paramater zfs_immediate_write_sz variable is a tunable, but lacks proper module_param() instrumentation. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1032	2012-10-11 11:09:21 -07:00
Cyril Plisko	5b7e5b5ab9	txg is spelled as tgx in places Term 'transaction group' is commonly abbreviated as txg in ZFS sources. There are some places (Linux specific MODULE_PARAM_DESC() macros) where it is incorrectly spelled as 'tgx'. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1030	2012-10-11 09:19:08 -07:00
KORN Andras	c8f259182d	zfs.8: add missing info about dedup, mlslabel These sections were missing from the `zfs.8` man page. Add them. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1026	2012-10-09 09:54:16 -07:00
Massimo Maggi	beb999445a	Switch KM_SLEEP to KM_PUSHPAGE Prevent snapshot_check to initiate I/O during memory allocation. Signed-off-by: Massimo Maggi <massimo@mmmm.it> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1023	2012-10-08 10:19:05 -07:00
Brian Behlendorf	7bd04f2d7d	Set default zvol elevator to noop It doesn't make sense for a zvol to use the default system I/O scheduler because it is a virtual device. Therefore, we change the default scheduler to 'noop' for zvols provided that the elevator_change() function is available. This interface has been available since Linux 2.6.36 and appears in the RHEL 6.x kernels. We deliberately do not implement the method for older kernels because it was racy and could result in system crashes. It's better to simply manually tune the scheduler for these kernels. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1017	2012-10-05 12:39:59 -07:00
Etienne Dechamps	089fa91bc5	Align DISCARD requests on zvols. Currently, when processing DISCARD requests, zvol_discard() calls dmu_free_long_range() with the precise offset and size of the request. Unfortunately, this is not optimal for requests that are not aligned to the zvol block boundaries. Indeed, in the case of an unaligned range, dnode_free_range() will zero out the unaligned parts. Not only is this useless since we are not freeing any space by doing so, it is also slow because it translates to a read-modify-write operation. This patch fixes the issue by rounding up the discard start offset to the next volume block boundary, and rounding down the discard end offset to the previous volume block boundary. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1010	2012-10-04 16:01:44 -07:00
Brian Behlendorf	31ab194297	Merge branch 'illumos-ztest' This branch is a port of the ztest backwards compatibility testing option. It includes the original upstream Illumos patch plus several followup patches to address concerns in the original change. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-10-04 14:39:18 -07:00
Brian Behlendorf	ae380cfa76	Realpath arg 2 must be a minimum of PATH_MAX The realpath(3) function expects that when a buffer is passed for the 'resolved_path' that it be at least PATH_MAX in length. If it's not a buffer overflow may occur. Therefore the passed buffer size is changed from MAXNAMELEN to MAXPATHLEN. We also take this opertunity to dynamically allocate the buffer to keep it off the stack. warning: call to '__realpath_chk_warn' declared with attribute warning: second argument of realpath must be either NULL or at least PATH_MAX bytes long buffer [enabled by default] Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-10-04 13:19:10 -07:00
Brian Behlendorf	5be98cfe2f	Verify the return value for warn_unused_result functions Under Linux the following functions are flagged with the attribute warn_unused_result, this triggers a warning when ever they are used without checking the return value. To handle this case we check the result VERIFY(). It's better to detect this immediately on failure rather than segfault farther down in the function. ../../cmd/ztest/ztest.c:6033:2: warning: ignoring return value of 'asprintf', declared with attribute warn_unused_result [-Wunused-result] ../../cmd/ztest/ztest.c:739:3: warning: ignoring return value of 'realpath', declared with attribute warn_unused_result [-Wunused-result] Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-10-04 13:19:10 -07:00
Brian Behlendorf	facbbe4366	Replace tempnam() with mkstemp() The use of tempnam() is racy and it should be avoided in favor of mkstemp(). According to the Linux tempnam(3) man page. "Although tempnam() generates names that are difficult to guess, it is nevertheless possible that between the time that tempnam() returns a pathname, and the time that the program opens it, another program might create that pathname using open(2), or create it as a symbolic link. This can lead to security holes. To avoid such possibilities, use the open(2) O_EXCL flag to open the pathname. Or better yet, use mkstemp(3) or tmpfile(3)." This issue was flagged by gcc. ztest.o: In function `setup_data_fd': cmd/ztest/ztest.c:5822: warning: the use of `tempnam' is dangerous, better use `mkstemp' Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-10-04 13:19:10 -07:00
Brian Behlendorf	483106eb71	Minimize ztest stack frame size To ensure ztest behaves as similarly as possible to the kernel implementation of ZFS we attempt to honor the kernel stack limits. This includes keeping the individual stack frame sizes under 1K in size. We currently use gcc to detect and enforce this limit. Therefore to get this building cleanly with full debugging enabled the stack usage in the following functions has been reduced by moving the buffer to the heap. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-10-04 13:19:09 -07:00
Etienne Dechamps	9d81146b01	Use dynamic file descriptor numbers in ztest. Currently, ztest expects to get 3 and 4 as the file descriptors for data and random files, respectively. This is quite fragile and breaks easily if ztest is run with these file descriptors already opened (e.g. in a complex shell script). This patch fixes the issue by removing the assumptions on the file descriptor numbers that open() returns. For the random file (/dev/urandom), the new code doesn't rely on a shared file descriptor; instead, it reopens the file in the child. For the data file, the new code writes the file descriptor number into a "ZTEST_FD_DATA" environment variable so that it can be recovered after the execv() call. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-10-04 13:19:09 -07:00
Christopher Siden	22257dc0d5	Fix mmap() usage in ztest. illumos/illumos-gate@ad135b5d64 Illumos changeset: 13700:2889e2596bd6 Note that this is only a partial port of the aforementioned Illumos changeset. Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: George Wilson <gwilson@delphix.com> Reviewed by: Richard Lowe <richlowe@richlowe.net> Reviewed by: Dan Kruchinin <dan.kruchinin@gmail.com> Approved by: Eric Schrock <Eric.Schrock@delphix.com> Ported to zfsonlinux by: Etienne Dechamps <etienne.dechamps@ovh.net>	2012-10-04 13:19:09 -07:00
Chris Siden	c242c188fd	Illumos #1950 : ztest backwards compatibility testing option. illumos/illumos-gate@420dfc9585 Illumos changeset: 13571:a5771a96228c 1950 ztest backwards compatibility testing option Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Adam Leventhal <ahl@delphix.com> Reviewed by: Matt Ahrens <matt@delphix.com> Reviewed by: Richard Lowe <richlowe@richlowe.net> Reviewed by: Robert Mustacchi <rm@joyent.com> Approved by: Eric Schrock <eric.schrock@delphix.com> Ported-by: Etienne Dechamps <etienne.dechamps@ovh.net> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-10-04 13:18:53 -07:00
Chris Dunlop	d75d6f294e	Switch KM_SLEEP to KM_PUSHPAGE This warning indicates the incorrect use of KM_SLEEP in a call path which must use KM_PUSHPAGE to avoid deadlocking in direct reclaim. See commit `b8d06fc` for additional details. SPL: Fixing allocation for task txg_sync (6093) which used GFP flags 0x297bda7c with PF_NOFS set Signed-off-by: Chris Dunlop <chris@onthe.net.au> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #1002	2012-10-04 10:44:09 -07:00
Matthew Ahrens	04434775b7	Illumos #3100 : zvol rename fails with EBUSY when dirty. illumos/illumos-gate@2e2c135528 Illumos changeset: 13780:6da32a929222 3100 zvol rename fails with EBUSY when dirty Reviewed by: Christopher Siden <chris.siden@delphix.com> Reviewed by: Adam H. Leventhal <ahl@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Garrett D'Amore <garrett@damore.org> Approved by: Eric Schrock <eric.schrock@delphix.com> Ported-by: Etienne Dechamps <etienne.dechamps@ovh.net> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #995	2012-10-03 13:59:02 -07:00
Richard Lowe	0677cb6f52	Illumos #2399 : zfs manual page does not document use of "zfs diff". illumos/illumos-gate@3b8be6bf4f Illumos changeset: 13773:00c2a08cf1bb 2399 zfs manual page does not document use of "zfs diff" Reviewed by: Joshua M. Clulow <josh@sysmgr.org> Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com> Reviewed by: Dan McDonald <danmcd@nexenta.com> Approved by: Robert Mustacchi <rm@joyent.com> Ported-by: Etienne Dechamps <etienne.dechamps@ovh.net> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #940	2012-10-03 13:59:02 -07:00
George Wilson	65947351e7	Illumos #3129 , #3130 illumos/illumos-gate@d6afdce20f Illumos changeset: 13794:7c5e0e746b2c 3129 'zpool reopen' restarts resilvers 3130 ztest failure: Assertion failed: 0 == dmu_objset_destroy(name, B_FALSE) (0x0 == 0x10) Reviewed by: Eric Schrock <eric.schrock@delphix.com> Reviewed by: Matt Ahrens <matthew.ahrens@delphix.com> Reviewed by: Christopher Siden <chris.siden@delphix.com> Reviewed by: Adam Leventhal <ahl@delphix.com> Approved by: Dan McDonald <danmcd@nexenta.com> References: https://www.illumos.org/issues/3129 https://www.illumos.org/issues/3130 Ported by: Etienne Dechamps <etienne.dechamps@ovh.net> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #994	2012-10-03 13:59:02 -07:00
Etienne Dechamps	d135245791	Temporarily disable the reguid test. Currently, ztest fails with the following error: error: Pool 'ztest' has encountered an uncorrectable I/O failure and the failure mode property for this pool is set to panic. We know how to fix it (see issue #939), but it may take some time before we get around to merging the fix, which has some heavy dependencies. In the mean time, it is not ideal to be unable to use ztest just because of a small isolated issue, so this patch works around the problem by disabling the reguid test. This is just a temporary hack to keep ztest usable. The reguid test will be enabled again when the proper fix is merged. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #997	2012-10-03 13:59:02 -07:00
Etienne Dechamps	6aec1cd5a6	Fix ztest vdev file paths. Currently, in several instances (but not all), ztest generates vdev file paths using a statement similar to this: snprintf(path, sizeof (path), ztest_dev_template, ...); This worked fine until `40b84e7aec`, which changed path to be a pointer to the heap instead of an array allocated on the stack. Before this change, sizeof(path) would return the size of the array; now, it returns the size of the pointer instead. As a result, the aforementioned sprintf statement uses the wrong size and truncates the vdev file path to the first 4 or 8 bytes (depending on the architecture). Typically, with default settings, the file path will become "/tmp/zt" instead of "/test/ztest.XXX". This issue only exists in ztest_vdev_attach_detach() and ztest_fault_inject(), which explains why ztest doesn't fail right away. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #989	2012-10-03 13:32:48 -07:00
Etienne Dechamps	274091c074	Fix VOP_CLOSE() in userspace. Currently, for unknown reasons, VOP_CLOSE() is a no-op in userspace. This causes file descriptor leaks. This is especially problematic with long ztest runs, since zpool.cache is opened repeatedly and never closed, resulting in resource exhaustion (EMFILE errors). This patch fixes the issue by making VOP_CLOSE() do what it is supposed to do. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #989	2012-10-03 13:32:48 -07:00
Etienne Dechamps	0aebd4f9e3	Create threads in detached state in userspace. Currently, thread_create(), when called in userspace, creates a joinable (i.e. not detached thread). This is the pthread default. Unfortunately, this does not reproduce kthreads behavior (kthreads are always detached). In addition, this contradicts the original Solaris code which creates userspace threads in detached mode. These joinable threads are never joined, which leads to a leakage of pthread thread objects ("zombie threads"). This in turn results in excessive ressource consumption, and possible ressource exhaustion in extreme cases (e.g. long ztest runs). This patch fixes the issue by creating userspace threads in detached mode. The only exception is ztest worker threads which are meant to be joinable. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #989	2012-10-03 13:32:48 -07:00
Brian Behlendorf	6d1d976b2c	Modify vdev_elevator_switch() to use elevator_change() As of Linux 2.6.36 an elevator_change() interface was added. This commit updates vdev_elevator_switch() to use this interface when available, otherwise it falls back to the usermodehelper method. Original-patch-by: foobarz <sysop@xeon.(none)> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #906	2012-10-03 13:31:44 -07:00
Etienne Dechamps	2f342404c1	Force 4K blocksize when testing ext2 on zvol. Currently, mkfs.ext2 on zconfig.sh zvols tries to use a 8K blocksize, probably because by default zvol exposes an optimal I/O size of 8K. Unfortunately, a ext2 blocksize of 8K is not supported by the kernel, so the resulting filesystem is unmountable. This patch fixes the issue by making sure the blocksize is 4K. We have to use -F to force it else mkfs.ext2 won't allow us to use a blocksize smaller than the optimal I/O size. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #979	2012-10-03 10:52:51 -07:00
Cyril Plisko	393b44c711	Implement .commit_metadata hook for NFS export In order to implement synchronous NFS metadata semantics ZFS needs to provide the .commit_metadata hook. All it takes there is to make sure changes are committed to ZIL. Fortunately zfs_fsync() does just that, so simply calling it from zpl_commit_metadata() does the trick. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #969	2012-10-03 10:49:45 -07:00
Chris Wedgwood	23a61ccc1b	zvol_probe should return NULL when the device isn't found. Previously we returned ERR_PTR(-ENOENT) which the rest of the kernel doesn't expect and as such we can oops. Signed-off-by: Chris Wedgwood <cw@f00f.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #949 Closes #931 Closes #789 Closes #743 Closes #730	2012-10-03 10:39:12 -07:00
Bill Pijewski	37abac6d55	Illumos #2703 : add mechanism to report ZFS send progress Reviewed by: Matt Ahrens <matt@delphix.com> Reviewed by: Robert Mustacchi <rm@joyent.com> Reviewed by: Richard Lowe <richlowe@richlowe.net> Approved by: Eric Schrock <Eric.Schrock@delphix.com> References: https://www.illumos.org/issues/2703 Ported by: Martin Matuska <martin@matuska.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-09-19 13:39:06 -07:00
Chris Siden	1bd201e70d	Illumos #1948 : zpool list should show more detailed pool info Reviewed by: Adam Leventhal <ahl@delphix.com> Reviewed by: Matt Ahrens <matt@delphix.com> Reviewed by: Eric Schrock <eric.schrock@delphix.com> Reviewed by: Richard Lowe <richlowe@richlowe.net> Reviewed by: Albert Lee <trisk@nexenta.com> Reviewed by: Dan McDonald <danmcd@nexenta.com> Reviewed by: Garrett D'Amore <garrett@damore.org> Approved by: Eric Schrock <eric.schrock@delphix.com> References: https://www.illumos.org/issues/1948 Ported by: Martin Matuska <martin@matuska.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #685	2012-09-19 13:39:05 -07:00
Brian Behlendorf	95fd8c9a7f	Switch KM_SLEEP to KM_PUSHPAGE This warning indicates the incorrect use of KM_SLEEP in a call path which must use KM_PUSHPAGE to avoid deadlocking in direct reclaim. See commit `b8d06fca08` for additional details. SPL: Fixing allocation for task txg_sync (6093) which used GFP flags 0x297bda7c with PF_NOFS set Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #973	2012-09-19 11:52:36 -07:00
Brian Behlendorf	0a2f7b3662	Seg fault 'zpool import -d /dev/disk/by-id -a' Introduced by commit `44867b6d6e`. We should of course check to ensure best isn't NULL before attempting to dereference it. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #974	2012-09-18 12:33:37 -07:00
Brian Behlendorf	211204bed3	zfs-0.6.0-rc11	2012-09-18 11:30:24 -07:00
Richard Lowe	dd4769adc0	Illumos #2088 zdb could use a reasonable manual page Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com> Reviewed by: Garrett D'Amore <garrett@damore.org> Reviewed by: George Wilson <gwilson@zfsmail.com> Reviewed by: Steve Gonczi <gonczi@comcast.net> Reviewed by: Richard Elling <richard.elling@richardelling.com> Approved by: Garrett D'Amore <garrett@damore.org> References: https://www.illumos.org/issues/2088 Ported by: Cyril Plisko <cyril.plisko@mountall.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #682	2012-09-18 09:09:13 -07:00
Brian Behlendorf	44867b6d6e	Improve `zpool import` search behavior The goal of this change is to make 'zpool import' prefer to use the peristent /dev/mapper or /dev/disk/by-* paths. These are far preferable to the devices in /dev/ whos names are not persistent and are determined by the order in which a device is detected. This patch improves things by changing the default search path from just to the top level /dev/ directory to (in order): /dev/disk/by-vdev - Custom rules, use first if they exist /dev/disk/zpool - Custom rules, use first if they exist /dev/mapper - Use multipath devices before components /dev/disk/by-uuid - Single unique entry and persistent /dev/disk/by-id - May be multiple entries and persistent /dev/disk/by-path - Encodes physical location and persistent /dev/disk/by-label - Custom persistent labels /dev - UNSAFE device names will change The default search path can be overriden by setting the ZPOOL_IMPORT_PATH environment variable. This must be a colon delimited list of paths which are searched for vdevs. If the 'zpool import -d' option is specified only those listed paths will be searched. Finally, when multiple paths to the same device are found. If one of the paths is an exact match for the path used last time to import the pool it will be used. When there are no exact matches the prefered path will be determined by the provided search order. This means you can still import a pool and force specific names by providing the -d <path> option. And the prefered names will persist as long as those paths exist on your system. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #965	2012-09-17 13:49:07 -07:00
Brian Behlendorf	ba367276d8	Switch KM_SLEEP to KM_PUSHPAGE This warning indicates the incorrect use of KM_SLEEP in a call path which must use KM_PUSHPAGE to avoid deadlocking in direct reclaim. See commit `b8d06fca08` for additional details. SPL: Fixing allocation for task txg_sync (6093) which used GFP flags 0x297bda7c with PF_NOFS set Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #917	2012-09-17 11:22:23 -07:00
Cyril Plisko	49d39798f2	ZFS replay transaction error 5 When zfs_replay_write() replays TX_WRITE records from ZIL it calls zpl_write_common() to perform the actual write. zpl_write_common() returns the number of bytes written (similar to write() system call) or an (negative) error. However, the code expects the positive return value to be a residual counter. Thus when zpl_write_common() successfully completes it is mistakenly considered to be a partial write and the error code delivered further. At this point the ZIL processing is aborted with famous "ZFS replay transaction error 5" error message given to the message buffer. The fix is to compare the zpl_write_commmon() return value with the buffer size and flag error only when they disagree. Signed-off-by: Cyril Plisko <cyril.plisko@mountall.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #933	2012-09-17 11:06:58 -07:00
Brian Behlendorf	8312c6df55	Clear PG_writeback for sync I/O error case Commit `2b2861362f` accidentally introduced this issue by only conditionally registering the commit callback in the async case. The error handing code for the dmu_tx_assign() failure case relied on there always being a registered commit callback to clear the PG_writeback bit. Since that is no longer strictly true for the synchronous case we must explicitly invoke the callback. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #961	2012-09-14 15:53:47 -07:00
Cyril Plisko	8e8e7f35b7	Fix zdb printf format string for ZIL data blocks Without this fix the zdb printouts of ZIL data blocks look full of FF due to printf() handling its arguments as int by default. Here is the output before the fix TX_WRITE len 4136, txg 1093817, seq 149231 foid 4242, offset 0, length f68 G FFFFFF8EFFFFFF87FFFFFF91FFFFFFCC 1c FFFFFFAFFFFFFFC9FFFFFFBAZ FFFFFFC3 And the same after the fix TX_WRITE len 4136, txg 1093817, seq 149231 foid 4242, offset 0, length f68 G 8E8791CC 1cAFC9BAZ C3 Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #962	2012-09-13 09:02:12 -07:00
Brian Behlendorf	5915791096	Move iput() after zfs_inode_update() When replaying an unlink/remove operation via zfs_rmdir() the object being removed will be instantiated by a call to zfs_dirent_lock(). This means that there is a single reference protecting the object. Right before the call to zfs_inode_update() this reference is dropped which may cause the object to be destroyed. This will result in a NULL dereference as shown by the stack trace is issue #782. This likely isn't an issue during normal operation because there is always an additional reference held on the object by the VFS. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #782	2012-09-12 14:22:52 -07:00
Brian Behlendorf	cda4db408c	Revert "Improve AF hard disk detection" This reverts commit `395350c85d` which accidentally introduced issue #955. Pools using AF drives which were originally created with a sector size of 512 bytes will now be correctly detected to have physical sector size of 4096. This is desirable for a new pool, however for an existing pool abruptly changing the sector size causes problems. For this reason, this change is being reverted until the additional logic can be added to detect the existing pool case. Existing pools must use the ashift size stored in the label regardless of what the disk reports. This is critical for compatibility. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #955	2012-09-11 16:33:49 -07:00

... 21 22 23 24 25 ...

1892 Commits