mirror_zfs

mirror of https://git.proxmox.com/git/mirror_zfs.git synced 2024-11-17 10:01:01 +03:00

Author	SHA1	Message	Date
Tim Haley	7b8518cb8d	Illumos #xxx: zdb -vvv broken after zfs diff integration References to Illumos issue and patch: - https://github.com/illumos/illumos-gate/commit/163eb7ff Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #340	2011-08-01 12:09:02 -07:00
Brian Behlendorf	22872ff5da	Use zfs_mknode() to create dataset root Long, long, long ago when the effort to port ZFS was begun the zfs_create_fs() function was heavily modified to remove all of its VFS dependencies. This allowed Lustre to use the dataset without us having to spend the time porting all the required VFS code. Fast-forward several years and we now have all the VFS code in place but are still relying on the modified zfs_create_fs(). This isn't required anymore and we can now use zfs_mknode() to create the root znode for the filesystem. This commit reverts the contents of zfs_create_fs() to largely match the upstream OpenSolaris code. There have been minor modifications to accomidate the Linux VFS but that is all. This code fixes issue #116 by bootstraping enough of the VFS data structures so we can rely on zfs_mknode() to create the root directory. This ensures it is created properly with support for system attributes. Previously it wasn't which is why it behaved differently that all other directories when modified. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #116	2011-07-20 19:52:26 -07:00
Gunnar Beutner	3c9609b322	Renamed HAVE_SHARE ifdefs to HAVE_SMB_SHARE. The remaining code that is guarded by HAVE_SHARE ifdefs is related to the .zfs/shares functionality which is currently not available on Linux. On Solaris the .zfs/shares directory can be used to set permissions for SMB shares. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2011-07-06 09:20:28 -07:00
Rohan Puri	a89c3e0bd5	Support mandatory locks (nbmand) The Linux kernel already has support for mandatory locking. This change just replaces the Solaris mandatory locking calls with the Linux equivilants. In fact, it looks like this code could be removed entirely because this checking is already done generically in the Linux VFS. However, for now we'll leave it in place even if it is redundant just in case we missed something. The original patch to update the code to support mandatory locking was done by Rohan Puri. This patch is an updated version which is compatible with the previous mount option handling changes. Original-Patch-by: Rohan Puri <rohan.puri15@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #222 Closes #253	2011-07-01 13:39:40 -07:00
Brian Behlendorf	5c03efc379	Linux compat 2.6.39: security_inode_init_security() The security_inode_init_security() function now takes an additional qstr argument which must be passed in from the dentry if available. Passing a NULL is safe when no qstr is available the relevant security checks will just be skipped. Closes #246 Closes #217 Closes #187	2011-07-01 12:40:08 -07:00
Brian Behlendorf	f0b2486034	Remove unused MMAP functions The following functions were required for the OpenSolaris mmap implementation. Because the Linux VFS does most the most heavy lifting for us they are not required and are being removed to keep the code clean and easy to understand. * zfs_null_putapage() * zfs_frlock() * zfs_no_putpage() Signed-off-by: Brian Behlendorf <behlendorf@llnl.gov>	2011-07-01 12:22:57 -07:00
Ned A. Bass	aa6d8c1086	Don't store rdev in SA for FIFOs and sockets Update the handling of named pipes and sockets to be consistent with other platforms with regard to the rdev attribute. While all ZFS ipmlementations store the rdev for device files in a system attribute (SA), this is not the case for FIFOs and sockets. Indeed, Linux always passes rdev=0 to mknod() for FIFOs and sockets, so the value is not needed. Add an ASSERT that rdev==0 for FIFOs and sockets to detect if the expected behavior ever changes. Closes #216	2011-05-09 13:35:07 -07:00
Brian Behlendorf	5f35b19007	Fully update inode when created When a new znode/inode pair is created both the znode and the inode should be immediately updated to the correct values. This was done for the znode and for most of the values in the inode, but not all of them. This normally wasn't a problem because most subsequent operations would cause the inode to be immediately updated. This change ensures the inode is now fully updated before it is inserted in to the inode hash. Closes #116 Closes #146 Closes #164	2011-05-02 14:04:19 -07:00
Gunnar Beutner	36df284366	Fixed a use-after-free bug in zfs_zget(). Fixed a bug where zfs_zget could access a stale znode pointer when the inode had already been removed from the inode cache via iput -> iput_final -> ... -> zfs_zinactive but the corresponding SA handle was still alive. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #180	2011-04-21 13:48:01 -07:00
Brian Behlendorf	c85b224faf	Call d_instantiate before unlocking inode Under Linux a dentry referencing an inode must be instantiated before the inode is unlocked. To accomplish this without overly modifing the core ZFS code the dentry it passed via the vattr_t. There are cases such as replay when a dentry is not available. In which case it is obviously not initialized at inode creation time, if a dentry is needed it will be spliced as when required via d_lookup().	2011-04-07 09:51:57 -07:00
Brian Behlendorf	d6bd8eaae4	Fix evict() deadlock Now that KM_SLEEP is not defined as GFP_NOFS there is the possibility of synchronous reclaim deadlocks. These deadlocks never existed in the original OpenSolaris code because all memory reclaim on Solaris is done asyncronously. Linux does both synchronous (direct) and asynchronous (indirect) reclaim. This commit addresses a deadlock caused by inode eviction. A KM_SLEEP allocation may trigger direct memory reclaim and shrink the inode cache. This can occur while a mutex in the array of ZFS_OBJ_HOLD mutexes is held. Through the ->shrink_icache_memory()->evict()->zfs_inactive()-> zfs_zinactive() call path the same mutex may be reacquired resulting in a deadlock. To avoid this deadlock the process must not reacquire the mutex when it is already holding it. This is a reasonable fix for now but longer term the ZFS_OBJ_HOLD mutex locking should be reevaluated. This infrastructure already prevents us from ever using the Linux lock dependency analysis tools, and it may limit scalability.	2011-03-22 12:14:55 -07:00
Brian Behlendorf	691f6ac4c2	Use KM_PUSHPAGE instead of KM_SLEEP It used to be the case that all KM_SLEEP allocations were GFS_NOFS. Unfortunately this often resulted in the kernel being unable to reclaim the ARC, inode, and dentry caches in a timely manor. The fix was to make KM_SLEEP a GFP_KERNEL allocation in the SPL. However, this increases the posibility of deadlocking the system on a zfs write thread. If a zfs write thread attempts to perform an allocation it may trigger synchronous reclaim. This reclaim may attempt to flush dirty data/inode to disk to free memory. Unforunately, this write cannot finish because the write thread which would handle it is holding the previous transaction open. Deadlock. To avoid this all allocations in the zfs write thread path must use KM_PUSHPAGE which prohibits synchronous reclaim for that thread. In this way forward progress in ensured. The risk with this change is I missed updating an allocation for the write threads leaving an increased posibility of deadlock. If any deadlocks remain they will be unlikely but we'll have to make sure they all get fixed.	2011-03-22 12:14:55 -07:00
Brian Behlendorf	adf2e8778e	Fix O_APPEND Corruption Due to an uninitialized variable files opened with O_APPEND may overwrite the start of the file rather than append to it. This was introduced accidentally when I removed the Solaris vnodes. The zfs_range_lock_writer() function used to key off zf->z_vnode to determine if a znode_t was for a zvol of zpl object. With the removal of vnodes this was replaced by the flag zp->z_is_zvol. This flag was used to control the append behavior for range locks. Unfortunately, this value was never properly initialized after the vnode removal. However, because most of memory is usually zeros it happened to be set correctly most of the time making the bug appear racy. Properly initializing zp->z_is_zvol to zero completely resolves the problem with O_APPEND. Closes #126	2011-03-09 13:31:00 -08:00
Brian Behlendorf	5484965ab6	Drop HAVE_XVATTR macros When I began work on the Posix layer it immediately became clear to me that to integrate cleanly with the Linux VFS certain Solaris specific things would have to go. One of these things was to elimate as many Solaris specific types from the ZPL layer as possible. They would be replaced with their Linux equivalents. This would not only be good for performance, but for the general readability and health of the code. The Solaris and Linux VFS are different beasts and should be treated as such. Most of the code remains common for constructing transactions and such, but there are subtle and important differenced which need to be repsected. This policy went quite for for certain types such as the vnode_t, and it initially seemed to be working out well for the vattr_t. There was a relatively small amount of related xvattr_t code I was forced to comment out with HAVE_XVATTR. But it didn't look that hard to come back soon and replace it all with a native Linux type. However, after going doing this path with xvattr some distance it clear that this code was woven in the ZPL more deeply than I thought. In particular its hooks went very deep in to the ZPL replay code and replacing it would not be as easy as I originally thought. Rather than continue persuing replacing and removing this code I've taken a step back and reevaluted things. This commit reverts many of my previous commits which removed xvattr related code. It restores much of the code to its original upstream state and now relies on improved xvattr_t support in the zfs package itself. The result of this is that much of the code which I had commented out, which accidentally broke things like replay, is now back in place and working. However, there may be a small performance impact for getattr/setattr operations because they now require a translation from native Linux to Solaris types. For now that's a price I'm willing to pay. Once everything is completely functional we can revisting the issue of removing the vattr_t/xvattr_t types. Closes #111	2011-03-02 11:44:34 -08:00
Brian Behlendorf	dc1d7665c5	Remove rdev packing Remove custom code to pack/unpack dev_t's. Under Linux all dev_t's are an unsigned 32-bit value even on 64-bit platforms. The lower 20 bits are used for the minor number and the upper 12 for the major number. This means if your importing a pool from Solaris you may get strange major/minor numbers. But it doesn't really matter because even if we add compatibility code to translate the encoded Solaris major/minor they won't do you any good under Linux. You will still need to recreate the dev_t with a major/minor which maps to reserved major numbers used under Linux. Dropping this code also resolves 32-bit builds by removing the offending 32-bit compatibility code.	2011-02-23 15:13:03 -08:00
Brian Behlendorf	d8fd10545b	Fix FIFO and socket handling Under Linux when creating a fifo or socket type device in the ZFS filesystem it's critical that the rdev is stored in a SA. This was already being correctly done for character and block devices, but that logic needed to be extended to include FIFOs and sockets. This patch takes care of device creation but a follow on patch may still be required to verify that the dev_t is being correctly packed/unpacked from the SA.	2011-02-16 09:51:44 -08:00
Brian Behlendorf	3558fd73b5	Prototype/structure update for Linux I appologize in advance why to many things ended up in this commit. When it could be seperated in to a whole series of commits teasing that all apart now would take considerable time and I'm not sure there's much merrit in it. As such I'll just summerize the intent of the changes which are all (or partly) in this commit. Broadly the intent is to remove as much Solaris specific code as possible and replace it with native Linux equivilants. More specifically: 1) Replace all instances of zfsvfs_t with zfs_sb_t. While the type is largely the same calling it private super block data rather than a zfsvfs is more consistent with how Linux names this. While non critical it makes the code easier to read when your thinking in Linux friendly VFS terms. 2) Replace vnode_t with struct inode. The Linux VFS doesn't have the notion of a vnode and there's absolutely no good reason to create one. There are in fact several good reasons to remove it. It just adds overhead on Linux if we were to manage one, it conplicates the code, and it likely will lead to bugs so there's a good change it will be out of date. The code has been updated to remove all need for this type. 3) Replace all vtype_t's with umode types. Along with this shift all uses of types to mode bits. The Solaris code would pass a vtype which is redundant with the Linux mode. Just update all the code to use the Linux mode macros and remove this redundancy. 4) Remove using of vn_* helpers and replace where needed with inode helpers. The big example here is creating iput_aync to replace vn_rele_async. Other vn helpers will be addressed as needed but they should be be emulated. They are a Solaris VFS'ism and should simply be replaced with Linux equivilants. 5) Update znode alloc/free code. Under Linux it's common to embed the inode specific data with the inode itself. This removes the need for an extra memory allocation. In zfs this information is called a znode and it now embeds the inode with it. Allocators have been updated accordingly. 6) Minimal integration with the vfs flags for setting up the super block and handling mount options has been added this code will need to be refined but functionally it's all there. This will be the first and last of these to large to review commits.	2011-02-10 09:27:21 -08:00
Brian Behlendorf	eb28321e2d	Create a root znode without VFS dependencies For portability reasons it's handy to be able to create a root znode and basic filesystem components without requiring the full cooperation of the VFS. We are committing to this to simply the filesystem creations code.	2011-02-10 09:27:21 -08:00
Brian Behlendorf	4b3f12ecd5	Remove Solaris VFS Hooks The ZFS code is being restructured to act as a library and a stand alone module. This allows us to leverage most of the existing code with minimal modification. It also means we need to drop the Solaris vfs/vnode functions they will be replaced by Linux equivilants and updated to be Linux friendly.	2011-02-10 09:27:20 -08:00
Brian Behlendorf	960e08fe3e	VFS: Add zfs_inode_update() helper For the moment we have left ZFS unchanged and it updates many values as part of the znode. However, some of these values should be set in the inode. For the moment this is handled by adding a function called zfs_inode_update() which updates the inode based on the znode. This is considered a workaround until we can systematically go through the ZFS code and have it directly update the inode. At which point zfs_update_inode() can be dropped entirely. Keeping two copies of the same data isn't only inefficient it's a breeding ground for bugs.	2011-02-10 09:27:20 -08:00
Brian Behlendorf	7304b6e50f	VFS: Integrate zfs_znode_alloc() Under Linux the convention for filesystem specific data structure is to embed it along with the generic vfs data structure. This differs significantly from Solaris. Since we want to integrates as cleanly with the Linux VFS as possible. This changes modifies zfs_znode_alloc() to allocate a znode with an embedded inode for use with the generic VFS. This is done by calling iget_locked() which will allocate a new inode if needed by calling sb->alloc_inode(). This function allocates enough memory for a znode_t by returns a pointer to the inode structure for Linux's VFS. This function is also responsible for setting the callback znode->z_set_ops_inodes() which is used to register the correct handlers for the inode.	2011-02-10 09:27:20 -08:00
Brian Behlendorf	10c6047ea5	Enable zfs_znode compilation Basic compilation of the bulk of zfs_znode.c has been enabled. After much consideration it was decided to convert the existing vnode based interfaces to more friendly Linux interfaces. The following commits will systematically replace update the requiter interfaces. There are of course pros and cons to this decision. Pros: * This simplifies intergration with Linux in the long term. There is no longer any need to manage vnodes which are a foreign concept to the Linux VFS. * Improved long term maintainability. * Minor performance improvements by removing vnode overhead. Cons: * Added work in the short term to modify multiple ZFS interfaces. * Harder to pull in changes if we ever see any new code from Solaris. * Mixed Solaris and Linux interfaces in some ZFS code.	2011-02-10 09:27:20 -08:00
Brian Behlendorf	9ee7fac531	VFS: Wrap with HAVE_SHARE Certain NFS/SMB share functionality is not yet in place. These functions used to be wrapped with the generic HAVE_ZPL to prevent them from being compiled. I still don't want them compiled but I'm working toward eliminating the use of HAVE_ZPL. So I'm just renaming the wrapper here to HAVE_SHARE. They still won't be compiled until all the share issues are worked through. Share support is the last missing piece from zfs_ioctl.c.	2011-02-10 09:21:43 -08:00
Brian Behlendorf	5649246dd3	Remove znode move functionality Unlike Solaris the Linux implementation embeds the inode in the znode, and has no use for a vnode. So while it's true that fragmention of the znode cache may occur it should not be worse than any of the other Linux FS inode caches. Until proven that this is a problem it's just added complexity we don't need.	2011-02-10 09:21:42 -08:00
Brian Behlendorf	f30484afc3	Conserve stack in zfs_mkdir() Move the sa_attrs array from the stack to the heap to minimize stack space usage.	2011-02-10 09:21:42 -08:00
Brian Behlendorf	149e873ab1	Fix minor compiler warnings These compiler warnings were introduced when code which was previously #ifdef'ed out by HAVE_ZPL was re-added for use by the posix layer. All of the following changes should be obviously correct and will cause no semantic changes.	2011-01-06 15:04:28 -08:00
Brian Behlendorf	c28b227942	Add linux kernel module support Setup linux kernel module support, this includes: - zfs context for kernel/user - kernel module build system integration - kernel module macros - kernel module symbol export - kernel module options Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-08-31 13:41:58 -07:00
Brian Behlendorf	60101509ee	Add linux kernel disk support Native Linux vdev disk interfaces Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-08-31 13:41:57 -07:00
Brian Behlendorf	753972fccf	Fix dbuf_dirty_record_t leaks Fix two leaks with dbuf_dirty_record_t Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-08-31 08:38:44 -07:00
Brian Behlendorf	d6320ddb78	Fix gcc c90 compliance warnings Fix non-c90 compliant code, for the most part these changes simply deal with where a particular variable is declared. Under c90 it must alway be done at the very start of a block. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-08-27 15:28:32 -07:00
Brian Behlendorf	572e285762	Update to onnv_147 This is the last official OpenSolaris tag before the public development tree was closed.	2010-08-26 14:24:34 -07:00
Brian Behlendorf	428870ff73	Update core ZFS code from build 121 to build 141.	2010-05-28 13:45:14 -07:00
Brian Behlendorf	45d1cae3b8	Rebase master to b121	2009-08-18 11:43:27 -07:00
Brian Behlendorf	9babb37438	Rebase master to b117	2009-07-02 15:44:48 -07:00
Brian Behlendorf	d164b20935	Rebase master to b108	2009-02-18 12:51:31 -08:00
Brian Behlendorf	fb5f0bc833	Rebase master to b105	2009-01-15 13:59:39 -08:00
Brian Behlendorf	172bb4bd5e	Move the world out of /zfs/ and seperate out module build tree	2008-12-11 11:08:09 -08:00

1 2 3

137 Commits