mirror_zfs

mirror of https://git.proxmox.com/git/mirror_zfs.git synced 2024-12-27 11:29:36 +03:00

Author	SHA1	Message	Date
Etienne Dechamps	34037afe24	Improve ZVOL queue behavior. The Linux block device queue subsystem exposes a number of configurable settings described in Linux block/blk-settings.c. The defaults for these settings are tuned for hard drives, and are not optimized for ZVOLs. Proper configuration of these options would allow upper layers (I/O scheduler) to take better decisions about write merging and ordering. Detailed rationale: - max_hw_sectors is set to unlimited (UINT_MAX). zvol_write() is able to handle writes of any size, so there's no reason to impose a limit. Let the upper layer decide. - max_segments and max_segment_size are set to unlimited. zvol_write() will copy the requests' contents into a dbuf anyway, so the number and size of the segments are irrelevant. Let the upper layer decide. - physical_block_size and io_opt are set to the ZVOL's block size. This has the potential to somewhat alleviate issue #361 for ZVOLs, by warning the upper layers that writes smaller than the volume's block size will be slow. - The NONROT flag is set to indicate this isn't a rotational device. Although the backing zpool might be composed of rotational devices, the resulting ZVOL often doesn't exhibit the same behavior due to the COW mechanisms used by ZFS. Setting this flag will prevent upper layers from making useless decisions (such as reordering writes) based on incorrect assumptions about the behavior of the ZVOL. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-02-07 16:23:06 -08:00
Etienne Dechamps	b18019d2d8	Fix synchronicity for ZVOLs. zvol_write() assumes that the write request must be written to stable storage if rq_is_sync() is true. Unfortunately, this assumption is incorrect. Indeed, "sync" does not mean what we think it means in the context of the Linux block layer. This is well explained in linux/fs.h: WRITE: A normal async write. Device will be plugged. WRITE_SYNC: Synchronous write. Identical to WRITE, but passes down the hint that someone will be waiting on this IO shortly. WRITE_FLUSH: Like WRITE_SYNC but with preceding cache flush. WRITE_FUA: Like WRITE_SYNC but data is guaranteed to be on non-volatile media on completion. In other words, SYNC does not mean that the write must be on stable storage on completion. It just means that someone is waiting on us to complete the write request. Thus triggering a ZIL commit for each SYNC write request on a ZVOL is unnecessary and harmful for performance. To make matters worse, ZVOL users have no way to express that they actually want data to be written to stable storage, which means the ZIL is broken for ZVOLs. The request for stable storage is expressed by the FUA flag, so we must commit the ZIL after the write if the FUA flag is set. In addition, we must commit the ZIL before the write if the FLUSH flag is set. Also, we must inform the block layer that we actually support FLUSH and FUA. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-02-07 16:23:06 -08:00
Etienne Dechamps	56c34bac44	Support "sync=always" for ZVOLs. Currently the "sync=always" property works for regular ZFS datasets, but not for ZVOLs. This patch remedies that. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Fixes #374.	2012-02-07 16:23:06 -08:00
Brian Behlendorf	30a9524e45	Set zvol_major/zvol_threads permissions The zvol_major and zvol_threads module options were being created with 0 permission bits. This prevented them from being listed in the /sys/module/zfs/parameters/ directory, although they were visible in `modinfo zfs`. This patch fixes the issue by updating the permission bits to 0444. For the moment these options must be read-only because they are used during module initialization. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #392	2011-12-07 09:27:50 -08:00
Brian Behlendorf	df554c148e	Fix 'zfs set volsize=N pool/dataset' This change fixes a kernel panic which would occur when resizing a dataset which was not open. The objset_t stored in the zvol_state_t will be set to NULL when the block device is closed. To avoid this issue we pass the correct objset_t as the third arg. The code has also been updated to correctly notify the kernel when the block device capacity changes. For 2.6.28 and newer kernels the capacity change will be immediately detected. For earlier kernels the capacity change will be detected when the device is next opened. This is a known limitation of older kernels. Online ext3 resize test case passes on 2.6.28+ kernels: $ dd if=/dev/zero of=/tmp/zvol bs=1M count=1 seek=1023 $ zpool create tank /tmp/zvol $ zfs create -V 500M tank/zd0 $ mkfs.ext3 /dev/zd0 $ mkdir /mnt/zd0 $ mount /dev/zd0 /mnt/zd0 $ df -h /mnt/zd0 $ zfs set volsize=800M tank/zd0 $ resize2fs /dev/zd0 $ df -h /mnt/zd0 Original-patch-by: Fajar A. Nugraha <github@fajar.net> Closes #68 Closes #84	2011-05-02 08:54:40 -07:00
Fajar A. Nugraha	4c0d8e50b9	Use udev to create /dev/zvol/[dataset_name] links This commit allows zvols with names longer than 32 characters, which fixes issue on https://github.com/behlendorf/zfs/issues/#issue/102. Changes include: - use /dev/zd* device names for zvol, where * is the device minor (include/sys/fs/zfs.h, module/zfs/zvol.c). - add BLKZNAME ioctl to get dataset name from userland (include/sys/fs/zfs.h, module/zfs/zvol.c, cmd/zvol_id). - add udev rule to create /dev/zvol/[dataset_name] and the legacy /dev/[dataset_name] symlink. For partitions on zvol, it will create /dev/zvol/[dataset_name]-part* (etc/udev/rules.d/60-zvol.rules, cmd/zvol_id). Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2011-02-25 09:43:19 -08:00
Brian Behlendorf	61e909608d	Linux 2.6.x compat, blkdev_compat.h For legacy reasons the zvol.c and vdev_disk.c Linux compatibility code ended up in sys/blkdev.h and sys/vdev_disk.h headers. While there are worse places for this code to live it should be in a linux/blkdev_compat.h header. This change moves this block device Linux compatibility code in to the linux/blkdev_compat.h header and updates all the correct #include locations. This is not a functional change or bug fix, it is just code cleanup.	2011-02-23 12:29:38 -08:00
Brian Behlendorf	d567444809	Create minors for all zvols It was noticed that when you have zvols in multiple datasets not all of the zvol devices are created at module load time. Fajarnugraha did the leg work to identify that the root cause of this bug is a non-zero return value from zvol_create_minors_cb(). Returning a non-zero value from the dmu_objset_find_spa() callback function results in aborting processing the remaining children in a dataset. Since we want to ensure that the callback in run on all children regardless of error simply unconditionally return zero from the zvol_create_minors_cb(). This callback function is solely used for this purpose so surpressing the error is safe. Closes #96	2011-02-16 09:50:06 -08:00
Brian Behlendorf	3c4988c83e	Add zp->z_is_zvol flag A new flag is required for the zfs_rlock code to determine if it is operation of the zvol of zpl dataset. This used to be keyed off the zp->z_vnode, which was a hack to begin with, but with the removal of vnodes we needed a dedicated flag.	2011-02-10 09:27:21 -08:00
Ned Bass	b1c5821375	Fix panic mounting unformatted zvol On some older kernels, i.e. 2.6.18, zvol_ioctl_by_inode() may get passed a NULL file pointer if the user tries to mount a zvol without a filesystem on it. This change adds checks to prevent a null pointer dereference. Closes #73. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-10-29 14:46:33 -07:00
Brian Behlendorf	60101509ee	Add linux kernel disk support Native Linux vdev disk interfaces Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2010-08-31 13:41:57 -07:00
Brian Behlendorf	0aa61e8427	Remove zvol.c when updating in update-zfs.sh Linux version available.	2009-11-15 16:20:01 -08:00
Brian Behlendorf	9babb37438	Rebase master to b117	2009-07-02 15:44:48 -07:00
Brian Behlendorf	d164b20935	Rebase master to b108	2009-02-18 12:51:31 -08:00
Brian Behlendorf	fb5f0bc833	Rebase master to b105	2009-01-15 13:59:39 -08:00
Brian Behlendorf	172bb4bd5e	Move the world out of /zfs/ and seperate out module build tree	2008-12-11 11:08:09 -08:00

1 2 3 4

166 Commits