Commit Graph

59 Commits

Author SHA1 Message Date
Brian Behlendorf
ba6a24026c Remove ZFC_IOC_*_MINOR ioctl()s
Early versions of ZFS coordinated the creation and destruction
of device minors from userspace.  This was inherently racy and
in late 2009 these ioctl()s were removed leaving everything up
to the kernel.  This significantly simplified the code.

However, we never picked up these changes in ZoL since we'd
already significantly adjusted this code for Linux.  This patch
aims to rectify that by finally removing ZFC_IOC_*_MINOR ioctl()s
and moving all the functionality down in to the kernel.  Since
this cleanup will change the kernel/user ABI it's being done
in the same tag as the previous libzfs_core ABI changes.  This
will minimize, but not eliminate, the disruption to end users.

Once merged ZoL, Illumos, and FreeBSD will basically be back
in sync in regards to handling ZVOLs in the common code.  While
each platform must have its own custom zvol.c implemenation the
interfaces provided are consistent.

NOTES:

1) This patch introduces one subtle change in behavior which
   could not be easily avoided.  Prior to this change callers
   of 'zfs create -V ...' were guaranteed that upon exit the
   /dev/zvol/ block device link would be created or an error
   returned.  That's no longer the case.  The utilities will no
   longer block waiting for the symlink to be created.  Callers
   are now responsible for blocking, this is why a 'udev_wait'
   call was added to the 'label' function in scripts/common.sh.

2) The read-only behavior of a ZVOL now solely depends on if
   the ZVOL_RDONLY bit is set in zv->zv_flags.  The redundant
   policy setting in the gendisk structure was removed.  This
   both simplifies the code and allows us to safely leverage
   set_disk_ro() to issue a KOBJ_CHANGE uevent.  See the
   comment in the code for futher details on this.

3) Because __zvol_create_minor() and zvol_alloc() may now be
   called in a sync task they must use KM_PUSHPAGE.

References:
  illumos/illumos-gate@681d9761e8

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ned Bass <bass6@llnl.gov>
Signed-off-by: Tim Chase <tim@chase2k.com>
Closes #1969
2013-12-16 09:15:57 -08:00
Will Andrews
d09f25dc66 Illumos #3744
3744 zfs shouldn't ignore errors unmounting snapshots
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Christopher Siden <christopher.siden@delphix.com>

References:
  https://www.illumos.org/issues/3744
  illumos/illumos-gate@fc7a6e3fef

Ported-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #1775

Porting notes:

1. There is no clear way to distinguish between a failure when we
   tried to unmount the snapdir of a zvol (which does not exist)
   and the failure when we try to unmount a snapdir of a dataset,
   so the changes to zfs_unmount_snap() were dropped in favor of
   an altered Linux function that unconditionally returns 0.
2013-11-04 10:55:25 -08:00
Massimo Maggi
023699cd62 Posix ACL Support
This change adds support for Posix ACLs by storing them as an xattr
which is common practice for many Linux file systems.  Since the
Posix ACL is stored as an xattr it will not overwrite any existing
ZFS/NFSv4 ACLs which may have been set.  The Posix ACL will also
be non-functional on other platforms although it may be visible
as an xattr if that platform understands SA based xattrs.

By default Posix ACLs are disabled but they may be enabled with
the new 'aclmode=noacl|posixacl' property.  Set the property to
'posixacl' to enable them.  If ZFS/NFSv4 ACL support is ever added
an appropriate acltype will be added.

This change passes the POSIX Test Suite cleanly with the exception
of xacl/00.t test 45 which is incorrect for Linux (Ext4 fails too).

  http://www.tuxera.com/community/posix-test-suite/

Signed-off-by: Massimo Maggi <me@massimo-maggi.eu>
Signed-off-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #170
2013-10-29 14:54:26 -07:00
Matthew Ahrens
13fe019870 Illumos #3464
3464 zfs synctask code needs restructuring
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Adam Leventhal <ahl@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Approved by: Garrett D'Amore <garrett@damore.org>

References:
  https://www.illumos.org/issues/3464
  illumos/illumos-gate@3b2aab1880

Ported-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #1495
2013-09-04 16:01:24 -07:00
Matthew Ahrens
6f1ffb0665 Illumos #2882, #2883, #2900
2882 implement libzfs_core
2883 changing "canmount" property to "on" should not always remount dataset
2900 "zfs snapshot" should be able to create multiple, arbitrary snapshots at once

Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Chris Siden <christopher.siden@delphix.com>
Reviewed by: Garrett D'Amore <garrett@damore.org>
Reviewed by: Bill Pijewski <wdp@joyent.com>
Reviewed by: Dan Kruchinin <dan.kruchinin@gmail.com>
Approved by: Eric Schrock <Eric.Schrock@delphix.com>

References:
  https://www.illumos.org/issues/2882
  https://www.illumos.org/issues/2883
  https://www.illumos.org/issues/2900
  illumos/illumos-gate@4445fffbbb

Ported-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #1293

Porting notes:

WARNING: This patch changes the user/kernel ABI.  That means that
the zfs/zpool utilities built from master are NOT compatible with
the 0.6.2 kernel modules.  Ensure you load the matching kernel
modules from master after updating the utilities.  Otherwise the
zfs/zpool commands will be unable to interact with your pool and
you will see errors similar to the following:

  $ zpool list
  failed to read pool configuration: bad address
  no pools available

  $ zfs list
  no datasets available

Add zvol minor device creation to the new zfs_snapshot_nvl function.

Remove the logging of the "release" operation in
dsl_dataset_user_release_sync().  The logging caused a null dereference
because ds->ds_dir is zeroed in dsl_dataset_destroy_sync() and the
logging functions try to get the ds name via the dsl_dataset_name()
function. I've got no idea why this particular code would have worked
in Illumos.  This code has subsequently been completely reworked in
Illumos commit 3b2aab1 (3464 zfs synctask code needs restructuring).

Squash some "may be used uninitialized" warning/erorrs.

Fix some printf format warnings for %lld and %llu.

Apply a few spa_writeable() changes that were made to Illumos in
illumos/illumos-gate.git@cd1c8b8 as part of the 3112, 3113, 3114 and
3115 fixes.

Add a missing call to fnvlist_free(nvl) in log_internal() that was added
in Illumos to fix issue 3085 but couldn't be ported to ZoL at the time
(zfsonlinux/zfs@9e11c73) because it depended on future work.
2013-09-04 15:49:00 -07:00
George.Wilson
cc92e9d0c3 3246 ZFS I/O deadman thread
Reviewed by: Matt Ahrens <matthew.ahrens@delphix.com>
Reviewed by: Eric Schrock <eric.schrock@delphix.com>
Reviewed by: Christopher Siden <chris.siden@delphix.com>
Approved by: Garrett D'Amore <garrett@damore.org>

NOTES: This patch has been reworked from the original in the
following ways to accomidate Linux ZFS implementation

*) Usage of the cyclic interface was replaced by the delayed taskq
   interface.  This avoids the need to implement new compatibility
   code and allows us to rely on the existing taskq implementation.

*) An extern for zfs_txg_synctime_ms was added to sys/dsl_pool.h
   because declaring externs in source files as was done in the
   original patch is just plain wrong.

*) Instead of panicing the system when the deadman triggers a
   zevent describing the blocked vdev and the first pending I/O
   is posted.  If the panic behavior is desired Linux provides
   other generic methods to panic the system when threads are
   observed to hang.

*) For reference, to delay zios by 30 seconds for testing you can
   use zinject as follows: 'zinject -d <vdev> -D30 <pool>'

References:
  illumos/illumos-gate@283b84606b
  https://www.illumos.org/issues/3246

Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #1396
2013-05-01 17:05:52 -07:00
Eric Dillmann
0b4d1b5853 Add snapdev=[hidden|visible] dataset property
The new snapdev dataset property may be set to control the
visibility of zvol snapshot devices.  By default this value
is set to 'hidden' which will prevent zvol snapshots from
appearing under /dev/zvol/ and /dev/<dataset>/.  When set to
'visible' all zvol snapshots for the dataset will be visible.

This functionality was largely added because when automatic
snapshoting is enabled large numbers of read-only zvol snapshots
will be created.  When creating these devices the kernel will
attempt to read their partition tables, and blkid will attempt
to identify any filesystems on those partitions.  This leads
to a variety of issues:

1) The zvol partition tables will be read in the context of
   the `modprobe zfs` for automatically imported pools.  This
   is undesirable and should be done asynchronously, but for
   now reducing the number of visible devices helps.

2) Udev expects to be able to complete its work for a new
   block devices fairly quickly.  When many zvol devices are
   added at the same time this is no longer be true.  It can
   lead to udev timeouts and missing /dev/zvol links.

3) Simply having lots of devices in /dev/ can be aukward from
   a management standpoint.  Hidding the devices your unlikely
   to ever use helps with this.  Any snapshot device which is
   needed can be made visible by changing the snapdev property.

NOTE: This patch changes the default behavior for zvols which
      was effectively 'snapdev=visible'.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #1235
Closes #945
Issue #956
Issue #756
2013-03-05 12:37:54 -08:00
Pawel Jakub Dawidek
0cee24064a Speed up 'zfs list -t snapshot -o name -s name'
FreeBSD #xxx:  Dramatically optimize listing snapshots when user
requests only snapshot names and wants to sort them by name, ie.
when executes:

  # zfs list -t snapshot -o name -s name

Because only name is needed we don't have to read all snapshot
properties.

Below you can find how long does it take to list 34509 snapshots
from a single disk pool before and after this change with cold and
warm cache:

    before:

        # time zfs list -t snapshot -o name -s name > /dev/null
        cold cache: 525s
        warm cache: 218s

    after:

        # time zfs list -t snapshot -o name -s name > /dev/null
        cold cache: 1.7s
        warm cache: 1.1s

NOTE: This patch only appears in FreeBSD.  If/when Illumos picks up
the change we may want to drop this patch and adopt their version.
However, for now this addresses a real issue.

Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #450
2012-06-14 09:49:04 -07:00
Brian Behlendorf
6283f55ea1 Support custom build directories and move includes
One of the neat tricks an autoconf style project is capable of
is allow configurion/building in a directory other than the
source directory.  The major advantage to this is that you can
build the project various different ways while making changes
in a single source tree.

For example, this project is designed to work on various different
Linux distributions each of which work slightly differently.  This
means that changes need to verified on each of those supported
distributions perferably before the change is committed to the
public git repo.

Using nfs and custom build directories makes this much easier.
I now have a single source tree in nfs mounted on several different
systems each running a supported distribution.  When I make a
change to the source base I suspect may break things I can
concurrently build from the same source on all the systems each
in their own subdirectory.

wget -c http://github.com/downloads/behlendorf/zfs/zfs-x.y.z.tar.gz
tar -xzf zfs-x.y.z.tar.gz
cd zfs-x-y-z

------------------------- run concurrently ----------------------
<ubuntu system>  <fedora system>  <debian system>  <rhel6 system>
mkdir ubuntu     mkdir fedora     mkdir debian     mkdir rhel6
cd ubuntu        cd fedora        cd debian        cd rhel6
../configure     ../configure     ../configure     ../configure
make             make             make             make
make check       make check       make check       make check

This change also moves many of the include headers from individual
incude/sys directories under the modules directory in to a single
top level include directory.  This has the advantage of making
the build rules cleaner and logically it makes a bit more sense.
2010-09-08 12:38:56 -07:00