Commit Graph

199 Commits

Author SHA1 Message Date
Brian Behlendorf
dbf763b39b Retire zpool_id infrastructure
In the interest of maintaining only one udev helper to give vdevs
user friendly names, the zpool_id and zpool_layout infrastructure
is being retired.  They are superseded by vdev_id which incorporates
all the previous functionality.

Documentation for the new vdev_id(8) helper and its configuration
file, vdev_id.conf(5), can be found in their respective man pages.
Several useful example files are installed under /etc/zfs/.

  /etc/zfs/vdev_id.conf.alias.example
  /etc/zfs/vdev_id.conf.multipath.example
  /etc/zfs/vdev_id.conf.sas_direct.example
  /etc/zfs/vdev_id.conf.sas_switch.example

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #981
2013-01-29 12:23:17 -08:00
Brian Behlendorf
79c6e4c445 Remove NPTL_GUARD_WITHIN_STACK
Commit 4b2f65b253 increased the user
space stack by 4x to resolve certain stack overflows.  As such it
no longer makes sense to worry about a single extra page which
might or might not be part of the process stack.  There is now
ample headroom for normal usage.

By eliminating this configure check we are also resolving the
following segfault which intentionally occurs at configure time
and may be logged in dmesg.

  conftest[22156]: segfault at 7fbf18a47e48 ip 00000000004007fe
  sp 00007fbf18a4be50 error 6 in conftest[400000+1000]

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2013-01-29 10:58:20 -08:00
Eric Dillmann
9759c60f1a Illumos #3035 LZ4 compression support in ZFS and GRUB
3035 LZ4 compression support in ZFS and GRUB

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Christopher Siden <csiden@delphix.com>

References:
  illumos/illumos-gate@a6f561b4ae
  https://www.illumos.org/issues/3035
  http://wiki.illumos.org/display/illumos/LZ4+Compression+In+ZFS

This patch has been slightly modified from the upstream Illumos
version to be compatible with Linux.  Due to the very limited
stack space in the kernel a lz4 workspace kmem cache is used.
Since we are using gcc we are also able to take advantage of the
gcc optimized __builtin_ctz functions.

Support for GRUB has been dropped from this patch.  That code
is available but those changes will need to made to the upstream
GRUB package.

Lastly, several hunks of dead code were dropped for clarity.  They
include the functions real_LZ4_uncompress(), LZ4_compressBound()
and the Visual Studio specific hunks wrapped in _MSC_VER.

Ported-by: Eric Dillmann <eric@jave.fr>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #1217
2013-01-29 09:28:20 -08:00
Brian Behlendorf
14ee71efbc Use strerror() not strerror_r()
The differ() function used strerror_r() instead of strerror() because
it allowed the error message to be directly copied in to a buffer.
This causes two issues under Linux.

* There are two versions of strerror_r() available an XSI-compliant
  version which returns an 'int' error code.  And a GNU-specific
  version which return a 'char *' to the resulting error string.

    int strerror_r(int errnum, char *buf, size_t buflen);   /* XSI */
    char *strerror_r(int errnum, char *buf, size_t buflen); /* GNU */

* The most recent versions of strerror_r() are annotated with the
  warn_unused_result attribute.  This causes the following warning
  since the upstream implementation casts the result to void.

    warning: ignoring return value of 'strerror_r', declared with
    attribute warn_unused_result [-Wunused-result]

The cleanest way to resolve both of these problems is just to use
strerror() and make a copy of the result in to the buffer.  This
resolves both issues and this is the only instance of strerror_r()
in the code base.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #1231
2013-01-28 10:02:38 -08:00
Darik Horn
38145d6129 Ensure that zfs diff prints unicode safely.
In the stream_bytes() library function used by `zfs diff`, explicitly
cast each byte in the input string to an unsigned character so that the
Linux fprintf() correctly escapes to octal and does not mangle the output.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #1172
2013-01-16 10:15:57 -08:00
Garrett D'Amore
844793c3cc Illumos #1557 assertion failed in userland taskq_destroy()
1557 assertion failed in userland taskq_destroy()

Reviewed by: Richard Lowe <richlowe@richlowe.net>
Reviewed by: George Wilson <gwilson@zfsmail.com>
Approved by: Eric Schrock <eric.schrock@delphix.com>

References:
  illumos/illumos-gate@aa846ad9bc
  illumos changeset: 13597:3eac1e8e0f4c
  https://www.illumos.org/issues/1557

Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
2013-01-11 09:17:06 -08:00
Christopher Siden
b9b24bb4ca Illumos #2762: zpool command should have better support for feature flags
2762 zpool command should have better support for feature flags
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Eric Schrock <Eric.Schrock@delphix.com>

References:
  illumos/illumos-gate@57221772c3
  https://www.illumos.org/issues/2762

Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
2013-01-08 10:35:43 -08:00
George Wilson
3bc7e0fb0f Illumos #3090 and #3102
3090 vdev_reopen() during reguid causes vdev to be treated as corrupt
3102 vdev_uberblock_load() and vdev_validate() may read the wrong label

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Christopher Siden <chris.siden@delphix.com>
Reviewed by: Garrett D'Amore <garrett@damore.org>
Approved by: Eric Schrock <Eric.Schrock@delphix.com>

References:
  illumos/illumos-gate@dfbb943217
  illumos changeset: 13777:b1e53580146d
  https://www.illumos.org/issues/3090
  https://www.illumos.org/issues/3102

Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #939
2013-01-08 10:35:42 -08:00
Christopher Siden
9ae529ec5d Illumos #2619 and #2747
2619 asynchronous destruction of ZFS file systems
2747 SPA versioning with zfs feature flags
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <gwilson@delphix.com>
Reviewed by: Richard Lowe <richlowe@richlowe.net>
Reviewed by: Dan Kruchinin <dan.kruchinin@gmail.com>
Approved by: Eric Schrock <Eric.Schrock@delphix.com>

References:
  illumos/illumos-gate@53089ab7c8
  illumos/illumos-gate@ad135b5d64
  illumos changeset: 13700:2889e2596bd6
  https://www.illumos.org/issues/2619
  https://www.illumos.org/issues/2747

NOTE: The grub specific changes were not ported.  This change
must be made to the Linux grub packages.

Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
2013-01-08 10:35:35 -08:00
Massimo Maggi
5e6320cd12 Fix get/set users/groups in quota props via numeric id
Fix setting/getting users/groups in quota properties through
numeric identifier.  This support was accidentally disabled
in the original port by applying the HAVE_IDMAP wrapper macro
too broadly.

Fix obtained by moving #ifdef HAVE_IDMAP to exclude only
the part of code that really needs IDMAP.  Now zfs (get|set)
(user|group)quota@1000 works as expected.

Signed-off-by: Massimo Maggi <massimo@mmmm.it>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #1147
2012-12-17 09:52:58 -08:00
Jorgen Lundman
53c2ec1d1b Fix 'zpool create' segfault due to bad syntax
Incorrect syntax should never cause a segfault.  In this case
listing multiple comma delimited options after '-o' triggered
the problem.  For example:

  zpool create -o ashift=12,listsnaps=on

This patch resolves the issue by wrapping the calls which use
hdr with a NULL test.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #1118
2012-12-04 11:15:25 -08:00
Turbo Fredriksson
645fb9cc21 Implemented sharing datasets via SMB using libshare
Add the initial support for the 'smbshare' option using the
existing libshare infrastructure.  Because this implementation
relies on usershares samba version 3.0.23 is required.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #493
2012-12-03 09:42:15 -08:00
Brian Behlendorf
c372b36e3e Allow GPT+EFI vdevs for root pools
Commit 57a4edd allows the bootfs property to be set on any pool.
However, many of the zpool commands still prevent you from using
EFI labeled devices for the root pool.  For example:

    # zpool attach rpool /dev/sda /dev/sdb
    cannot label 'sdb': EFI labeled devices are not supported on
    root pools. on root devices.

For non-Solaris builds such as Linux disable this error.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #1077
2012-11-30 13:45:14 -08:00
Brian Behlendorf
0e20a31b4b Recreate minors when renaming zvols
When a zvol with snapshots is renamed the device files under
/dev/zvol/ are not renamed.  This patch resolves the problem
by destroying and recreating the minors with the new name so
the links can be recreated bu udev.

Original-patch-by: Suman Chakravartula <schakrava@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #408
2012-11-19 16:59:44 -08:00
Brian Behlendorf
e95853a331 Add txgs-<pool> kstat file
Create a kstat file which contains useful statistics about the
last N txgs processed.  This can be helpful when analyzing pool
performance.  The new KSTAT_TYPE_TXG type was added for this
purpose and it tracks the following statistics per-txg.

  txg          - Unique txg number
  state        - State (O)pen/(Q)uiescing/(S)yncing/(C)ommitted
  birth;       - Creation time
  nread        - Bytes read
  nwritten;    - Bytes written
  reads        - IOPs read
  writes       - IOPs write
  open_time;   - Length in nanoseconds the txg was open
  quiesce_time - Length in nanoseconds the txg was quiescing
  sync_time;   - Length in nanoseconds the txg was syncing

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2012-11-02 15:45:56 -07:00
Brian Behlendorf
30b937ee15 Update spare and cache device names on import
During 'zpool import' all ZPOOL_CONFIG_PATH names are supposed
to be updated by fix_paths().  This was not happening for spare
and cache devices because the proper names were getting filtered
out of the pool_list_t->names.  Interestingly, the names were
being filtered because the spare and cache devices do not
contain the pool name in their vdev label.

The fix is to exclude the device path from the list only if:

  1) has a valid ZPOOL_CONFIG_POOL_NAME key in the label, and
  2) that pool name does not match the specified pool name.

Since the label is valid and because it does properly store the
vdev guid it will be correctly assembled without the pool name.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #725
2012-10-22 08:46:02 -07:00
Brian Behlendorf
eac4720465 Allow 'zpool replace' to use short device names
The 'zpool replace' command would fail when given a short name
because unlike on other platforms the short name cannot be
deterministically expanded to a single path.  Multiple path
prefixes must be checked and in addition the partition suffix
for whole disks is determined by the prefix.

To handle this complexity a zfs_strcmp_pathname() function was
added which takes either a short or fully qualified device name.
Short names will be expanded using the prefixes in the default
import search path, or the ZPOOL_IMPORT_PATH environment variable
if it's defined.  All posible expansions are then compared against
the comparison path.  Care is taken to strip redundant slashes to
ensure legitimate matches are not missed.

In the context of this work the existing zfs_resolve_shortname()
function was extended to consider the ZPOOL_IMPORT_PATH when set.
The zfs_append_partition() interface was also simplified to take
only a single buffer.

The vast majority of these changes rework existing Linux specific
code which was originally written to accomidate udev.  However,
there is some minimal cleanup which removes Illumos specific code.
This was done to improve readability but the basic flow and intent
of the upstream code was maintained.

These changes are the logical conclusion of the previos work to
adjust the 'zpool import' search behavior, see commit 44867b6a.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #544
Closes #976
2012-10-22 08:45:58 -07:00
Etienne Dechamps
142e6dd100 Add atomic_sub_* functions to libspl.
Both the SPL and the ZFS libspl export most of the atomic_* functions,
except atomic_sub_* functions which are only exported by the SPL, not by
libspl. This patch remedies that by implementing atomic_sub_* functions
in libspl.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #1013
2012-10-17 08:56:37 -07:00
Matthew Ahrens
04434775b7 Illumos #3100: zvol rename fails with EBUSY when dirty.
illumos/illumos-gate@2e2c135528
Illumos changeset: 13780:6da32a929222

3100 zvol rename fails with EBUSY when dirty

Reviewed by: Christopher Siden <chris.siden@delphix.com>
Reviewed by: Adam H. Leventhal <ahl@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Garrett D'Amore <garrett@damore.org>
Approved by: Eric Schrock <eric.schrock@delphix.com>

Ported-by: Etienne Dechamps <etienne.dechamps@ovh.net>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #995
2012-10-03 13:59:02 -07:00
Etienne Dechamps
0aebd4f9e3 Create threads in detached state in userspace.
Currently, thread_create(), when called in userspace, creates a
joinable (i.e. not detached thread). This is the pthread default.

Unfortunately, this does not reproduce kthreads behavior (kthreads
are always detached). In addition, this contradicts the original
Solaris code which creates userspace threads in detached mode.

These joinable threads are never joined, which leads to a leakage of
pthread thread objects ("zombie threads"). This in turn results in
excessive ressource consumption, and possible ressource exhaustion in
extreme cases (e.g. long ztest runs).

This patch fixes the issue by creating userspace threads in detached
mode. The only exception is ztest worker threads which are meant to be
joinable.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #989
2012-10-03 13:32:48 -07:00
Bill Pijewski
37abac6d55 Illumos #2703: add mechanism to report ZFS send progress
Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Reviewed by: Richard Lowe <richlowe@richlowe.net>
Approved by: Eric Schrock <Eric.Schrock@delphix.com>

References:
  https://www.illumos.org/issues/2703

Ported by: Martin Matuska <martin@matuska.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2012-09-19 13:39:06 -07:00
Chris Siden
1bd201e70d Illumos #1948: zpool list should show more detailed pool info
Reviewed by: Adam Leventhal <ahl@delphix.com>
Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: Eric Schrock <eric.schrock@delphix.com>
Reviewed by: Richard Lowe <richlowe@richlowe.net>
Reviewed by: Albert Lee <trisk@nexenta.com>
Reviewed by: Dan McDonald <danmcd@nexenta.com>
Reviewed by: Garrett D'Amore <garrett@damore.org>
Approved by: Eric Schrock <eric.schrock@delphix.com>

References:
  https://www.illumos.org/issues/1948

Ported by:	Martin Matuska <martin@matuska.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #685
2012-09-19 13:39:05 -07:00
Brian Behlendorf
0a2f7b3662 Seg fault 'zpool import -d /dev/disk/by-id -a'
Introduced by commit 44867b6d6e.
We should of course check to ensure best isn't NULL before
attempting to dereference it.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #974
2012-09-18 12:33:37 -07:00
Brian Behlendorf
44867b6d6e Improve zpool import search behavior
The goal of this change is to make 'zpool import' prefer to use
the peristent /dev/mapper or /dev/disk/by-* paths.  These are far
preferable to the devices in /dev/ whos names are not persistent
and are determined by the order in which a device is detected.

This patch improves things by changing the default search path from
just to the top level /dev/ directory to (in order):

  /dev/disk/by-vdev   - Custom rules, use first if they exist
  /dev/disk/zpool     - Custom rules, use first if they exist
  /dev/mapper         - Use multipath devices before components
  /dev/disk/by-uuid   - Single unique entry and persistent
  /dev/disk/by-id     - May be multiple entries and persistent
  /dev/disk/by-path   - Encodes physical location and persistent
  /dev/disk/by-label  - Custom persistent labels
  /dev                - UNSAFE device names will change

The default search path can be overriden by setting the
ZPOOL_IMPORT_PATH environment variable.  This must be a colon
delimited list of paths which are searched for vdevs.  If the
'zpool import -d' option is specified only those listed paths
will be searched.

Finally, when multiple paths to the same device are found.  If one
of the paths is an exact match for the path used last time to import
the pool it will be used.  When there are no exact matches the
prefered path will be determined by the provided search order.

This means you can still import a pool and force specific names by
providing the -d <path> option.  And the prefered names will persist
as long as those paths exist on your system.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #965
2012-09-17 13:49:07 -07:00
Cyril Plisko
27ccd4147b Avoid running exportfs on each zfs/zpool command invocation
Delay executing exportfs command until its results are actually
required.

Signed-off-by: Cyril Plisko <cyril.plisko@mountall.com>
Signed-off-by: Gunnar Beutner <gunnar@beutner.name>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2012-09-11 10:21:49 -07:00
Etienne Dechamps
4b2f65b253 Increase the stack space in userspace.
In 1e33ac1e26, the maximum stack size for
userspace tools was set to 8k to mimic the available kernel stack size.

Unfortunately, due to differences in how the stack is used in userspace
vs kernel space, spurious stack overflows could occur in userspace
tools due to the limited stack size. This is especially true in ztest
when debugging is enabled.

This patch multiplies the userspace stack size by 4, which fixes the
stack overflow issues. This comes at the price of not being able to
catch stack size issues in userspace, but the previous solution proved
unreliable anyway.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Fixes #934.
2012-09-06 11:59:59 -07:00
Michael Martin
fc24f7c887 Fix missing vdev names in zpool status output
Commit 858219c makes more sense down below in the 'if (verbose)'
section of the code.  Initially, buf and path will never point
to the same location.  Once 'path = buf' is set on a raidz vdev,
the code may drop into the verbose section depending on the
verbose flag.  In here, using a tmpbuf makes sense since now
'buf == path'.

This issue does not occur in the upstream Solaris code because
their implementations of snprintf() allow for buf and path to
be the same address.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #57
2012-09-05 22:09:12 -07:00
Brian Behlendorf
ca8b5af89d Remove autotools products
Remove all of the generated autotools products from the repository
and update the .gitignore files accordingly.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #718
2012-08-27 11:47:44 -07:00
Garrett D'Amore
08b1b21d58 Illumos #2803: zfs get guid pretty-prints the output
Reviewed by: Eric Schrock <eric.schrock@delphix.com>
Reviewed by: Richard Elling <richard.elling@gmail.com>
Reviewed by: Alexander Eremin <alexander.eremin@nexenta.com>
Approved by: Dan McDonald <danmcd@nexenta.com>

References:
  https://www.illumos.org/issues/2803

Ported by: Martin Matuska <martin@matuska.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2012-08-23 10:40:14 -07:00
Christopher Siden
e956d65106 Illumos #1796, #2871, #2903, #2957
1796 "ZFS HOLD" should not be used when doing "ZFS SEND" from a read-only pool
2871 support for __ZFS_POOL_RESTRICT used by ZFS test suite
2903 zfs destroy -d does not work
2957 zfs destroy -R/r sometimes fails when removing defer-destroyed snapshot
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Eric Schrock <Eric.Schrock@delphix.com>

References:
  https://www.illumos.org/issues/1796
  https://www.illumos.org/issues/2871
  https://www.illumos.org/issues/2903
  https://www.illumos.org/issues/2957

Ported by: Martin Matuska <martin@matuska.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2012-08-23 10:40:02 -07:00
Eric Schrock
db49968e5c Illumos #2635: 'zfs rename -f' to perform force unmount
Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: George Wilson <George.Wilson@delphix.com>
Reviewed by: Bill Pijewski <wdp@joyent.com>
Reviewed by: Richard Elling <richard.elling@richardelling.com>
Approved by: Richard Lowe <richlowe@richlowe.net>

References:
  https://www.illumos.org/issues/2635

Ported by: Martin Matuska <martin@matuska.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #717
2012-08-23 10:39:43 -07:00
Martin Matuska
cf997d797b Properly initialize and free destroydata
This regression was accidentally introduced by commit
330d06f90d due to ZoL
specific code.  The fix is to simply ensure the passed
nvlist is initialized and freed.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #876
2012-08-23 09:42:21 -07:00
Dan McDonald
d96eb2b153 Illumos #1693: persistent 'comment' field for a zpool
Reviewed by: George Wilson <gwilson@zfsmail.com>
Reviewed by: Eric Schrock <eric.schrock@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>

References:
  https://www.illumos.org/issues/1693

Ported by: Martin Matuska <martin@matuska.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #678
2012-08-08 11:49:37 -07:00
Etienne Dechamps
ee5fd0bb80 Set zvol discard_granularity to the volblocksize.
Currently, zvols have a discard granularity set to 0, which suggests to
the upper layer that discard requests of arbirarily small size and
alignment can be made efficiently.

In practice however, ZFS does not handle unaligned discard requests
efficiently: indeed, it is unable to free a part of a block. It will
write zeros to the specified range instead, which is both useless and
inefficient (see dnode_free_range).

With this patch, zvol block devices expose volblocksize as their discard
granularity, so the upper layer is aware that it's not supposed to send
discard requests smaller than volblocksize.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #862
2012-08-07 14:55:31 -07:00
Matthew Ahrens
330d06f90d Illumos #1644, #1645, #1646, #1647, #1708
1644 add ZFS "clones" property
1645 add ZFS "written" and "written@..." properties
1646 "zfs send" should estimate size of stream
1647 "zfs destroy" should determine space reclaimed by
     destroying multiple snapshots
1708 adjust size of zpool history data

References:
  https://www.illumos.org/issues/1644
  https://www.illumos.org/issues/1645
  https://www.illumos.org/issues/1646
  https://www.illumos.org/issues/1647
  https://www.illumos.org/issues/1708

This commit modifies the user to kernel space ioctl ABI.  Extra
care should be taken when updating to ensure both the kernel
modules and utilities are updated.  This change has reordered
all of the new ioctl()s to the end of the list.  This should
help minimize this issue in the future.

Reviewed by: Richard Lowe <richlowe@richlowe.net>
Reviewed by: George Wilson <gwilson@zfsmail.com>
Reviewed by: Albert Lee <trisk@opensolaris.org>
Approved by: Garrett D'Amore <garret@nexenta.com>

Ported by: Martin Matuska <martin@matuska.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #826
Closes #664
2012-07-31 09:25:30 -07:00
Etienne Dechamps
f09398cec6 Use /sys/module instead of /proc/modules.
When libzfs checks if the module is loaded or not, it currently reads
/proc/modules and searches for a line matching the module name.

Unfortunately, if the module is included in the kernel itself (built-in
module), then /proc/modules won't list it, so libzfs will wrongly conclude
that the module is not loaded, thus making all ZFS userspace tools unusable.

Fortunately, all loaded modules appear as directories in /sys/module, even
built-in ones. Thus we can use /sys/module in lieu of /proc/modules to fix
the issue.

As a bonus, the code for checking becomes much simpler.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #851
2012-07-26 13:45:33 -07:00
Richard Yao
739a1a82e0 Linux 3.5 compat, end_writeback() changed to clear_inode()
The end_writeback() function was changed by moving the call to
inode_sync_wait() earlier in to evict().   This effecitvely changes
the ordering of the sync but it does not impact the details of
the zfs implementation.

However, as part of this change end_writeback() was renamed to
clear_inode() to reflect the new semantics.  This change does
impact us and clear_inode() now maps to end_writeback() for
kernels prior to 3.5.

Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #784
2012-07-23 12:29:36 -07:00
Richard Yao
ea1fdf46e2 Linux 3.5 compat, iops->truncate_range() removed
The vmtruncate_range() support has been removed from the kernel in
favor of using the fallocate method in the file_operations table.

Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #784
2012-07-23 12:29:32 -07:00
Richard Yao
756c3e5a9c Linux 3.5 compat, eops->encode_fh() takes inodes
The export_operations member ->encode_fh() has been updated to
take both the child and parent inodes.  This interface used to
take the child dentry and a bool describing if the parent is needed.

NOTE: While updating this code I noticed that we do not currently
cleanly handle the case where we're passed a connectable parent.
This code should be audited to make sure we're doing the right thing.

Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #784
2012-07-23 12:29:23 -07:00
Etienne Dechamps
b5a28807cd Move partition scanning from userspace to module.
Currently, zpool online -e (dynamic vdev expansion) doesn't work on
whole disks because we're invoking ioctl(BLKRRPART) from userspace
while ZFS still has a partition open on the disk, which results in
EBUSY.

This patch moves the BLKRRPART invocation from the zpool utility to the
module. Specifically, this is done just before opening the device in
vdev_disk_open() which is called inside vdev_reopen(). This requires
jumping through some hoops to get to the disk device from the partition
device, and to make sure we can still open the partition after the
BLKRRPART call.

Note that this new code path is triggered on dynamic vdev expansion
only; other actions, like creating a new pool, are unchanged and still
call BLKRRPART from userspace.

This change also depends on API changes which are available in 2.6.37
and latter kernels.  The build system has been updated to detect this,
but there is no compatibility mode for older kernels.  This means that
online expansion will NOT be available in older kernels.  However, it
will still be possible to expand the vdev offline.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #808
2012-07-17 09:17:31 -07:00
Brian Behlendorf
7535251dcf Add PowerPC to supported VTOCs
This code was was inherited from Solaris which was careful to define
the expected VTOC for various supported architectures.  While this
check may have made sense there it's something we should be able to
safely drop under Linux.

However, I'm not quite ready to do that yet.  So for the moment I'm
just doing the very safe thing of adding PowerPC as a supported type.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2012-07-12 11:52:34 -07:00
Etienne Dechamps
cee43a7477 Fix efi_use_whole_disk() when efi_nparts == 128.
Commit e5dc681a changed EFI_NUMPAR from 9 to 128. This means that the
on-disk EFI label has efi_nparts = 128 instead of 9. The index of the
reserved partition, however, is still 8. This breaks
efi_use_whole_disk(), which uses efi_nparts-1 as the index of the
reserved partition.

This commit fixes efi_use_whole_disk() when the index of the reserved
partition is not efi_nparts-1. It rewrites the algorithm and makes it
more robust by using the order of the partitions instead of their
numbering. It assumes that the last non-empty partition is the reserved
partition, and that the non-empty partition before that is the data
partition.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #808
2012-07-12 08:59:22 -07:00
Etienne Dechamps
7608bd0dd0 Use the right device path when relabeling.
Currently, zpool_vdev_online() calls zpool_relabel_disk() with a short
partition device name, which is obviously wrong because (1)
zpool_relabel_disk() expects a full, absolute path to use with open()
and (2) efi_write() must be called on an opened disk device, not a
partition device.

With this patch, zpool_relabel_disk() gets called with a full disk
device path. The path is determined using the same algorithm as
zpool_find_vdev().

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #808
2012-07-12 08:59:16 -07:00
Etienne Dechamps
8adf486422 Fix error handling for "zpool online -e".
The error handling code around zpool_relabel_disk() is either inexistent
or wrong. The function call itself is not checked, and
zpool_relabel_disk() is generating error messages from an unitialized
buffer.

Before:

    # zpool online -e homez sdb; echo $?
    `: cannot relabel 'sdb1': unable to open device: 2
    0

After:

    # zpool online -e homez sdb; echo $?
    cannot expand sdb: cannot relabel 'sdb1': unable to open device: 2
    1

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #808
2012-07-12 08:58:19 -07:00
George Wilson
c7f2d69de3 Illumos #1949, #1953
1949 crash during reguid causes stale config
1953 allow and unallow missing from zpool history since removal of pyzfs

Reviewed by: Adam Leventhal <ahl@delphix.com>
Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: Eric Schrock <eric.schrock@delphix.com>
Reviewed by: Bill Pijewski <wdp@joyent.com>
Reviewed by: Richard Lowe <richlowe@richlowe.net>
Reviewed by: Garrett D'Amore <garrett.damore@gmail.com>
Reviewed by: Dan McDonald <danmcd@nexenta.com>
Reviewed by: Steve Gonczi <gonczi@comcast.net>
Approved by: Eric Schrock <eric.schrock@delphix.com>

References:
  https://www.illumos.org/issues/1949
  https://www.illumos.org/issues/1953

Ported by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #665
2012-07-11 13:33:31 -07:00
Garrett D'Amore
3541dc6d02 Illumos #1748: desire support for reguid in zfs
Reviewed by: George Wilson <gwilson@zfsmail.com>
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Reviewed by: Alexander Eremin <alexander.eremin@nexenta.com>
Reviewed by: Alexander Stetsenko <ams@nexenta.com>
Approved by: Richard Lowe <richlowe@richlowe.net>

References:
  https://www.illumos.org/issues/1748

This commit modifies the user to kernel space ioctl ABI.  Extra
care should be taken when updating to ensure both the kernel
modules and utilities are updated.  If only the user space
component is updated both the 'zpool events' command and the
'zpool reguid' command will not work until the kernel modules
are updated.

Ported by:     Martin Matuska <martin@matuska.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #665
2012-07-11 13:08:56 -07:00
Pawel Jakub Dawidek
0cee24064a Speed up 'zfs list -t snapshot -o name -s name'
FreeBSD #xxx:  Dramatically optimize listing snapshots when user
requests only snapshot names and wants to sort them by name, ie.
when executes:

  # zfs list -t snapshot -o name -s name

Because only name is needed we don't have to read all snapshot
properties.

Below you can find how long does it take to list 34509 snapshots
from a single disk pool before and after this change with cold and
warm cache:

    before:

        # time zfs list -t snapshot -o name -s name > /dev/null
        cold cache: 525s
        warm cache: 218s

    after:

        # time zfs list -t snapshot -o name -s name > /dev/null
        cold cache: 1.7s
        warm cache: 1.1s

NOTE: This patch only appears in FreeBSD.  If/when Illumos picks up
the change we may want to drop this patch and adopt their version.
However, for now this addresses a real issue.

Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #450
2012-06-14 09:49:04 -07:00
Richard Yao
bc98d6c809 Make zvol_remove_link() print a more useful error message
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2012-06-13 16:27:19 -07:00
Daniel Verite
c6327b63e6 Retry removal of busy minors
When failing to remove a zvol device link because it's busy, wait
a bit and retry in a loop instead of giving up immediately.  This
technique is similar to the loop in zpool_label_disk_wait(), with
the same goal: waiting for the asynchronous udev processes to finish
their work.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #692
2012-06-11 10:50:20 -07:00
Richard Yao
6a0936babc Linux 3.4 compat, d_make_root() replaces d_alloc_root()
torvalds/linux@adc0e91ab1 introduced
introduced d_make_root() as a replacement for d_alloc_root(). Further
commits appear to have removed d_alloc_root() from the Linux source
tree. This causes the following failure:

  error: implicit declaration of function 'd_alloc_root'
  [-Werror=implicit-function-declaration]

To correct this we update the code to use the current d_make_root()
interface for readability.  Then we introduce an autotools check
to determine if d_make_root() is available.  If it isn't then we
define some compatibility logic which used the older d_alloc_root()
interface.

Signed-off-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #776
2012-06-11 10:04:49 -07:00