Compare commits

...

148 Commits

Author SHA1 Message Date
Tony Hutter 1222e921c9 Tag zfs-0.8.2
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2019-09-25 11:27:51 -07:00
Kody A Kantor c37fa0d5a8 Disabled resilver_defer feature leads to looping resilvers
When a disk is replaced with another on a pool with the resilver_defer
feature present, but not enabled the resilver activity restarts during
each spa_sync. This patch checks to make sure that the resilver_defer
feature is first enabled before requesting a deferred resilver.

This was originally fixed in illumos-joyent as OS-7982.

Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>
Signed-off-by: Kody A Kantor <kody@kkantor.com>
External-issue: illumos-joyent OS-7982
Closes #9299
Closes #9338
2019-09-25 11:27:51 -07:00
Andriy Gapon 12a78fbb4f Fix dsl_scan_ds_clone_swapped logic
The was incorrect with respect to swapping dataset IDs both in the
on-disk ZAP object and the in-memory queue.

In both cases, if ds1 was already present, then it would be first
replaced with ds2 and then ds would be replaced back with ds1.
Also, both cases did not properly handle a situation where both ds1 and
ds2 are already queued.  A duplicate insertion would be attempted and
its failure would result in a panic.

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Andriy Gapon <avg@FreeBSD.org>
Closes #9140
Closes #9163
2019-09-25 11:27:51 -07:00
loli10K 63d8f57fe7 Scrubbing root pools may deadlock on kernels without elevator_change() (#9321)
Originally the zfs_vdev_elevator module option was added as a
convenience so the requested elevator would be automatically set
on the underlying block devices. At the time this was simple
because the kernel provided an API function which did exactly this.

This API was then removed in the Linux 4.12 kernel which prompted
us to add compatibly code to set the elevator via a usermodehelper.

Unfortunately changing the evelator via usermodehelper requires reading
some userland binaries, most notably modprobe(8) or sh(1), from a zfs
dataset on systems with root-on-zfs. This can deadlock the system if
used during the following call path because it may need, if the data
is not already cached in the ARC, reading directly from disk while
holding the spa config lock as a writer:

  zfs_ioc_pool_scan()
    -> spa_scan()
      -> spa_scan()
        -> vdev_reopen()
          -> vdev_elevator_switch()
            -> call_usermodehelper()

While the usermodehelper waits sh(1), modprobe(8) is blocked in the
ZIO pipeline trying to read from disk:

  INFO: task modprobe:2650 blocked for more than 10 seconds.
       Tainted: P           OE     5.2.14
  modprobe        D    0  2650    206 0x00000000
  Call Trace:
   ? __schedule+0x244/0x5f0
   schedule+0x2f/0xa0
   cv_wait_common+0x156/0x290 [spl]
   ? do_wait_intr_irq+0xb0/0xb0
   spa_config_enter+0x13b/0x1e0 [zfs]
   zio_vdev_io_start+0x51d/0x590 [zfs]
   ? tsd_get_by_thread+0x3b/0x80 [spl]
   zio_nowait+0x142/0x2f0 [zfs]
   arc_read+0xb2d/0x19d0 [zfs]
   ...
   zpl_iter_read+0xfa/0x170 [zfs]
   new_sync_read+0x124/0x1b0
   vfs_read+0x91/0x140
   ksys_read+0x59/0xd0
   do_syscall_64+0x4f/0x130
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

This commit changes how we use the usermodehelper functionality from
synchronous (UMH_WAIT_PROC) to asynchronous (UMH_NO_WAIT) which prevents
scrubs, and other vdev_elevator_switch() consumers, from triggering the
aforementioned issue.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Issue #8664
Closes #9321
2019-09-25 11:27:51 -07:00
Chengfei ZHu 9fa8b5b55b QAT related bug fixes
1. Fix issue:  Kernel BUG with QAT during decompression  #9276.
   Now it is uninterruptible for a specific given QAT request,
   but Ctrl-C interrupt still works in user-space process.

2. Copy the digest result to the buffer only when doing encryption,
   and vise-versa for decryption.

Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chengfei Zhu <chengfeix.zhu@intel.com>
Closes #9276
Closes #9303
2019-09-25 11:27:51 -07:00
Brian Behlendorf e17445d1f7 kmodtool: depmod path
Determine the location of depmod on the system, either /sbin/depmod or
/usr/sbin/depmod.  Then use that path when generating the specfile.

Additionally, update the Requires lines to reference the package which
provides depmod rather than the binary itself.  For CentOS/RHEL 7+8
and all supported Fedora releases this is the kmod package, and for
CentOS/RHEL 6 it is the module-init-tools package.

Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #8724
Closes #9310
2019-09-25 11:27:51 -07:00
Brian Behlendorf 97d4986214 Fix /etc/hostid on root pool deadlock
Accidentally introduced by dc04a8c which now takes the SCL_VDEV lock
as a reader in zfs_blkptr_verify().  A deadlock can occur if the
/etc/hostid file resides on a dataset in the same pool.  This is
because reading the /etc/hostid file may occur while the caller is
holding the SCL_VDEV lock as a writer.  For example, to perform a
`zpool attach` as shown in the abbreviated stack below.

To resolve the issue we cache the system's hostid when initializing
the spa_t, or when modifying the multihost property.  The cached
value is then relied upon for subsequent accesses.

Call Trace:
    spa_config_enter+0x1e8/0x350 [zfs]
    zfs_blkptr_verify+0x33c/0x4f0 [zfs] <--- trying read lock
    zio_read+0x6c/0x140 [zfs]
    ...
    vfs_read+0xfc/0x1e0
    kernel_read+0x50/0x90
    ...
    spa_get_hostid+0x1c/0x38 [zfs]
    spa_config_generate+0x1a0/0x610 [zfs]
    vdev_label_init+0xa0/0xc80 [zfs]
    vdev_create+0x98/0xe0 [zfs]
    spa_vdev_attach+0x14c/0xb40 [zfs] <--- grabbed write lock

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9256
Closes #9285
2019-09-25 11:27:51 -07:00
Olaf Faaland 0ae5f0c8d2 BuildRequires libtirpc-devel needed for RHEL 8
Building against RHEL 8 requires libtirpc-devel, as with fedora 28.
Add rhel8 and centos8 options to the test, to account for that.

BuildRequires Originally added for fedora 28 via commit
1a62a305be

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #9289
2019-09-25 11:27:51 -07:00
loli10K 146d7d8846 Fix zpool subcommands error message with some unsupported options
Both 'detach' and 'online' zpool subcommands, when provided with an
unsupported option, forget to print it in the error message:

   # zpool online -t rpool vda3
   invalid option ''
   usage:
      online [-e] <pool> <device> ...

This changes fixes the error message in order to include the actual
option that is not supported.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #9270
2019-09-25 11:27:51 -07:00
loli10K 9f261b1be6 Fix zfs-dkms .deb package warning in prerm script
Debian zfs-dkms package generated by alien doesn't call the prerm script
(rpm's %preun) with an integer as first parameter, which results in the
following warning when the package is uninstalled:

   "zfs-dkms.prerm: line 3: [: remove: integer expression expected"

Modify the if-condition to avoid the warning.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #9271
2019-09-25 11:27:51 -07:00
Pavel Zakharov 5acba22ec0 zvol_wait script should ignore partially received zvols
Partially received zvols won't have links in /dev/zvol.

Reviewed-by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Closes #9260
2019-09-25 11:27:51 -07:00
Pavel Zakharov 38528476bf New service that waits on zvol links to be created
The zfs-volume-wait.service scans existing zvols and waits for their
links under /dev to be created. Any service that depends on zvol
links to be there should add a dependency on zfs-volumes.target.
By default, this target is not enabled.

Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Reviewed-by: Antonio Russo <antonio.e.russo@gmail.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: John Gallagher <john.gallagher@delphix.com>
Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pavel Zakharov <pzakharov@delphix.com>
Closes #8975
2019-09-25 11:27:51 -07:00
Andriy Gapon beb21db3c6 Always refuse receving non-resume stream when resume state exists
This fixes a hole in the situation where the resume state is left from
receiving a new dataset and, so, the state is set on the dataset itself
(as opposed to %recv child).

Additionally, distinguish incremental and resume streams in error
messages.

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andriy Gapon <avg@FreeBSD.org>
Closes #9252
2019-09-25 11:27:51 -07:00
loli10K 13e5e396a3 Fix Intel QAT / ZFS compatibility on v4.7.1+ kernels
This change use the compat code introduced in 9cc1844a.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #9268
Closes #9269
2019-09-25 11:27:51 -07:00
Georgy Yakovlev 3cf4ecb03f etc/init.d/zfs-functions.in: remove arch warning
Remove the x86_64 warning, it's no longer the case that this is the
only supported architecture.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Georgy Yakovlev <gyakovlev@gentoo.org>
Closes: #9177
2019-09-25 11:27:51 -07:00
Pavel Zakharov 0e765c4eb8 zfs_handle used after being closed/freed in change_one callback
This is a typical case of use after free. We would call zfs_close(zhp)
which would free the handle, and then call zfs_iter_children() on that
handle later.  This change ensures that the zfs_handle is only closed
when we are ready to return.

Running `zfs inherit -r sharenfs pool` was failing with an error
code without any error messages. After some debugging I've pinpointed
the issue to be memory corruption, which would cause zfs to try to
issue an ioctl to the wrong device and receive ENOTTY.

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed-by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alek Pinchuk <apinchuk@datto.com>
Signed-off-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Issue #7967
Closes #9165
2019-09-25 11:27:51 -07:00
Chunwei Chen c7a4255f12 Fix zil replay panic when TX_REMOVE followed by TX_CREATE
If TX_REMOVE is followed by TX_CREATE on the same object id, we need to
make sure the object removal is completely finished before creation. The
current implementation relies on dnode_hold_impl with
DNODE_MUST_BE_ALLOCATED returning ENOENT. While this check seems to work
fine before, in current version it does not guarantee the object removal
is completed.

We fix this by checking if DNODE_MUST_BE_FREE returns successful
instead. Also add test and remove dead code in dnode_hold_impl.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #7151
Closes #8910
Closes #9123
Closes #9145
2019-09-25 11:27:51 -07:00
Andriy Gapon 931bef81c8 zfs_ioc_snapshot: check user-prop permissions on snapshotted datasets
Previously, the permissions were checked on the pool which was obviously
incorrect.

After this change, zfs_check_userprops() only validates the properties
without any permission checks.  The permissions are checked individually
for each snapshotted dataset.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Signed-off-by: Andriy Gapon <avg@FreeBSD.org>
Closes #9179
Closes #9180
2019-09-25 11:27:50 -07:00
Richard Allen ea34735203 Fix Plymouth passphrase prompt in initramfs script
Entering the ZFS encryption passphrase under Plymouth wasn't working
because in the ZFS initrd script, Plymouth was calling zfs via
"--command", which wasn't passing through the filesystem argument to
zfs load-key properly (it was passing through the single quotes around
the filesystem name intended to handle spaces literally,
which zfs load-key couldn't understand).

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Garrett Fields <ghfields@gmail.com>
Signed-off-by: Richard Allen <belperite@gmail.com>
Issue #9193
Closes #9202
2019-09-25 11:27:50 -07:00
Tom Caputi 95319fc569 Fix deadlock in 'zfs rollback'
Currently, the 'zfs rollback' code can end up deadlocked due to
the way the kernel handles unreferenced inodes on a suspended fs.
Essentially, the zfs_resume_fs() code path may cause zfs to spawn
new threads as it reinstantiates the suspended fs's zil. When a
new thread is spawned, the kernel may attempt to free memory for
that thread by freeing some unreferenced inodes. If it happens to
select inodes that are a a part of the suspended fs a deadlock
will occur because freeing inodes requires holding the fs's
z_teardown_inactive_lock which is still held from the suspend.

This patch corrects this issue by adding an additional reference
to all inodes that are still present when a suspend is initiated.
This prevents them from being freed by the kernel for any reason.

Reviewed-by: Alek Pinchuk <apinchuk@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #9203
2019-09-25 11:27:50 -07:00
Ryan Moeller 33374f21f0 Make slog test setup more robust
The slog tests fail when attempting to create pools using file vdevs
that already exist from previous test runs. Remove these files in the
setup for the test.

Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Ryan Moeller <ryan@ixsystems.com>
Closes #9194
2019-09-25 11:27:50 -07:00
yshui 512a50f38d zfs-mount-genrator: dependencies should be space-separated
Reviewed-by: Antonio Russo <antonio.e.russo@gmail.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com>
Closes #9174
2019-09-25 11:27:50 -07:00
Tony Hutter 023ab67a64 Linux 5.3: Fix switch() fall though compiler errors
Fix some switch() fall-though compiler errors:

    abd.c:1504:9: error: this statement may fall through

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #9170
2019-09-25 11:27:50 -07:00
Dominic Pearson 65469f6e30 Linux 5.3 compat: Makefile subdir-m no longer supported
Uses obj-m instead, due to kernel changes.

See LKML: Masahiro Yamada, Tue, 6 Aug 2019 19:03:23 +0900

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Dominic Pearson <dsp@technoanimal.net>
Closes #9169
2019-09-25 11:27:50 -07:00
Chunwei Chen 569f5d5d05 Fix out-of-order ZIL txtype lost on hardlinked files
We should only call zil_remove_async when an object is removed. However,
in current implementation, it is called whenever TX_REMOVE is called. In
the case of hardlinked file, every unlink will generate TX_REMOVE and
causing operations to be dropped even when the object is not removed.

We fix this by only calling zil_remove_async when the file is fully
unlinked.

Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Prakash Surya <prakash.surya@delphix.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #8769
Closes #9061
2019-09-25 11:27:50 -07:00
Michael Niewöhner 6d1599c1e1 Increase default zcmd allocation to 256K
When creating hundreds of clones (for example using containers with
LXD) cloning slows down as the number of clones increases over time.
The reason for this is that the fetching of the clone information
using a small zcmd buffer requires two ioctl calls, one to determine
the size and a second to return the data. However, this requires
gathering the data twice, once to determine the size and again to
populate the zcmd buffer to return it to userspace.
These are expensive ioctl() calls, so instead, make the default buffer
size much larger: 256K.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>
Closes #9084
2019-09-25 11:27:50 -07:00
Matthew Ahrens 6c9882d5db Improve performance by using dmu_tx_hold_*_by_dnode()
In zfs_write() and dmu_tx_hold_sa(), we can use dmu_tx_hold_*_by_dnode()
instead of dmu_tx_hold_*(), since we already have a dbuf from the target
dnode in hand.  This eliminates some calls to dnode_hold(), which can be
expensive.  This is especially impactful if several threads are
accessing objects that are in the same block of dnodes, because they
will contend for that dbuf's lock.

We are seeing 10-20% performance wins for the sequential_writes tests in
the performance test suite, when doing >=128K writes to files with
recordsize=8K.

This also removes some unnecessary casts that are in the area.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #9081
2019-09-25 11:27:50 -07:00
Brian Behlendorf 8c00159411 Fix channel programs on s390x
When adapting the original sources for s390x the JMP_BUF_CNT was
mistakenly halved due to an incorrect assumption of the size of
a unsigned long.  They are 8 bytes for the s390x architecture.
Increase JMP_BUF_CNT accordingly.

Authored-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reported-by: Colin Ian King <canonical.com>
Tested-by: Colin Ian King <canonical.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #8992
Closes #9080
2019-09-25 11:27:50 -07:00
George Wilson a8c5bcb5de Race between zfs-share and zfs-mount services
When a system boots the zfs-mount.service and the
zfs-share.service can start simultaneously. What may be
unclear is that sharing a filesystem will first mount
the filesystem if it's not already mounted. This means
that both service can race to mount the same fileystem.
This race can result in a SEGFAULT or EBUSY conditions.

This change explicitly defines the start ordering between the
two services such that the zfs-mount.service is solely
responsible for mounting filesystems eliminating the race
between "zfs mount -a" and "zfs share -a" commands.

Reviewed-by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Wilson <george.wilson@delphix.com>
Closes #9083
2019-09-25 11:27:50 -07:00
Tomohiro Kusumi 6c68594675 Implement secpolicy_vnode_setid_retain()
Don't unconditionally return 0 (i.e. retain SUID/SGID).
Test CAP_FSETID capability.

https://github.com/pjd/pjdfstest/blob/master/tests/chmod/12.t
which expects SUID/SGID to be dropped on write(2) by non-owner fails
without this. Most filesystems make this decision within VFS by using
a generic file write for fops.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Closes #9035
Closes #9043
2019-09-25 11:27:50 -07:00
Matthew Ahrens 1f5979d23f zed crashes when devid not present
zed core dumps due to a NULL pointer in zfs_agent_iter_vdev(). The
gs_devid is NULL, but the nvl has a "devid" entry.

zfs_agent_post_event() checks that ZFS_EV_VDEV_GUID or DEV_IDENTIFIER is
present in nvl, but then later it and zfs_agent_iter_vdev() assume that
DEV_IDENTIFIER is present and thus gs_devid is set.

Typically this is not a problem because usually either all vdevs have
devid's, or none of them do. Since zfs_agent_iter_vdev() first checks if
the vdev has devid before dereferencing gs_devid, the problem isn't
typically encountered. However, if some vdevs have devid's and some do
not, then the problem is easily reproduced.  This can happen if the pool
has been moved from a system that has devid's to one that does not.

The fix is for zfs_agent_iter_vdev() to only try to match the devid's if
both nvl and gsp have devid's present.

Reviewed-by: Prashanth Sreenivasa <pks@delphix.com>
Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-65090
Closes #9054
Closes #9060
2019-09-25 11:27:50 -07:00
Tomohiro Kusumi 4f951b183c Don't directly cast unsigned long to void*
Cast to uintptr_t first for portability on integer to/from pointer
conversion.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Closes #9065
2019-09-25 11:27:50 -07:00
Tomohiro Kusumi 65a0b28b42 Fix module_param() type for zfs_read_chunk_size
zfs_read_chunk_size is unsigned long.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Closes #9051
2019-09-25 11:27:50 -07:00
Tony Hutter be068aeea8 Move some tests to cli_user/zpool_status
The tests in tests/functional/cli_root/zpool_status should all require
root. However, linux.run has "user =" specified for those tests, which
means they run as a normal user.  When I removed that line to run them
as root, the following tests did not pass:

zpool_status_003_pos
zpool_status_-c_disable
zpool_status_-c_homedir
zpool_status_-c_searchpath

These tests need to be run as a normal user.  To fix this, move these
tests to a new tests/functional/cli_user/zpool_status directory.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #9057
2019-09-25 11:27:50 -07:00
Serapheim Dimitropoulos 1c4b0fc745 Race condition between spa async threads and export
In the past we've seen multiple race conditions that have
to do with open-context threads async threads and concurrent
calls to spa_export()/spa_destroy() (including the one
referenced in issue #9015).

This patch ensures that only one thread can execute the
main body of spa_export_common() at a time, with subsequent
threads returning with a new error code created just for
this situation, eliminating this way any race condition
bugs introduced by concurrent calls to this function.

Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #9015
Closes #9044
2019-09-25 11:27:50 -07:00
Serapheim Dimitropoulos bbbe4b0a98 hdr_recl calls zthr_wakeup() on destroyed zthr
There exists a race condition were hdr_recl() calls
zthr_wakeup() on a destroyed zthr. The timeline is the
following:

[1] hdr_recl() runs first and goes intro zthr_wakeup()
    because arc_initialized is set.
[2] arc_fini() is called by another thread, zeroes
    that flag, destroying the zthr, and goes into
    buf_init().
[3] hdr_recl() tries to enter the destroyed mutex
    and we blow up.

This patch ensures that the ARC's zthrs are not offloaded
any new work once arc_initialized is set and then destroys
them after all of the ARC state has been deleted.

Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #9047
2019-09-25 11:27:50 -07:00
Tomohiro Kusumi 3c144b9267 Fix wrong comment on zcr_blksz_{min,max}
These aren't tunable; illumos has this comment fixed in
"3742 zfs comments need cleaner, more consistent style",
so sync with that.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Closes #9052
2019-09-25 11:27:50 -07:00
Brian Behlendorf 428a63cc62 Retire unused spl_{mutex,rwlock}_{init_fini}
These functions are unused and can be removed along
with the spl-mutex.c and spl-rwlock.c source files.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9029
2019-09-25 11:27:49 -07:00
Brian Behlendorf 3982d959c5 Linux 5.3 compat: retire rw_tryupgrade()
The Linux kernel's rwsem's have never provided an interface to
allow a reader to be upgraded to a writer.  Historically, this
functionality has been implemented by a SPL wrapper function.
However, this approach depends on internal knowledge of the
rw_semaphore and is therefore rather brittle.

Since the ZFS code must always be able to fallback to rw_exit()
and rw_enter() when an rw_tryupgrade() fails; this functionality
isn't critical.  Furthermore, the only potentially performance
sensitive consumer is dmu_zfetch() and no decrease in performance
was observed with this change applied.  See the PR comments for
additional testing details.

Therefore, it is being retired to make the build more robust and
to simplify the rwlock implementation.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9029
2019-09-25 11:27:49 -07:00
Brian Behlendorf 54561073e7 Linux 5.3 compat: rw_semaphore owner
Commit https://github.com/torvalds/linux/commit/94a9717b updated the
rwsem's owner field to contain additional flags describing the rwsem's
state.  Rather then update the wrappers to mask out these bits, the
code no longer relies on the owner stored by the kernel.  This does
increase the size of a krwlock_t but it makes the implementation
less sensitive to future kernel changes.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9029
2019-09-25 11:27:49 -07:00
jdike 4c98586daf Fix lockdep recursive locking false positive in dbuf_destroy
lockdep reports a possible recursive lock in dbuf_destroy.

It is true that dbuf_destroy is acquiring the dn_dbufs_mtx
on one dnode while holding it on another dnode.  However,
it is impossible for these to be the same dnode because,
among other things,dbuf_destroy checks MUTEX_HELD before
acquiring the mutex.

This fix defines a class NESTED_SINGLE == 1 and changes
that lock to call mutex_enter_nested with a subclass of
NESTED_SINGLE.

In order to make the userspace code compile,
include/sys/zfs_context.h now defines mutex_enter_nested and
NESTED_SINGLE.

This is the lockdep report:

[  122.950921] ============================================
[  122.950921] WARNING: possible recursive locking detected
[  122.950921] 4.19.29-4.19.0-debug-d69edad5368c1166 #1 Tainted: G           O
[  122.950921] --------------------------------------------
[  122.950921] dbu_evict/1457 is trying to acquire lock:
[  122.950921] 0000000083e9cbcf (&dn->dn_dbufs_mtx){+.+.}, at: dbuf_destroy+0x3c0/0xdb0 [zfs]
[  122.950921]
               but task is already holding lock:
[  122.950921] 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs]
[  122.950921]
               other info that might help us debug this:
[  122.950921]  Possible unsafe locking scenario:

[  122.950921]        CPU0
[  122.950921]        ----
[  122.950921]   lock(&dn->dn_dbufs_mtx);
[  122.950921]   lock(&dn->dn_dbufs_mtx);
[  122.950921]
                *** DEADLOCK ***

[  122.950921]  May be due to missing lock nesting notation

[  122.950921] 1 lock held by dbu_evict/1457:
[  122.950921]  #0: 0000000055523987 (&dn->dn_dbufs_mtx){+.+.}, at: dnode_evict_dbufs+0x90/0x740 [zfs]
[  122.950921]
               stack backtrace:
[  122.950921] CPU: 0 PID: 1457 Comm: dbu_evict Tainted: G           O      4.19.29-4.19.0-debug-d69edad5368c1166 #1
[  122.950921] Hardware name: Supermicro H8SSL-I2/H8SSL-I2, BIOS 080011  03/13/2009
[  122.950921] Call Trace:
[  122.950921]  dump_stack+0x91/0xeb
[  122.950921]  __lock_acquire+0x2ca7/0x4f10
[  122.950921]  lock_acquire+0x153/0x330
[  122.950921]  dbuf_destroy+0x3c0/0xdb0 [zfs]
[  122.950921]  dbuf_evict_one+0x1cc/0x3d0 [zfs]
[  122.950921]  dbuf_rele_and_unlock+0xb84/0xd60 [zfs]
[  122.950921]  dnode_evict_dbufs+0x3a6/0x740 [zfs]
[  122.950921]  dmu_objset_evict+0x7a/0x500 [zfs]
[  122.950921]  dsl_dataset_evict_async+0x70/0x480 [zfs]
[  122.950921]  taskq_thread+0x979/0x1480 [spl]
[  122.950921]  kthread+0x2e7/0x3e0
[  122.950921]  ret_from_fork+0x27/0x50

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Jeff Dike <jdike@akamai.com>
Closes #8984
2019-09-25 11:27:49 -07:00
Michael Niewöhner ceb516ac2f Add missing __GFP_HIGHMEM flag to vmalloc
Make use of __GFP_HIGHMEM flag in vmem_alloc, which is required for
some 32-bit systems to make use of full available memory.
While kernel versions >=4.12-rc1 add this flag implicitly, older
kernels do not.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>
Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>
Closes #9031
2019-09-25 11:27:49 -07:00
Tomohiro Kusumi 2b9f73e5e6 Use zfsctl_snapshot_hold() wrapper
zfs_refcount_*() are to be wrapped by zfsctl_snapshot_*() in this file.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Closes #9039
2019-09-25 11:27:49 -07:00
Brian Behlendorf 984bfb373f Minor style cleanup
Resolve an assortment of style inconsistencies including
use of white space, typos, capitalization, and line wrapping.
There is no functional change.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #9030
2019-09-25 11:27:49 -07:00
Brian Behlendorf 446d08fba4 Fix get_special_prop() build failure
The cast of the size_t returned by strlcpy() to a uint64_t by the
VERIFY3U can result in a build failure when CONFIG_FORTIFY_SOURCE
is set.  This is due to the additional hardening.  Since the token
is expected to always fit in strval the VERIFY3U has been removed.
If somehow it doesn't, it will still be safely truncated.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #8999
Closes #9020
2019-09-25 11:27:49 -07:00
Antonio Russo af7a5672c3 systemd encryption key support
Modify zfs-mount-generator to produce a dependency on new
zfs-import-key-*.service units, dynamically created at boot to call
zfs load-key for the encryption root, before attempting to mount any
encrypted datasets.

These units are created by zfs-mount-generator, and RequiresMountsFor on
the keyfile, if present, or call systemd-ask-password if a passphrase is
requested.

This patch includes suggestions from @Fabian-Gruenbichler, @ryanjaeb and
@rlaager, as well an adaptation of @rlaager's script to retry on
incorrect password entry.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Antonio Russo <antonio.e.russo@gmail.com>
Closes #8750
Closes #8848
2019-09-25 11:27:49 -07:00
Tomohiro Kusumi 73e50a7d5d Drop redundant POSIX ACL check in zpl_init_acl()
ZFS_ACLTYPE_POSIXACL has already been tested in zpl_init_acl(),
so no need to test again on POSIX ACL access.

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Closes #9009
2019-09-25 11:27:49 -07:00
Brian Behlendorf d751b12a9d Export dnode symbols
External consumers such as Lustre require access to the dnode
interfaces in order to correctly manipulate dnodes.

Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #8994
Closes #9027
2019-09-25 11:27:49 -07:00
Tom Caputi 78831d4290 Ensure dsl_destroy_head() decrypts objsets
This patch corrects a small issue where the dsl_destroy_head()
code that runs when the async_destroy feature is disabled would
not properly decrypt the dataset before beginning processing.
If the dataset is not able to be decrypted, the optimization
code now simply does not run and the dataset is completely
destroyed in the DSL sync task.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #9021
2019-09-25 11:27:49 -07:00
Tomohiro Kusumi 0a223246e1 Disable unused pathname::pn_path* (unneeded in Linux)
struct pathname is originally from Solaris VFS, and it has been used
in ZoL to merely call VOP from Linux VFS interface without API change,
therefore pathname::pn_path* are unused and unneeded. Technically,
struct pathname is a wrapper for C string in ZoL.

Saves stack a bit on lookup and unlink.

(#if0'd members instead of removing since comments refer to them.)

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Closes #9025
2019-09-25 11:27:49 -07:00
Nick Mattis cf966cb19a Fixes: #8934 Large kmem_alloc
Large allocation over the spl_kmem_alloc_warn value was being performed.
Switched to vmem_alloc interface as specified for large allocations.
Changed the subsequent frees to match.

Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: nmattis <nickm970@gmail.com>
Closes #8934
Closes #9011
2019-09-25 11:27:49 -07:00
Attila Fülöp 6e19cc77cf Fix ZTS killed processes detection
log_neg_expect was using the wrong exit status to detect if a process
got killed by SIGSEGV or SIGBUS, resulting in false positives.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes #9003
2019-09-25 11:27:49 -07:00
Shaun Tancheff c3a3c5a30f pkg-utils python sitelib for SLES15
Use python -Esc to set __python_sitelib.

Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Closes #8969
2019-09-25 11:27:49 -07:00
Tomohiro Kusumi ccd8125e45 Fix race in parallel mount's thread dispatching algorithm
Strategy of parallel mount is as follows.

1) Initial thread dispatching is to select sets of mount points that
 don't have dependencies on other sets, hence threads can/should run
 lock-less and shouldn't race with other threads for other sets. Each
 thread dispatched corresponds to top level directory which may or may
 not have datasets to be mounted on sub directories.

2) Subsequent recursive thread dispatching for each thread from 1)
 is to mount datasets for each set of mount points. The mount points
 within each set have dependencies (i.e. child directories), so child
 directories are processed only after parent directory completes.

The problem is that the initial thread dispatching in
zfs_foreach_mountpoint() can be multi-threaded when it needs to be
single-threaded, and this puts threads under race condition. This race
appeared as mount/unmount issues on ZoL for ZoL having different
timing regarding mount(2) execution due to fork(2)/exec(2) of mount(8).
`zfs unmount -a` which expects proper mount order can't unmount if the
mounts were reordered by the race condition.

There are currently two known patterns of input list `handles` in
`zfs_foreach_mountpoint(..,handles,..)` which cause the race condition.

1) #8833 case where input is `/a /a /a/b` after sorting.
 The problem is that libzfs_path_contains() can't correctly handle an
 input list with two same top level directories.
 There is a race between two POSIX threads A and B,
  * ThreadA for "/a" for test1 and "/a/b"
  * ThreadB for "/a" for test0/a
 and in case of #8833, ThreadA won the race. Two threads were created
 because "/a" wasn't considered as `"/a" contains "/a"`.

2) #8450 case where input is `/ /var/data /var/data/test` after sorting.
 The problem is that libzfs_path_contains() can't correctly handle an
 input list containing "/".
 There is a race between two POSIX threads A and B,
  * ThreadA for "/" and "/var/data/test"
  * ThreadB for "/var/data"
 and in case of #8450, ThreadA won the race. Two threads were created
 because "/var/data" wasn't considered as `"/" contains "/var/data"`.
 In other words, if there is (at least one) "/" in the input list,
 the initial thread dispatching must be single-threaded since every
 directory is a child of "/", meaning they all directly or indirectly
 depend on "/".

In both cases, the first non_descendant_idx() call fails to correctly
determine "path1-contains-path2", and as a result the initial thread
dispatching creates another thread when it needs to be single-threaded.
Fix a conditional in libzfs_path_contains() to consider above two.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Closes #8450
Closes #8833
Closes #8878
2019-09-25 11:27:49 -07:00
loli10K 2ac233c633 Fix dracut Debian/Ubuntu packaging
This commit ensures make(1) targets that build .deb packages fail if
alien(1) can't convert all .rpm files; additionally it also updates
the zfs-dracut package name which was changed to "noarch" in ca4e5a7.

Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #8990
Closes #8991
2019-09-25 11:27:49 -07:00
Tom Caputi 1f72a18f59 Remove VERIFY from dsl_dataset_crypt_stats()
This patch fixes an issue where dsl_dataset_crypt_stats() would
VERIFY that it was able to hold the encryption root. This function
should instead silently continue without populating the related
field in the nvlist, as is the convention for this code.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #8976
2019-09-25 11:27:49 -07:00
Paul Zuchowski 14a11bf2f6 Improve "Unable to automount" error message.
Having the mountpoint and dataset name both in the message made it
confusing to read.  Additionally, convert this to a zfs_dbgmsg rather than
sending it to the console.

Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Zuchowski <pzuchowski@datto.com>
Closes #8959
2019-09-25 11:27:49 -07:00
Brian Behlendorf 7a03d7c73c Check b_freeze_cksum under ZFS_DEBUG_MODIFY conditional
The b_freeze_cksum field can only have data when ZFS_DEBUG_MODIFY
is set.  Therefore, the EQUIV check must be wrapped accordingly.
For the same reason the ASSERT in arc_buf_fill() in unsafe.
However, since it's largely redundant it has simply been removed.

Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed-by: Allan Jude <allanjude@freebsd.org>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #8979
2019-09-25 11:27:49 -07:00
Tom Caputi 9e09826b33 Fix error text for EINVAL in zfs_receive_one()
This small patch fixes the EINVAL case for zfs_receive_one(). A
missing 'else' has been added to the two possible cases, which
will ensure the intended error message is printed.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #8977
2019-09-25 11:27:48 -07:00
Tomohiro Kusumi 093bb64461 Don't use d_path() for automount mount point for chroot'd process
Chroot'd process fails to automount snapshots due to realpath(3)
failure in mount.zfs(8).

Construct a mount point path from sb of the ctldir inode and dirent
name, instead of from d_path(), so that chroot'd process doesn't get
affected by its view of fs.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Closes #8903
Closes #8966
2019-09-25 11:27:48 -07:00
George Wilson 7d2489cfad nopwrites on dmu_sync-ed blocks can result in a panic
After device removal, performing nopwrites on a dmu_sync-ed block
will result in a panic. This panic can show up in two ways:
1. an attempt to issue an IOCTL in vdev_indirect_io_start()
2. a failed comparison of zio->io_bp and zio->io_bp_orig in
   zio_done()
To resolve both of these panics, nopwrites of blocks on indirect
vdevs should be ignored and new allocations should be performed on
concrete vdevs.

Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: George Wilson <gwilson@delphix.com>
Closes #8957
2019-09-25 11:27:48 -07:00
Alexander Motin 04d4df89f4 Avoid extra taskq_dispatch() calls by DMU
DMU sync code calls taskq_dispatch() for each sublist of os_dirty_dnodes
and os_synced_dnodes.  Since the number of sublists by default is equal
to number of CPUs, it will dispatch equal, potentially large, number of
tasks, waking up many CPUs to handle them, even if only one or few of
sublists actually have any work to do.

This change adds check for empty sublists to avoid this.

Reviewed by: Sean Eric Fagan <sef@ixsystems.com>
Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:  Alexander Motin <mav@FreeBSD.org>
Closes #8909
2019-09-25 11:27:48 -07:00
Igor K 05006f125c -Y option for zdb is valid
The -Y option was added for ztest to test split block reconstruction.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Signed-off-by: Igor Kozhukhov <igor@dilos.org>
Closes #8926
2019-09-25 11:27:48 -07:00
Tom Caputi bfe5f029cf Fix error message on promoting encrypted dataset
This patch corrects the error message reported when attempting
to promote a dataset outside of its encryption root.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #8905
Closes #8935
2019-09-25 11:27:48 -07:00
Brian Behlendorf cc7fe8a599 Fix out-of-tree build failures
Resolve the incorrect use of srcdir and builddir references for
various files in the build system.  These have crept in over time
and went unnoticed because when building in the top level directory
srcdir and builddir are identical.

With this change it's again possible to build in a subdirectory.

    $ mkdir obj
    $ cd obj
    $ ../configure
    $ make

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #8921
Closes #8943
2019-09-25 11:27:48 -07:00
Matthew Ahrens 7d64595c25 dn_struct_rwlock can not be held in dmu_tx_try_assign()
The thread calling dmu_tx_try_assign() can't hold the dn_struct_rwlock
while assigning the tx, because this can lead to deadlock. Specifically,
if this dnode is already assigned to an earlier txg, this thread may
need to wait for that txg to sync (the ERESTART case below).  The other
thread that has assigned this dnode to an earlier txg prevents this txg
from syncing until its tx can complete (calling dmu_tx_commit()), but it
may need to acquire the dn_struct_rwlock to do so (e.g. via
dmu_buf_hold*()).

This commit adds an assertion to dmu_tx_try_assign() to ensure that this
deadlock is not inadvertently introduced.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #8929
2019-09-25 11:27:48 -07:00
gordan-bobic be4a282a8d Remove arch and relax version dependency
Remove arch and relax version dependency for zfs-dracut
package.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gordan Bobic <gordan@redsleeve.org>
Issue #8913
Closes #8914
2019-09-25 11:27:48 -07:00
Harry Mallon 2d88230d97 Add libnvpair to libzfs pkg-config
Functions such as `fnvlist_lookup_nvlist` need libnvpair to be linked.
Default pkg-config file did not contain it.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Harry Mallon <hjmallon@gmail.com>
Closes #8919
2019-09-25 11:27:48 -07:00
Don Brady 95fcb04215 Let zfs mount all tolerate in-progress mounts
The zfs-mount service can unexpectedly fail to start when zfs
encounters a mount that is in progress. This service uses
zfs mount -a, which has a window between the time it checks if
the dataset was mounted and when the actual mount (via mount.zfs
binary) occurs.

The reason for the racing mounts is that both zfs-mount.target
and zfs-share.target are allowed to execute concurrently after
the import.  This is more of an issue with the relatively recent
addition of parallel mounting, and we should consider serializing
the mount and share targets.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Allan Jude <allanjude@freebsd.org>
Signed-off-by: Don Brady <don.brady@delphix.com>
Closes #8881
2019-09-25 11:27:48 -07:00
Allan Jude d053481523 zstreamdump: add per-record-type counters and an overhead counter
Count the bytes of payload for each replication record type

Count the bytes of overhead (replication records themselves)

Include these counters in the output summary at the end of the run.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Signed-off-by: Allan Jude <allanjude@freebsd.org>
Sponsored-By: Klara Systems and Catalogic
Closes #8432
2019-09-25 11:27:48 -07:00
Paul Dagnelie 7a5f4656ce Fix comments on zfs_bookmark_phys
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #8945
2019-09-25 11:27:48 -07:00
Paul Dagnelie 1fd28bd8d4 Add SCSI_PASSTHROUGH to zvols to enable UNMAP support
When exporting ZVOLs as SCSI LUNs, by default Windows will not
issue them UNMAP commands. This reduces storage efficiency in
many cases.

We add the SCSI_PASSTHROUGH flag to the zvol's device queue,
which lets the SCSI target logic know that it can handle SCSI
commands.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: John Gallagher <john.gallagher@delphix.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #8933
2019-09-25 11:27:48 -07:00
Tomohiro Kusumi ab24c9cd4c Prevent pointer to an out-of-scope local variable
`show_str` could be a pointer to a local variable in stack
which is out-of-scope by the time
`return (snprintf(buf, buflen, "%s\n", show_str));`
is called.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Closes #8924
Closes #8940
2019-09-25 11:27:48 -07:00
Matthew Ahrens 3c2a42fd25 dedup=verify doesn't clear the blkptr's dedup flag
The logic to handle strong checksum collisions where the data doesn't
match is incorrect. It is not clearing the dedup bit of the blkptr,
which can cause a panic later in zio_ddt_free() due to the dedup table
not matching what is in the blkptr.

Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-48097
Closes #8936
2019-09-25 11:27:48 -07:00
Igor K 9af524b0ee Update vdev_ops_t from illumos
Align vdev_ops_t from illumos for better compatibility.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Igor Kozhukhov <igor@dilos.org>
Closes #8925
2019-09-25 11:27:48 -07:00
Tom Caputi b96ceeead2 Allow unencrypted children of encrypted datasets
When encryption was first added to ZFS, we made a decision to
prevent users from creating unencrypted children of encrypted
datasets. The idea was to prevent users from inadvertently
leaving some of their data unencrypted. However, since the
release of 0.8.0, some legitimate reasons have been brought up
for this behavior to be allowed. This patch simply removes this
limitation from all code paths that had checks for it and updates
the tests accordingly.

Reviewed-by: Jason King <jason.king@joyent.com>
Reviewed-by: Sean Eric Fagan <sef@ixsystems.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #8737
Closes #8870
2019-09-25 11:27:48 -07:00
dacianstremtan 01cc94f68d Replace whereis with type in zfs-lib.sh
The whereis command should not be used since it may not exist
in the initramfs.  The dracut plymouth module also uses the type
command instead of whereis.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Garrett Fields <ghfields@gmail.com>
Signed-off-by: Dacian Reece-Stremtan <dacianstremtan@gmail.com>
Closes #8920
Closes #8938
2019-09-25 11:27:48 -07:00
Tomohiro Kusumi fb6f6b47d6 Use ZFS_DEV macro instead of literals
The rest of the code/comments use ZFS_DEV, so sync with that.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Closes #8912
2019-09-25 11:27:48 -07:00
Michael Niewöhner 2087b6cf49 Fix memory leak in check_disk()
Reviewed-by: Allan Jude <allanjude@freebsd.org>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>
Closes #8897
Closes #8911
2019-09-25 11:27:47 -07:00
Olaf Faaland 5b0327bc57 kmod-zfs-devel rpm should provide kmod-spl-devel
When configure is run with --with-spec=redhat, and rpms are built, the
kmod-zfs-devel package is missing

Provides: kmod-spl-devel = %{version}

which is required by software such as Lustre which builds against zfs
kmods.  Adding it makes it easier for such software to build against
both zfs-0.7 (where SPL is separate and may be missing) and zfs-0.8.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #8930
2019-09-25 11:27:47 -07:00
Brian Behlendorf b5e8d14a4b ZTS: Fix mmp_interval failure
The mmp_interval test case was failing on Fedora 30 due to the built-in
'echo' command terminating the script when it was unable to write to
the sysfs module parameter.  This change in behavior was observed with
ksh-2020.0.0-alpha1.  Resolve the issue by using the external cat
command which fails gracefully as expected.

Additionally, remove some incorrect quotes around the $? return values.

Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #8906
2019-09-25 11:27:47 -07:00
Alexander Motin ed7b0d357a Minimize aggsum_compare(&arc_size, arc_c) calls.
For busy ARC situation when arc_size close to arc_c is desired.  But
then it is quite likely that aggsum_compare(&arc_size, arc_c) will need
to flush per-CPU buckets to find exact comparison result.  Doing that
often in a hot path penalizes whole idea of aggsum usage there, since it
replaces few simple atomic additions with dozens of lock acquisitions.

Replacing aggsum_compare() with aggsum_upper_bound() in code increasing
arc_p when ARC is growing (arc_size < arc_c) according to PMC profiles
allows to save ~5% of CPU time in aggsum code during sequential write
to 12 ZVOLs with 16KB block size on large dual-socket system.

I suppose there some minor arc_p behavior change due to lower precision
of the new code, but I don't think it is a big deal, since it should
affect only very small window in time (aggsum buckets are flushed every
second) and in ARC size (buckets are limited to 10 average ARC blocks
per CPU).

Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Allan Jude <allanjude@freebsd.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:  Alexander Motin <mav@FreeBSD.org>
Closes #8901
2019-09-25 11:27:47 -07:00
Ryan Moeller 9e54b9d930 Python config cleanup
Don't require Python at configure/build unless building pyzfs.
Move ZFS_AC_PYTHON_MODULE to always-pyzfs.m4 where it is used.
Make test syntax more consistent.

Sponsored by: iXsystems, Inc.
Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@ixsystems.com>
Closes #8895
2019-09-25 11:27:47 -07:00
Matthew Ahrens b033353b25 lz4_decompress_abd declared but not defined
`lz4_decompress_abd` is declared in zio_compress.h but it is not defined
anywhere. The declaration should be removed.

Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed-by: Allan Jude <allanjude@freebsd.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-47477
Closes #8894
2019-09-25 11:27:47 -07:00
Matthew Ahrens 6083f40387 panic in removal_remap test on 4K devices
If the zfs_remove_max_segment tunable is changed to be not a multiple of
the sector size, then the device removal code will malfunction and try
to create mappings that are smaller than one sector, leading to a panic.

On debug bits this assertion will fail in spa_vdev_copy_segment():
    ASSERT3U(DVA_GET_ASIZE(&dst), ==, size);

On nondebug, the system panics with a stack like:
    metaslab_free_concrete()
    metaslab_free_impl()
    metaslab_free_impl_cb()
    vdev_indirect_remap()
    free_from_removing_vdev()
    metaslab_free_impl()
    metaslab_free_dva()
    metaslab_free()

Fortunately, the default for zfs_remove_max_segment is 1MB, so this
can't occur by default.  We hit it during this test because
removal_remap.ksh changes zfs_remove_max_segment to 1KB. When testing on
4KB-sector disks, we hit the bug.

This change makes the zfs_remove_max_segment tunable more robust,
automatically rounding it up to a multiple of the sector size. We also
turn some key assertions into VERIFY's so that similar bugs would be
caught before they are encoded on disk (and thus avoid a
panic-reboot-loop).

Reviewed-by: Sean Eric Fagan <sef@ixsystems.com>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-61342
Closes #8893
2019-09-25 11:27:47 -07:00
Matthew Ahrens 592ee2e6dd compress metadata in later sync passes
Starting in sync pass 5 (zfs_sync_pass_dont_compress), we disable
compression (including of metadata).  Ostensibly this helps the sync
passes to converge (i.e. for a sync pass to not need to allocate
anything because it is 100% overwrites).

However, in practice it increases the average number of sync passes,
because when we turn compression off, a lot of block's size will change
and thus we have to re-allocate (not overwrite) them.  It also increases
the number of 128KB allocations (e.g. for indirect blocks and spacemaps)
because these will not be compressed.  The 128K allocations are
especially detrimental to performance on highly fragmented systems,
which may have very few free segments of this size, and may need to load
new metaslabs to satisfy 128K allocations.

We should increase zfs_sync_pass_dont_compress.  In practice on a highly
fragmented system we see a few 5-pass txg's, a tiny number of 6-pass
txg's, and no txg's with more than 6 passes.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: George Wilson <george.wilson@delphix.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-63431
Closes #8892
2019-09-25 11:27:47 -07:00
Alexander Motin cab7d856ea Move write aggregation memory copy out of vq_lock
Memory copy is too heavy operation to do under the congested lock.
Moving it out reduces congestion by many times to almost invisible.
Since the original zio removed from the queue, and the child zio is
not executed yet, I don't see why would the copy need protection.
My guess it just remained like this from the time when lock was not
dropped here, which was added later to fix lock ordering issue.

Multi-threaded sequential write tests with both HDD and SSD pools
with ZVOL block sizes of 4KB, 16KB, 64KB and 128KB all show major
reduction of lock congestion, saving from 15% to 35% of CPU time
and increasing throughput from 10% to 40%.

Reviewed-by: Richard Yao <ryao@gentoo.org>
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:  Alexander Motin <mav@FreeBSD.org>
Closes #8890
2019-09-25 11:27:47 -07:00
Tulsi Jain 19cebf0518 Restrict filesystem creation if name referred either '.' or '..'
This change restricts filesystem creation if the given name
contains either '.' or '..'

Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Signed-off-by: TulsiJain <tulsi.jain@delphix.com>
Closes #8842
Closes #8564
2019-09-25 11:27:47 -07:00
Matthew Ahrens 77e64c6fff ztest: dmu_tx_assign() gets ENOSPC in spa_vdev_remove_thread()
When running zloop, we occasionally see the following crash:

    dmu_tx_assign(tx, TXG_WAIT) == 0 (0x1c == 0)
    ASSERT at ../../module/zfs/vdev_removal.c:1507:spa_vdev_remove_thread()/sbin/ztest(+0x89c3)[0x55faf567b9c3]

The error value 0x1c is ENOSPC.

The transaction used by spa_vdev_remove_thread() should not be able to
fail due to being out of space. i.e. we should not call
dmu_tx_hold_space().  This will allow the removal thread to schedule its
work even when the pool is low on space.  The "slop space" will provide
enough free space to sync out the txg.

Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-37853
Closes #8889
2019-09-25 11:27:47 -07:00
Tomohiro Kusumi 4f809bddc6 Fix lockdep warning on insmod
sysfs_attr_init() is required to make lockdep happy for dynamically
allocated sysfs attributes. This fixed #8868 on Fedora 29 running
kernel-debug.

This requirement was introduced in 2.6.34.
See include/linux/sysfs.h for what it actually does.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Closes #8868
Closes #8884
2019-09-25 11:27:47 -07:00
Matthew Ahrens 516a08ebb4 fat zap should prefetch when iterating
When iterating over a ZAP object, we're almost always certain to iterate
over the entire object. If there are multiple leaf blocks, we can
realize a performance win by issuing reads for all the leaf blocks in
parallel when the iteration begins.

For example, if we have 10,000 snapshots, "zfs destroy -nv
pool/fs@1%9999" can take 30 minutes when the cache is cold. This change
provides a >3x performance improvement, by issuing the reads for all ~64
blocks of each ZAP object in parallel.

Reviewed-by: Andreas Dilger <andreas.dilger@whamcloud.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-58347
Closes #8862
2019-09-25 11:27:47 -07:00
Matthew Ahrens 812c36fc71 Target ARC size can get reduced to arc_c_min
Sometimes the target ARC size is reduced to arc_c_min, which impacts
performance.  We've seen this happen as part of the random_reads
performance regression test, where the ARC size is reduced before the
reads test starts which impacts how long it takes for system to reach
good IOPS performance.

We call arc_reduce_target_size when arc_reap_cb_check() returns TRUE,
and arc_available_memory() is less than arc_c>>arc_shrink_shift.

However, arc_available_memory() could easily be low, even when arc_c is
low, because we can have tons of unused bufs in the abd kmem cache. This
would be especially true just after the DMU requests a bunch of stuff be
evicted from the ARC (e.g. due to "zpool export").

To fix this, the ARC should reduce arc_c by the requested amount, not
all the way down to arc_size (or arc_c_min), which can be very small.

Reviewed-by: Tim Chase <tim@chase2k.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-59431
Closes #8864
2019-09-25 11:27:47 -07:00
bnjf fe11968bbf Fix typo in vdev_raidz_math.c
Fix typo in vdev_raidz_math.c

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brad Forschinger <github@bnjf.id.au>
Closes #8875
Closes #8880
2019-09-25 11:27:47 -07:00
Richard Elling 4be4dedb9f Improve ZTS block_device_wait debugging
The udevadm settle timeout can be 120 or 180 seconds by default
for some distributions. If a long delay is experienced, it could
be due to some strangeness in a malfunctioning device that isn't
related to the devices under test. To help debug this condition,
a notice is given if settle takes too long.

Arguments can now be passed to block_device_wait. The expected
arguments are block device pathnames.

Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Elling <Richard.Elling@RichardElling.com>
Closes #8839
2019-09-25 11:27:47 -07:00
Richard Elling fb52bf9b1d Block_device_wait does not return an error code
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Elling <Richard.Elling@RichardElling.com>
Closes #8839
2019-09-25 11:27:47 -07:00
Richard Elling a22b00f924 Remove redundant redundant remove
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Elling <Richard.Elling@RichardElling.com>
Closes #8839
2019-09-25 11:27:47 -07:00
Richard Elling c350e62309 Fix logic error in setpartition function
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Elling <Richard.Elling@RichardElling.com>
Closes #8839
2019-09-25 11:27:47 -07:00
Paul Dagnelie 6f7bc75825 Allow metaslab to be unloaded even when not freed from
On large systems, the memory used by loaded metaslabs can become
a concern. While range trees are a fairly efficient data structure,
on heavily fragmented pools they can still consume a significant
amount of memory. This problem is amplified when we fail to unload
metaslabs that we aren't using. Currently, we only unload a metaslab
during metaslab_sync_done; in order for that function to be called
on a given metaslab in a given txg, we have to have dirtied that
metaslab in that txg. If the dirtying was the result of an allocation,
we wouldn't be unloading it (since it wouldn't be 8 txgs since it
was selected), so in effect we only unload a metaslab during txgs
where it's being freed from.

We move the unload logic from sync_done to a new function, and
call that function on all metaslabs in a given vdev during
vdev_sync_done().

Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #8837
2019-09-25 11:27:47 -07:00
Jorgen Lundman 06900c409b Avoid updating zfs_gitrev.h when rev is unchanged
Build process would always re-compile spa_history.c due to touching
zfs_gitrev.h - avoid if no change in gitrev.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Reviewed-by: Allan Jude <allanjude@freebsd.org>
Signed-off-by: Jorgen Lundman <lundman@lundman.net>
Closes #8860
2019-09-25 11:27:47 -07:00
Allan Jude 60cbc18136 l2arc_apply_transforms: Fix typo in comment
Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: Allan Jude <allanjude@freebsd.org>
Closes #8822
2019-09-25 11:27:46 -07:00
Serapheim Dimitropoulos b63ed49c29 Reduced IOPS when all vdevs are in the zfs_mg_fragmentation_threshold
Historically while doing performance testing we've noticed that IOPS
can be significantly reduced when all vdevs in the pool are hitting
the zfs_mg_fragmentation_threshold percentage. Specifically in a
hypothetical pool with two vdevs, what can happen is the following:
Vdev A would go above that threshold and only vdev B would be used.
Then vdev B would pass that threshold but vdev A would go below it
(we've been freeing from A to allocate to B). The allocations would
go back and forth utilizing one vdev at a time with IOPS taking a hit.

Empirically, we've seen that our vdev selection for allocations is
good enough that fragmentation increases uniformly across all vdevs
the majority of the time. Thus we set the threshold percentage high
enough to avoid hitting the speed bump on pools that are being pushed
to the edge. We effectively disable its effect in the majority of the
cases but we don't remove (at least for now) just in case we hit any
weird behavior in the future.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Closes #8859
2019-09-25 11:27:46 -07:00
Tomohiro Kusumi fafe72712a Drop objid argument in zfs_znode_alloc() (sync with OpenZFS)
Since zfs_znode_alloc() already takes dmu_buf_t*, taking another
uint64_t argument for objid is redundant. inode's ->i_ino does and
needs to match znode's ->z_id.

zfs_znode_alloc() in FreeBSD and illumos doesn't have this argument
since vnode doesn't have vnode# in VFS (hence ->z_id exists).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@osnexus.com>
Closes #8841
2019-09-25 11:27:46 -07:00
Tomohiro Kusumi 328c95e391 Remove vn_set_fs_pwd()/vn_set_pwd() (no need to be at / during insmod)
Per suggestion from @behlendorf in #8777, remove vn_set_fs_pwd() and
vn_set_pwd() which are only used in zfs_ioctl.c:_init() while loading
zfs.ko.

The rest of initialization functions being called here after cwd set
to / don't depend on cwd of the process except for spa_config_load().
spa_config_load() uses a relative path ".//etc/zfs/zpool.cache" when
`rootdir` is non-NULL, which is "/etc/zfs/zpool.cache" given cwd is /,
so just unconditionally use the absolute path without "./", so that
`vn_set_pwd("/")` as well as the entire functions can be removed.
This is also what FreeBSD does.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@osnexus.com>
Closes #8826
2019-09-25 11:27:46 -07:00
Josh Soref 6ce10fdabb grammar: it is / plural agreement
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
Closes #8818
2019-09-25 11:27:46 -07:00
Tomohiro Kusumi e4a11acfac Refactor parent dataset handling in libzfs zfs_rename()
For recursive renaming, simplify the code by moving `zhrp` and
`parentname` to inner scope. `zhrp` is only used to test existence
of a parent dataset for recursive dataset dir scan since ba6a24026c.

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@osnexus.com>
Closes #8815
2019-09-25 11:27:46 -07:00
Ryan Moeller 90d8067a77 Update comments to match code
s/get_vdev_spec/make_root_vdev

The former doesn't exist anymore.

Sponsored by: iXsystems, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Ryan Moeller <ryan@freqlabs.com>
Closes #8759
2019-09-25 11:27:46 -07:00
Tomohiro Kusumi e5a877c5d0 Update descriptions for vnops
These descriptions are not uptodate with the code.

Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Closes #8767
2019-09-25 11:27:46 -07:00
Tomohiro Kusumi 4933b0a25b Drop local definition of MOUNT_BUSY
It's accessible via <sys/mntent.h>.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@osnexus.com>
Closes #8765
2019-09-25 11:27:46 -07:00
Rafael Kitover e0cd6c28a3 kernel timer API rework
In `config/kernel-timer.m4` refactor slightly to check more generally
for the new `timer_setup()` APIs, but also check the callback signature
because some kernels (notably 4.14) have the new `timer_setup()` API but
use the old callback signature. Also add a check for a `flags` member in
`struct timer_list`, which was added in 4.1-rc8.

Add compatibility shims to `include/spl/sys/timer.h` to allow using the
new timer APIs with the only two caveats being that the callback
argument type must be declared as `spl_timer_list_t` and an explicit
assignment is required to get the timer variable for the `timer_of()`
macro. So the callback would look like this:

```c
__cv_wakeup(spl_timer_list_t t)
{
        struct timer_list *tmr = (struct timer_list *)t;
	struct thing *parent = from_timer(parent, tmr,
		parent_timer_field);
	... /* do stuff with parent */
```

Make some minor changes to `spl-condvar.c` and `spl-taskq.c` to use the
new timer APIs instead of conditional code.

Reviewed-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rafael Kitover <rkitover@gmail.com>
Closes #8647
2019-09-25 11:27:46 -07:00
Tony Hutter 63b88f7e22 Tag zfs-0.8.1
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2019-06-14 09:43:18 -07:00
Alexander Motin 72888812b0 Fix comparison signedness in arc_is_overflowing()
When ARC size is very small, aggsum_lower_bound(&arc_size) may return
negative values, that due to unsigned comparison caused delays, waiting
for arc_adjust() to "fix" it by calling aggsum_value(&arc_size).  Use
of signed comparison there fixes the problem.

Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Closes #8873
2019-06-11 10:45:23 -07:00
Tom Caputi 581c77e725 Fix incorrect error message for raw receive
This patch fixes an incorrect error message that comes up when
doing a non-forcing, raw, incremental receive into a dataset
that has a newer snapshot than the "from" snapshot. In this
case, the current code prints a confusing message about an IVset
guid mismatch.

This functionality is supported by non-raw receives as an
undocumented feature, but was never supported by the raw receive
code. If this is desired in the future, we can probably figure
out a way to make it work.

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Issue #8758
Closes #8863
2019-06-11 10:45:17 -07:00
Eli Schwartz ba505f90d8 arc_summary: prefer python3 version and install when there is no python
This matches the behavior of other python scripts, such as arcstat and
dbufstat, which are always installed but whose install-exec-hook actions
will simply touch up the shebang if a python interpreter was configured
*and* that interpreter is a python2 interpreter.

Fixes installation in a minimal build chroot without python available.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@freqlabs.com>
Signed-off-by: Eli Schwartz <eschwartz@archlinux.org>
Closes #8851
2019-06-11 10:45:02 -07:00
Samuel VERSCHELDE eaa21b2349 Fix %post and %postun generation in kmodtool
During zfs-kmod RPM build, $(uname -r) gets unintentionally evaluated on
the build host, once and for all. It should be evaluated during the
execution of the scriptlets on the installation host. Escaping the $
character avoids evaluating it during build.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Signed-off-by: Samuel Verschelde <stormi-xcp@ylix.fr>
Closes #8866
2019-06-11 10:44:54 -07:00
Tom Caputi 8dc8bbde6e Reinstate raw receive check when truncating
This patch re-adds a check that was removed in 369aa50. The check
confirms that a raw receive is not occuring before truncating an
object's dn_maxblkid. At the time, it was believed that all cases
that would hit this code path would be handled in other places,
but that was not the case.

Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #8852 
Closes #8857
2019-06-07 12:45:40 -07:00
Garrett Fields 9fd95a2f1b If $ZFS_BOOTFS contains guid, replace the guid portion with $pool
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: Garrett Fields <ghfields@gmail.com>
Closes #8356
2019-06-07 12:45:40 -07:00
Tom Caputi 35050ef39e Fix integer overflow of ZTOI(zp)->i_generation
The ZFS on-disk format stores each inode's generation ID as a 64
bit number on disk and in-core. However, the Linux kernel's inode
is only a 32 bit number. In most places, the code handles this
correctly, but the cast is missing in zfs_rezget(). For many pools,
this isn't an issue since the generation ID is computed as the
current txg when the inode is created and many pools don't have
more than 2^32 txgs.

For the pools that have more txgs, this issue causes any inode with
a high enough generation number to report IO errors after a call to
"zfs rollback" while holding the file or directory open. This patch
simply adds the missing cast.

Reviewed-by: Alek Pinchuk <apinchuk@datto.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #8858
2019-06-07 12:45:40 -07:00
Don Brady 5108d27aec hkdf_test binary should only have one icp instance
The build for test binary hkdf_test was linking both against libicp 
and libzpool. This results in two instances of libicp inside the 
binary but the call to icp_init() only initializes one of them!

Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Don Brady <don.brady@delphix.com>
Closes #8850
2019-06-07 12:45:40 -07:00
Peter Wirdemo 02010e9c2c Fixed a small typo in man/man1/raidz_test.1
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Signed-off-by: Peter Wirdemo <peter.wirdemo@gmail.com>
Closes #8855
2019-06-07 12:45:40 -07:00
Torsten Wörtwein a0bf24952d Allow TRIM_UNUSED_KSYM when build as a builtin-module
If ZFS is built with enable_linux_builtin, it seems to be possible
to compile the kernel with TRIM_UNUSED_KSYM.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Torsten Wörtwein <twoertwein@gmail.com>
Closes #8820
2019-06-07 12:45:40 -07:00
Ryan Moeller d6920fb996 Make Python detection optional and more portable
Previously, --without-python would cause ./configure to fail. Now it is
able to proceed, and the Python scripts will not be built.

Use portable parameter expansion matching instead of nonstandard
substring matching to detect the Python version.  This test is
duplicated in several places, so define a function for it.

Don't assume the full path to binaries, since different platforms do
install things in different places.  Use AC_CHECK_PROGS instead.

When building without Python, also build without pyzfs.

Sponsored by: iXsystems, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Eli Schwartz <eschwartz93@gmail.com>
Signed-off-by: Ryan Moeller <ryan@freqlabs.com>
Closes #8809 
Closes #8731
2019-06-07 12:45:40 -07:00
DeHackEd 58b2de6420 Wait in 'S' state when send/recv pipe is blocking
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: DHE <git@dehacked.net>
Closes #8733 
Closes #8752
2019-06-07 12:45:40 -07:00
TulsiJain 11ad06d1d8 Make zfs_async_block_max_blocks handle zero correctly
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: TulsiJain <tulsi.jain@delphix.com>
Closes #8829
Closes #8289
2019-06-07 12:45:40 -07:00
Brian Behlendorf 4f8eef29e0 Revert "Report holes when there are only metadata changes"
This reverts commit ec4f9b8f30 which introduced a narrow race which
can lead to lseek(, SEEK_DATA) incorrectly returning ENXIO.  Resolve
the issue by revering this change to restore the previous behavior
which depends solely on checking the dirty list.

Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #8816 
Closes #8834
2019-06-07 12:45:40 -07:00
Tomohiro Kusumi 94866d8309 Add link count test for root inode
Add tests for
97aa3ba44("Fix link count of root inode when snapdir is visible")
as suggested in #8727.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@osnexus.com>
Closes #8732
2019-06-07 12:45:40 -07:00
Brian Behlendorf a1eaf0dde0 Exclude log device ashift from normal class
When opening a log device during import its allocation bias will
not yet have been set by vdev_load().  This results in the log
device's ashift being incorrectly applied to the maximum ashift
of the vdevs in the normal class.  Which in turn prevents the
removal of any top-level devices due to the ashift check in the
spa_vdev_remove_top_check() function.

This issue is resolved by including vdev_islog in the check since
it will be set correctly during vdev_open().

Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #8735
2019-06-07 12:45:40 -07:00
madz 580256045b Fix integer overflow in get_next_chunk()
dn->dn_datablksz type is uint32_t and need to be casted to uint64_t
to avoid an overflow when the record size is greater than 4 MiB.

Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olivier Mazouffre <olivier.mazouffre@ims-bordeaux.fr>
Closes #8778 
Closes #8797
2019-06-07 12:45:40 -07:00
loli10K aaf3b30dcf Double-free of encryption wrapping key due to invalid pool properties
This commits fixes a double-free in zfs_ioc_pool_create() triggered by
specifying an unsupported combination of properties when creating a pool
with encryption enabled.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #8791
2019-06-07 12:45:40 -07:00
Stoiko Ivanov 27b446f799 tests: fix cosmetic permission issues during make install
files in dist_*_SCRIPTS get installed with 0755, those in dist_*_DATA
with 0644. This commit moves all .kshlib, .shlib and .cfg files in the
testsuite to dist_pkgdata_DATA, and removes the shebang from
zpool_import.kshlib.

This ensures that the files are installed with appropriate permissions
and silences some warnings from lintian

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
Closes #8803
2019-06-07 12:45:40 -07:00
Stoiko Ivanov 0c6206e7f1 test-runner.py: change shebang to python3
In commit 6e72a5b9b6 python scripts which
work with python2 and python3 changed the shebang from /usr/bin/python
to /usr/bin/python3. This gets adapted by the build-system on systems
which don't provide python3.
This commit changes test-runner.py to also use /usr/bin/python3,
enabling the change during buildtime and fixing a minor lintian issue
for those Debian packages, which depend on a specific python version
(python3/python2).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
Closes #8803
2019-06-07 12:45:40 -07:00
loli10K 51de7ccb42 Endless loop in zpool_do_remove() on platforms with unsigned char
On systems where "char" is an unsigned type the value returned by
getopt() will never be negative (-1), leading to an endless loop:
this issue prevents both 'zpool remove' and 'zstreamdump' for
working on some systems.

Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #8789
2019-06-07 12:45:40 -07:00
Tom Caputi 69ae34076f Fix embedded bp accounting in count_block()
Currently, count_block() does not correctly account for the
possibility that the bp that is passed to it could be embedded.
These blocks shouldn't be counted since the work of scanning
these blocks in already handled when the containing block is
scanned. This patch simply resolves this issue by returning
early in this case.

Reviewed by: Allan Jude <allanjude@freebsd.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Authored-by: Bill Sommerfeld <sommerfeld@alum.mit.edu>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #8800 
Closes #8766
2019-06-07 12:39:13 -07:00
Tom Caputi b746c397e3 Disable parallel processing for 'zfs mount -l'
Currently, 'zfs mount -a' will always attempt to parallelize
work related to mounting as best it can. Unfortunately, when
the user passes the '-l' option to load keys, this causes
all threads to prompt the user for their keys at once,
resulting in a confusing and racy user experience. This patch
simply disables parallel mounting when using the '-l' flag.

Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #8762 
Closes #8811
2019-06-07 12:39:13 -07:00
Tomohiro Kusumi 2fb37bcadd Linux 5.2 compat: Directly call wait_on_page_bit()
wait_on_page_writeback() was made GPL only in torvalds/linux@19343b5bdd.

Directly call wait_on_page_bit() without using wait_on_page_writeback()
interface, given zfs_putpage() is the only caller for now.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@osnexus.com>
Closes #8794
2019-06-07 12:39:13 -07:00
Tomohiro Kusumi a727f69e52 Linux 5.2 compat: Fix config/kernel-shrink.m4 test failure
"whether ->count_objects callback exists" test failed with
"error: error" message for using an incomplete function shrinker_cb().

This is caused by torvalds/linux@83da1bed86. It's configurable,
but we would want to be able to compile with default kbuild setting.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@osnexus.com>
Closes #8776
2019-06-07 12:39:13 -07:00
Tomohiro Kusumi 8ec352be1f Linux 5.2 compat: Remove config/kernel-set-fs-pwd.m4
This failed on 5.2-rc1 with "error: unknown" message, for set_fs_pwd()
not being visible in both const and non-const tests.

This is caused by torvalds/linux@83da1bed86. It's configurable,
but we would want to be able to compile with default kbuild setting.

set_fs_pwd() has never been exported with exception of some distro
kernels, and set_fs_pwd() wasn't used in ZoL to begin with. The test
result was used for a spl function vn_set_fs_pwd().

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@osnexus.com>
Closes #8777
2019-06-07 12:39:13 -07:00
loli10K df717bb835 zpool: status -t is not documented in help message
This commit adds the undocumented "-t" option to zpool(8) help message.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #8782
2019-06-07 12:39:13 -07:00
loli10K e0b3689ed5 zfs-tests: fix warnings when packaging some .shlib files
This change prevents the following warning when packaging some zfs-tests
files:

   *** WARNING: ./usr/src/zfs-0.8.0/tests/zfs-tests/include/zpool_script.shlib
   is executable but has empty or no shebang, removing executable bit

Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #8787
2019-06-07 12:39:13 -07:00
loli10K 438275c9a0 VERIFY3P() message is missing a space character
This commit just reintroduces a [space] character inadvertently removed
in a887d653.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #8786
2019-06-07 12:39:13 -07:00
loli10K 8cfa6d4a1c zfs-tests: verify zfs(8) and zpool(8) help message is under 80 columns
This commit updates the ZFS Test Suite to detect incorrect wrapping of
both zfs(8) and zpool(8) help message

Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #8785
2019-06-07 12:39:13 -07:00
loli10K ad0157ec91 zfs: don't pretty-print objsetid property
The objsetid property, while being stored as a number, is a dataset
identifier and should not be pretty-printed.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #8784
2019-06-07 12:39:13 -07:00
loli10K cd75d5f710 zfs: missing newline character in zfs_do_channel_program() error message
This commit simply adds a missing newline ("\n") character to the error
message printed by the zfs command when the provided pool parameter
can't be found.

Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #8783
2019-06-07 12:39:13 -07:00
siv0 c6bbacebc8 Fix ksh-path for random_readwrite_fixed.ksh
The test in zfs-tests/tests/perf/regression/random_readwrite_fixed.ksh
is the only file to use /usr/bin/ksh in the shebang.
Change it to /bin/ksh for consistency.

Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
Closes #8779
2019-06-07 12:39:13 -07:00
Tomohiro Kusumi 4d7cb872e8 Linux 2.6.39 compat: Test if kstrtoul() exists
kstrtoul() exists only after torvalds/linux@33ee3b2e2e in 2.6.39.
Use strict_strtoul() if kstrtoul() doesn't exist.
Note that strict_strtoul() has existed as an alias for kstrtoul()
for a while, but removed in torvalds/linux@3db2e9cdc0.

It looks like RHEL6 (2.6.32 based) has backported kstrtoul(),
and this caused build CI to pass compilation test.
It should fail on vanilla < 2.6.39 kernels or distro kernels without
backport as reported in #8760.

--
 # grep "kstrtoul(" /lib/modules/2.6.32-754.12.1.el6.x86_64/build/ \
 include/linux/kernel.h >/dev/null
 # echo $?
 0

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Closes #8760 
Closes #8761
2019-06-07 12:39:13 -07:00
loli10K f91e7e6284 Device removal panics on 32-bit systems
The issue is caused by an incorrect usage of the sizeof() operator
in vdev_obsolete_sm_object(): on 64-bit systems this is not an issue
since both "uint64_t" and "uint64_t*" are 8 bytes in size. However on
32-bit systems pointers are 4 bytes long which is not supported by
zap_lookup_impl(). Trying to remove a top-level vdev on a 32-bit system
will cause the following failure:

VERIFY3(0 == vdev_obsolete_sm_object(vd, &obsolete_sm_object)) failed (0 == 22)
PANIC at vdev_indirect.c:833:vdev_indirect_sync_obsolete()
Showing stack for process 1315
CPU: 6 PID: 1315 Comm: txg_sync Tainted: P           OE   4.4.69+ #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014
 c1abc6e7 0ae10898 00000286 d4ac3bc0 c14397bc da4cd7d8 d4ac3bf0 d4ac3bd0
 d790e7ce d7911cc1 00000523 d4ac3d00 d790e7d7 d7911ce4 da4cd7d8 00000341
 da4ce664 da4cd8c0 da33fa6e 49524556 28335946 3d3d2030 65647620 626f5f76
Call Trace:
 [<>] dump_stack+0x58/0x7c
 [<>] spl_dumpstack+0x23/0x27 [spl]
 [<>] spl_panic.cold.0+0x5/0x41 [spl]
 [<>] ? dbuf_rele+0x3e/0x90 [zfs]
 [<>] ? zap_lookup_norm+0xbe/0xe0 [zfs]
 [<>] ? zap_lookup+0x57/0x70 [zfs]
 [<>] ? vdev_obsolete_sm_object+0x102/0x12b [zfs]
 [<>] vdev_indirect_sync_obsolete+0x3e1/0x64d [zfs]
 [<>] ? txg_verify+0x1d/0x160 [zfs]
 [<>] ? dmu_tx_create_dd+0x80/0xc0 [zfs]
 [<>] vdev_sync+0xbf/0x550 [zfs]
 [<>] ? mutex_lock+0x10/0x30
 [<>] ? txg_list_remove+0x9f/0x1a0 [zfs]
 [<>] ? zap_contains+0x4d/0x70 [zfs]
 [<>] spa_sync+0x9f1/0x1b10 [zfs]
 ...
 [<>] ? kthread_stop+0x110/0x110

This commit simply corrects the "integer_size" parameter used to lookup
the vdev's ZAP object.

Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #8790
2019-06-07 12:39:13 -07:00
loli10K abe267f677 zpool: trim -p is not a valid option
This commit removes the documented but not handled "-p" option from
zpool(8) help message.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #8781
2019-06-07 12:39:13 -07:00
loli10K cc434dcf45 Fix coverity defects: CID 186143
CID 186143: Memory - illegal accesses (USE_AFTER_FREE)

This patch fixes an use-after-free in spa_import_progress_destroy()
moving the kmem_free() call at the end of the function.

Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #8788
2019-06-07 12:39:13 -07:00
Igor K e2e7b0a2cd Rename reservation tests from *.sh to *.ksh
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Igor Kozhukhov <igor@dilos.org>
Closes #8729
2019-06-07 12:39:13 -07:00
254 changed files with 3154 additions and 1644 deletions
+11
View File
@@ -63,3 +63,14 @@ cscope.*
*.log
venv
#
# Module leftovers
#
/module/avl/zavl.mod
/module/icp/icp.mod
/module/lua/zlua.mod
/module/nvpair/znvpair.mod
/module/spl/spl.mod
/module/unicode/zunicode.mod
/module/zcommon/zcommon.mod
/module/zfs/zfs.mod
+2 -2
View File
@@ -1,10 +1,10 @@
Meta: 1
Name: zfs
Branch: 1.0
Version: 0.8.0
Version: 0.8.2
Release: 1
Release-Tags: relext
License: CDDL
Author: OpenZFS on Linux
Linux-Maximum: 5.1
Linux-Maximum: 5.3
Linux-Minimum: 2.6.32
+4 -2
View File
@@ -52,7 +52,8 @@ distclean-local::
-type f -print | xargs $(RM)
all-local:
-${top_srcdir}/scripts/zfs-tests.sh -c
-[ -x ${top_builddir}/scripts/zfs-tests.sh ] && \
${top_builddir}/scripts/zfs-tests.sh -c
dist-hook: gitrev
cp ${top_srcdir}/include/zfs_gitrev.h $(distdir)/include; \
@@ -111,9 +112,10 @@ mancheck:
fi
testscheck:
@find ${top_srcdir}/tests/zfs-tests/tests -type f \
@find ${top_srcdir}/tests/zfs-tests -type f \
\( -name '*.ksh' -not -executable \) -o \
\( -name '*.kshlib' -executable \) -o \
\( -name '*.shlib' -executable \) -o \
\( -name '*.cfg' -executable \) | \
xargs -r stat -c '%A %n' | \
awk '{c++; print} END {if(c>0) exit 1}'
+7 -2
View File
@@ -1,3 +1,8 @@
SUBDIRS = zfs zpool zdb zhack zinject zstreamdump ztest
SUBDIRS += mount_zfs fsck_zfs zvol_id vdev_id arcstat dbufstat zed
SUBDIRS += arc_summary raidz_test zgenhostid
SUBDIRS += fsck_zfs vdev_id raidz_test zgenhostid
if USING_PYTHON
SUBDIRS += arcstat arc_summary dbufstat
endif
SUBDIRS += mount_zfs zed zvol_id zvol_wait
+1 -3
View File
@@ -4,9 +4,7 @@ if USING_PYTHON_2
dist_bin_SCRIPTS = arc_summary2
install-exec-hook:
mv $(DESTDIR)$(bindir)/arc_summary2 $(DESTDIR)$(bindir)/arc_summary
endif
if USING_PYTHON_3
else
dist_bin_SCRIPTS = arc_summary3
install-exec-hook:
mv $(DESTDIR)$(bindir)/arc_summary3 $(DESTDIR)$(bindir)/arc_summary
+2 -55
View File
@@ -1,12 +1,11 @@
SUBDIRS = zed.d
include $(top_srcdir)/config/Rules.am
DEFAULT_INCLUDES += \
-I$(top_srcdir)/include \
-I$(top_srcdir)/lib/libspl/include
EXTRA_DIST = zed.d/README \
zed.d/history_event-zfs-list-cacher.sh.in
sbin_PROGRAMS = zed
ZED_SRC = \
@@ -47,55 +46,3 @@ zed_LDADD = \
zed_LDADD += -lrt
zed_LDFLAGS = -pthread
zedconfdir = $(sysconfdir)/zfs/zed.d
dist_zedconf_DATA = \
zed.d/zed-functions.sh \
zed.d/zed.rc
zedexecdir = $(zfsexecdir)/zed.d
dist_zedexec_SCRIPTS = \
zed.d/all-debug.sh \
zed.d/all-syslog.sh \
zed.d/data-notify.sh \
zed.d/generic-notify.sh \
zed.d/resilver_finish-notify.sh \
zed.d/scrub_finish-notify.sh \
zed.d/statechange-led.sh \
zed.d/statechange-notify.sh \
zed.d/vdev_clear-led.sh \
zed.d/vdev_attach-led.sh \
zed.d/pool_import-led.sh \
zed.d/resilver_finish-start-scrub.sh
nodist_zedexec_SCRIPTS = zed.d/history_event-zfs-list-cacher.sh
$(nodist_zedexec_SCRIPTS): %: %.in
-$(SED) -e 's,@bindir\@,$(bindir),g' \
-e 's,@runstatedir\@,$(runstatedir),g' \
-e 's,@sbindir\@,$(sbindir),g' \
-e 's,@sysconfdir\@,$(sysconfdir),g' \
$< >'$@'
zedconfdefaults = \
all-syslog.sh \
data-notify.sh \
resilver_finish-notify.sh \
scrub_finish-notify.sh \
statechange-led.sh \
statechange-notify.sh \
vdev_clear-led.sh \
vdev_attach-led.sh \
pool_import-led.sh \
resilver_finish-start-scrub.sh
install-data-hook:
$(MKDIR_P) "$(DESTDIR)$(zedconfdir)"
for f in $(zedconfdefaults); do \
test -f "$(DESTDIR)$(zedconfdir)/$${f}" -o \
-L "$(DESTDIR)$(zedconfdir)/$${f}" || \
ln -s "$(zedexecdir)/$${f}" "$(DESTDIR)$(zedconfdir)"; \
done
chmod 0600 "$(DESTDIR)$(zedconfdir)/zed.rc"
+2 -1
View File
@@ -116,7 +116,8 @@ zfs_agent_iter_vdev(zpool_handle_t *zhp, nvlist_t *nvl, void *arg)
/*
* On a devid match, grab the vdev guid and expansion time, if any.
*/
if ((nvlist_lookup_string(nvl, ZPOOL_CONFIG_DEVID, &path) == 0) &&
if (gsp->gs_devid != NULL &&
(nvlist_lookup_string(nvl, ZPOOL_CONFIG_DEVID, &path) == 0) &&
(strcmp(gsp->gs_devid, path) == 0)) {
(void) nvlist_lookup_uint64(nvl, ZPOOL_CONFIG_GUID,
&gsp->gs_vdev_guid);
+57
View File
@@ -0,0 +1,57 @@
include $(top_srcdir)/config/Rules.am
EXTRA_DIST = \
README \
history_event-zfs-list-cacher.sh.in
zedconfdir = $(sysconfdir)/zfs/zed.d
dist_zedconf_DATA = \
zed-functions.sh \
zed.rc
zedexecdir = $(zfsexecdir)/zed.d
dist_zedexec_SCRIPTS = \
all-debug.sh \
all-syslog.sh \
data-notify.sh \
generic-notify.sh \
resilver_finish-notify.sh \
scrub_finish-notify.sh \
statechange-led.sh \
statechange-notify.sh \
vdev_clear-led.sh \
vdev_attach-led.sh \
pool_import-led.sh \
resilver_finish-start-scrub.sh
nodist_zedexec_SCRIPTS = history_event-zfs-list-cacher.sh
$(nodist_zedexec_SCRIPTS): %: %.in
-$(SED) -e 's,@bindir\@,$(bindir),g' \
-e 's,@runstatedir\@,$(runstatedir),g' \
-e 's,@sbindir\@,$(sbindir),g' \
-e 's,@sysconfdir\@,$(sysconfdir),g' \
$< >'$@'
zedconfdefaults = \
all-syslog.sh \
data-notify.sh \
resilver_finish-notify.sh \
scrub_finish-notify.sh \
statechange-led.sh \
statechange-notify.sh \
vdev_clear-led.sh \
vdev_attach-led.sh \
pool_import-led.sh \
resilver_finish-start-scrub.sh
install-data-hook:
$(MKDIR_P) "$(DESTDIR)$(zedconfdir)"
for f in $(zedconfdefaults); do \
test -f "$(DESTDIR)$(zedconfdir)/$${f}" -o \
-L "$(DESTDIR)$(zedconfdir)/$${f}" || \
ln -s "$(zedexecdir)/$${f}" "$(DESTDIR)$(zedconfdir)"; \
done
chmod 0600 "$(DESTDIR)$(zedconfdir)/zed.rc"
@@ -47,7 +47,7 @@ case "${ZEVENT_HISTORY_INTERNAL_NAME}" in
# Only act if one of the tracked properties is altered.
case "${ZEVENT_HISTORY_INTERNAL_STR%%=*}" in
canmount|mountpoint|atime|relatime|devices|exec| \
readonly|setuid|nbmand) ;;
readonly|setuid|nbmand|encroot|keylocation) ;;
*) exit 0 ;;
esac
;;
@@ -62,7 +62,7 @@ zed_lock zfs-list
trap abort_alter EXIT
PROPS="name,mountpoint,canmount,atime,relatime,devices,exec,readonly"
PROPS="${PROPS},setuid,nbmand"
PROPS="${PROPS},setuid,nbmand,encroot,keylocation"
"${ZFS}" list -H -t filesystem -o $PROPS -r "${ZEVENT_POOL}" > "${FSLIST_TMP}"
+29 -8
View File
@@ -28,6 +28,7 @@
* Copyright 2016 Igor Kozhukhov <ikozhukhov@gmail.com>.
* Copyright 2016 Nexenta Systems, Inc.
* Copyright (c) 2019 Datto Inc.
* Copyright (c) 2019, loli10K <ezomori.nozomu@gmail.com>
*/
#include <assert.h>
@@ -2238,7 +2239,7 @@ zfs_do_upgrade(int argc, char **argv)
boolean_t showversions = B_FALSE;
int ret = 0;
upgrade_cbdata_t cb = { 0 };
signed char c;
int c;
int flags = ZFS_ITER_ARGS_CAN_BE_PATHS;
/* check options */
@@ -3932,7 +3933,7 @@ static int
zfs_do_snapshot(int argc, char **argv)
{
int ret = 0;
signed char c;
int c;
nvlist_t *props;
snap_cbdata_t sd = { 0 };
boolean_t multiple_snaps = B_FALSE;
@@ -6445,8 +6446,25 @@ share_mount_one(zfs_handle_t *zhp, int op, int flags, char *protocol,
return (1);
}
if (zfs_mount(zhp, options, flags) != 0)
if (zfs_mount(zhp, options, flags) != 0) {
/*
* Check if a mount sneaked in after we checked
*/
if (!explicit &&
libzfs_errno(g_zfs) == EZFS_MOUNTFAILED) {
usleep(10 * MILLISEC);
libzfs_mnttab_cache(g_zfs, B_FALSE);
if (zfs_is_mounted(zhp, NULL)) {
(void) fprintf(stderr, gettext(
"Ignoring previous 'already "
"mounted' error for '%s'\n"),
zfs_get_name(zhp));
return (0);
}
}
return (1);
}
break;
}
@@ -6621,10 +6639,13 @@ share_mount(int op, int argc, char **argv)
/*
* libshare isn't mt-safe, so only do the operation in parallel
* if we're mounting.
* if we're mounting. Additionally, the key-loading option must
* be serialized so that we can prompt the user for their keys
* in a consistent manner.
*/
zfs_foreach_mountpoint(g_zfs, cb.cb_handles, cb.cb_used,
share_mount_one_cb, &share_mount_state, op == OP_MOUNT);
share_mount_one_cb, &share_mount_state,
op == OP_MOUNT && !(flags & MS_CRYPT));
ret = share_mount_state.sm_status;
for (int i = 0; i < cb.cb_used; i++)
@@ -6729,8 +6750,8 @@ unshare_unmount_compare(const void *larg, const void *rarg, void *unused)
/*
* Convenience routine used by zfs_do_umount() and manual_unmount(). Given an
* absolute path, find the entry /proc/self/mounts, verify that its a
* ZFS filesystems, and unmount it appropriately.
* absolute path, find the entry /proc/self/mounts, verify that it's a
* ZFS filesystem, and unmount it appropriately.
*/
static int
unshare_unmount_path(int op, char *path, int flags, boolean_t is_manual)
@@ -7522,7 +7543,7 @@ zfs_do_channel_program(int argc, char **argv)
}
if ((zhp = zpool_open(g_zfs, poolname)) == NULL) {
(void) fprintf(stderr, gettext("cannot open pool '%s'"),
(void) fprintf(stderr, gettext("cannot open pool '%s'\n"),
poolname);
if (fd != 0)
(void) close(fd);
+12 -13
View File
@@ -30,6 +30,7 @@
* Copyright (c) 2017 Datto Inc.
* Copyright (c) 2017 Open-E, Inc. All Rights Reserved.
* Copyright (c) 2017, Intel Corporation.
* Copyright (c) 2019, loli10K <ezomori.nozomu@gmail.com>
*/
#include <assert.h>
@@ -384,11 +385,11 @@ get_usage(zpool_help_t idx)
case HELP_RESILVER:
return (gettext("\tresilver <pool> ...\n"));
case HELP_TRIM:
return (gettext("\ttrim [-dp] [-r <rate>] [-c | -s] <pool> "
return (gettext("\ttrim [-d] [-r <rate>] [-c | -s] <pool> "
"[<device> ...]\n"));
case HELP_STATUS:
return (gettext("\tstatus [-c [script1,script2,...]] "
"[-igLpPsvxD] [-T d|u] [pool] ... \n"
"[-igLpPstvxD] [-T d|u] [pool] ... \n"
"\t [interval [count]]\n"));
case HELP_UPGRADE:
return (gettext("\tupgrade\n"
@@ -784,7 +785,7 @@ add_prop_list_default(const char *propname, char *propval, nvlist_t **props,
* -P Display full path for vdev name.
*
* Adds the given vdevs to 'pool'. As with create, the bulk of this work is
* handled by get_vdev_spec(), which constructs the nvlist needed to pass to
* handled by make_root_vdev(), which constructs the nvlist needed to pass to
* libzfs.
*/
int
@@ -882,7 +883,7 @@ zpool_do_add(int argc, char **argv)
}
}
/* pass off to get_vdev_spec for processing */
/* pass off to make_root_vdev for processing */
nvroot = make_root_vdev(zhp, props, force, !force, B_FALSE, dryrun,
argc, argv);
if (nvroot == NULL) {
@@ -972,7 +973,7 @@ zpool_do_remove(int argc, char **argv)
int i, ret = 0;
zpool_handle_t *zhp = NULL;
boolean_t stop = B_FALSE;
char c;
int c;
boolean_t noop = B_FALSE;
boolean_t parsable = B_FALSE;
@@ -1231,9 +1232,9 @@ errout:
* -O Set fsproperty=value in the pool's root file system
*
* Creates the named pool according to the given vdev specification. The
* bulk of the vdev processing is done in get_vdev_spec() in zpool_vdev.c. Once
* we get the nvlist back from get_vdev_spec(), we either print out the contents
* (if '-n' was specified), or pass it to libzfs to do the creation.
* bulk of the vdev processing is done in make_root_vdev() in zpool_vdev.c.
* Once we get the nvlist back from make_root_vdev(), we either print out the
* contents (if '-n' was specified), or pass it to libzfs to do the creation.
*/
int
zpool_do_create(int argc, char **argv)
@@ -1387,7 +1388,7 @@ zpool_do_create(int argc, char **argv)
goto errout;
}
/* pass off to get_vdev_spec for bulk processing */
/* pass off to make_root_vdev for bulk processing */
nvroot = make_root_vdev(NULL, props, force, !force, B_FALSE, dryrun,
argc - 1, argv + 1);
if (nvroot == NULL)
@@ -6110,9 +6111,8 @@ zpool_do_detach(int argc, char **argv)
int ret;
/* check options */
while ((c = getopt(argc, argv, "f")) != -1) {
while ((c = getopt(argc, argv, "")) != -1) {
switch (c) {
case 'f':
case '?':
(void) fprintf(stderr, gettext("invalid option '%c'\n"),
optopt);
@@ -6341,12 +6341,11 @@ zpool_do_online(int argc, char **argv)
int flags = 0;
/* check options */
while ((c = getopt(argc, argv, "et")) != -1) {
while ((c = getopt(argc, argv, "e")) != -1) {
switch (c) {
case 'e':
flags |= ZFS_ONLINE_EXPAND;
break;
case 't':
case '?':
(void) fprintf(stderr, gettext("invalid option '%c'\n"),
optopt);
+1
View File
@@ -433,6 +433,7 @@ check_disk(const char *path, blkid_cache cache, int force,
char *value = blkid_get_tag_value(cache, "TYPE", path);
(void) fprintf(stderr, gettext("%s is in use and contains "
"a %s filesystem.\n"), path, value ? value : "unknown");
free(value);
return (-1);
}
+42 -23
View File
@@ -53,7 +53,6 @@
*/
#define DUMP_GROUPING 4
uint64_t total_write_size = 0;
uint64_t total_stream_len = 0;
FILE *send_stream = 0;
boolean_t do_byteswap = B_FALSE;
@@ -219,6 +218,9 @@ main(int argc, char *argv[])
{
char *buf = safe_malloc(SPA_MAXBLOCKSIZE);
uint64_t drr_record_count[DRR_NUMTYPES] = { 0 };
uint64_t total_payload_size = 0;
uint64_t total_overhead_size = 0;
uint64_t drr_byte_count[DRR_NUMTYPES] = { 0 };
char salt[ZIO_DATA_SALT_LEN * 2 + 1];
char iv[ZIO_DATA_IV_LEN * 2 + 1];
char mac[ZIO_DATA_MAC_LEN * 2 + 1];
@@ -237,7 +239,7 @@ main(int argc, char *argv[])
struct drr_write_embedded *drrwe = &thedrr.drr_u.drr_write_embedded;
struct drr_object_range *drror = &thedrr.drr_u.drr_object_range;
struct drr_checksum *drrc = &thedrr.drr_u.drr_checksum;
char c;
int c;
boolean_t verbose = B_FALSE;
boolean_t very_verbose = B_FALSE;
boolean_t first = B_TRUE;
@@ -336,7 +338,9 @@ main(int argc, char *argv[])
}
drr_record_count[drr->drr_type]++;
total_overhead_size += sizeof (*drr);
total_records++;
payload_size = 0;
switch (drr->drr_type) {
case DRR_BEGIN:
@@ -390,6 +394,7 @@ main(int argc, char *argv[])
nvlist_print(stdout, nv);
nvlist_free(nv);
}
payload_size = sz;
}
break;
@@ -554,7 +559,6 @@ main(int argc, char *argv[])
if (dump) {
print_block(buf, payload_size);
}
total_write_size += payload_size;
break;
case DRR_WRITE_BYREF:
@@ -683,6 +687,7 @@ main(int argc, char *argv[])
print_block(buf,
P2ROUNDUP(drrwe->drr_psize, 8));
}
payload_size = P2ROUNDUP(drrwe->drr_psize, 8);
break;
case DRR_OBJECT_RANGE:
if (do_byteswap) {
@@ -723,6 +728,8 @@ main(int argc, char *argv[])
(longlong_t)drrc->drr_checksum.zc_word[3]);
}
pcksum = zc;
drr_byte_count[drr->drr_type] += payload_size;
total_payload_size += payload_size;
}
free(buf);
fletcher_4_fini();
@@ -730,28 +737,40 @@ main(int argc, char *argv[])
/* Print final summary */
(void) printf("SUMMARY:\n");
(void) printf("\tTotal DRR_BEGIN records = %lld\n",
(u_longlong_t)drr_record_count[DRR_BEGIN]);
(void) printf("\tTotal DRR_END records = %lld\n",
(u_longlong_t)drr_record_count[DRR_END]);
(void) printf("\tTotal DRR_OBJECT records = %lld\n",
(u_longlong_t)drr_record_count[DRR_OBJECT]);
(void) printf("\tTotal DRR_FREEOBJECTS records = %lld\n",
(u_longlong_t)drr_record_count[DRR_FREEOBJECTS]);
(void) printf("\tTotal DRR_WRITE records = %lld\n",
(u_longlong_t)drr_record_count[DRR_WRITE]);
(void) printf("\tTotal DRR_WRITE_BYREF records = %lld\n",
(u_longlong_t)drr_record_count[DRR_WRITE_BYREF]);
(void) printf("\tTotal DRR_WRITE_EMBEDDED records = %lld\n",
(u_longlong_t)drr_record_count[DRR_WRITE_EMBEDDED]);
(void) printf("\tTotal DRR_FREE records = %lld\n",
(u_longlong_t)drr_record_count[DRR_FREE]);
(void) printf("\tTotal DRR_SPILL records = %lld\n",
(u_longlong_t)drr_record_count[DRR_SPILL]);
(void) printf("\tTotal DRR_BEGIN records = %lld (%llu bytes)\n",
(u_longlong_t)drr_record_count[DRR_BEGIN],
(u_longlong_t)drr_byte_count[DRR_BEGIN]);
(void) printf("\tTotal DRR_END records = %lld (%llu bytes)\n",
(u_longlong_t)drr_record_count[DRR_END],
(u_longlong_t)drr_byte_count[DRR_END]);
(void) printf("\tTotal DRR_OBJECT records = %lld (%llu bytes)\n",
(u_longlong_t)drr_record_count[DRR_OBJECT],
(u_longlong_t)drr_byte_count[DRR_OBJECT]);
(void) printf("\tTotal DRR_FREEOBJECTS records = %lld (%llu bytes)\n",
(u_longlong_t)drr_record_count[DRR_FREEOBJECTS],
(u_longlong_t)drr_byte_count[DRR_FREEOBJECTS]);
(void) printf("\tTotal DRR_WRITE records = %lld (%llu bytes)\n",
(u_longlong_t)drr_record_count[DRR_WRITE],
(u_longlong_t)drr_byte_count[DRR_WRITE]);
(void) printf("\tTotal DRR_WRITE_BYREF records = %lld (%llu bytes)\n",
(u_longlong_t)drr_record_count[DRR_WRITE_BYREF],
(u_longlong_t)drr_byte_count[DRR_WRITE_BYREF]);
(void) printf("\tTotal DRR_WRITE_EMBEDDED records = %lld (%llu "
"bytes)\n", (u_longlong_t)drr_record_count[DRR_WRITE_EMBEDDED],
(u_longlong_t)drr_byte_count[DRR_WRITE_EMBEDDED]);
(void) printf("\tTotal DRR_FREE records = %lld (%llu bytes)\n",
(u_longlong_t)drr_record_count[DRR_FREE],
(u_longlong_t)drr_byte_count[DRR_FREE]);
(void) printf("\tTotal DRR_SPILL records = %lld (%llu bytes)\n",
(u_longlong_t)drr_record_count[DRR_SPILL],
(u_longlong_t)drr_byte_count[DRR_SPILL]);
(void) printf("\tTotal records = %lld\n",
(u_longlong_t)total_records);
(void) printf("\tTotal write size = %lld (0x%llx)\n",
(u_longlong_t)total_write_size, (u_longlong_t)total_write_size);
(void) printf("\tTotal payload size = %lld (0x%llx)\n",
(u_longlong_t)total_payload_size, (u_longlong_t)total_payload_size);
(void) printf("\tTotal header overhead = %lld (0x%llx)\n",
(u_longlong_t)total_overhead_size,
(u_longlong_t)total_overhead_size);
(void) printf("\tTotal stream length = %lld (0x%llx)\n",
(u_longlong_t)total_stream_len, (u_longlong_t)total_stream_len);
return (0);
+17 -1
View File
@@ -2745,8 +2745,24 @@ ztest_spa_create_destroy(ztest_ds_t *zd, uint64_t id)
VERIFY3U(EEXIST, ==,
spa_create(zo->zo_pool, nvroot, NULL, NULL, NULL));
nvlist_free(nvroot);
/*
* We open a reference to the spa and then we try to export it
* expecting one of the following errors:
*
* EBUSY
* Because of the reference we just opened.
*
* ZFS_ERR_EXPORT_IN_PROGRESS
* For the case that there is another ztest thread doing
* an export concurrently.
*/
VERIFY3U(0, ==, spa_open(zo->zo_pool, &spa, FTAG));
VERIFY3U(EBUSY, ==, spa_destroy(zo->zo_pool));
int error = spa_destroy(zo->zo_pool);
if (error != EBUSY && error != ZFS_ERR_EXPORT_IN_PROGRESS) {
fatal(0, "spa_destroy(%s) returned unexpected value %d",
spa->spa_name, error);
}
spa_close(spa, FTAG);
(void) pthread_rwlock_unlock(&ztest_name_lock);
+1
View File
@@ -0,0 +1 @@
dist_bin_SCRIPTS = zvol_wait
+112
View File
@@ -0,0 +1,112 @@
#!/bin/sh
count_zvols() {
if [ -z "$zvols" ]; then
echo 0
else
echo "$zvols" | wc -l
fi
}
filter_out_zvols_with_links() {
while read -r zvol; do
if [ ! -L "/dev/zvol/$zvol" ]; then
echo "$zvol"
fi
done
}
filter_out_deleted_zvols() {
while read -r zvol; do
if zfs list "$zvol" >/dev/null 2>&1; then
echo "$zvol"
fi
done
}
list_zvols() {
zfs list -t volume -H -o name,volmode,receive_resume_token |
while read -r zvol_line; do
name=$(echo "$zvol_line" | awk '{print $1}')
volmode=$(echo "$zvol_line" | awk '{print $2}')
token=$(echo "$zvol_line" | awk '{print $3}')
#
# /dev links are not created for zvols with volmode = "none".
#
[ "$volmode" = "none" ] && continue
#
# We also also ignore partially received zvols if it is
# not an incremental receive, as those won't even have a block
# device minor node created yet.
#
if [ "$token" != "-" ]; then
#
# Incremental receives create an invisible clone that
# is not automatically displayed by zfs list.
#
if ! zfs list "$name/%recv" >/dev/null 2>&1; then
continue
fi
fi
echo "$name"
done
}
zvols=$(list_zvols)
zvols_count=$(count_zvols)
if [ "$zvols_count" -eq 0 ]; then
echo "No zvols found, nothing to do."
exit 0
fi
echo "Testing $zvols_count zvol links"
outer_loop=0
while [ "$outer_loop" -lt 20 ]; do
outer_loop=$((outer_loop + 1))
old_zvols_count=$(count_zvols)
inner_loop=0
while [ "$inner_loop" -lt 30 ]; do
inner_loop=$((inner_loop + 1))
zvols="$(echo "$zvols" | filter_out_zvols_with_links)"
zvols_count=$(count_zvols)
if [ "$zvols_count" -eq 0 ]; then
echo "All zvol links are now present."
exit 0
fi
sleep 1
done
echo "Still waiting on $zvols_count zvol links ..."
#
# Although zvols should normally not be deleted at boot time,
# if that is the case then their links will be missing and
# we would stall.
#
if [ "$old_zvols_count" -eq "$zvols_count" ]; then
echo "No progress since last loop."
echo "Checking if any zvols were deleted."
zvols=$(echo "$zvols" | filter_out_deleted_zvols)
zvols_count=$(count_zvols)
if [ "$old_zvols_count" -ne "$zvols_count" ]; then
echo "$((old_zvols_count - zvols_count)) zvol(s) deleted."
fi
if [ "$zvols_count" -ne 0 ]; then
echo "Remaining zvols:"
echo "$zvols"
else
echo "All zvol links are now present."
exit 0
fi
fi
done
echo "Timed out waiting on zvol links"
exit 1
+25 -61
View File
@@ -1,36 +1,3 @@
dnl #
dnl # ZFS_AC_PYTHON_VERSION(version, [action-if-true], [action-if-false])
dnl #
dnl # Verify Python version
dnl #
AC_DEFUN([ZFS_AC_PYTHON_VERSION], [
ver_check=`$PYTHON -c "import sys; print (sys.version.split()[[0]] $1)"`
AS_IF([test "$ver_check" = "True"], [
m4_ifvaln([$2], [$2])
], [
m4_ifvaln([$3], [$3])
])
])
dnl #
dnl # ZFS_AC_PYTHON_MODULE(module_name, [action-if-true], [action-if-false])
dnl #
dnl # Checks for Python module. Freely inspired by AX_PYTHON_MODULE
dnl # https://www.gnu.org/software/autoconf-archive/ax_python_module.html
dnl # Required by ZFS_AC_CONFIG_ALWAYS_PYZFS.
dnl #
AC_DEFUN([ZFS_AC_PYTHON_MODULE], [
PYTHON_NAME=`basename $PYTHON`
AC_MSG_CHECKING([for $PYTHON_NAME module: $1])
AS_IF([$PYTHON -c "import $1" 2>/dev/null], [
AC_MSG_RESULT(yes)
m4_ifvaln([$2], [$2])
], [
AC_MSG_RESULT(no)
m4_ifvaln([$3], [$3])
])
])
dnl #
dnl # The majority of the python scripts are written to be compatible
dnl # with Python 2.6 and Python 3.4. Therefore, they may be installed
@@ -46,50 +13,47 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_PYTHON], [
[with_python=check])
AS_CASE([$with_python],
[check],
[AS_IF([test -x /usr/bin/python3],
[PYTHON="python3"],
[AS_IF([test -x /usr/bin/python2],
[PYTHON="python2"],
[PYTHON=""]
)]
)],
[check], [AC_CHECK_PROGS([PYTHON], [python3 python2], [:])],
[2*], [PYTHON="python${with_python}"],
[*python2*], [PYTHON="${with_python}"],
[3*], [PYTHON="python${with_python}"],
[*python3*], [PYTHON="${with_python}"],
[no], [PYTHON=""],
[no], [PYTHON=":"],
[AC_MSG_ERROR([Unknown --with-python value '$with_python'])]
)
AS_IF([$PYTHON --version >/dev/null 2>&1], [ /bin/true ], [
AC_MSG_ERROR([Cannot find $PYTHON in your system path])
])
AM_PATH_PYTHON([2.6], [], [:])
AM_CONDITIONAL([USING_PYTHON], [test "$PYTHON" != :])
AM_CONDITIONAL([USING_PYTHON_2], [test "${PYTHON_VERSION:0:2}" = "2."])
AM_CONDITIONAL([USING_PYTHON_3], [test "${PYTHON_VERSION:0:2}" = "3."])
dnl #
dnl # Minimum supported Python versions for utilities:
dnl # Python 2.6.x, or Python 3.4.x
dnl # Python 2.6 or Python 3.4
dnl #
AS_IF([test "${PYTHON_VERSION:0:2}" = "2."], [
ZFS_AC_PYTHON_VERSION([>= '2.6'], [ /bin/true ],
[AC_MSG_ERROR("Python >= 2.6.x is not available")])
AM_PATH_PYTHON([], [], [:])
AS_IF([test -z "$PYTHON_VERSION"], [
PYTHON_VERSION=$(basename $PYTHON | tr -cd 0-9.)
])
PYTHON_MINOR=${PYTHON_VERSION#*\.}
AS_IF([test "${PYTHON_VERSION:0:2}" = "3."], [
ZFS_AC_PYTHON_VERSION([>= '3.4'], [ /bin/true ],
[AC_MSG_ERROR("Python >= 3.4.x is not available")])
])
AS_CASE([$PYTHON_VERSION],
[2.*], [
AS_IF([test $PYTHON_MINOR -lt 6],
[AC_MSG_ERROR("Python >= 2.6 is required")])
],
[3.*], [
AS_IF([test $PYTHON_MINOR -lt 4],
[AC_MSG_ERROR("Python >= 3.4 is required")])
],
[:|2|3], [],
[PYTHON_VERSION=3]
)
AM_CONDITIONAL([USING_PYTHON], [test "$PYTHON" != :])
AM_CONDITIONAL([USING_PYTHON_2], [test "x${PYTHON_VERSION%%\.*}" = x2])
AM_CONDITIONAL([USING_PYTHON_3], [test "x${PYTHON_VERSION%%\.*}" = x3])
dnl #
dnl # Request that packages be built for a specific Python version.
dnl #
AS_IF([test $with_python != check], [
PYTHON_PKG_VERSION=`echo ${PYTHON} | tr -d 'a-zA-Z.'`
AS_IF([test "x$with_python" != xcheck], [
PYTHON_PKG_VERSION=$(echo $PYTHON_VERSION | tr -d .)
DEFINE_PYTHON_PKG_VERSION='--define "__use_python_pkg_version '${PYTHON_PKG_VERSION}'"'
DEFINE_PYTHON_VERSION='--define "__use_python '${PYTHON}'"'
], [
+35 -15
View File
@@ -1,5 +1,24 @@
dnl #
dnl # Determines if pyzfs can be built, requires Python 2.7 or latter.
dnl # ZFS_AC_PYTHON_MODULE(module_name, [action-if-true], [action-if-false])
dnl #
dnl # Checks for Python module. Freely inspired by AX_PYTHON_MODULE
dnl # https://www.gnu.org/software/autoconf-archive/ax_python_module.html
dnl # Required by ZFS_AC_CONFIG_ALWAYS_PYZFS.
dnl #
AC_DEFUN([ZFS_AC_PYTHON_MODULE], [
PYTHON_NAME=$(basename $PYTHON)
AC_MSG_CHECKING([for $PYTHON_NAME module: $1])
AS_IF([$PYTHON -c "import $1" 2>/dev/null], [
AC_MSG_RESULT(yes)
m4_ifvaln([$2], [$2])
], [
AC_MSG_RESULT(no)
m4_ifvaln([$3], [$3])
])
])
dnl #
dnl # Determines if pyzfs can be built, requires Python 2.7 or later.
dnl #
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_PYZFS], [
AC_ARG_ENABLE([pyzfs],
@@ -18,7 +37,12 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_PYZFS], [
DEFINE_PYZFS='--without pyzfs'
])
], [
DEFINE_PYZFS=''
AS_IF([test "$PYTHON" != :], [
DEFINE_PYZFS=''
], [
enable_pyzfs=no
DEFINE_PYZFS='--without pyzfs'
])
])
AC_SUBST(DEFINE_PYZFS)
@@ -26,20 +50,16 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_PYZFS], [
dnl # Require python-devel libraries
dnl #
AS_IF([test "x$enable_pyzfs" = xcheck -o "x$enable_pyzfs" = xyes], [
AS_IF([test "${PYTHON_VERSION:0:2}" = "2."], [
PYTHON_REQUIRED_VERSION=">= '2.7.0'"
], [
AS_IF([test "${PYTHON_VERSION:0:2}" = "3."], [
PYTHON_REQUIRED_VERSION=">= '3.4.0'"
], [
AC_MSG_ERROR("Python $PYTHON_VERSION unknown")
])
])
AS_CASE([$PYTHON_VERSION],
[3.*], [PYTHON_REQUIRED_VERSION=">= '3.4.0'"],
[2.*], [PYTHON_REQUIRED_VERSION=">= '2.7.0'"],
[AC_MSG_ERROR("Python $PYTHON_VERSION unknown")]
)
AX_PYTHON_DEVEL([$PYTHON_REQUIRED_VERSION], [
AS_IF([test "x$enable_pyzfs" = xyes], [
AC_MSG_ERROR("Python $PYTHON_REQUIRED_VERSION development library is not installed")
], [test ! "x$enable_pyzfs" = xno], [
], [test "x$enable_pyzfs" != xno], [
enable_pyzfs=no
])
])
@@ -52,7 +72,7 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_PYZFS], [
ZFS_AC_PYTHON_MODULE([setuptools], [], [
AS_IF([test "x$enable_pyzfs" = xyes], [
AC_MSG_ERROR("Python $PYTHON_VERSION setuptools is not installed")
], [test ! "x$enable_pyzfs" = xno], [
], [test "x$enable_pyzfs" != xno], [
enable_pyzfs=no
])
])
@@ -65,7 +85,7 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_PYZFS], [
ZFS_AC_PYTHON_MODULE([cffi], [], [
AS_IF([test "x$enable_pyzfs" = xyes], [
AC_MSG_ERROR("Python $PYTHON_VERSION cffi is not installed")
], [test ! "x$enable_pyzfs" = xno], [
], [test "x$enable_pyzfs" != xno], [
enable_pyzfs=no
])
])
@@ -76,7 +96,7 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_PYZFS], [
dnl #
AS_IF([test "x$enable_pyzfs" = xcheck], [enable_pyzfs=yes])
AM_CONDITIONAL([PYZFS_ENABLED], [test x$enable_pyzfs = xyes])
AM_CONDITIONAL([PYZFS_ENABLED], [test "x$enable_pyzfs" = xyes])
AC_SUBST([PYZFS_ENABLED], [$enable_pyzfs])
AC_SUBST(pythonsitedir, [$PYTHON_SITE_PKG])
+4 -4
View File
@@ -20,7 +20,7 @@ deb-kmod: deb-local rpm-kmod
arch=`$(RPM) -qp $${name}-kmod-$${version}.src.rpm --qf %{arch} | tail -1`; \
debarch=`$(DPKG) --print-architecture`; \
pkg1=kmod-$${name}*$${version}.$${arch}.rpm; \
fakeroot $(ALIEN) --bump=0 --scripts --to-deb --target=$$debarch $$pkg1; \
fakeroot $(ALIEN) --bump=0 --scripts --to-deb --target=$$debarch $$pkg1 || exit 1; \
$(RM) $$pkg1
@@ -30,7 +30,7 @@ deb-dkms: deb-local rpm-dkms
arch=`$(RPM) -qp $${name}-dkms-$${version}.src.rpm --qf %{arch} | tail -1`; \
debarch=`$(DPKG) --print-architecture`; \
pkg1=$${name}-dkms-$${version}.$${arch}.rpm; \
fakeroot $(ALIEN) --bump=0 --scripts --to-deb --target=$$debarch $$pkg1; \
fakeroot $(ALIEN) --bump=0 --scripts --to-deb --target=$$debarch $$pkg1 || exit 1; \
$(RM) $$pkg1
deb-utils: deb-local rpm-utils
@@ -45,7 +45,7 @@ deb-utils: deb-local rpm-utils
pkg5=libzpool2-$${version}.$${arch}.rpm; \
pkg6=libzfs2-devel-$${version}.$${arch}.rpm; \
pkg7=$${name}-test-$${version}.$${arch}.rpm; \
pkg8=$${name}-dracut-$${version}.$${arch}.rpm; \
pkg8=$${name}-dracut-$${version}.noarch.rpm; \
pkg9=$${name}-initramfs-$${version}.$${arch}.rpm; \
pkg10=`ls python*-pyzfs-$${version}* | tail -1`; \
## Arguments need to be passed to dh_shlibdeps. Alien provides no mechanism
@@ -63,7 +63,7 @@ deb-utils: deb-local rpm-utils
env PATH=$${path_prepend}:$${PATH} \
fakeroot $(ALIEN) --bump=0 --scripts --to-deb --target=$$debarch \
$$pkg1 $$pkg2 $$pkg3 $$pkg4 $$pkg5 $$pkg6 $$pkg7 \
$$pkg8 $$pkg9 $$pkg10; \
$$pkg8 $$pkg9 $$pkg10 || exit 1; \
$(RM) $${path_prepend}/dh_shlibdeps; \
rmdir $${path_prepend}; \
$(RM) $$pkg1 $$pkg2 $$pkg3 $$pkg4 $$pkg5 $$pkg6 $$pkg7 \
+6 -3
View File
@@ -18,7 +18,8 @@ AC_DEFUN([ZFS_AC_KERNEL_FPU], [
#include <asm/fpu/api.h>
],[
],[
AC_DEFINE(HAVE_KERNEL_FPU_API_HEADER, 1, [kernel has asm/fpu/api.h])
AC_DEFINE(HAVE_KERNEL_FPU_API_HEADER, 1,
[kernel has asm/fpu/api.h])
AC_MSG_RESULT(asm/fpu/api.h)
],[
AC_MSG_RESULT(i387.h & xcr.h)
@@ -39,8 +40,10 @@ AC_DEFUN([ZFS_AC_KERNEL_FPU], [
kernel_fpu_end();
], [kernel_fpu_begin], [arch/x86/kernel/fpu/core.c], [
AC_MSG_RESULT(kernel_fpu_*)
AC_DEFINE(HAVE_KERNEL_FPU, 1, [kernel has kernel_fpu_* functions])
AC_DEFINE(KERNEL_EXPORTS_X86_FPU, 1, [kernel exports FPU functions])
AC_DEFINE(HAVE_KERNEL_FPU, 1,
[kernel has kernel_fpu_* functions])
AC_DEFINE(KERNEL_EXPORTS_X86_FPU, 1,
[kernel exports FPU functions])
],[
ZFS_LINUX_TRY_COMPILE_SYMBOL([
#include <linux/module.h>
+21
View File
@@ -0,0 +1,21 @@
dnl #
dnl # 2.6.39 API change
dnl #
dnl # 33ee3b2e2eb9 kstrto*: converting strings to integers done (hopefully) right
dnl #
dnl # If kstrtoul() doesn't exist, fallback to use strict_strtoul() which has
dnl # existed since 2.6.25.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_KSTRTOUL], [
AC_MSG_CHECKING([whether kstrtoul() exists])
ZFS_LINUX_TRY_COMPILE([
#include <linux/kernel.h>
],[
int ret __attribute__ ((unused)) = kstrtoul(NULL, 10, NULL);
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_KSTRTOUL, 1, [kstrtoul() exists])
],[
AC_MSG_RESULT(no)
])
])
-39
View File
@@ -1,39 +0,0 @@
dnl #
dnl # 3.9 API change
dnl # set_fs_pwd takes const struct path *
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SET_FS_PWD_WITH_CONST],
tmp_flags="$EXTRA_KCFLAGS"
EXTRA_KCFLAGS="-Werror"
[AC_MSG_CHECKING([whether set_fs_pwd() requires const struct path *])
ZFS_LINUX_TRY_COMPILE([
#include <linux/spinlock.h>
#include <linux/fs_struct.h>
#include <linux/path.h>
void (*const set_fs_pwd_func)
(struct fs_struct *, const struct path *)
= set_fs_pwd;
],[
return 0;
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SET_FS_PWD_WITH_CONST, 1,
[set_fs_pwd() needs const path *])
],[
ZFS_LINUX_TRY_COMPILE([
#include <linux/spinlock.h>
#include <linux/fs_struct.h>
#include <linux/path.h>
void (*const set_fs_pwd_func)
(struct fs_struct *, struct path *)
= set_fs_pwd;
],[
return 0;
],[
AC_MSG_RESULT(no)
],[
AC_MSG_ERROR(unknown)
])
])
EXTRA_KCFLAGS="$tmp_flags"
])
+15 -7
View File
@@ -144,7 +144,9 @@ AC_DEFUN([ZFS_AC_KERNEL_SHRINKER_CALLBACK],[
ZFS_LINUX_TRY_COMPILE([
#include <linux/mm.h>
int shrinker_cb(int nr_to_scan, gfp_t gfp_mask);
int shrinker_cb(int nr_to_scan, gfp_t gfp_mask) {
return 0;
}
],[
struct shrinker cache_shrinker = {
.shrink = shrinker_cb,
@@ -166,8 +168,10 @@ AC_DEFUN([ZFS_AC_KERNEL_SHRINKER_CALLBACK],[
ZFS_LINUX_TRY_COMPILE([
#include <linux/mm.h>
int shrinker_cb(struct shrinker *, int nr_to_scan,
gfp_t gfp_mask);
int shrinker_cb(struct shrinker *shrink, int nr_to_scan,
gfp_t gfp_mask) {
return 0;
}
],[
struct shrinker cache_shrinker = {
.shrink = shrinker_cb,
@@ -190,8 +194,10 @@ AC_DEFUN([ZFS_AC_KERNEL_SHRINKER_CALLBACK],[
ZFS_LINUX_TRY_COMPILE([
#include <linux/mm.h>
int shrinker_cb(struct shrinker *,
struct shrink_control *sc);
int shrinker_cb(struct shrinker *shrink,
struct shrink_control *sc) {
return 0;
}
],[
struct shrinker cache_shrinker = {
.shrink = shrinker_cb,
@@ -215,8 +221,10 @@ AC_DEFUN([ZFS_AC_KERNEL_SHRINKER_CALLBACK],[
#include <linux/mm.h>
unsigned long shrinker_cb(
struct shrinker *,
struct shrink_control *sc);
struct shrinker *shrink,
struct shrink_control *sc) {
return 0;
}
],[
struct shrinker cache_shrinker = {
.count_objects = shrinker_cb,
-24
View File
@@ -1,24 +0,0 @@
dnl #
dnl # 2.6.36 API change,
dnl # The 'struct fs_struct->lock' was changed from a rwlock_t to
dnl # a spinlock_t to improve the fastpath performance.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_FS_STRUCT_SPINLOCK], [
AC_MSG_CHECKING([whether struct fs_struct uses spinlock_t])
tmp_flags="$EXTRA_KCFLAGS"
EXTRA_KCFLAGS="-Werror"
ZFS_LINUX_TRY_COMPILE([
#include <linux/sched.h>
#include <linux/fs_struct.h>
],[
static struct fs_struct fs;
spin_lock_init(&fs.lock);
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_FS_STRUCT_SPINLOCK, 1,
[struct fs_struct uses spinlock_t])
],[
AC_MSG_RESULT(no)
])
EXTRA_KCFLAGS="$tmp_flags"
])
+52 -11
View File
@@ -1,26 +1,51 @@
dnl # 4.14-rc3 API change
dnl # https://lwn.net/Articles/735887/
dnl #
dnl # 4.15 API change
dnl # https://lkml.org/lkml/2017/11/25/90
dnl # Check if timer_list.func get passed a timer_list or an unsigned long
dnl # (older kernels). Also sanity check the from_timer() and timer_setup()
dnl # macros are available as well, since they will be used in the same newer
dnl # kernels that support the new timer_list.func signature.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_TIMER_FUNCTION_TIMER_LIST], [
AC_MSG_CHECKING([whether timer_list.function gets a timer_list])
dnl # Also check for the existance of flags in struct timer_list, they were
dnl # added in 4.1-rc8 via 0eeda71bc30d.
AC_DEFUN([ZFS_AC_KERNEL_TIMER_SETUP], [
AC_MSG_CHECKING([whether timer_setup() is available])
tmp_flags="$EXTRA_KCFLAGS"
EXTRA_KCFLAGS="-Werror"
ZFS_LINUX_TRY_COMPILE([
#include <linux/timer.h>
struct my_task_timer {
struct timer_list timer;
int data;
};
void task_expire(struct timer_list *tl)
{
struct my_task_timer *task_timer = from_timer(task_timer, tl, timer);
task_timer->data = 42;
}
],[
struct my_task_timer task_timer;
timer_setup(&task_timer.timer, task_expire, 0);
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_KERNEL_TIMER_SETUP, 1,
[timer_setup() is available])
],[
AC_MSG_RESULT(no)
])
AC_MSG_CHECKING([whether timer function expects timer_list])
ZFS_LINUX_TRY_COMPILE([
#include <linux/timer.h>
void task_expire(struct timer_list *tl) {}
],[
#ifndef from_timer
#error "No from_timer() macro"
#endif
struct timer_list timer;
timer.function = task_expire;
timer_setup(&timer, NULL, 0);
struct timer_list tl;
tl.function = task_expire;
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_KERNEL_TIMER_FUNCTION_TIMER_LIST, 1,
@@ -28,5 +53,21 @@ AC_DEFUN([ZFS_AC_KERNEL_TIMER_FUNCTION_TIMER_LIST], [
],[
AC_MSG_RESULT(no)
])
AC_MSG_CHECKING([whether struct timer_list has flags])
ZFS_LINUX_TRY_COMPILE([
#include <linux/timer.h>
],[
struct timer_list tl;
tl.flags = 2;
],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_KERNEL_TIMER_LIST_FLAGS, 1,
[struct timer_list has a flags member])
],[
AC_MSG_RESULT(no)
])
EXTRA_KCFLAGS="$tmp_flags"
])
+5 -5
View File
@@ -11,9 +11,7 @@ AC_DEFUN([ZFS_AC_CONFIG_KERNEL], [
ZFS_AC_KERNEL_CONFIG
ZFS_AC_KERNEL_CTL_NAME
ZFS_AC_KERNEL_PDE_DATA
ZFS_AC_KERNEL_SET_FS_PWD_WITH_CONST
ZFS_AC_KERNEL_2ARGS_VFS_FSYNC
ZFS_AC_KERNEL_FS_STRUCT_SPINLOCK
ZFS_AC_KERNEL_KUIDGID_T
ZFS_AC_KERNEL_FALLOCATE
ZFS_AC_KERNEL_2ARGS_ZLIB_DEFLATE_WORKSPACESIZE
@@ -37,7 +35,7 @@ AC_DEFUN([ZFS_AC_CONFIG_KERNEL], [
ZFS_AC_KERNEL_GROUP_INFO_GID
ZFS_AC_KERNEL_WRITE
ZFS_AC_KERNEL_READ
ZFS_AC_KERNEL_TIMER_FUNCTION_TIMER_LIST
ZFS_AC_KERNEL_TIMER_SETUP
ZFS_AC_KERNEL_DECLARE_EVENT_CLASS
ZFS_AC_KERNEL_CURRENT_BIO_TAIL
ZFS_AC_KERNEL_SUPER_USER_NS
@@ -167,6 +165,7 @@ AC_DEFUN([ZFS_AC_CONFIG_KERNEL], [
ZFS_AC_KERNEL_TOTALHIGH_PAGES
ZFS_AC_KERNEL_BLK_QUEUE_DISCARD
ZFS_AC_KERNEL_BLK_QUEUE_SECURE_ERASE
ZFS_AC_KERNEL_KSTRTOUL
AS_IF([test "$LINUX_OBJ" != "$LINUX"], [
KERNEL_MAKE="$KERNEL_MAKE O=$LINUX_OBJ"
@@ -529,10 +528,11 @@ AC_DEFUN([ZFS_AC_KERNEL_CONFIG_TRIM_UNUSED_KSYMS], [
AC_MSG_RESULT([yes])
],[
AC_MSG_RESULT([no])
AC_MSG_ERROR([
AS_IF([test "x$enable_linux_builtin" != xyes], [
AC_MSG_ERROR([
*** This kernel has unused symbols trimming enabled, please disable.
*** Rebuild the kernel with CONFIG_TRIM_UNUSED_KSYMS=n set.])
])
])])
])
dnl #
+4
View File
@@ -120,8 +120,10 @@ AC_CONFIG_FILES([
cmd/dbufstat/Makefile
cmd/arc_summary/Makefile
cmd/zed/Makefile
cmd/zed/zed.d/Makefile
cmd/raidz_test/Makefile
cmd/zgenhostid/Makefile
cmd/zvol_wait/Makefile
contrib/Makefile
contrib/bash_completion.d/Makefile
contrib/dracut/Makefile
@@ -271,6 +273,7 @@ AC_CONFIG_FILES([
tests/zfs-tests/tests/functional/cli_user/zfs_list/Makefile
tests/zfs-tests/tests/functional/cli_user/zpool_iostat/Makefile
tests/zfs-tests/tests/functional/cli_user/zpool_list/Makefile
tests/zfs-tests/tests/functional/cli_user/zpool_status/Makefile
tests/zfs-tests/tests/functional/compression/Makefile
tests/zfs-tests/tests/functional/cp_files/Makefile
tests/zfs-tests/tests/functional/ctime/Makefile
@@ -326,6 +329,7 @@ AC_CONFIG_FILES([
tests/zfs-tests/tests/functional/snapshot/Makefile
tests/zfs-tests/tests/functional/snapused/Makefile
tests/zfs-tests/tests/functional/sparse/Makefile
tests/zfs-tests/tests/functional/suid/Makefile
tests/zfs-tests/tests/functional/alloc_class/Makefile
tests/zfs-tests/tests/functional/threadsappend/Makefile
tests/zfs-tests/tests/functional/tmpfile/Makefile
+1 -1
View File
@@ -144,7 +144,7 @@ ask_for_password() {
{ flock -s 9;
# Prompt for password with plymouth, if installed and running.
if whereis plymouth >/dev/null 2>&1 && plymouth --ping 2>/dev/null; then
if type plymouth >/dev/null 2>&1 && plymouth --ping 2>/dev/null; then
plymouth ask-for-password \
--prompt "$ply_prompt" --number-of-tries="$ply_tries" \
--command="$ply_cmd"
+13 -8
View File
@@ -11,13 +11,18 @@ EXTRA_DIST = \
$(top_srcdir)/contrib/initramfs/README.initramfs.markdown
install-initrdSCRIPTS: $(EXTRA_DIST)
for d in conf.d conf-hooks.d hooks scripts scripts/local-top; do \
$(MKDIR_P) $(DESTDIR)$(initrddir)/$$d; \
cp $(top_srcdir)/contrib/initramfs/$$d/zfs \
$(DESTDIR)$(initrddir)/$$d/; \
for d in conf.d conf-hooks.d scripts/local-top; do \
$(MKDIR_P) $(DESTDIR)$(initrddir)/$$d; \
cp $(top_srcdir)/contrib/initramfs/$$d/zfs \
$(DESTDIR)$(initrddir)/$$d/; \
done
if [ -f etc/init.d/zfs ]; then \
$(MKDIR_P) $(DESTDIR)$(DEFAULT_INITCONF_DIR); \
cp $(top_srcdir)/etc/init.d/zfs \
$(DESTDIR)$(DEFAULT_INITCONF_DIR)/; \
for d in hooks scripts; do \
$(MKDIR_P) $(DESTDIR)$(initrddir)/$$d; \
cp $(top_builddir)/contrib/initramfs/$$d/zfs \
$(DESTDIR)$(initrddir)/$$d/; \
done
if [ -f $(top_builddir)/etc/init.d/zfs ]; then \
$(MKDIR_P) $(DESTDIR)$(DEFAULT_INITCONF_DIR); \
cp $(top_builddir)/etc/init.d/zfs \
$(DESTDIR)$(DEFAULT_INITCONF_DIR)/; \
fi
+11 -9
View File
@@ -411,29 +411,29 @@ decrypt_fs()
# Determine dataset that holds key for root dataset
ENCRYPTIONROOT=$(${ZFS} get -H -o value encryptionroot "${fs}")
DECRYPT_CMD="${ZFS} load-key '${ENCRYPTIONROOT}'"
# If root dataset is encrypted...
if ! [ "${ENCRYPTIONROOT}" = "-" ]; then
TRY_COUNT=3
# Prompt with plymouth, if active
if [ -e /bin/plymouth ] && /bin/plymouth --ping 2>/dev/null; then
plymouth ask-for-password --prompt "Encrypted ZFS password for ${ENCRYPTIONROOT}" \
--number-of-tries="3" \
--command="${DECRYPT_CMD}"
while [ $TRY_COUNT -gt 0 ]; do
plymouth ask-for-password --prompt "Encrypted ZFS password for ${ENCRYPTIONROOT}" | \
$ZFS load-key "${ENCRYPTIONROOT}" && break
TRY_COUNT=$((TRY_COUNT - 1))
done
# Prompt with systemd, if active
elif [ -e /run/systemd/system ]; then
TRY_COUNT=3
while [ $TRY_COUNT -gt 0 ]; do
systemd-ask-password "Encrypted ZFS password for ${ENCRYPTIONROOT}" --no-tty | \
${DECRYPT_CMD} && break
$ZFS load-key "${ENCRYPTIONROOT}" && break
TRY_COUNT=$((TRY_COUNT - 1))
done
# Prompt with ZFS tty, otherwise
else
eval "${DECRYPT_CMD}"
$ZFS load-key "${ENCRYPTIONROOT}"
fi
fi
fi
@@ -878,7 +878,9 @@ mountroot()
pool="$("${ZPOOL}" get name,guid -o name,value -H | \
awk -v pool="${ZFS_RPOOL}" '$2 == pool { print $1 }')"
if [ -n "$pool" ]; then
ZFS_BOOTFS="${pool}/${ZFS_BOOTFS#*/}"
# If $ZFS_BOOTFS contains guid, replace the guid portion with $pool
ZFS_BOOTFS=$(echo "$ZFS_BOOTFS" | \
sed -e "s/$("${ZPOOL}" get guid -o value "$pool" -H)/$pool/g")
ZFS_RPOOL="${pool}"
fi
+1 -1
View File
@@ -24,7 +24,7 @@ all-local:
# files are later created by manually loading the Python modules.
#
install-exec-local:
$(PYTHON) $(srcdir)/setup.py install \
$(PYTHON) $(builddir)/setup.py install \
--prefix $(prefix) \
--root $(DESTDIR)/ \
--install-lib $(pythonsitedir) \
-7
View File
@@ -294,13 +294,6 @@ checksystem()
# Just make sure that /dev/zfs is created.
udev_trigger
if ! [ "$(uname -m)" = "x86_64" ]; then
echo "Warning: You're not running 64bit. Currently native zfs in";
echo " Linux is only supported and tested on 64bit.";
# should we break here? People doing this should know what they
# do, thus i'm not breaking here.
fi
return 0
}
@@ -71,6 +71,8 @@ process_line() {
p_readonly="${8}"
p_setuid="${9}"
p_nbmand="${10}"
p_encroot="${11}"
p_keyloc="${12}"
# Check for canmount=off .
if [ "${p_canmount}" = "off" ] ; then
@@ -168,6 +170,54 @@ process_line() {
"${dataset}" >/dev/kmsg
fi
# Minimal pre-requisites to mount a ZFS dataset
wants="zfs-import.target"
if [ -n "${p_encroot}" ] &&
[ "${p_encroot}" != "-" ] ; then
keyloadunit="zfs-load-key-$(systemd-escape "${p_encroot}").service"
if [ "${p_encroot}" = "${dataset}" ] ; then
pathdep=""
if [ "${p_keyloc%%://*}" = "file" ] ; then
pathdep="RequiresMountsFor='${p_keyloc#file://}'"
keyloadcmd="@sbindir@/zfs load-key '${dataset}'"
elif [ "${p_keyloc}" = "prompt" ] ; then
keyloadcmd="sh -c 'set -eu;"\
"count=0;"\
"while [ \$\$count -lt 3 ];do"\
" systemd-ask-password --id=\"zfs:${dataset}\""\
" \"Enter passphrase for ${dataset}:\"|"\
" @sbindir@/zfs load-key \"${dataset}\" && exit 0;"\
" count=\$\$((count + 1));"\
"done;"\
"exit 1'"
else
printf 'zfs-mount-generator: (%s) invalid keylocation\n' \
"${dataset}" >/dev/kmsg
fi
cat > "${dest_norm}/${keyloadunit}" << EOF
# Automatically generated by zfs-mount-generator
[Unit]
Description=Load ZFS key for ${dataset}
SourcePath=${cachefile}
Documentation=man:zfs-mount-generator(8)
DefaultDependencies=no
Wants=${wants}
After=${wants}
${pathdep}
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=${keyloadcmd}
ExecStop=@sbindir@/zfs unload-key '${dataset}'
EOF
fi
# Update the dependencies for the mount file to require the
# key-loading unit.
wants="${wants} ${keyloadunit}"
fi
# If the mountpoint has already been created, give it precedence.
if [ -e "${dest_norm}/${mountfile}" ] ; then
printf 'zfs-mount-generator: %s already exists\n' "${mountfile}" \
@@ -183,8 +233,8 @@ process_line() {
SourcePath=${cachefile}
Documentation=man:zfs-mount-generator(8)
Before=local-fs.target zfs-mount.service
After=zfs-import.target
Wants=zfs-import.target
After=${wants}
Wants=${wants}
[Mount]
Where=${p_mountpoint}
+1
View File
@@ -5,4 +5,5 @@ enable zfs-import.target
enable zfs-mount.service
enable zfs-share.service
enable zfs-zed.service
enable zfs-volume-wait.service
enable zfs.target
+4
View File
@@ -7,7 +7,9 @@ systemdunit_DATA = \
zfs-import-scan.service \
zfs-mount.service \
zfs-share.service \
zfs-volume-wait.service \
zfs-import.target \
zfs-volumes.target \
zfs.target
EXTRA_DIST = \
@@ -17,6 +19,8 @@ EXTRA_DIST = \
$(top_srcdir)/etc/systemd/system/zfs-mount.service.in \
$(top_srcdir)/etc/systemd/system/zfs-share.service.in \
$(top_srcdir)/etc/systemd/system/zfs-import.target.in \
$(top_srcdir)/etc/systemd/system/zfs-volume-wait.service.in \
$(top_srcdir)/etc/systemd/system/zfs-volumes.target.in \
$(top_srcdir)/etc/systemd/system/zfs.target.in \
$(top_srcdir)/etc/systemd/system/50-zfs.preset.in
+1
View File
@@ -5,6 +5,7 @@ After=nfs-server.service nfs-kernel-server.service
After=smb.service
Before=rpc-statd-notify.service
Wants=zfs-mount.service
After=zfs-mount.service
PartOf=nfs-server.service nfs-kernel-server.service
PartOf=smb.service
@@ -0,0 +1,13 @@
[Unit]
Description=Wait for ZFS Volume (zvol) links in /dev
DefaultDependencies=no
After=systemd-udev-settle.service
After=zfs-import.target
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=@bindir@/zvol_wait
[Install]
WantedBy=zfs-volumes.target
+7
View File
@@ -0,0 +1,7 @@
[Unit]
Description=ZFS volumes are ready
After=zfs-volume-wait.service
Requires=zfs-volume-wait.service
[Install]
WantedBy=zfs.target
+1
View File
@@ -147,6 +147,7 @@ typedef enum zfs_error {
EZFS_NO_TRIM, /* no active trim */
EZFS_TRIM_NOTSUP, /* device does not support trim */
EZFS_NO_RESILVER_DEFER, /* pool doesn't support resilver_defer */
EZFS_EXPORT_IN_PROGRESS, /* currently exporting the pool */
EZFS_UNKNOWN
} zfs_error_t;
+4 -2
View File
@@ -26,8 +26,10 @@
* USER API:
*
* Kernel fpu methods:
* kfpu_begin()
* kfpu_end()
* kfpu_allowed()
* kfpu_initialize()
* kfpu_begin()
* kfpu_end()
*/
#ifndef _SIMD_AARCH64_H
+25 -23
View File
@@ -26,8 +26,10 @@
* USER API:
*
* Kernel fpu methods:
* kfpu_begin()
* kfpu_end()
* kfpu_allowed()
* kfpu_initialize()
* kfpu_begin()
* kfpu_end()
*
* SIMD support:
*
@@ -37,31 +39,31 @@
* all relevant feature test functions should be called.
*
* Supported features:
* zfs_sse_available()
* zfs_sse2_available()
* zfs_sse3_available()
* zfs_ssse3_available()
* zfs_sse4_1_available()
* zfs_sse4_2_available()
* zfs_sse_available()
* zfs_sse2_available()
* zfs_sse3_available()
* zfs_ssse3_available()
* zfs_sse4_1_available()
* zfs_sse4_2_available()
*
* zfs_avx_available()
* zfs_avx2_available()
* zfs_avx_available()
* zfs_avx2_available()
*
* zfs_bmi1_available()
* zfs_bmi2_available()
* zfs_bmi1_available()
* zfs_bmi2_available()
*
* zfs_avx512f_available()
* zfs_avx512cd_available()
* zfs_avx512er_available()
* zfs_avx512pf_available()
* zfs_avx512bw_available()
* zfs_avx512dq_available()
* zfs_avx512vl_available()
* zfs_avx512ifma_available()
* zfs_avx512vbmi_available()
* zfs_avx512f_available()
* zfs_avx512cd_available()
* zfs_avx512er_available()
* zfs_avx512pf_available()
* zfs_avx512bw_available()
* zfs_avx512dq_available()
* zfs_avx512vl_available()
* zfs_avx512ifma_available()
* zfs_avx512vbmi_available()
*
* NOTE(AVX-512VL): If using AVX-512 instructions with 128Bit registers
* also add zfs_avx512vl_available() to feature check.
* also add zfs_avx512vl_available() to feature check.
*/
#ifndef _SIMD_X86_H
@@ -190,7 +192,7 @@ typedef struct cpuid_feature_desc {
* Descriptions of supported instruction sets
*/
static const cpuid_feature_desc_t cpuid_features[] = {
[SSE] = {1U, 0U, 1U << 25, EDX },
[SSE] = {1U, 0U, 1U << 25, EDX },
[SSE2] = {1U, 0U, 1U << 26, EDX },
[SSE3] = {1U, 0U, 1U << 0, ECX },
[SSSE3] = {1U, 0U, 1U << 9, ECX },
+1 -1
View File
@@ -102,7 +102,7 @@ void spl_dumpstack(void);
if (!(_verify3_left OP _verify3_right)) \
spl_panic(__FILE__, __FUNCTION__, __LINE__, \
"VERIFY3(" #LEFT " " #OP " " #RIGHT ") " \
"failed (%px" #OP " %px)\n", \
"failed (%px " #OP " %px)\n", \
(void *) (_verify3_left), \
(void *) (_verify3_right)); \
} while (0)
+2 -3
View File
@@ -127,6 +127,8 @@ spl_mutex_lockdep_on_maybe(kmutex_t *mp) \
})
/* END CSTYLED */
#define NESTED_SINGLE 1
#ifdef CONFIG_DEBUG_LOCK_ALLOC
#define mutex_enter_nested(mp, subclass) \
{ \
@@ -179,7 +181,4 @@ spl_mutex_lockdep_on_maybe(kmutex_t *mp) \
/* NOTE: do not dereference mp after this point */ \
}
int spl_mutex_init(void);
void spl_mutex_fini(void);
#endif /* _SPL_MUTEX_H */
+12 -120
View File
@@ -29,43 +29,6 @@
#include <linux/rwsem.h>
#include <linux/sched.h>
/* Linux kernel compatibility */
#if defined(CONFIG_PREEMPT_RT_FULL)
#define SPL_RWSEM_SINGLE_READER_VALUE (1)
#define SPL_RWSEM_SINGLE_WRITER_VALUE (0)
#elif defined(CONFIG_RWSEM_GENERIC_SPINLOCK)
#define SPL_RWSEM_SINGLE_READER_VALUE (1)
#define SPL_RWSEM_SINGLE_WRITER_VALUE (-1)
#elif defined(RWSEM_ACTIVE_MASK)
#define SPL_RWSEM_SINGLE_READER_VALUE (RWSEM_ACTIVE_READ_BIAS)
#define SPL_RWSEM_SINGLE_WRITER_VALUE (RWSEM_ACTIVE_WRITE_BIAS)
#endif
/* Linux 3.16 changed activity to count for rwsem-spinlock */
#if defined(CONFIG_PREEMPT_RT_FULL)
#define RWSEM_COUNT(sem) sem->read_depth
#elif defined(HAVE_RWSEM_ACTIVITY)
#define RWSEM_COUNT(sem) sem->activity
/* Linux 4.8 changed count to an atomic_long_t for !rwsem-spinlock */
#elif defined(HAVE_RWSEM_ATOMIC_LONG_COUNT)
#define RWSEM_COUNT(sem) atomic_long_read(&(sem)->count)
#else
#define RWSEM_COUNT(sem) sem->count
#endif
#if defined(RWSEM_SPINLOCK_IS_RAW)
#define spl_rwsem_lock_irqsave(lk, fl) raw_spin_lock_irqsave(lk, fl)
#define spl_rwsem_unlock_irqrestore(lk, fl) \
raw_spin_unlock_irqrestore(lk, fl)
#define spl_rwsem_trylock_irqsave(lk, fl) raw_spin_trylock_irqsave(lk, fl)
#else
#define spl_rwsem_lock_irqsave(lk, fl) spin_lock_irqsave(lk, fl)
#define spl_rwsem_unlock_irqrestore(lk, fl) spin_unlock_irqrestore(lk, fl)
#define spl_rwsem_trylock_irqsave(lk, fl) spin_trylock_irqsave(lk, fl)
#endif /* RWSEM_SPINLOCK_IS_RAW */
#define spl_rwsem_is_locked(rwsem) rwsem_is_locked(rwsem)
typedef enum {
RW_DRIVER = 2,
RW_DEFAULT = 4,
@@ -78,15 +41,9 @@ typedef enum {
RW_READER = 2
} krw_t;
/*
* If CONFIG_RWSEM_SPIN_ON_OWNER is defined, rw_semaphore will have an owner
* field, so we don't need our own.
*/
typedef struct {
struct rw_semaphore rw_rwlock;
#ifndef CONFIG_RWSEM_SPIN_ON_OWNER
kthread_t *rw_owner;
#endif
#ifdef CONFIG_LOCKDEP
krw_type_t rw_type;
#endif /* CONFIG_LOCKDEP */
@@ -97,31 +54,19 @@ typedef struct {
static inline void
spl_rw_set_owner(krwlock_t *rwp)
{
/*
* If CONFIG_RWSEM_SPIN_ON_OWNER is defined, down_write, up_write,
* downgrade_write and __init_rwsem will set/clear owner for us.
*/
#ifndef CONFIG_RWSEM_SPIN_ON_OWNER
rwp->rw_owner = current;
#endif
}
static inline void
spl_rw_clear_owner(krwlock_t *rwp)
{
#ifndef CONFIG_RWSEM_SPIN_ON_OWNER
rwp->rw_owner = NULL;
#endif
}
static inline kthread_t *
rw_owner(krwlock_t *rwp)
{
#ifdef CONFIG_RWSEM_SPIN_ON_OWNER
return (SEM(rwp)->owner);
#else
return (rwp->rw_owner);
#endif
}
#ifdef CONFIG_LOCKDEP
@@ -148,6 +93,11 @@ spl_rw_lockdep_on_maybe(krwlock_t *rwp) \
#define spl_rw_lockdep_on_maybe(rwp)
#endif /* CONFIG_LOCKDEP */
static inline int
RW_LOCK_HELD(krwlock_t *rwp)
{
return (rwsem_is_locked(SEM(rwp)));
}
static inline int
RW_WRITE_HELD(krwlock_t *rwp)
@@ -155,55 +105,10 @@ RW_WRITE_HELD(krwlock_t *rwp)
return (rw_owner(rwp) == current);
}
static inline int
RW_LOCK_HELD(krwlock_t *rwp)
{
return (spl_rwsem_is_locked(SEM(rwp)));
}
static inline int
RW_READ_HELD(krwlock_t *rwp)
{
if (!RW_LOCK_HELD(rwp))
return (0);
/*
* rw_semaphore cheat sheet:
*
* < 3.16:
* There's no rw_semaphore.owner, so use rwp.owner instead.
* If rwp.owner == NULL then it's a reader
*
* 3.16 - 4.7:
* rw_semaphore.owner added (https://lwn.net/Articles/596656/)
* and CONFIG_RWSEM_SPIN_ON_OWNER introduced.
* If rw_semaphore.owner == NULL then it's a reader
*
* 4.8 - 4.16.16:
* RWSEM_READER_OWNED added as an internal #define.
* (https://lore.kernel.org/patchwork/patch/678590/)
* If rw_semaphore.owner == 1 then it's a reader
*
* 4.16.17 - 4.19:
* RWSEM_OWNER_UNKNOWN introduced as ((struct task_struct *)-1L)
* (https://do-db2.lkml.org/lkml/2018/5/15/985)
* If rw_semaphore.owner == 1 then it's a reader.
*
* 4.20+:
* RWSEM_OWNER_UNKNOWN changed to ((struct task_struct *)-2L)
* (https://lkml.org/lkml/2018/9/6/986)
* If rw_semaphore.owner & 1 then it's a reader, and also the reader's
* task_struct may be embedded in rw_semaphore->owner.
*/
#if defined(CONFIG_RWSEM_SPIN_ON_OWNER) && defined(RWSEM_OWNER_UNKNOWN)
if (RWSEM_OWNER_UNKNOWN == (struct task_struct *)-2L) {
/* 4.20+ kernels with CONFIG_RWSEM_SPIN_ON_OWNER */
return ((unsigned long) SEM(rwp)->owner & 1);
}
#endif
/* < 4.20 kernel or !CONFIG_RWSEM_SPIN_ON_OWNER */
return (rw_owner(rwp) == NULL || (unsigned long) rw_owner(rwp) == 1);
return (RW_LOCK_HELD(rwp) && rw_owner(rwp) == NULL);
}
/*
@@ -228,6 +133,12 @@ RW_READ_HELD(krwlock_t *rwp)
*/
#define rw_destroy(rwp) ((void) 0)
/*
* Upgrading a rwsem from a reader to a writer is not supported by the
* Linux kernel. The lock must be dropped and reacquired as a writer.
*/
#define rw_tryupgrade(rwp) RW_WRITE_HELD(rwp)
#define rw_tryenter(rwp, rw) \
({ \
int _rc_ = 0; \
@@ -285,25 +196,6 @@ RW_READ_HELD(krwlock_t *rwp)
downgrade_write(SEM(rwp)); \
spl_rw_lockdep_on_maybe(rwp); \
})
#define rw_tryupgrade(rwp) \
({ \
int _rc_ = 0; \
\
if (RW_WRITE_HELD(rwp)) { \
_rc_ = 1; \
} else { \
spl_rw_lockdep_off_maybe(rwp); \
if ((_rc_ = rwsem_tryupgrade(SEM(rwp)))) \
spl_rw_set_owner(rwp); \
spl_rw_lockdep_on_maybe(rwp); \
} \
_rc_; \
})
/* END CSTYLED */
int spl_rw_init(void);
void spl_rw_fini(void);
int rwsem_tryupgrade(struct rw_semaphore *rwsem);
#endif /* _SPL_RWLOCK_H */
+4
View File
@@ -28,4 +28,8 @@
#define bcopy(src, dest, size) memmove(dest, src, size)
#define bcmp(src, dest, size) memcmp((src), (dest), (size_t)(size))
#ifndef HAVE_KSTRTOUL
#define kstrtoul strict_strtoul
#endif
#endif /* _SPL_SYS_STRINGS_H */
+25
View File
@@ -72,4 +72,29 @@ usleep_range(unsigned long min, unsigned long max)
#define USEC_TO_TICK(us) usecs_to_jiffies(us)
#define NSEC_TO_TICK(ns) usecs_to_jiffies(ns / NSEC_PER_USEC)
#ifndef from_timer
#define from_timer(var, timer, timer_field) \
container_of(timer, typeof(*var), timer_field)
#endif
#ifdef HAVE_KERNEL_TIMER_FUNCTION_TIMER_LIST
typedef struct timer_list *spl_timer_list_t;
#else
typedef unsigned long spl_timer_list_t;
#endif
#ifndef HAVE_KERNEL_TIMER_SETUP
static inline void
timer_setup(struct timer_list *timer, void (*func)(spl_timer_list_t), u32 fl)
{
#ifdef HAVE_KERNEL_TIMER_LIST_FLAGS
(timer)->flags = fl;
#endif
init_timer(timer);
setup_timer(timer, func, (spl_timer_list_t)(timer));
}
#endif /* HAVE_KERNEL_TIMER_SETUP */
#endif /* _SPL_TIMER_H */
-1
View File
@@ -182,7 +182,6 @@ extern int vn_space(vnode_t *vp, int cmd, struct flock *bfp, int flag,
extern file_t *vn_getf(int fd);
extern void vn_releasef(int fd);
extern void vn_areleasef(int fd, uf_info_t *fip);
extern int vn_set_pwd(const char *filename);
int spl_vn_init(void);
void spl_vn_fini(void);
+2 -5
View File
@@ -46,6 +46,7 @@ extern "C" {
*/
#define DNODE_MUST_BE_ALLOCATED 1
#define DNODE_MUST_BE_FREE 2
#define DNODE_DRY_RUN 4
/*
* dnode_next_offset() flags.
@@ -415,6 +416,7 @@ int dnode_hold_impl(struct objset *dd, uint64_t object, int flag, int dn_slots,
boolean_t dnode_add_ref(dnode_t *dn, void *ref);
void dnode_rele(dnode_t *dn, void *ref);
void dnode_rele_and_unlock(dnode_t *dn, void *tag, boolean_t evicting);
int dnode_try_claim(objset_t *os, uint64_t object, int slots);
void dnode_setdirty(dnode_t *dn, dmu_tx_t *tx);
void dnode_sync(dnode_t *dn, dmu_tx_t *tx);
void dnode_allocate(dnode_t *dn, dmu_object_type_t ot, int blocksize, int ibs,
@@ -532,11 +534,6 @@ typedef struct dnode_stats {
* a range of dnode slots which would overflow the dnode_phys_t.
*/
kstat_named_t dnode_hold_free_overflow;
/*
* Number of times a dnode_hold(...) was attempted on a dnode
* which had already been unlinked in an earlier txg.
*/
kstat_named_t dnode_hold_free_txg;
/*
* Number of times dnode_free_interior_slots() needed to retry
* acquiring a slot zrl lock due to contention.
+3 -1
View File
@@ -37,9 +37,11 @@ typedef struct zfs_bookmark_phys {
uint64_t zbm_creation_txg; /* birth transaction group */
uint64_t zbm_creation_time; /* bookmark creation time */
/* the following fields are reserved for redacted send / recv */
/* fields used for redacted send / recv */
uint64_t zbm_redaction_obj; /* redaction list object */
uint64_t zbm_flags; /* ZBM_FLAG_* */
/* fields used for bookmark written size */
uint64_t zbm_referenced_bytes_refd;
uint64_t zbm_compressed_bytes_refd;
uint64_t zbm_uncompressed_bytes_refd;
-1
View File
@@ -209,7 +209,6 @@ void dsl_dataset_create_crypt_sync(uint64_t dsobj, dsl_dir_t *dd,
struct dsl_dataset *origin, dsl_crypto_params_t *dcp, dmu_tx_t *tx);
uint64_t dsl_crypto_key_create_sync(uint64_t crypt, dsl_wrapping_key_t *wkey,
dmu_tx_t *tx);
int dmu_objset_clone_crypt_check(dsl_dir_t *parentdd, dsl_dir_t *origindd);
uint64_t dsl_crypto_key_clone_sync(dsl_dir_t *origindd, dmu_tx_t *tx);
void dsl_crypto_key_destroy_sync(uint64_t dckobj, dmu_tx_t *tx);
+1
View File
@@ -1318,6 +1318,7 @@ typedef enum {
ZFS_ERR_FROM_IVSET_GUID_MISSING,
ZFS_ERR_FROM_IVSET_GUID_MISMATCH,
ZFS_ERR_SPILL_BLOCK_FLAG_MISSING,
ZFS_ERR_EXPORT_IN_PROGRESS,
} zfs_errno_t;
/*
+1
View File
@@ -50,6 +50,7 @@ int metaslab_init(metaslab_group_t *, uint64_t, uint64_t, uint64_t,
void metaslab_fini(metaslab_t *);
int metaslab_load(metaslab_t *);
void metaslab_potentially_unload(metaslab_t *, uint64_t);
void metaslab_unload(metaslab_t *);
uint64_t metaslab_allocated_space(metaslab_t *);
+2
View File
@@ -89,6 +89,8 @@ void multilist_sublist_insert_head(multilist_sublist_t *, void *);
void multilist_sublist_insert_tail(multilist_sublist_t *, void *);
void multilist_sublist_move_forward(multilist_sublist_t *mls, void *obj);
void multilist_sublist_remove(multilist_sublist_t *, void *);
int multilist_sublist_is_empty(multilist_sublist_t *);
int multilist_sublist_is_empty_idx(multilist_t *, unsigned int);
void *multilist_sublist_head(multilist_sublist_t *);
void *multilist_sublist_tail(multilist_sublist_t *);
+2
View File
@@ -54,8 +54,10 @@ extern "C" {
*/
typedef struct pathname {
char *pn_buf; /* underlying storage */
#if 0 /* unused in ZoL */
char *pn_path; /* remaining pathname */
size_t pn_pathlen; /* remaining length */
#endif
size_t pn_bufsize; /* total size of pn_buf */
} pathname_t;
+1 -1
View File
@@ -1104,7 +1104,7 @@ extern uint64_t spa_missing_tvds_allowed(spa_t *spa);
extern void spa_set_missing_tvds(spa_t *spa, uint64_t missing);
extern boolean_t spa_top_vdevs_spacemap_addressable(spa_t *spa);
extern boolean_t spa_multihost(spa_t *spa);
extern unsigned long spa_get_hostid(void);
extern uint32_t spa_get_hostid(spa_t *spa);
extern void spa_activate_allocation_classes(spa_t *, dmu_tx_t *);
extern int spa_mode(spa_t *spa);
+2
View File
@@ -219,6 +219,7 @@ struct spa {
spa_taskqs_t spa_zio_taskq[ZIO_TYPES][ZIO_TASKQ_TYPES];
dsl_pool_t *spa_dsl_pool;
boolean_t spa_is_initializing; /* true while opening pool */
boolean_t spa_is_exporting; /* true while exporting pool */
metaslab_class_t *spa_normal_class; /* normal data class */
metaslab_class_t *spa_log_class; /* intent log data class */
metaslab_class_t *spa_special_class; /* special allocation class */
@@ -394,6 +395,7 @@ struct spa {
mmp_thread_t spa_mmp; /* multihost mmp thread */
list_t spa_leaf_list; /* list of leaf vdevs */
uint64_t spa_leaf_list_gen; /* track leaf_list changes */
uint32_t spa_hostid; /* cached system hostid */
/*
* spa_refcount & spa_config_lock must be the last elements
+4 -4
View File
@@ -14,7 +14,7 @@
*/
/*
* Copyright (c) 2014, 2017 by Delphix. All rights reserved.
* Copyright (c) 2014, 2019 by Delphix. All rights reserved.
*/
#ifndef _SYS_VDEV_REMOVAL_H
@@ -81,13 +81,13 @@ extern void spa_vdev_condense_suspend(spa_t *);
extern int spa_vdev_remove(spa_t *, uint64_t, boolean_t);
extern void free_from_removing_vdev(vdev_t *, uint64_t, uint64_t);
extern int spa_removal_get_stats(spa_t *, pool_removal_stat_t *);
extern void svr_sync(spa_t *spa, dmu_tx_t *tx);
extern void svr_sync(spa_t *, dmu_tx_t *);
extern void spa_vdev_remove_suspend(spa_t *);
extern int spa_vdev_remove_cancel(spa_t *);
extern void spa_vdev_removal_destroy(spa_vdev_removal_t *svr);
extern void spa_vdev_removal_destroy(spa_vdev_removal_t *);
extern uint64_t spa_remove_max_segment(spa_t *);
extern int vdev_removal_max_span;
extern int zfs_remove_max_segment;
#ifdef __cplusplus
}
+5 -2
View File
@@ -21,7 +21,7 @@
/*
* Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved.
* Copyright (c) 2013 by Delphix. All rights reserved.
* Copyright (c) 2012, 2018 by Delphix. All rights reserved.
* Copyright 2017 Nexenta Systems, Inc.
*/
@@ -350,6 +350,7 @@ typedef struct zap_cursor {
uint64_t zc_serialized;
uint64_t zc_hash;
uint32_t zc_cd;
boolean_t zc_prefetch;
} zap_cursor_t;
typedef struct {
@@ -375,7 +376,9 @@ typedef struct {
* Initialize a zap cursor, pointing to the "first" attribute of the
* zapobj. You must _fini the cursor when you are done with it.
*/
void zap_cursor_init(zap_cursor_t *zc, objset_t *ds, uint64_t zapobj);
void zap_cursor_init(zap_cursor_t *zc, objset_t *os, uint64_t zapobj);
void zap_cursor_init_noprefetch(zap_cursor_t *zc, objset_t *os,
uint64_t zapobj);
void zap_cursor_fini(zap_cursor_t *zc);
/*
+2
View File
@@ -257,6 +257,8 @@ extern void mutex_enter(kmutex_t *mp);
extern void mutex_exit(kmutex_t *mp);
extern int mutex_tryenter(kmutex_t *mp);
#define NESTED_SINGLE 1
#define mutex_enter_nested(mp, class) mutex_enter(mp)
/*
* RW locks
*/
+2 -1
View File
@@ -196,6 +196,7 @@ typedef struct znode {
uint8_t z_atime_dirty; /* atime needs to be synced */
uint8_t z_zn_prefetch; /* Prefetch znodes? */
uint8_t z_moved; /* Has this znode been moved? */
boolean_t z_suspended; /* extra ref from a suspend? */
uint_t z_blksz; /* block size in bytes */
uint_t z_seq; /* modification sequence number */
uint64_t z_mapcnt; /* number of pages mapped to file */
@@ -371,7 +372,7 @@ extern void zfs_log_create(zilog_t *zilog, dmu_tx_t *tx, uint64_t txtype,
extern int zfs_log_create_txtype(zil_create_t, vsecattr_t *vsecp,
vattr_t *vap);
extern void zfs_log_remove(zilog_t *zilog, dmu_tx_t *tx, uint64_t txtype,
znode_t *dzp, char *name, uint64_t foid);
znode_t *dzp, char *name, uint64_t foid, boolean_t unlinked);
#define ZFS_NO_OBJECT 0 /* no object id */
extern void zfs_log_link(zilog_t *zilog, dmu_tx_t *tx, uint64_t txtype,
znode_t *dzp, znode_t *zp, char *name);
+1 -2
View File
@@ -105,8 +105,7 @@ extern size_t lz4_compress_zfs(void *src, void *dst, size_t s_len, size_t d_len,
int level);
extern int lz4_decompress_zfs(void *src, void *dst, size_t s_len, size_t d_len,
int level);
extern int lz4_decompress_abd(abd_t *src, void *dst, size_t s_len, size_t d_len,
int level);
/*
* Compress and decompress data if necessary.
*/
+2
View File
@@ -43,6 +43,8 @@ typedef enum {
NAME_ERR_RESERVED, /* entire name is reserved */
NAME_ERR_DISKLIKE, /* reserved disk name (c[0-9].*) */
NAME_ERR_TOOLONG, /* name is too long */
NAME_ERR_SELF_REF, /* reserved self path name ('.') */
NAME_ERR_PARENT_REF, /* reserved parent path name ('..') */
NAME_ERR_NO_AT, /* permission set is missing '@' */
} namecheck_err_t;
+1 -1
View File
@@ -9,4 +9,4 @@ Version: @VERSION@
URL: http://zfsonlinux.org
Requires: libzfs_core
Cflags: -I${includedir}/libzfs -I${includedir}/libspl
Libs: -L${libdir} -lzfs
Libs: -L${libdir} -lzfs -lnvpair
+18 -12
View File
@@ -475,9 +475,10 @@ change_one(zfs_handle_t *zhp, void *data)
prop_changelist_t *clp = data;
char property[ZFS_MAXPROPLEN];
char where[64];
prop_changenode_t *cn;
prop_changenode_t *cn = NULL;
zprop_source_t sourcetype = ZPROP_SRC_NONE;
zprop_source_t share_sourcetype = ZPROP_SRC_NONE;
int ret = 0;
/*
* We only want to unmount/unshare those filesystems that may inherit
@@ -493,8 +494,7 @@ change_one(zfs_handle_t *zhp, void *data)
zfs_prop_get(zhp, clp->cl_prop, property,
sizeof (property), &sourcetype, where, sizeof (where),
B_FALSE) != 0) {
zfs_close(zhp);
return (0);
goto out;
}
/*
@@ -506,8 +506,7 @@ change_one(zfs_handle_t *zhp, void *data)
zfs_prop_get(zhp, clp->cl_shareprop, property,
sizeof (property), &share_sourcetype, where, sizeof (where),
B_FALSE) != 0) {
zfs_close(zhp);
return (0);
goto out;
}
if (clp->cl_alldependents || clp->cl_allchildren ||
@@ -518,8 +517,8 @@ change_one(zfs_handle_t *zhp, void *data)
share_sourcetype == ZPROP_SRC_INHERITED))) {
if ((cn = zfs_alloc(zfs_get_handle(zhp),
sizeof (prop_changenode_t))) == NULL) {
zfs_close(zhp);
return (-1);
ret = -1;
goto out;
}
cn->cn_handle = zhp;
@@ -541,16 +540,23 @@ change_one(zfs_handle_t *zhp, void *data)
uu_avl_insert(clp->cl_tree, cn, idx);
} else {
free(cn);
zfs_close(zhp);
cn = NULL;
}
if (!clp->cl_alldependents)
return (zfs_iter_children(zhp, change_one, data));
} else {
zfs_close(zhp);
ret = zfs_iter_children(zhp, change_one, data);
/*
* If we added the handle to the changelist, we will re-use it
* later so return without closing it.
*/
if (cn != NULL)
return (ret);
}
return (0);
out:
zfs_close(zhp);
return (ret);
}
static int
+1 -40
View File
@@ -740,14 +740,6 @@ zfs_crypto_create(libzfs_handle_t *hdl, char *parent_name, nvlist_t *props,
pcrypt = ZIO_CRYPT_OFF;
}
/* Check for encryption being explicitly truned off */
if (crypt == ZIO_CRYPT_OFF && pcrypt != ZIO_CRYPT_OFF) {
ret = EINVAL;
zfs_error_aux(hdl, dgettext(TEXT_DOMAIN,
"Invalid encryption value. Dataset must be encrypted."));
goto out;
}
/* Get the inherited encryption property if we don't have it locally */
if (!local_crypt)
crypt = pcrypt;
@@ -849,10 +841,7 @@ int
zfs_crypto_clone_check(libzfs_handle_t *hdl, zfs_handle_t *origin_zhp,
char *parent_name, nvlist_t *props)
{
int ret;
char errbuf[1024];
zfs_handle_t *pzhp = NULL;
uint64_t pcrypt, ocrypt;
(void) snprintf(errbuf, sizeof (errbuf),
dgettext(TEXT_DOMAIN, "Encryption clone error"));
@@ -865,40 +854,12 @@ zfs_crypto_clone_check(libzfs_handle_t *hdl, zfs_handle_t *origin_zhp,
nvlist_exists(props, zfs_prop_to_name(ZFS_PROP_KEYLOCATION)) ||
nvlist_exists(props, zfs_prop_to_name(ZFS_PROP_ENCRYPTION)) ||
nvlist_exists(props, zfs_prop_to_name(ZFS_PROP_PBKDF2_ITERS))) {
ret = EINVAL;
zfs_error_aux(hdl, dgettext(TEXT_DOMAIN,
"Encryption properties must inherit from origin dataset."));
goto out;
return (EINVAL);
}
/* get a reference to parent dataset, should never be NULL */
pzhp = make_dataset_handle(hdl, parent_name);
if (pzhp == NULL) {
zfs_error_aux(hdl, dgettext(TEXT_DOMAIN,
"Failed to lookup parent."));
return (ENOENT);
}
/* Lookup parent's crypt */
pcrypt = zfs_prop_get_int(pzhp, ZFS_PROP_ENCRYPTION);
ocrypt = zfs_prop_get_int(origin_zhp, ZFS_PROP_ENCRYPTION);
/* all children of encrypted parents must be encrypted */
if (pcrypt != ZIO_CRYPT_OFF && ocrypt == ZIO_CRYPT_OFF) {
ret = EINVAL;
zfs_error_aux(hdl, dgettext(TEXT_DOMAIN,
"Cannot create unencrypted clone as a child "
"of encrypted parent."));
goto out;
}
zfs_close(pzhp);
return (0);
out:
if (pzhp != NULL)
zfs_close(pzhp);
return (ret);
}
typedef struct loadkeys_cbdata {
+31 -20
View File
@@ -31,6 +31,7 @@
* Copyright 2016 Igor Kozhukhov <ikozhukhov@gmail.com>
* Copyright 2017-2018 RackTop Systems.
* Copyright (c) 2019 Datto Inc.
* Copyright (c) 2019, loli10K <ezomori.nozomu@gmail.com>
*/
#include <ctype.h>
@@ -196,6 +197,16 @@ zfs_validate_name(libzfs_handle_t *hdl, const char *path, int type,
"reserved disk name"));
break;
case NAME_ERR_SELF_REF:
zfs_error_aux(hdl, dgettext(TEXT_DOMAIN,
"self reference, '.' is found in name"));
break;
case NAME_ERR_PARENT_REF:
zfs_error_aux(hdl, dgettext(TEXT_DOMAIN,
"parent reference, '..' is found in name"));
break;
default:
zfs_error_aux(hdl, dgettext(TEXT_DOMAIN,
"(%d) not defined"), why);
@@ -2969,8 +2980,10 @@ zfs_prop_get(zfs_handle_t *zhp, zfs_prop_t prop, char *propbuf, size_t proplen,
case ZFS_PROP_GUID:
case ZFS_PROP_CREATETXG:
case ZFS_PROP_OBJSETID:
/*
* GUIDs are stored as numbers, but they are identifiers.
* These properties are stored as numbers, but they are
* identifiers.
* We don't want them to be pretty printed, because pretty
* printing mangles the ID into a truncated and useless value.
*/
@@ -4104,6 +4117,16 @@ zfs_promote(zfs_handle_t *zhp)
if (ret != 0) {
switch (ret) {
case EACCES:
/*
* Promoting encrypted dataset outside its
* encryption root.
*/
zfs_error_aux(hdl, dgettext(TEXT_DOMAIN,
"cannot promote dataset outside its "
"encryption root"));
return (zfs_error(hdl, EZFS_EXISTS, errbuf));
case EEXIST:
/* There is a conflicting snapshot name. */
zfs_error_aux(hdl, dgettext(TEXT_DOMAIN,
@@ -4467,8 +4490,6 @@ zfs_rename(zfs_handle_t *zhp, const char *target, boolean_t recursive,
zfs_cmd_t zc = {"\0"};
char *delim;
prop_changelist_t *cl = NULL;
zfs_handle_t *zhrp = NULL;
char *parentname = NULL;
char parent[ZFS_MAX_DATASET_NAME_LEN];
libzfs_handle_t *hdl = zhp->zfs_hdl;
char errbuf[1024];
@@ -4563,7 +4584,8 @@ zfs_rename(zfs_handle_t *zhp, const char *target, boolean_t recursive,
}
if (recursive) {
parentname = zfs_strdup(zhp->zfs_hdl, zhp->zfs_name);
zfs_handle_t *zhrp;
char *parentname = zfs_strdup(zhp->zfs_hdl, zhp->zfs_name);
if (parentname == NULL) {
ret = -1;
goto error;
@@ -4571,10 +4593,12 @@ zfs_rename(zfs_handle_t *zhp, const char *target, boolean_t recursive,
delim = strchr(parentname, '@');
*delim = '\0';
zhrp = zfs_open(zhp->zfs_hdl, parentname, ZFS_TYPE_DATASET);
free(parentname);
if (zhrp == NULL) {
ret = -1;
goto error;
}
zfs_close(zhrp);
} else if (zhp->zfs_type != ZFS_TYPE_SNAPSHOT) {
if ((cl = changelist_gather(zhp, ZFS_PROP_NAME,
CL_GATHER_ITER_MOUNTED,
@@ -4618,16 +4642,9 @@ zfs_rename(zfs_handle_t *zhp, const char *target, boolean_t recursive,
"with the new name"));
(void) zfs_error(hdl, EZFS_EXISTS, errbuf);
} else if (errno == EACCES) {
if (zfs_prop_get_int(zhp, ZFS_PROP_ENCRYPTION) ==
ZIO_CRYPT_OFF) {
zfs_error_aux(hdl, dgettext(TEXT_DOMAIN,
"cannot rename an unencrypted dataset to "
"be a decendent of an encrypted one"));
} else {
zfs_error_aux(hdl, dgettext(TEXT_DOMAIN,
"cannot move encryption child outside of "
"its encryption root"));
}
zfs_error_aux(hdl, dgettext(TEXT_DOMAIN,
"cannot move encrypted child outside of "
"its encryption root"));
(void) zfs_error(hdl, EZFS_CRYPTOFAILED, errbuf);
} else {
(void) zfs_standard_error(zhp->zfs_hdl, errno, errbuf);
@@ -4647,12 +4664,6 @@ zfs_rename(zfs_handle_t *zhp, const char *target, boolean_t recursive,
}
error:
if (parentname != NULL) {
free(parentname);
}
if (zhrp != NULL) {
zfs_close(zhrp);
}
if (cl != NULL) {
changelist_free(cl);
}
+4 -2
View File
@@ -1302,12 +1302,14 @@ mountpoint_cmp(const void *arga, const void *argb)
}
/*
* Return true if path2 is a child of path1.
* Return true if path2 is a child of path1 or path2 equals path1 or
* path1 is "/" (path2 is always a child of "/").
*/
static boolean_t
libzfs_path_contains(const char *path1, const char *path2)
{
return (strstr(path2, path1) == path2 && path2[strlen(path1)] == '/');
return (strcmp(path1, path2) == 0 || strcmp(path1, "/") == 0 ||
(strstr(path2, path1) == path2 && path2[strlen(path1)] == '/'));
}
/*
+33 -35
View File
@@ -2827,7 +2827,7 @@ recv_fix_encryption_hierarchy(libzfs_handle_t *hdl, const char *destname,
is_clone = zhp->zfs_dmustats.dds_origin[0] != '\0';
(void) zfs_crypto_get_encryption_root(zhp, &is_encroot, NULL);
/* we don't need to do anything for unencrypted filesystems */
/* we don't need to do anything for unencrypted datasets */
if (crypt == ZIO_CRYPT_OFF) {
zfs_close(zhp);
continue;
@@ -3992,11 +3992,18 @@ zfs_receive_one(libzfs_handle_t *hdl, int infd, const char *tosnap,
}
} else {
/*
* if the fs does not exist, look for it based on the
* fromsnap GUID
* If the fs does not exist, look for it based on the
* fromsnap GUID.
*/
(void) snprintf(errbuf, sizeof (errbuf), dgettext(TEXT_DOMAIN,
"cannot receive incremental stream"));
if (resuming) {
(void) snprintf(errbuf, sizeof (errbuf),
dgettext(TEXT_DOMAIN,
"cannot receive resume stream"));
} else {
(void) snprintf(errbuf, sizeof (errbuf),
dgettext(TEXT_DOMAIN,
"cannot receive incremental stream"));
}
(void) strcpy(name, destsnap);
*strchr(name, '@') = '\0';
@@ -4210,34 +4217,6 @@ zfs_receive_one(libzfs_handle_t *hdl, int infd, const char *tosnap,
goto out;
}
/*
* It is invalid to receive a properties stream that was
* unencrypted on the send side as a child of an encrypted
* parent. Technically there is nothing preventing this, but
* it would mean that the encryption=off property which is
* locally set on the send side would not be received correctly.
* We can infer encryption=off if the stream is not raw and
* properties were included since the send side will only ever
* send the encryption property in a raw nvlist header. This
* check will be avoided if the user specifically overrides
* the encryption property on the command line.
*/
if (!raw && rcvprops != NULL &&
!nvlist_exists(cmdprops,
zfs_prop_to_name(ZFS_PROP_ENCRYPTION))) {
uint64_t crypt;
crypt = zfs_prop_get_int(zhp, ZFS_PROP_ENCRYPTION);
if (crypt != ZIO_CRYPT_OFF) {
zfs_error_aux(hdl, dgettext(TEXT_DOMAIN,
"parent '%s' must not be encrypted to "
"receive unenecrypted property"), name);
err = zfs_error(hdl, EZFS_BADPROP, errbuf);
zfs_close(zhp);
goto out;
}
}
zfs_close(zhp);
newfs = B_TRUE;
@@ -4274,6 +4253,24 @@ zfs_receive_one(libzfs_handle_t *hdl, int infd, const char *tosnap,
&oxprops, &wkeydata, &wkeylen, errbuf)) != 0)
goto out;
/*
* When sending with properties (zfs send -p), the encryption property
* is not included because it is a SETONCE property and therefore
* treated as read only. However, we are always able to determine its
* value because raw sends will include it in the DRR_BDEGIN payload
* and non-raw sends with properties are not allowed for encrypted
* datasets. Therefore, if this is a non-raw properties stream, we can
* infer that the value should be ZIO_CRYPT_OFF and manually add that
* to the received properties.
*/
if (stream_wantsnewfs && !raw && rcvprops != NULL &&
!nvlist_exists(cmdprops, zfs_prop_to_name(ZFS_PROP_ENCRYPTION))) {
if (oxprops == NULL)
oxprops = fnvlist_alloc();
fnvlist_add_uint64(oxprops,
zfs_prop_to_name(ZFS_PROP_ENCRYPTION), ZIO_CRYPT_OFF);
}
err = ioctl_err = lzc_receive_with_cmdprops(destsnap, rcvprops,
oxprops, wkeydata, wkeylen, origin, flags->force, flags->resumable,
raw, infd, drr_noswap, cleanup_fd, &read_bytes, &errflags,
@@ -4428,14 +4425,15 @@ zfs_receive_one(libzfs_handle_t *hdl, int infd, const char *tosnap,
*cp = '@';
break;
case EINVAL:
if (flags->resumable)
if (flags->resumable) {
zfs_error_aux(hdl, dgettext(TEXT_DOMAIN,
"kernel modules must be upgraded to "
"receive this stream."));
if (embedded && !raw)
} else if (embedded && !raw) {
zfs_error_aux(hdl, dgettext(TEXT_DOMAIN,
"incompatible embedded data stream "
"feature with encrypted receive."));
}
(void) zfs_error(hdl, EZFS_BADSTREAM, errbuf);
break;
case ECKSUM:
+6 -1
View File
@@ -303,6 +303,8 @@ libzfs_error_description(libzfs_handle_t *hdl)
case EZFS_NO_RESILVER_DEFER:
return (dgettext(TEXT_DOMAIN, "this action requires the "
"resilver_defer feature"));
case EZFS_EXPORT_IN_PROGRESS:
return (dgettext(TEXT_DOMAIN, "pool export in progress"));
case EZFS_UNKNOWN:
return (dgettext(TEXT_DOMAIN, "unknown error"));
default:
@@ -598,6 +600,9 @@ zpool_standard_error_fmt(libzfs_handle_t *hdl, int error, const char *fmt, ...)
case ZFS_ERR_VDEV_TOO_BIG:
zfs_verror(hdl, EZFS_VDEV_TOO_BIG, fmt, ap);
break;
case ZFS_ERR_EXPORT_IN_PROGRESS:
zfs_verror(hdl, EZFS_EXPORT_IN_PROGRESS, fmt, ap);
break;
case ZFS_ERR_IOC_CMD_UNAVAIL:
zfs_error_aux(hdl, dgettext(TEXT_DOMAIN, "the loaded zfs "
"module does not support this operation. A reboot may "
@@ -1134,7 +1139,7 @@ int
zcmd_alloc_dst_nvlist(libzfs_handle_t *hdl, zfs_cmd_t *zc, size_t len)
{
if (len == 0)
len = 16 * 1024;
len = 256 * 1024;
zc->zc_nvlist_dst_size = len;
zc->zc_nvlist_dst =
(uint64_t)(uintptr_t)zfs_alloc(hdl, zc->zc_nvlist_dst_size);
+3 -3
View File
@@ -52,7 +52,7 @@
*
* - Thin Layer. libzfs_core is a thin layer, marshaling arguments
* to/from the kernel ioctls. There is generally a 1:1 correspondence
* between libzfs_core functions and ioctls to /dev/zfs.
* between libzfs_core functions and ioctls to ZFS_DEV.
*
* - Clear Atomicity. Because libzfs_core functions are generally 1:1
* with kernel ioctls, and kernel ioctls are general atomic, each
@@ -135,7 +135,7 @@ libzfs_core_init(void)
{
(void) pthread_mutex_lock(&g_lock);
if (g_refcount == 0) {
g_fd = open("/dev/zfs", O_RDWR);
g_fd = open(ZFS_DEV, O_RDWR);
if (g_fd < 0) {
(void) pthread_mutex_unlock(&g_lock);
return (errno);
@@ -499,7 +499,7 @@ lzc_sync(const char *pool_name, nvlist_t *innvl, nvlist_t **outnvl)
* The snapshots must all be in the same pool.
* The value is the name of the hold (string type).
*
* If cleanup_fd is not -1, it must be the result of open("/dev/zfs", O_EXCL).
* If cleanup_fd is not -1, it must be the result of open(ZFS_DEV, O_EXCL).
* In this case, when the cleanup_fd is closed (including on process
* termination), the holds will be released. If the system is shut down
* uncleanly, the holds will be released when the pool is next opened
+1 -1
View File
@@ -223,7 +223,7 @@ pool_active(void *unused, const char *name, uint64_t guid,
* Use ZFS_IOC_POOL_SYNC to confirm if a pool is active
*/
fd = open("/dev/zfs", O_RDWR);
fd = open(ZFS_DEV, O_RDWR);
if (fd < 0)
return (-1);
+1 -1
View File
@@ -1,4 +1,4 @@
dist_man_MANS = zhack.1 ztest.1 raidz_test.1
dist_man_MANS = zhack.1 ztest.1 raidz_test.1 zvol_wait.1
EXTRA_DIST = cstyle.1
install-data-local:
+1 -1
View File
@@ -25,7 +25,7 @@
.TH raidz_test 1 "2016" "ZFS on Linux" "User Commands"
.SH NAME
\fBraidz_test\fR \- raidz implementation verification and bencmarking tool
\fBraidz_test\fR \- raidz implementation verification and benchmarking tool
.SH SYNOPSIS
.LP
.BI "raidz_test <options>"
+21
View File
@@ -0,0 +1,21 @@
.Dd July 5, 2019
.Dt ZVOL_WAIT 1 SMM
.Os Linux
.Sh NAME
.Nm zvol_wait
.Nd Wait for ZFS volume links in
.Em /dev
to be created.
.Sh SYNOPSIS
.Nm
.Sh DESCRIPTION
When a ZFS pool is imported, ZFS will register each ZFS volume
(zvol) as a disk device with the system. As the disks are registered,
.Xr \fBudev 7\fR
will asynchronously create symlinks under
.Em /dev/zvol
using the zvol's name.
.Nm
will wait for all those symlinks to be created before returning.
.Sh SEE ALSO
.Xr \fBudev 7\fR
+67 -3
View File
@@ -104,6 +104,18 @@ to a log2 fraction of the target arc size.
Default value: \fB6\fR.
.RE
.sp
.ne 2
.na
\fBdmu_prefetch_max\fR (int)
.ad
.RS 12n
Limit the amount we can prefetch with one call to this amount (in bytes).
This helps to limit the amount of memory that can be used by prefetching.
.sp
Default value: \fB134,217,728\fR (128MB).
.RE
.sp
.ne 2
.na
@@ -502,6 +514,19 @@ regular reads (but there's no reason it has to be the same).
Default value: \fB32,768\fR.
.RE
.sp
.ne 2
.na
\fBzap_iterate_prefetch\fR (int)
.ad
.RS 12n
If this is set, when we start iterating over a ZAP object, zfs will prefetch
the entire object (all leaf blocks). However, this is limited by
\fBdmu_prefetch_max\fR.
.sp
Use \fB1\fR for on (default) and \fB0\fR for off.
.RE
.sp
.ne 2
.na
@@ -1817,7 +1842,7 @@ this value. If a metaslab group exceeds this threshold then it will be
skipped unless all metaslab groups within the metaslab class have also
crossed this threshold.
.sp
Default value: \fB85\fR.
Default value: \fB95\fR.
.RE
.sp
@@ -2169,6 +2194,33 @@ pool cannot be returned to a healthy state prior to removing the device.
Default value: \fB0\fR.
.RE
.sp
.ne 2
.na
\fBzfs_removal_suspend_progress\fR (int)
.ad
.RS 12n
.sp
This is used by the test suite so that it can ensure that certain actions
happen while in the middle of a removal.
.sp
Default value: \fB0\fR.
.RE
.sp
.ne 2
.na
\fBzfs_remove_max_segment\fR (int)
.ad
.RS 12n
.sp
The largest contiguous segment that we will attempt to allocate when removing
a device. This can be no larger than 16MB. If there is a performance
problem with attempting to allocate large blocks, consider decreasing this.
.sp
Default value: \fB16,777,216\fR (16MB).
.RE
.sp
.ne 2
.na
@@ -2419,9 +2471,21 @@ Default value: \fB25\fR.
\fBzfs_sync_pass_dont_compress\fR (int)
.ad
.RS 12n
Don't compress starting in this pass
Starting in this sync pass, we disable compression (including of metadata).
With the default setting, in practice, we don't have this many sync passes,
so this has no effect.
.sp
Default value: \fB5\fR.
The original intent was that disabling compression would help the sync passes
to converge. However, in practice disabling compression increases the average
number of sync passes, because when we turn compression off, a lot of block's
size will change and thus we have to re-allocate (not overwrite) them. It
also increases the number of 128KB allocations (e.g. for indirect blocks and
spacemaps) because these will not be compressed. The 128K allocations are
especially detrimental to performance on highly fragmented systems, which may
have very few free segments of this size, and may need to load new metaslabs
to satisfy 128K allocations.
.sp
Default value: \fB8\fR.
.RE
.sp
+1 -1
View File
@@ -26,7 +26,7 @@ information on ZFS mountpoints must be stored separately. The output
of the command
.PP
.RS 4
zfs list -H -o name,mountpoint,canmount,atime,relatime,devices,exec,readonly,setuid,nbmand
zfs list -H -o name,mountpoint,canmount,atime,relatime,devices,exec,readonly,setuid,nbmand,encroot,keylocation
.RE
.PP
for datasets that should be mounted by systemd, should be kept
+14 -13
View File
@@ -1,11 +1,11 @@
subdir-m += avl
subdir-m += icp
subdir-m += lua
subdir-m += nvpair
subdir-m += spl
subdir-m += unicode
subdir-m += zcommon
subdir-m += zfs
obj-m += avl/
obj-m += icp/
obj-m += lua/
obj-m += nvpair/
obj-m += spl/
obj-m += unicode/
obj-m += zcommon/
obj-m += zfs/
INSTALL_MOD_DIR ?= extra
@@ -60,14 +60,15 @@ modules_install:
modules_uninstall:
@# Uninstall the kernel modules
kmoddir=$(DESTDIR)$(INSTALL_MOD_PATH)/lib/modules/@LINUX_VERSION@
list='$(subdir-m)'; for subdir in $$list; do \
$(RM) -R $$kmoddir/$(INSTALL_MOD_DIR)/$$subdir; \
list='$(obj-m)'; for objdir in $$list; do \
$(RM) -R $$kmoddir/$(INSTALL_MOD_DIR)/$$objdir; \
done
distdir:
list='$(subdir-m)'; for subdir in $$list; do \
(cd @top_srcdir@/module && find $$subdir -name '*.c' -o -name '*.h' -o -name '*.S' |\
xargs cp --parents -t $$distdir); \
list='$(obj-m)'; for objdir in $$list; do \
(cd @top_srcdir@/module && find $$objdir \
-name '*.c' -o -name '*.h' -o -name '*.S' | \
xargs cp --parents -t @abs_top_builddir@/module/$$distdir); \
done
distclean maintainer-clean: clean
+8 -3
View File
@@ -303,16 +303,21 @@ aes_impl_init(void)
}
aes_supp_impl_cnt = c;
/* set fastest implementation. assume hardware accelerated is fastest */
/*
* Set the fastest implementation given the assumption that the
* hardware accelerated version is the fastest.
*/
#if defined(__x86_64)
#if defined(HAVE_AES)
if (aes_aesni_impl.is_supported())
if (aes_aesni_impl.is_supported()) {
memcpy(&aes_fastest_impl, &aes_aesni_impl,
sizeof (aes_fastest_impl));
else
} else
#endif
{
memcpy(&aes_fastest_impl, &aes_x86_64_impl,
sizeof (aes_fastest_impl));
}
#else
memcpy(&aes_fastest_impl, &aes_generic_impl,
sizeof (aes_fastest_impl));
+6 -4
View File
@@ -646,7 +646,7 @@ const gcm_impl_ops_t *gcm_all_impl[] = {
/* Indicate that benchmark has been completed */
static boolean_t gcm_impl_initialized = B_FALSE;
/* Select aes implementation */
/* Select GCM implementation */
#define IMPL_FASTEST (UINT32_MAX)
#define IMPL_CYCLE (UINT32_MAX-1)
@@ -713,13 +713,15 @@ gcm_impl_init(void)
/* set fastest implementation. assume hardware accelerated is fastest */
#if defined(__x86_64) && defined(HAVE_PCLMULQDQ)
if (gcm_pclmulqdq_impl.is_supported())
if (gcm_pclmulqdq_impl.is_supported()) {
memcpy(&gcm_fastest_impl, &gcm_pclmulqdq_impl,
sizeof (gcm_fastest_impl));
else
} else
#endif
{
memcpy(&gcm_fastest_impl, &gcm_generic_impl,
sizeof (gcm_fastest_impl));
}
strcpy(gcm_fastest_impl.name, "fastest");
@@ -742,7 +744,7 @@ static const struct {
* If we are called before init(), user preference will be saved in
* user_sel_impl, and applied in later init() call. This occurs when module
* parameter is specified on module load. Otherwise, directly update
* icp_aes_impl.
* icp_gcm_impl.
*
* @val Name of gcm implementation to use
* @param Unused.
+1 -1
View File
@@ -162,7 +162,7 @@ typedef enum aes_mech_type {
#endif /* _AES_IMPL */
/*
* Methods used to define aes implementation
* Methods used to define AES implementation
*
* @aes_gen_f Key generation
* @aes_enc_f Function encrypts one block
+2 -2
View File
@@ -37,12 +37,12 @@ extern "C" {
#include <sys/crypto/common.h>
/*
* Methods used to define gcm implementation
* Methods used to define GCM implementation
*
* @gcm_mul_f Perform carry-less multiplication
* @gcm_will_work_f Function tests whether implementation will function
*/
typedef void (*gcm_mul_f)(uint64_t *, uint64_t *, uint64_t *);
typedef void (*gcm_mul_f)(uint64_t *, uint64_t *, uint64_t *);
typedef boolean_t (*gcm_will_work_f)(void);
#define GCM_IMPL_NAME_MAX (16)
+1 -1
View File
@@ -61,7 +61,7 @@
#elif defined(__mips__)
#define JMP_BUF_CNT 12
#elif defined(__s390x__)
#define JMP_BUF_CNT 9
#define JMP_BUF_CNT 18
#else
#define JMP_BUF_CNT 1
#endif
+6 -3
View File
@@ -431,9 +431,12 @@ static int llex (LexState *ls, SemInfo *seminfo) {
if (sep >= 0) {
read_long_string(ls, seminfo, sep);
return TK_STRING;
}
else if (sep == -1) return '[';
else lexerror(ls, "invalid long string delimiter", TK_STRING);
} else if (sep == -1) {
return '[';
} else {
lexerror(ls, "invalid long string delimiter", TK_STRING);
break;
}
}
case '=': {
next(ls);
-2
View File
@@ -16,10 +16,8 @@ $(MODULE)-objs += spl-kmem.o
$(MODULE)-objs += spl-kmem-cache.o
$(MODULE)-objs += spl-kobj.o
$(MODULE)-objs += spl-kstat.o
$(MODULE)-objs += spl-mutex.o
$(MODULE)-objs += spl-proc.o
$(MODULE)-objs += spl-procfs-list.o
$(MODULE)-objs += spl-rwlock.o
$(MODULE)-objs += spl-taskq.o
$(MODULE)-objs += spl-thread.o
$(MODULE)-objs += spl-tsd.o
+21 -8
View File
@@ -154,26 +154,39 @@ EXPORT_SYMBOL(__cv_wait_sig);
#if defined(HAVE_IO_SCHEDULE_TIMEOUT)
#define spl_io_schedule_timeout(t) io_schedule_timeout(t)
#else
struct spl_task_timer {
struct timer_list timer;
struct task_struct *task;
};
static void
__cv_wakeup(unsigned long data)
__cv_wakeup(spl_timer_list_t t)
{
wake_up_process((struct task_struct *)data);
struct timer_list *tmr = (struct timer_list *)t;
struct spl_task_timer *task_timer = from_timer(task_timer, tmr, timer);
wake_up_process(task_timer->task);
}
static long
spl_io_schedule_timeout(long time_left)
{
long expire_time = jiffies + time_left;
struct timer_list timer;
struct spl_task_timer task_timer;
struct timer_list *timer = &task_timer.timer;
init_timer(&timer);
setup_timer(&timer, __cv_wakeup, (unsigned long)current);
timer.expires = expire_time;
add_timer(&timer);
task_timer.task = current;
timer_setup(timer, __cv_wakeup, 0);
timer->expires = expire_time;
add_timer(timer);
io_schedule();
del_timer_sync(&timer);
del_timer_sync(timer);
time_left = expire_time - jiffies;
return (time_left < 0 ? 0 : time_left);
+13 -25
View File
@@ -694,51 +694,41 @@ spl_init(void)
if ((rc = spl_kvmem_init()))
goto out1;
if ((rc = spl_mutex_init()))
if ((rc = spl_tsd_init()))
goto out2;
if ((rc = spl_rw_init()))
if ((rc = spl_taskq_init()))
goto out3;
if ((rc = spl_tsd_init()))
if ((rc = spl_kmem_cache_init()))
goto out4;
if ((rc = spl_taskq_init()))
if ((rc = spl_vn_init()))
goto out5;
if ((rc = spl_kmem_cache_init()))
if ((rc = spl_proc_init()))
goto out6;
if ((rc = spl_vn_init()))
if ((rc = spl_kstat_init()))
goto out7;
if ((rc = spl_proc_init()))
goto out8;
if ((rc = spl_kstat_init()))
goto out9;
if ((rc = spl_zlib_init()))
goto out10;
goto out8;
return (rc);
out10:
spl_kstat_fini();
out9:
spl_proc_fini();
out8:
spl_vn_fini();
spl_kstat_fini();
out7:
spl_kmem_cache_fini();
spl_proc_fini();
out6:
spl_taskq_fini();
spl_vn_fini();
out5:
spl_tsd_fini();
spl_kmem_cache_fini();
out4:
spl_rw_fini();
spl_taskq_fini();
out3:
spl_mutex_fini();
spl_tsd_fini();
out2:
spl_kvmem_fini();
out1:
@@ -755,8 +745,6 @@ spl_fini(void)
spl_kmem_cache_fini();
spl_taskq_fini();
spl_tsd_fini();
spl_rw_fini();
spl_mutex_fini();
spl_kvmem_fini();
}
+2 -1
View File
@@ -180,7 +180,8 @@ spl_kmem_alloc_impl(size_t size, int flags, int node)
*/
if ((size > spl_kmem_alloc_max) || use_vmem) {
if (flags & KM_VMEM) {
ptr = __vmalloc(size, lflags, PAGE_KERNEL);
ptr = __vmalloc(size, lflags | __GFP_HIGHMEM,
PAGE_KERNEL);
} else {
return (NULL);
}
-30
View File
@@ -1,30 +0,0 @@
/*
* Copyright (C) 2007-2010 Lawrence Livermore National Security, LLC.
* Copyright (C) 2007 The Regents of the University of California.
* Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
* Written by Brian Behlendorf <behlendorf1@llnl.gov>.
* UCRL-CODE-235197
*
* This file is part of the SPL, Solaris Porting Layer.
* For details, see <http://zfsonlinux.org/>.
*
* The SPL is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License as published by the
* Free Software Foundation; either version 2 of the License, or (at your
* option) any later version.
*
* The SPL is distributed in the hope that it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
* for more details.
*
* You should have received a copy of the GNU General Public License along
* with the SPL. If not, see <http://www.gnu.org/licenses/>.
*
* Solaris Porting Layer (SPL) Mutex Implementation.
*/
#include <sys/mutex.h>
int spl_mutex_init(void) { return 0; }
void spl_mutex_fini(void) { }
-132
View File
@@ -1,132 +0,0 @@
/*
* Copyright (C) 2007-2010 Lawrence Livermore National Security, LLC.
* Copyright (C) 2007 The Regents of the University of California.
* Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
* Written by Brian Behlendorf <behlendorf1@llnl.gov>.
* UCRL-CODE-235197
*
* This file is part of the SPL, Solaris Porting Layer.
* For details, see <http://zfsonlinux.org/>.
*
* The SPL is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License as published by the
* Free Software Foundation; either version 2 of the License, or (at your
* option) any later version.
*
* The SPL is distributed in the hope that it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
* for more details.
*
* You should have received a copy of the GNU General Public License along
* with the SPL. If not, see <http://www.gnu.org/licenses/>.
*
* Solaris Porting Layer (SPL) Reader/Writer Lock Implementation.
*/
#include <sys/rwlock.h>
#include <linux/module.h>
#if defined(CONFIG_PREEMPT_RT_FULL)
#include <linux/rtmutex.h>
#define RT_MUTEX_OWNER_MASKALL 1UL
static int
__rwsem_tryupgrade(struct rw_semaphore *rwsem)
{
#if defined(READER_BIAS) && defined(WRITER_BIAS)
/*
* After the 4.9.20-rt16 kernel the realtime patch series lifted the
* single reader restriction. While this could be accommodated by
* adding additional compatibility code assume the rwsem can never
* be upgraded. All caller must already cleanly handle this case.
*/
return (0);
#else
ASSERT((struct task_struct *)
((unsigned long)rwsem->lock.owner & ~RT_MUTEX_OWNER_MASKALL) ==
current);
/*
* Prior to 4.9.20-rt16 kernel the realtime patch series, rwsem is
* implemented as a single mutex held by readers and writers alike.
* However, this implementation would prevent a thread from taking
* a read lock twice, as the mutex would already be locked on
* the second attempt. Therefore the implementation allows a
* single thread to take a rwsem as read lock multiple times
* tracking that nesting as read_depth counter.
*/
if (rwsem->read_depth <= 1) {
/*
* In case, the current thread has not taken the lock
* more than once as read lock, we can allow an
* upgrade to a write lock. rwsem_rt.h implements
* write locks as read_depth == 0.
*/
rwsem->read_depth = 0;
return (1);
}
return (0);
#endif
}
#elif defined(CONFIG_RWSEM_GENERIC_SPINLOCK)
static int
__rwsem_tryupgrade(struct rw_semaphore *rwsem)
{
int ret = 0;
unsigned long flags;
spl_rwsem_lock_irqsave(&rwsem->wait_lock, flags);
if (RWSEM_COUNT(rwsem) == SPL_RWSEM_SINGLE_READER_VALUE &&
list_empty(&rwsem->wait_list)) {
ret = 1;
RWSEM_COUNT(rwsem) = SPL_RWSEM_SINGLE_WRITER_VALUE;
}
spl_rwsem_unlock_irqrestore(&rwsem->wait_lock, flags);
return (ret);
}
#elif defined(RWSEM_ACTIVE_MASK)
#if defined(HAVE_RWSEM_ATOMIC_LONG_COUNT)
static int
__rwsem_tryupgrade(struct rw_semaphore *rwsem)
{
long val;
val = atomic_long_cmpxchg(&rwsem->count, SPL_RWSEM_SINGLE_READER_VALUE,
SPL_RWSEM_SINGLE_WRITER_VALUE);
return (val == SPL_RWSEM_SINGLE_READER_VALUE);
}
#else
static int
__rwsem_tryupgrade(struct rw_semaphore *rwsem)
{
typeof(rwsem->count) val;
val = cmpxchg(&rwsem->count, SPL_RWSEM_SINGLE_READER_VALUE,
SPL_RWSEM_SINGLE_WRITER_VALUE);
return (val == SPL_RWSEM_SINGLE_READER_VALUE);
}
#endif
#else
static int
__rwsem_tryupgrade(struct rw_semaphore *rwsem)
{
return (0);
}
#endif
int
rwsem_tryupgrade(struct rw_semaphore *rwsem)
{
if (__rwsem_tryupgrade(rwsem)) {
rwsem_release(&rwsem->dep_map, 1, _RET_IP_);
rwsem_acquire(&rwsem->dep_map, 0, 1, _RET_IP_);
#ifdef CONFIG_RWSEM_SPIN_ON_OWNER
rwsem->owner = current;
#endif
return (1);
}
return (0);
}
EXPORT_SYMBOL(rwsem_tryupgrade);
int spl_rw_init(void) { return 0; }
void spl_rw_fini(void) { }
+4 -20
View File
@@ -24,6 +24,7 @@
* Solaris Porting Layer (SPL) Task Queue Implementation.
*/
#include <sys/timer.h>
#include <sys/taskq.h>
#include <sys/kmem.h>
#include <sys/tsd.h>
@@ -242,20 +243,13 @@ task_expire_impl(taskq_ent_t *t)
wake_up(&tq->tq_work_waitq);
}
#ifdef HAVE_KERNEL_TIMER_FUNCTION_TIMER_LIST
static void
task_expire(struct timer_list *tl)
task_expire(spl_timer_list_t tl)
{
taskq_ent_t *t = from_timer(t, tl, tqent_timer);
struct timer_list *tmr = (struct timer_list *)tl;
taskq_ent_t *t = from_timer(t, tmr, tqent_timer);
task_expire_impl(t);
}
#else
static void
task_expire(unsigned long data)
{
task_expire_impl((taskq_ent_t *)data);
}
#endif
/*
* Returns the lowest incomplete taskqid_t. The taskqid_t may
@@ -597,9 +591,6 @@ taskq_dispatch(taskq_t *tq, task_func_t func, void *arg, uint_t flags)
t->tqent_func = func;
t->tqent_arg = arg;
t->tqent_taskq = tq;
#ifndef HAVE_KERNEL_TIMER_FUNCTION_TIMER_LIST
t->tqent_timer.data = 0;
#endif
t->tqent_timer.function = NULL;
t->tqent_timer.expires = 0;
t->tqent_birth = jiffies;
@@ -649,9 +640,6 @@ taskq_dispatch_delay(taskq_t *tq, task_func_t func, void *arg,
t->tqent_func = func;
t->tqent_arg = arg;
t->tqent_taskq = tq;
#ifndef HAVE_KERNEL_TIMER_FUNCTION_TIMER_LIST
t->tqent_timer.data = (unsigned long)t;
#endif
t->tqent_timer.function = task_expire;
t->tqent_timer.expires = (unsigned long)expire_time;
add_timer(&t->tqent_timer);
@@ -744,11 +732,7 @@ taskq_init_ent(taskq_ent_t *t)
{
spin_lock_init(&t->tqent_lock);
init_waitqueue_head(&t->tqent_waitq);
#ifdef HAVE_KERNEL_TIMER_FUNCTION_TIMER_LIST
timer_setup(&t->tqent_timer, NULL, 0);
#else
init_timer(&t->tqent_timer);
#endif
INIT_LIST_HEAD(&t->tqent_list);
t->tqent_id = 0;
t->tqent_func = NULL;
+2 -1
View File
@@ -153,8 +153,9 @@ spl_kthread_create(int (*func)(void *), void *data, const char namefmt[], ...)
if (PTR_ERR(tsk) == -ENOMEM)
continue;
return (NULL);
} else
} else {
return (tsk);
}
} while (1);
}
EXPORT_SYMBOL(spl_kthread_create);
-62
View File
@@ -641,68 +641,6 @@ vn_areleasef(int fd, uf_info_t *fip)
} /* releasef() */
EXPORT_SYMBOL(areleasef);
static void
#ifdef HAVE_SET_FS_PWD_WITH_CONST
vn_set_fs_pwd(struct fs_struct *fs, const struct path *path)
#else
vn_set_fs_pwd(struct fs_struct *fs, struct path *path)
#endif /* HAVE_SET_FS_PWD_WITH_CONST */
{
struct path old_pwd;
#ifdef HAVE_FS_STRUCT_SPINLOCK
spin_lock(&fs->lock);
old_pwd = fs->pwd;
fs->pwd = *path;
path_get(path);
spin_unlock(&fs->lock);
#else
write_lock(&fs->lock);
old_pwd = fs->pwd;
fs->pwd = *path;
path_get(path);
write_unlock(&fs->lock);
#endif /* HAVE_FS_STRUCT_SPINLOCK */
if (old_pwd.dentry)
path_put(&old_pwd);
}
int
vn_set_pwd(const char *filename)
{
struct path path;
mm_segment_t saved_fs;
int rc;
/*
* user_path_dir() and __user_walk() both expect 'filename' to be
* a user space address so we must briefly increase the data segment
* size to ensure strncpy_from_user() does not fail with -EFAULT.
*/
saved_fs = get_fs();
set_fs(KERNEL_DS);
rc = user_path_dir(filename, &path);
if (rc)
goto out;
rc = inode_permission(path.dentry->d_inode, MAY_EXEC | MAY_ACCESS);
if (rc)
goto dput_and_out;
vn_set_fs_pwd(current->fs, &path);
dput_and_out:
path_put(&path);
out:
set_fs(saved_fs);
return (-rc);
} /* vn_set_pwd() */
EXPORT_SYMBOL(vn_set_pwd);
static int
vn_cache_constructor(void *buf, void *cdrarg, int kmflags)
{
+3 -3
View File
@@ -592,8 +592,9 @@ fletcher_4_incremental_byteswap(void *buf, size_t size, void *data)
}
#if defined(_KERNEL)
/* Fletcher 4 kstats */
/*
* Fletcher 4 kstats
*/
static int
fletcher_4_kstat_headers(char *buf, size_t size)
{
@@ -669,7 +670,6 @@ fletcher_4_benchmark_impl(boolean_t native, char *data, uint64_t data_size)
zio_cksum_t zc;
uint32_t i, l, sel_save = IMPL_READ(fletcher_4_impl_chosen);
fletcher_checksum_func_t *fletcher_4_test = native ?
fletcher_4_native : fletcher_4_byteswap;
+21
View File
@@ -232,6 +232,27 @@ entity_namecheck(const char *path, namecheck_err_t *why, char *what)
}
}
if (*end == '\0' || *end == '/') {
int component_length = end - start;
/* Validate the contents of this component is not '.' */
if (component_length == 1) {
if (start[0] == '.') {
if (why)
*why = NAME_ERR_SELF_REF;
return (-1);
}
}
/* Validate the content of this component is not '..' */
if (component_length == 2) {
if (start[0] == '.' && start[1] == '.') {
if (why)
*why = NAME_ERR_PARENT_REF;
return (-1);
}
}
}
/* Snapshot or bookmark delimiter found */
if (*end == '@' || *end == '#') {
/* Multiple delimiters are not allowed */
+4
View File
@@ -1370,8 +1370,10 @@ abd_raidz_gen_iterate(abd_t **cabds, abd_t *dabd,
switch (parity) {
case 3:
len = MIN(caiters[2].iter_mapsize, len);
/* falls through */
case 2:
len = MIN(caiters[1].iter_mapsize, len);
/* falls through */
case 1:
len = MIN(caiters[0].iter_mapsize, len);
}
@@ -1461,9 +1463,11 @@ abd_raidz_rec_iterate(abd_t **cabds, abd_t **tabds,
case 3:
len = MIN(xiters[2].iter_mapsize, len);
len = MIN(citers[2].iter_mapsize, len);
/* falls through */
case 2:
len = MIN(xiters[1].iter_mapsize, len);
len = MIN(citers[1].iter_mapsize, len);
/* falls through */
case 1:
len = MIN(xiters[0].iter_mapsize, len);
len = MIN(citers[0].iter_mapsize, len);
+22 -12
View File
@@ -21,7 +21,7 @@
/*
* Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved.
* Copyright (c) 2018, Joyent, Inc.
* Copyright (c) 2011, 2018 by Delphix. All rights reserved.
* Copyright (c) 2011, 2019 by Delphix. All rights reserved.
* Copyright (c) 2014 by Saso Kiselkov. All rights reserved.
* Copyright 2017 Nexenta Systems, Inc. All rights reserved.
*/
@@ -1872,7 +1872,8 @@ arc_buf_try_copy_decompressed_data(arc_buf_t *buf)
* There were no decompressed bufs, so there should not be a
* checksum on the hdr either.
*/
EQUIV(!copied, hdr->b_l1hdr.b_freeze_cksum == NULL);
if (zfs_flags & ZFS_DEBUG_MODIFY)
EQUIV(!copied, hdr->b_l1hdr.b_freeze_cksum == NULL);
return (copied);
}
@@ -2253,7 +2254,6 @@ arc_buf_fill(arc_buf_t *buf, spa_t *spa, const zbookmark_phys_t *zb,
*/
if (arc_buf_try_copy_decompressed_data(buf)) {
/* Skip byteswapping and checksumming (already done) */
ASSERT3P(hdr->b_l1hdr.b_freeze_cksum, !=, NULL);
return (0);
} else {
error = zio_decompress_data(HDR_GET_COMPRESS(hdr),
@@ -4801,8 +4801,6 @@ arc_reduce_target_size(int64_t to_free)
if (c > to_free && c - to_free > arc_c_min) {
arc_c = c - to_free;
atomic_add_64(&arc_p, -(arc_p >> arc_shrink_shift));
if (asize < arc_c)
arc_c = MAX(asize, arc_c_min);
if (arc_p > arc_c)
arc_p = (arc_c >> 1);
ASSERT(arc_c >= arc_c_min);
@@ -5081,6 +5079,9 @@ arc_kmem_reap_soon(void)
static boolean_t
arc_adjust_cb_check(void *arg, zthr_t *zthr)
{
if (!arc_initialized)
return (B_FALSE);
/*
* This is necessary so that any changes which may have been made to
* many of the zfs_arc_* module parameters will be propagated to
@@ -5168,6 +5169,9 @@ arc_adjust_cb(void *arg, zthr_t *zthr)
static boolean_t
arc_reap_cb_check(void *arg, zthr_t *zthr)
{
if (!arc_initialized)
return (B_FALSE);
int64_t free_memory = arc_available_memory();
/*
@@ -5480,7 +5484,7 @@ static boolean_t
arc_is_overflowing(void)
{
/* Always allow at least one block of overflow */
uint64_t overflow = MAX(SPA_MAXBLOCKSIZE,
int64_t overflow = MAX(SPA_MAXBLOCKSIZE,
arc_c >> zfs_arc_overflow_shift);
/*
@@ -5492,7 +5496,7 @@ arc_is_overflowing(void)
* in the ARC. In practice, that's in the tens of MB, which is low
* enough to be safe.
*/
return (aggsum_lower_bound(&arc_size) >= arc_c + overflow);
return (aggsum_lower_bound(&arc_size) >= (int64_t)arc_c + overflow);
}
static abd_t *
@@ -5608,7 +5612,7 @@ arc_get_data_impl(arc_buf_hdr_t *hdr, uint64_t size, void *tag)
* If we are growing the cache, and we are adding anonymous
* data, and we have outgrown arc_p, update arc_p
*/
if (aggsum_compare(&arc_size, arc_c) < 0 &&
if (aggsum_upper_bound(&arc_size) < arc_c &&
hdr->b_l1hdr.b_state == arc_anon &&
(zfs_refcount_count(&arc_anon->arcs_size) +
zfs_refcount_count(&arc_mru->arcs_size) > arc_p))
@@ -7926,11 +7930,9 @@ arc_fini(void)
list_destroy(&arc_prune_list);
mutex_destroy(&arc_prune_mtx);
(void) zthr_cancel(arc_adjust_zthr);
zthr_destroy(arc_adjust_zthr);
(void) zthr_cancel(arc_adjust_zthr);
(void) zthr_cancel(arc_reap_zthr);
zthr_destroy(arc_reap_zthr);
mutex_destroy(&arc_adjust_lock);
cv_destroy(&arc_adjust_waiters_cv);
@@ -7943,6 +7945,14 @@ arc_fini(void)
buf_fini();
arc_state_fini();
/*
* We destroy the zthrs after all the ARC state has been
* torn down to avoid the case of them receiving any
* wakeup() signals after they are destroyed.
*/
zthr_destroy(arc_adjust_zthr);
zthr_destroy(arc_reap_zthr);
ASSERT0(arc_loaned_bytes);
}
@@ -8760,7 +8770,7 @@ l2arc_apply_transforms(spa_t *spa, arc_buf_hdr_t *hdr, uint64_t asize,
/*
* If this data simply needs its own buffer, we simply allocate it
* and copy the data. This may be done to elimiate a depedency on a
* and copy the data. This may be done to eliminate a dependency on a
* shared buffer or to reallocate the buffer to match asize.
*/
if (HDR_HAS_RABD(hdr) && asize != psize) {
+2 -2
View File
@@ -73,7 +73,7 @@ bqueue_enqueue(bqueue_t *q, void *data, uint64_t item_size)
mutex_enter(&q->bq_lock);
obj2node(q, data)->bqn_size = item_size;
while (q->bq_size + item_size > q->bq_maxsize) {
cv_wait(&q->bq_add_cv, &q->bq_lock);
cv_wait_sig(&q->bq_add_cv, &q->bq_lock);
}
q->bq_size += item_size;
list_insert_tail(&q->bq_list, data);
@@ -91,7 +91,7 @@ bqueue_dequeue(bqueue_t *q)
uint64_t item_size;
mutex_enter(&q->bq_lock);
while (q->bq_size == 0) {
cv_wait(&q->bq_pop_cv, &q->bq_lock);
cv_wait_sig(&q->bq_pop_cv, &q->bq_lock);
}
ret = list_remove_head(&q->bq_list);
ASSERT3P(ret, !=, NULL);
+2 -1
View File
@@ -2591,7 +2591,8 @@ dbuf_destroy(dmu_buf_impl_t *db)
if (db->db_blkid != DMU_BONUS_BLKID) {
boolean_t needlock = !MUTEX_HELD(&dn->dn_dbufs_mtx);
if (needlock)
mutex_enter(&dn->dn_dbufs_mtx);
mutex_enter_nested(&dn->dn_dbufs_mtx,
NESTED_SINGLE);
avl_remove(&dn->dn_dbufs, db);
atomic_dec_32(&dn->dn_dbufs_count);
membar_producer();
+13 -1
View File
@@ -21,6 +21,7 @@
/*
* Copyright (c) 2009, 2010, Oracle and/or its affiliates. All rights reserved.
* Copyright (c) 2018 by Delphix. All rights reserved.
*/
#include <sys/zfs_context.h>
@@ -117,7 +118,18 @@ ddt_zap_walk(objset_t *os, uint64_t object, ddt_entry_t *dde, uint64_t *walk)
zap_attribute_t za;
int error;
zap_cursor_init_serialized(&zc, os, object, *walk);
if (*walk == 0) {
/*
* We don't want to prefetch the entire ZAP object, because
* it can be enormous. Also the primary use of DDT iteration
* is for scrubbing, in which case we will be issuing many
* scrub I/Os for each ZAP block that we read in, so
* reading the ZAP is unlikely to be the bottleneck.
*/
zap_cursor_init_noprefetch(&zc, os, object);
} else {
zap_cursor_init_serialized(&zc, os, object, *walk);
}
if ((error = zap_cursor_retrieve(&zc, &za)) == 0) {
uchar_t cbuf[sizeof (dde->dde_phys) + 1];
uint64_t csize = za.za_num_integers;
+21 -30
View File
@@ -81,6 +81,13 @@ int zfs_dmu_offset_next_sync = 0;
*/
int zfs_object_remap_one_indirect_delay_ms = 0;
/*
* Limit the amount we can prefetch with one call to this amount. This
* helps to limit the amount of memory that can be used by prefetching.
* Larger objects should be prefetched a bit at a time.
*/
int dmu_prefetch_max = 8 * SPA_MAXBLOCKSIZE;
const dmu_object_type_info_t dmu_ot[DMU_OT_NUMTYPES] = {
{DMU_BSWAP_UINT8, TRUE, FALSE, FALSE, "unallocated" },
{DMU_BSWAP_ZAP, TRUE, TRUE, FALSE, "object directory" },
@@ -667,6 +674,11 @@ dmu_prefetch(objset_t *os, uint64_t object, int64_t level, uint64_t offset,
return;
}
/*
* See comment before the definition of dmu_prefetch_max.
*/
len = MIN(len, dmu_prefetch_max);
/*
* XXX - Note, if the dnode for the requested object is not
* already cached, we will do a *synchronous* read in the
@@ -719,8 +731,8 @@ get_next_chunk(dnode_t *dn, uint64_t *start, uint64_t minimum, uint64_t *l1blks)
uint64_t blks;
uint64_t maxblks = DMU_MAX_ACCESS >> (dn->dn_indblkshift + 1);
/* bytes of data covered by a level-1 indirect block */
uint64_t iblkrange =
dn->dn_datablksz * EPB(dn->dn_indblkshift, SPA_BLKPTRSHIFT);
uint64_t iblkrange = (uint64_t)dn->dn_datablksz *
EPB(dn->dn_indblkshift, SPA_BLKPTRSHIFT);
ASSERT3U(minimum, <=, *start);
@@ -2373,39 +2385,14 @@ dmu_offset_next(objset_t *os, uint64_t object, boolean_t hole, uint64_t *off)
return (err);
/*
* Check if there are dirty data blocks or frees which have not been
* synced. Dirty spill and bonus blocks which are external to the
* object can ignored when reporting holes.
* Check if dnode is dirty
*/
mutex_enter(&dn->dn_mtx);
for (i = 0; i < TXG_SIZE; i++) {
if (multilist_link_active(&dn->dn_dirty_link[i])) {
if (dn->dn_free_ranges[i] != NULL) {
clean = B_FALSE;
break;
}
list_t *list = &dn->dn_dirty_records[i];
dbuf_dirty_record_t *dr;
for (dr = list_head(list); dr != NULL;
dr = list_next(list, dr)) {
dmu_buf_impl_t *db = dr->dr_dbuf;
if (db->db_blkid == DMU_SPILL_BLKID ||
db->db_blkid == DMU_BONUS_BLKID)
continue;
clean = B_FALSE;
break;
}
}
if (clean == B_FALSE)
clean = B_FALSE;
break;
}
}
mutex_exit(&dn->dn_mtx);
/*
* If compatibility option is on, sync any current changes before
@@ -2654,6 +2641,10 @@ module_param(zfs_dmu_offset_next_sync, int, 0644);
MODULE_PARM_DESC(zfs_dmu_offset_next_sync,
"Enable forcing txg sync to find holes");
module_param(dmu_prefetch_max, int, 0644);
MODULE_PARM_DESC(dmu_prefetch_max,
"Limit one prefetch call to this size");
/* END CSTYLED */
#endif

Some files were not shown because too many files have changed in this diff Show More