Historically a dynamic misc minor number was registered for the
/dev/zfs device in order to prevent minor number collisions. This
was fine but it prevented us from being able to use the kernel
module auto-loaded which requires a known reserved value.
Resolve this issue by adding a configure test to find an available
misc minor number which can then be used in MODULE_ALIAS_MISCDEV at
build time. By adding this alias the zfs kmod is added to the list
of known static-nodes and the systemd-tmpfiles-setup-dev service
will create a /dev/zfs character device at boot time.
This in turn allows us to update the 90-zfs.rules file to make it
aware this is a static node. The upshot of this is that whenever
a process (zpool, zfs, zed) opens the /dev/zfs the kmods will be
automatic loaded. This even works for unprivileged users so there
is no longer a need to manually load the modules at boot time.
As an additional bonus the zed now no longer needs to start after
the zfs-import.service since it will trigger the module load.
In the unlikely event the minor number we selected conflicts with
another out of tree unregistered minor number the code falls back
to dynamically allocating it. In this case the modules again
must be manually loaded.
Note that due to the change in the method of registering the minor
number the zimport.sh test case may incorrectly fail when the
static node for the installed packages is created instead of the
dynamic one. This issue will only transiently impact zimport.sh
for this single commit when we transition and are mixing and
matching methods.
Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
TEST_ZIMPORT_SKIP="yes"
Closes#7287
This reverts commit 2a16d4cfaf.
The commit was causing a "attempt to access beyond the end
of device" error:
list.zfsonlinux.org/pipermail/zfs-discuss/2018-September/032217.html
This is a port of 047116ac - Raw sends must be able to decrease nlevels,
to the zfs-0.7-stable branch. It includes the various fixes to the
problem of receiving incremental streams which include reallocated dnodes
in which the number of dnode slots has changed but excludes the parts
which are related to raw streams.
From 047116ac:
Currently, when a raw zfs send file includes a
DRR_OBJECT record that would decrease the number of
levels of an existing object, the object is reallocated
with dmu_object_reclaim() which creates the new dnode
using the old object's nlevels. For non-raw sends this
doesn't really matter, but raw sends require that
nlevels on the receive side match that of the send
side so that the checksum-of-MAC tree can be properly
maintained. This patch corrects the issue by freeing
the object completely before allocating it again in
this case.
This patch also corrects several issues with
dnode_hold_impl() and related functions that prevented
dnodes (particularly multi-slot dnodes) from being
reallocated properly due to the fact that existing
dnodes were not being fully cleaned up when they
were freed.
This patch adds a test to make sure that zfs recv
functions properly with incremental streams containing
dnodes of different sizes.
This also includes a one-liner fix from loli10K to fix a test failure:
https://github.com/zfsonlinux/zfs/pull/7792#discussion_r212769264
Authored-by: Tom Caputi <tcaputi@datto.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tim Chase <tim@chase2k.com>
Ported-by: Tim Chase <tim@chase2k.com>
Closes#6821Closes#6864
NOTE: This is the first of the port of 3 related patches patches to the
zfs-0.7-release branch of ZoL. The other two patches should immediately
follow this one.
Fix a bunch of truncation compiler warnings that show up
on Fedora 28 (GCC 8.0.1).
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #7368Closes#7826Closes#7830
When receiving an incremental send stream with intermediary snapshots
zfs_receive_one() does not correctly identify the top-level dataset:
consequently we restore said snapshots as if they were children
datasets in the hierarchy, forcing inheritance of any property received
with 'zfs send -o' and effectively removing any locally set value.
The test case did not correctly verify this situation because it uses
adjacent snapshots, basically testing 'zfs send -i' instead of
'zfs send -I': this commit adds an additional intermediary snapshot to
the test script.
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes#7478
Update the SA_COPY_DATA macro to check if architecture supports
efficient unaligned memory accesses at compile time. Otherwise
fallback to using the sa_copy_data() function.
The kernel provided CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is
used to determine availability in kernel space. In user space
the x86_64, x86, powerpc, and sometimes arm architectures will
define the HAVE_EFFICIENT_UNALIGNED_ACCESS macro.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#7642Closes#7684
1. Add a proc entry to display the pool's state:
$ cat /proc/spl/kstat/zfs/tank/state
ONLINE
This is done without using the spa config locks, so it will
never hang.
2. Fix 'zpool status' and 'zpool list -o health' output to print
"SUSPENDED" instead of "ONLINE" for suspended pools.
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes#7331Closes#7563
Update bdev_capacity to have wholedisk vdevs query the
size of the underlying block device (correcting for the size
of the efi parition and partition alignment) and therefore detect
expanded space.
Correct vdev_get_stats_ex so that the expandsize is aligned
to metaslab size and new space is only reported if it is large
enough for a new metaslab.
Reviewed by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: John Wren Kennedy <jwk404@gmail.com>
Signed-off-by: sara hartse <sara.hartse@delphix.com>
External-issue: LX-165
Closes#7546
Issue #7582
Commit torvalds/linux@95582b0 changes the inode i_atime, i_mtime,
and i_ctime members form timespec's to timespec64's to make them
2038 safe. As part of this change the current_time() function was
also updated to return the timespec64 type.
Resolve this issue by introducing a new inode_timespec_t type which
is defined to match the timespec type used by the inode. It should
be used when working with inode timestamps to ensure matching types.
The timestruc_t type under Illumos was used in a similar fashion but
was specified to always be a timespec_t. Rather than incorrectly
define this type all timespec_t types have been replaced by the new
inode_timespec_t type.
Finally, the kernel and user space 'sys/time.h' headers were aligned
with each other. They define as appropriate for the context several
constants as macros and include static inline implementation of
gethrestime(), gethrestime_sec(), and gethrtime().
Reviewed-by: Chunwei Chen <tuxoko@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#7643
Backported-by: Richard Yao <ryao@gentoo.org>
Fix a bunch of (mostly) sprintf/snprintf truncation compiler
warnings that show up on Fedora 28 (GCC 8.0.1).
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes#7361Closes#7368
Adds a devid for nvme devices. This is very similar to how the
other 'bus' (scsi|sata|usb) devids are generated. The devid
resides in a name/value pair in the leaf vdevs in a zpool config.
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Don Brady <don.brady@delphix.com>
Closes#7356
This treats /dev/nvme.. devices the same way as /dev/sd... devices. The
motivation behind this is that whole disk detection did not work on nvme
SSDs without that, because it DKC_UNKNOWN was returned for such devices.
Perhaps there should be a separate DKC_ type for this, but I don't know
enough about the code to know the implications of that.
Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: timor <timor.dd@googlemail.com>
Closes#7304
When the pool is suspended, record whether it was due to an I/O error or
due to MMP writes failing to succeed within the required time.
Change spa_suspended from uint8_t to zio_suspend_reason_t to store the
reason.
When userspace queries pool status via spa_tryimport(), report the
reason the pool was suspended in a new key,
ZPOOL_CONFIG_SUSPENDED_REASON.
In libzfs, when interpreting the returned config nvlist, report
suspension due to MMP with a new pool status enum value,
ZPOOL_STATUS_IO_FAILURE_MMP.
In status_callback(), which generates and emits the message when 'zpool
status' is executed, add a case to print an appropriate message for the
new pool status enum value.
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes#7296
When a single pool contains more vdevs than the CONFIG_HZ for
for the kernel the mmp thread will not delay properly. Switch
to using cv_timedwait_sig_hires() to handle higher resolution
delays.
This issue was reported on Arch Linux where HZ defaults to only
100 and this could be fairly easily reproduced with a reasonably
large pool. Most distribution kernels set CONFIG_HZ=250 or
CONFIG_HZ=1000 and thus are unlikely to be impacted.
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#7205Closes#7289
Receiving an incremental stream after an interrupted "zfs receive -s"
fails with the message "dataset is busy": this is because we still have
the hidden clone ../%recv from the resumable receive.
Improve the error message suggesting the existence of a partially
complete resumable stream from "zfs receive -s" which can be either
aborted ("zfs receive -A") or resumed ("zfs send -t").
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes#7129Closes#7154
When used in conjunction with one of '-e' or '-d' zfs receive options
none of the properties requested to be set (-o) are actually applied:
this is caused by a wrong assumption made about the toplevel dataset
in zfs_receive_one().
Fix this by correctly detecting the toplevel dataset.
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes#7088
Requires-spl: refs/pull/679/head
Resolve new warnings and errors from cppcheck v1.80.
* [lib/libshare/libshare.c:543]: (warning)
Possible null pointer dereference: protocol
* [lib/libzfs/libzfs_dataset.c:2323]: (warning)
Possible null pointer dereference: srctype
* [lib/libzfs/libzfs_import.c:318]: (error)
Uninitialized variable: link
* [module/zfs/abd.c:353]: (error) Uninitialized variable: sg
* [module/zfs/abd.c:353]: (error) Uninitialized variable: i
* [module/zfs/abd.c:385]: (error) Uninitialized variable: sg
* [module/zfs/abd.c:385]: (error) Uninitialized variable: i
* [module/zfs/abd.c:553]: (error) Uninitialized variable: i
* [module/zfs/abd.c:553]: (error) Uninitialized variable: sg
* [module/zfs/abd.c:763]: (error) Uninitialized variable: i
* [module/zfs/abd.c:763]: (error) Uninitialized variable: sg
* [module/zfs/abd.c:305]: (error) Uninitialized variable: tmp_page
* [module/zfs/zpl_xattr.c:342]: (warning)
Possible null pointer dereference: value
* [module/zfs/zvol.c:208]: (error) Uninitialized variable: p
Convert the following suppression to inline.
* [module/zfs/zfs_vnops.c:840]: (error)
Possible null pointer dereference: aiov
Exclude HAVE_UIO_ZEROCOPY and HAVE_DNLC from analysis since
these macro's will never be defined until this functionality
is implemented.
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#6879
Currently the function documentation states that two strings are
allocated, this is outdated. Only one char ** parameter is passed
into the function now, clearly only a pointer to a single string
is returned and needs to be free'd.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Closes#6754
The current code base almost compiles on SPARC, but a few fixes are
required for the code to compile (and work efficiently). Code in this
PR comes from OpenZFS project which was initially dropped when porting
the crypto framework.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pengcheng Xu <i@jsteward.moe>
Closes#6733Closes#6738Closes#6750
Because resuming from a token requires "guid" -> "snapshot" mapping
we have to walk the whole dataset hierarchy to find the right snapshot
to send; when both source and destination exists, for an incremental
resumable stream, libzfs gets confused and picks up the wrong snapshot
to send from: this results in attempting to send
"destination@snap1 -> source@snap2"
instead of
"source@snap1 -> source@snap2"
which fails with a "Invalid cross-device link" error (EXDEV).
Fix this by adjusting the logic behind dataset traversal in
zfs_iter_children() to pick the right snapshot to send from.
Additionally update dry-run 'zfs send -t' to print its output to
stderr: this is consistent with other dry-run commands.
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes#6618Closes#6619Closes#6623
ZFS buildbot STYLE builder was moved to Ubuntu 17.04
which has a newer version of cppcheck. Handle the
new cppcheck errors.
uu_* functions removed in this commit were unused
and effectively dead code. They are now retired.
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes#6653
By default the mount(8) command, as invoked by 'zfs mount', will try
to resolve any path parameter in its canonical form: this could lead
to mount failures when the cwd contains a symlink having the same name
of the dataset being mounted.
Fix this by explicitly disabling mount(8) path canonicalization.
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes#1791Closes#6429Closes#6437
Turning the multihost property on requires that a hostid be set to allow
ZFS to determine when a foreign system is attemping to import a pool.
The error message instructing the user to set a hostid refers to
genhostid(1).
Genhostid(1) is not available on SUSE Linux. This commit adds a script
modeled after genhostid(1) for those users.
Zgenhostid checks for an /etc/hostid file; if it does not exist, it
creates one and stores a value. If the user has provided a hostid as an
argument, that value is used. Otherwise, a random hostid is generated
and stored.
This differs from the CENTOS 6/7 versions of genhostid, which overwrite
the /etc/hostid file even though their manpages state otherwise.
A man page for zgenhostid is added. The one for genhostid is in (1), but
I put zgenhostid in (8) because I believe it's more appropriate.
The mmp tests are modified to use zgenhostid to set the hostid instead
of using the spl_hostid module parameter. zgenhostid will not replace
an existing /etc/hostid file, so new mmp_clear_hostid calls are
required.
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes#6358Closes#6379
Add multihost=on|off pool property to control MMP. When enabled
a new thread writes uberblocks to the last slot in each label, at a
set frequency, to indicate to other hosts the pool is actively imported.
These uberblocks are the last synced uberblock with an updated
timestamp. Property defaults to off.
During tryimport, find the "best" uberblock (newest txg and timestamp)
repeatedly, checking for change in the found uberblock. Include the
results of the activity test in the config returned by tryimport.
These results are reported to user in "zpool import".
Allow the user to control the period between MMP writes, and the
duration of the activity test on import, via a new module parameter
zfs_multihost_interval. The period is specified in milliseconds. The
activity test duration is calculated from this value, and from the
mmp_delay in the "best" uberblock found initially.
Add a kstat interface to export statistics about Multiple Modifier
Protection (MMP) updates. Include the last synced txg number, the
timestamp, the delay since the last MMP update, the VDEV GUID, the VDEV
label that received the last MMP update, and the VDEV path. Abbreviated
output below.
$ cat /proc/spl/kstat/zfs/mypool/multihost
31 0 0x01 10 880 105092382393521 105144180101111
txg timestamp mmp_delay vdev_guid vdev_label vdev_path
20468 261337 250274925 68396651780 3 /dev/sda
20468 261339 252023374 6267402363293 1 /dev/sdc
20468 261340 252000858 6698080955233 1 /dev/sdx
20468 261341 251980635 783892869810 2 /dev/sdy
20468 261342 253385953 8923255792467 3 /dev/sdd
20468 261344 253336622 042125143176 0 /dev/sdab
20468 261345 253310522 1200778101278 2 /dev/sde
20468 261346 253286429 0950576198362 2 /dev/sdt
20468 261347 253261545 96209817917 3 /dev/sds
20468 261349 253238188 8555725937673 3 /dev/sdb
Add a new tunable zfs_multihost_history to specify the number of MMP
updates to store history for. By default it is set to zero meaning that
no MMP statistics are stored.
When using ztest to generate activity, for automated tests of the MMP
function, some test functions interfere with the test. For example, the
pool is exported to run zdb and then imported again. Add a new ztest
function, "-M", to alter ztest behavior to prevent this.
Add new tests to verify the new functionality. Tests provided by
Giuseppe Di Natale.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Ned Bass <bass6@llnl.gov>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes#745Closes#6279
If no spl_hostid was set, and no /etc/hostid file existed, the user
and kernel would have different values for the hostid.
The kernel's would be 0. User space's would depend on the libc
implementation. On systems with glibc, it would be a generated value,
probably the first 4 bytes of an IP address (see man 3 gethostid and
comments above hostid_read in SPL for details).
This then causes the hostid stored in the labels and in the pool
config not to match the hostid userspace obtains from
get_system_hostid().
Since the kernel has no way to know the libc's generated hostid value,
it serves no purpose for ZFS to use the value.
This patch changes user space's get_system_hostid() to conform to the
kernel's method, first checking for the spl_hostid via sysfs, and then
reading from /etc/hostid directly.
It does not look up spl_hostid_path, because if that is set and the
file it pointed to exists, spl_hostid will reflect its contents.
It eliminates the call to libc's gethostid().
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Ned Bass <bass6@llnl.gov>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes#745Closes#6279
When VERIFY3_IMPL() was adjusted in 682ce104, the values of
the operands were omitted from the variadic arguments list.
This patch simply corrects this.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes#6343
Currently, there is no way to pause a scrub. Pausing may
be useful when the pool is busy with other I/O to preserve
bandwidth.
This patch adds the ability to pause and resume scrubbing.
This is achieved by maintaining a persistent on-disk scrub state.
While the state is 'paused' we do not scrub any more blocks.
We do however perform regular scan housekeeping such as
freeing async destroyed and deadlist blocks while paused.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Thomas Caputi <tcaputi@datto.com>
Reviewed-by: Serapheim Dimitropoulos <serapheimd@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alek Pinchuk <apinchuk@datto.com>
Closes#6167
Musl libc's <stdio.h> doesn't include <stdarg.h>, which cause
`va_start` and `va_end` end up being undefined symbols.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Leorize <alaviss@users.noreply.github.com>
Closes#6310
Authored by: Andriy Gapon <avg@FreeBSD.org>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Ported-by: Giuseppe Di Natale <dinatale2@llnl.gov>
The existing kernel-side code only provides a method to rollback to a
latest snapshot, whatever it happens to be at the time when the rollback
is actually done. That could be unsafe or confusing in environments
where concurrent DSL changes are possible as the resulting state could
correspond to a newer or older snapshot than the originally requested
one.
This change allows to amend that method such that the rollback is
performed only when the latest snapshot has a specific name. That is,
if a new snapshot is concurrently created or the target snapshot is
destroyed, then no rollback is done and EXDEV error is returned.
New libzfs_core function lzc_rollback_to() is provided for the new
functionality. libzfs is changed to use lzc_rollback_to() to implement
zfs rollback command.
Perhaps we should return different errors to distinguish the case where
the desired snapshot exists but it's not the latest snapshot and the
case where the desired snapshot does not exist.
OpenZFS-issue: https://www.illumos.org/issues/7600
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/3d645ebCloses#6292
Authored by: Sowrabha Gopal <sowrabha.gopal@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Igor Kozhukhov <igor@dilos.org>
Approved by: Robert Mustacchi <rm@joyent.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Ported-by: Giuseppe Di Natale <dinatale2@llnl.gov>
dir_is_empty_readdir() immediately returns if fdopendir() fails.
We should close dirfd when that happens.
OpenZFS-issue: https://www.illumos.org/issues/8430
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/e165e20Closes#6289
GCC 7.1 with will warn when we're not checking the snprintf()
return code in cases where the buffer could be truncated. This
patch either checks the snprintf return code (where applicable),
or simply disables the warnings (ztest.c).
Reviewed-by: Chunwei Chen <david.chen@osnexus.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes#6253
Authored by: Andrew Stormont <astormont@racktopsystems.com>
Reviewed by: Marcel Telka <marcel@telka.sk>
Reviewed by: Toomas Soome <tsoome@me.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Ported-by: Giuseppe Di Natale <dinatale2@llnl.gov>
OpenZFS-issue: https://www.illumos.org/issues/8331
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/4f4378cCloses#6255
This prints dashes instead of zeros for zero latency values in
'zpool iostat -p'. You'll get zero latencies reported when the
disk is idle, but technically a zero latency is invalid, since you
can't measure the latency of doing nothing.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes#6210
When spare or l2cache device path changes, zpool import will not fix up
their paths like normal vdev. The issue is that when you supply a pool
name argument to zpool import, it will use it to filter out device which
doesn't have the pool name in the label. Since spare and l2cache device
never have that in the label, they'll always get filtered out.
We fix this by making sure we never filter out a spare or l2cache
device.
Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes#6158
This addition will enable us to sync an open TXG to the main pool
on demand. The functionality is similar to 'sync(2)' but 'zpool sync'
will return when data has hit the main storage instead of potentially
just the ZIL as is the case with the 'sync(2)' cmd.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Alek Pinchuk <apinchuk@datto.com>
Closes#6122
This patch adds a '-f' option to 'zpool offline' to fault a vdev
instead of bringing it offline. Unlike the OFFLINE state, the
FAULTED state will trigger the FMA code, allowing for things like
autoreplace and triggering the slot fault LED. The -f faults
persist across imports, unless they were set with the temporary
(-t) flag. Both persistent and temporary faults can be cleared
with zpool clear.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes#6094
In glibc-2.23 <sys/sysmacros.h> isn't automatically included in
<sys/types.h> [1], so we need ot explicitely include it.
https://sourceware.org/ml/libc-alpha/2015-11/msg00253.html
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Justin Lecher <jlec@gentoo.org>
Closes#6132
This allows users to specify "-o property=value" to override and
"-x property" to exclude properties when receiving a zfs send stream.
Both native and user properties can be specified.
This is useful when using zfs send/receive for periodic
backup/replication because it lets users change properties such as
canmount, mountpoint, or compression without modifying the source.
References:
https://www.illumos.org/issues/2745https://www.illumos.org/issues/3753
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Alek Pinchuk <apinchuk@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes#1350Closes#5349
Document the existence of `createtxg` and `guid` native properties
in man pages and zfs command output.
One of the great features of ZFS is incremental replication of
snapshots, possibly between pools on different machines.
Shell scripts are commonly used to auomate this procedure. They have to
find the most recent common snapshot between both sides and then
perform incremental send & recv.
Currently, scripts rely on the sorting order of `zfs list`, which
defaults to `createtxg`, and the assumption that snapshot names on
either side do not change.
By making `createtxg` and `guid` part of the public ZFS interface,
scripts are enabled to use
a) `createtxg` to determine the logical & temporal order of snapshots
(the creation property is not an equivalent substitute since
multiple snapshots may be created within one second)
b) `guid` to uniquely identify a snapshot, independent of its current
display name
This has the potential of making scripts safer and correct.
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: DHE <git@dehacked.net>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: Christian Schwarz <me@cschwarz.com>
Closes#6102
A race condition between 'zpool export' and 'zfs create' can crash the
latter: this is because we never check libzfs`zpool_open() return
value in libzfs`zfs_create().
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes#6096
zfsonlinux/spl@8f87971 added __spl_pf_fstrans_check for the xfs related
check, so we use them accordingly.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes#6113
This commit allow higher ashift values (up to 16) in 'zpool create'
The ashift value was previously limited to 13 (8K block) in b41c990
because the limited number of uberblocks we could fit in the
statically sized (128K) vdev label ring buffer could prevent the
ability the safely roll back a pool to recover it.
Since b02fe35 the largest uberblock size we support is 8K: this
allow us to store a minimum number of 16 uberblocks in the vdev
label, even with higher ashift values.
Additionally change 'ashift' pool property behaviour: if set it will
be used as the default hint value in subsequent vdev operations
('zpool add', 'attach' and 'replace'). A custom ashift value can still
be specified from the command line, if desired.
Finally, fix a bug in add-o_ashift.ksh caused by a missing variable.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes#2024Closes#4205Closes#4740Closes#5763
* Add zfs_nicebytes() to print human-readable sizes
Some 'zfs', 'zpool' and 'zdb' output strings can be confusing to the
user when no units are specified. This add a new zfs_nicenum_format
"ZFS_NICENUM_BYTES" used to print bytes in their human-readable form.
Additionally, update some test cases to use machine-parsable 'zfs get'.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes#2414Closes#3185Closes#3594Closes#6032
OpenZFS 7252 - compressed zfs send / receive
OpenZFS 7628 - create long versions of ZFS send / receive options
Authored by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed by: David Quigley <dpquigl@davequigley.com>
Reviewed by: Thomas Caputi <tcaputi@datto.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Reviewed by: David Quigley <dpquigl@davequigley.com>
Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Ported-by: bunder2015 <omfgbunder@gmail.com>
Ported-by: Don Brady <don.brady@intel.com>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
Porting Notes:
- Most of 7252 was already picked up during ABD work. This
commit represents the gap from the final commit to openzfs.
- Fixed split_large_blocks check in do_dump()
- An alternate version of the write_compressible() function was
implemented for Linux which does not depend on fio. The behavior
of fio differs significantly based on the exact version.
- mkholes was replaced with truncate for Linux.
OpenZFS-issue: https://www.illumos.org/issues/7252
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/5602294Closes#6067
zdb -e for active cache-less pools fails:
$ sudo zpool create -o cachefile=none basic mirror sdk sdl
$ sudo zdb -e -b basic
zdb: can't open 'basic': No such file or directory
This is a recent regression introduce by commit c30d8de.
Reviewed-by: Richard Yao <ryao@gentoo.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Don Brady <don.brady@intel.com>
Closes#6059
This patch updates the "zpool status/iostat -c" commands to only run
"pre-baked" scripts from the /etc/zfs/zpool.d directory (or wherever
you install to). The scripts can only be run from -c as an unprivileged
user (unless the ZPOOL_SCRIPTS_AS_ROOT environment var is
set by root). This was done to encourage scripts to be written is such
a way that normal users can use them, and to be cautious. If your
script needs to run a privileged command, consider adding the
appropriate line in /etc/sudoers. See zpool(8) for an example of how
to do this.
The patch also allows the scripts to output custom column names. If
the script outputs a line like:
name=value
then "name" is used for the column name, and "value" is its value.
Multiple columns can be specified by outputting multiple lines. Column
names and values can have spaces. If the value is empty, a dash (-) is
printed instead.
After all the "name=value" lines are read (if any), zpool will take the
next the next line of output (if any) and print it without a column
header. After that, no more lines will be processed. This can be
useful for printing errors.
Lastly, this patch also disables the -c option with the latency and
request size histograms, since it produced awkward output and made the
code harder to maintain.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes#5852
Fix a leak when generating a replication stream of a cloned dataset.
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Tim Crawford <tcrawford@datto.com>
Closes#6034