It turns out that "do LZ4 and zstd-1 both fail" is a great heuristic
for "don't even bother trying higher zstd tiers".
By way of illustration:
$ cat /incompress | mbuffer | zfs recv -o compression=zstd-12 evenfaster/lowcomp_1M_zstd12_normal
summary: 39.8 GiByte in 3min 40.2sec - average of 185 MiB/s
$ echo 3 | sudo tee /sys/module/zzstd/parameters/zstd_lz4_pass
3
$ cat /incompress | mbuffer -m 4G | zfs recv -o compression=zstd-12 evenfaster/lowcomp_1M_zstd12_patched
summary: 39.8 GiByte in 48.6sec - average of 839 MiB/s
$ sudo zfs list -p -o name,used,lused,ratio evenfaster/lowcomp_1M_zstd12_normal evenfaster/lowcomp_1M_zstd12_patched
NAME USED LUSED RATIO
evenfaster/lowcomp_1M_zstd12_normal 39549931520 42721221632 1.08
evenfaster/lowcomp_1M_zstd12_patched 39626399744 42721217536 1.07
$ python3 -c "print(39626399744 - 39549931520)"
76468224
$
I'll take 76 MB out of 42 GB for > 4x speedup.
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes#13244
Makes getmntent and getmntany thread-safe for external consumers of
libzfs zpool_disable_datasets, zfs_iter_mounted, libzfs_mnttab_update,
libzfs_mnttab_find.
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes#13484
The bug this was working around is no longer present.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes#13480Closes#13490
When scrubbing/resilvering a pool it can be counter productive to
cancel the scan and kick of a replace operation to a hot spare
when encountering checksum errors. In this case, the best course
of action is to allow the scrub/resilver to complete as quickly
as possible and to keep the vdevs fully online if possible.
Realistically, this is less of an issue for a RAIDZ since a
traditional resilver must be used and checksums will be verified.
However, this is not the case for a mirror or dRAID pool which is
sequentially resilvered and checksum verification is deferred
until after the replace operation completes.
Regardless, we apply this policy to all pool types since it's
a good idea for all vdevs. Degrading additional vdevs has the
potential to make a bad situation worse. Note the checksum
errors will still be reported as both an event and by
`zpool status`. This change only prevents the ZED from
proactively taking any action.
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#13499
We want `zpool import` to be highly robust and never panic, even
when encountering corrupt metadata. This is already handled in the
arc_read() code path, which covers most cases, but spa_load_verify_cb()
relies on zio_read() and is responsible for verifying the block pointer.
During import it is also possible to encounter blocks pointers which
contain ZIO_COMPRESS_INHERIT and ZIO_CHECKSUM_INHERIT values. Relax
the verification function slightly to allow this.
Futhermore, extend dsl_scan_recurse() to verify the block pointer
contents of level zero blocks which are not of type DMU_OT_DNODE or
DMU_OT_OBJSET. This is handled by arc_read() in the other cases.
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#13124Closes#13360
The SA attribute containing the symlink target does not include a nul
terminator, so when printing the target zdb would sometimes include
garbage at the end of the string.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes#13482
This also expands the zfs version output from 127 characters to However
Many Are Actually Set
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes#13330
The short-path is now one access() call,
we always modprobe zfs (ZFS_MODULE_LOADING which doesn't use the libzfs
boolean parsing is gone),
and we use a simple inotify IN_CREATE loop with a timerfd timeout
rather than 10ms kernel-style polling
There's one substantial difference: ZFS_MODULE_TIMEOUT=-1
now means "never give up", rather than "wait 10 minutes"
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes#13330
Clang trunk now warns -Wstrict-prototypes on this, and they're removed
in C2x
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes#13447
Some minimal MUAs don't support passing the subjects as cmdline option.
This commit checks if "@SUBJECT@" is missing in ZED_EMAIL_OPTS and then
prepends a subject header to the notification message.
Also set a default for ${subject}.
Reviewed-by: Ahelenia Ziemia<C5><84>ska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Daniel Hiepler <d-git@coderdu.de>
Closes#13440
There are times when end-users may wish to have
a fast and convenient method to get zpool guid
without having to use libzfs. This commit
exposes the zpool guid via kstats in similar
manner to the zpool state.
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andrew Walker <awalker@ixsystems.com>
Closes#13466
The compiler appears to be expanding the unused NULL pointer into a
zero-length array via the inline bitops code. When -Werror=array-bounds
is used, this causes a build failure. Recommended solution is allocate
temporary structures, fill with zeros (to avoid uninitialized data use
warnings), and pass the pointer to those to the inline calls.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes#13463Closes#13465
Recognise initial whitespace, + in both cases,
and - also in unsigneds
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes#13434
With the additional benefit of removing all the _all() functions and
treating a NULL list as "all" ‒ the remaining all function is for all
/datasets/, which is consistent with the rest of the API
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes#13165
This makes it so we don't leak a consistent 64 bytes anymore,
makes the searches simpler and faster, removes /all allocations/
from the driver (quite trivially, since they were absolutely needless),
and makes libshare thread-safe (except, maybe, linux/smb, but that only
does pointer-width loads/stores so it's also mostly fine, except for
leaking smb_shares)
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes#13165
This renders it thread-safe
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes#13165
When the HAVE_IOPS_MKDIR_USERNS check fails output result
as required.
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#13454
The zloop.sh script is primarily designed to randomly stress
the DMU and SPA layers. This can result in some unrealistic
(or even impossible) scenarios being tested which then fail.
Since the longer we run zloop.sh the more likely this is to occur
this commit reduces the runtime. The intention being that normally
this will result in a clean CI run unless the PR does introduce
serious breaking change.
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#13453
Linux passes -Wframe-larger-than=1024, which breaks
our build in a number of places with -Werror.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes#13450
This changes the behaviour of -B from the illumos one which would,
in the example in the manual, take just ./chroots/lenny;
this, however, is more versatile, and scales much better for systems
with ZFS in /usr/local, for example
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes#13411Closes#1770
Even on Illumos it's only available in the 32-bit programming
environment, and, quoth enable_extended_FILE_stdio(3C):
> Historically, 32-bit Solaris applications have been limited to using
> only the file descriptors 0 through 255 with the standard I/O
> functions (see stdio(3C)) in the C library. The extended FILE
> facility allows well-behaved 32-bit applications to use any
> valid file descriptor with the standard I/O functions.
where "well-behaved" means that it
> does not directly access any fields in the FILE structure pointed
> to by the FILE pointer associated with any standard I/O stream,
And the stdio/flush.c implementation reads:
/*
* if this is not an internal extended FILE then check
* if _file is being changed from underneath us.
* It should not be because if
* it is then then we lose our ability to guard against
* silent data corruption.
*/
if (!iop->__xf_nocheck && bad_fd > -1 && iop->_magic != bad_fd) {
(void) fprintf(stderr,
"Application violated extended FILE safety mechanism.\n"
"Please read the man page for extendedFILE.\nAborting\n");
abort();
}
This appears to be an insane workaround for broken implementation with
exposed FILE internals and _file being an u8, both only on non-LP64;
it's shimmed out on all LP64 targets in Illumos,
and we shim it out as well: just get rid of it
This appears to've been originally fixed in illumos-gate
a5f69788de7ac07553de47f7fec8c05a9a94c105 ("PSARC 2006/162 Extended FILE
space for 32-bit Solaris processes", "1085341 32-bit stdio routines
should support file descriptors >255"), which also bears extendedFILE
and enable_extended_FILE_stdio(3C):
- unsigned char _file; /* UNIX System file descriptor */
+ unsigned char _magic; /* Old home of the file descriptor */
+ /* Only fileno(3C) can retrieve the
value now */
and
+/*
+ * Macros to aid the extended fd FILE work.
+ * This helps isolate the changes to only the 32-bit code
+ * since 64-bit Solaris is not affected by this.
+ */
+#ifdef _LP64
+#define GET_FD(iop) ((iop)->_file)
+#define SET_FILE(iop, fd) ((iop)->_file = (fd))
+#else
+#define GET_FD(iop) \
+ (((iop)->__extendedfd) ? _file_get(iop) : (iop)->_magic)
+#define SET_FILE(iop, fd) (iop)->_magic = (fd); (iop)->__extendedfd = 0
+#endif
Also remove the 1k setrlimit(NOFILE) calls: that's the default on Linux,
with 64k on Illumos and 171k on FreeBSD
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes#13411