2018-02-08 19:16:23 +03:00
|
|
|
/*
|
|
|
|
* CDDL HEADER START
|
|
|
|
*
|
|
|
|
* This file and its contents are supplied under the terms of the
|
|
|
|
* Common Development and Distribution License ("CDDL"), version 1.0.
|
|
|
|
* You may only use this file in accordance with the terms of version
|
|
|
|
* 1.0 of the CDDL.
|
|
|
|
*
|
|
|
|
* A full copy of the text of the CDDL should have accompanied this
|
|
|
|
* source. A copy of the CDDL is also available via the Internet at
|
|
|
|
* http://www.illumos.org/license/CDDL.
|
|
|
|
*
|
|
|
|
* CDDL HEADER END
|
|
|
|
*/
|
|
|
|
|
|
|
|
/*
|
2018-06-16 01:10:42 +03:00
|
|
|
* Copyright (c) 2016, 2018 by Delphix. All rights reserved.
|
2018-02-08 19:16:23 +03:00
|
|
|
*/
|
|
|
|
|
|
|
|
/*
|
|
|
|
* ZFS Channel Programs (ZCP)
|
|
|
|
*
|
|
|
|
* The ZCP interface allows various ZFS commands and operations ZFS
|
|
|
|
* administrative operations (e.g. creating and destroying snapshots, typically
|
|
|
|
* performed via an ioctl to /dev/zfs by the zfs(8) command and
|
|
|
|
* libzfs/libzfs_core) to be run * programmatically as a Lua script. A ZCP
|
|
|
|
* script is run as a dsl_sync_task and fully executed during one transaction
|
|
|
|
* group sync. This ensures that no other changes can be written concurrently
|
|
|
|
* with a running Lua script. Combining multiple calls to the exposed ZFS
|
|
|
|
* functions into one script gives a number of benefits:
|
|
|
|
*
|
|
|
|
* 1. Atomicity. For some compound or iterative operations, it's useful to be
|
|
|
|
* able to guarantee that the state of a pool has not changed between calls to
|
|
|
|
* ZFS.
|
|
|
|
*
|
|
|
|
* 2. Performance. If a large number of changes need to be made (e.g. deleting
|
|
|
|
* many filesystems), there can be a significant performance penalty as a
|
|
|
|
* result of the need to wait for a transaction group sync to pass for every
|
|
|
|
* single operation. When expressed as a single ZCP script, all these changes
|
|
|
|
* can be performed at once in one txg sync.
|
|
|
|
*
|
|
|
|
* A modified version of the Lua 5.2 interpreter is used to run channel program
|
|
|
|
* scripts. The Lua 5.2 manual can be found at:
|
|
|
|
*
|
|
|
|
* http://www.lua.org/manual/5.2/
|
|
|
|
*
|
|
|
|
* If being run by a user (via an ioctl syscall), executing a ZCP script
|
|
|
|
* requires root privileges in the global zone.
|
|
|
|
*
|
|
|
|
* Scripts are passed to zcp_eval() as a string, then run in a synctask by
|
|
|
|
* zcp_eval_sync(). Arguments can be passed into the Lua script as an nvlist,
|
|
|
|
* which will be converted to a Lua table. Similarly, values returned from
|
|
|
|
* a ZCP script will be converted to an nvlist. See zcp_lua_to_nvlist_impl()
|
|
|
|
* for details on exact allowed types and conversion.
|
|
|
|
*
|
|
|
|
* ZFS functionality is exposed to a ZCP script as a library of function calls.
|
|
|
|
* These calls are sorted into submodules, such as zfs.list and zfs.sync, for
|
|
|
|
* iterators and synctasks, respectively. Each of these submodules resides in
|
|
|
|
* its own source file, with a zcp_*_info structure describing each library
|
|
|
|
* call in the submodule.
|
|
|
|
*
|
|
|
|
* Error handling in ZCP scripts is handled by a number of different methods
|
|
|
|
* based on severity:
|
|
|
|
*
|
|
|
|
* 1. Memory and time limits are in place to prevent a channel program from
|
|
|
|
* consuming excessive system or running forever. If one of these limits is
|
|
|
|
* hit, the channel program will be stopped immediately and return from
|
|
|
|
* zcp_eval() with an error code. No attempt will be made to roll back or undo
|
2019-09-03 03:56:41 +03:00
|
|
|
* any changes made by the channel program before the error occurred.
|
2018-02-08 19:16:23 +03:00
|
|
|
* Consumers invoking zcp_eval() from elsewhere in the kernel may pass a time
|
|
|
|
* limit of 0, disabling the time limit.
|
|
|
|
*
|
|
|
|
* 2. Internal Lua errors can occur as a result of a syntax error, calling a
|
|
|
|
* library function with incorrect arguments, invoking the error() function,
|
|
|
|
* failing an assert(), or other runtime errors. In these cases the channel
|
|
|
|
* program will stop executing and return from zcp_eval() with an error code.
|
|
|
|
* In place of a return value, an error message will also be returned in the
|
|
|
|
* 'result' nvlist containing information about the error. No attempt will be
|
|
|
|
* made to roll back or undo any changes made by the channel program before the
|
2019-09-03 03:56:41 +03:00
|
|
|
* error occurred.
|
2018-02-08 19:16:23 +03:00
|
|
|
*
|
|
|
|
* 3. If an error occurs inside a ZFS library call which returns an error code,
|
|
|
|
* the error is returned to the Lua script to be handled as desired.
|
|
|
|
*
|
|
|
|
* In the first two cases, Lua's error-throwing mechanism is used, which
|
|
|
|
* longjumps out of the script execution with luaL_error() and returns with the
|
|
|
|
* error.
|
|
|
|
*
|
|
|
|
* See zfs-program(8) for more information on high level usage.
|
|
|
|
*/
|
|
|
|
|
|
|
|
#include <sys/lua/lua.h>
|
|
|
|
#include <sys/lua/lualib.h>
|
|
|
|
#include <sys/lua/lauxlib.h>
|
|
|
|
|
|
|
|
#include <sys/dsl_prop.h>
|
|
|
|
#include <sys/dsl_synctask.h>
|
|
|
|
#include <sys/dsl_dataset.h>
|
|
|
|
#include <sys/zcp.h>
|
|
|
|
#include <sys/zcp_iter.h>
|
|
|
|
#include <sys/zcp_prop.h>
|
|
|
|
#include <sys/zcp_global.h>
|
async zvol minor node creation interferes with receive
When we finish a zfs receive, dmu_recv_end_sync() calls
zvol_create_minors(async=TRUE). This kicks off some other threads that
create the minor device nodes (in /dev/zvol/poolname/...). These async
threads call zvol_prefetch_minors_impl() and zvol_create_minor(), which
both call dmu_objset_own(), which puts a "long hold" on the dataset.
Since the zvol minor node creation is asynchronous, this can happen
after the `ZFS_IOC_RECV[_NEW]` ioctl and `zfs receive` process have
completed.
After the first receive ioctl has completed, userland may attempt to do
another receive into the same dataset (e.g. the next incremental
stream). This second receive and the asynchronous minor node creation
can interfere with one another in several different ways, because they
both require exclusive access to the dataset:
1. When the second receive is finishing up, dmu_recv_end_check() does
dsl_dataset_handoff_check(), which can fail with EBUSY if the async
minor node creation already has a "long hold" on this dataset. This
causes the 2nd receive to fail.
2. The async udev rule can fail if zvol_id and/or systemd-udevd try to
open the device while the the second receive's async attempt at minor
node creation owns the dataset (via zvol_prefetch_minors_impl). This
causes the minor node (/dev/zd*) to exist, but the udev-generated
/dev/zvol/... to not exist.
3. The async minor node creation can silently fail with EBUSY if the
first receive's zvol_create_minor() trys to own the dataset while the
second receive's zvol_prefetch_minors_impl already owns the dataset.
To address these problems, this change synchronously creates the minor
node. To avoid the lock ordering problems that the asynchrony was
introduced to fix (see #3681), we create the minor nodes from open
context, with no locks held, rather than from syncing contex as was
originally done.
Implementation notes:
We generally do not need to traverse children or prefetch anything (e.g.
when running the recv, snapshot, create, or clone subcommands of zfs).
We only need recursion when importing/opening a pool and when loading
encryption keys. The existing recursive, asynchronous, prefetching code
is preserved for use in these cases.
Channel programs may need to create zvol minor nodes, when creating a
snapshot of a zvol with the snapdev property set. We figure out what
snapshots are created when running the LUA program in syncing context.
In this case we need to remember what snapshots were created, and then
try to create their minor nodes from open context, after the LUA code
has completed.
There are additional zvol use cases that asynchronously own the dataset,
which can cause similar problems. E.g. changing the volmode or snapdev
properties. These are less problematic because they are not recursive
and don't touch datasets that are not involved in the operation, there
is still potential for interference with subsequent operations. In the
future, these cases should be similarly converted to create the zvol
minor node synchronously from open context.
The async tasks of removing and renaming minors do not own the objset,
so they do not have this problem. However, it may make sense to also
convert these operations to happen synchronously from open context, in
the future.
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Prakash Surya <prakash.surya@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-65948
Closes #7863
Closes #9885
2020-02-03 20:33:14 +03:00
|
|
|
#include <sys/zvol.h>
|
2018-02-08 19:16:23 +03:00
|
|
|
|
|
|
|
#ifndef KM_NORMALPRI
|
|
|
|
#define KM_NORMALPRI 0
|
|
|
|
#endif
|
|
|
|
|
2018-02-08 19:24:39 +03:00
|
|
|
#define ZCP_NVLIST_MAX_DEPTH 20
|
|
|
|
|
2022-01-15 02:37:55 +03:00
|
|
|
static const uint64_t zfs_lua_check_instrlimit_interval = 100;
|
Cleanup: 64-bit kernel module parameters should use fixed width types
Various module parameters such as `zfs_arc_max` were originally
`uint64_t` on OpenSolaris/Illumos, but were changed to `unsigned long`
for Linux compatibility because Linux's kernel default module parameter
implementation did not support 64-bit types on 32-bit platforms. This
caused problems when porting OpenZFS to Windows because its LLP64 memory
model made `unsigned long` a 32-bit type on 64-bit, which created the
undesireable situation that parameters that should accept 64-bit values
could not on 64-bit Windows.
Upon inspection, it turns out that the Linux kernel module parameter
interface is extensible, such that we are allowed to define our own
types. Rather than maintaining the original type change via hacks to to
continue shrinking module parameters on 32-bit Linux, we implement
support for 64-bit module parameters on Linux.
After doing a review of all 64-bit kernel parameters (found via the man
page and also proposed changes by Andrew Innes), the kernel module
parameters fell into a few groups:
Parameters that were originally 64-bit on Illumos:
* dbuf_cache_max_bytes
* dbuf_metadata_cache_max_bytes
* l2arc_feed_min_ms
* l2arc_feed_secs
* l2arc_headroom
* l2arc_headroom_boost
* l2arc_write_boost
* l2arc_write_max
* metaslab_aliquot
* metaslab_force_ganging
* zfetch_array_rd_sz
* zfs_arc_max
* zfs_arc_meta_limit
* zfs_arc_meta_min
* zfs_arc_min
* zfs_async_block_max_blocks
* zfs_condense_max_obsolete_bytes
* zfs_condense_min_mapping_bytes
* zfs_deadman_checktime_ms
* zfs_deadman_synctime_ms
* zfs_initialize_chunk_size
* zfs_initialize_value
* zfs_lua_max_instrlimit
* zfs_lua_max_memlimit
* zil_slog_bulk
Parameters that were originally 32-bit on Illumos:
* zfs_per_txg_dirty_frees_percent
Parameters that were originally `ssize_t` on Illumos:
* zfs_immediate_write_sz
Note that `ssize_t` is `int32_t` on 32-bit and `int64_t` on 64-bit. It
has been upgraded to 64-bit.
Parameters that were `long`/`unsigned long` because of Linux/FreeBSD
influence:
* l2arc_rebuild_blocks_min_l2size
* zfs_key_max_salt_uses
* zfs_max_log_walking
* zfs_max_logsm_summary_length
* zfs_metaslab_max_size_cache_sec
* zfs_min_metaslabs_to_flush
* zfs_multihost_interval
* zfs_unflushed_log_block_max
* zfs_unflushed_log_block_min
* zfs_unflushed_log_block_pct
* zfs_unflushed_max_mem_amt
* zfs_unflushed_max_mem_ppm
New parameters that do not exist in Illumos:
* l2arc_trim_ahead
* vdev_file_logical_ashift
* vdev_file_physical_ashift
* zfs_arc_dnode_limit
* zfs_arc_dnode_limit_percent
* zfs_arc_dnode_reduce_percent
* zfs_arc_meta_limit_percent
* zfs_arc_sys_free
* zfs_deadman_ziotime_ms
* zfs_delete_blocks
* zfs_history_output_max
* zfs_livelist_max_entries
* zfs_max_async_dedup_frees
* zfs_max_nvlist_src_size
* zfs_rebuild_max_segment
* zfs_rebuild_vdev_limit
* zfs_unflushed_log_txg_max
* zfs_vdev_max_auto_ashift
* zfs_vdev_min_auto_ashift
* zfs_vnops_read_chunk_size
* zvol_max_discard_blocks
Rather than clutter the lists with commentary, the module parameters
that need comments are repeated below.
A few parameters were defined in Linux/FreeBSD specific code, where the
use of ulong/long is not an issue for portability, so we leave them
alone:
* zfs_delete_blocks
* zfs_key_max_salt_uses
* zvol_max_discard_blocks
The documentation for a few parameters was found to be incorrect:
* zfs_deadman_checktime_ms - incorrectly documented as int
* zfs_delete_blocks - not documented as Linux only
* zfs_history_output_max - incorrectly documented as int
* zfs_vnops_read_chunk_size - incorrectly documented as long
* zvol_max_discard_blocks - incorrectly documented as ulong
The documentation for these has been fixed, alongside the changes to
document the switch to fixed width types.
In addition, several kernel module parameters were percentages or held
ashift values, so being 64-bit never made sense for them. They have been
downgraded to 32-bit:
* vdev_file_logical_ashift
* vdev_file_physical_ashift
* zfs_arc_dnode_limit_percent
* zfs_arc_dnode_reduce_percent
* zfs_arc_meta_limit_percent
* zfs_per_txg_dirty_frees_percent
* zfs_unflushed_log_block_pct
* zfs_vdev_max_auto_ashift
* zfs_vdev_min_auto_ashift
Of special note are `zfs_vdev_max_auto_ashift` and
`zfs_vdev_min_auto_ashift`, which were already defined as `uint64_t`,
and passed to the kernel as `ulong`. This is inherently buggy on big
endian 32-bit Linux, since the values would not be written to the
correct locations. 32-bit FreeBSD was unaffected because its sysctl code
correctly treated this as a `uint64_t`.
Lastly, a code comment suggests that `zfs_arc_sys_free` is
Linux-specific, but there is nothing to indicate to me that it is
Linux-specific. Nothing was done about that.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Original-patch-by: Andrew Innes <andrew.c12@gmail.com>
Original-patch-by: Jorgen Lundman <lundman@lundman.net>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13984
Closes #14004
2022-10-03 22:06:54 +03:00
|
|
|
uint64_t zfs_lua_max_instrlimit = ZCP_MAX_INSTRLIMIT;
|
|
|
|
uint64_t zfs_lua_max_memlimit = ZCP_MAX_MEMLIMIT;
|
2018-02-08 19:16:23 +03:00
|
|
|
|
2018-02-08 19:24:39 +03:00
|
|
|
/*
|
|
|
|
* Forward declarations for mutually recursive functions
|
|
|
|
*/
|
2018-02-08 19:16:23 +03:00
|
|
|
static int zcp_nvpair_value_to_lua(lua_State *, nvpair_t *, char *, int);
|
|
|
|
static int zcp_lua_to_nvlist_impl(lua_State *, int, nvlist_t *, const char *,
|
|
|
|
int);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* The outer-most error callback handler for use with lua_pcall(). On
|
|
|
|
* error Lua will call this callback with a single argument that
|
|
|
|
* represents the error value. In most cases this will be a string
|
|
|
|
* containing an error message, but channel programs can use Lua's
|
|
|
|
* error() function to return arbitrary objects as errors. This callback
|
|
|
|
* returns (on the Lua stack) the original error object along with a traceback.
|
|
|
|
*
|
|
|
|
* Fatal Lua errors can occur while resources are held, so we also call any
|
|
|
|
* registered cleanup function here.
|
|
|
|
*/
|
|
|
|
static int
|
|
|
|
zcp_error_handler(lua_State *state)
|
|
|
|
{
|
|
|
|
const char *msg;
|
|
|
|
|
|
|
|
zcp_cleanup(state);
|
|
|
|
|
|
|
|
VERIFY3U(1, ==, lua_gettop(state));
|
|
|
|
msg = lua_tostring(state, 1);
|
|
|
|
luaL_traceback(state, state, msg, 1);
|
|
|
|
return (1);
|
|
|
|
}
|
|
|
|
|
|
|
|
int
|
|
|
|
zcp_argerror(lua_State *state, int narg, const char *msg, ...)
|
|
|
|
{
|
|
|
|
va_list alist;
|
|
|
|
|
|
|
|
va_start(alist, msg);
|
|
|
|
const char *buf = lua_pushvfstring(state, msg, alist);
|
|
|
|
va_end(alist);
|
|
|
|
|
|
|
|
return (luaL_argerror(state, narg, buf));
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Install a new cleanup function, which will be invoked with the given
|
|
|
|
* opaque argument if a fatal error causes the Lua interpreter to longjump out
|
|
|
|
* of a function call.
|
|
|
|
*
|
|
|
|
* If an error occurs, the cleanup function will be invoked exactly once and
|
2019-09-03 03:56:41 +03:00
|
|
|
* then unregistered.
|
2018-02-08 19:35:09 +03:00
|
|
|
*
|
|
|
|
* Returns the registered cleanup handler so the caller can deregister it
|
|
|
|
* if no error occurs.
|
2018-02-08 19:16:23 +03:00
|
|
|
*/
|
2018-02-08 19:35:09 +03:00
|
|
|
zcp_cleanup_handler_t *
|
2018-02-08 19:16:23 +03:00
|
|
|
zcp_register_cleanup(lua_State *state, zcp_cleanup_t cleanfunc, void *cleanarg)
|
|
|
|
{
|
|
|
|
zcp_run_info_t *ri = zcp_run_info(state);
|
|
|
|
|
2018-02-08 19:35:09 +03:00
|
|
|
zcp_cleanup_handler_t *zch = kmem_alloc(sizeof (*zch), KM_SLEEP);
|
|
|
|
zch->zch_cleanup_func = cleanfunc;
|
|
|
|
zch->zch_cleanup_arg = cleanarg;
|
|
|
|
list_insert_head(&ri->zri_cleanup_handlers, zch);
|
|
|
|
|
|
|
|
return (zch);
|
2018-02-08 19:16:23 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
void
|
2018-02-08 19:35:09 +03:00
|
|
|
zcp_deregister_cleanup(lua_State *state, zcp_cleanup_handler_t *zch)
|
2018-02-08 19:16:23 +03:00
|
|
|
{
|
|
|
|
zcp_run_info_t *ri = zcp_run_info(state);
|
2018-02-08 19:35:09 +03:00
|
|
|
list_remove(&ri->zri_cleanup_handlers, zch);
|
|
|
|
kmem_free(zch, sizeof (*zch));
|
2018-02-08 19:16:23 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2018-02-08 19:35:09 +03:00
|
|
|
* Execute the currently registered cleanup handlers then free them and
|
|
|
|
* destroy the handler list.
|
2018-02-08 19:16:23 +03:00
|
|
|
*/
|
|
|
|
void
|
|
|
|
zcp_cleanup(lua_State *state)
|
|
|
|
{
|
|
|
|
zcp_run_info_t *ri = zcp_run_info(state);
|
|
|
|
|
2018-02-08 19:35:09 +03:00
|
|
|
for (zcp_cleanup_handler_t *zch =
|
|
|
|
list_remove_head(&ri->zri_cleanup_handlers); zch != NULL;
|
|
|
|
zch = list_remove_head(&ri->zri_cleanup_handlers)) {
|
|
|
|
zch->zch_cleanup_func(zch->zch_cleanup_arg);
|
|
|
|
kmem_free(zch, sizeof (*zch));
|
2018-02-08 19:16:23 +03:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Convert the lua table at the given index on the Lua stack to an nvlist
|
|
|
|
* and return it.
|
|
|
|
*
|
|
|
|
* If the table can not be converted for any reason, NULL is returned and
|
|
|
|
* an error message is pushed onto the Lua stack.
|
|
|
|
*/
|
|
|
|
static nvlist_t *
|
|
|
|
zcp_table_to_nvlist(lua_State *state, int index, int depth)
|
|
|
|
{
|
|
|
|
nvlist_t *nvl;
|
|
|
|
/*
|
|
|
|
* Converting a Lua table to an nvlist with key uniqueness checking is
|
|
|
|
* O(n^2) in the number of keys in the nvlist, which can take a long
|
|
|
|
* time when we return a large table from a channel program.
|
|
|
|
* Furthermore, Lua's table interface *almost* guarantees unique keys
|
|
|
|
* on its own (details below). Therefore, we don't use fnvlist_alloc()
|
|
|
|
* here to avoid the built-in uniqueness checking.
|
|
|
|
*
|
|
|
|
* The *almost* is because it's possible to have key collisions between
|
|
|
|
* e.g. the string "1" and the number 1, or the string "true" and the
|
|
|
|
* boolean true, so we explicitly check that when we're looking at a
|
|
|
|
* key which is an integer / boolean or a string that can be parsed as
|
|
|
|
* one of those types. In the worst case this could still devolve into
|
|
|
|
* O(n^2), so we only start doing these checks on boolean/integer keys
|
|
|
|
* once we've seen a string key which fits this weird usage pattern.
|
|
|
|
*
|
|
|
|
* Ultimately, we still want callers to know that the keys in this
|
|
|
|
* nvlist are unique, so before we return this we set the nvlist's
|
|
|
|
* flags to reflect that.
|
|
|
|
*/
|
|
|
|
VERIFY0(nvlist_alloc(&nvl, 0, KM_SLEEP));
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Push an empty stack slot where lua_next() will store each
|
|
|
|
* table key.
|
|
|
|
*/
|
|
|
|
lua_pushnil(state);
|
|
|
|
boolean_t saw_str_could_collide = B_FALSE;
|
|
|
|
while (lua_next(state, index) != 0) {
|
|
|
|
/*
|
|
|
|
* The next key-value pair from the table at index is
|
|
|
|
* now on the stack, with the key at stack slot -2 and
|
|
|
|
* the value at slot -1.
|
|
|
|
*/
|
|
|
|
int err = 0;
|
|
|
|
char buf[32];
|
|
|
|
const char *key = NULL;
|
|
|
|
boolean_t key_could_collide = B_FALSE;
|
|
|
|
|
|
|
|
switch (lua_type(state, -2)) {
|
|
|
|
case LUA_TSTRING:
|
|
|
|
key = lua_tostring(state, -2);
|
|
|
|
|
|
|
|
/* check if this could collide with a number or bool */
|
|
|
|
long long tmp;
|
|
|
|
int parselen;
|
|
|
|
if ((sscanf(key, "%lld%n", &tmp, &parselen) > 0 &&
|
|
|
|
parselen == strlen(key)) ||
|
|
|
|
strcmp(key, "true") == 0 ||
|
|
|
|
strcmp(key, "false") == 0) {
|
|
|
|
key_could_collide = B_TRUE;
|
|
|
|
saw_str_could_collide = B_TRUE;
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
case LUA_TBOOLEAN:
|
|
|
|
key = (lua_toboolean(state, -2) == B_TRUE ?
|
|
|
|
"true" : "false");
|
|
|
|
if (saw_str_could_collide) {
|
|
|
|
key_could_collide = B_TRUE;
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
case LUA_TNUMBER:
|
Fix too few arguments to formatting function
CodeQL reported that when the VERIFY3U condition is false, we do not
pass enough arguments to `spl_panic()`. This is because the format
string from `snprintf()` was concatenated into the format string for
`spl_panic()`, which causes us to have an unexpected format specifier.
A CodeQL developer suggested fixing the macro to have a `%s` format
string that takes a stringified RIGHT argument, which would fix this.
However, upon inspection, the VERIFY3U check was never necessary in the
first place, so we remove it in favor of just calling `snprintf()`.
Lastly, it is interesting that every other static analyzer run on the
codebase did not catch this, including some that made an effort to catch
such things. Presumably, all of them relied on header annotations, which
we have not yet done on `spl_panic()`. CodeQL apparently is able to
track the flow of arguments on their way to annotated functions, which
llowed it to catch this when others did not. A future patch that I have
in development should annotate `spl_panic()`, so the others will catch
this too.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #14098
2022-10-27 19:45:26 +03:00
|
|
|
(void) snprintf(buf, sizeof (buf), "%lld",
|
|
|
|
(longlong_t)lua_tonumber(state, -2));
|
|
|
|
|
2018-02-08 19:16:23 +03:00
|
|
|
key = buf;
|
|
|
|
if (saw_str_could_collide) {
|
|
|
|
key_could_collide = B_TRUE;
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
fnvlist_free(nvl);
|
|
|
|
(void) lua_pushfstring(state, "Invalid key "
|
|
|
|
"type '%s' in table",
|
|
|
|
lua_typename(state, lua_type(state, -2)));
|
|
|
|
return (NULL);
|
|
|
|
}
|
|
|
|
/*
|
|
|
|
* Check for type-mismatched key collisions, and throw an error.
|
|
|
|
*/
|
|
|
|
if (key_could_collide && nvlist_exists(nvl, key)) {
|
|
|
|
fnvlist_free(nvl);
|
|
|
|
(void) lua_pushfstring(state, "Collision of "
|
|
|
|
"key '%s' in table", key);
|
|
|
|
return (NULL);
|
|
|
|
}
|
|
|
|
/*
|
|
|
|
* Recursively convert the table value and insert into
|
|
|
|
* the new nvlist with the parsed key. To prevent
|
|
|
|
* stack overflow on circular or heavily nested tables,
|
|
|
|
* we track the current nvlist depth.
|
|
|
|
*/
|
|
|
|
if (depth >= ZCP_NVLIST_MAX_DEPTH) {
|
|
|
|
fnvlist_free(nvl);
|
|
|
|
(void) lua_pushfstring(state, "Maximum table "
|
|
|
|
"depth (%d) exceeded for table",
|
|
|
|
ZCP_NVLIST_MAX_DEPTH);
|
|
|
|
return (NULL);
|
|
|
|
}
|
|
|
|
err = zcp_lua_to_nvlist_impl(state, -1, nvl, key,
|
|
|
|
depth + 1);
|
|
|
|
if (err != 0) {
|
|
|
|
fnvlist_free(nvl);
|
|
|
|
/*
|
|
|
|
* Error message has been pushed to the lua
|
|
|
|
* stack by the recursive call.
|
|
|
|
*/
|
|
|
|
return (NULL);
|
|
|
|
}
|
|
|
|
/*
|
|
|
|
* Pop the value pushed by lua_next().
|
|
|
|
*/
|
|
|
|
lua_pop(state, 1);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Mark the nvlist as having unique keys. This is a little ugly, but we
|
|
|
|
* ensured above that there are no duplicate keys in the nvlist.
|
|
|
|
*/
|
|
|
|
nvl->nvl_nvflag |= NV_UNIQUE_NAME;
|
|
|
|
|
|
|
|
return (nvl);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Convert a value from the given index into the lua stack to an nvpair, adding
|
|
|
|
* it to an nvlist with the given key.
|
|
|
|
*
|
|
|
|
* Values are converted as follows:
|
|
|
|
*
|
|
|
|
* string -> string
|
|
|
|
* number -> int64
|
|
|
|
* boolean -> boolean
|
|
|
|
* nil -> boolean (no value)
|
|
|
|
*
|
|
|
|
* Lua tables are converted to nvlists and then inserted. The table's keys
|
|
|
|
* are converted to strings then used as keys in the nvlist to store each table
|
|
|
|
* element. Keys are converted as follows:
|
|
|
|
*
|
|
|
|
* string -> no change
|
|
|
|
* number -> "%lld"
|
|
|
|
* boolean -> "true" | "false"
|
|
|
|
* nil -> error
|
|
|
|
*
|
|
|
|
* In the case of a key collision, an error is thrown.
|
|
|
|
*
|
|
|
|
* If an error is encountered, a nonzero error code is returned, and an error
|
|
|
|
* string will be pushed onto the Lua stack.
|
|
|
|
*/
|
|
|
|
static int
|
|
|
|
zcp_lua_to_nvlist_impl(lua_State *state, int index, nvlist_t *nvl,
|
|
|
|
const char *key, int depth)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* Verify that we have enough remaining space in the lua stack to parse
|
|
|
|
* a key-value pair and push an error.
|
|
|
|
*/
|
|
|
|
if (!lua_checkstack(state, 3)) {
|
|
|
|
(void) lua_pushstring(state, "Lua stack overflow");
|
|
|
|
return (1);
|
|
|
|
}
|
|
|
|
|
|
|
|
index = lua_absindex(state, index);
|
|
|
|
|
|
|
|
switch (lua_type(state, index)) {
|
|
|
|
case LUA_TNIL:
|
|
|
|
fnvlist_add_boolean(nvl, key);
|
|
|
|
break;
|
|
|
|
case LUA_TBOOLEAN:
|
|
|
|
fnvlist_add_boolean_value(nvl, key,
|
|
|
|
lua_toboolean(state, index));
|
|
|
|
break;
|
|
|
|
case LUA_TNUMBER:
|
|
|
|
fnvlist_add_int64(nvl, key, lua_tonumber(state, index));
|
|
|
|
break;
|
|
|
|
case LUA_TSTRING:
|
|
|
|
fnvlist_add_string(nvl, key, lua_tostring(state, index));
|
|
|
|
break;
|
|
|
|
case LUA_TTABLE: {
|
|
|
|
nvlist_t *value_nvl = zcp_table_to_nvlist(state, index, depth);
|
|
|
|
if (value_nvl == NULL)
|
2020-02-27 03:09:17 +03:00
|
|
|
return (SET_ERROR(EINVAL));
|
2018-02-08 19:16:23 +03:00
|
|
|
|
|
|
|
fnvlist_add_nvlist(nvl, key, value_nvl);
|
|
|
|
fnvlist_free(value_nvl);
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
default:
|
|
|
|
(void) lua_pushfstring(state,
|
|
|
|
"Invalid value type '%s' for key '%s'",
|
|
|
|
lua_typename(state, lua_type(state, index)), key);
|
2020-02-27 03:09:17 +03:00
|
|
|
return (SET_ERROR(EINVAL));
|
2018-02-08 19:16:23 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Convert a lua value to an nvpair, adding it to an nvlist with the given key.
|
|
|
|
*/
|
OpenZFS 9424 - ztest failure: "unprotected error in call to Lua API (Invalid value type 'function' for key 'error')"
Ztest failed with the following crash.
::status
debugging core file of ztest (64-bit) from clone-dc-slave-280-bc7947b1.dcenter
file: /usr/bin/amd64/ztest
initial argv: /usr/bin/amd64/ztest
threading model: raw lwps
status: process terminated by SIGABRT (Abort), pid=2150 uid=1025 code=-1
panic message: failure for thread 0xfffffd7fff112a40, thread-id 1: unprotected error in call to Lua API (Invalid
value type 'function' for key 'error')
::stack
libc.so.1`_lwp_kill+0xa()
libc.so.1`_assfail+0x182(fffffd7fffdfe8d0, 0, 0)
libc.so.1`assfail+0x19(fffffd7fffdfe8d0, 0, 0)
libzpool.so.1`vpanic+0x3d(fffffd7ffaa58c20, fffffd7fffdfeb00)
0xfffffd7ffaa28146()
0xfffffd7ffaa0a109()
libzpool.so.1`luaD_throw+0x86(3011a48, 2)
0xfffffd7ffa9350d3()
0xfffffd7ffa93e3f1()
libzpool.so.1`zcp_lua_to_nvlist+0x33(3011a48, 1, 2686470, fffffd7ffaa2e2c3)
libzpool.so.1`zcp_convert_return_values+0xa4(3011a48, 2686470, fffffd7ffaa2e2c3, fffffd7fffdfedd0)
libzpool.so.1`zcp_pool_error+0x59(fffffd7fffdfedd0, 1e0f450)
libzpool.so.1`zcp_eval+0x6f8(1e0f450, fffffd7ffaa483f8, 1, 0, 6400000, 1d33b30)
libzpool.so.1`dsl_destroy_snapshots_nvl+0x12c(2786b60, 0, 484750)
libzpool.so.1`dsl_destroy_snapshot+0x4f(fffffd7fffdfef70, 0)
ztest_dsl_dataset_cleanup+0xea(fffffd7fffdff4c0, 1)
ztest_dataset_destroy+0x53(1)
ztest_run+0x59f(fffffd7fff0e0498)
main+0x7ff(1, fffffd7fffdffa88)
_start+0x6c()
The problem is that zcp_convert_return_values() assumes that there's
exactly one value on the stack, but that isn't always true. It ends up
putting the wrong thing on the stack which is then consumed by
zcp_convert_return values, which either adds the wrong message to the
nvlist, or blows up.
The fix is to make sure that callers of zcp_convert_return_values()
clear the stack before pushing their error message, and
zcp_convert_return_values() should VERIFY that the stack is the expected
size.
Authored by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Don Brady <don.brady@delphix.com>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Robert Mustacchi <rm@joyent.com>
OpenZFS-issue: https://www.illumos.org/issues/9424
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/eb7e57429
Closes #7696
2017-08-07 19:29:34 +03:00
|
|
|
static void
|
2018-02-08 19:16:23 +03:00
|
|
|
zcp_lua_to_nvlist(lua_State *state, int index, nvlist_t *nvl, const char *key)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* On error, zcp_lua_to_nvlist_impl pushes an error string onto the Lua
|
|
|
|
* stack before returning with a nonzero error code. If an error is
|
|
|
|
* returned, throw a fatal lua error with the given string.
|
|
|
|
*/
|
|
|
|
if (zcp_lua_to_nvlist_impl(state, index, nvl, key, 0) != 0)
|
|
|
|
(void) lua_error(state);
|
|
|
|
}
|
|
|
|
|
OpenZFS 9424 - ztest failure: "unprotected error in call to Lua API (Invalid value type 'function' for key 'error')"
Ztest failed with the following crash.
::status
debugging core file of ztest (64-bit) from clone-dc-slave-280-bc7947b1.dcenter
file: /usr/bin/amd64/ztest
initial argv: /usr/bin/amd64/ztest
threading model: raw lwps
status: process terminated by SIGABRT (Abort), pid=2150 uid=1025 code=-1
panic message: failure for thread 0xfffffd7fff112a40, thread-id 1: unprotected error in call to Lua API (Invalid
value type 'function' for key 'error')
::stack
libc.so.1`_lwp_kill+0xa()
libc.so.1`_assfail+0x182(fffffd7fffdfe8d0, 0, 0)
libc.so.1`assfail+0x19(fffffd7fffdfe8d0, 0, 0)
libzpool.so.1`vpanic+0x3d(fffffd7ffaa58c20, fffffd7fffdfeb00)
0xfffffd7ffaa28146()
0xfffffd7ffaa0a109()
libzpool.so.1`luaD_throw+0x86(3011a48, 2)
0xfffffd7ffa9350d3()
0xfffffd7ffa93e3f1()
libzpool.so.1`zcp_lua_to_nvlist+0x33(3011a48, 1, 2686470, fffffd7ffaa2e2c3)
libzpool.so.1`zcp_convert_return_values+0xa4(3011a48, 2686470, fffffd7ffaa2e2c3, fffffd7fffdfedd0)
libzpool.so.1`zcp_pool_error+0x59(fffffd7fffdfedd0, 1e0f450)
libzpool.so.1`zcp_eval+0x6f8(1e0f450, fffffd7ffaa483f8, 1, 0, 6400000, 1d33b30)
libzpool.so.1`dsl_destroy_snapshots_nvl+0x12c(2786b60, 0, 484750)
libzpool.so.1`dsl_destroy_snapshot+0x4f(fffffd7fffdfef70, 0)
ztest_dsl_dataset_cleanup+0xea(fffffd7fffdff4c0, 1)
ztest_dataset_destroy+0x53(1)
ztest_run+0x59f(fffffd7fff0e0498)
main+0x7ff(1, fffffd7fffdffa88)
_start+0x6c()
The problem is that zcp_convert_return_values() assumes that there's
exactly one value on the stack, but that isn't always true. It ends up
putting the wrong thing on the stack which is then consumed by
zcp_convert_return values, which either adds the wrong message to the
nvlist, or blows up.
The fix is to make sure that callers of zcp_convert_return_values()
clear the stack before pushing their error message, and
zcp_convert_return_values() should VERIFY that the stack is the expected
size.
Authored by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Don Brady <don.brady@delphix.com>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Robert Mustacchi <rm@joyent.com>
OpenZFS-issue: https://www.illumos.org/issues/9424
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/eb7e57429
Closes #7696
2017-08-07 19:29:34 +03:00
|
|
|
static int
|
2018-02-08 19:16:23 +03:00
|
|
|
zcp_lua_to_nvlist_helper(lua_State *state)
|
|
|
|
{
|
|
|
|
nvlist_t *nv = (nvlist_t *)lua_touserdata(state, 2);
|
|
|
|
const char *key = (const char *)lua_touserdata(state, 1);
|
|
|
|
zcp_lua_to_nvlist(state, 3, nv, key);
|
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
|
OpenZFS 9424 - ztest failure: "unprotected error in call to Lua API (Invalid value type 'function' for key 'error')"
Ztest failed with the following crash.
::status
debugging core file of ztest (64-bit) from clone-dc-slave-280-bc7947b1.dcenter
file: /usr/bin/amd64/ztest
initial argv: /usr/bin/amd64/ztest
threading model: raw lwps
status: process terminated by SIGABRT (Abort), pid=2150 uid=1025 code=-1
panic message: failure for thread 0xfffffd7fff112a40, thread-id 1: unprotected error in call to Lua API (Invalid
value type 'function' for key 'error')
::stack
libc.so.1`_lwp_kill+0xa()
libc.so.1`_assfail+0x182(fffffd7fffdfe8d0, 0, 0)
libc.so.1`assfail+0x19(fffffd7fffdfe8d0, 0, 0)
libzpool.so.1`vpanic+0x3d(fffffd7ffaa58c20, fffffd7fffdfeb00)
0xfffffd7ffaa28146()
0xfffffd7ffaa0a109()
libzpool.so.1`luaD_throw+0x86(3011a48, 2)
0xfffffd7ffa9350d3()
0xfffffd7ffa93e3f1()
libzpool.so.1`zcp_lua_to_nvlist+0x33(3011a48, 1, 2686470, fffffd7ffaa2e2c3)
libzpool.so.1`zcp_convert_return_values+0xa4(3011a48, 2686470, fffffd7ffaa2e2c3, fffffd7fffdfedd0)
libzpool.so.1`zcp_pool_error+0x59(fffffd7fffdfedd0, 1e0f450)
libzpool.so.1`zcp_eval+0x6f8(1e0f450, fffffd7ffaa483f8, 1, 0, 6400000, 1d33b30)
libzpool.so.1`dsl_destroy_snapshots_nvl+0x12c(2786b60, 0, 484750)
libzpool.so.1`dsl_destroy_snapshot+0x4f(fffffd7fffdfef70, 0)
ztest_dsl_dataset_cleanup+0xea(fffffd7fffdff4c0, 1)
ztest_dataset_destroy+0x53(1)
ztest_run+0x59f(fffffd7fff0e0498)
main+0x7ff(1, fffffd7fffdffa88)
_start+0x6c()
The problem is that zcp_convert_return_values() assumes that there's
exactly one value on the stack, but that isn't always true. It ends up
putting the wrong thing on the stack which is then consumed by
zcp_convert_return values, which either adds the wrong message to the
nvlist, or blows up.
The fix is to make sure that callers of zcp_convert_return_values()
clear the stack before pushing their error message, and
zcp_convert_return_values() should VERIFY that the stack is the expected
size.
Authored by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Don Brady <don.brady@delphix.com>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Robert Mustacchi <rm@joyent.com>
OpenZFS-issue: https://www.illumos.org/issues/9424
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/eb7e57429
Closes #7696
2017-08-07 19:29:34 +03:00
|
|
|
static void
|
2018-02-08 19:16:23 +03:00
|
|
|
zcp_convert_return_values(lua_State *state, nvlist_t *nvl,
|
2019-06-23 02:51:46 +03:00
|
|
|
const char *key, int *result)
|
2018-02-08 19:16:23 +03:00
|
|
|
{
|
|
|
|
int err;
|
OpenZFS 9424 - ztest failure: "unprotected error in call to Lua API (Invalid value type 'function' for key 'error')"
Ztest failed with the following crash.
::status
debugging core file of ztest (64-bit) from clone-dc-slave-280-bc7947b1.dcenter
file: /usr/bin/amd64/ztest
initial argv: /usr/bin/amd64/ztest
threading model: raw lwps
status: process terminated by SIGABRT (Abort), pid=2150 uid=1025 code=-1
panic message: failure for thread 0xfffffd7fff112a40, thread-id 1: unprotected error in call to Lua API (Invalid
value type 'function' for key 'error')
::stack
libc.so.1`_lwp_kill+0xa()
libc.so.1`_assfail+0x182(fffffd7fffdfe8d0, 0, 0)
libc.so.1`assfail+0x19(fffffd7fffdfe8d0, 0, 0)
libzpool.so.1`vpanic+0x3d(fffffd7ffaa58c20, fffffd7fffdfeb00)
0xfffffd7ffaa28146()
0xfffffd7ffaa0a109()
libzpool.so.1`luaD_throw+0x86(3011a48, 2)
0xfffffd7ffa9350d3()
0xfffffd7ffa93e3f1()
libzpool.so.1`zcp_lua_to_nvlist+0x33(3011a48, 1, 2686470, fffffd7ffaa2e2c3)
libzpool.so.1`zcp_convert_return_values+0xa4(3011a48, 2686470, fffffd7ffaa2e2c3, fffffd7fffdfedd0)
libzpool.so.1`zcp_pool_error+0x59(fffffd7fffdfedd0, 1e0f450)
libzpool.so.1`zcp_eval+0x6f8(1e0f450, fffffd7ffaa483f8, 1, 0, 6400000, 1d33b30)
libzpool.so.1`dsl_destroy_snapshots_nvl+0x12c(2786b60, 0, 484750)
libzpool.so.1`dsl_destroy_snapshot+0x4f(fffffd7fffdfef70, 0)
ztest_dsl_dataset_cleanup+0xea(fffffd7fffdff4c0, 1)
ztest_dataset_destroy+0x53(1)
ztest_run+0x59f(fffffd7fff0e0498)
main+0x7ff(1, fffffd7fffdffa88)
_start+0x6c()
The problem is that zcp_convert_return_values() assumes that there's
exactly one value on the stack, but that isn't always true. It ends up
putting the wrong thing on the stack which is then consumed by
zcp_convert_return values, which either adds the wrong message to the
nvlist, or blows up.
The fix is to make sure that callers of zcp_convert_return_values()
clear the stack before pushing their error message, and
zcp_convert_return_values() should VERIFY that the stack is the expected
size.
Authored by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Don Brady <don.brady@delphix.com>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Robert Mustacchi <rm@joyent.com>
OpenZFS-issue: https://www.illumos.org/issues/9424
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/eb7e57429
Closes #7696
2017-08-07 19:29:34 +03:00
|
|
|
VERIFY3U(1, ==, lua_gettop(state));
|
2018-02-08 19:16:23 +03:00
|
|
|
lua_pushcfunction(state, zcp_lua_to_nvlist_helper);
|
|
|
|
lua_pushlightuserdata(state, (char *)key);
|
|
|
|
lua_pushlightuserdata(state, nvl);
|
|
|
|
lua_pushvalue(state, 1);
|
|
|
|
lua_remove(state, 1);
|
|
|
|
err = lua_pcall(state, 3, 0, 0); /* zcp_lua_to_nvlist_helper */
|
|
|
|
if (err != 0) {
|
|
|
|
zcp_lua_to_nvlist(state, 1, nvl, ZCP_RET_ERROR);
|
2019-06-23 02:51:46 +03:00
|
|
|
*result = SET_ERROR(ECHRNG);
|
2018-02-08 19:16:23 +03:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Push a Lua table representing nvl onto the stack. If it can't be
|
|
|
|
* converted, return EINVAL, fill in errbuf, and push nothing. errbuf may
|
|
|
|
* be specified as NULL, in which case no error string will be output.
|
|
|
|
*
|
|
|
|
* Most nvlists are converted as simple key->value Lua tables, but we make
|
|
|
|
* an exception for the case where all nvlist entries are BOOLEANs (a string
|
|
|
|
* key without a value). In Lua, a table key pointing to a value of Nil
|
|
|
|
* (no value) is equivalent to the key not existing, so a BOOLEAN nvlist
|
|
|
|
* entry can't be directly converted to a Lua table entry. Nvlists of entirely
|
|
|
|
* BOOLEAN entries are frequently used to pass around lists of datasets, so for
|
|
|
|
* convenience we check for this case, and convert it to a simple Lua array of
|
|
|
|
* strings.
|
|
|
|
*/
|
|
|
|
int
|
|
|
|
zcp_nvlist_to_lua(lua_State *state, nvlist_t *nvl,
|
|
|
|
char *errbuf, int errbuf_len)
|
|
|
|
{
|
|
|
|
nvpair_t *pair;
|
|
|
|
lua_newtable(state);
|
|
|
|
boolean_t has_values = B_FALSE;
|
|
|
|
/*
|
|
|
|
* If the list doesn't have any values, just convert it to a string
|
|
|
|
* array.
|
|
|
|
*/
|
|
|
|
for (pair = nvlist_next_nvpair(nvl, NULL);
|
|
|
|
pair != NULL; pair = nvlist_next_nvpair(nvl, pair)) {
|
|
|
|
if (nvpair_type(pair) != DATA_TYPE_BOOLEAN) {
|
|
|
|
has_values = B_TRUE;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (!has_values) {
|
|
|
|
int i = 1;
|
|
|
|
for (pair = nvlist_next_nvpair(nvl, NULL);
|
|
|
|
pair != NULL; pair = nvlist_next_nvpair(nvl, pair)) {
|
|
|
|
(void) lua_pushinteger(state, i);
|
|
|
|
(void) lua_pushstring(state, nvpair_name(pair));
|
|
|
|
(void) lua_settable(state, -3);
|
|
|
|
i++;
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
for (pair = nvlist_next_nvpair(nvl, NULL);
|
|
|
|
pair != NULL; pair = nvlist_next_nvpair(nvl, pair)) {
|
|
|
|
int err = zcp_nvpair_value_to_lua(state, pair,
|
|
|
|
errbuf, errbuf_len);
|
|
|
|
if (err != 0) {
|
|
|
|
lua_pop(state, 1);
|
|
|
|
return (err);
|
|
|
|
}
|
|
|
|
(void) lua_setfield(state, -2, nvpair_name(pair));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Push a Lua object representing the value of "pair" onto the stack.
|
|
|
|
*
|
|
|
|
* Only understands boolean_value, string, int64, nvlist,
|
|
|
|
* string_array, and int64_array type values. For other
|
|
|
|
* types, returns EINVAL, fills in errbuf, and pushes nothing.
|
|
|
|
*/
|
|
|
|
static int
|
|
|
|
zcp_nvpair_value_to_lua(lua_State *state, nvpair_t *pair,
|
|
|
|
char *errbuf, int errbuf_len)
|
|
|
|
{
|
|
|
|
int err = 0;
|
|
|
|
|
|
|
|
if (pair == NULL) {
|
|
|
|
lua_pushnil(state);
|
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
|
|
|
|
switch (nvpair_type(pair)) {
|
|
|
|
case DATA_TYPE_BOOLEAN_VALUE:
|
|
|
|
(void) lua_pushboolean(state,
|
|
|
|
fnvpair_value_boolean_value(pair));
|
|
|
|
break;
|
|
|
|
case DATA_TYPE_STRING:
|
|
|
|
(void) lua_pushstring(state, fnvpair_value_string(pair));
|
|
|
|
break;
|
|
|
|
case DATA_TYPE_INT64:
|
|
|
|
(void) lua_pushinteger(state, fnvpair_value_int64(pair));
|
|
|
|
break;
|
|
|
|
case DATA_TYPE_NVLIST:
|
|
|
|
err = zcp_nvlist_to_lua(state,
|
|
|
|
fnvpair_value_nvlist(pair), errbuf, errbuf_len);
|
|
|
|
break;
|
|
|
|
case DATA_TYPE_STRING_ARRAY: {
|
2023-03-11 21:39:24 +03:00
|
|
|
const char **strarr;
|
2018-02-08 19:16:23 +03:00
|
|
|
uint_t nelem;
|
|
|
|
(void) nvpair_value_string_array(pair, &strarr, &nelem);
|
|
|
|
lua_newtable(state);
|
|
|
|
for (int i = 0; i < nelem; i++) {
|
|
|
|
(void) lua_pushinteger(state, i + 1);
|
|
|
|
(void) lua_pushstring(state, strarr[i]);
|
|
|
|
(void) lua_settable(state, -3);
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
case DATA_TYPE_UINT64_ARRAY: {
|
|
|
|
uint64_t *intarr;
|
|
|
|
uint_t nelem;
|
|
|
|
(void) nvpair_value_uint64_array(pair, &intarr, &nelem);
|
|
|
|
lua_newtable(state);
|
|
|
|
for (int i = 0; i < nelem; i++) {
|
|
|
|
(void) lua_pushinteger(state, i + 1);
|
|
|
|
(void) lua_pushinteger(state, intarr[i]);
|
|
|
|
(void) lua_settable(state, -3);
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
case DATA_TYPE_INT64_ARRAY: {
|
|
|
|
int64_t *intarr;
|
|
|
|
uint_t nelem;
|
|
|
|
(void) nvpair_value_int64_array(pair, &intarr, &nelem);
|
|
|
|
lua_newtable(state);
|
|
|
|
for (int i = 0; i < nelem; i++) {
|
|
|
|
(void) lua_pushinteger(state, i + 1);
|
|
|
|
(void) lua_pushinteger(state, intarr[i]);
|
|
|
|
(void) lua_settable(state, -3);
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
default: {
|
|
|
|
if (errbuf != NULL) {
|
|
|
|
(void) snprintf(errbuf, errbuf_len,
|
|
|
|
"Unhandled nvpair type %d for key '%s'",
|
|
|
|
nvpair_type(pair), nvpair_name(pair));
|
|
|
|
}
|
2020-02-27 03:09:17 +03:00
|
|
|
return (SET_ERROR(EINVAL));
|
2018-02-08 19:16:23 +03:00
|
|
|
}
|
|
|
|
}
|
|
|
|
return (err);
|
|
|
|
}
|
|
|
|
|
|
|
|
int
|
|
|
|
zcp_dataset_hold_error(lua_State *state, dsl_pool_t *dp, const char *dsname,
|
|
|
|
int error)
|
|
|
|
{
|
|
|
|
if (error == ENOENT) {
|
|
|
|
(void) zcp_argerror(state, 1, "no such dataset '%s'", dsname);
|
|
|
|
return (0); /* not reached; zcp_argerror will longjmp */
|
|
|
|
} else if (error == EXDEV) {
|
|
|
|
(void) zcp_argerror(state, 1,
|
|
|
|
"dataset '%s' is not in the target pool '%s'",
|
|
|
|
dsname, spa_name(dp->dp_spa));
|
|
|
|
return (0); /* not reached; zcp_argerror will longjmp */
|
|
|
|
} else if (error == EIO) {
|
|
|
|
(void) luaL_error(state,
|
|
|
|
"I/O error while accessing dataset '%s'", dsname);
|
|
|
|
return (0); /* not reached; luaL_error will longjmp */
|
|
|
|
} else if (error != 0) {
|
|
|
|
(void) luaL_error(state,
|
|
|
|
"unexpected error %d while accessing dataset '%s'",
|
|
|
|
error, dsname);
|
|
|
|
return (0); /* not reached; luaL_error will longjmp */
|
|
|
|
}
|
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Note: will longjmp (via lua_error()) on error.
|
|
|
|
* Assumes that the dsname is argument #1 (for error reporting purposes).
|
|
|
|
*/
|
|
|
|
dsl_dataset_t *
|
|
|
|
zcp_dataset_hold(lua_State *state, dsl_pool_t *dp, const char *dsname,
|
2022-04-19 21:49:30 +03:00
|
|
|
const void *tag)
|
2018-02-08 19:16:23 +03:00
|
|
|
{
|
|
|
|
dsl_dataset_t *ds;
|
|
|
|
int error = dsl_dataset_hold(dp, dsname, tag, &ds);
|
|
|
|
(void) zcp_dataset_hold_error(state, dp, dsname, error);
|
|
|
|
return (ds);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int zcp_debug(lua_State *);
|
2022-01-15 02:37:55 +03:00
|
|
|
static const zcp_lib_info_t zcp_debug_info = {
|
2018-02-08 19:16:23 +03:00
|
|
|
.name = "debug",
|
|
|
|
.func = zcp_debug,
|
|
|
|
.pargs = {
|
2022-01-15 02:37:55 +03:00
|
|
|
{ .za_name = "debug string", .za_lua_type = LUA_TSTRING },
|
2018-02-08 19:16:23 +03:00
|
|
|
{NULL, 0}
|
|
|
|
},
|
|
|
|
.kwargs = {
|
|
|
|
{NULL, 0}
|
|
|
|
}
|
|
|
|
};
|
|
|
|
|
|
|
|
static int
|
|
|
|
zcp_debug(lua_State *state)
|
|
|
|
{
|
|
|
|
const char *dbgstring;
|
|
|
|
zcp_run_info_t *ri = zcp_run_info(state);
|
2022-01-15 02:37:55 +03:00
|
|
|
const zcp_lib_info_t *libinfo = &zcp_debug_info;
|
2018-02-08 19:16:23 +03:00
|
|
|
|
|
|
|
zcp_parse_args(state, libinfo->name, libinfo->pargs, libinfo->kwargs);
|
|
|
|
|
|
|
|
dbgstring = lua_tostring(state, 1);
|
|
|
|
|
2021-06-23 07:53:45 +03:00
|
|
|
zfs_dbgmsg("txg %lld ZCP: %s", (longlong_t)ri->zri_tx->tx_txg,
|
|
|
|
dbgstring);
|
2018-02-08 19:16:23 +03:00
|
|
|
|
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int zcp_exists(lua_State *);
|
2022-01-15 02:37:55 +03:00
|
|
|
static const zcp_lib_info_t zcp_exists_info = {
|
2018-02-08 19:16:23 +03:00
|
|
|
.name = "exists",
|
|
|
|
.func = zcp_exists,
|
|
|
|
.pargs = {
|
2022-01-15 02:37:55 +03:00
|
|
|
{ .za_name = "dataset", .za_lua_type = LUA_TSTRING },
|
2018-02-08 19:16:23 +03:00
|
|
|
{NULL, 0}
|
|
|
|
},
|
|
|
|
.kwargs = {
|
|
|
|
{NULL, 0}
|
|
|
|
}
|
|
|
|
};
|
|
|
|
|
|
|
|
static int
|
|
|
|
zcp_exists(lua_State *state)
|
|
|
|
{
|
|
|
|
zcp_run_info_t *ri = zcp_run_info(state);
|
|
|
|
dsl_pool_t *dp = ri->zri_pool;
|
2022-01-15 02:37:55 +03:00
|
|
|
const zcp_lib_info_t *libinfo = &zcp_exists_info;
|
2018-02-08 19:16:23 +03:00
|
|
|
|
|
|
|
zcp_parse_args(state, libinfo->name, libinfo->pargs, libinfo->kwargs);
|
|
|
|
|
|
|
|
const char *dsname = lua_tostring(state, 1);
|
|
|
|
|
|
|
|
dsl_dataset_t *ds;
|
|
|
|
int error = dsl_dataset_hold(dp, dsname, FTAG, &ds);
|
|
|
|
if (error == 0) {
|
|
|
|
dsl_dataset_rele(ds, FTAG);
|
|
|
|
lua_pushboolean(state, B_TRUE);
|
|
|
|
} else if (error == ENOENT) {
|
|
|
|
lua_pushboolean(state, B_FALSE);
|
|
|
|
} else if (error == EXDEV) {
|
|
|
|
return (luaL_error(state, "dataset '%s' is not in the "
|
|
|
|
"target pool", dsname));
|
|
|
|
} else if (error == EIO) {
|
|
|
|
return (luaL_error(state, "I/O error opening dataset '%s'",
|
|
|
|
dsname));
|
|
|
|
} else if (error != 0) {
|
|
|
|
return (luaL_error(state, "unexpected error %d", error));
|
|
|
|
}
|
|
|
|
|
2018-02-08 19:17:52 +03:00
|
|
|
return (1);
|
2018-02-08 19:16:23 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Allocate/realloc/free a buffer for the lua interpreter.
|
|
|
|
*
|
|
|
|
* When nsize is 0, behaves as free() and returns NULL.
|
|
|
|
*
|
|
|
|
* If ptr is NULL, behaves as malloc() and returns an allocated buffer of size
|
|
|
|
* at least nsize.
|
|
|
|
*
|
|
|
|
* Otherwise, behaves as realloc(), changing the allocation from osize to nsize.
|
|
|
|
* Shrinking the buffer size never fails.
|
|
|
|
*
|
|
|
|
* The original allocated buffer size is stored as a uint64 at the beginning of
|
|
|
|
* the buffer to avoid actually reallocating when shrinking a buffer, since lua
|
|
|
|
* requires that this operation never fail.
|
|
|
|
*/
|
|
|
|
static void *
|
|
|
|
zcp_lua_alloc(void *ud, void *ptr, size_t osize, size_t nsize)
|
|
|
|
{
|
|
|
|
zcp_alloc_arg_t *allocargs = ud;
|
|
|
|
|
|
|
|
if (nsize == 0) {
|
|
|
|
if (ptr != NULL) {
|
|
|
|
int64_t *allocbuf = (int64_t *)ptr - 1;
|
|
|
|
int64_t allocsize = *allocbuf;
|
|
|
|
ASSERT3S(allocsize, >, 0);
|
|
|
|
ASSERT3S(allocargs->aa_alloc_remaining + allocsize, <=,
|
|
|
|
allocargs->aa_alloc_limit);
|
|
|
|
allocargs->aa_alloc_remaining += allocsize;
|
|
|
|
vmem_free(allocbuf, allocsize);
|
|
|
|
}
|
|
|
|
return (NULL);
|
|
|
|
} else if (ptr == NULL) {
|
|
|
|
int64_t *allocbuf;
|
|
|
|
int64_t allocsize = nsize + sizeof (int64_t);
|
|
|
|
|
|
|
|
if (!allocargs->aa_must_succeed &&
|
|
|
|
(allocsize <= 0 ||
|
|
|
|
allocsize > allocargs->aa_alloc_remaining)) {
|
|
|
|
return (NULL);
|
|
|
|
}
|
|
|
|
|
Channel program may spuriously fail with "memory limit exhausted"
ZFS channel programs (invoked by `zfs program`) are executed in a LUA
sandbox with a limit on the amount of memory they can consume. The
limit is 10MB by default, and can be raised to 100MB with the `-m` flag.
If the memory limit is exceeded, the LUA program exits and the command
fails with a message like `Channel program execution failed: Memory
limit exhausted.`
The LUA sandbox allocates memory with `vmem_alloc(KM_NOSLEEP)`, which
will fail if the requested memory is not immediately available. In this
case, the program fails with the same message, `Memory limit exhausted`.
However, in this case the specified memory limit has not been reached,
and the memory may only be temporarily unavailable.
This commit changes the LUA memory allocator `zcp_lua_alloc()` to use
`vmem_alloc(KM_SLEEP)`, so that we won't spuriously fail when memory is
temporarily low. Instead, we rely on the system to be able to free up
memory (e.g. by evicting from the ARC), and we assume that even at the
highest memory limit of 100MB, the channel program will not truly
exhaust the system's memory.
External-issue: DLPX-71924
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #11190
2020-11-12 04:16:15 +03:00
|
|
|
allocbuf = vmem_alloc(allocsize, KM_SLEEP);
|
2018-02-08 19:16:23 +03:00
|
|
|
allocargs->aa_alloc_remaining -= allocsize;
|
|
|
|
|
|
|
|
*allocbuf = allocsize;
|
|
|
|
return (allocbuf + 1);
|
|
|
|
} else if (nsize <= osize) {
|
|
|
|
/*
|
|
|
|
* If shrinking the buffer, lua requires that the reallocation
|
|
|
|
* never fail.
|
|
|
|
*/
|
|
|
|
return (ptr);
|
|
|
|
} else {
|
|
|
|
ASSERT3U(nsize, >, osize);
|
|
|
|
|
|
|
|
uint64_t *luabuf = zcp_lua_alloc(ud, NULL, 0, nsize);
|
|
|
|
if (luabuf == NULL) {
|
|
|
|
return (NULL);
|
|
|
|
}
|
|
|
|
(void) memcpy(luabuf, ptr, osize);
|
|
|
|
VERIFY3P(zcp_lua_alloc(ud, ptr, osize, 0), ==, NULL);
|
|
|
|
return (luabuf);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
|
|
|
zcp_lua_counthook(lua_State *state, lua_Debug *ar)
|
|
|
|
{
|
2021-12-12 18:06:44 +03:00
|
|
|
(void) ar;
|
2018-02-08 19:16:23 +03:00
|
|
|
lua_getfield(state, LUA_REGISTRYINDEX, ZCP_RUN_INFO_KEY);
|
|
|
|
zcp_run_info_t *ri = lua_touserdata(state, -1);
|
|
|
|
|
2019-06-23 02:51:46 +03:00
|
|
|
/*
|
|
|
|
* Check if we were canceled while waiting for the
|
|
|
|
* txg to sync or from our open context thread
|
|
|
|
*/
|
2024-05-29 20:49:11 +03:00
|
|
|
if (ri->zri_canceled || (!ri->zri_sync && issig())) {
|
2019-06-23 02:51:46 +03:00
|
|
|
ri->zri_canceled = B_TRUE;
|
|
|
|
(void) lua_pushstring(state, "Channel program was canceled.");
|
|
|
|
(void) lua_error(state);
|
|
|
|
/* Unreachable */
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Check how many instructions the channel program has
|
|
|
|
* executed so far, and compare against the limit.
|
|
|
|
*/
|
2018-02-08 19:16:23 +03:00
|
|
|
ri->zri_curinstrs += zfs_lua_check_instrlimit_interval;
|
|
|
|
if (ri->zri_maxinstrs != 0 && ri->zri_curinstrs > ri->zri_maxinstrs) {
|
|
|
|
ri->zri_timed_out = B_TRUE;
|
|
|
|
(void) lua_pushstring(state,
|
|
|
|
"Channel program timed out.");
|
|
|
|
(void) lua_error(state);
|
2019-06-23 02:51:46 +03:00
|
|
|
/* Unreachable */
|
2018-02-08 19:16:23 +03:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
|
|
|
zcp_panic_cb(lua_State *state)
|
|
|
|
{
|
|
|
|
panic("unprotected error in call to Lua API (%s)\n",
|
|
|
|
lua_tostring(state, -1));
|
|
|
|
return (0);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
2019-06-23 02:51:46 +03:00
|
|
|
zcp_eval_impl(dmu_tx_t *tx, zcp_run_info_t *ri)
|
2018-02-08 19:16:23 +03:00
|
|
|
{
|
|
|
|
int err;
|
2019-06-23 02:51:46 +03:00
|
|
|
lua_State *state = ri->zri_state;
|
2018-02-08 19:16:23 +03:00
|
|
|
|
|
|
|
VERIFY3U(3, ==, lua_gettop(state));
|
|
|
|
|
2019-06-23 02:51:46 +03:00
|
|
|
/* finish initializing our runtime state */
|
|
|
|
ri->zri_pool = dmu_tx_pool(tx);
|
|
|
|
ri->zri_tx = tx;
|
|
|
|
list_create(&ri->zri_cleanup_handlers, sizeof (zcp_cleanup_handler_t),
|
|
|
|
offsetof(zcp_cleanup_handler_t, zch_node));
|
|
|
|
|
2018-02-08 19:16:23 +03:00
|
|
|
/*
|
|
|
|
* Store the zcp_run_info_t struct for this run in the Lua registry.
|
|
|
|
* Registry entries are not directly accessible by the Lua scripts but
|
|
|
|
* can be accessed by our callbacks.
|
|
|
|
*/
|
2019-06-23 02:51:46 +03:00
|
|
|
lua_pushlightuserdata(state, ri);
|
2018-02-08 19:16:23 +03:00
|
|
|
lua_setfield(state, LUA_REGISTRYINDEX, ZCP_RUN_INFO_KEY);
|
|
|
|
VERIFY3U(3, ==, lua_gettop(state));
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Tell the Lua interpreter to call our handler every count
|
|
|
|
* instructions. Channel programs that execute too many instructions
|
|
|
|
* should die with ETIME.
|
|
|
|
*/
|
|
|
|
(void) lua_sethook(state, zcp_lua_counthook, LUA_MASKCOUNT,
|
|
|
|
zfs_lua_check_instrlimit_interval);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Tell the Lua memory allocator to stop using KM_SLEEP before handing
|
|
|
|
* off control to the channel program. Channel programs that use too
|
|
|
|
* much memory should die with ENOSPC.
|
|
|
|
*/
|
2019-06-23 02:51:46 +03:00
|
|
|
ri->zri_allocargs->aa_must_succeed = B_FALSE;
|
2018-02-08 19:16:23 +03:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Call the Lua function that open-context passed us. This pops the
|
|
|
|
* function and its input from the stack and pushes any return
|
|
|
|
* or error values.
|
|
|
|
*/
|
|
|
|
err = lua_pcall(state, 1, LUA_MULTRET, 1);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Let Lua use KM_SLEEP while we interpret the return values.
|
|
|
|
*/
|
2019-06-23 02:51:46 +03:00
|
|
|
ri->zri_allocargs->aa_must_succeed = B_TRUE;
|
2018-02-08 19:16:23 +03:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Remove the error handler callback from the stack. At this point,
|
2018-02-08 19:35:09 +03:00
|
|
|
* there shouldn't be any cleanup handler registered in the handler
|
|
|
|
* list (zri_cleanup_handlers), regardless of whether it ran or not.
|
2018-02-08 19:16:23 +03:00
|
|
|
*/
|
2019-06-23 02:51:46 +03:00
|
|
|
list_destroy(&ri->zri_cleanup_handlers);
|
2018-02-08 19:16:23 +03:00
|
|
|
lua_remove(state, 1);
|
|
|
|
|
|
|
|
switch (err) {
|
|
|
|
case LUA_OK: {
|
|
|
|
/*
|
|
|
|
* Lua supports returning multiple values in a single return
|
|
|
|
* statement. Return values will have been pushed onto the
|
|
|
|
* stack:
|
|
|
|
* 1: Return value 1
|
|
|
|
* 2: Return value 2
|
|
|
|
* 3: etc...
|
|
|
|
* To simplify the process of retrieving a return value from a
|
|
|
|
* channel program, we disallow returning more than one value
|
|
|
|
* to ZFS from the Lua script, yielding a singleton return
|
|
|
|
* nvlist of the form { "return": Return value 1 }.
|
|
|
|
*/
|
|
|
|
int return_count = lua_gettop(state);
|
|
|
|
|
|
|
|
if (return_count == 1) {
|
2019-06-23 02:51:46 +03:00
|
|
|
ri->zri_result = 0;
|
|
|
|
zcp_convert_return_values(state, ri->zri_outnvl,
|
|
|
|
ZCP_RET_RETURN, &ri->zri_result);
|
2018-02-08 19:16:23 +03:00
|
|
|
} else if (return_count > 1) {
|
2019-06-23 02:51:46 +03:00
|
|
|
ri->zri_result = SET_ERROR(ECHRNG);
|
OpenZFS 9424 - ztest failure: "unprotected error in call to Lua API (Invalid value type 'function' for key 'error')"
Ztest failed with the following crash.
::status
debugging core file of ztest (64-bit) from clone-dc-slave-280-bc7947b1.dcenter
file: /usr/bin/amd64/ztest
initial argv: /usr/bin/amd64/ztest
threading model: raw lwps
status: process terminated by SIGABRT (Abort), pid=2150 uid=1025 code=-1
panic message: failure for thread 0xfffffd7fff112a40, thread-id 1: unprotected error in call to Lua API (Invalid
value type 'function' for key 'error')
::stack
libc.so.1`_lwp_kill+0xa()
libc.so.1`_assfail+0x182(fffffd7fffdfe8d0, 0, 0)
libc.so.1`assfail+0x19(fffffd7fffdfe8d0, 0, 0)
libzpool.so.1`vpanic+0x3d(fffffd7ffaa58c20, fffffd7fffdfeb00)
0xfffffd7ffaa28146()
0xfffffd7ffaa0a109()
libzpool.so.1`luaD_throw+0x86(3011a48, 2)
0xfffffd7ffa9350d3()
0xfffffd7ffa93e3f1()
libzpool.so.1`zcp_lua_to_nvlist+0x33(3011a48, 1, 2686470, fffffd7ffaa2e2c3)
libzpool.so.1`zcp_convert_return_values+0xa4(3011a48, 2686470, fffffd7ffaa2e2c3, fffffd7fffdfedd0)
libzpool.so.1`zcp_pool_error+0x59(fffffd7fffdfedd0, 1e0f450)
libzpool.so.1`zcp_eval+0x6f8(1e0f450, fffffd7ffaa483f8, 1, 0, 6400000, 1d33b30)
libzpool.so.1`dsl_destroy_snapshots_nvl+0x12c(2786b60, 0, 484750)
libzpool.so.1`dsl_destroy_snapshot+0x4f(fffffd7fffdfef70, 0)
ztest_dsl_dataset_cleanup+0xea(fffffd7fffdff4c0, 1)
ztest_dataset_destroy+0x53(1)
ztest_run+0x59f(fffffd7fff0e0498)
main+0x7ff(1, fffffd7fffdffa88)
_start+0x6c()
The problem is that zcp_convert_return_values() assumes that there's
exactly one value on the stack, but that isn't always true. It ends up
putting the wrong thing on the stack which is then consumed by
zcp_convert_return values, which either adds the wrong message to the
nvlist, or blows up.
The fix is to make sure that callers of zcp_convert_return_values()
clear the stack before pushing their error message, and
zcp_convert_return_values() should VERIFY that the stack is the expected
size.
Authored by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Don Brady <don.brady@delphix.com>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Robert Mustacchi <rm@joyent.com>
OpenZFS-issue: https://www.illumos.org/issues/9424
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/eb7e57429
Closes #7696
2017-08-07 19:29:34 +03:00
|
|
|
lua_settop(state, 0);
|
2018-02-08 19:16:23 +03:00
|
|
|
(void) lua_pushfstring(state, "Multiple return "
|
|
|
|
"values not supported");
|
2019-06-23 02:51:46 +03:00
|
|
|
zcp_convert_return_values(state, ri->zri_outnvl,
|
|
|
|
ZCP_RET_ERROR, &ri->zri_result);
|
2018-02-08 19:16:23 +03:00
|
|
|
}
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
case LUA_ERRRUN:
|
|
|
|
case LUA_ERRGCMM: {
|
|
|
|
/*
|
|
|
|
* The channel program encountered a fatal error within the
|
|
|
|
* script, such as failing an assertion, or calling a function
|
|
|
|
* with incompatible arguments. The error value and the
|
|
|
|
* traceback generated by zcp_error_handler() should be on the
|
|
|
|
* stack.
|
|
|
|
*/
|
|
|
|
VERIFY3U(1, ==, lua_gettop(state));
|
2019-06-23 02:51:46 +03:00
|
|
|
if (ri->zri_timed_out) {
|
|
|
|
ri->zri_result = SET_ERROR(ETIME);
|
|
|
|
} else if (ri->zri_canceled) {
|
|
|
|
ri->zri_result = SET_ERROR(EINTR);
|
2018-02-08 19:16:23 +03:00
|
|
|
} else {
|
2019-06-23 02:51:46 +03:00
|
|
|
ri->zri_result = SET_ERROR(ECHRNG);
|
2018-02-08 19:16:23 +03:00
|
|
|
}
|
|
|
|
|
2019-06-23 02:51:46 +03:00
|
|
|
zcp_convert_return_values(state, ri->zri_outnvl,
|
|
|
|
ZCP_RET_ERROR, &ri->zri_result);
|
2018-02-08 19:16:23 +03:00
|
|
|
|
2019-06-23 02:51:46 +03:00
|
|
|
if (ri->zri_result == ETIME && ri->zri_outnvl != NULL) {
|
|
|
|
(void) nvlist_add_uint64(ri->zri_outnvl,
|
|
|
|
ZCP_ARG_INSTRLIMIT, ri->zri_curinstrs);
|
2018-02-08 19:16:23 +03:00
|
|
|
}
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
case LUA_ERRERR: {
|
|
|
|
/*
|
|
|
|
* The channel program encountered a fatal error within the
|
|
|
|
* script, and we encountered another error while trying to
|
|
|
|
* compute the traceback in zcp_error_handler(). We can only
|
|
|
|
* return the error message.
|
|
|
|
*/
|
|
|
|
VERIFY3U(1, ==, lua_gettop(state));
|
2019-06-23 02:51:46 +03:00
|
|
|
if (ri->zri_timed_out) {
|
|
|
|
ri->zri_result = SET_ERROR(ETIME);
|
|
|
|
} else if (ri->zri_canceled) {
|
|
|
|
ri->zri_result = SET_ERROR(EINTR);
|
2018-02-08 19:16:23 +03:00
|
|
|
} else {
|
2019-06-23 02:51:46 +03:00
|
|
|
ri->zri_result = SET_ERROR(ECHRNG);
|
2018-02-08 19:16:23 +03:00
|
|
|
}
|
|
|
|
|
2019-06-23 02:51:46 +03:00
|
|
|
zcp_convert_return_values(state, ri->zri_outnvl,
|
|
|
|
ZCP_RET_ERROR, &ri->zri_result);
|
2018-02-08 19:16:23 +03:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
case LUA_ERRMEM:
|
|
|
|
/*
|
|
|
|
* Lua ran out of memory while running the channel program.
|
|
|
|
* There's not much we can do.
|
|
|
|
*/
|
2019-06-23 02:51:46 +03:00
|
|
|
ri->zri_result = SET_ERROR(ENOSPC);
|
2018-02-08 19:16:23 +03:00
|
|
|
break;
|
|
|
|
default:
|
|
|
|
VERIFY0(err);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2018-02-08 19:35:09 +03:00
|
|
|
static void
|
2023-01-05 03:05:36 +03:00
|
|
|
zcp_pool_error(zcp_run_info_t *ri, const char *poolname, int error)
|
2018-02-08 19:35:09 +03:00
|
|
|
{
|
2019-06-23 02:51:46 +03:00
|
|
|
ri->zri_result = SET_ERROR(ECHRNG);
|
|
|
|
lua_settop(ri->zri_state, 0);
|
2023-01-05 03:05:36 +03:00
|
|
|
(void) lua_pushfstring(ri->zri_state, "Could not open pool: %s "
|
|
|
|
"errno: %d", poolname, error);
|
2019-06-23 02:51:46 +03:00
|
|
|
zcp_convert_return_values(ri->zri_state, ri->zri_outnvl,
|
|
|
|
ZCP_RET_ERROR, &ri->zri_result);
|
|
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* This callback is called when txg_wait_synced_sig encountered a signal.
|
|
|
|
* The txg_wait_synced_sig will continue to wait for the txg to complete
|
|
|
|
* after calling this callback.
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
zcp_eval_sig(void *arg, dmu_tx_t *tx)
|
|
|
|
{
|
2021-12-12 18:06:44 +03:00
|
|
|
(void) tx;
|
2019-06-23 02:51:46 +03:00
|
|
|
zcp_run_info_t *ri = arg;
|
2018-02-08 19:35:09 +03:00
|
|
|
|
2019-06-23 02:51:46 +03:00
|
|
|
ri->zri_canceled = B_TRUE;
|
2018-02-08 19:35:09 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
|
|
|
zcp_eval_sync(void *arg, dmu_tx_t *tx)
|
|
|
|
{
|
2019-06-23 02:51:46 +03:00
|
|
|
zcp_run_info_t *ri = arg;
|
2018-02-08 19:35:09 +03:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Open context should have setup the stack to contain:
|
|
|
|
* 1: Error handler callback
|
|
|
|
* 2: Script to run (converted to a Lua function)
|
|
|
|
* 3: nvlist input to function (converted to Lua table or nil)
|
|
|
|
*/
|
2019-06-23 02:51:46 +03:00
|
|
|
VERIFY3U(3, ==, lua_gettop(ri->zri_state));
|
2018-02-08 19:35:09 +03:00
|
|
|
|
2019-06-23 02:51:46 +03:00
|
|
|
zcp_eval_impl(tx, ri);
|
2018-02-08 19:35:09 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
2019-06-23 02:51:46 +03:00
|
|
|
zcp_eval_open(zcp_run_info_t *ri, const char *poolname)
|
2018-02-08 19:35:09 +03:00
|
|
|
{
|
|
|
|
int error;
|
|
|
|
dsl_pool_t *dp;
|
|
|
|
dmu_tx_t *tx;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* See comment from the same assertion in zcp_eval_sync().
|
|
|
|
*/
|
2019-06-23 02:51:46 +03:00
|
|
|
VERIFY3U(3, ==, lua_gettop(ri->zri_state));
|
2018-02-08 19:35:09 +03:00
|
|
|
|
|
|
|
error = dsl_pool_hold(poolname, FTAG, &dp);
|
|
|
|
if (error != 0) {
|
2023-01-05 03:05:36 +03:00
|
|
|
zcp_pool_error(ri, poolname, error);
|
2018-02-08 19:35:09 +03:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* As we are running in open-context, we have no transaction associated
|
|
|
|
* with the channel program. At the same time, functions from the
|
|
|
|
* zfs.check submodule need to be associated with a transaction as
|
|
|
|
* they are basically dry-runs of their counterparts in the zfs.sync
|
|
|
|
* submodule. These functions should be able to run in open-context.
|
|
|
|
* Therefore we create a new transaction that we later abort once
|
|
|
|
* the channel program has been evaluated.
|
|
|
|
*/
|
|
|
|
tx = dmu_tx_create_dd(dp->dp_mos_dir);
|
|
|
|
|
2019-06-23 02:51:46 +03:00
|
|
|
zcp_eval_impl(tx, ri);
|
2018-02-08 19:35:09 +03:00
|
|
|
|
|
|
|
dmu_tx_abort(tx);
|
|
|
|
|
|
|
|
dsl_pool_rele(dp, FTAG);
|
|
|
|
}
|
|
|
|
|
2018-02-08 19:16:23 +03:00
|
|
|
int
|
2018-02-08 19:35:09 +03:00
|
|
|
zcp_eval(const char *poolname, const char *program, boolean_t sync,
|
|
|
|
uint64_t instrlimit, uint64_t memlimit, nvpair_t *nvarg, nvlist_t *outnvl)
|
2018-02-08 19:16:23 +03:00
|
|
|
{
|
|
|
|
int err;
|
|
|
|
lua_State *state;
|
2019-06-23 02:51:46 +03:00
|
|
|
zcp_run_info_t runinfo;
|
2018-02-08 19:16:23 +03:00
|
|
|
|
|
|
|
if (instrlimit > zfs_lua_max_instrlimit)
|
|
|
|
return (SET_ERROR(EINVAL));
|
|
|
|
if (memlimit == 0 || memlimit > zfs_lua_max_memlimit)
|
|
|
|
return (SET_ERROR(EINVAL));
|
|
|
|
|
|
|
|
zcp_alloc_arg_t allocargs = {
|
|
|
|
.aa_must_succeed = B_TRUE,
|
|
|
|
.aa_alloc_remaining = (int64_t)memlimit,
|
|
|
|
.aa_alloc_limit = (int64_t)memlimit,
|
|
|
|
};
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Creates a Lua state with a memory allocator that uses KM_SLEEP.
|
|
|
|
* This should never fail.
|
|
|
|
*/
|
|
|
|
state = lua_newstate(zcp_lua_alloc, &allocargs);
|
|
|
|
VERIFY(state != NULL);
|
|
|
|
(void) lua_atpanic(state, zcp_panic_cb);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Load core Lua libraries we want access to.
|
|
|
|
*/
|
|
|
|
VERIFY3U(1, ==, luaopen_base(state));
|
|
|
|
lua_pop(state, 1);
|
|
|
|
VERIFY3U(1, ==, luaopen_coroutine(state));
|
|
|
|
lua_setglobal(state, LUA_COLIBNAME);
|
|
|
|
VERIFY0(lua_gettop(state));
|
|
|
|
VERIFY3U(1, ==, luaopen_string(state));
|
|
|
|
lua_setglobal(state, LUA_STRLIBNAME);
|
|
|
|
VERIFY0(lua_gettop(state));
|
|
|
|
VERIFY3U(1, ==, luaopen_table(state));
|
|
|
|
lua_setglobal(state, LUA_TABLIBNAME);
|
|
|
|
VERIFY0(lua_gettop(state));
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Load globally visible variables such as errno aliases.
|
|
|
|
*/
|
|
|
|
zcp_load_globals(state);
|
|
|
|
VERIFY0(lua_gettop(state));
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Load ZFS-specific modules.
|
|
|
|
*/
|
|
|
|
lua_newtable(state);
|
|
|
|
VERIFY3U(1, ==, zcp_load_list_lib(state));
|
|
|
|
lua_setfield(state, -2, "list");
|
|
|
|
VERIFY3U(1, ==, zcp_load_synctask_lib(state, B_FALSE));
|
|
|
|
lua_setfield(state, -2, "check");
|
|
|
|
VERIFY3U(1, ==, zcp_load_synctask_lib(state, B_TRUE));
|
|
|
|
lua_setfield(state, -2, "sync");
|
|
|
|
VERIFY3U(1, ==, zcp_load_get_lib(state));
|
|
|
|
lua_pushcclosure(state, zcp_debug_info.func, 0);
|
|
|
|
lua_setfield(state, -2, zcp_debug_info.name);
|
|
|
|
lua_pushcclosure(state, zcp_exists_info.func, 0);
|
|
|
|
lua_setfield(state, -2, zcp_exists_info.name);
|
|
|
|
lua_setglobal(state, "zfs");
|
|
|
|
VERIFY0(lua_gettop(state));
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Push the error-callback that calculates Lua stack traces on
|
|
|
|
* unexpected failures.
|
|
|
|
*/
|
|
|
|
lua_pushcfunction(state, zcp_error_handler);
|
|
|
|
VERIFY3U(1, ==, lua_gettop(state));
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Load the actual script as a function onto the stack as text ("t").
|
|
|
|
* The only valid error condition is a syntax error in the script.
|
|
|
|
* ERRMEM should not be possible because our allocator is using
|
|
|
|
* KM_SLEEP. ERRGCMM should not be possible because we have not added
|
|
|
|
* any objects with __gc metamethods to the interpreter that could
|
|
|
|
* fail.
|
|
|
|
*/
|
|
|
|
err = luaL_loadbufferx(state, program, strlen(program),
|
|
|
|
"channel program", "t");
|
|
|
|
if (err == LUA_ERRSYNTAX) {
|
|
|
|
fnvlist_add_string(outnvl, ZCP_RET_ERROR,
|
|
|
|
lua_tostring(state, -1));
|
|
|
|
lua_close(state);
|
|
|
|
return (SET_ERROR(EINVAL));
|
|
|
|
}
|
|
|
|
VERIFY0(err);
|
|
|
|
VERIFY3U(2, ==, lua_gettop(state));
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Convert the input nvlist to a Lua object and put it on top of the
|
|
|
|
* stack.
|
|
|
|
*/
|
|
|
|
char errmsg[128];
|
|
|
|
err = zcp_nvpair_value_to_lua(state, nvarg,
|
|
|
|
errmsg, sizeof (errmsg));
|
|
|
|
if (err != 0) {
|
|
|
|
fnvlist_add_string(outnvl, ZCP_RET_ERROR, errmsg);
|
|
|
|
lua_close(state);
|
|
|
|
return (SET_ERROR(EINVAL));
|
|
|
|
}
|
|
|
|
VERIFY3U(3, ==, lua_gettop(state));
|
|
|
|
|
2019-06-23 02:51:46 +03:00
|
|
|
runinfo.zri_state = state;
|
|
|
|
runinfo.zri_allocargs = &allocargs;
|
|
|
|
runinfo.zri_outnvl = outnvl;
|
|
|
|
runinfo.zri_result = 0;
|
|
|
|
runinfo.zri_cred = CRED();
|
2020-07-12 03:18:02 +03:00
|
|
|
runinfo.zri_proc = curproc;
|
2019-06-23 02:51:46 +03:00
|
|
|
runinfo.zri_timed_out = B_FALSE;
|
|
|
|
runinfo.zri_canceled = B_FALSE;
|
|
|
|
runinfo.zri_sync = sync;
|
|
|
|
runinfo.zri_space_used = 0;
|
|
|
|
runinfo.zri_curinstrs = 0;
|
|
|
|
runinfo.zri_maxinstrs = instrlimit;
|
async zvol minor node creation interferes with receive
When we finish a zfs receive, dmu_recv_end_sync() calls
zvol_create_minors(async=TRUE). This kicks off some other threads that
create the minor device nodes (in /dev/zvol/poolname/...). These async
threads call zvol_prefetch_minors_impl() and zvol_create_minor(), which
both call dmu_objset_own(), which puts a "long hold" on the dataset.
Since the zvol minor node creation is asynchronous, this can happen
after the `ZFS_IOC_RECV[_NEW]` ioctl and `zfs receive` process have
completed.
After the first receive ioctl has completed, userland may attempt to do
another receive into the same dataset (e.g. the next incremental
stream). This second receive and the asynchronous minor node creation
can interfere with one another in several different ways, because they
both require exclusive access to the dataset:
1. When the second receive is finishing up, dmu_recv_end_check() does
dsl_dataset_handoff_check(), which can fail with EBUSY if the async
minor node creation already has a "long hold" on this dataset. This
causes the 2nd receive to fail.
2. The async udev rule can fail if zvol_id and/or systemd-udevd try to
open the device while the the second receive's async attempt at minor
node creation owns the dataset (via zvol_prefetch_minors_impl). This
causes the minor node (/dev/zd*) to exist, but the udev-generated
/dev/zvol/... to not exist.
3. The async minor node creation can silently fail with EBUSY if the
first receive's zvol_create_minor() trys to own the dataset while the
second receive's zvol_prefetch_minors_impl already owns the dataset.
To address these problems, this change synchronously creates the minor
node. To avoid the lock ordering problems that the asynchrony was
introduced to fix (see #3681), we create the minor nodes from open
context, with no locks held, rather than from syncing contex as was
originally done.
Implementation notes:
We generally do not need to traverse children or prefetch anything (e.g.
when running the recv, snapshot, create, or clone subcommands of zfs).
We only need recursion when importing/opening a pool and when loading
encryption keys. The existing recursive, asynchronous, prefetching code
is preserved for use in these cases.
Channel programs may need to create zvol minor nodes, when creating a
snapshot of a zvol with the snapdev property set. We figure out what
snapshots are created when running the LUA program in syncing context.
In this case we need to remember what snapshots were created, and then
try to create their minor nodes from open context, after the LUA code
has completed.
There are additional zvol use cases that asynchronously own the dataset,
which can cause similar problems. E.g. changing the volmode or snapdev
properties. These are less problematic because they are not recursive
and don't touch datasets that are not involved in the operation, there
is still potential for interference with subsequent operations. In the
future, these cases should be similarly converted to create the zvol
minor node synchronously from open context.
The async tasks of removing and renaming minors do not own the objset,
so they do not have this problem. However, it may make sense to also
convert these operations to happen synchronously from open context, in
the future.
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Prakash Surya <prakash.surya@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-65948
Closes #7863
Closes #9885
2020-02-03 20:33:14 +03:00
|
|
|
runinfo.zri_new_zvols = fnvlist_alloc();
|
2018-02-08 19:16:23 +03:00
|
|
|
|
2018-02-08 19:35:09 +03:00
|
|
|
if (sync) {
|
2019-06-23 02:51:46 +03:00
|
|
|
err = dsl_sync_task_sig(poolname, NULL, zcp_eval_sync,
|
|
|
|
zcp_eval_sig, &runinfo, 0, ZFS_SPACE_CHECK_ZCP_EVAL);
|
2018-02-08 19:35:09 +03:00
|
|
|
if (err != 0)
|
2023-01-05 03:05:36 +03:00
|
|
|
zcp_pool_error(&runinfo, poolname, err);
|
2018-02-08 19:35:09 +03:00
|
|
|
} else {
|
2019-06-23 02:51:46 +03:00
|
|
|
zcp_eval_open(&runinfo, poolname);
|
2018-02-08 19:35:09 +03:00
|
|
|
}
|
2018-02-08 19:16:23 +03:00
|
|
|
lua_close(state);
|
|
|
|
|
async zvol minor node creation interferes with receive
When we finish a zfs receive, dmu_recv_end_sync() calls
zvol_create_minors(async=TRUE). This kicks off some other threads that
create the minor device nodes (in /dev/zvol/poolname/...). These async
threads call zvol_prefetch_minors_impl() and zvol_create_minor(), which
both call dmu_objset_own(), which puts a "long hold" on the dataset.
Since the zvol minor node creation is asynchronous, this can happen
after the `ZFS_IOC_RECV[_NEW]` ioctl and `zfs receive` process have
completed.
After the first receive ioctl has completed, userland may attempt to do
another receive into the same dataset (e.g. the next incremental
stream). This second receive and the asynchronous minor node creation
can interfere with one another in several different ways, because they
both require exclusive access to the dataset:
1. When the second receive is finishing up, dmu_recv_end_check() does
dsl_dataset_handoff_check(), which can fail with EBUSY if the async
minor node creation already has a "long hold" on this dataset. This
causes the 2nd receive to fail.
2. The async udev rule can fail if zvol_id and/or systemd-udevd try to
open the device while the the second receive's async attempt at minor
node creation owns the dataset (via zvol_prefetch_minors_impl). This
causes the minor node (/dev/zd*) to exist, but the udev-generated
/dev/zvol/... to not exist.
3. The async minor node creation can silently fail with EBUSY if the
first receive's zvol_create_minor() trys to own the dataset while the
second receive's zvol_prefetch_minors_impl already owns the dataset.
To address these problems, this change synchronously creates the minor
node. To avoid the lock ordering problems that the asynchrony was
introduced to fix (see #3681), we create the minor nodes from open
context, with no locks held, rather than from syncing contex as was
originally done.
Implementation notes:
We generally do not need to traverse children or prefetch anything (e.g.
when running the recv, snapshot, create, or clone subcommands of zfs).
We only need recursion when importing/opening a pool and when loading
encryption keys. The existing recursive, asynchronous, prefetching code
is preserved for use in these cases.
Channel programs may need to create zvol minor nodes, when creating a
snapshot of a zvol with the snapdev property set. We figure out what
snapshots are created when running the LUA program in syncing context.
In this case we need to remember what snapshots were created, and then
try to create their minor nodes from open context, after the LUA code
has completed.
There are additional zvol use cases that asynchronously own the dataset,
which can cause similar problems. E.g. changing the volmode or snapdev
properties. These are less problematic because they are not recursive
and don't touch datasets that are not involved in the operation, there
is still potential for interference with subsequent operations. In the
future, these cases should be similarly converted to create the zvol
minor node synchronously from open context.
The async tasks of removing and renaming minors do not own the objset,
so they do not have this problem. However, it may make sense to also
convert these operations to happen synchronously from open context, in
the future.
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Prakash Surya <prakash.surya@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-65948
Closes #7863
Closes #9885
2020-02-03 20:33:14 +03:00
|
|
|
/*
|
|
|
|
* Create device minor nodes for any new zvols.
|
|
|
|
*/
|
|
|
|
for (nvpair_t *pair = nvlist_next_nvpair(runinfo.zri_new_zvols, NULL);
|
|
|
|
pair != NULL;
|
|
|
|
pair = nvlist_next_nvpair(runinfo.zri_new_zvols, pair)) {
|
|
|
|
zvol_create_minor(nvpair_name(pair));
|
|
|
|
}
|
|
|
|
fnvlist_free(runinfo.zri_new_zvols);
|
|
|
|
|
2019-06-23 02:51:46 +03:00
|
|
|
return (runinfo.zri_result);
|
2018-02-08 19:16:23 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Retrieve metadata about the currently running channel program.
|
|
|
|
*/
|
|
|
|
zcp_run_info_t *
|
|
|
|
zcp_run_info(lua_State *state)
|
|
|
|
{
|
|
|
|
zcp_run_info_t *ri;
|
|
|
|
|
|
|
|
lua_getfield(state, LUA_REGISTRYINDEX, ZCP_RUN_INFO_KEY);
|
|
|
|
ri = lua_touserdata(state, -1);
|
|
|
|
lua_pop(state, 1);
|
|
|
|
return (ri);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Argument Parsing
|
|
|
|
* ================
|
|
|
|
*
|
|
|
|
* The Lua language allows methods to be called with any number
|
|
|
|
* of arguments of any type. When calling back into ZFS we need to sanitize
|
|
|
|
* arguments from channel programs to make sure unexpected arguments or
|
|
|
|
* arguments of the wrong type result in clear error messages. To do this
|
|
|
|
* in a uniform way all callbacks from channel programs should use the
|
|
|
|
* zcp_parse_args() function to interpret inputs.
|
|
|
|
*
|
|
|
|
* Positional vs Keyword Arguments
|
|
|
|
* ===============================
|
|
|
|
*
|
|
|
|
* Every callback function takes a fixed set of required positional arguments
|
|
|
|
* and optional keyword arguments. For example, the destroy function takes
|
|
|
|
* a single positional string argument (the name of the dataset to destroy)
|
|
|
|
* and an optional "defer" keyword boolean argument. When calling lua functions
|
|
|
|
* with parentheses, only positional arguments can be used:
|
|
|
|
*
|
|
|
|
* zfs.sync.snapshot("rpool@snap")
|
|
|
|
*
|
|
|
|
* To use keyword arguments functions should be called with a single argument
|
|
|
|
* that is a lua table containing mappings of integer -> positional arguments
|
|
|
|
* and string -> keyword arguments:
|
|
|
|
*
|
|
|
|
* zfs.sync.snapshot({1="rpool@snap", defer=true})
|
|
|
|
*
|
|
|
|
* The lua language allows curly braces to be used in place of parenthesis as
|
|
|
|
* syntactic sugar for this calling convention:
|
|
|
|
*
|
|
|
|
* zfs.sync.snapshot{"rpool@snap", defer=true}
|
|
|
|
*/
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Throw an error and print the given arguments. If there are too many
|
|
|
|
* arguments to fit in the output buffer, only the error format string is
|
|
|
|
* output.
|
|
|
|
*/
|
|
|
|
static void
|
|
|
|
zcp_args_error(lua_State *state, const char *fname, const zcp_arg_t *pargs,
|
|
|
|
const zcp_arg_t *kwargs, const char *fmt, ...)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
char errmsg[512];
|
|
|
|
size_t len = sizeof (errmsg);
|
|
|
|
size_t msglen = 0;
|
|
|
|
va_list argp;
|
|
|
|
|
|
|
|
va_start(argp, fmt);
|
|
|
|
VERIFY3U(len, >, vsnprintf(errmsg, len, fmt, argp));
|
|
|
|
va_end(argp);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Calculate the total length of the final string, including extra
|
|
|
|
* formatting characters. If the argument dump would be too large,
|
|
|
|
* only print the error string.
|
|
|
|
*/
|
|
|
|
msglen = strlen(errmsg);
|
|
|
|
msglen += strlen(fname) + 4; /* : + {} + null terminator */
|
|
|
|
for (i = 0; pargs[i].za_name != NULL; i++) {
|
|
|
|
msglen += strlen(pargs[i].za_name);
|
|
|
|
msglen += strlen(lua_typename(state, pargs[i].za_lua_type));
|
|
|
|
if (pargs[i + 1].za_name != NULL || kwargs[0].za_name != NULL)
|
|
|
|
msglen += 5; /* < + ( + )> + , */
|
|
|
|
else
|
|
|
|
msglen += 4; /* < + ( + )> */
|
|
|
|
}
|
|
|
|
for (i = 0; kwargs[i].za_name != NULL; i++) {
|
|
|
|
msglen += strlen(kwargs[i].za_name);
|
|
|
|
msglen += strlen(lua_typename(state, kwargs[i].za_lua_type));
|
|
|
|
if (kwargs[i + 1].za_name != NULL)
|
|
|
|
msglen += 4; /* =( + ) + , */
|
|
|
|
else
|
|
|
|
msglen += 3; /* =( + ) */
|
|
|
|
}
|
|
|
|
|
|
|
|
if (msglen >= len)
|
|
|
|
(void) luaL_error(state, errmsg);
|
|
|
|
|
|
|
|
VERIFY3U(len, >, strlcat(errmsg, ": ", len));
|
|
|
|
VERIFY3U(len, >, strlcat(errmsg, fname, len));
|
|
|
|
VERIFY3U(len, >, strlcat(errmsg, "{", len));
|
|
|
|
for (i = 0; pargs[i].za_name != NULL; i++) {
|
|
|
|
VERIFY3U(len, >, strlcat(errmsg, "<", len));
|
|
|
|
VERIFY3U(len, >, strlcat(errmsg, pargs[i].za_name, len));
|
|
|
|
VERIFY3U(len, >, strlcat(errmsg, "(", len));
|
|
|
|
VERIFY3U(len, >, strlcat(errmsg,
|
|
|
|
lua_typename(state, pargs[i].za_lua_type), len));
|
|
|
|
VERIFY3U(len, >, strlcat(errmsg, ")>", len));
|
|
|
|
if (pargs[i + 1].za_name != NULL || kwargs[0].za_name != NULL) {
|
|
|
|
VERIFY3U(len, >, strlcat(errmsg, ", ", len));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
for (i = 0; kwargs[i].za_name != NULL; i++) {
|
|
|
|
VERIFY3U(len, >, strlcat(errmsg, kwargs[i].za_name, len));
|
|
|
|
VERIFY3U(len, >, strlcat(errmsg, "=(", len));
|
|
|
|
VERIFY3U(len, >, strlcat(errmsg,
|
|
|
|
lua_typename(state, kwargs[i].za_lua_type), len));
|
|
|
|
VERIFY3U(len, >, strlcat(errmsg, ")", len));
|
|
|
|
if (kwargs[i + 1].za_name != NULL) {
|
|
|
|
VERIFY3U(len, >, strlcat(errmsg, ", ", len));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
VERIFY3U(len, >, strlcat(errmsg, "}", len));
|
|
|
|
|
|
|
|
(void) luaL_error(state, errmsg);
|
|
|
|
panic("unreachable code");
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
|
|
|
zcp_parse_table_args(lua_State *state, const char *fname,
|
|
|
|
const zcp_arg_t *pargs, const zcp_arg_t *kwargs)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
int type;
|
|
|
|
|
|
|
|
for (i = 0; pargs[i].za_name != NULL; i++) {
|
|
|
|
/*
|
|
|
|
* Check the table for this positional argument, leaving it
|
|
|
|
* on the top of the stack once we finish validating it.
|
|
|
|
*/
|
|
|
|
lua_pushinteger(state, i + 1);
|
|
|
|
lua_gettable(state, 1);
|
|
|
|
|
|
|
|
type = lua_type(state, -1);
|
|
|
|
if (type == LUA_TNIL) {
|
|
|
|
zcp_args_error(state, fname, pargs, kwargs,
|
|
|
|
"too few arguments");
|
|
|
|
panic("unreachable code");
|
|
|
|
} else if (type != pargs[i].za_lua_type) {
|
|
|
|
zcp_args_error(state, fname, pargs, kwargs,
|
|
|
|
"arg %d wrong type (is '%s', expected '%s')",
|
|
|
|
i + 1, lua_typename(state, type),
|
|
|
|
lua_typename(state, pargs[i].za_lua_type));
|
|
|
|
panic("unreachable code");
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Remove the positional argument from the table.
|
|
|
|
*/
|
|
|
|
lua_pushinteger(state, i + 1);
|
|
|
|
lua_pushnil(state);
|
|
|
|
lua_settable(state, 1);
|
|
|
|
}
|
|
|
|
|
|
|
|
for (i = 0; kwargs[i].za_name != NULL; i++) {
|
|
|
|
/*
|
|
|
|
* Check the table for this keyword argument, which may be
|
|
|
|
* nil if it was omitted. Leave the value on the top of
|
|
|
|
* the stack after validating it.
|
|
|
|
*/
|
|
|
|
lua_getfield(state, 1, kwargs[i].za_name);
|
|
|
|
|
|
|
|
type = lua_type(state, -1);
|
|
|
|
if (type != LUA_TNIL && type != kwargs[i].za_lua_type) {
|
|
|
|
zcp_args_error(state, fname, pargs, kwargs,
|
|
|
|
"kwarg '%s' wrong type (is '%s', expected '%s')",
|
|
|
|
kwargs[i].za_name, lua_typename(state, type),
|
|
|
|
lua_typename(state, kwargs[i].za_lua_type));
|
|
|
|
panic("unreachable code");
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Remove the keyword argument from the table.
|
|
|
|
*/
|
|
|
|
lua_pushnil(state);
|
|
|
|
lua_setfield(state, 1, kwargs[i].za_name);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Any entries remaining in the table are invalid inputs, print
|
|
|
|
* an error message based on what the entry is.
|
|
|
|
*/
|
|
|
|
lua_pushnil(state);
|
|
|
|
if (lua_next(state, 1)) {
|
|
|
|
if (lua_isnumber(state, -2) && lua_tointeger(state, -2) > 0) {
|
|
|
|
zcp_args_error(state, fname, pargs, kwargs,
|
|
|
|
"too many positional arguments");
|
|
|
|
} else if (lua_isstring(state, -2)) {
|
|
|
|
zcp_args_error(state, fname, pargs, kwargs,
|
|
|
|
"invalid kwarg '%s'", lua_tostring(state, -2));
|
|
|
|
} else {
|
|
|
|
zcp_args_error(state, fname, pargs, kwargs,
|
|
|
|
"kwarg keys must be strings");
|
|
|
|
}
|
|
|
|
panic("unreachable code");
|
|
|
|
}
|
|
|
|
|
|
|
|
lua_remove(state, 1);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
|
|
|
zcp_parse_pos_args(lua_State *state, const char *fname, const zcp_arg_t *pargs,
|
|
|
|
const zcp_arg_t *kwargs)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
int type;
|
|
|
|
|
|
|
|
for (i = 0; pargs[i].za_name != NULL; i++) {
|
|
|
|
type = lua_type(state, i + 1);
|
|
|
|
if (type == LUA_TNONE) {
|
|
|
|
zcp_args_error(state, fname, pargs, kwargs,
|
|
|
|
"too few arguments");
|
|
|
|
panic("unreachable code");
|
|
|
|
} else if (type != pargs[i].za_lua_type) {
|
|
|
|
zcp_args_error(state, fname, pargs, kwargs,
|
|
|
|
"arg %d wrong type (is '%s', expected '%s')",
|
|
|
|
i + 1, lua_typename(state, type),
|
|
|
|
lua_typename(state, pargs[i].za_lua_type));
|
|
|
|
panic("unreachable code");
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (lua_gettop(state) != i) {
|
|
|
|
zcp_args_error(state, fname, pargs, kwargs,
|
|
|
|
"too many positional arguments");
|
|
|
|
panic("unreachable code");
|
|
|
|
}
|
|
|
|
|
|
|
|
for (i = 0; kwargs[i].za_name != NULL; i++) {
|
|
|
|
lua_pushnil(state);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Checks the current Lua stack against an expected set of positional and
|
|
|
|
* keyword arguments. If the stack does not match the expected arguments
|
|
|
|
* aborts the current channel program with a useful error message, otherwise
|
|
|
|
* it re-arranges the stack so that it contains the positional arguments
|
|
|
|
* followed by the keyword argument values in declaration order. Any missing
|
|
|
|
* keyword argument will be represented by a nil value on the stack.
|
|
|
|
*
|
|
|
|
* If the stack contains exactly one argument of type LUA_TTABLE the curly
|
|
|
|
* braces calling convention is assumed, otherwise the stack is parsed for
|
|
|
|
* positional arguments only.
|
|
|
|
*
|
|
|
|
* This function should be used by every function callback. It should be called
|
|
|
|
* before the callback manipulates the Lua stack as it assumes the stack
|
|
|
|
* represents the function arguments.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
zcp_parse_args(lua_State *state, const char *fname, const zcp_arg_t *pargs,
|
|
|
|
const zcp_arg_t *kwargs)
|
|
|
|
{
|
|
|
|
if (lua_gettop(state) == 1 && lua_istable(state, 1)) {
|
|
|
|
zcp_parse_table_args(state, fname, pargs, kwargs);
|
|
|
|
} else {
|
|
|
|
zcp_parse_pos_args(state, fname, pargs, kwargs);
|
|
|
|
}
|
|
|
|
}
|
2018-06-16 01:10:42 +03:00
|
|
|
|
Cleanup: 64-bit kernel module parameters should use fixed width types
Various module parameters such as `zfs_arc_max` were originally
`uint64_t` on OpenSolaris/Illumos, but were changed to `unsigned long`
for Linux compatibility because Linux's kernel default module parameter
implementation did not support 64-bit types on 32-bit platforms. This
caused problems when porting OpenZFS to Windows because its LLP64 memory
model made `unsigned long` a 32-bit type on 64-bit, which created the
undesireable situation that parameters that should accept 64-bit values
could not on 64-bit Windows.
Upon inspection, it turns out that the Linux kernel module parameter
interface is extensible, such that we are allowed to define our own
types. Rather than maintaining the original type change via hacks to to
continue shrinking module parameters on 32-bit Linux, we implement
support for 64-bit module parameters on Linux.
After doing a review of all 64-bit kernel parameters (found via the man
page and also proposed changes by Andrew Innes), the kernel module
parameters fell into a few groups:
Parameters that were originally 64-bit on Illumos:
* dbuf_cache_max_bytes
* dbuf_metadata_cache_max_bytes
* l2arc_feed_min_ms
* l2arc_feed_secs
* l2arc_headroom
* l2arc_headroom_boost
* l2arc_write_boost
* l2arc_write_max
* metaslab_aliquot
* metaslab_force_ganging
* zfetch_array_rd_sz
* zfs_arc_max
* zfs_arc_meta_limit
* zfs_arc_meta_min
* zfs_arc_min
* zfs_async_block_max_blocks
* zfs_condense_max_obsolete_bytes
* zfs_condense_min_mapping_bytes
* zfs_deadman_checktime_ms
* zfs_deadman_synctime_ms
* zfs_initialize_chunk_size
* zfs_initialize_value
* zfs_lua_max_instrlimit
* zfs_lua_max_memlimit
* zil_slog_bulk
Parameters that were originally 32-bit on Illumos:
* zfs_per_txg_dirty_frees_percent
Parameters that were originally `ssize_t` on Illumos:
* zfs_immediate_write_sz
Note that `ssize_t` is `int32_t` on 32-bit and `int64_t` on 64-bit. It
has been upgraded to 64-bit.
Parameters that were `long`/`unsigned long` because of Linux/FreeBSD
influence:
* l2arc_rebuild_blocks_min_l2size
* zfs_key_max_salt_uses
* zfs_max_log_walking
* zfs_max_logsm_summary_length
* zfs_metaslab_max_size_cache_sec
* zfs_min_metaslabs_to_flush
* zfs_multihost_interval
* zfs_unflushed_log_block_max
* zfs_unflushed_log_block_min
* zfs_unflushed_log_block_pct
* zfs_unflushed_max_mem_amt
* zfs_unflushed_max_mem_ppm
New parameters that do not exist in Illumos:
* l2arc_trim_ahead
* vdev_file_logical_ashift
* vdev_file_physical_ashift
* zfs_arc_dnode_limit
* zfs_arc_dnode_limit_percent
* zfs_arc_dnode_reduce_percent
* zfs_arc_meta_limit_percent
* zfs_arc_sys_free
* zfs_deadman_ziotime_ms
* zfs_delete_blocks
* zfs_history_output_max
* zfs_livelist_max_entries
* zfs_max_async_dedup_frees
* zfs_max_nvlist_src_size
* zfs_rebuild_max_segment
* zfs_rebuild_vdev_limit
* zfs_unflushed_log_txg_max
* zfs_vdev_max_auto_ashift
* zfs_vdev_min_auto_ashift
* zfs_vnops_read_chunk_size
* zvol_max_discard_blocks
Rather than clutter the lists with commentary, the module parameters
that need comments are repeated below.
A few parameters were defined in Linux/FreeBSD specific code, where the
use of ulong/long is not an issue for portability, so we leave them
alone:
* zfs_delete_blocks
* zfs_key_max_salt_uses
* zvol_max_discard_blocks
The documentation for a few parameters was found to be incorrect:
* zfs_deadman_checktime_ms - incorrectly documented as int
* zfs_delete_blocks - not documented as Linux only
* zfs_history_output_max - incorrectly documented as int
* zfs_vnops_read_chunk_size - incorrectly documented as long
* zvol_max_discard_blocks - incorrectly documented as ulong
The documentation for these has been fixed, alongside the changes to
document the switch to fixed width types.
In addition, several kernel module parameters were percentages or held
ashift values, so being 64-bit never made sense for them. They have been
downgraded to 32-bit:
* vdev_file_logical_ashift
* vdev_file_physical_ashift
* zfs_arc_dnode_limit_percent
* zfs_arc_dnode_reduce_percent
* zfs_arc_meta_limit_percent
* zfs_per_txg_dirty_frees_percent
* zfs_unflushed_log_block_pct
* zfs_vdev_max_auto_ashift
* zfs_vdev_min_auto_ashift
Of special note are `zfs_vdev_max_auto_ashift` and
`zfs_vdev_min_auto_ashift`, which were already defined as `uint64_t`,
and passed to the kernel as `ulong`. This is inherently buggy on big
endian 32-bit Linux, since the values would not be written to the
correct locations. 32-bit FreeBSD was unaffected because its sysctl code
correctly treated this as a `uint64_t`.
Lastly, a code comment suggests that `zfs_arc_sys_free` is
Linux-specific, but there is nothing to indicate to me that it is
Linux-specific. Nothing was done about that.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Original-patch-by: Andrew Innes <andrew.c12@gmail.com>
Original-patch-by: Jorgen Lundman <lundman@lundman.net>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13984
Closes #14004
2022-10-03 22:06:54 +03:00
|
|
|
ZFS_MODULE_PARAM(zfs_lua, zfs_lua_, max_instrlimit, U64, ZMOD_RW,
|
2018-06-16 01:10:42 +03:00
|
|
|
"Max instruction limit that can be specified for a channel program");
|
|
|
|
|
Cleanup: 64-bit kernel module parameters should use fixed width types
Various module parameters such as `zfs_arc_max` were originally
`uint64_t` on OpenSolaris/Illumos, but were changed to `unsigned long`
for Linux compatibility because Linux's kernel default module parameter
implementation did not support 64-bit types on 32-bit platforms. This
caused problems when porting OpenZFS to Windows because its LLP64 memory
model made `unsigned long` a 32-bit type on 64-bit, which created the
undesireable situation that parameters that should accept 64-bit values
could not on 64-bit Windows.
Upon inspection, it turns out that the Linux kernel module parameter
interface is extensible, such that we are allowed to define our own
types. Rather than maintaining the original type change via hacks to to
continue shrinking module parameters on 32-bit Linux, we implement
support for 64-bit module parameters on Linux.
After doing a review of all 64-bit kernel parameters (found via the man
page and also proposed changes by Andrew Innes), the kernel module
parameters fell into a few groups:
Parameters that were originally 64-bit on Illumos:
* dbuf_cache_max_bytes
* dbuf_metadata_cache_max_bytes
* l2arc_feed_min_ms
* l2arc_feed_secs
* l2arc_headroom
* l2arc_headroom_boost
* l2arc_write_boost
* l2arc_write_max
* metaslab_aliquot
* metaslab_force_ganging
* zfetch_array_rd_sz
* zfs_arc_max
* zfs_arc_meta_limit
* zfs_arc_meta_min
* zfs_arc_min
* zfs_async_block_max_blocks
* zfs_condense_max_obsolete_bytes
* zfs_condense_min_mapping_bytes
* zfs_deadman_checktime_ms
* zfs_deadman_synctime_ms
* zfs_initialize_chunk_size
* zfs_initialize_value
* zfs_lua_max_instrlimit
* zfs_lua_max_memlimit
* zil_slog_bulk
Parameters that were originally 32-bit on Illumos:
* zfs_per_txg_dirty_frees_percent
Parameters that were originally `ssize_t` on Illumos:
* zfs_immediate_write_sz
Note that `ssize_t` is `int32_t` on 32-bit and `int64_t` on 64-bit. It
has been upgraded to 64-bit.
Parameters that were `long`/`unsigned long` because of Linux/FreeBSD
influence:
* l2arc_rebuild_blocks_min_l2size
* zfs_key_max_salt_uses
* zfs_max_log_walking
* zfs_max_logsm_summary_length
* zfs_metaslab_max_size_cache_sec
* zfs_min_metaslabs_to_flush
* zfs_multihost_interval
* zfs_unflushed_log_block_max
* zfs_unflushed_log_block_min
* zfs_unflushed_log_block_pct
* zfs_unflushed_max_mem_amt
* zfs_unflushed_max_mem_ppm
New parameters that do not exist in Illumos:
* l2arc_trim_ahead
* vdev_file_logical_ashift
* vdev_file_physical_ashift
* zfs_arc_dnode_limit
* zfs_arc_dnode_limit_percent
* zfs_arc_dnode_reduce_percent
* zfs_arc_meta_limit_percent
* zfs_arc_sys_free
* zfs_deadman_ziotime_ms
* zfs_delete_blocks
* zfs_history_output_max
* zfs_livelist_max_entries
* zfs_max_async_dedup_frees
* zfs_max_nvlist_src_size
* zfs_rebuild_max_segment
* zfs_rebuild_vdev_limit
* zfs_unflushed_log_txg_max
* zfs_vdev_max_auto_ashift
* zfs_vdev_min_auto_ashift
* zfs_vnops_read_chunk_size
* zvol_max_discard_blocks
Rather than clutter the lists with commentary, the module parameters
that need comments are repeated below.
A few parameters were defined in Linux/FreeBSD specific code, where the
use of ulong/long is not an issue for portability, so we leave them
alone:
* zfs_delete_blocks
* zfs_key_max_salt_uses
* zvol_max_discard_blocks
The documentation for a few parameters was found to be incorrect:
* zfs_deadman_checktime_ms - incorrectly documented as int
* zfs_delete_blocks - not documented as Linux only
* zfs_history_output_max - incorrectly documented as int
* zfs_vnops_read_chunk_size - incorrectly documented as long
* zvol_max_discard_blocks - incorrectly documented as ulong
The documentation for these has been fixed, alongside the changes to
document the switch to fixed width types.
In addition, several kernel module parameters were percentages or held
ashift values, so being 64-bit never made sense for them. They have been
downgraded to 32-bit:
* vdev_file_logical_ashift
* vdev_file_physical_ashift
* zfs_arc_dnode_limit_percent
* zfs_arc_dnode_reduce_percent
* zfs_arc_meta_limit_percent
* zfs_per_txg_dirty_frees_percent
* zfs_unflushed_log_block_pct
* zfs_vdev_max_auto_ashift
* zfs_vdev_min_auto_ashift
Of special note are `zfs_vdev_max_auto_ashift` and
`zfs_vdev_min_auto_ashift`, which were already defined as `uint64_t`,
and passed to the kernel as `ulong`. This is inherently buggy on big
endian 32-bit Linux, since the values would not be written to the
correct locations. 32-bit FreeBSD was unaffected because its sysctl code
correctly treated this as a `uint64_t`.
Lastly, a code comment suggests that `zfs_arc_sys_free` is
Linux-specific, but there is nothing to indicate to me that it is
Linux-specific. Nothing was done about that.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Original-patch-by: Andrew Innes <andrew.c12@gmail.com>
Original-patch-by: Jorgen Lundman <lundman@lundman.net>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes #13984
Closes #14004
2022-10-03 22:06:54 +03:00
|
|
|
ZFS_MODULE_PARAM(zfs_lua, zfs_lua_, max_memlimit, U64, ZMOD_RW,
|
2018-06-16 01:10:42 +03:00
|
|
|
"Max memory limit that can be specified for a channel program");
|