ZTS: fail test run if test runner crashes unexpectedly

zfs-tests.sh executes test-runner.py to do the actual test work. Any
exit code < 4 is interpreted as success, with the actual value
describing the outcome of the tests inside.

If a Python program crashes in some way (eg an uncaught exception), the
process exit code is 1.

Taken together, this means that test-runner.py can crash during setup,
but return a "success" error code to zfs-tests.sh, which will report and
exit 0. This in turn causes the CI runner to believe the test run
completed successfully.

This commit addresses this by making zfs-tests.sh interpret an exit code
of 255 as a failure in the runner itself. Then, in test-runner.py, the
"fail()" function defaults to a 255 return, and the main function gets
wrapped in a generic exception handler, which prints it and calls
fail().

All together, this should mean that any unexpected failure in the test
runner itself will be propagated out of zfs-tests.sh for CI or any other
calling program to deal with.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17858
This commit is contained in:
Rob Norris
2025-10-22 08:34:39 +11:00
committed by GitHub
parent 3a55e76b84
commit 72f41454a6
2 changed files with 23 additions and 14 deletions
+4
View File
@@ -797,6 +797,10 @@ msg "${TEST_RUNNER}" \
2>&1; echo $? >"$REPORT_FILE"; } | tee "$RESULTS_FILE"
read -r RUNRESULT <"$REPORT_FILE"
if [[ "$RUNRESULT" -eq "255" ]] ; then
fail "$TEST_RUNNER failed, test aborted."
fi
#
# Analyze the results.
#