Only interrupt active disk I/Os in failmode=continue

failmode=continue is in a sorry state. Originally designed to fix a very
specific problem, it causes crashes and panics for most people who end
up trying to use it. At this point, we should either remove it entirely,
or try to make it more usable.

With this patch, I choose the latter. While the feature is fundamentally
unpredictable and prone to race conditions, it should be possible to get
it to the point where it can at least sometimes be useful for some
users. This patch fixes one of the major issues with failmode=continue:
it interrupts even ZIOs that are patiently waiting in line behind stuck
IOs.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Rob Norris <rob.norris@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Co-authored-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Closes #17372
This commit is contained in:
Paul Dagnelie 2025-05-28 15:31:32 -07:00 committed by GitHub
parent 0372def8c9
commit c464f1d014
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -2307,12 +2307,12 @@ zio_deadman_impl(zio_t *pio, int ziodepth)
zio_t *cio, *cio_next; zio_t *cio, *cio_next;
zio_link_t *zl = NULL; zio_link_t *zl = NULL;
vdev_t *vd = pio->io_vd; vdev_t *vd = pio->io_vd;
uint64_t failmode = spa_get_deadman_failmode(pio->io_spa);
if (zio_deadman_log_all || (vd != NULL && vd->vdev_ops->vdev_op_leaf)) { if (zio_deadman_log_all || (vd != NULL && vd->vdev_ops->vdev_op_leaf)) {
vdev_queue_t *vq = vd ? &vd->vdev_queue : NULL; vdev_queue_t *vq = vd ? &vd->vdev_queue : NULL;
zbookmark_phys_t *zb = &pio->io_bookmark; zbookmark_phys_t *zb = &pio->io_bookmark;
uint64_t delta = gethrtime() - pio->io_timestamp; uint64_t delta = gethrtime() - pio->io_timestamp;
uint64_t failmode = spa_get_deadman_failmode(pio->io_spa);
zfs_dbgmsg("slow zio[%d]: zio=%px timestamp=%llu " zfs_dbgmsg("slow zio[%d]: zio=%px timestamp=%llu "
"delta=%llu queued=%llu io=%llu " "delta=%llu queued=%llu io=%llu "
@ -2336,11 +2336,15 @@ zio_deadman_impl(zio_t *pio, int ziodepth)
pio->io_error); pio->io_error);
(void) zfs_ereport_post(FM_EREPORT_ZFS_DEADMAN, (void) zfs_ereport_post(FM_EREPORT_ZFS_DEADMAN,
pio->io_spa, vd, zb, pio, 0); pio->io_spa, vd, zb, pio, 0);
}
if (failmode == ZIO_FAILURE_MODE_CONTINUE && if (vd != NULL && vd->vdev_ops->vdev_op_leaf &&
taskq_empty_ent(&pio->io_tqent)) { list_is_empty(&pio->io_child_list) &&
zio_interrupt(pio); failmode == ZIO_FAILURE_MODE_CONTINUE &&
} taskq_empty_ent(&pio->io_tqent) &&
pio->io_queue_state == ZIO_QS_ACTIVE) {
pio->io_error = EINTR;
zio_interrupt(pio);
} }
mutex_enter(&pio->io_lock); mutex_enter(&pio->io_lock);