FreeBSD: zfs_putpage: handle page writeback errors

Page writeback is considered completed when the associated itx callback
completes. A syncing writeback will receive the error in its callback
directly, but an in-flight async writeback that was promoted to sync by
the ZIL may also receive an error.

Writeback errors, even syncing writeback errors, are not especially
serious on their own, because the error will ultimately be returned to
the zil_commit() caller, either zfs_fsync() for an explicit sync op (eg
msync()) or to zfs_putpage() itself for a syncing (VM_PAGER_PUT_SYNC)
writeback.

The only thing we need to do when a page writeback fails is to skip
marking the page clean ("undirty"), since we don't know if it made it to
disk yet. This will ensure that it gets written out again in the future,
either some scheduled async writeback or another explicit syncing call.

On the other side, we need to make sure that if a syncing op arrives,
any changes on dirty pages are written back to the DMU and/or the ZIL
first. We do this by starting an async writeback on the vnode cache
first, so any dirty data has been recorded in the ZIL, ready for the
followup zfs_sync()->zil_commit() to find.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17398
This commit is contained in:
Rob Norris 2025-06-04 19:20:24 +10:00 committed by Brian Behlendorf
parent 3d6ee9a68c
commit ef4058fcdc

View File

@ -4314,7 +4314,6 @@ typedef struct {
static void
zfs_putpage_commit_cb(void *arg, int err)
{
(void) err;
putpage_commit_arg_t *pca = arg;
vm_object_t object = pca->pca_pages[0]->object;
@ -4322,7 +4321,17 @@ zfs_putpage_commit_cb(void *arg, int err)
for (uint_t i = 0; i < pca->pca_npages; i++) {
vm_page_t pp = pca->pca_pages[i];
vm_page_undirty(pp);
if (err == 0) {
/*
* Writeback succeeded, so undirty the page. If it
* fails, we leave it in the same state it was. That's
* most likely dirty, so it will get tried again some
* other time.
*/
vm_page_undirty(pp);
}
vm_page_sunbusy(pp);
}
@ -5228,8 +5237,32 @@ struct vop_fsync_args {
static int
zfs_freebsd_fsync(struct vop_fsync_args *ap)
{
vnode_t *vp = ap->a_vp;
int err = 0;
return (zfs_fsync(VTOZ(ap->a_vp), 0, ap->a_td->td_ucred));
/*
* Push any dirty mmap()'d data out to the DMU and ZIL, ready for
* zil_commit() to be called in zfs_fsync().
*/
if (vm_object_mightbedirty(vp->v_object)) {
zfs_vmobject_wlock(vp->v_object);
if (!vm_object_page_clean(vp->v_object, 0, 0, 0))
err = SET_ERROR(EIO);
zfs_vmobject_wunlock(vp->v_object);
if (err) {
/*
* Unclear what state things are in. zfs_putpages()
* will ensure the pages remain dirty if they haven't
* been written down to the DMU, but because there may
* be nothing logged, we can't assume that zfs_sync()
* -> zil_commit() will give us a useful error. It's
* safest if we just error out here.
*/
return (err);
}
}
return (zfs_fsync(VTOZ(vp), 0, ap->a_td->td_ucred));
}
#ifndef _SYS_SYSPROTO_H_