Speed up WB_SYNC_NONE when a WB_SYNC_ALL occurs simultaneously

mirror of https://git.proxmox.com/git/mirror_zfs.git synced 2026-05-22 18:40:43 +03:00

Page writebacks with WB_SYNC_NONE can take several seconds to complete
since they wait for the transaction group to close before being
committed. This is usually not a problem since the caller does not
need to wait. However, if we're simultaneously doing a writeback
with WB_SYNC_ALL (e.g via msync), the latter can block for several
seconds (up to zfs_txg_timeout) due to the active WB_SYNC_NONE
writeback since it needs to wait for the transaction to complete
and the PG_writeback bit to be cleared.

This commit deals with 2 cases:

- No page writeback is active. A WB_SYNC_ALL page writeback starts
  and even completes. But when it's about to check if the PG_writeback
  bit has been cleared, another writeback with WB_SYNC_NONE starts.
  The sync page writeback ends up waiting for the non-sync page
  writeback to complete.

- A page writeback with WB_SYNC_NONE is already active when a
  WB_SYNC_ALL writeback starts. The WB_SYNC_ALL writeback ends up
  waiting for the WB_SYNC_NONE writeback.

The fix works by carefully keeping track of active sync/non-sync
writebacks and committing when beneficial.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Shaan Nobee <sniper111@gmail.com>
Closes #12662
Closes #12790

This commit is contained in:

Shaan Nobee

2022-05-04 00:23:26 +04:00

committed by

Tony Hutter

parent 8a315a30ab

commit 9e5a297de6

19 changed files with 208 additions and 23 deletions

									
										include/sys/zfs_znode.h
									
		+2
		
												View File
												
				@@ -198,6 +198,8 @@ typedef struct znode {

					uint64_t	z_size;		/* file size (cached) */

					uint64_t	z_pflags;	/* pflags (cached) */

					uint32_t	z_sync_cnt;	/* synchronous open count */

					uint32_t	z_sync_writes_cnt; /* synchronous write count */

					uint32_t	z_async_writes_cnt; /* asynchronous write count */

					mode_t		z_mode;		/* mode (cached) */

					kmutex_t	z_acl_lock;	/* acl data lock */

					zfs_acl_t	*z_acl_cached;	/* cached acl */