Initialize mmp_last_write when the mmp thread starts

A great deal of time may go by between when mmp_init() is called and
the MMP thread starts, particularly if there are bad devices, because
there is I/O checking configs etc.  If this time is too long,

    (gethrtime() - mmp_last_write) > mmp_fail_ns

at the time the MMP thread starts.  If MMP is configured to suspend
the pool, the pool will be suspended immediately.

This can be seen in issue #10838

The value of mmp_last_write doesn't matter before the mmp thread
starts.  To give the MMP thread time to issue and land MMP writes,
initialize mmp_last_write when the MMP thread starts.

Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #10873
This commit is contained in:
Olaf Faaland 2020-09-09 10:12:54 -07:00 committed by GitHub
parent cba6d86638
commit a74259cea0
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -198,14 +198,6 @@ mmp_init(spa_t *spa)
cv_init(&mmp->mmp_thread_cv, NULL, CV_DEFAULT, NULL); cv_init(&mmp->mmp_thread_cv, NULL, CV_DEFAULT, NULL);
mutex_init(&mmp->mmp_io_lock, NULL, MUTEX_DEFAULT, NULL); mutex_init(&mmp->mmp_io_lock, NULL, MUTEX_DEFAULT, NULL);
mmp->mmp_kstat_id = 1; mmp->mmp_kstat_id = 1;
/*
* mmp_write_done() calculates mmp_delay based on prior mmp_delay and
* the elapsed time since the last write. For the first mmp write,
* there is no "last write", so we start with fake non-zero values.
*/
mmp->mmp_last_write = gethrtime();
mmp->mmp_delay = MSEC2NSEC(MMP_INTERVAL_OK(zfs_multihost_interval));
} }
void void
@ -557,6 +549,18 @@ mmp_thread(void *arg)
mmp_thread_enter(mmp, &cpr); mmp_thread_enter(mmp, &cpr);
/*
* There have been no MMP writes yet. Setting mmp_last_write here gives
* us one mmp_fail_ns period, which is consistent with the activity
* check duration, to try to land an MMP write before MMP suspends the
* pool (if so configured).
*/
mutex_enter(&mmp->mmp_io_lock);
mmp->mmp_last_write = gethrtime();
mmp->mmp_delay = MSEC2NSEC(MMP_INTERVAL_OK(zfs_multihost_interval));
mutex_exit(&mmp->mmp_io_lock);
while (!mmp->mmp_thread_exiting) { while (!mmp->mmp_thread_exiting) {
hrtime_t next_time = gethrtime() + hrtime_t next_time = gethrtime() +
MSEC2NSEC(MMP_DEFAULT_INTERVAL); MSEC2NSEC(MMP_DEFAULT_INTERVAL);