Improve performance of zio_taskq_member

__zio_execute() calls zio_taskq_member() to determine if we are running
in a zio interrupt taskq, in which case we may need to switch to
processing this zio in a zio issue taskq.  The call to
zio_taskq_member() can become a performance bottleneck when we are
processing a high rate of zio's.

zio_taskq_member() calls taskq_member() on each of the zio interrupt
taskqs, of which there are 21.  This is slow because each call to
taskq_member() does tsd_get(taskq_tsd), which on Linux is relatively
slow.

This commit improves the performance of zio_taskq_member() by having it
cache the value of tsd_get(taskq_tsd), reducing the number of those
calls to 1/21th of the current behavior.

In a test case running `zfs send -c >/dev/null` of a filesystem with
small blocks (average 2.5KB/block), zio_taskq_member() was using 6.7% of
one CPU, and with this change it is reduced to 1.3%.  Overall time to
perform the `zfs send` reduced by 10% (~150,000 block/sec to ~165,000
blocks/sec).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #10070
This commit is contained in:
Matthew Ahrens
2020-03-03 10:29:38 -08:00
committed by GitHub
parent 0a0f9a7dc6
commit b3212d2fa6
5 changed files with 24 additions and 2 deletions
+12
View File
@@ -34,6 +34,8 @@ int taskq_now;
taskq_t *system_taskq;
taskq_t *system_delay_taskq;
static pthread_key_t taskq_tsd;
#define TASKQ_ACTIVE 0x00010000
static taskq_ent_t *
@@ -213,6 +215,8 @@ taskq_thread(void *arg)
taskq_ent_t *t;
boolean_t prealloc;
VERIFY0(pthread_setspecific(taskq_tsd, tq));
mutex_enter(&tq->tq_lock);
while (tq->tq_flags & TASKQ_ACTIVE) {
if ((t = tq->tq_task.tqent_next) == &tq->tq_task) {
@@ -343,6 +347,12 @@ taskq_member(taskq_t *tq, kthread_t *t)
return (0);
}
taskq_t *
taskq_of_curthread(void)
{
return (pthread_getspecific(taskq_tsd));
}
int
taskq_cancel_id(taskq_t *tq, taskqid_t id)
{
@@ -352,6 +362,7 @@ taskq_cancel_id(taskq_t *tq, taskqid_t id)
void
system_taskq_init(void)
{
VERIFY0(pthread_key_create(&taskq_tsd, NULL));
system_taskq = taskq_create("system_taskq", 64, maxclsyspri, 4, 512,
TASKQ_DYNAMIC | TASKQ_PREPOPULATE);
system_delay_taskq = taskq_create("delay_taskq", 4, maxclsyspri, 4,
@@ -365,4 +376,5 @@ system_taskq_fini(void)
system_taskq = NULL; /* defensive */
taskq_destroy(system_delay_taskq);
system_delay_taskq = NULL;
VERIFY0(pthread_key_delete(taskq_tsd));
}