[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <237bd7d8-9e75-01b5-ebe7-8b1eb747474b@kernel.dk>
Date: Sun, 19 Dec 2021 08:28:18 -0700
From: Jens Axboe <axboe@...nel.dk>
To: "Alex Xu (Hello71)" <alex_y_xu@...oo.ca>,
linux-block@...r.kernel.org, Dexuan Cui <decui@...rosoft.com>,
ming.lei@...hat.com, hch@....de, Long Li <longli@...rosoft.com>,
"Michael Kelley (LINUX)" <mikelley@...rosoft.com>,
linux-kernel@...r.kernel.org
Subject: Re: very low IOPS due to "block: reduce kblockd_mod_delayed_work_on()
CPU consumption"
On 12/19/21 7:58 AM, Jens Axboe wrote:
> On 12/18/21 12:02 PM, Jens Axboe wrote:
>> On 12/18/21 11:57 AM, Alex Xu (Hello71) wrote:
>>> Hi,
>>>
>>> I recently noticed that between 6441998e2e and 9eaa88c703, I/O became
>>> much slower on my machine using ext4 on dm-crypt on NVMe with bfq
>>> scheduler. Checking iostat during heavy usage (find / -xdev and fstrim
>>> -v /), maximum IOPS had fallen from ~10000 to ~100. Reverting cb2ac2912a
>>> ("block: reduce kblockd_mod_delayed_work_on() CPU consumption") resolves
>>> the issue.
>>
>> Hmm interesting. I'll try and see if I can reproduce this and come up
>> with a fix.
>
> I can reproduce this. Alex, can you see if this one helps? Trying to see
> if we can hit a happy medium here that avoids hammering on that timer,
> but it really depends on what the mix is here of delay with pending,
> or no delay with no pending.
>
> Dexuan, can you test this for your test case too? I'm going to queue
> up a revert for -rc6 just in case.
This one should be better...
diff --git a/block/blk-core.c b/block/blk-core.c
index c1833f95cb97..5e9e3c2b7a94 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1481,12 +1481,17 @@ int kblockd_schedule_work(struct work_struct *work)
}
EXPORT_SYMBOL(kblockd_schedule_work);
-int kblockd_mod_delayed_work_on(int cpu, struct delayed_work *dwork,
- unsigned long delay)
+void kblockd_mod_delayed_work_on(int cpu, struct delayed_work *dwork,
+ unsigned long msecs)
{
- if (!delay)
- return queue_work_on(cpu, kblockd_workqueue, &dwork->work);
- return mod_delayed_work_on(cpu, kblockd_workqueue, dwork, delay);
+ if (!msecs) {
+ cancel_delayed_work(dwork);
+ queue_work_on(cpu, kblockd_workqueue, &dwork->work);
+ } else {
+ unsigned long delay = msecs_to_jiffies(msecs);
+
+ mod_delayed_work_on(cpu, kblockd_workqueue, dwork, delay);
+ }
}
EXPORT_SYMBOL(kblockd_mod_delayed_work_on);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 8874a63ae952..95288a98dae1 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1155,8 +1155,7 @@ EXPORT_SYMBOL(blk_mq_kick_requeue_list);
void blk_mq_delay_kick_requeue_list(struct request_queue *q,
unsigned long msecs)
{
- kblockd_mod_delayed_work_on(WORK_CPU_UNBOUND, &q->requeue_work,
- msecs_to_jiffies(msecs));
+ kblockd_mod_delayed_work_on(WORK_CPU_UNBOUND, &q->requeue_work, msecs);
}
EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list);
@@ -1868,7 +1867,7 @@ static void __blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async,
}
kblockd_mod_delayed_work_on(blk_mq_hctx_next_cpu(hctx), &hctx->run_work,
- msecs_to_jiffies(msecs));
+ msecs);
}
/**
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index bd4370baccca..40748eedddbb 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1159,7 +1159,7 @@ static inline unsigned int block_size(struct block_device *bdev)
}
int kblockd_schedule_work(struct work_struct *work);
-int kblockd_mod_delayed_work_on(int cpu, struct delayed_work *dwork, unsigned long delay);
+void kblockd_mod_delayed_work_on(int cpu, struct delayed_work *dwork, unsigned long msecs);
#define MODULE_ALIAS_BLOCKDEV(major,minor) \
MODULE_ALIAS("block-major-" __stringify(major) "-" __stringify(minor))
--
Jens Axboe
Powered by blists - more mailing lists