[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8f2ddabc-01d0-dae9-f958-1b26a6bdf58c@grimberg.me>
Date: Mon, 11 May 2020 02:23:14 -0700
From: Sagi Grimberg <sagi@...mberg.me>
To: Ming Lei <ming.lei@...hat.com>
Cc: Baolin Wang <baolin.wang7@...il.com>,
Christoph Hellwig <hch@...radead.org>, axboe@...nel.dk,
Ulf Hansson <ulf.hansson@...aro.org>,
Adrian Hunter <adrian.hunter@...el.com>,
Arnd Bergmann <arnd@...db.de>,
Linus Walleij <linus.walleij@...aro.org>,
Paolo Valente <paolo.valente@...aro.org>,
Orson Zhai <orsonzhai@...il.com>,
Chunyan Zhang <zhang.lyra@...il.com>,
linux-mmc <linux-mmc@...r.kernel.org>,
linux-block <linux-block@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH v2 1/7] block: Extand commit_rqs() to do batch
processing
>>> Basically, my idea is to dequeue request one by one, and for each
>>> dequeued request:
>>>
>>> - we try to get a budget and driver tag, if both succeed, add the
>>> request to one per-task list which can be stored in stack variable,
>>> then continue to dequeue more request
>>>
>>> - if either budget or driver tag can't be allocated for this request,
>>> marks the last request in the per-task list as .last, and send the
>>> batching requests stored in the list to LLD
>>>
>>> - when queueing batching requests to LLD, if one request isn't queued
>>> to driver successfully, calling .commit_rqs() like before, meantime
>>> adding the remained requests in the per-task list back to scheduler
>>> queue or hctx->dispatch.
>>
>> Sounds good to me.
>>
>>> One issue is that this way might degrade sequential IO performance if
>>> the LLD just tells queue busy to blk-mq via return value of .queue_rq(),
>>> so I guess we still may need one flag, such as BLK_MQ_F_BATCHING_SUBMISSION.
>>
>> Why is that degrading sequential I/O performance? because the specific
>
> Some devices may only return BLK_STS_RESOURCE from .queue_rq(), then more
> requests are dequeued from scheduler queue if we always queue batching IOs
> to LLD, and chance of IO merge is reduced, so sequential IO performance will
> be effected.
>
> Such as some scsi device which doesn't use sdev->queue_depth for
> throttling IOs.
>
> For virtio-scsi or virtio-blk, we may stop queue for avoiding the
> potential affect.
Do we have a way to characterize such devices? I'd assume that most
devices will benefit from the batching so maybe the flag needs to be
inverted? BLK_MQ_F_DONT_BATCHING_SUBMISSION?
Powered by blists - more mailing lists